--- title: "Getting started" author: "Sebastian Lequime" date: "`r Sys.Date()`" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Getting started} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} %\VignetteDepends{ggplot2} %\VignetteDepends{igraph} %\VignetteDepends{viridis} %\VignetteDepends{ggnetwork} --- # Introduction `nosoi` aims to provide a modular framework to conduct epidemiological simulations under a wide range of scenarios, varying from relatively simple to highly complex and parametric. This flexibility allows `nosoi` to take into account the effect of numerous covariates, taken individually or in the form of interactions, on the epidemic process, without having to use advanced mathematics to formalize the model into differential equations. At its core, `nosoi` generates a transmission chain (a link between hosts), the foundation of every epidemic process, allowing to reconstruct in great details the history of the epidemic. `nosoi` is an agent-based model, which means it is centered on individuals, here called "hosts", that enter the simulation when they get infected. `nosoi` is also stochastic, and thus relies heavily on probabilities, mainly 4 core probabilities. It operates on a discretized time. ## Critical assumption of `nosoi` `nosoi` assumes that the maximum number of hosts infected during a simulation is orders of magnitude smaller than the total exposed population. This means that, currently, it does not take into account building herd immunity using the simulated epidemic results (although a proxy can be used, as it will be discussed in several of the companion tutorials). ## Essential building blocks: 3 probabilities and 2 numbers At its core, `nosoi` can be summarized by 3 probabilities and 2 numbers at a specific time point: - `pExit` the probability to exit the simulation. For hosts, that means - for example - dying, being cured, leaving the study area, etc. - `pMove` the probability to move, only relevant when your simulation has some kind of structure, either in a discrete or continuous space. For hosts, that could be - for example - traveling, changing status (gaining access to treatment), etc. - `sdMove` the standard deviation of the random walk (links to the distance traveled in coordinate space), only relevant when a structure is imposed or assumed in a continuous space. - `nContact` the number of contacts. How many potentially infectious contacts does an infected host have? - `pTrans` the probability to transmit. When a contact occurs between two hosts, with one of them being infected, the probability that the infection gets transmitted to the uninfected one. Each of these probabilities and numbers are computed during the simulation. In a very simple scenario, each one is a constant. In more complex models supported by `nosoi`, these probabilities could depend on a host's parameters (e.g. genetics), dynamic parameters (for how long the host has been infected), environmental parameters (related to the host's location), or the particular period of the simulation (e.g. seasons, mitigation campaign,...). All those parameters can be taken into account either individually, or for the population as a whole. ## What happens at each time step? Time is discretized in `nosoi`. Each time step follows the same pattern for each host: ```{r setupMatrix, echo=FALSE, message=FALSE,warning=FALSE} if (!(requireNamespace("ggplot2", quietly = TRUE) && requireNamespace("viridis", quietly = TRUE) && requireNamespace("igraph", quietly = TRUE) && requireNamespace("ggnetwork", quietly = TRUE))) { message("Packages 'ggplot2', 'viridis', 'igraph' and 'ggnetwork' are needed for plotting this figure.") } else { library(ggplot2) library(viridis) library(igraph) library(ggnetwork) transition.matrix = matrix(c(0,0,0,0,0, 1,0,0,0,0, 0,1,0,0,0, 0,0,1,0,0, 0,0,0,1,0),nrow = 5, ncol = 5, dimnames=list(c("pExit","pMove","sdMove","nContact","pTrans"), c("pExit","pMove","sdMove","nContact","pTrans"))) # melting the matrix go get from -> to in one line with probability # melted.transition.matrix <- reshape2::melt(transition.matrix, varnames = c("from","to"),value.name="prob", as.is = TRUE) melted.transition.matrix <- as.data.frame.table(transition.matrix, stringsAsFactors = FALSE) colnames(melted.transition.matrix) <- c("from", "to", "prob") melted.transition.matrix <- subset(melted.transition.matrix, prob!=0) graph.Matrix <- igraph::graph.data.frame(melted.transition.matrix,directed=T) graph.Matrix2 = igraph::layout_in_circle(graph.Matrix) if (utils::packageVersion("ggnetwork") > "0.5.12") { # unfortunately, the update to igraph 2.0.0 has broken the connection # between ggnetwork and igraph, see: # https://github.com/briatte/ggnetwork/issues/74 #using ggnetwork to provide the layout graph.simA.network <- ggnetwork::ggnetwork(graph.Matrix, layout = graph.Matrix2, arrow.gap=0.18) #using ggnetwork to provide the layout #plotting the network ggplot(graph.simA.network, aes(x = x, y = y, xend = xend, yend = yend)) + geom_edges(color = "grey70",arrow = arrow(length = unit(1, "lines"), type = "closed"),curvature = 0.2) + geom_nodes(aes(color = name) , size = 30) + geom_nodetext(aes(label = name),color="white", fontface = "bold",size=3) + scale_color_viridis(guide=FALSE,discrete=TRUE) + theme_blank() + ylim(-0.5,1.2) + xlim(-0.5,1.2) } else { plot(graph.Matrix, vertex.shape = "none", label.dist = 1, layout = graph.Matrix2, vertex.color = "white", edge.curved = rep(-0.5, length = ecount(graph.Matrix)), edge.arrow.size = 0.5) } } ``` - `pExit`: does the host exit the simulation ? This step samples YES or NO based on the probability `pExit`. - `pMove`: does the host move (if there is structure) ? This step samples YES or NO based on the probability `pMove`. - `sdMove`: how far does the host move (if the structure is a continuous space)? This step returns a number used as a distance in the continuous space, drawn with standard deviation `sdMove`. - `nContact`: how many infectious contacts does the host have? This step yields an integer number `nContact`. - `pTrans`: if there is an infectious contact, does the host transmit the infection to a new host? This step samples YES or NO based on the probability `pTrans`. After these five propagation steps, the simulation moves to the next time step. # Setting up the core functions `nosoi` runs under a series of user-defined probabilities and numbers (see [General principles](nosoi.html#essential-building-blocs-3-probabilities-and-2-numbers) ). Each follows the same principles to be set up. We provide here a detailed explanation on how to set up a function correctly so that it can be used in the simulator. This will apply to the five core functions already mentioned: `pExit`, `pMove`, `sdMove`, `nContact` and `pTrans`. ## Expected output Every function which name starts with a `p` (i.e. `pExit`, `pMove` and `pTrans`) should return a single probability (a number between 0 and 1). `nContact` should return a positive natural number (positive integer). `sdMove` should return a real number (keep in mind this number is related to your coordinate space). ## Functions' arguments ### `t` A core function in `nosoi` should always explicitly depend on the time `t` (e.g. `pExit(t)`), even if `t` is not used within the body of the function. For instance, to return a single constant value of 0.08, the function should be expressed as: ```{r pFunction1, eval = FALSE} p_Function <- function(t){0.08} ``` The argument `t` represents the time since the host's initial infection, and can be included in the body of the function, to model a time-varying probability or number. For instance, a quantity depending on the following function will evolve in time as a logistic distribution with parameters $\mu$ 10 and $s$ 2: ```{r pFunction2, eval = FALSE} p_Function <- function(t){plogis(t,10,2)} ``` ### `prestime` The argument `prestime` can be used to represent the *absolute* time of the simulation, shared by all the hosts (as opposed to the relative time since infection `t`, which is individual-dependent). For instance, the following function can be used to produce a periodic pattern: ```{r pFunction3, eval = FALSE} p_Function <- function(t,prestime){(sin(prestime/12)+1)/2} ``` ### `current.in` and `current.env.value` The arguments `current.in` (discrete structure) or `current.env.value` (continuous structure) can be used to represent the location of the host (only if a structured population is used), as shown below: ```{r pFunction4, eval = FALSE} p_Function <- function(t,current.in){ if(current.in=="A"){return(0)} if(current.in=="B"){return(0.5)} if(current.in=="C"){return(1)}} #discrete (between states "A","B" and "C") p_Function <- function(t,current.env.value){current.env.value/100} #continuous ``` For more details on how to set up the influence of the structure, we refer to the tutorials on [discrete](discrete.html) and [continuous](continuous.html) structure. ### `host.count` The argument `host.count` can be used to represent the number of hosts present at a given location (only for a structured population), as in the following: ```{r pFunction4.2, eval = FALSE} p_Function <- function(t,current.in, host.count){ if(current.in=="A"){return(0)} if(current.in=="B" & host.count < 300 ){return(0.5)} if(current.in=="B" & host.count >= 300 ){return(0)} if(current.in=="C"){return(1)}} #discrete (between states "A","B" and "C") p_Function <- function(t,current.env.value,host.count){(current.env.value-host.count)/100} #continuous ``` ## Extra Parameters Any of the parameters used in these functions can be themselves dependent on the individual host (in order to include some heterogeneity between host). For instance, the $\mu$ parameter of a logistic distribution can be determined for each individual host by another function to be specified: ```{r pFunction5, eval = FALSE} p_Function <- function(t,pFunction.param1){plogis(t,pFunction.param1,2)} ``` Where for instance $\mu$ can be sampled from a normal distribution (mean = 10 and sd = 2): ```{r pFunction6, eval = FALSE} p_Function_param1 <- function(x){rnorm(x,mean=10,sd=2)} #sampling one parameter for each infected individual ``` Notice here that the function is expressed as a function of `x` instead of `t`. `x` is present in the body of the function as the number of draws to make from the distribution. Every parameter function you specify should be gathered into a list, where the function determining the parameter for each individual (here, `p_Function_param1`) has to be named according to the name used in `p_Function` for this parameter (here, `pFunction.param1`). ```{r pFunction7, eval = FALSE} p_Function_parameters <- list(pFunction.param1 = p_Function_param1) ``` ## Combining arguments We have previously shown that you can combine the time since infection `t` with other parameters such as `current.in` or `prestime`. In fact, you can combine as many arguments as you want, making a function dependent on the time since infection, current location, present time and individual host-dependent parameters. They however need to respect a specific order to be correctly parsed by the simulator: first `t`, then `prestime`, then `current.in` (discrete) or `current.env.value` (continuous) and finally individual host-dependent parameters. ```{r pFunction8, eval = FALSE} p_Function <- function(t, prestime, current.in, pFunction.param1, pFunction.param2,...){} ``` # Going further Once your core functions are ready, you can provide everything `nosoi` needs to run a simulation. A series of tutorials will guide you in how to set up `nosoi` depending on your case, both for single host and dual host scenarios: 1. [Spread of a pathogen in a homogeneous population (no structure) of hosts](none.html), a "simple" scenario. 2. [Spread of a pathogen in a structured (discrete) population of hosts](discrete.html). 3. [Spread of a pathogen in a structure (continuous) population of hosts](continuous.html). To get basic statistics or approximated phylogenetic trees out of your simulations output, you can have a look at [this page](output.html). To visualize your simulations, you can have a look at [these few examples](https://slequime.github.io/nosoi/articles/examples/viz.html). A series of practical examples are also available: - [Spread of dengue virus in a discrete space](https://slequime.github.io/nosoi/articles/examples/dengue.html). - [Spread of Ebola virus in a continuous space](https://slequime.github.io/nosoi/articles/examples/ebola.html). - [Epidemiological impact of mosquito vector competence](https://slequime.github.io/nosoi/articles/examples/vector-competence.html).