--- title: "User guide -- microxanox R Package for microbial oxic and anoxic ecosystem simulations" author: "Owen Petchey, Rainer M Krug" date: "`r format(Sys.time(), '%d %B, %Y')`" output: rmarkdown::html_vignette: toc: yes vignette: > %\VignetteIndexEntry{User guide -- microxanox R Package for microbial oxic and anoxic ecosystem simulations} %\VignetteEngine{knitr::rmarkdown} \usepackage[utf8]{inputenc} --- ```{r setup, echo = FALSE, output = FALSE, message = FALSE} ## should some of the more intense simulations be run... options(mc.cores=8) mdat <- function(name = "") { system.file(file.path("data_pkg", "user_guide", name), package = "microxanox") } ########## ### To generate the data_pkg, please uncomment the following three lines. # # mdat <- function(name = "") { # file.path(".", "data_pkg", "sym-sys-user_guide", name) # } # dir.create(mdat(), recursive = TRUE, showWarnings = FALSE) ########## knitr::opts_chunk$set( cache = FALSE, fig.align = "centre", fig.width = 7, fig.height = 7 ) library(tidyverse) library(deSolve) library(rootSolve) library(microxanox) library(patchwork) ## Set random number generator seed ## Is redundant if there are no stochastic components in the model set.seed(13) ``` # Package version The microxanox package may be update / changed, so please take care to work with a specific version. This user guide is requires at least version 0.4, and the loaded version is: ```{r} packageVersion("microxanox") ``` # Introduction The aims of the `microxanox` package are: - Facilitate reproduction of the results of [Bush et al 2017](https://www.nature.com/articles/s41467-017-00912-x). - Include new functionality that allows new research questions to be answered. We added functionality for: - Adding random noise in substrate concentrations. - Adding temporally varying oxygen diffusivity. - Including immigration. - Setting minimum population abundances. - Multiple strains per functional group. We also include a function for visualising results. It is provided for convenience, and not for production of perfectly formatted print-ready graphs. Here (again) is the original publication Bush et al (2017): And here is the Supplementary Information, including the table of parameter values as implemented here when using the `bush` parameter: ## Notation Note that in this package we use the notation from the Table S1 of the [Bush et al 2017 Supplementary information](https://static-content.springer.com/esm/art%3A10.1038%2Fs41467-017-00912-x/MediaObjects/41467_2017_912_MOESM1_ESM.pdf), and not the notation in the main text of [Bush et al 2017](https://www.nature.com/articles/s41467-017-00912-x). (There is only one small difference between these: In the main text the notation is simplified to `y_S_PB` ($y_{S,PB}$) (which is `y_SR_PB` ($y_{SR,PB}$) in Table S1) and `y_S_SB` ($y_{S,SB}$) (which is y_SO_SB ($y_{SO,SB}$) in Table S1). We also use `Pr_cB` ($Pr_B$) instead of `P_CB` ($P_{CB}$). ## Model dimensions - Time: hours - Volume: litres - Substrate quantity: micromoles - Organism quantity: cells # Modelling framework The framework used for running a simulation, and for varying aspects of a simulation run is: - Define a parameter set of class `runsim_parameter` (i.e. parameters and initial conditions) by using the functions `new_runsim_parameter()`, `new_initial_state()` and `new_strain_parameter()`. - The function call `run_simulation(parameter)` would therefore run a simulation using the parameter as defined in the object `parameter`. # Conditions that must be defined All parameter needed to run the simulations are encapsulated in an S3 object of class `runsim_parameter`. This object is returned by the function `new_runsim_parameter()`. ```{r} new_runsim_parameter() ``` All parameter values are set initially to NA (except the solver method, which is set to radau) and therefore need to be set. This can either be done by providing the parameters to be set as named arguments to the function `new_runsim_parameter()`: ```{r} new_runsim_parameter(dynamic_model = bushplus_dynamic_model) ``` Parameters can also be set after creation of a parameter set: ```{r} parameter_set1 <- new_runsim_parameter() parameter_set1$dynamic_model <- bushplus_dynamic_model ``` This parameter set has the following elements: ```{r} parameter_set1 ``` Now we will discuss elements in the parameter set that must be set, with an example of setting them: ## The dynamic model The ordinary differential equations for the rates of change are specified in the function `bushplus_dynamic_model()`. This augmented version of the model published in [[Bush et al 2017](https://www.nature.com/articles/s41467-017-00912-x)](https://www.nature.com/articles/s41467-017-00912-x) can handle multiple strains within each of the three functional groups, temporal variation in oxygen diffusivity, and events (see below for event definition). ```{r} parameter_set1$dynamic_model <- bushplus_dynamic_model ``` ## The parameter and initial state values The parameter values and initial state values are held in one object. The function `new_starter` can be used to create this object. In the follow, we specify one strain per functional group, all parameter values as defined in [[Bush et al 2017](https://www.nature.com/articles/s41467-017-00912-x)](https://www.nature.com/articles/s41467-017-00912-x), and starting conditions as the anoxic favouring conditions in [Bush et al 2017](https://www.nature.com/articles/s41467-017-00912-x). ```{r} parameter_set1$strain_parameter <- new_strain_parameter( n_CB = 1, values_CB = "bush", n_PB = 1, values_PB = "bush", n_SB = 1, values_SB = "bush", values_other = "bush", values_initial_state = "bush_anoxic_fig2ab" ) ``` Here, `n_CB`, `n_{PB}`, and `n_{SB}` are the number of strains within each functional group and `values_{CB}` (etc) are the parameter values of the strains, i.e. `"bush"` uses the parameter as used in the Bush paper. The same applies to `values_initial_state`, which sets the initial state of the system to a preset from the [Bush et al 2017](https://www.nature.com/articles/s41467-017-00912-x) publication when using `"bush_anoxic_fig2ab"`. Here are the parameter values of the CB strain, for example ```{r} parameter_set1$strain_parameter$CB ``` and these are initial state values used to start a simulation: ```{r} parameter_set1$strain_parameter$initial_state ``` ### Modifying initial states and parameter values Note that the function `new_initial_state` can be used to get a couple of preset initial states: ```{r} new_initial_state(values = "bush_oxic_fig2cd") ``` And the function `new_CB_strain_parameter` (and SB and PB equivalents) can be used to get preset organism parameters. Currently there is only one preset, however: "bush". ```{r} new_CB_strain_parameter(values = "bush") ``` ## The duration of the simulation: ```{r} parameter_set1$sim_duration <- 3600 ``` ## The sampling interval How frequently are the values of state variables recorded: ```{r} parameter_set1$sim_sample_interval <- 1 ``` ## The oxygen diffusivity dynamics Specified by giving a vector of values of $a_O$. These are then automatically spaced equally along the duration of the simulation. Here we specify only two values of $a_O$, and so these are the start and end values, and they are the same. Hence there is no temporal change in $a_O$. ```{r} parameter_set1$log10a_series <- c( log10(parameter_set1$strain_parameter$a_O), log10(parameter_set1$strain_parameter$a_O) ) ``` Note that here we use the already set default value of $a_O$ and therefore must have already defined the default parameter values. ## Events These are "things" that happen during the simulation. One has to specify what happens, and when it happens. ### What happens There are various available event definitions. Please see the functions for more information. Even definition 1 can add noise to a state variable, and/or assure that abundances are limited to a minimum: ```{r} parameter_set1$event_definition <- event_definition_1 ``` ### When it happens Here set to an event every 10 time units. ```{r} parameter_set1$event_interval <- 10 ``` ### Default noise event options Default to no noise added to substrate concentrations: ```{r} parameter_set1$noise_sigma <- 0 ``` ### Default minimum abundance event Default to following minimum abundances: ```{r} parameter_set1$minimum_abundances <- rep(0, 3) names(parameter_set1$minimum_abundances) <- c("CB", "PB", "SB") ``` ## Setting parameters The parameters can also be set in the command that creates the `run_sim_parameter` object: ```{r} sp <- new_strain_parameter( n_CB = 1, values_CB = "bush", n_PB = 1, values_PB = "bush", n_SB = 1, values_SB = "bush", values_other = "bush", values_initial_state = "bush_anoxic_fig2ab" ) parameter_set1 <- new_runsim_parameter( dynamic_model = bushplus_dynamic_model, strain_parameter = sp, sim_duration = 3600, sim_sample_interval = 1, log10a_series = c( log10(sp$a_O), log10(sp$a_O) ), event_definition = event_definition_1, event_interval = 10, noise_sigma = 0, minimum_abundances = rep(0, 3) ) names(parameter_set1$minimum_abundances) <- c("CB", "PB", "SB") rm(sp) ``` The advantage of this approach is that the names of the parameter are checked and an error is raised when it does not match, e.g. ```{r, error = TRUE} parameter <- new_runsim_parameter( dynamical_model = bushplus_dynamic_model ) ``` # Run some example simulations ## 1 strain anoxic start Once all the conditions are assigned to the parameter object, then these can be given to the `run_simulation()`: ```{r eval = !file.exists(mdat("anoxic_start_sim_results.RDS"))} anoxic_start_sim_results <- run_simulation(parameter_set1) ``` This will return an object of class `run_sim_results` which is identical to `parameter`, except that it contains one additional element, namely `result`, which contains the results of the simulation. Consequently, saving `anoxic_start_sim_results` will save all parameters used to (re)run the simulation as well as the result: ```{r eval = !file.exists(mdat("anoxic_start_sim_results.RDS")), echo = FALSE} saveRDS(anoxic_start_sim_results, file = mdat("anoxic_start_sim_results.RDS")) ``` As a sidenote, the simulation can be run again by simply calling ```{r eval = !file.exists(mdat("anoxic_start_sim_results.RDS"))} anoxic_start_sim_results_2 <- run_simulation(anoxic_start_sim_results) ``` The convenience function `plot_dynamics` can be used to easily visualise the dynamics of a simulation: ```{r echo = FALSE} anoxic_start_sim_results <- readRDS(file = mdat("anoxic_start_sim_results.RDS")) ``` ```{r} plot_dynamics(anoxic_start_sim_results) ``` ## 1 strain oxic start ```{r} parameter_set1$strain_parameter <- new_strain_parameter(values_initial_state = "bush_oxic_fig2cd") ``` ```{r eval = !file.exists(mdat("oxic_start_sim_results.RDS"))} oxic_start_sim_results <- run_simulation(parameter_set1) saveRDS(oxic_start_sim_results, file = mdat("oxic_start_sim_results.RDS")) ``` ```{r echo = FALSE} oxic_start_sim_results <- readRDS(file = mdat("oxic_start_sim_results.RDS")) ``` ```{r} plot_dynamics(oxic_start_sim_results) ``` ## Three strains per functional group Here we simulate a system with three strains per functional group. The system dynamics should not (and do not) differ from the 1 strain version, because the strains within a functional group are (in this case) identical. ```{r} parameter_set1$strain_parameter <- new_strain_parameter( n_CB = 3, n_PB = 3, n_SB = 3 ) ``` For example, here are the parameters of the three CB strains: ```{r} parameter_set1$strain_parameter$CB ``` And the initial state values ```{r} parameter_set1$strain_parameter$initial_state ``` So we can see the different strains, we now give them slightly different initial abundances: ```{r} parameter_set1$strain_parameter$initial_state["CB_1"] <- 5e3 parameter_set1$strain_parameter$initial_state["CB_2"] <- 1e3 parameter_set1$strain_parameter$initial_state["CB_3"] <- 0 parameter_set1$strain_parameter$initial_state["PB_1"] <- 1e7 parameter_set1$strain_parameter$initial_state["PB_2"] <- 1e6 parameter_set1$strain_parameter$initial_state["PB_3"] <- 1e5 parameter_set1$strain_parameter$initial_state["SB_1"] <- 1e7 parameter_set1$strain_parameter$initial_state["SB_2"] <- 1e6 parameter_set1$strain_parameter$initial_state["SB_3"] <- 1e5 # parameter$strain_parameter$initial_state["SO"] <- 300 # parameter$strain_parameter$initial_state["SR"] <- 300 # parameter$strain_parameter$initial_state["O"] <- 1e1 # parameter$strain_parameter$initial_state["P"] <- 1e1 ``` ```{r eval = !file.exists(mdat("multistrain3.RDS"))} parameter_set1$initial_state <- parameter_set1$strain_parameter$initial_state simulation_result3 <- run_simulation(parameter_set1) saveRDS(simulation_result3, file = mdat("multistrain3.RDS")) ``` ```{r} simulation_result3 <- readRDS(file = mdat("multistrain3.RDS")) plot_dynamics(simulation_result3) ``` # Get steady state(s) There are two methods for finding steady states. The first runs a separate simulation for each combination of starting conditions and oxygen diffusivity (let us term this the *Replication method*). The second runs only two simulations, with step-wise and slowly temporally increasing or decreasing oxygen diffusivities and recorded of state just before change to a new oxygen diffusivity (let us term this the *Temporal method*). ## Replication method. ### 1 strain per functional group Finding the steady state of the system is a common task. The function `run_replication_ssfind()` finds the state of a system after a certain length of simulation, for different values of the $a_0$ (oxygen diffusivity) parameter, and for different initial conditions ($N$). Note: The function `run_replication_ssfind()` can be told to use multiple cores by setting the relevant option, e.g., `options(mc.cores=8)` We first set the conditions that will apply to all simulations. Note that these are automatically picked up by the functions called by `run_replication_ssfind()` and so do not need to be passed. We will work with a one strain per functional group model, and unless otherwise changed (which we do) starting with oxic favouring initial conditions. ```{r sim_ss_CB_SB_PB1, eval = TRUE} ma <- rep(0, 3) names(ma) <- c("CB", "PB", "SB") parameter_set2 <- new_replication_ssfind_parameter( dynamic_model = bushplus_dynamic_model, strain_parameter = new_strain_parameter( n_CB = 1, n_PB = 1, n_SB = 1, values_initial_state = "bush_ssfig3" ), sim_duration = 50000, sim_sample_interval = 50000, event_definition = event_definition_1, event_interval = 50000, ss_expt = NA, noise_sigma = 0, minimum_abundances = ma ) rm(ma) ``` Next we need to define the values of $a_0$ and initial conditions that we will get the steady states of. We create quite continuous variation $a_0$. We, however, only need to vary the initial abundance of CB and have it either high or low, in order to find which of the two possible states that can occur do occur. ```{r sim_ss_CB_SB_PB2, eval = !file.exists(mdat("ss_1strain.RDS"))} grid_num_a <- 100 ## number of a_0 values grid_num_N <- 2 ## number of N values initial_CBs <- 10^seq(0, 10, length=grid_num_N) ## sequence of N values initial_PBs <- 1e8 ## not varied initial_SBs <- 1e8 ## not varied a_Os <- 10^seq(-6, -2, length=grid_num_a) ## sequence of a_0 values ## next line creates all possible combinations expt <- expand.grid(N_CB = initial_CBs, N_PB = initial_PBs, N_SB = initial_SBs, a_O = a_Os) parameter_set2$ss_expt <- expt rm(grid_num_a, grid_num_N, initial_CBs, initial_PBs, initial_SBs, a_Os, expt) # next line runs the "experiment", calling various functions to do this. sim_res <- run_replication_ssfind(parameter_set2) saveRDS(sim_res, file = mdat("ss_1strain.RDS")) ``` ```{r echo = FALSE} ss_1strain <- readRDS(file = mdat("ss_1strain.RDS")) ``` And the results: ```{r} p1 <- ss_1strain$result %>% mutate(direction = ifelse(initial_N_CB == 1, "up", "down")) %>% select(a, direction, starts_with("CB"), starts_with("PB"), starts_with("SB")) %>% gather(key = Species, value=Quantity, 3:ncol(.)) %>% ggplot() + geom_path(aes(x = a, y = log10(Quantity+1), col = Species, linetype = direction), size=1) + ylab(expression(Log[10](Abundance+1))) + xlab(expression(Log[10](Oxygen~diffusivity))) + ylim(0,10) + #theme(plot.title=element_text(size=10, hjust=0.5)) + labs(title="Organisms") ##ggsave("figures/CB_SB_PB.png", width = 3, height = 2) p2 <- ss_1strain$result %>% mutate(direction = ifelse(initial_N_CB == 1, "up", "down")) %>% select(a, direction ,SO, SR, O, P) %>% gather(key = Species, value=Quantity, 3:ncol(.)) %>% ggplot() + geom_path(aes(x = a, y = log10(Quantity), col = Species, linetype = direction)) + labs(title="Substrates") p1 / p2 ``` ### 3 strains per functional group And now finding the steady states of a three strain per functional group system, though with identical strains in each group. Hence the steady states should be the same as the one strain version. Note that the `run_replication_ssfind()` function divides the specified starting abundance of a functional group and distributes that equally among the strains of that functional group. ```{r sim_ss_CB_SB_PB4, eval = !file.exists(mdat("ss_3strain.RDS"))} parameter_set2$strain_parameter <- new_strain_parameter( n_CB = 3, n_PB = 3, n_SB = 3, values_initial_state = "bush_ssfig3" ) ss_3strain <- run_replication_ssfind(parameter_set2) saveRDS(ss_3strain, file = mdat("ss_3strain.RDS")) ``` ```{r echo = FALSE} ss_3strain <- readRDS(file = mdat("ss_3strain.RDS")) ``` And the results (plotting code hidden this time): ```{r echo = FALSE} p1 <- ss_3strain$result %>% mutate(direction = ifelse(initial_N_CB == 1, "up", "down")) %>% select(a, direction, starts_with("CB"), starts_with("PB"), starts_with("SB")) %>% gather(key = Species, value=Quantity, 3:ncol(.)) %>% ggplot() + geom_path(aes(x = a, y = log10(Quantity+1), col = Species, linetype = direction), size=1) + ylab(expression(Log[10](Abundance+1))) + xlab(expression(Log[10](Oxygen~diffusivity))) + ylim(0,10) + #theme(plot.title=element_text(size=10, hjust=0.5)) + labs(title="Organisms") ##ggsave("figures/CB_SB_PB.png", width = 3, height = 2) p2 <- ss_3strain$result %>% mutate(direction = ifelse(initial_N_CB == 1, "up", "down")) %>% select(a, direction ,SO, SR, O, P) %>% gather(key = Species, value=Quantity, 3:ncol(.)) %>% ggplot() + geom_path(aes(x = a, y = log10(Quantity), col = Species, linetype = direction)) + labs(title="Substrates") p1 / p2 ``` ## Temporal method The temporal method involves two simulations for a particular system configuration (parameter set). In one simulation the oxygen diffusivity is *increased* in a step-wise fashion. In the other it is *decreased* in a step-wise fashion. That is, oxygen diffusivity is held at a constant level for long enough for steady state to be reach, that state is recorded, and then a slightly higher (or lower) oxygen diffusivity value is set. Hence, at that time point, the system is effectively started with initial conditions that are the state of the system in the previous time step. We hereafter term this "the temporal method". The results below are for a very short simulation, hence the region of bistability is large. I.e. the rate of temporal change is fast, so the system lags, and this creates a larger region of bistability. ```{r temporal_ss_find, eval = !file.exists(mdat("temporal_ss_find.RDS"))} log10a_series <- seq(-8, 0, length = 100) wait_time <- 1e3 # time spent at each step num_strains <- 3 total_initial_abundances <- 10^5 num_CB_strains <- num_strains num_SB_strains <- num_strains num_PB_strains <- num_strains sp <- new_strain_parameter( n_CB = num_CB_strains, n_PB = num_SB_strains, n_SB = num_PB_strains, values_initial_state = "bush_ssfig3" ) parameter <- new_runsim_parameter( dynamic_model = bushplus_dynamic_model, event_definition = event_definition_1, event_interval = 1000, noise_sigma = 0, minimum_abundances = rep(1, 3), strain_parameter = sp, log10a_series = log10a_series ) names(parameter$minimum_abundances) <- c("CB", "PB", "SB") rm(sp) parameter$sim_duration <- wait_time * length(parameter$log10a_series) parameter$sim_sample_interval <- wait_time parameter <- set_temporal_ssfind_initial_state( parameter, total_initial_abundances, total_initial_abundances, total_initial_abundances ) ## be careful, this could take a long time to run if parameters above are changed (the demonstration parameters are with very short wait time). temporal_ss_find_result <- run_temporal_ssfind(parameter) saveRDS(temporal_ss_find_result, file = mdat("temporal_ss_find.RDS")) ``` ```{r fig.height=4} temporal_ss_find_result <- readRDS(file = mdat("temporal_ss_find.RDS")) p1 <- temporal_ss_find_result %>% mutate(a = 10^a_O) %>% gather(species, quantity, 2:(ncol(.) - 2)) %>% mutate(log10_quantity = log10(quantity + 1)) %>% dplyr::filter(species == "O") %>% ggplot(aes(x = log10(a), y = log10_quantity, col = species)) + geom_path() + ylab("Log(Quantity)") + xlab("Oxygen diffusivity") + theme(legend.position="none") + geom_vline(xintercept = c(-8, 0), col = "grey", lwd = 3) p1 ``` # Get measures of nonlinearity and hysteresis ## Replication method The `get_stability_measures()` function takes an object of the type returned by the `run_replication_ssfind()` function and returns various measures of the stability of each of the state variables. The measures of non-linearity and hysteresis were used in [Garnier et al (2020) (link to online article)](https://doi.org/10.1002/ece3.6294). (Warning: Care is required when processing the produced data. Some log transformations with addition of 1 are used; NAs are produced in some cases; and some measures are affected by the number of strains, even when the strains are identical.) For example: ```{r} get_stability_measures(ss_1strain) ``` And with three strains per functional group: ```{r} get_stability_measures(ss_3strain) ``` ## Temporal method All as in the previous section, except that `get_stability_measures()` function takes an object of the type returned by the `run_temporal_ssfind` function. ```{r} get_stability_measures(temporal_ss_find_result) ```