Input data

The model requires time series data about individual titre readings, along with last exposure times. Times can be relative (e.g. day of study) or absolute (i.e. precise calendar dates). This is provided via the data argument when initialising an object of class biokinetics, which must be a data.table containing the following columns:

name type description
pid numeric or character Unique identifier to identify a person across observations
day integer or date The day of the observation. Can be a date or an integer representing a relative day of study
last_exp_day integer or date The most recent day on which the person was exposed. Must be of the same type as the ‘day’ column
titre_type character Name of the titre or biomarker
value numeric Titre value

It can also contain further columns for any covariates to be included in the model. The data files installed with this package have additional columns infection_history, last_vax_type, and exp_num.

The model also accepts a covariate formula to define the regression model. The variables in the formula must correspond to column names in the dataset. Note that all variables will be treated as categorical variables; that is, converted to factors regardless of their input type.

Note also that the value column is assumed to be on a natural scale by default, and will be converted to a log scale for model fitting. If your data is already on a log scale, you must pass the log=TRUE argument when initialising the biokinetics class. See biokinetics.

Example

dat <- data.table::fread(system.file("delta_full.rds", package = "epikinetics"))
head(dat)
#>      pid        day last_exp_day titre_type    value infection_history
#>    <int>     <IDat>       <IDat>     <char>    <num>            <char>
#> 1:     1 2021-03-10   2021-03-08  Ancestral 175.9350   Infection naive
#> 2:     1 2021-04-15   2021-03-08  Ancestral 607.5750   Infection naive
#> 3:     1 2021-07-08   2021-03-08  Ancestral 179.0463   Infection naive
#> 4:     1 2021-03-10   2021-03-08      Alpha   5.0000   Infection naive
#> 5:     1 2021-04-15   2021-03-08      Alpha 416.7905   Infection naive
#> 6:     1 2021-07-08   2021-03-08      Alpha 103.5274   Infection naive
#>    last_vax_type exp_num
#>           <char>   <int>
#> 1:      BNT162b2       2
#> 2:      BNT162b2       2
#> 3:      BNT162b2       2
#> 4:      BNT162b2       2
#> 5:      BNT162b2       2
#> 6:      BNT162b2       2

Output data

After fitting a model, a CmdStanMCMC object is returned. This means that users who are already familiar with cmdstanr are free to do what they want with the fitted model.

Important! If you provide data on a natural scale, it will be converted to a base2 log scale before inference is performed. This means that if working directly with the fitted CmdStanMCMC all values will be on this scale. The package provides a helper function for converting back to the original scale: convert_log2_scale_inverse.

Three further functions provide model outputs that we think are particularly useful in data.table format. biokinetics contains documentation on each of these functions so please read that first; this vignette provides guidance on the correct interpretation of each column in the returned tables (in these functions data is returned on the original scale).

simulate_population_trajectories

See the documentation for this function here. There are 2 different output formats depending on whether the provided summarise argument is TRUE or FALSE.

summarise = TRUE

Returned columns are

name type description
time_since_last_exp integer Number of days since last exposure
me numeric Median titre value
lo numeric Titre value at the 0.025 quantile
hi numeric Titre value at the 0.975 quantile
titre_type character Name of the titre or biomarker

There will also be a column for each covariate in the regression model.

summarise = FALSE

Returned columns are

name type description
time_since_last_exp integer Number of days since last exposure
t0_pop numeric Titre value at time 0
tp_pop numeric Time at peak titre
ts_pop numeric Time at start of waning
m1_pop numeric Boosting rate
m2_pop numeric Plateau rate
m3_pop numeric Waning rate
beta_t0 numeric Coefficient to adjust t0 by
beta_tp numeric Coefficient to adjust tp by
beta_ts numeric Coefficient to adjust ts by
beta_m1 numeric Coefficient to adjust m1 by
beta_m2 numeric Coefficient to adjust m2 by
beta_m3 numeric Coefficient to adjust m3 by
mu numeric Titre value
.draw integer Draw number
titre_type numeric Name of the titre or biomarker

There will also be column for each covariate in the hierarchical model.

See the model vignette for more detail about the model parameters.

simulate_individual_trajectories

See the documentation for this function here. There are 2 different output formats depending on whether the provided summarise argument is TRUE or FALSE.

summarise = FALSE

Returned columns are

name type description
pid character or numeric Unique person identifier as provided in input data
draw integer Which draw from the fits this is
time_since_last_exp integer Number of days since last exposure
mu numeric Titre value
titre_type character Name of the titre or biomarker
exposure_day integer Day of this person’s last exposure
calendar_day integer Day of this titre value
time_shift integer The number of days these exposures have been adjusted by, as provided in function arguments

There will also be a column for each covariate in the regression model.

summarise = TRUE

Returned columns are

name type description
me numeric Median titre value
lo numeric Titre value at the 0.025 quantile
hi numeric Titre value at the 0.075 quantile
titre_type character Name of the titre or biomarker
calendar_day integer Day of this titre value
time_shift integer The number of days the exposures were adjusted by, as provided in function arguments