data.Rmd
The model requires time series data about individual titre readings,
along with last exposure times. Times can be relative (e.g. day of
study) or absolute (i.e. precise calendar dates). This is provided via
the data
argument when initialising an object of class biokinetics, which must be a data.table
containing the following columns:
name | type | description |
---|---|---|
pid | numeric or character | Unique identifier to identify a person across observations |
day | integer or date | The day of the observation. Can be a date or an integer representing a relative day of study |
last_exp_day | integer or date | The most recent day on which the person was exposed. Must be of the same type as the ‘day’ column |
titre_type | character | Name of the titre or biomarker |
value | numeric | Titre value |
It can also contain further columns for any covariates to be included in the model. The data files installed with this package have additional columns infection_history, last_vax_type, and exp_num.
The model also accepts a covariate formula to define the regression model. The variables in the formula must correspond to column names in the dataset. Note that all variables will be treated as categorical variables; that is, converted to factors regardless of their input type.
Note also that the value
column is assumed to be on a
natural scale by default, and will be converted to a log scale for model
fitting. If your data is already on a log scale, you must pass the
log=TRUE
argument when initialising the biokinetics class.
See biokinetics.
dat <- data.table::fread(system.file("delta_full.rds", package = "epikinetics"))
head(dat)
#> pid day last_exp_day titre_type value infection_history
#> <int> <IDat> <IDat> <char> <num> <char>
#> 1: 1 2021-03-10 2021-03-08 Ancestral 175.9350 Infection naive
#> 2: 1 2021-04-15 2021-03-08 Ancestral 607.5750 Infection naive
#> 3: 1 2021-07-08 2021-03-08 Ancestral 179.0463 Infection naive
#> 4: 1 2021-03-10 2021-03-08 Alpha 5.0000 Infection naive
#> 5: 1 2021-04-15 2021-03-08 Alpha 416.7905 Infection naive
#> 6: 1 2021-07-08 2021-03-08 Alpha 103.5274 Infection naive
#> last_vax_type exp_num
#> <char> <int>
#> 1: BNT162b2 2
#> 2: BNT162b2 2
#> 3: BNT162b2 2
#> 4: BNT162b2 2
#> 5: BNT162b2 2
#> 6: BNT162b2 2
After fitting a model, a CmdStanMCMC
object is returned. This means that users who are already familiar with
cmdstanr
are free to do what they want with the fitted
model.
Important! If you provide data on a natural
scale, it will be converted to a base2 log scale before inference is
performed. This means that if working directly with the fitted
CmdStanMCMC
all values will be on this scale. The package
provides a helper function for converting back to the original scale: convert_log2_scale_inverse.
Three further functions provide model outputs that we think are particularly useful in data.table format. biokinetics contains documentation on each of these functions so please read that first; this vignette provides guidance on the correct interpretation of each column in the returned tables (in these functions data is returned on the original scale).
See the documentation for this function here.
There are 2 different output formats depending on whether the provided
summarise
argument is TRUE
or
FALSE
.
Returned columns are
name | type | description |
---|---|---|
time_since_last_exp | integer | Number of days since last exposure |
me | numeric | Median titre value |
lo | numeric | Titre value at the 0.025 quantile |
hi | numeric | Titre value at the 0.975 quantile |
titre_type | character | Name of the titre or biomarker |
There will also be a column for each covariate in the regression model.
Returned columns are
name | type | description |
---|---|---|
time_since_last_exp | integer | Number of days since last exposure |
t0_pop | numeric | Titre value at time 0 |
tp_pop | numeric | Time at peak titre |
ts_pop | numeric | Time at start of waning |
m1_pop | numeric | Boosting rate |
m2_pop | numeric | Plateau rate |
m3_pop | numeric | Waning rate |
beta_t0 | numeric | Coefficient to adjust t0 by |
beta_tp | numeric | Coefficient to adjust tp by |
beta_ts | numeric | Coefficient to adjust ts by |
beta_m1 | numeric | Coefficient to adjust m1 by |
beta_m2 | numeric | Coefficient to adjust m2 by |
beta_m3 | numeric | Coefficient to adjust m3 by |
mu | numeric | Titre value |
.draw | integer | Draw number |
titre_type | numeric | Name of the titre or biomarker |
There will also be column for each covariate in the hierarchical model.
See the model vignette for more detail about the model parameters.
See the documentation for this function here.
There are 2 different output formats depending on whether the provided
summarise
argument is TRUE
or
FALSE
.
Returned columns are
name | type | description |
---|---|---|
pid | character or numeric | Unique person identifier as provided in input data |
draw | integer | Which draw from the fits this is |
time_since_last_exp | integer | Number of days since last exposure |
mu | numeric | Titre value |
titre_type | character | Name of the titre or biomarker |
exposure_day | integer | Day of this person’s last exposure |
calendar_day | integer | Day of this titre value |
time_shift | integer | The number of days these exposures have been adjusted by, as provided in function arguments |
There will also be a column for each covariate in the regression model.
Returned columns are
name | type | description |
---|---|---|
me | numeric | Median titre value |
lo | numeric | Titre value at the 0.025 quantile |
hi | numeric | Titre value at the 0.075 quantile |
titre_type | character | Name of the titre or biomarker |
calendar_day | integer | Day of this titre value |
time_shift | integer | The number of days the exposures were adjusted by, as provided in function arguments |