The seroCOP package provides Bayesian tools for
analysing correlates of protection (CoP) — the
relationship between an individual’s antibody level and their risk of
infection. Models are fit using brms (backed by Stan)
and expose posterior predictions, performance metrics, and
publication-quality plots through two R6 classes:
| Class | Purpose |
|---|---|
SeroCOP |
Single biomarker; optionally hierarchical by group |
SeroCOPMulti |
Multiple biomarkers fitted and compared simultaneously |
The probability of infection is modelled as a four-parameter logistic curve:
| Parameter | Interpretation |
|---|---|
floor |
Residual infection risk at the highest titres (near-complete protection) |
ceiling |
Maximum infection risk at the lowest titres (baseline attack rate) |
ec50 |
Titre at the inflection point of the curve |
slope |
Steepness of the dose–response (higher = sharper transition) |
library(seroCOP)
# Simulate data
set.seed(123)
n <- 300
titre <- rnorm(n, mean = 2, sd = 1.5)
prob <- 0.05 + 0.65 / (1 + exp(2 * (titre - 1.5)))
infected <- rbinom(n, 1, prob)
# Initialise the model
model <- SeroCOP$new(titre = titre, infected = infected)
# Fit (brms / Stan backend)
model$fit_model(chains = 4, iter = 2000, warmup = 1000, cores = 4)
# Inspect parameters
model$summary()
# Performance metrics
model$get_metrics()
# Plots
model$plot_curve()
model$plot_roc()
model$plot_posteriors()Default priors are weakly informative and data-driven:
| Parameter | Default prior | Rationale |
|---|---|---|
floor |
Beta(1, 9) | Favours low residual risk |
ceiling |
Beta(9, 1) | Favours high baseline risk |
ec50 |
Normal(mid, range/4) | Centred on the data range |
slope |
Normal(0, 2) truncated ≥ 0 | Weakly informative |
Override priors with definePrior():
model$definePrior(
floor_alpha = 2, floor_beta = 18, # E[floor] ≈ 0.10
ceiling_alpha = 18, ceiling_beta = 2, # E[ceiling] ≈ 0.90
ec50_mean = 2.0, ec50_sd = 1.0, # Informative ec50 prior
slope_mean = 0, slope_sd = 1 # Tighter slope prior
)definePrior() can be called any time
before fit_model() and prints a summary of
the updated priors.
Pass a non-negative numeric vector to
SeroCOP$new(weights = ...) to up-/down-weight individual
observations in the likelihood:
# Case-control design: up-weight confirmed cases
w <- ifelse(infected == 1, 1.0, 0.5)
w_model <- SeroCOP$new(titre = titre, infected = infected, weights = w)
w_model$fit_model(chains = 4, iter = 2000, warmup = 1000)Add a group variable to estimate group-specific
ec50 and slope while sharing a common
floor (and optionally ceiling):
age_group <- sample(c("Young", "Middle", "Old"), n, replace = TRUE)
hier <- SeroCOP$new(titre = titre, infected = infected, group = age_group)
# ceiling_group = FALSE (default): ceiling shared across groups
# ceiling_group = TRUE: ceiling also allowed to vary by group
hier$fit_model(chains = 4, iter = 2000, warmup = 1000,
ceiling_group = FALSE)
# Group-specific parameter table
hier$extract_group_parameters()
# Fitted risk curves with marginalised overlay
hier$plot_group_curves(
weights_pop = c(Young = 0.45, Middle = 0.35, Old = 0.20)
)
# Correlates of protection curves
hier$plot_cop_curves(
weights_pop = c(Young = 0.45, Middle = 0.35, Old = 0.20)
)
# Per-group LOO is computed automatically; accessed via:
hier$loo_by_groupSeroCOPMulti fits one independent SeroCOP
model per biomarker and provides comparison tools:
# titre is a matrix: rows = observations, cols = biomarkers
titre_mat <- cbind(IgG = titre_IgG, IgA = titre_IgA, Neutralisation = titre_Neut)
multi <- SeroCOPMulti$new(
titre = titre_mat,
infected = infected,
biomarker_names = c("IgG", "IgA", "Neutralisation")
)
multi$fit_all(chains = 4, iter = 2000, warmup = 1000, cores = 4)
# Compare biomarkers (AUC, Brier, LOO-ELPD)
multi$compare_biomarkers()
multi$plot_comparison()
# All fitted risk curves side-by-side
multi$plot_all_curves()get_metrics() returns and prints:
| Metric | Description |
|---|---|
roc_auc |
Area under the ROC curve (marginalised predictions) |
roc_auc_within_group |
Weighted within-group AUC (hierarchical only) |
brier_score |
Mean squared error of predicted probabilities |
loo_elpd / loo_se
|
LOO expected log predictive density ± SE |
loo_by_group |
Per-group LOO ELPD (hierarchical only) |
extract_parameters() wraps
brms::as_draws_df() and returns a tidy data frame of
posterior mean, median, and credible intervals:
params <- extract_parameters(model, prob = 0.95)
print(params)
# parameter mean median lower upper
# 1 floor 0.052 0.050 0.010 0.110
# 2 ceiling 0.681 0.683 0.612 0.748
# 3 ec50 1.504 1.503 1.363 1.641
# 4 slope 1.983 1.978 1.701 2.275The correlate of protection at titre is:
# Matrix of posterior samples (rows = draws, cols = observations)
cop_samples <- model$predict_protection(newdata = seq(0, 5, 0.1))
cop_mean <- colMeans(cop_samples)
cop_lower <- apply(cop_samples, 2, quantile, 0.025)
cop_upper <- apply(cop_samples, 2, quantile, 0.975)
sessionInfo()
#> R version 4.5.3 (2026-03-11)
#> Platform: x86_64-pc-linux-gnu
#> Running under: Ubuntu 24.04.3 LTS
#>
#> Matrix products: default
#> BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
#> LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.26.so; LAPACK version 3.12.0
#>
#> locale:
#> [1] LC_CTYPE=C.UTF-8 LC_NUMERIC=C LC_TIME=C.UTF-8
#> [4] LC_COLLATE=C.UTF-8 LC_MONETARY=C.UTF-8 LC_MESSAGES=C.UTF-8
#> [7] LC_PAPER=C.UTF-8 LC_NAME=C LC_ADDRESS=C
#> [10] LC_TELEPHONE=C LC_MEASUREMENT=C.UTF-8 LC_IDENTIFICATION=C
#>
#> time zone: UTC
#> tzcode source: system (glibc)
#>
#> attached base packages:
#> [1] stats graphics grDevices utils datasets methods base
#>
#> loaded via a namespace (and not attached):
#> [1] digest_0.6.39 desc_1.4.3 R6_2.6.1 fastmap_1.2.0
#> [5] xfun_0.57 cachem_1.1.0 knitr_1.51 htmltools_0.5.9
#> [9] rmarkdown_2.30 lifecycle_1.0.5 cli_3.6.5 sass_0.4.10
#> [13] pkgdown_2.2.0 textshaping_1.0.5 jquerylib_0.1.4 systemfonts_1.3.2
#> [17] compiler_4.5.3 tools_4.5.3 ragg_1.5.2 evaluate_1.0.5
#> [21] bslib_0.10.0 yaml_2.3.12 jsonlite_2.0.0 rlang_1.1.7
#> [25] fs_2.0.0