Overview of SeroCOP

The seroCOP package provides Bayesian tools for analysing correlates of protection (CoP) — the relationship between an individual’s antibody level and their risk of infection. Models are fit using brms (backed by Stan) and expose posterior predictions, performance metrics, and publication-quality plots through two R6 classes:

Class Purpose
SeroCOP Single biomarker; optionally hierarchical by group
SeroCOPMulti Multiple biomarkers fitted and compared simultaneously

Model Description

The probability of infection is modelled as a four-parameter logistic curve:

P(infectiont)=ceiling×[inv_logit(slope(tec50))×(1floor)+floor] P(\text{infection} \mid t) = \text{ceiling} \times \left[\text{inv\_logit}\!\bigl(-\text{slope}\,(t - \text{ec50})\bigr) \times (1 - \text{floor}) + \text{floor}\right]

Parameter Interpretation
floor Residual infection risk at the highest titres (near-complete protection)
ceiling Maximum infection risk at the lowest titres (baseline attack rate)
ec50 Titre at the inflection point of the curve
slope Steepness of the dose–response (higher = sharper transition)

Quick Start

library(seroCOP)

# Simulate data
set.seed(123)
n        <- 300
titre    <- rnorm(n, mean = 2, sd = 1.5)
prob     <- 0.05 + 0.65 / (1 + exp(2 * (titre - 1.5)))
infected <- rbinom(n, 1, prob)

# Initialise the model
model <- SeroCOP$new(titre = titre, infected = infected)

# Fit (brms / Stan backend)
model$fit_model(chains = 4, iter = 2000, warmup = 1000, cores = 4)

# Inspect parameters
model$summary()

# Performance metrics
model$get_metrics()

# Plots
model$plot_curve()
model$plot_roc()
model$plot_posteriors()

Prior Distributions

Default priors are weakly informative and data-driven:

Parameter Default prior Rationale
floor Beta(1, 9) Favours low residual risk
ceiling Beta(9, 1) Favours high baseline risk
ec50 Normal(mid, range/4) Centred on the data range
slope Normal(0, 2) truncated ≥ 0 Weakly informative

Override priors with definePrior():

model$definePrior(
  floor_alpha   = 2,   floor_beta    = 18,   # E[floor]   ≈ 0.10
  ceiling_alpha = 18,  ceiling_beta  = 2,    # E[ceiling] ≈ 0.90
  ec50_mean     = 2.0, ec50_sd       = 1.0,  # Informative ec50 prior
  slope_mean    = 0,   slope_sd      = 1     # Tighter slope prior
)

definePrior() can be called any time before fit_model() and prints a summary of the updated priors.


Observation Weights

Pass a non-negative numeric vector to SeroCOP$new(weights = ...) to up-/down-weight individual observations in the likelihood:

# Case-control design: up-weight confirmed cases
w <- ifelse(infected == 1, 1.0, 0.5)

w_model <- SeroCOP$new(titre = titre, infected = infected, weights = w)
w_model$fit_model(chains = 4, iter = 2000, warmup = 1000)

Hierarchical Models (Multiple Groups)

Add a group variable to estimate group-specific ec50 and slope while sharing a common floor (and optionally ceiling):

age_group <- sample(c("Young", "Middle", "Old"), n, replace = TRUE)

hier <- SeroCOP$new(titre = titre, infected = infected, group = age_group)

# ceiling_group = FALSE (default): ceiling shared across groups
# ceiling_group = TRUE:  ceiling also allowed to vary by group
hier$fit_model(chains = 4, iter = 2000, warmup = 1000,
               ceiling_group = FALSE)

# Group-specific parameter table
hier$extract_group_parameters()

# Fitted risk curves with marginalised overlay
hier$plot_group_curves(
  weights_pop = c(Young = 0.45, Middle = 0.35, Old = 0.20)
)

# Correlates of protection curves
hier$plot_cop_curves(
  weights_pop = c(Young = 0.45, Middle = 0.35, Old = 0.20)
)

# Per-group LOO is computed automatically; accessed via:
hier$loo_by_group

Multi-Biomarker Analysis

SeroCOPMulti fits one independent SeroCOP model per biomarker and provides comparison tools:

# titre is a matrix: rows = observations, cols = biomarkers
titre_mat <- cbind(IgG = titre_IgG, IgA = titre_IgA, Neutralisation = titre_Neut)

multi <- SeroCOPMulti$new(
  titre           = titre_mat,
  infected        = infected,
  biomarker_names = c("IgG", "IgA", "Neutralisation")
)

multi$fit_all(chains = 4, iter = 2000, warmup = 1000, cores = 4)

# Compare biomarkers (AUC, Brier, LOO-ELPD)
multi$compare_biomarkers()
multi$plot_comparison()

# All fitted risk curves side-by-side
multi$plot_all_curves()

Performance Metrics

get_metrics() returns and prints:

Metric Description
roc_auc Area under the ROC curve (marginalised predictions)
roc_auc_within_group Weighted within-group AUC (hierarchical only)
brier_score Mean squared error of predicted probabilities
loo_elpd / loo_se LOO expected log predictive density ± SE
loo_by_group Per-group LOO ELPD (hierarchical only)

Extracting Parameters

extract_parameters() wraps brms::as_draws_df() and returns a tidy data frame of posterior mean, median, and credible intervals:

params <- extract_parameters(model, prob = 0.95)
print(params)
#   parameter  mean median  lower  upper
# 1     floor 0.052  0.050  0.010  0.110
# 2   ceiling 0.681  0.683  0.612  0.748
# 3      ec50 1.504  1.503  1.363  1.641
# 4     slope 1.983  1.978  1.701  2.275

Predicting Protection

The correlate of protection at titre tt is:

COP(t)=1P(infectiont)ceiling\text{COP}(t) = 1 - \frac{P(\text{infection} \mid t)}{\text{ceiling}}

# Matrix of posterior samples (rows = draws, cols = observations)
cop_samples <- model$predict_protection(newdata = seq(0, 5, 0.1))

cop_mean  <- colMeans(cop_samples)
cop_lower <- apply(cop_samples, 2, quantile, 0.025)
cop_upper <- apply(cop_samples, 2, quantile, 0.975)

Session Info

sessionInfo()
#> R version 4.5.3 (2026-03-11)
#> Platform: x86_64-pc-linux-gnu
#> Running under: Ubuntu 24.04.3 LTS
#> 
#> Matrix products: default
#> BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3 
#> LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.26.so;  LAPACK version 3.12.0
#> 
#> locale:
#>  [1] LC_CTYPE=C.UTF-8       LC_NUMERIC=C           LC_TIME=C.UTF-8       
#>  [4] LC_COLLATE=C.UTF-8     LC_MONETARY=C.UTF-8    LC_MESSAGES=C.UTF-8   
#>  [7] LC_PAPER=C.UTF-8       LC_NAME=C              LC_ADDRESS=C          
#> [10] LC_TELEPHONE=C         LC_MEASUREMENT=C.UTF-8 LC_IDENTIFICATION=C   
#> 
#> time zone: UTC
#> tzcode source: system (glibc)
#> 
#> attached base packages:
#> [1] stats     graphics  grDevices utils     datasets  methods   base     
#> 
#> loaded via a namespace (and not attached):
#>  [1] digest_0.6.39     desc_1.4.3        R6_2.6.1          fastmap_1.2.0    
#>  [5] xfun_0.57         cachem_1.1.0      knitr_1.51        htmltools_0.5.9  
#>  [9] rmarkdown_2.30    lifecycle_1.0.5   cli_3.6.5         sass_0.4.10      
#> [13] pkgdown_2.2.0     textshaping_1.0.5 jquerylib_0.1.4   systemfonts_1.3.2
#> [17] compiler_4.5.3    tools_4.5.3       ragg_1.5.2        evaluate_1.0.5   
#> [21] bslib_0.10.0      yaml_2.3.12       jsonlite_2.0.0    rlang_1.1.7      
#> [25] fs_2.0.0