
Pool regression fits across posterior draws of correct-match classifications
Source:R/mixbayes_survreg_methods.R
mi_with.survMixBayes.RdUse posterior draws of the latent match indicators from survregMixBayes()
to repeatedly identify which records are treated as correct matches, refit a
Cox proportional hazards model on those records, and pool the resulting
estimates using multiple-imputation pooling rules.
Each retained posterior draw defines one subset of records classified as
correct matches. The function fits the specified survival::coxph()
model to that subset, extracts the estimated coefficients and covariance
matrix, and combines the results across draws using Rubin's rules.
Usage
# S3 method for class 'survMixBayes'
mi_with(
object,
data,
formula,
min_n = NULL,
quietly = TRUE,
ties = "efron",
...
)Arguments
- object
A
survMixBayesmodel object containing posterior draws of the latent match indicators.- data
A data.frame with all candidate records in the same row order as used in the model.
- formula
Model formula for refitting on each draw (required), typically of the form
survival::Surv(time, event) ~ ....- min_n
Minimum number of records required to fit the model for a given posterior draw. The default is
p + 2, wherepis the number of non-intercept columns in the model matrix.- quietly
If
TRUE, draws that lead to fitting errors are skipped without printing the full error message.- ties
Method for handling tied event times in
survival::coxph(). Default is"efron".- ...
Additional arguments passed to
survival::coxph().
Value
An object of class c("mi_link_pool_survreg", "mi_link_pool")
containing pooled coefficient estimates, standard errors, confidence
intervals, and related summary information.
Examples
# \donttest{
set.seed(301)
n <- 150
trt <- rbinom(n, 1, 0.5)
# Simulate Weibull AFT data
true_time <- rweibull(n, shape = 1.5, scale = exp(1 + 0.8 * trt))
cens_time <- rexp(n, rate = 0.1)
true_obs_time <- pmin(true_time, cens_time)
true_status <- as.integer(true_time <= cens_time)
# Induce linkage mismatch errors in approximately 20% of records
is_mismatch <- rbinom(n, 1, 0.2)
obs_time <- true_obs_time
obs_status <- true_status
mismatch_idx <- which(is_mismatch == 1)
shuffled <- sample(mismatch_idx)
obs_time[mismatch_idx] <- obs_time[shuffled]
obs_status[mismatch_idx] <- obs_status[shuffled]
linked_df <- data.frame(time = obs_time, status = obs_status, trt = trt)
adj <- adjMixBayes(linked.data = linked_df)
fit <- plsurvreg(
survival::Surv(time, status) ~ trt,
dist = "weibull",
adjustment = adj,
control = list(iterations = 200, burnin.iterations = 100, seed = 123)
)
#>
#> SAMPLING FOR MODEL 'survMixBayes_weibull' NOW (CHAIN 1).
#> Chain 1:
#> Chain 1: Gradient evaluation took 9.4e-05 seconds
#> Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.94 seconds.
#> Chain 1: Adjust your expectations accordingly!
#> Chain 1:
#> Chain 1:
#> Chain 1: WARNING: There aren't enough warmup iterations to fit the
#> Chain 1: three stages of adaptation as currently configured.
#> Chain 1: Reducing each adaptation stage to 15%/75%/10% of
#> Chain 1: the given number of warmup iterations:
#> Chain 1: init_buffer = 15
#> Chain 1: adapt_window = 75
#> Chain 1: term_buffer = 10
#> Chain 1:
#> Chain 1: Iteration: 1 / 200 [ 0%] (Warmup)
#> Chain 1: Iteration: 20 / 200 [ 10%] (Warmup)
#> Chain 1: Iteration: 40 / 200 [ 20%] (Warmup)
#> Chain 1: Iteration: 60 / 200 [ 30%] (Warmup)
#> Chain 1: Iteration: 80 / 200 [ 40%] (Warmup)
#> Chain 1: Iteration: 100 / 200 [ 50%] (Warmup)
#> Chain 1: Iteration: 101 / 200 [ 50%] (Sampling)
#> Chain 1: Iteration: 120 / 200 [ 60%] (Sampling)
#> Chain 1: Iteration: 140 / 200 [ 70%] (Sampling)
#> Chain 1: Iteration: 160 / 200 [ 80%] (Sampling)
#> Chain 1: Iteration: 180 / 200 [ 90%] (Sampling)
#> Chain 1: Iteration: 200 / 200 [100%] (Sampling)
#> Chain 1:
#> Chain 1: Elapsed Time: 0.62 seconds (Warm-up)
#> Chain 1: 0.489 seconds (Sampling)
#> Chain 1: 1.109 seconds (Total)
#> Chain 1:
#> Warning: The largest R-hat is 1.18, indicating chains have not mixed.
#> Running the chains for more iterations may help. See
#> https://mc-stan.org/misc/warnings.html#r-hat
#> Warning: Bulk Effective Samples Size (ESS) is too low, indicating posterior means and medians may be unreliable.
#> Running the chains for more iterations may help. See
#> https://mc-stan.org/misc/warnings.html#bulk-ess
#> Warning: Tail Effective Samples Size (ESS) is too low, indicating posterior variances and tail quantiles may be unreliable.
#> Running the chains for more iterations may help. See
#> https://mc-stan.org/misc/warnings.html#tail-ess
#>
#> ......................................................................................
#> . Method Time (sec) Status .
#> ......................................................................................
#> . ECR-ITERATIVE-1 0.096 Converged (2 iterations) .
#> ......................................................................................
#>
#> Relabelling all methods according to method ECR-ITERATIVE-1 ... done!
#> Retrieve the 1 permutation arrays by typing:
#> [...]$permutations$"ECR-ITERATIVE-1"
#> Retrieve the 1 best clusterings: [...]$clusters
#> Retrieve the 1 CPU times: [...]$timings
#> Retrieve the 1 X 1 similarity matrix: [...]$similarity
#> Label switching finished. Total time: 0.1 seconds.
pooled_obj <- mi_with(
object = fit,
data = linked_df,
formula = survival::Surv(time, status) ~ trt
)
print(pooled_obj)
#> Pooled Cox regression results across posterior match classifications:
#> Retained imputations (m): 100
#> Mixture model distribution: weibull
#> Refit model: coxph
#>
#> Estimate Std.Error CI.lwr CI.upr df
#> trt -0.97202 0.40304 -1.76665 -0.1774 205.7726
# }