mdata, inla.surv and counts with exposureSPDE2mdatasurv.RmdThe INLA package supports different kinds of data
likelihoods, including the inla.surv, which is a
generalization of the Surv from the
survival package, and mdata. It also
support joint modeling different kind of outcomes. In this vignette we
illustrate the case when we have three outcomes, of different type, and
model them jointly.
Usually correlations between outcomes are modeled by sharing common model terms in the linear predictor. These terms can be covariates, random effects or sums of these. In some cases, the likelihood requires more information than just the vector of observations. As an example, the likelihood for survival models requires both observed times and information about censoring.
In this vignette we consider a joint model for three outcomes, simulate data from this model and illustrate how to fit the joint model accounting for shared terms. The linear predictor for each outcome includes covariates and a spatial random effect modeled using the SPDE approach.
The first outcome is known at an aggregated level, in a setting were we model it considering a kind of likelihood designed in INLA to account for all the information available with respect to the aggregation. The second outcome is assumed to be censored, so that we have to consider the observed value and the censoring information. The third outcome is just the usual kind of data: one scalar per data unit.
The SPDE models available in INLA could be applied for processes in or , for , and the theory in Lindgren, Rue, and Lindström (2011) include higher dimentional domains. We consider a spatial domain, to keep the example simple. Without loss off generality we consider as a rectangle defined as
Next, we store the length of each side and its average for later use
We sample a set of locations, , in this domain, , completely at random as follows:
Instead of simulating from a model to play as a spatial smooth term, we define a smooth function on this domain as follows
This function is evaluated at the locations and visualized with
u <- sfn(2 * pi * loc[, 1] / rr,
2 * pi * loc[, 2] / rr)
summary(u)
#> Min. 1st Qu. Median Mean 3rd Qu. Max.
#> -1.99813 -0.77879 -0.02521 -0.04554 0.65591 1.99664
par(mar = c(2, 2, 0, 0), mgp = c(1.5, 0.5, 0), las = 1)
plot(loc, asp = 1, pch = 19, cex = 0.5 + (2 + u)/4, xlab = "", ylab = "")
In order to define the first outcome, let us partition our domain into a set of sub-regions, , so that and .
For simplicity of coding, but without loss of generality, we assume these small sub-regions to be pixels based in the following set of centers:
h <- 1 ## size of the side of each sub-region
group.ce <- as.matrix(expand.grid(
seq(bb[1, 1] + h/2, bb[1, 2], h),
seq(bb[2, 1] + h/2, bb[2, 2], h)))
(ng <- nrow(group.ce))
#> [1] 70We can avoid the need of dealing with the general spatial objects and operations available in R and identify each location to each sub-region with an integer vector as
This will be used as a grouping index, so that each group is made of the observations falling in each of the pixels. We will visualize the locations with color depending on this grouping later.
We will consider three outcomes.
The first outcome is Gaussian distributes, with unknown mean and variance, and a known scaling factor, , such that
We assume and This is being evaluated considering the covariate and random effect averaged per group:
beta.y <- c(5, 0, -3, 3)
u.g <- tapply(u, group.id, mean)
eta.y.g <- drop(cbind(1, as.matrix(xxx.g[, 2:ncol(xxx.g)])) %*% beta.y) + u.gWe will simulate the scaling factor for each observation with
s <- rgamma(n, 7, 3)
summary(s)
#> Min. 1st Qu. Median Mean 3rd Qu. Max.
#> 0.209 1.673 2.215 2.327 2.857 7.060
sigma.y <- 1
y <- rnorm(n, eta.y.g[group.id], sqrt(sigma.y/s))
summary(y)
#> Min. 1st Qu. Median Mean 3rd Qu. Max.
#> 1.566 4.143 4.944 4.958 5.764 10.431We simulated for each location . However, we consider that observations of this variable are available only at the aggregated level, at each of the sub-regions . Furthermore, for each sub-region , we know
The function inla.agaussian computes these five
statistics, given
and
for each group. We use this function to compute these statistics, as
follows
library(INLA)
agg <- lapply(1:ng, function(g) {
ig <- which(group.id==g)
if(length(ig)>0)
return(inla.agaussian(y[ig], s[ig]))
return(inla.mdata(list(NA, 0, 1, 1, NA))) ### a deal with missing
})
str(YY <- Reduce('rbind', lapply(agg, unlist))) ## five columns matrix
#> num [1:70, 1:5] 0.577 0.34 0.495 0.476 0.415 ...
#> - attr(*, "dimnames")=List of 2
#> ..$ : chr [1:70] "init" "" "" "" ...
#> ..$ : chr [1:5] "Y1" "Y2" "Y3" "Y4" ...For the second outcome we choose a Weibull distribution with the following parametrization
We define the linear predictor as and and we consider the following code to sample from a Weibull distribution
beta.w <- c(1, -1, 0, 1)
alpha <- 2
gamma.w <- 0.5
lambdaw <- exp(cbind(1, xxx) %*% beta.w + gamma.w * u)
we0 <- rweibull(n, shape = alpha, scale = 1/lambdaw)
summary(we0)
#> Min. 1st Qu. Median Mean 3rd Qu. Max.
#> 0.003639 0.160146 0.298233 0.410024 0.531580 3.814975However, we suppose that some observations are censored
summary(u.ev <- runif(n, 0.3, 1)) ## censoring factor
#> Min. 1st Qu. Median Mean 3rd Qu. Max.
#> 0.3001 0.4823 0.6595 0.6559 0.8347 0.9998
table(event <- rbinom(n, 1, 0.5)) ## censored (=0) or event (=1)
#>
#> 0 1
#> 1507 1493
summary(we <- ifelse(event == 1, we0, we0 * u.ev)) ## censored outcome
#> Min. 1st Qu. Median Mean 3rd Qu. Max.
#> 0.002706 0.121893 0.238876 0.338448 0.439904 3.814975We consider that we have different exposure at each location
summary(ee <- rgamma(n, 10, 2))
#> Min. 1st Qu. Median Mean 3rd Qu. Max.
#> 1.012 3.893 4.858 4.998 5.966 11.994We assume an outcome following a Poisson distribution with
We assume the linear predictor as and
This outcome is being simulated with
We now have to setup the objects in order to fit the model.
We have to define a mesh over the domain. We do it considering the following code
mesh <- inla.mesh.2d(
loc.domain = matrix(bb[c(1,3,3,1,1, 2,2,4,4,2)], ncol = 2),
offset = c(.0001, .5) * rr, ## extensions size
max.edge = c(.05, .3) * rr,
cutoff = 0.025 * rr)
#> Warning: `inla.mesh.2d()` was deprecated in INLA 23.08.18.
#> ℹ Please use `fmesher::fm_mesh_2d_inla()` instead.
#> ℹ For more information, see
#> https://inlabru-org.github.io/fmesher/articles/inla_conversion.html
#> ℹ To silence these deprecation messages in old legacy code, set
#> `inla.setOption(fmesher.evolution.warn = FALSE)`.
#> ℹ To ensure visibility of these messages in package tests, also set
#> `inla.setOption(fmesher.evolution.verbosity = 'warn')`.
#> This warning is displayed once per session.
#> Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
#> generated.
mesh$n
#> [1] 1152This mesh and the locations colored by the group defined for the first outcome can be visualized with
par(mar = c(0, 0, 1, 0), mgp = c(1.5, 0.5, 0))
plot(mesh, asp = 1, edge.color = gray(0.5))
points(loc, pch = 19, col = group.id, cex = 0.3 + (2 + u)/4)
The SPDE model is defined with
spde <- inla.spde2.pcmatern(
mesh,
prior.range=c(1, 0.1),
prior.sigma=c(1, 0.1))We will fit a joint model considering the linear predictor with terms
as how it was simulated. The inla.stack() function in meant
to me used when working with SPDE models. However, it also helps when
considering multiple likelihood data as detailed below.
One way to deal with multiple likelihoods in INLA is
defining matrix data for the outcome where each column is
used for each likelihood. However, in this vignette we are dealing with
more complex cases. Therefore, he need to use list in the
left hand side of the formula. This is the key point to
implement this example.
The right hand side of the formula includes each term in the linear
predictor. When working with multiple likelihood, each term in each
likelihood should be named uniquely. For example, if two likelihoods
include the same covariate but with a different coefficient, each one
has to have its own name. Therefore, we will pass the covariate
x1 as x1y, for the first outcome,
x1w for the second outcome, and x1p for the
third outcome. Similarly for the other ones, including the
intercept.
In our example, the SPDE model is in the linear predictor for the first outcome, a copy from this random effect in in the linear predictor for the second outcome, and another copy of this term is in the linear predictor for the third outcome.
The model formula considering all of these details is defined as
The data has to be organized so it contains all the information needed, that is all the outcomes, covariates and random effect indexes. For the SPDE models it also has to include the projector matrices.
The main work of the inla.stack() function is to check
and organize outcomes, effects and projector matrices in order to
guarantee that the dimentions match. Additionally,
inla.stack() will deal with the problem of arranging the
outcomes to work properly with multiple likelihoods.
For the first outcome, we need to supply a five column matrix, each
with one of the five statistics previously computed for the
ng groups. For the effects we define an intercept,
covariates with names according to what the formula specify
and the index for the SPDE model. The projector matrices is one for the
fixed effect, which is just 1, and the one for the SPDE
model.
We will include a different tag for each stack to keep
track. This is useful to collect results associated to each outcome. For
each outcome we can also define an indicator vector to identify it. This
is useful for the case of having some missing data. In this case the
indicator can be used to identify which link to be used for making the
predictions.
The stack for the first outcome can be defined as
stackY <- inla.stack(
tag = 'y',
data = list(Y = YY),
effects= list(
data.frame(a1 = 1,
x1y = xxx.g$x1,
x2y = xxx.g$x2,
x3y = xxx.g$x3),
s = 1:mesh$n),
A=list(1,
inla.spde.make.A(mesh, group.ce))
)For the second outcome, we have to supply two vectors (observed time
and censoring) to build the inla.surv data, the covariates,
with names according to the terms in the formula, and the projector
matrices, the single 1 for the covariates and the projector
for the shared SPDE term.
The stack for the second outcome can be defined as
stackW <- inla.stack(
tag = 'w',
data = list(w = we,
ev = event),
effects = list(
data.frame(a2 = 1,
x1w = xxx[, 1],
x2w = xxx[, 2],
x3w = xxx[, 3]),
sc1 = 1:mesh$n),
A = list(1,
inla.spde.make.A(mesh, loc))
)For the third outcome we have to supply the vector of counts and the vector with the exposure. The effects and projector are similar with the previous others, just that we have to account for its names.
The stack for the third outcome can be defined as
stackP <- inla.stack(
tag = 'p',
data = list(o = po,
E = ee),
effects = list(
data.frame(a3 = 1,
x1p = xxx[, 1],
x2p = xxx[, 2],
x3p = xxx[, 3]),
sc2 = 1:mesh$n),
A = list(1,
inla.spde.make.A(mesh, loc))
)We now fit the model and look at some results.
To fit the model we start by joining all the data into one data stack
stacked <- inla.stack(stackY, stackW, stackP)We now supply this to the inla() function in order to
fit the model
result <- inla(
formula = fff,
family = c("agaussian", "weibullsurv", "poisson"),
data = inla.stack.data(stacked),
E = E,
control.predictor = list(
A=inla.stack.A(stacked)),
control.family = list(
list(),
list(variant = 1),
list()),
control.compute = list(dic = TRUE, waic = TRUE, cpo = TRUE)
)
result$cpu
#> NULLWe now compare the true value for each fixed effect with its posterior summary.
round(cbind(
true = c(beta.y, beta.w, beta.p),
result$summary.fix), 2)
#> true mean sd 0.025quant 0.5quant 0.975quant mode kld
#> a1 5 5.70 1.96 1.64 5.70 9.76 5.70 0
#> x1y 0 0.17 0.35 -0.51 0.17 0.85 0.17 0
#> x2y -3 -3.11 0.37 -3.84 -3.11 -2.38 -3.11 0
#> x3y 3 2.53 0.43 1.69 2.53 3.37 2.53 0
#> a2 1 1.03 1.02 -1.09 1.02 3.15 1.02 0
#> x1w -1 -0.99 0.04 -1.07 -0.99 -0.91 -0.99 0
#> x2w 0 -0.01 0.04 -0.09 -0.01 0.07 -0.01 0
#> x3w 1 1.07 0.04 1.00 1.07 1.15 1.07 0
#> a3 2 1.84 0.60 0.59 1.85 3.09 1.85 0
#> x1p 1 1.00 0.01 0.98 1.00 1.02 1.00 0
#> x2p -1 -0.98 0.01 -1.00 -0.98 -0.96 -0.98 0
#> x3p 0 0.01 0.01 -0.01 0.01 0.03 0.01 0For the hyper-parameters, we have one for the agaussian
likelihood, one for the weibulsuv, two for the SPDE model
and two that define the scaling for the shared SPDE terms in the second
and third outome.
round(cbind(true = c(1/sigma.y^2, alpha, NA, NA, gamma.w, gamma.p),
result$summary.hy), 2)
#> true mean sd 0.025quant 0.5quant
#> Precision for the AggGaussian observations 1.0 1.01 0.03 0.96 1.01
#> alpha parameter for weibullsurv[2] 2.0 2.27 0.06 2.18 2.27
#> Range for s NA 15.38 2.96 10.73 15.00
#> Stdev for s NA 1.72 0.27 1.28 1.69
#> Beta for sc1 0.5 0.54 0.02 0.50 0.54
#> Beta for sc2 -0.3 -0.31 0.01 -0.32 -0.31
#> 0.975quant mode
#> Precision for the AggGaussian observations 1.06 1.01
#> alpha parameter for weibullsurv[2] 2.41 2.24
#> Range for s 22.31 14.07
#> Stdev for s 2.34 1.60
#> Beta for sc1 0.57 0.54
#> Beta for sc2 -0.30 -0.31To collect outputs associated with each data, we need to consider the
tag from each data stack to get the data index from the
joint stack. For example, the data index for the third outcome is
idx1 <- inla.stack.index(stacked, tag = "p")$dataWe can see the summary for the fitted values for the first few observations of this outcome with
head(cbind(true = delta.p,
result$summary.fitted.values[idx1, ]), 3)
#> true mean sd 0.025quant 0.5quant
#> fitted.APredictor.3071 5.920626 5.889987 0.1674931 5.567970 5.887713
#> fitted.APredictor.3072 4.326089 4.242086 0.1142313 4.022118 4.240658
#> fitted.APredictor.3073 10.401938 10.690993 0.3198927 10.077774 10.686015
#> 0.975quant mode
#> fitted.APredictor.3071 6.224950 5.883205
#> fitted.APredictor.3072 4.470182 4.237828
#> fitted.APredictor.3073 11.332578 10.676140For access the DIC, WAIC and log score can consider the family indicator stored in the DIC output as follows
str(result$dic)
#> List of 14
#> $ dic : num 25497
#> $ p.eff : num 173
#> $ mean.deviance : num 25324
#> $ deviance.mean : num 25151
#> $ dic.sat : num 6359
#> $ mean.deviance.sat: num 6186
#> $ deviance.mean.sat: num 6010
#> $ family.dic : num [1:3] 6216.5 -53.1 19333.4
#> $ family.dic.sat : num [1:3] 94.8 3153.2 3113.8
#> $ family.p.eff : num [1:3] 37.8 23.8 111.3
#> $ family : num [1:6070] 1 1 1 1 1 1 1 1 1 1 ...
#> $ local.dic : num [1:6070] 103.5 92.1 115.5 101.7 108.9 ...
#> $ local.dic.sat : num [1:6070] 1.44 1.25 1.14 1.92 1.24 ...
#> $ local.p.eff : num [1:6070] 0.676 0.543 0.627 0.527 0.643 ...
str(result$waic)
#> List of 4
#> $ waic : num 24277
#> $ p.eff : num 148
#> $ local.waic : num [1:6070] 72.9 72.5 72.5 73.2 72.5 ...
#> $ local.p.eff: num [1:6070] 0.382 0.229 0.185 0.574 0.193 ...
str(result$cpo)
#> List of 3
#> $ cpo : num [1:6070] 3.27e-23 1.05e-20 9.66e-26 6.48e-23 2.57e-24 ...
#> $ pit : num [1:6070] 0.882 0.279 0.347 0.973 0.499 ...
#> $ failure: num [1:6070] 0 0 0 0 0 0 0 0 0 0 ...
result$dic$family.dic ### DIC for each outcome
#> [1] 6216.45015 -53.14955 19333.39448
tapply(result$waic$local.waic,
result$dic$family, ## taken from dic family id
sum) ## WAIC for each outcome
#> 1 2 3
#> 5000.23765 -54.04986 19330.61060
tapply(result$cpo$cpo,
result$dic$family, ## taken fron dic family id
function(x)
c(nlCPO = -sum(log(x)))) ## -sum log CPO for each outcome
#> 1 2 3
#> 3109.47483 -26.98744 9665.64534