A hidden Markov model for informative dropout in longitudinal response data with crisis states
Introduction
Longitudinal studies often suffer attrition, in that individuals drop out of the study before the scheduled completion time, and thus present incomplete data. A variety of methods have by now been developed for dealing with the possibility that dropout is related to responses (Hogan et al., 2004, Molenberghs et al., 2004, Philipson et al., 2008, Tsiatis and Davidian, 2004), though caution in using such methods is always needed (Molenberghs et al., 2004, Molenberghs et al., 2008).
Recently le Cessie et al. (2009) recognised that longitudinal data analysis can be complicated by the fact that during follow-up, subjects can change condition or state, examples being remission, relapse and death for cancer patients. Both longitudinal responses and the dropout probability can depend on the current state and this needs to be accounted for in analysis. The methods developed by le Cessie et al. (2009) are appropriate when the underlying state is observed. If a state is defined by the level of a response variable but obscured by measurement error, then the hidden Markov methods of Satten and Longini (1996) or Guihenneuc-Jouyaux et al. (2000) can form a basis for analysis. But, as argued by Liestøl and Andersen (2002), there are circumstances where a subject’s state is either hidden or vaguely defined. For the liver cirrhosis application considered by Liestøl and Andersen (2002) for example, some subjects experienced apparent “crises”, marked by a sudden change in response values. These crises could be transient or could indicate a terminal disease stage.
In this work we build on the ideas of le Cessie et al. (2009) and Liestøl and Andersen (2002) and develop a hidden state modelling approach for longitudinal data subject to dropout. We assume that during follow-up subjects can experience different states, which we will think of as a stable state, a crisis state and dropout. The first two states are transient and reversible, while the third, dropout, is an absorbing state. The crisis state can be defined as an intermediate phase where significant changes of the response values can be observed and where the probability of dropout is increased. We assume that the longitudinal response is associated with the underlying state but we assume that states other than dropout are not directly observed, and perhaps not precisely defined. We exclude situations where the state is defined by the response, such as for AIDS when the CD4 T-cell first reaches a given level, and leukaemia relapse when an residual leukaemic cell count is over a defined threshold (De Lorenzo et al., 2005).
In Section 2 we provide brief details of two applications which motivated our work. Our model is introduced and the estimation outlined in Section 3, where we also argue for the merits of proper treatment of time ordering when considering missingness mechanisms for longitudinal data. Section 4 includes summaries of our analyses of the motivating data sets, and some brief comments in Section 5 conclude the paper.
Section snippets
Schizophrenia data
We consider data from a trial into treatment of schizophrenia, previously described by Henderson et al. (2000) and Diggle et al. (2007). There are three treatment groups (standard, placebo and experimental) and the response of interest is the Positive and Negative Symptom Scale (PANSS), which is high for subjects with poor condition. Six assessments were scheduled, at weeks 0, 1, 2, 4, 6 and 8, but of the 518 subjects under consideration only 269 completed the trial. Of the remainder, 183
The model and assumptions
We will begin with the general situation. We assume a balanced design with common scheduled assessment times for all subjects recruited into the study. We label the assessment times but do not require equal spacing in calendar time. For the moment we will consider a single generic subject and do not use a subscript to differentiate between individuals. We identify two stochastic processes in : a partly unobservable finite state first-order Markov chain, , and an observable
Schizophrenia data
For the fixed effects component we assume separate quadratic time trends within each of the three treatment groups. We take the logistic model (1) for the probability of a direct transition from a stable state to dropout, but we assume that the coefficient of the previous response is time-constant, i.e. . We assume that transitions out of a crisis state are time-homogeneous: and .
We performed three analyses of these data: standard maximum likelihood, Bayes
Discussion
We have proposed an approach to modelling longitudinal data subject to dropout which might be useful when there are indications that subjects can have high risk or crisis periods during which the response variable can change dramatically and the probability of dropout be affected. Our model is MAR given complete data filtrations but MNAR given only observed data filtrations. We do not claim that our approach will always be appropriate, but we do consider it potentially useful. In both the
Acknowledgements
We gratefully acknowledge Dr. Ton de Craen and Dr. Rudi Westendorp of the Leiden University Medical Centre, for kindly providing the analysed data. We thank the guest editor and an anonymous reviewer for helpful comments on an earlier version of the manuscript.
References (20)
- et al.
A high response is not essential to prevent selection bias: results from the Leiden 85-plus study
Journal of Clinical Epidemiology
(2002) - et al.
The quest for genetic determinants of human longevity: challenges and insights
Nature Reviews Genetics
(2006) Hypothesis testing when a nuisance parameter is present only under the alternative
Biometrika
(1977)Hypothesis testing when a nuisance parameter is present only under the alternative
Biometrika
(1987)- et al.
Analysis of interval-censored longitudinal data with application to onco-haematology
Statistics in Medicine
(2005) - et al.
Analysis of longitudinal data with drop-out: objectives, assumptions and a proposal (with discussion)
Applied Statistics
(2007) - et al.
Analysis of longitudinal studies with death and drop-out: a case study
Statistics in Medicine
(2004) - et al.
Modeling markers of disease progression by a hidden Markov process: application to characterizing CD4 cell decline
Biometrics
(2000) - et al.
Joint modelling of longitudinal measurements and event time data
Biostatistics
(2000)
Cited by (8)
A hidden Markov model for continuous longitudinal data with missing responses and dropout
2023, Biometrical JournalA multilevel latent Markov model for the evaluation of nursing homes' performance
2018, Biometrical JournalA Novel Entropy-Based Decoding Algorithm for a Generalized High-Order Discrete Hidden Markov Model
2018, Journal of Probability and Statistics