A hidden Markov model for informative dropout in longitudinal response data with crisis states

doi:10.1016/j.spl.2011.02.005

Statistics & Probability Letters

Volume 81, Issue 7, July 2011, Pages 730-738

https://doi.org/10.1016/j.spl.2011.02.005 Get rights and content

Abstract

We adopt a hidden state approach for the analysis of longitudinal data subject to dropout. Motivated by two applied studies, we assume that subjects can move between three states: stable, crisis, dropout. Dropout is observed but the other two states are not. During a possibly transient crisis state both the longitudinal response distribution and the probability of dropout can differ from those for the stable state. We adopt a linear mixed effects model with subject-specific trajectories during stable periods and additional random jumps during crises. We place the model in the context of Rubin’s taxonomy and develop the associated likelihood. The methods are illustrated using the two motivating examples.

Introduction

Longitudinal studies often suffer attrition, in that individuals drop out of the study before the scheduled completion time, and thus present incomplete data. A variety of methods have by now been developed for dealing with the possibility that dropout is related to responses (Hogan et al., 2004, Molenberghs et al., 2004, Philipson et al., 2008, Tsiatis and Davidian, 2004), though caution in using such methods is always needed (Molenberghs et al., 2004, Molenberghs et al., 2008).

Recently le Cessie et al. (2009) recognised that longitudinal data analysis can be complicated by the fact that during follow-up, subjects can change condition or state, examples being remission, relapse and death for cancer patients. Both longitudinal responses and the dropout probability can depend on the current state and this needs to be accounted for in analysis. The methods developed by le Cessie et al. (2009) are appropriate when the underlying state is observed. If a state is defined by the level of a response variable but obscured by measurement error, then the hidden Markov methods of Satten and Longini (1996) or Guihenneuc-Jouyaux et al. (2000) can form a basis for analysis. But, as argued by Liestøl and Andersen (2002), there are circumstances where a subject’s state is either hidden or vaguely defined. For the liver cirrhosis application considered by Liestøl and Andersen (2002) for example, some subjects experienced apparent “crises”, marked by a sudden change in response values. These crises could be transient or could indicate a terminal disease stage.

In this work we build on the ideas of le Cessie et al. (2009) and Liestøl and Andersen (2002) and develop a hidden state modelling approach for longitudinal data subject to dropout. We assume that during follow-up subjects can experience different states, which we will think of as a stable state, a crisis state and dropout. The first two states are transient and reversible, while the third, dropout, is an absorbing state. The crisis state can be defined as an intermediate phase where significant changes of the response values can be observed and where the probability of dropout is increased. We assume that the longitudinal response is associated with the underlying state but we assume that states other than dropout are not directly observed, and perhaps not precisely defined. We exclude situations where the state is defined by the response, such as for AIDS when the CD4 T-cell first reaches a given level, and leukaemia relapse when an residual leukaemic cell count is over a defined threshold (De Lorenzo et al., 2005).

In Section 2 we provide brief details of two applications which motivated our work. Our model is introduced and the estimation outlined in Section 3, where we also argue for the merits of proper treatment of time ordering when considering missingness mechanisms for longitudinal data. Section 4 includes summaries of our analyses of the motivating data sets, and some brief comments in Section 5 conclude the paper.

Section snippets

Schizophrenia data

We consider data from a trial into treatment of schizophrenia, previously described by Henderson et al. (2000) and Diggle et al. (2007). There are three treatment groups (standard, placebo and experimental) and the response of interest is the Positive and Negative Symptom Scale (PANSS), which is high for subjects with poor condition. Six assessments were scheduled, at weeks 0, 1, 2, 4, 6 and 8, but of the 518 subjects under consideration only 269 completed the trial. Of the remainder, 183

The model and assumptions

We will begin with the general situation. We assume a balanced design with common scheduled assessment times for all $n$ subjects recruited into the study. We label the assessment times $t = 1, 2, \dots, T$ but do not require equal spacing in calendar time. For the moment we will consider a single generic subject and do not use a subscript to differentiate between individuals. We identify two stochastic processes in $t$ : a partly unobservable finite state first-order Markov chain, $S_{t}$ , and an observable

Schizophrenia data

For the fixed effects component we assume separate quadratic time trends within each of the three treatment groups. We take the logistic model (1) for the probability $ϕ_{13} (t)$ of a direct transition from a stable state to dropout, but we assume that the coefficient of the previous response is time-constant, i.e. $α_{1 t} = α_{1}$ . We assume that transitions out of a crisis state are time-homogeneous: $ϕ_{21} (t) = ϕ_{21}$ and $ϕ_{23} (t) = ϕ_{23}$ .

We performed three analyses of these data: standard maximum likelihood, Bayes

Discussion

We have proposed an approach to modelling longitudinal data subject to dropout which might be useful when there are indications that subjects can have high risk or crisis periods during which the response variable can change dramatically and the probability of dropout be affected. Our model is MAR given complete data filtrations but MNAR given only observed data filtrations. We do not claim that our approach will always be appropriate, but we do consider it potentially useful. In both the

Acknowledgements

We gratefully acknowledge Dr. Ton de Craen and Dr. Rudi Westendorp of the Leiden University Medical Centre, for kindly providing the analysed data. We thank the guest editor and an anonymous reviewer for helpful comments on an earlier version of the manuscript.

References (20)

A.B. der Wiel et al.
A high response is not essential to prevent selection bias: results from the Leiden 85-plus study
Journal of Clinical Epidemiology
(2002)
K. Christensen et al.
The quest for genetic determinants of human longevity: challenges and insights
Nature Reviews Genetics
(2006)
R.B. Davies
Hypothesis testing when a nuisance parameter is present only under the alternative
Biometrika
(1977)
R.B. Davies
Hypothesis testing when a nuisance parameter is present only under the alternative
Biometrika
(1987)
P. De Lorenzo et al.
Analysis of interval-censored longitudinal data with application to onco-haematology
Statistics in Medicine
(2005)
P.J. Diggle
P.J. Diggle et al.
Analysis of longitudinal data with drop-out: objectives, assumptions and a proposal (with discussion)
Applied Statistics
(2007)
C. Dufoil et al.
Analysis of longitudinal studies with death and drop-out: a case study
Statistics in Medicine
(2004)
C. Guihenneuc-Jouyaux et al.
Modeling markers of disease progression by a hidden Markov process: application to characterizing CD4 cell decline
Biometrics
(2000)
R. Henderson et al.
Joint modelling of longitudinal measurements and event time data
Biostatistics
(2000)

There are more references available in the full text version of this article.

Cited by (8)

A hidden Markov model for continuous longitudinal data with missing responses and dropout
2023, Biometrical Journal
Maximum likelihood estimation of hidden Markov models for continuous longitudinal data with missing responses and dropout
2021, arXiv
A shared-parameter continuous-time hidden Markov and survival model for longitudinal data with informative dropout
2019, Statistics in Medicine
A mixed-effects estimating equation approach to nonignorable missing longitudinal data with Refreshment samples
2018, Statistica Sinica
A multilevel latent Markov model for the evaluation of nursing homes' performance
2018, Biometrical Journal
A Novel Entropy-Based Decoding Algorithm for a Generalized High-Order Discrete Hidden Markov Model
2018, Journal of Probability and Statistics

View all citing articles on Scopus

View full text

A hidden Markov model for informative dropout in longitudinal response data with crisis states

Abstract

Introduction

Section snippets

Schizophrenia data

The model and assumptions

Schizophrenia data

Discussion

Acknowledgements

Journal of Clinical Epidemiology

The quest for genetic determinants of human longevity: challenges and insights

Nature Reviews Genetics

Hypothesis testing when a nuisance parameter is present only under the alternative

Biometrika

Hypothesis testing when a nuisance parameter is present only under the alternative

Biometrika

Analysis of interval-censored longitudinal data with application to onco-haematology

Statistics in Medicine

Analysis of longitudinal data with drop-out: objectives, assumptions and a proposal (with discussion)

Applied Statistics

Analysis of longitudinal studies with death and drop-out: a case study

Statistics in Medicine

Modeling markers of disease progression by a hidden Markov process: application to characterizing CD4 cell decline

Biometrics

Joint modelling of longitudinal measurements and event time data

Biostatistics