Robust plug-in estimators in proportional scatter models☆
Introduction
Many methods involving several populations in multivariate analysis assume equality of covariance matrices. Scatter matrices can share more complex relationships among them than just being equal or not. For example, one matrix might be identical to another except that each element of the first matrix is multiplied by a single constant. We would then say that the matrices are proportional to one another. A more precise definition of proportionality is that the matrices share identical eigenvectors (or principal components), but their eigenvalues differ in a proportionality constant. A weaker relationship between matrices could be that they are commutable and so, they have the same principal components, but different eigenvalues, as is the case in the common principal component model (CPC model) proposed by Flury (1984).
Assume that we have k populations in , with covariance matrices . The common principal components model states thatwhere is a diagonal matrix and is an orthogonal matrix. The proportional model could be seen as special case of the CPC model, obtained by imposing further constraints on the parameter spaceIn the one-group principal component analysis the eigenvectors forming the orthogonal matrix are usually ordered according to an increasing order of the associated eigenvalues. In the proportional model, in order to identify uniquely the axis, it is usually assumed that the eigenvalues of are distinct and that the columns of are arranged according to increasing values of the eigenvalues of .
Proportional covariance matrix estimation, in the two-sample case, has been considered by Khatri (1967) and Pillai et al. (1969). These authors studied the distribution of the ratios of the characteristic roots of , where are the sample covariance matrices. However, these authors neither explicitly construct tests for proportionality, nor indicate how to estimate proportional covariance matrices. Guttman et al. (1985) and Rao (1983) derived the asymptotic distribution of the proportionality constant estimators.
For the case of k>2, Kim (1971) showed that, under normal sampling with proportional covariance matrices, there exists at least one solution of the likelihood equations, and derived the joint asymptotic distribution of the k−1 estimated constants of proportionality. Owen (1984) considered the case of several groups in the context of a classification problem and gave an algorithm to find the maximum likelihood estimators. Essentially the same algorithm was considered by Manly and Rayner (1987) and Eriksen (1987) who, in addition, proved the convergence of the algorithm and the uniqueness of maximum likelihood estimators. Flury (1986), using a different parametrization based on the CPC model, obtained a system of equations, which defines the maximum likelihood estimators and gave an algorithm to solve it.
The asymptotic distribution of the maximum likelihood estimators for the CPC model and the proportionality model are given in Flury (1988) under normal sampling. A robustified version of these estimators, under a CPC model, was given by Novi Inverardi and Flury (1992). These authors considered robust and independent estimators of the scatter matrices of the k populations using the affine-equivariant M-estimators studied by Maronna (1976) and plugged them into the equations defining the maximum likelihood estimators of the parameters. Boente and Orellana (2001) established some asymptotic properties of these plug-in estimators for the CPC model, when using any robust, asymptotically normally distributed scatter matrix and also considered an approach based on projection pursuit principles.
This paper focuses on robust estimation using a plug-in approach, under proportionality of the scatter matrices. In Section 2, we describe the proposal to be considered for the proportionality model. In Section 3, some asymptotic results are established, while an application to test the hypothesis of equality against proportionality is given in Section 4. All proofs are given in the appendix.
Section snippets
Proportionality model
Let , 1⩽i⩽k, be independent observations from k independent samples in with location parameter and scatter matrix . We are interested in robustly estimating the common eigenvectors of , the eigenvalues of and the proportionality constants ρi>0 under the proportionality modelwith and λ1<⋯<λp the eigenvalues of .
Let . Flury (1986) obtains the maximum likelihood estimators, for normally distributed
Asymptotic distribution
A standard framework to derive the asymptotic behavior in robust principal component analysis is to assume that the estimators of the scatter matrix are asymptotically normally distributed and spherically invariant. For that reason, and since the samples of the k populations are independent, we will assume, throughout this section, that for 1⩽i⩽k, the estimators, , of the scatter matrix are independent and satisfy the following assumptions
- A1.
where has a multivariate normal
A test of equality against proportionality
The results and proposals of the previous sections can be used to test the hypothesis of equality of several scatter matrices against proportionality. This corresponds to the two first levels of similarity among the covariance matrices of k populations considered in Flury (1988). Effectively, assume that we want to testThe robust estimators defined in Section 2.2 provide statistics more resistant to outlying observations than the classical ones and thus,
References (34)
Asymptotic theory for robust principal components
J. Multivariate Anal.
(1987)Proportionality of k covariance matrices
Statist. Probab. Lett.
(1986)Likelihood tests for relationships between covariance matrices
An Introduction to Multivariate Statistical Analysis
(1984)- Berrendero, J.R., 1996. Contribuciones a la teorı́a de la robustez respecto al sesgo. Tesis de Doctorado, Universidad...
- et al.
A robust approach to common principal components
- et al.
Influence functions and outlier detection under the common principal components model: a robust approach
Biometrika
(2002) - et al.
Principal component analysis based on robust estimators of the covariance or correlation matrixinfluence functions and efficiencies
Biometrika
(2000) - et al.
A fast algorithm for robust principal components based on projection pursuit
- Croux, C., Ruiz-Gazen, A., 2000. High breakdown estimators for principal components: The projection-pursuit approach...
Robust estimation of dispersion matrices and principal components
J. Amer. Statist. Assoc.
Proportionality of covariance matrices
Ann. Statist.
Common principal components in groups
J. Amer. Statist. Assoc.
Common Principal Components and Related Multivariate Models
Robust Statistics
Cited by (9)
High-dimensional testing for proportional covariance matrices
2019, Journal of Multivariate AnalysisCitation Excerpt :We will provide a brief review of this test problem in the next paragraph, and in Remark 6 after presenting our proposed approach. For other hypothesis testing regarding covariance matrices, readers are referred to [2,16]. In Section 2, several assumptions on the covariance matrices are introduced.
Robust nonparametric kernel regression estimator
2016, Statistics and Probability LettersCitation Excerpt :The most familiar practice in bandwidth selection in both non-robust and robust methods is to minimize the asymptotic mean squared error (MSE), see for example, Cleveland (1979), Härdle and Marron (1985), Cheng and Cheng (1987), Hall and Jones (1990) and Cantoni and Ronchetti (2001). Plug-in method is proposed as an important tool of bandwidth selection in robust nonparametric regression (Boente et al., 1997; Boente and Orellana, 2004; Bianco and Boente, 2007; Boente and Rodriguez, 2008). Cross-validation, as another popular bandwidth selection tool, is studied in Rice (1984), Leungi et al. (1993) and Leung (2005).
Robust tests for the common principal components model
2009, Journal of Statistical Planning and InferenceRobust discrimination under a hierarchy on the scatter matrices
2008, Journal of Multivariate AnalysisPreliminary Multiple-Test Estimation, With Applications to k-Sample Covariance Estimation
2022, Journal of the American Statistical Association
- ☆
This research was partially supported by Grant PICT # 03-00000-006277 from ANPCYT at Buenos Aires, Argentina. The research of Graciela Boente was also partially supported by a Guggenheim fellowship.