Determining an Adequate Number of Principal Components
Stanley L. Sclove
A chapter in Advances in Principal Component Analysis from IntechOpen
Abstract:
The problem of choosing the number of PCs to retain is analyzed in the context of model selection, using so-called model selection criteria (MSCs). For a prespecified set of models, indexed by k=1,2,...,K, these model selection criteria (MSCs) take the form MSCk=nLLk+anmk, where, for model k,LLk is the maximum log likelihood, mk is the number of independent parameters, and the constant an is an=lnn for BIC and an=2 for AIC. The maximum log likelihood LLk is achieved by using the maximum likelihood estimates (MLEs) of the parameters. In Gaussian models, LLk involves the logarithm of the mean squared error (MSE). The main contribution of this chapter is to show how to best use BIC to choose the number of PCs, and to compare these results to ad hoc procedures that have been used. Findings include the following. These are stated as they apply to the eigenvalues of the correlation matrix, which are between 0 and p and have an average of 1. For considering an additional PCk + 1, with AIC, inclusion of the additional PCk + 1 is justified if the corresponding eigenvalue ?k+1 is greater than exp-2/n. For BIC, the inclusion of an additional PCk + 1 is justified if ?k+1>n1/n, which tends to 1 for large n. Therefore, this is in approximate agreement with the average eigenvalue rule for correlation matrices, stating that one should retain dimensions with eigenvalues larger than 1.
Keywords: reduction of dimensionality; principal components; model selection criteria; information criteria; AIC; BIC (search for similar items in EconPapers)
JEL-codes: C10 (search for similar items in EconPapers)
References: Add references at CitEc
Citations:
Downloads: (external link)
https://www.intechopen.com/chapters/81645 (text/html)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:ito:pchaps:256804
DOI: 10.5772/intechopen.104534
Access Statistics for this chapter
More chapters in Chapters from IntechOpen
Bibliographic data for series maintained by Slobodan Momcilovic ().