TY - JOUR
T1 - Two-phase incremental kernel PCA for learning massive or online datasets
AU - Zhao, Feng
AU - Rekik, Islem
AU - Lee, Seong-Whan
AU - Liu, Jing
AU - Zhang, Junying
AU - Shen, Dinggang
N1 - This work was supported in part by National Natural Science Foundation of China
(Grand No: 61773244, 61373079, 61572344), National Institutes of Health in U.S.A
(AG041721, MH107815, EB006733, EB008374 and EB009634), Provincial Natural
Science Foundation of Shanxi in China(2018JM4018).
PY - 2019/2/11
Y1 - 2019/2/11
N2 - As a powerful non-linear feature extractor, kernel principal component analysis (KPCA) has been widely adopted in many machine learning applications. However, KPCA is usually performed in a batch mode, leading to some potential problems when handling massive or online datasets. To overcome this drawback of KPCA, in this paper, we propose a two-phase incremental KPCA (TP-IKPCA) algorithm which can incorporate data into KPCA in an incremental fashion. In the first phase, an incremental algorithm is developed to explicitly express the data in the kernel space. In the second phase, we extend an incremental principal component analysis (IPCA) to estimate the kernel principal components. Extensive experimental results on both synthesized and real datasets showed that the proposed TP-IKPCA produces similar principal components as conventional batch-based KPCA but is computationally faster than KPCA and its several incremental variants. Therefore, our algorithm can be applied to massive or online datasets where the batch method is not available.
AB - As a powerful non-linear feature extractor, kernel principal component analysis (KPCA) has been widely adopted in many machine learning applications. However, KPCA is usually performed in a batch mode, leading to some potential problems when handling massive or online datasets. To overcome this drawback of KPCA, in this paper, we propose a two-phase incremental KPCA (TP-IKPCA) algorithm which can incorporate data into KPCA in an incremental fashion. In the first phase, an incremental algorithm is developed to explicitly express the data in the kernel space. In the second phase, we extend an incremental principal component analysis (IPCA) to estimate the kernel principal components. Extensive experimental results on both synthesized and real datasets showed that the proposed TP-IKPCA produces similar principal components as conventional batch-based KPCA but is computationally faster than KPCA and its several incremental variants. Therefore, our algorithm can be applied to massive or online datasets where the batch method is not available.
KW - Kernel principal component analysis (KPCA)
KW - Incremental learning
KW - Big data
KW - Orthonormal basis
UR - http://www.scopus.com/inward/record.url?scp=85062375049&partnerID=8YFLogxK
U2 - 10.1155/2019/5937274
DO - 10.1155/2019/5937274
M3 - Article
SN - 1076-2787
VL - 2019
JO - Complexity
JF - Complexity
M1 - 5937274
ER -