浙江电力

2017, v.36;No.260(12) 16-21

[打印本页] [关闭]
本期目录(Current Issue) | 过刊浏览(Archive) | 高级检索(Advanced Search)

面向高维数据的凹型自表示特征选择方法
A Concave Self-Representation Based Feature Selection for High-dimensional Data

朱国荣,冯昊,叶玲节
ZHU Guorong,FENG Hao,YE Lingjie

摘要(Abstract):

在大数据时代,特征选择对于降低复杂度、压缩存储量、提升数据分析泛化能力等具有重要作用。针对大量的无标签高维样本,无监督特征选择技术能够更好地缓解维数灾难问题并得到了广泛应用。对此,提出了一种凹型正则约束的自表示模型,通过特征间的互线性表示以及l_(2,p)范数用于无监督特征选择。对比常见的凸函数约束,所提方法具有更为稀疏的系数解,能更有效地选择显著性特征子空间。为求解目标系数,进一步提出了一种有效的迭代重加权最小二乘算法,保证模型得以收敛至稳定解。9个公开数据集中的试验表明,所提方法在分类精度和聚类性能方面都优于其他对比算法。
Feature selection is a vital in the big data era for reducing complexity, decreasing storage level and enhancing the generalization capacity of data analysis. For the flooded untagged high dimensional samples,feature selection with unsupervised feature is also very useful to alleviate the curse of dimensionality and can be applied in numerous fields. A concave constrained self-representation method is proposed where features are represented by a linear combination of the other ones, and the l_(2,p) norm is used as a regularizer for feature selection. Compared with the traditional convex regularization, more compact solutions can be obtained via the concave constraint, which makes the proposed method more effective for choosing salient features. For solving the targeted coefficient, we further devise an efficient iterative reweighted least square method, which guarantees the convergence of the proposed model to a stationary point. We conduct several experiments on nine publicly available databases, and the results show that our method for feature selection outperforms other competing methods in terms of clustering effectiveness and recognition accuracy.

关键词(KeyWords): 大数据;高维数据;自表示;特征选择
big data;high-dimensional data;self-representation;feature selection

Abstract:

Keywords:

基金项目(Foundation): 国网浙江省电力有限公司科技项目(5211JY15001V)

作者(Author): 朱国荣,冯昊,叶玲节
ZHU Guorong,FENG Hao,YE Lingjie

DOI: 10.19585/j.zjdl.201712004

参考文献(References):

扩展功能
本文信息
服务与反馈
本文关键词相关文章
本文作者相关文章
中国知网
分享