融合样本选择的深度图半监督分类

  • 打印
  • 收藏
收藏成功


打开文本图片集

A semi-supervised classification model for fusion sample selection based on depth map

LI Shun-yong1'²,WEN Nan1, ZHAO Xing-wang³ (1.School of Mathematics and Statistics,Shanxi University,Taiyuan O3oo06,China; 2. Key Laboratory of Complex Systems and Data Science of Ministry of Education, Shanxi University,Taiyuan O3oo06,China; 3.School of Computer and Information Technology,Key Laboratory of Computational Intellgence and Chinese Information Processing of Ministry of Education,Shanxi University, Taiyuan O3ooo6,China)

Abstract: Traditional supervised learning requires a large number of labeled samples for model training,which makes it difficult to apply traditional supervised models to tasks lacking labeled samples.To address this issue,a semi-supervised classification model for fusion sample selection based on depth map(SSC_ FSSDM) is proposed. The model is divided into two parts: graph structure clustering and semi-supervised classification. In graph structure clustering,unlabeled samples are represented as high-quality graph structures using Laplace rank constraints,and the class information of labeled data is used as prior information to cluster the graph structures to obtain pseudo labels of unlabeled samples.A sample selection mechanism is used to select reliable samples from the pseudo labels,reducing the impact of noisy samples on model performance. In semi-supervised classification,reliable samples and their pseudo labels are used as inputs for deep learning to predict the labels of unlabeled samples in the original data. The performance of the SSC-FSSDM model was tested on three datasets, and various indicators showed that the SSC-FSSDM model outperformed other semi-supervised classification models.

Key words:sample selection; diagram structure; Laplace; clustering; semi-supervise

0 引言

在实际应用中,如文本分类、语音识别、电子邮件分类和计算机辅助医疗诊断,存在大量的未标记数据需要手工标记或者通过实验获取,此过程费时费力.充分利用这些未标记数据来完成最终的标签预测是非常重要的,因此,半监督学习得到了越来越多的关注.

半监督学习介于无监督学习和监督学习之间,利用同时包含标签和无标签的数据来构建一个模型对未标记样例进行标记,使得模型能在预测阶段更好地泛化到“新"数据。(剩余8773字)

试读结束

monitor