Abstract

Machine learning and other data-driven methods have developed at a prolific rate for industrial applications due to the advent of industrial big data. However, industrial datasets may not be especially well-suited to supervised learning approaches that require extensive domain knowledge in the complete and accurate labeling of datasets. To address these challenges, a semi-supervised learning approach is proposed that makes use of partially labeled subsets. The proposed methodology is applied to high-dimensional in-process measurement data, utilizing a convolutional autoencoder (CAE) for unsupervised feature extraction. A multiclass extension for semi-supervised anomaly diagnosis is proposed that utilizes principal component analysis (PCA) as the basis for anomaly scoring, and the proposed approach intersects the results of targeted one-against-all phases on partially labeled sets to classify faults. Experiments in a case study on semiconductor manufacturing measurement data are performed to explore the relationship between latent features extracted and anomaly detection performance. The application of the proposed algorithm achieves a true positive detection rate of over 90% with false positive rate under 9% for both local and global anomaly types, with these results accomplished while reducing over 99% of the original input data dimensions. In addition, the approach also allows for positive samples to be identified that were previously undetected by human experts. These results are promising for the application of the proposed semi-supervised methodology in real industrial settings.

References

1.
Cohen
,
J.
, and
Ni
,
J.
,
2021
, “
A Semi-Supervised Multiclass Anomaly Detection Approach for Partially Labeled In-Process Measurement Data
,”
Proceedings of the ASME 2021 16th International Manufacturing Science and Engineering Conference
,
Virtual
,
June 21–25
.
2.
Xu
,
M.
,
David
,
J. M.
, and
Kim
,
S. H.
,
2018
, “
The Fourth Industrial Revolution: Opportunities and Challenges
,”
Int. J. Financ. Res.
,
9
(
2
), pp.
90
95
.
3.
Lee
,
J.
,
Bagheri
,
B.
, and
Kao
,
H. A.
,
2015
, “
A Cyber-Physical Systems Architecture for Industry 4.0-Based Manufacturing Systems
,”
Manuf. Lett.
,
3
, pp.
18
23
.
4.
Zhu
,
J.
,
Ge
,
Z.
,
Song
,
Z.
, and
Gao
,
F.
,
2018
, “
Review and Big Data Perspectives on Robust Data Mining Approaches for Industrial Process Modeling With Outliers and Missing Data
,”
Annu. Rev. Control
,
46
, pp.
107
133
.
5.
Li
,
J.
,
Socher
,
R.
, and
Hoi
,
S. C. H.
,
2020
, “
DivideMix: Learning With Noisy Labels as Semi-Supervised Learning
,”
International Conference on Learning Representations
,
Virtual
,
Apr. 26–May 1
, pp.
1
14
.
6.
Xishuang
,
D.
,
Lijun
,
Q.
, and
Lei
,
H.
,
2017
, “
Short-Term Load Forecasting in Smart Grid: A Combined CNN and K-Means Clustering Approach
,”
2017 IEEE International Conference on Big Data and Smart Computing
,
Jeju, South Korea
,
Feb. 13–16
, pp.
119
125
.
7.
Azhar Ramli
,
A.
,
Watada
,
J.
, and
Pedrycz
,
W.
,
2014
, “
A Combination of Genetic Algorithm-Based Fuzzy C-Means With a Convex Hull-Based Regression for Real-Time Fuzzy Switching Regression Analysis: Application to Industrial Intelligent Data Analysis
,”
IEEJ Trans. Electr. Electron. Eng.
,
9
(
1
), pp.
71
82
.
8.
Yu
,
J.
,
2011
, “
Fault Detection Using Principal Components-Based Gaussian Mixture Model for Semiconductor Manufacturing Processes
,”
IEEE Trans. Semicond. Manuf.
,
24
(
3
), pp.
432
444
.
9.
Bair
,
E.
,
2013
, “
Semi-Supervised Clustering Methods
,”
Wiley Interdiscip. Rev. Comput. Stat.
,
5
(
5
), pp.
349
361
.
10.
Basu
,
S.
,
Banerjee
,
A.
, and
Mooney
,
R. J.
,
2004
, “
Active Semi-Supervision for Pairwise Constrained Clustering
,”
Proceedings of the 2004 SIAM International Conference on Data Mining
,
Lake Buena Vista, FL
,
Apr. 22–24
, pp.
333
344
.
11.
Goldstein
,
M.
, and
Uchida
,
S.
,
2016
, “
A Comparative Evaluation of Unsupervised Anomaly Detection Algorithms for Multivariate Data
,”
PLoS One
,
11
(
4
), pp.
1
31
.
12.
Cohen
,
J.
,
Jiang
,
B.
, and
Ni
,
J.
,
2021
, “
Fault Diagnosis of Timed Event Systems: An Exploration of Machine Learning Methods
,”
ASME 2020 15th International Manufacturing Science and Engineering Conference, MSEC 2020
,
Virtual
,
Sept. 3
.
13.
Zhang
,
Y. L.
,
Li
,
L.
,
Zhou
,
J.
,
Li
,
X.
, and
Zhou
,
Z. H.
,
2018
, “
Anomaly Detection With Partially Observed Anomalies
,”
Companion Proceedings of The Web Conference 2018
,
Lyon, France
,
Apr. 23–27
, pp.
639
646
.
14.
Wang
,
L.
, and
Gao
,
R. X.
,
2006
,
Condition Monitoring and Control for Intelligent Manufacturing
,
Springer
,
London
.
15.
Guo
,
X.
,
Liu
,
X.
,
Zhu
,
E.
, and
Yin
,
J.
,
2017
,
Deep Clustering With Convolutional Autoencoders
, ICONIP 2017,
Springer
,
Cham
, pp.
373
382
.
16.
Shyu
,
M. L.
,
Chen
,
S. C.
,
Sarinnapakorn
,
K.
, and
Chang
,
L.
,
2003
, “
A Novel Anomaly Detection Scheme Based on Principal Component Classifier
,”
3rd IEEE International Conference on Data Mining
,
Melbourne, FL
,
Nov. 22
, pp.
353
365
.
17.
Schölkopf
,
B.
,
Williamson
,
R.
,
Smola
,
A.
,
Shawe-Taylor
,
J.
, and
Platt
,
J.
,
1999
, “
Support Vector Method for Novelty Detection
,”
Proceedings of the 12th International Conference on Neural Information Processing Systems
,
Denver, CO
,
Nov. 29–Dec. 4
, pp.
582
588
.
18.
Paszke
,
A.
,
Gross
,
S.
,
Massa
,
F.
,
Lerer
,
A.
,
Bradbury
,
J.
,
Chanan
,
G.
,
Killeen
,
T.
, et al
,
2019
, “
PyTorch: An Imperative Style, High-Performance Deep Learning Library
,”
33rd Conference on Neural Information Processing Systems
,
Vancouver, Canada
,
Dec. 8–14
.
19.
Caron
,
M.
,
Bojanowski
,
P.
,
Joulin
,
A.
, and
Douze
,
M.
,
2018
, “
Deep Clustering for Unsupervised Learning of Visual Features
,”
European Conference on Computer Vision 2018
,
Munich, Germany
,
Sept. 8–14
, pp.
139
156
.
20.
Pedregosa
,
F.
,
Varoquaux
,
G.
,
Gramfort
,
A.
,
Michel
,
V.
,
Thirion
,
B.
,
Grisel
,
O.
,
Blondel
,
M.
, et al
,
2012
, “
Scikit-Learn: Machine Learning in Python
,”
J. Mach. Learn. Res.
,
12
, pp.
2825
2830
.
21.
Wijayasekara
,
D.
,
Linda
,
O.
,
Manic
,
M.
, and
Rieger
,
C.
,
2014
, “
Mining Building Energy Management System Data Using Fuzzy Anomaly Detection and Linguistic Descriptions
,”
IEEE Trans. Ind. Inf.
,
10
(
3
), pp.
1829
1840
.
You do not currently have access to this content.