Blanka Bártová, Vladislav Bína
Early Defect Detection Using Clustering Algorithms
Číslo: 1/2019
Periodikum: Acta Oeconomica Pragensia
DOI: 10.18267/j.aop.613
Klíčová slova: manufacturing, data mining, clustering, product quality, quality management, MICE-CART, VIF
Pro získání musíte mít účet v Citace PRO.
emerging product defects. In contrast to the use of traditional methods, the “modern” constantly
evolving data mining methods are now being more frequently used. The main objective of this
paper is to detect the potential cause or the area of the production process where the majority of
product defects arise. The dataset from the semiconductor manufacturing process has been used
for this purpose. First, it was necessary to address dataset quality. Significant multicollinearity was
found in the data and to detect and delete the collinear variables, correlations and variance inflation
factors have been used. The MICE-CART method has been used for the imputation because the
original dataset contained more than 5% of random missing values. In further analysis, the K-means
clustering method has been used to separate the failed products from the flawless ones. Following
this, the hierarchical clustering method has been used for the failed product to create groups of
product defects with similar properties. For the optimal number of clusters, the determination of
the BIC method has been used. Five clusters of products have been made although only three can
be classed as important for further analysis. These groups of products should be directly subjected
to the analysis in the production process, which can assist in identifying the source of scarcity.