Deep Anomaly Detection via Morphological Transformations

: The goal of deep anomaly detection is to identify abnormal data by utilizing a deep neural network trained by a normal training dataset. In general, industrial visual anomaly detection problems distinguish normal and abnormal data through small morphological differences, such as crack and stain. Nevertheless, most existing algorithms focused on capturing not morphological features but semantic features of normal data. Therefore, they yield poor performance on real-world visual inspection, even though they show their superiority on simulations with representative image classiﬁcation datasets. To solve this problem, we propose a novel deep anomaly detection method that encourages understanding salient morphological features of normal data. The main idea behind our algorithm is to train a multi-class model to classify between dozens of morphological transformations applied to all the given data. To this end, the proposed algorithm utilizes a self-supervised learning strategy, which makes unsupervised learning straightforwardly. Additionally, we present a kernel size loss to enhance the proposed neural networks’ morphological feature representation power. This objective function is deﬁned as the loss between predicted kernel size and label kernel size via morphological transformed images with the label kernel. In all experiments on the industrial dataset, the proposed method demonstrates superior performance. For instance, in the MVTec anomaly detection task, our model achieves the AUROC of 72.92% that is 8.74% higher than the semantic feature-based deep anomaly detection


Introduction
Deep anomaly detection means verifying abnormal data via a deep neural network trained by normal instances.It is a significant challenge that has been well-studied within various application domains, including video surveillance, disease diagnosis, and visual inspection.In this paper, we tackle the problem of deep anomaly detection in images.The intuition behind most existing methodologies in this problem is training the deep neural network to understand semantically important features of normal data.Hence, most of these studies [1][2][3] reported their superior results on a representative image classification dataset (e.g., MNIST [4] and CIFAR-10 [5]), which is composed of clearly distinguishable between classes.However, with a point of view of industrial inspection, these existing methodologies are not useful to solve real-world problems.In the real-world problem, the criterion that discriminates abnormal data from normal data usually defined as morphological differences such as crack, stain, and bent, which cannot describe semantically.For ease of understanding, the visual descriptions of both semantic and morphological differences show in Figure 1.Both figures are the sampled instances from the MVTec [7].The difference like the difference between "good wood" and "scratched wood" classes called the morphological difference.The morphological difference does not involve the semantic difference.In other words, instances of both "good wood" and "scratched wood" have the same semantic definition.
In order to utilize the morphological feature in deep anomaly detection, the proposed method is based on self-supervised learning algorithm.The self-supervised learning means a form of unsupervised learning where the training data provides the supervision.There is a proxy loss in this learning mechanism that makes the deep neural network achieve the main goal of target application.In other words, by utilizing this training algorithm, the deep neural network can learn what we care about, such as the semantic difference or the morphological difference.There are several previous methods in self-supervised learning-based deep anomaly detection [2,8].These existing methods focused on training the deep neural network to understand the geometric transformations of normal data, including rotation and translation.Especially, training a deep neural network to classify the rotation degree of normal data is an effective strategy to capture semantic information of normal data [8].Obviously, training geometric transformations in self-supervised learning does not help identify abnormal data in the case represented in Figure 1b.
To mitigate this problem, we propose a novel deep anomaly detection algorithm based on self-supervised learning using morphological transformations, including dilation, erosion, and morphological gradient.The proposed method is based on the observation of industrial anomaly detection problem, which requires a morphological understanding of normal data.Therefore, the proposed method is trained over a self-labeled dataset, which is constructed by the normal instances and their morphological transformed variants, accomplished by various morphological transformations.At the test procedure, the trained neural network takes input on morphological transformed test data, and the distribution of softmax activations on trained normal data is useful to detect abnormal test data.The intuition behind the proposed method is that by training the classifier to discriminate between transformed images, it has to learn valuable morphological features.
In this paper, we performed deep anomaly detection experiments based on the MVTec dataset [7], which is composed to measure anomaly detection performance in industrial inspection.There are various industrial defection types (e.g., crack, stain, bent) per class in this dataset.Additionally, to demonstrate the superior performance of the proposed algorithm in the industrial aspect, we compared with the latest state-of-the-art deep anomaly detection based on self-supervised learning [2].
In summary, the main contributions of this study are as follows: • The proposed method achieves superior performance in deep anomaly detection on industrial inspection by training the deep neural network to capture salient morphological features of normal data.

•
The proposed algorithm can flexibly adapt to various real-world deep anomaly detection problem by choosing the adequate morphological transformation in image processing technology.

•
Because the proposed methodology utilizes self-supervised learning, it has low computational complexity than other deep anomaly detection methods such as reconstruction-based algorithm.

Proposed Method
This section describes the morphological transformations-based deep anomaly detection algorithm, which applies to industrial and real-world anomaly detection problems.

Morphological Image Processing
In digital image processing, a mathematical morphology transformation is a mechanism for extracting image components useful in representing and describing region shapes, such as boundaries, skeletons, and the convex hull [9].The proposed deep anomaly detection learns the morphological features by three representative morphological transformations, including erosion, dilation, and morphological gradient, which are described in the following sub-sections, respectively.

Erosion and Dilation
The erosion at any location (x, y) of image i by a kernel b is the minimum value of i in the region covered by b when the central point (origin) of b is at (x, y).For instance, if b is a 3 × 3 kernel, obtaining the erosion at a pixel needs getting the minimum of the nine values of i included in the 3 × 3 region determined by the kernel when its origin is at that point.In equation form, the erosion is defined as: Likewise, the dilation of i by b is designated as the maximum value of i from all the values of i contained in the region coincident with b.That is, ( Because erosion computes the minimum pixel value of i in every neighborhood of (x, y) coincident with b, it expected that the size of bright features in i will be reduced, and the size of dark features will be increased.Figure 2b,f show eroded images of normal and abnormal data in the "tile" class of MVTec, respectively.As mentioned above, from these Figures, it can be seen that the area of dark features is increased in eroded examples.Similarly, Figure 2c,g show the result of dilation.The effects are the opposite of those obtained with erosion.The bright features were thickened, and the intensities of the dark features were decreased.

Morphological Gradient
To obtain the morphological gradient of an image, dilation and erosion can be used in combination with image subtraction.In this paper, this operation is denoted as follows: Because the dilation thickens regions in an image and the erosion shrinks them, the difference between them highlights the boundaries between areas.Therefore, an image in which the edges are emphasized and the homogeneous regions are suppressed; "derivative-like" (gradient) effect.Figure 2d,h show morphological gradient images of normal and abnormal data, respectively.Especially in Figure 2h, it can be seen that this morphological transformation emphasizes the cracked area.

Deep Anomaly Detection via Morphological Transformations
The proposed algorithm aims to train the deep neural network with normal data's morphological features through a self-supervised learning strategy.To achieve this goal, we propose to train a deep neural network F to discriminate the morphological transformation types applied to an image that is given to it as input.Specifically, we define a set of N 1 discrete morphological transformations, N 2 discrete values for kernel width, and N 3 discrete values for kernel height.In other words, the proposed self-labeled dataset is a multi-class dataset that consists of N 1 N 2 N 3 classes.For clarification, we denote n 2 × n 3 size kernel b as b n 2 ,n 3 .Thus, we define a set of N 1 N 2 N 3 discrete morphological transformations as follows: where g(.|n 1 , n 2 , n 3 ) denotes that applies to image i the morphological transformation with multi-class label {n 1 , n 2 , n 3 } that produces the transformed image i n 1 ,n 2 ,n 3 = g(i|n 1 , n 2 , n 3 ).
The deep neural network F takes an input as transformed image After that, it produces a probability distribution of softmax response over all possible morphological transformations, which is denoted as follows: where ) is the predicted probability for morphological transformation with {n * 1 , n * 2 , n * 3 } and θ denotes the parameters of F. Consequently, the proposed objective function is as follows: where F n , respectively.Through the above formulation, we enforce the deep neural network to learn morphological features of normal images by predicting both transformation type and kernel size simultaneously.Specifically, training to predict kernel size encourages the proposed algorithm to learn useful morphological features in real-world industrial deep anomaly detection.In Figure 3, the overall architecture of the proposed method is presented.

Deep Anomaly Detection on Industrial Dataset
In Table 1, we present the overall experimental results of the proposed method on the representative industrial anomaly dataset, MVTec [7].From the experimental results, it can be verified that proposed self-supervised learning designed to capture salient features of normal data achieves superior performance than the semantic feature-based deep anomaly detection.Interestingly, in a performance comparison experiment between the proposed method's three types, although the type 1 case model achieves fast convergence than the other cases, it produces the lowest performance.This observation implies that creating an easy self-labeled dataset in self-supervised learning can not help lead the deep neural network to where we intended.This phenomenon proved inductively through the experimental results on the type 3 case.These overall experimental results prove that utilizing morphological image features improves performance in real-world industrial problems.The proposed method can also verify anomalies by inferencing a neural network, which takes a processing time of almost 0.0125 s .In other words, it has low computational complexity.
Table 1.Comparison of AUROC (area under the receiver operating characteristic, %) performance between [2] and the proposed algorithm.

Figure 1 .
Figure 1.The visual description of semantic and morphological differences in images: (a) Semantic difference: Both figures are sampled instances from the cats-and-dogs[6].The difference like the difference between "cat" and "dog" classes called the semantic difference.Generally, the semantic difference does involve both the semantic and morphological differences.(b) Morphological difference: Both figures are the sampled instances from the MVTec[7].The difference like the difference between "good wood" and "scratched wood" classes called the morphological difference.The morphological difference does not involve the semantic difference.In other words, instances of both "good wood" and "scratched wood" have the same semantic definition.

Figure 2 .
Figure 2. Morphological transformed images in "tile" class of MVTec [7]: (a) normal image (b) eroded normal image (c) dilated normal image (d) morphological gradient of normal image (e) abnormal image (f) eroded abnormal image (g) dilated abnormal image (h) morphological gradient of abnormal image.

Figure 3 .
Figure 3.The proposed deep anomaly detection aims to discriminate the abnormal data using the acquired morphological features of normal data in the training procedure.Therefore, if a given morphological transformed data generates a high prediction error, it can be considered abnormal.