Automatic Defect Segmentation on Leather with Deep Learning

Automatic Defect Segmentation on Leather with Deep Learning

Sze-Teng Liong stliong@fcu.edu.tw Y.S. Gan ysgn88@gmail.com Yen-Chang Huang yenchang.huang1@gmail.com Chang-Ann Yuan cayuan@fcu.edu.tw Hsiu-Chi Chang aspirine923768@gmail.com Department of Electronic Engineering, Feng Chia University, Taichung, Taiwan Research Center for Healthcare Industry Innovation, NTUNHS, Taipei, Taiwan Department of Mechanical and Computer-Aided Engineering, Feng Chia University, Taichung, Taiwan
Abstract

Leather is a natural and durable material created through a process of tanning of hides and skins of animals. The price of the leather is subjective as it is highly sensitive to its quality and surface defects condition. In the literature, there are very few works investigating on the defects detection for leather using automatic image processing techniques. The manual defect inspection process is essential in an leather production industry to control the quality of the finished products. However, it is tedious, as it is labour intensive, time consuming, causes eye fatigue and often prone to human error. In this paper, a fully automatic defect detection and marking system on a calf leather is proposed. The proposed system consists of a piece of leather, LED light, high resolution camera and a robot arm. Succinctly, a machine vision method is presented to identify the position of the defects on the leather using a deep learning architecture. Then, a series of processes are conducted to predict the defect instances, including elicitation of the leather images with a robot arm, train and test the images using a deep learning architecture and determination of the boundary of the defects using mathematical derivation of the geometry. Note that, all the processes do not involve human intervention, except for the defect ground truths construction stage. The proposed algorithm is capable to exhibit 91.5% segmentation accuracy on the train data and 70.35% on the test data. We also report confusion matrix, F1-score, precision and specificity, sensitivity performance metrics to further verify the effectiveness of the proposed approach.

keywords:
Defect, tick bites, segmentation, geometry, robot arm, Mask R-CNN
journal: .\biboptions

sort&compress

1 Introduction

Hides (refers to skin of large animals, i.e., cows) and skins (normally used for small animals, i.e., sheep) are mostly by-products of slaughterhouses. Many of the supply comes from the USA, Brazil and Europe, since they are large producer of beef. Normally, three major steps are carried out in a leather factory: sorting, chemical processing and physical processing. The raw materials are categorized by the number of defects, such as tick bites and scars which affect in the quality degradation and subsequently require more processing. The materials are then tumbler with specific chemical substances that convert the hides or skins into leathers, and hence possess the superior characteristics of soft, pliable, water resistance and putrefaction. Finally the leathers are stretched and trimmed to portray its velvet appearance. Those leathers are then sold to the leather manufacturing companies to produce high-end leather goods, like bags, shoes and jackets. The companies usually carry out a few rounds of defects sorting and classification by the severity for different appearance regions of the visual appeal on the goods.

One of the earliest research works on the leather is carried out by Yeh and Perng yeh2001establishing. They defined a reference standard to classify the leather into several grade levels based on the amount of usable region that are eligible to proceed for the manufacturing process. As the price of every piece of leather hinges on its grade, such guideline is established to minimize the disputes for each trade transaction. In fact, sophisticated negotiation always incurs additional cost and argument between suppliers and purchasers. During the defect detection process in yeh2001establishing, a few experienced experts are involved to manually annotate and mark the defect region using a software package - Adobe Photoshop 4.0 photoshop. This process that requires human effort is not reliable as it is highly dependent on individuals. Thus, a machine vision technique is necessary to reduce the cost (i.e., workload and time) of the defect annotation task.

To date, most of the defect detection and annotation procedures in industries are still carried out by highly trained inspectors. In general, there are two visual inspection automation tasks: classification and instance segmentation. The former categorizes the type of the defect of the leather, such as cuts, tick bites, wrinkle, scabies and others; whereas the latter localizes the defect region and at the meantime annotate the type of the defect. Many previous works focus on the leather defect classification. In contrast, there are relative less researchers predict the precise position of the defects. It is also worth noting that in many of the experiments reported in the papers, a test image only contain one type of the defects or identify the presence of a single defect in a sample image wong2018computer; tong2016differential; villar2011new.

Kwon et al. kwon2004development propose a framework to identify several defect types (i.e., hole, pin hole, scratch and wrinkle) based on the histogram of the pixel intensity values. They discover that the composition of the image pixels of a non-defective leather should portray standard normal distributions. For the hole defects, their Gaussian distribution for image pixels are usually concentrated at the brighter part (i.e., close to pixel value of 255). In contrast, the pin holes are having much more darker pixels (i.e., close to pixel value of 0). Defects like scratch and wrinkle are normally present distinct patterns compared to the normal distribution. Then, the grade of the leather (i.e., A, B or C) would be determined based on the analysis result that refers to the density and the number of defects extracted.

On the other hand, an image processing technique - fuzzy logic, is employed in krastev2004leather to analyze the features set of the leather images to perform the surface defects recognition. The leather image is first loaded in grey level and represented using a histogram. Specifically, a few statistical features such as the histogram range, histogram position, median, mean, variance, energy, entropy, contrast, etc. are calculated with maximal, minimal and average values. However, the sample size is small (i.e., images) and the procedure of the experiments is ambiguous. For instance, the explanation for the experiment configuration and the distribution of the training and testing sets to evaluate the proposed algorithm are not included in the paper.

An automated machine vision system to detect a few defect types (i.e., open cut , closed cut and fly bite) is introduced by Villar et al. villar2011new. They utilize seven popular feature descriptors (i.e., first order statistics, contrast characteristics, Haralick descriptors, Fourier and Cosine transform, Hu moments with information about intensity, Local binary patterns and Gabor features) and a selection method to dynamically reduce the feature size. Then, a multilayer perceptron neural classifier is adopted to categorize the type of the defect. An overall of 94% high classification accuracy on the test images is obtained. Note that the training and testing datasets composed a total of approximately 1800 sample images of 40 40 spatial resolution.

A similar defect categorization work is conducted by Pistori et al. pistori2018defect. Particularly, they tend to distinguish four types of defects: tick marks, cuts, scabies and brand marks made from hot iron, on both the raw hide and wet blue leathers. The former has more complex exterior that has various kinds of surface (i.e., textures, colors, shapes, thickness and even with serious defects), whereas the latter is a common type of leather that had been undergoing a tanning process which appears to be more noticeable to both the human and machine visual inspection. The features of the images are extracted using a popular texture analysis technique, namely Gray-scale Coocurrence Matrix (GLCM) jobanputra2004texture; singh2002spatial. The proposed method is validated on a pre-built dataset, comprised of images from 258 pieces of raw hide and wet blue leather with 17 different defect types. As a result, a perfect classification result (i.e., 100%) is achieved using Support Vector Machine (SVM) suykens1999least classifier.

Another leather classification work is conducted by Pereira et al. pereiraclassification, which attempt to reduce the discrepancies between specialists’ assessments on the goat leather quality in order to increase the productivity of quality classification using feature extractor and machine learning classifier. The proposed algorithm includes the processes of image acquisition, image preprocessing, features extraction and machine learning classification. In brief, a new approach is introduced to extract features called Pixel Intensity Analyzer (PIA) which emerges as the most cost-effective method to the problem when used together with Extreme Learning Machines (ELM) classifier huang2006extreme. However, the details for the experiment setup is absence, thus it may be difficult to duplicate the framework and compare to the other works.

A recent work is carried out by Winiarti et al. winiarti2018pre, where they classify five types of leather by employing both the two types of feature extractors: handcrafted feature descriptors and deep learning architecture. The leather types include the monitor lizard, crocodile, sheep, goat and cow. For the handcrafted representation, a fusion of statistical color features (i.e., mean, standard deviation, skewness and kurtosis) and statistical texture features (i.e., contrast, energy, correlation, homogeneity and entropy) are adopted. A pre-trained AlexNet is exploited as the deep learning structure. The classification performance suggests that the deep learning method can better capture the characteristics of the leather, which exhibits an overall accuracy of 99.97%. Note that, there is no defect classification involved in this paper as all the data are non-defective images.

For the defect localization and segmentation tasks, one of the pioneer research works is conducted by Lovergine et al. lovergine1997leather. They detect and determine the defective areas using a black and white CCD camera. Then, a morphological segmentation giardina1988morphological; lee1996mathematical process is applied on the collected images to extract the texture orientation features of the leather. A few of qualitative results are shown in the paper, however, the quantitative methods and numerical data to evaluate the proposed algorithm are absence.

Later, Lanzetta and Tantussi lanzetta2002design suggest a laboratory prototype for trimming the external part of a leather. The leather sample images are processed to determine the background and the defective areas by using binarization, opening and laplacian mask methods to find the trimming path. The proposed defect detection system successfully identify most of the defects on distinct leather types. However, the surface finish and color are still the main factors that could influence the outcome of inspection. Since there is no practical implementation of the proposed prototype, the qualitative and quantitative performances are not shown in the paper.

A 6-step inspection method has been proposed by Bong et al. bong2018vision for leather defect (i.e., scars, scratches and pinholes) detection and predict their exact size, shape and position. An image grabbing system is built such that the leather is able to captured using a static camera and a consistent light source. Then a series of image processing techniques are applied on the images captured to obtain the defective areas, which include color space conversion, Gaussian thresholding, Laplacian detection, Median blurring, defect features extraction (i.e., color moments, color corellogram, zernike moments, texture features) and SVM classification. The distribution of the training and evaluation data for the images that contain defects are about 7.5 : 2.5. The proposed method achieves an average accuracy of 98.8% to detect a single defect in every image.

The goal of this study is to introduce an automatic defect identification system to segment irregular regions of a specific defect type, viz., tick bites. This type of defect appears as a tiny surface damage on the animal skin, and is often neglected via human inspection. A sample defective image is shown in Figure 1. An instance segmentation deep learning model, namely, Mask Region-based Convolutional Neural Network (Mask R-CNN), is utilized to develop a robust architecture to evaluate the test dataset. Then, the details of the defective regions (i.e., a set of coordinates) is transferred to a robotic arm to automatically mark the boundary of the defect area.

Figure 1: A sample defective image

In short, the contributions of this work are summarized as follows:

  1. Proposal of an end-to-end defect detection and segmentation system using a deep convolutional neural network (CNN).

  2. Usage of robotic arm for automatic dataset collection and defect marking on the leather.

  3. Acquisition of a set of optimal coordinates of each irregular shape of defect using mathematical derivation of the geometry.

  4. Thorough experiments on approximately 80 training images and 500 testing images and several performance metrics are presented to demonstrate the effectiveness of the proposed framework.

The rest of the paper is organized as follows: Section 2 elaborates the details of our experimental setup and the proposed algorithm is thoroughly described. Section LABEL:sec:metric explains the measurement metrics and the experiment settings. Section LABEL:sec:result reports and discusses the experimental results. Lastly, Section LABEL:sec:conclusion concludes the paper.

2 Proposed Method

The proposed automated visual defect inspection system comprised of six stages: 1) Dataset collection using robotic arm to capture top-down leather images; 2) Manual ground truth annotation for each defect on all the images; 3) Deep learning architecture training and parameters fine tuning on the trained model; 4) Images testing with the trained model; 5) Acquisition of a set of coordinates for each defect; 6) Defect highlighting with chalk using the robotic arm. The architecture overview of the system is illustrated in Figure 2. Following subsections explicitly describe each of the six steps involved.

Figure 2: Flowchart of the proposed leather defect detection and segmentation framework

2.1 Dataset collection using robotic arm

The apparatus involved in the dataset elicitation consist of a six-axis desktop robotic arm, high-resolution camera, non flickering LED light source and 3D printed plastic components. The leather is placed on a table and there is a 2D camera mounted on a robot arm to capture the details of the leather from top-down viewpoint as illustrated in Figure 3. The experiments are carried out on a six-axis articulated robots DRV70L from Delta. The placement of the Robot has been optimized, to reach the maximum range of the leather detection. This optimization has been achieved by the commercial software package of Tecnomatix Process Simulation. The robot payload is 5kg and the weight of the camera is about 1kg. All the movements of the robot have been programmed by the robot language of Delta Robot Automation Studio (DRAS), such that it will move to a few specific pre-configured positions to automatically capture multiple image patches. The control code can be transferred into the robot control gear by direct Ethernet link, and be independent from DRAS during the operation. In order to improve the image capture stability during the robot movement, the holding tool for the camera has been optimized. The images are captured using Canon EOS 77D and the detailed settings are described in Table 2.1. The fluorescent lights in the laboratory are operating on alternating current (50 Hz) electric systems produce light flickering, which yield to undesirable shadow, reflectance and variable illumination. Thus, a professional lighting source (i.e., DOF D1296 Ultra High Power LED light of 12400 lux) is used to provide consistent and continuous source of illumination. Particularly, the light source is placed and fixed at 45 degrees from the leather and aiming downward.

Figure 3: Hardware prototype of robot arm and lighting to capture the local image of leather
Feature Description
Megapixel 24.2

Resolution (pixels) 2400 1600

Color representation sRGB

Frame rate (fps) 60

Shutter Speed (s) 1/60

Exposure time () 1/200

ISO speed ISO - 1600

Focal length (mm) 135

Flash mode No flash


Table 1: Camera specifications and configurations
Comments 0
Request Comment
You are adding the first comment!
How to quickly get a good reply:
  • Give credit where it’s due by listing out the positive aspects of a paper before getting into which changes should be made.
  • Be specific in your critique, and provide supporting evidence with appropriate references to substantiate general statements.
  • Your comment should inspire ideas to flow and help the author improves the paper.

The better we are at sharing our knowledge with each other, the faster we move forward.
""
The feedback must be of minimum 40 characters and the title a minimum of 5 characters
   
Add comment
Cancel
Loading ...
368661
This is a comment super asjknd jkasnjk adsnkj
Upvote
Downvote
""
The feedback must be of minumum 40 characters
The feedback must be of minumum 40 characters
Submit
Cancel

You are asking your first question!
How to quickly get a good answer:
  • Keep your question short and to the point
  • Check for grammar or spelling errors.
  • Phrase it like a question
Test
Test description