Smart Factories, Industry 4.0, AI in manufacturing has become a growing interest area across industries and scales to improve work processes. How much money do we need to development and implement AI, and how long will it take? What type of data do we need, and how much data? What problem can we solve with AI? There are so many unknowns about Industrial AI. and increase operational efficiency. We can easily find success stories on the benefits of AI, but limited guidance for organizations to implement AI for themselves.
Workflow automation. Anomaly detection. Preventative maintenance of equipment. Production planning. These are only a few examples of how AI is already being used in manufacturing. Without sufficient data and clear preparation PoC (Proof of Concept) projects cannot develop into a mass production solution.
Just as humans experience simple tasks and complicated tasks, there are several levels of difficulty for AI. Model performance, reliability, and difficulty are all highly dependent on the provided training data. The less data provided, the less information the AI model has to learn and process – which means that the difficulty level will increase. Therefore, data appropriately fitted to the purpose of the model is required to create a high-performance model.
A company manufacturing ‘Product B’ possesses 1B data entries regarding the sales and revenue history of ‘Product B’ by region, and date. Therefore, judging this would be more than enough data to create a sales prediction AI model, the company consulted an AI specialist to implement the model. However, the AI specialist responded, “you do not have enough data.” It didn’t make any sense that 1B labeled data entries are not enough. We will explore what “insufficient data” actually means.
improper or insufficient problem definition.
Illogical data compilation
The typical manufacturing site, even with no AI, has its own definition and system of defining “normal/defective” products. However, most do not manage defective products by specific type. Also, current visual inspection or machine vision technology do not have a high accuracy rate. Therefore, defective products can be misjudged as normal; and normal as defective. Yet in order to develop a high performance AI solution, accurately labeled data is crucial.
Availability of normal data
To detect defects and anomalies, it is obvious that we need data that is considered “abnormal.” Applying the same logic, we need to first train the model the definition of “normal” data. Machine learning networks are constructed based on the human logic system. There are various defect patterns such as cracks or contaminants on the silicon wafer. Yet, the prerequisite knowledge that we often forget about is that we need to know what a cleanly produced silicon wafer looks like.