Unbalanced Datasets? No Problem

The impact of artificial intelligence has been made clear, with research predicting that AI could contribute up to $15.7 trillion to the global economy in 2030. The technology is being used across industries – from manufacturing to robotics – for a number of applications, providing great benefits, including time and cost savings to a variety of users.

However, we have also learned that these gains are not automatic. The performance of AI is highly dependent on the quality and quantity of the data powering it. But, like many recipes, figuring out the correct proportions of ingredients (data) can make or break your application.

The case for ‘good’ data

Let’s look at visual inspection in industrial manufacturing as an example of how crucial good and bad data sets are. In a factory setting, industrial machines are designed to output high-quality, reliably “good” products. However, machines and raw materials are not perfect. This means that a small fraction of products with some defects will likely end up on the production line. If a manufacturer wants to be able to detect those defective products, the AI system will need to know that they are in fact, “bad.” This presents a challenge because typically AI data will be composed of 90-99%+ good images, and a tiny number of bad images. Most AI will have a hard time accurately identifying the bad parts in these instances where the ratio of good/bad images can be many thousands to one. 

The Solution

While conventional AI and deep learning algorithms are not equipped to face these real-world unbalanced dataset scenarios, an emerging class of techniques known as Lifelong Deep Neural Networks (L-DNN) can. L-DNN is technology that reduces the data requirements for AI model development and enables continuous learning in the cloud or at the edge. With its help, the industry can get closer to a solution where models can be trained on less data, and learn incrementally as they encounter new scenarios, such as on the production line.  

This ability to continuously learn over time, derived from biological brains, is the defining characteristic of this new breed of AI. L-DNNs can form a ‘prototype’ of what is good by simply observing a few images of an example. This means that rather than starting from ground zero, each time you want to improve the AI or teach it something new, like with conventional AI, you can continue to train the AI incrementally. And, with the ability to learn at the edge, L-DNN also has the potential to improve edge-friendly, AI-powered visual inspections for drones, manufacturing and robotics.

While traditional DNNs need large amounts of well-balanced data, L-DNN can work well when these conditions are not met. That’s a huge step in addressing the challenges that come with data collection for AI for industrial inspection applications