Computer Vision: intelligent eyes in the computer

Computer vision is, in fact, an AI that allows connected devices to see and understand through cameras and algorithms. In the field of artificial intelligence, it is considered one of the technologies with the highest potential to transform many industries, from manufacturing to automotive, from energy to healthcare.

Computer vision software allows computers and other systems with processing capabilities to extract information from digital images, videos and other visual inputs, interpret them and provide suggestions on actions to be taken. This is made possible by machine learning and deep learning techniques, which “train” computers to accurately identify and classify objects, understand the visual world and react to what they “see”.

A natural application is in smart factories. In the production lines, intelligent systems equipped with computer vision “monitor” assembly operations in order to detect any defects in real time and intervene. Similarly, processes, machinery and other assets can be inspected: computer vision is capable of analyzing thousands of products or processes per minute, noticing seemingly imperceptible defects or problems. Another fundamental use case is in self-driving cars: the safety of autonomous vehicles relies on Computer vision software that can spot in real time objects (and people!) in front and around them to avoid accidents. Further on, in healthcare, machine vision takes the analysis of diagnostic images to higher levels.

Computer vision systems can have different purposes and are therefore instructed accordingly to observe and understand what is relevant to the specific task. For this reason, data remains essential as an input to train AI, as well as an output of its “vision”.

From image to intelligence. What is Computer vision for?

The purpose of Computer vision is to allow machines to extract valuable information from the surrounding environment. Therefore, it is not just a question of being able to recognize objects, people or animals within a single image or a sequence of images (video); it is about extracting useful information for their processing, with growing levels of abstraction and understanding. It is a form of artificial intelligence as it is capable of reconstructing a context around the image, giving it meaning.

The applications touch on numerous areas, such as waste reduction (including energy, with consequent cost reduction), supply chain visibility (to react proactively to bottlenecks), quality control in production, predictive maintenance of machinery, warehouse and inventory checking, product and worker safety, even compliance. Moreover: Computer vision allows you to identify categories for search engines and simplifies media analysis by reducing the time and costs of inserting video ads and producing or editing content.

These are all applications that businesses of all sizes can access. The important thing is to start from your most important asset, data, because your data is the raw material with which Computer vision systems are trained to see and interpret what your business needs.

Integration with the IoT. Deep learning and the digital twin

In 2022, Computer vision will be the major investment area for technology and service provider organizations with artificial intelligence technology plans already in place, according to a recent survey by Gartner conducted between April and June, 2021. These companies are predicted to spend an average of $679,000 over two years.

Another area where companies want to bring AI solutions into, especially the AI edge, is smart factory. According to Gartner, deep learning, the most advanced form of machine learning, will be included in more than 65% of edge computing use cases in 2027, up from 10% in 2021. This means that artificial intelligence will be increasingly applied directly to objects in the factory (the so-called Industrial internet of things or IIoT), rather than just to centralized processing systems. And this largely includes computer vision systems, which, as we have seen, apply AI and deep learning in cameras and sensors to carry out product and process quality inspections and perform predictive maintenance.

Being closely connected with the IoT, Computer vision is very effective in conjunction with the digital twin. A digital twin is a virtual replica of physical assets, potential and actual, equivalent to objects, processes, people, places, infrastructures, systems and devices. The digital twin can be used, for example, in design and prototyping: it allows you to carry out tests more accurately and quickly, verifying the product before it goes into production and accelerating the time to market. On the other hand, Computer vision monitors the production process and identifies any sources of problems. In both cases, you save time and money because you avoid sending into production something that will not work or the market will not like, while reducing the impact of anomalies and faults in assembling.

Explainable AI: why did the software give this interpretation?

Computer vision is an AI that not only sees but interprets. These systems know how to classify objects and show us how to act. For example, they tell us if any items in the warehouse have been misplaced. Or, they detect a deteriorated performance in a machine, alerting us to call maintenance before it stops altogether. The ability to observe and interpret becomes more sophisticated as deep learning is integrated into the systems’ training. But can we trust what the computer sees and, above all, the decisions it makes?

Yes, if we have provided the system with data in the correct way and if we use artificial intelligence techniques that are always transparent. It is still a software designed by people! The fundamental element that creates trust in AI is therefore the starting process of data preparation and software training, that we can do in the way our human supervision deems correct both from a technical and an ethical point of view – as well as useful for our business strategy.

This is why today we talk about Explainable AI (XAI): methods that allow people who use AI systems and machine learning to understand how they arrive at certain interpretations and decisions. The trend towards which companies and technology suppliers will move is precisely that of AI, ML and deep learning solutions that are explainable and understandable by design, i.e. it is the software itself that formulates an explanation for what it foresees or decides.

DuneD, our AI for your business

Explainable AI is exactly what DuneD offers with its AI, ML and Computer vision solutions. We make the predictions of artificial intelligence algorithms transparent by explaining the logic of the answers provided by the AI. DuneD’s data products are created to give user reliability and generate confidence in forecasts by making it clear how AI arrives at certain results.

As a partner of Amazon, in Computer vision we offer the Amazon Rekognition solution, which offers pre-trained and customizable computer vision capabilities to extract detailed information from images and videos even for those who start from the base or do not have all the necessary internal skills.

Amazon also offers a solution that makes machine learning accessible to all – Amazon SageMaker – to build, train, and deploy machine learning models for any use case with fully managed infrastructure, tools, and workflows.

Furthermore, we have in our portfolio a patented Digital twin platform equipped with Artificial Intelligence and Computer vision capabilities that is capable of categorizing images and connecting to IoT systems. This product helps companies mitigate risk in both civil and industrial sectors by dealing with predictive maintenance, product monitoring and safety in the workplace. It is equipped with a simple configuration interface with several proprietary AI algorithms that are ready for use.

All AI systems need to be fed with a large amount of data – images and videos in the case of Computer vision – which, appropriately labeled, will constitute the dataset that can make the algorithm truly intelligent. This is exactly what DuneD does: we collect data (structured, semi-structured and unstructured) and we create Data Lakes that enrich the data coming from all corporate and external sources, making them always available and searchable by all business units. This is why we have always been the Data Product company standing alongside businesses that want to extract value from their most precious asset, data.

Food for thought…

Gartner forecasts on investments in AI and Computer vision.
A brief history of computer vision and neural networks.
AIoT or artificial intelligence of things: the WeForum infographic to illustrate the transformation triggered by the union of IoT with AI, Big data and 5G.

Interesting podcasts…

Computer vision in production, the podcast that brings together representatives of the computer vision community to discuss the applications of this technology.