Computer vision is a field of artificial intelligence that allows computers and systems to extract information from digital images, videos and other visual inputs

Computer vision: the senses reach automation

12 Mar 2024

Computer vision, increasingly integrated into tools such as picking robots, is gaining more ground in companies worldwide. According to DHL, the value of this technology is expected to reach $41.11 billion by 2030. Computer vision is set to continue expanding across businesses in logistics and other sectors over the next five years.

What is computer vision?

Computer vision is a field of artificial intelligence (AI) that allows computer systems to extract information from digital images, videos and other visual inputs. After processing the data collected through algorithms, computers can take action or make recommendations and suggestions. While AI enables computers to “think”, with computer vision, they can “see” and “understand” their environment.

These tools must be trained the same way humans learn to distinguish objects and interpret what they see. However, the process is faster for computer vision due to the large number of references it can assimilate.

How does computer vision work?

Computer vision systems employ two core technologies:

  • Deep learning. This type of machine learning uses algorithms that allow computers to teach themselves the context of visual data. Thus, they learn to distinguish between images on their own without needing to be programmed to recognise each figure.
  • Convolutional neural network (CNN). For machine learning models to discern what they see, they break down images, focusing on their pixels, and assign labels to these pixels. Then, they use these labels to create convolutions, which are mathematical operations on two functions to generate a third function. From there, the robots make predictions and check their accuracy through a series of iterations, enabling them to recognise objects similarly to how humans do.
Deep learning makes it possible for computers to learn on their own without having to be trained to recognise each object
Deep learning makes it possible for computers to learn on their own without having to be trained to recognise each object

Through this knowledge, computer vision analyses its surroundings in three steps:

  1. A device captures an image. The device can be a camera or a video camera.
  2. The image is sent to an interpretation system, which applies pattern recognition to compare the scene with others it knows.
  3. When a user requests information, the program provides the results of its analysis.

Computer vision applications in industry

Companies in sectors such as logistics, medicine, transport and leisure have already incorporated computer vision into their operations. Security cameras, traffic controls, smartphones and other devices provide them with a wealth of data that they use for various purposes. One example is Google Translate: it allows users to capture text with a camera and renders it in another language instantly.

Additionally, several computer vision applications can be seen in Industry 4.0:

  • Augmented reality. Information collected through computer vision serves to place virtual objects in physical environments.
  • Autonomous vehicles. Self-driving cars use real-time identification to detect what’s happening on the road and act accordingly.
  • Manufacturing. Machines can be monitored to ensure their proper operation. It’s also possible to assess the quality of products and packaging on production lines.
  • Spatial analysis. People and objects are identified across spaces and their movements are recorded.
  • Healthcare. Analysing images from medical devices helps healthcare professionals identify pathologies and reach faster, more accurate diagnoses.
  • Agriculture. Overseeing fields with satellites, drones or aeroplanes allows for monitoring crops and detecting emergencies and nutrient deficiencies. Patatas Meléndez, for instance, uses this technology to select the potatoes it distributes to its customers.
  • Text extraction. Automated processing can help discover relevant content within large amounts of text.

Who created computer vision?

Thomas Huang was a researcher and professor emeritus at the University of Illinois Urbana-Champaign and one of the leading figures in computer vision. According to Huang, this technology dates back to the 1960s, when Larry Roberts discussed the possibility of extracting 3D geometric information from 2D perspectives in his thesis at the Massachusetts Institute of Technology (MIT). AI began as a field of study around that time. In 1963, computers started transforming two-dimensional images into three-dimensional ones.

Computer vision can be traced back to Larry Roberts’ thesis at MIT in the 1960s
Computer vision can be traced back to Larry Roberts’ thesis at MIT in the 1960s

Optical character recognition (OCR) emerged in 1974, and intelligent character recognition (ICR) succeeded in deciphering handwritten texts through neural networks. In 1982, neuroscientist David Marr determined that vision functions hierarchically and developed algorithms for machines to detect edges, corners, curves and other geometric shapes. Simultaneously, computer scientist Kunihiko Fukushima created Neocognitron, a network of cells for pattern recognition. Advancements continued into the early 21st century, and in 2012, the AlexNet model reduced the error rate to a small percentage.

Computer vision in logistics

Logistics and supply chain management are other sectors with great potential and various applications for computer vision, some linked to robotics:

  • Shipping. Intelligent vision is used to calculate the space occupied by objects in transport and storage. Moreover, it enhances the data collected by warehouse management systems (WMSs) and makes sure product labels are legible.
  • Maintenance. Since AI gathers information from various pieces of equipment, it can detect when repairs will be needed.
  • Operations. AI can map out the most efficient routes for operator picking tasks and is effective for access control. Computer vision can detect individuals running through facilities or entering restricted areas, facilitating rapid intervention through alerts.
  • Safety. Monitoring the movements of vehicles and people in warehouses and parking lots makes it possible to take immediate action to minimise risks. Cameras also detect whether personal protective equipment (PPE) is worn properly and monitor drivers to ensure they rest at the first signs of fatigue.

Computer vision in picking robots

Another area where intelligent vision is poised to revolutionise logistics is in picking robots and pick and place cobots. These machines are optimal in logistics centres that manage large daily shipment volumes. They fill orders at high speeds, reaching up to 1,000 picks per hour. They operate nonstop, and their computer vision software calculates the most appropriate picking points for each product. AI algorithms enable these robots to pick unknown items without the need for previous training.

At Mecalux, we equip our clients’ warehouses with the latest technologies, leveraging advancements such as computer vision in picking operations. Are you looking to take your business to the next level? Be sure to contact us to find out about our picking robot and other cutting-edge solutions. We’ll advise you on the options that best meet your needs and support you throughout the installation process.

Missconfigured or missplaced portlet, no content found
Dynamic Content: false
Master Name: Banner-Software-Solutions
Template Key: