We all know the elephant in the room: artificial intelligence (AI) including deep learning (DL). Before discussing this trend or, really, this disruptive event, let’s lay some groundwork by discussing the market drivers for machine vision that ultimately power the trends.
The Economics and the Industry
The year 2022 was a great year for machine vision with exceptional growth. In contrast, 2023 is seeing the market contract both due economic uncertainty and rising interest rates making capital more expensive. Still, the machine vision market will continue its long-term trend of higher single digit compound growth exceeding the growth of the GDP.
The additional drivers of a shrinking labor force as a percentage of the population, labor shortages in many areas including agriculture, health care, and manufacturing, and demands of organized labor for higher wages and improved working conditions indicate machine vision’s annual growth rate should continue to increase.
Historically, machine vision relied on technologies developed for other fields like security, consumer electronics, and, more recently, autonomous vehicles, to provide the advances in machine vision to spur new trends. Now, though, we see new technologies being introduced outside of these traditional sources creating new opportunities for growth in machine vision.
A couple of these external drivers are:
- Reconnaissance and satellite imaging giving us better access to shortwave infrared (SWIR) and hyperspectral imaging
- The (anticipated) spread of artificial intelligence (AI) and deep learning (DL) into all aspects of business and consumer affairs. In 2022, more money was spent on development of AI specific chips for use at the edge than on the development of all other integrated circuits.
In the recent past, display inspection and solar panel inspection have been big areas for machine vision to penetrate. Now, the battery industry is exploding and demanding lots of advanced automation to prepare for expected demand and to ensure quality. This raises the question, “What’s the next big thing?” While it’s difficult to say, the life science area (e.g., medical devices) and clean tech are two areas which could experience rapid growth and demand for machine vision.
Artificial Intelligence
Now, let’s look into the big trend: DL. DL is just one facet of AI, and the one most significant for machine vision at this moment. AI is not new to machine vision. The earliest machine vision software package, the SRI Algorithm, included nearest neighbor classification, an AI technique. Most current software packages include camera calibration, a regression algorithm considered machine learning, and geometric pattern matching, that is also AI.
DL, because it’s a disruptive technology, is actually more a leap than a trend. Ever since Yann LeCun created the convolutional neural network in the 1980’s, DL has been progressing rather rapidly. At first, it wasn’t good enough to adopt into machine vision. With continued development and refining, its adoption into self-driving automobiles, and a proliferation of college courses on the subject, DL for image processing broke out and became immensely appealing. This was coupled with increasing computational power not just from advanced CPUs but more from advances in graphic processing units (GPUs) and field programmable gate arrays (FPGAs). More recently, there is a proliferation of special purpose processing chips tailored to execute DL algorithms with amazing efficiency and speed. Many of these chips are suitable for embedding into edge applications like machine vision systems on the factory floor.
What’s the future for DL in machine vision? (See Figure 1.) Long term, it’s very bright. However, the industry is repeating a pattern seen in the early days of machine vision as well as in other disruptive technologies where the barrier to entry is low. At this moment, a great many entrants into the market have little actual, on-the-floor, experience delivering working systems. This is the technology trigger. Customers are inexperienced with bringing the technology into operation. There are more companies with offerings than the market can accommodate. As companies compete for sales, their claims become exaggerated and customers become excited. This is the peak of inflated expectations. This leads to failed projects and a customer base that becomes very skeptical leading to the trough of disillusionment. In time, enterprises using care to pick good applications, execute them with care, and succeed to grow while competitors disappear. The technology reaches the plateau of productivity.
The biggest barrier to DL is estimating the cost of gathering and carefully labeling a comprehensive set of training images to guarantee high quality performance of the DL system.
A key question is whether or not DL can replace the current method of programming a machine vision system? The answer, so far, is “not really.” The software tools available for many machine vision applications already accelerates machine vision application development. Where DL will have its most impact is in those applications difficult to program because of the range of variations. Examples are defect detection where there are many types of defects and variations within defects and classification and evaluation of natural products, such as food and forest products, where nature causes wide variations.
Finally, we can expect to see large language models (LLMs) emerge as a way of delivering technical support and accelerating application development. At present, LLMs are prone to hallucinations – inventing data. This problem must be addressed before LLMs can be relied upon for technical assistance.
Image Sensors and Cameras
More and more megapixels to extract more detail from the image. Smaller and smaller pixels to keep the size of the image sensor smaller and reduce camera costs. These are trends, driven especially from the consumer side of image sensing. Designs for consumer products place a very high weight on achieving low cost even at the sacrifice of performance.
Double the number of pixels spanning a camera’s field-of-view, and you have four times as many pixels in the image. This increase places demands on the camera interface for speed to transfer the camera’s data and on the processing power needed to provide timely results.
While there is a cost-sensitive portion of the machine vision market that needs only basic functionality, most of the machine vision market is looking for performance. In this context, performance may be speed, or it may be image qualities such as lower noise, or it may be extended camera features.
While the machine vision camera market was formerly driven by consumer and security industry demands, it’s need for image quality is being addressed by new generations of image sensors directly targeting the machine vision market. The security and, to a lesser extent, the consumer markets are also adopting the higher quality image sensors. Back side illuminated (BSI) image sensors are delivering low-noise, higher sensitivity images. Since BSI image sensors are manufactured from two silicon wafers bonded together, they are more expensive to fabricate.
New image sensing technologies are emerging on the market: short wave infrared (SWIR) imaging, multi-spectral and hyperspectral imaging, time-of-flight (TOF) imaging for depth sensing, and polarization sensing.
Of these four, SWIR seems to be gaining the most adoption. While prices have come down for SWIR cameras, SWIR imaging, which uses image sensors made from InGaAs is still more expensive than visible imaging which uses image sensors made from silicon. Also, since people cannot see infrared wavelengths, engineering the imaging solution is a bit more complicated than with visible light imaging. Still, SWIR addresses many problems that are intractable with visible light imaging and should see continued adoption.
3D imaging
3D imaging continues to grow, just not as robustly as earlier predicted. Part of the growth limitation is the complexity of most 3D imaging tasks and part is the computation requirements to resolve 3D image data in a form useful to other devices such as robots.
A key differentiator of 3D imaging is whether or not it relies on triangulation. Triangulation usually provides higher accuracy data with the risk of occlusion where no data is available.
LiDAR and time-of-flight (TOF) imaging do not rely on triangulation and do not suffer from occlusion. LiDAR is good for longer distance sensing with modest spatial resolution. TOF imaging also has modest spatial resolution with a more limited depth range, and is faster in acquiring depth data than LiDAR. LiDAR and TOF imaging are finding applications in navigation for autonomous guided vehicles.
Stereo, using only two cameras, has found some uses but it requires sharp features to provide depth information. Its depth information can be fairly sparse. When the stereo cameras are paired with a pattern projector, usually projecting a pseudo-random dot pattern, it becomes augmented stereo. The projected dots provide a denser array of features for stereo to sense and there is more depth information. Augmented stereo is finding application in robot guidance and part location for gripping.
Laser profilometry, or sheet of light imaging, provides a very dense line of depth information and is finding a growing number of measurement applications.
Structured light where a light pattern is projected at an angle to the camera viewing the scene is still the most widely used depth sensing technique for robot vision. By projecting a series of patterns, very good depth resolution is obtained. Structured light is the leading approach to getting high resolution 3D data for robot part acquisition and bin picking.
Multispectral and Hyperspectral imaging
One relatively new area of imaging is multispectral and hyperspectral imaging. Both multispectral and hyperspectral imaging capture an image in different spectral bands.
The original differentiation between multispectral and hyperspectral imaging was the number of spectral bands. Multispectral imaging has up to 10 bands, and hyperspectral imaging has more than 10 bands. The more recent differentiation is multispectral imaging has several discrete and not necessarily contiguous spectral bands, and hyperspectral imaging has a larger number of contiguous spectral bands. Put another way, hyperspectral imaging is an imaging spectrometer.
Since materials’ chemical composition determine which spectral bands they absorb and which they reflect or transmit, hyperspectral imaging can be used for “chemical imaging.” This can be useful for imaging and identifying unknown substances, such as in recycling. It can also be used to determine specific imaging bands to use for multispectral imaging.
Hyperspectral and multispectral imaging are fairly complex imaging technologies. Hyperspectral imaging, especially, requires high data bandwidth and significant computational power.
Because hyperspectral imaging usually spans through the visible and into the short wave infrared (SWIR) wavelengths, a challenge to hyperspectral imaging is a broad spectrum, uniform light source. Incandescent lamps do well, but have very short lifetimes compared with LEDs. Lighting companies are working to provide broad spectrum LED sources spanning through the visible and SWIR regions, yet these light sources remain spectrally non-uniform and unusable for some applications.
A common approach for larger volume applications is to use hyperspectral imaging to evaluate what portions of the light spectrum are useful for the application, and then design a multi-spectral image sensor to address only the specific spectral bands of interest.
Both of these are new technologies just coming into machine vision and showing promise in agriculture, food sorting, food processing, and recycling. It is possible for multispectral and hyperspectral imaging to open up new opportunities in the chemical industry which has so far not seen much penetration by machine vision.
Lenses
Trends toward larger and larger image sensors with more pixels and toward smaller pixels increases demands on lenses to achieve higher performance.
The optics community is responding somewhat slowly to the increasing demands. Optical design tools make the design of better lenses more achievable. Manufacturing tolerances exceed what is practical for other fabricated components. However, even the higher tolerances are not enough. A recent article in Vision Systems Design by Jeremy Govier explained the new demands exceed what is possible with drop together lens assemblies, and require precision assembly to the micron level, either manually or by specially designed automation. Also, the traditional threaded retaining rings are not useable with these assembly tolerances, and adhesives are necessary. All of these demands mean lenses for newer, more demanding image sensors take more labor and time to produce. Lens costs for these image sensors will be much more expensive.
Accompanying the growth of SWIR cameras, more SWIR lenses are becoming available.
One trend in machine vision lenses is ruggedization – hardening lenses for mechanical and environmental stresses. Lenses used on vehicles need to withstand vibration and sometimes mechanical shock. Lenses used outdoors need to be sealed against moisture and tolerant of wide temperature swings while still providing quality images.
Lighting
LEDs continue to dominate lighting for machine vision with are no emerging challenges. Most of the advances in LED lighting continues to be improvement in efficacy (more light power out for electrical power input) and the increasing range of wavelengths available – especially in the NIR and SWIR ranges. LED advances are slow due to more resources being put into LED lights to replace conventional lights such as incandescent lights.
Another continuing development are programmable lighting controllers to create very short, intense light pulses by overdriving the LED. These controllers help machine vision to access higher speed applications.
In support of multi-spectral and hyperspectral imaging, LED light sources are becoming available using an array of LEDs with different wavelengths to cover the needed spectral range. Presently, these light sources are spectrally non-uniform, and work continues, especially on phosphors to use on top of LEDs, to mitigate this problem.
Another growth area for lighting is ultraviolet (UV) illumination. This is used both as an exciter for fluorescence imaging and to directly image in the UV range with cameras and lenses adapted to work in the UV range.
There is more work being performed in creating ruggedized LED light sources for machine vision able to withstand washdown and mechanical and environmental stresses.
Compute Power, Energy Efficient Computing
For CPUs, power is a challenge. It is a key reason that Intel is losing ground to ARM. Many people don’t realize data centers worldwide consume 416 terawatts of energy or about 3% of all electrical energy. No wonder data centers push for CPUs and GPUs having the same or increased computing power yet consume less electrical power.[i].
With the rise of AI and DL, the demand for and capability of graphical processing units (GPUs) has taken off. GPUs typically contain an array of processors capable of operating in parallel. Figure 3 shows a popular GPU development platform.
At the low end, a GPU might have just over 100 cores, and at the high end around 4,000 cores. While you can’t compare cores of a GPU to the ones for a CPU, GPU cores are slower and more limited; they give processing power through specialized architecture and massive parallel processing.
The field-programmable gate array, or FPGA, is customizable logic that can be tailored to handle very complex computations without needing to fetch instructions. FPGAs are used in most machine vision cameras as well as in many processors for image processing.
As mentioned earlier in the discussion of DL, there are special purpose processors available to power DL models executing on the edge. Already, products are emerging that contain these DL specific processors, and more are sure to follow.
Software as a Service
Software as a service (SaS) is a rapidly growing business model for enterprise software. For sellers it has the advantage of a steady revenue stream. For users it has the benefits of lower initial cost and enhanced service and support.
However, for the factory floor, there is resistance to its adoption. Some of the areas of resistance are:
- Security – connecting factory devices to the web is worrisome for security reasons. A hacker can bring down production lines with little effort. As the movement toward Industry 4.0 which envisions interconnected devices, often wirelessly, grows, measures to combat security threats will develop. For example, the OPC UA standard includes encryption of all data.
- Budgeting – industry allocates funds to various areas. One of these is capex (capital expenditures) in which purchases can be depreciated. Another would be something like utilities which can are expensed. The choice between the two spending alternatives affects the company’s bottom line and book value. For SaS to work, budgets need to be realigned.
- Reliability – most SaS programs rely on continuous connectivity to the cloud and risk a shutdown if data transmission is interrupted anywhere along its path. A failure in the data path could shut a factory down for some period of time. However, Microsoft’s Office 365 offers a possible model to alleviate this problem. With Office 365, computation is local with the requirement the device connect to the internet at least once a month to verify the license and download updates.
- Automatic updates – automatic software updates reduce the maintenance burden and helps ensure security. However, there are cases where the updates might also require an alteration to the application program. This condition would be disruptive to the factory unless the SaS supplier has an appropriate process for migrating the updates.
Standards
Vision & Sensors
A Quality Special Section
Voluntary standards, such as those developed by A3, EMVA, and JIIA, have accelerated the proliferation of machine vision by improving the ability to interface with cameras while supporting ever increasing camera standards. Although there are no new camera standards expected, the existing standards have revisions in process to keep pace with the developments in image sensors and cameras.
Mandatory standards addressing safety and environmental issues are diverging. Countries are developing their own standards and, in some cases, requiring in-country testing to verify the products meet the requirements. This trend puts pressure on costs and delays new product introductions.
Summary
- Machine vision continues to grow somewhat faster than the GDP.
- Battery manufacturing is the latest “big thing.”
- DL will become a significant factor after the hype and disillusionment wear off.
- Cameras are trending toward higher image resolutions and faster speeds. SWIR is a growing area for cameras.
- Multispectral and hyperspectral imaging are poised to offer new markets for machine vision.
- Lenses are responding to the need for larger sensors and smaller pixels with tighter tolerances and higher precision assembly. Ruggedized lenses are becoming more available.
- Lighting continues to expand into UV and SWIR with special attention to sources that work for hyperspectral imaging. Ruggedization is also becoming more evident.
- Compute power continues to increase with more attention paid to power consumption.
- SaS is a latent trend needing significant attention to concerns before it becomes widely adopted for machine vision.
- Safety and environmental standard compliance might restrict the introduction of new components.
Acknowledgements
The following individuals, in alphabetic order, provided insight into trends. None of them are the author of this article and any errors or shortcomings are exclusively the fault of the author and not the people listed below.
Romik Chatterjee, Graftek Imaging
Jeremy Govier, Edmund Optics
Steve Kinney, Smart Vision Lights
Mark Kolvites, Metaphase Technologies
Andy Long, Cyth Systems
Paul Proios, Metaphase Technologies
Rob Quick, 1st Vision