For machine vision, it would be a silver bullet to identify an object without any imprint like a bar code or data code, but only by sight. Now, this technology becomes reality with so called sample-based identification (SBI) as a unique software feature.
This technology can identify trained objects only based on characteristic features like color or texture, thereby eliminating the need to use special imprints like bar codes or data codes for object identification purposes. It is capable of differentiating thousands of objects. This even works with warped objects or varying perspective views of the object. It is also possible to learn a 3-D object from any side by using samples showing all relevant views.
In many applications it is desirable to identify objects without the use of special imprints like bar codes or data codes. Either the objects are not equipped with such imprints (e.g., vegetables or fruits), or it cannot be guaranteed that the imprint on the object is always visible (e.g., if the imprint is at the bottom side of the object that is lying on a conveyer belt). The automatic identification of such objects by using machine vision often was either impossible or required an expensive design of a complex and usually also very specialized solution. Because of these problems, spread by industrial field users, a new technology was developed to assist them. From the start of the research phase, the new technology was required to be general with regard to the type of object, to show a high degree of robustness, to be fast even for a large number of objects to be distinguished, and to offer an extremely high usability even for non-expert users. Sample-based identification meets all of these goals.
How does SBI work?
SBI is separated into an offline and an online phase. In the offline phase, the user provides at least one example image (sample) of each object he wants to identify. Furthermore, the user may specify whether only (gray-value) texture information should be used for identification or whether the color of the objects should be used as an additional feature. In the example application shown in Figure 1, it is sufficient that for each poster only one example image is provided. Here, the color information is useful in order to distinguish the posters from each other, and hence, is used in order to increase the robustness of the identification. Then, based on the example images, a so-called sample identifier is first prepared and subsequently trained.
For this, SBI automatically extracts predefined features in each example image. Then attributes are internally computed for each extracted feature. Basically, the features and their attributes describe the texture of the object. If color should be used, then additional attributes are computed. The idea is that the set of all attributes is characteristic for the example image, and hence, can be used to describe the object. The attributes are used for both preparing and training the identifier.
The preparation step is essential to adapt the internal data structure of the sample identifier to the kind of objects to be identified. A prepared sample identifier can be thought of as a warehouse, optimized to handle a specific group of objects. For a typical application, the preparation process must be performed only once. Then, the prepared sample identifier will be trained with samples of the individual objects to be identified. In contrast to the preparation step, the training step is a matter of only a few milliseconds and hardly consumes memory. Because of this, the training phase for hundreds or thousands of objects needs only little time.
In the online phase, the trained sample-identifier is applied to a run-time image in order to identify the object. For this, features and their attributes are computed in the run-time image in the same way as it was previously done in the images that were used for training. The sample identifier then queries the object in the database that has the most similar attributes and finally returns this as the identified object.
In practice, the run-time image might show the objects under completely different conditions compared to the images that were used for training. Coming back to the example of the posters from public places, the run-time images might be acquired under different viewing angles (from a side view), under different lighting conditions (at a different time of day or with flash or different weather conditions), partially occluded, in a different orientation (with a tilted camera), in a different size (from a different distance or with a different focal length), or with clutter objects (other people, parts of neighboring objects, cars, buses, street furniture). However, SBI is very robust to these “disturbances,” which also often occur even under controlled industrial environments. The run-time of the identification process is in the range of a few tenths to a few hundredths of a second per image.
One great advantage of SBI is that the run-time is almost independent of the number of trained objects. Even for very large databases that include thousands of different objects, the run-time merely increases, which is one important advantage over previous software products. Aside from that, the software also exploits the power of modern multi-core processors to obtain a high performance. This allows SBI to be used even for time-critical applications.
Note that there is only a single parameter the user has to set. This parameter is intuitive and determines whether only texture information should be used for identification or whether the color information should be used as well. The choice of this parameter strongly depends on the application.
In more demanding applications, the objects are not planar but have a 3-D shape. If you think of products in a supermarket, the appearance of many products depends on the viewing direction. To be able to identify 3-D objects from an arbitrary viewing direction (e.g. within an automatic self-checkout), it is not enough to take only a single example image per object for training. Instead, for each object multiple example images under different viewing directions are necessary. Usually, one image approximately every 45 degrees suffices which seems reasonable for practical uses. Note that this is only necessary for object rotations out of the image plane. Rotations within the image plane need not to be sampled because the SBI is invariant with respect to that.
Another demanding application is the identification of deformable (planar or 3-D) objects. If we stay in the supermarket a little bit further, we can see objects like bags of sweets, potato chips, newspapers, fruits, vegetables, or salad. Their appearance in the image changes depending on the objects’ deformation. Even more difficult seems the identification of objects like a sack of potatoes or onions, where no fixed base structure is available for identification at all. SBI is also robust to that. For such objects providing more than one example image can increase the robustness.
Relevance and Advantage for the Machine Vision Industry
SBI combines the intelligence of current state-of-the-art machine vision algorithms with the requirements of the machine vision industry, i.e. robustness, speed, flexibility, and usability. This enables the machine vision industry to enter new identification markets or penetrate existing markets more intensively.