Introduction


Ocular Intelligence is at the forefront of several exciting emerging markets; artificial intelligence, machine learning and machine vision.

Most people should be familiar with facial recognition. Our aim is to expand this technology to recognize clothing, cars, buildings and many other object types.

Our unique technology can identify and categorize objects within a flat 2D photo or video.

While artificial intelligence technology such as Google TensorFlow can achieve similar results with still photos, no one has effectively implemented this technology into live video.

Teaching computers to recognise photos of cats is one thing, but trying to teach them to recognise objects in a live video, in real-time, is quite another. Today this remains an incredibly complex and challenging task. Google TensorFlow itself has only been around since 2015, so as you can imagine the technology is still in its infancy.

One of our key focus areas is clothing recognition. Much like face recognition, clothing recognition identifies what a person is wearing.

At this stage, we are focused on developing the raw white-label technology, and are constantly improving our machine learning models. The next stage will be to develop our own unique business or consumer-level products implementing our own technology.

What is Machine Vision?


Machine vision is the ability of a computer to 'see', and more importantly, analyze what it sees; it employs one or more video cameras, analog-to-digital conversion (ADC) and digital signal processing (DSP). The resulting data goes to a computer or robot controller. Machine vision is similar in complexity to voice recognition.

Two important specifications in any vision system are the sensitivity and the resolution. Sensitivity is the ability of a machine to see in dim light, or to detect weak impulses at invisible wavelengths. Resolution is the extent to which a machine can differentiate between objects. In general, the better the resolution, the more confined the field of vision. Sensitivity and resolution are interdependent. All other factors held constant, increasing the sensitivity reduces the resolution, and improving the resolution reduces the sensitivity.

Human eyes are sensitive to electromagnetic wavelength s ranging from 390 to 770 nanometers (nm). Video cameras can be sensitive to a range of wavelengths much wider than this. Some machine-vision systems function at infrared (IR), ultraviolet (UV), or X-ray wavelengths.

Binocular (stereo) machine vision requires a computer with an advanced processor. In addition, high-resolution cameras, a large amount of random access memory (RAM), and artificial intelligence (AI) programming are required for depth perception.

Image Processing

After an image is acquired, it is processed. Multiple stages of processing are generally used in a sequence that ends up as a desired result. A typical sequence might start with tools such as filters which modify the image, followed by extraction of objects, then extraction (e.g. measurements, reading of codes) of data from those objects, followed by communicating that data, or comparing it against target vales to create and communicate "pass/fail" results.

Machine vision image processing methods include;

Stitching/Registration: Combining of adjacent 2D or 3D images.

Filtering (e.g. morphological filtering).

Thresholding: Thresholding starts with setting or determining a gray value that will be useful for the following steps. The value is then used to separate portions of the image, and sometimes to transform each portion of the image to simply black and white based on whether it is below or above that grayscale value.

Pixel counting: counts the number of light or dark pixels.

Segmentation: Partitioning a digital image into multiple segments to simplify and/or change the representation of an image into something that is more meaningful and easier to analyze.

Edge detection: finding object edges.

Color Analysis: Identify parts, products and items using color, assess quality from color, and isolate features using color.

Blob discovery & manipulation: inspecting an image for discrete blobs of connected pixels (e.g. a black hole in a grey object) as image landmarks. These blobs frequently represent optical targets for machining, robotic capture, or manufacturing failure. Neural net / deep learning processing: weighted and self-training multi-variable decision making.

Pattern recognition including template matching. Finding, matching, and/or counting specific patterns. This may include location of an object that may be rotated, partially hidden by another object, or varying in size.

Barcode, Data Matrix and "2D barcode" reading.

Optical character recognition: automated reading of text such as serial numbers.

Gauging/Metrology: measurement of object dimensions (e.g. in pixels, inches or millimeters).

Comparison against target values to determine a "pass or fail" or "go/no go" result. For example, with code or bar code verification, the read value is compared to the stored target value. For gauging, a measurement is compared against the proper value and tolerances. For verification of alpha-numberic codes, the OCR'd value is compared to the proper or target value. For inspection for blemishes, the measured size of the blemishes may be compared to the maximums allowed by quality standards.