Computer and machine vision based on video signal processing and analysis is a critical function in systems like autonomous vehicles, medical imaging diagnostic equipment, facial recognition and eye tracking applications, smart cities, supply chain management, and robotics. It requires rapid and accurate object and feature recognition and extraction. Implementation of computer vision is complex and can benefit from using artificial intelligence (AI), machine learning (ML), convolutional neural network (CNN) inferencing engines, and other advanced computing techniques.
Designers of computer vision systems need tools that assist in merging high-performance hardware and software. Sources where designers can get the needed electronic design automation (EDA) tools include component suppliers, general and specialist EDA tool makers, and even focused industry organizations that offer open-source EDA solutions. This second of two FAQs looks at examples of the wide range of free and open-source machine vision and video EDA tools available to designers. Part one focused on tools available from component suppliers and EDA tool makers.
OpenCV (open-source computer vision) is an Apache 2 licensed open-source collection of computer vision and machine learning tools that can be modified as needed by designers. OpenCV is written in C++ and has an interface that works with standard template library (STL) containers. It also has Python, Java, and MatLab interfaces and supports Windows, Linux, Android, and Mac OS. It includes over 2,500 algorithms for basic computer vision applications and advanced machine learning environments. Algorithms support the development of facial recognition, classification of human actions, following eye movement, tracking object movement, producing three-dimensional (3D) point clouds, stitching multiple images together, creation of augmented reality environments, and more.
OpenCV users can turn to the EmguCV Cross platform .Net addon. EmguCV is designed for use with .Net-compatible languages like C#, VB, VC++, and IronPython. It’s compatible with Visual Studio, Xamarin Studio, and Unity and can run on Windows, Mac OS, iOS, Linux, and Android.
Designers of medical imaging systems can use MIScnn (Medical Image Segmentation with Convolutional Neural Networks), an open-source application programming interface (API) based on Python for the development of convolutional neural network (CNN) and deep learning (DL) models with minimal coding. MIScnn is optimized for medical image processing and can handle 2D and 3D medical image segmentation and includes preprocessing and data augmentation tools for biomedical images. It has extensive tools including (Figure 1):
- Full image and patch-wise analysis
- DL model library with fast model training
- Multiple automatic image evaluation techniques
- Based on Keras, an open-source library that provides a Python interface for CNNs and has a TensorFlow backend
DeepFace is an open-source facial recognition and facial attribute analysis tool written in Python and is available on GitHub. The DeepFace library includes AI models and automatically handles all activity in the background. Users simply import the library and add the image path as input. DeepFace features include (Figure 2):
- Face Verification
- Face Recognition
- Facial Attribute Analysis
- Real-Time Face Analysis
JavaScript
Open-source computer vision tools are also available in JavaScript. For example, tracking.js is a library of computer vision algorithms for use in a browser environment. Tracking.js is based on HTML5 and has a core that’s only about 7 KB. Functions include color tracking and facial recognition.
For developers of eye-tracking applications, WebGazer.js can use common webcams to infer the real-time eye-gaze locations of visitors on a web page. No video information is sent to a server, WebGazer.js runs in the browser. It requires user consent to access the webcam. Features include:
- Compatibility with most common browsers
- Self-calibration using clicks and cursor movements
- Multiple gaze prediction models
Sport performance analysis
Last, but not least interesting, there’s Kinovea a free and open-source video annotation tool optimized for sports performance analysis. Built-in utilities enable users to capture, slow down, compare, annotate, and measure motion. For example,
- Arrows, descriptions, and other commentary can be added to videos.
- Two videos can be observed side-by-side and synchronized for comparative analysis.
- Users can measure angles, distances, and times manually or use the semiautomated tracking tool to follow the trajectories of specific points on the video.
Summary
Machine and computer vision is important in a wide range of industrial, transportation, medical, and consumer applications. Designing vision systems is complex and requires a mix of high-performance hardware and sophisticated software. Fortunately, there’s a variety of EDA tools available from component suppliers, dedicated EDA tool makers, and even focused industry organizations. Tools from component suppliers and commercial EDA tool makers were reviewed in part 1.
References
A Microscope for Your Videos, Kinovea
Cross platform .Net addon for OpenCV for image processing, EmguCV
Deepface, serengil
DeepFace – The Most Popular Open Source Facial Recognition Library, viso.ai
Democratizing Webcam Eye Tracking on the Browser, Webgazer
Medical Image Segmentation with Convolutional Neural Networks, frankkramer-lab
Open Source Computer Vision Library, OpenCV
Tracking.js, tracking.js