Advantages Object Detection for Computer Vision

Data analysis plays a crucial role in creating and implementing business development strategies in any industry whether it is retail, construction or hospitality. Computer Vision (CV) technologies can help you to easily deal with business tasks by getting valuable insights automatically and in real-time. All that you need is a camera and custom software solution meeting your specific needs.

Development of the Computer Vision solution implies business task decomposition that can include object detection, tracking and identification. The primary task of any CV project is to train a system to optimally detect visible objects and separate them from each other or background. If you need to serve the specific needs, object detection can be enough for creating an independent software solution. You can also use detection in conjunction with identification to determine a unique object, a specific person, specific car make and model and so on. How can we use object detection for traffic analysis and what insights can we get?

Object detection for traffic analysis

Together with Kirill Lozovoi and Artem Khodakov from the Exposit Machine Learning team we’ve considered the traffic analysis case to describe the process of the Computer Vision system development and share how object detection can help you in monitoring and collecting data, making informed decisions and improving business strategies. Who and why might need a CV-driven traffic analysis?

Advertising companies
Companies providing the opportunity to install a billboard or hire a professional promoter can use traffic analysis for the place selection and price justification. The system can analyze traffic real-time providing a user with the number of moving objects (people, cars, etc.), cars average speed and even make and models with the market price.
Construction companies
Traffic analysis combining detection and identification can help to provide smart and safe parking management in the residential areas allowing only the house residents’ cars to pass. The principle of operation is simple: one camera and a car number recognition system. This way you don’t need to hire a separate worker or use garage door opener.
Public safety companies
Object detection can help to improve the security systems in private sectors. Cameras can use Computer Vision to monitor the territory and uninvited guests at night. With the inclusion of identification technologies in the system, it can also determine the personality of a person.

Computer Vision traffic systems can also help to optimize transportation plans and programs, manage city traffic, reduce pollution and more. We will tell you about the creation of the object detection traffic analysis system designed to get information about the number of vehicles and their speed on the specific road at different times of the day.

Computer Vision-based solution: development process

When you have shared and discussed the project idea with your stakeholders, you can start the development process. The process of Computer Vision solution development begins with business analysis and R&D.

Step 1: Business objectives analysis

We start work on a project with the analysis of business objectives and tasks to identify key characteristics requiring for technical implementation. Our analysis included the following steps:

Determining target objects for detection: a list of objects to be searched for on the frame. In the case of traffic analysis, our target objects for detection are people and vehicles of all categories (motorcycles, cars, trucks, public transport, etc.).
Determining the characteristics of incoming data: image/video stream resolution and approximate size of the target object on the frame. In our case, target objects are quite large in physical dimensions and the camera is located in the proximity to the road, so there is no need for the highest resolution and HD quality should be enough.
Determining the amount of incoming data: the amount of data to process. The amount of incoming data may vary depending on the needs. For example, you may want to analyze traffic on the specific part of the road or to monitor all roads in the city at once. Our case will be focused on the region of interest (ROI) – the bridge in the city center.

Step 2: Neural network training

Computer Vision tasks like detection or identification are solved using neural networks. A neural network is a mathematical model that can be trained to make decisions based on input data. Their speed, accuracy and efficiency can vary depending on purpose: a neural network that is perfect for image recognition can be useless for speech recognition and translation. For our project, we have used Convolutional Neural Networks (CNN) that are the leaders in image object recognition.

CNNs break up the image into frequently repeated simple features (lines, gradient transitions, catchy details) and then combine them into more complex features or objects. As a result, a complex object like person, car or whatever is assembled from the set of simple features. In the image above you can see how CNN breaks down the photo of the car into signs and identifying vehicle make and model.

If you want to train neural network remember the visual distinguishing features, you need to have a big collection of images where the required objects are marked (have indicated boundaries and categories). Such a collection of images is also called dataset. For a successful result, you need a huge number of photos with different cars of different colors and models in different contexts. Sometimes for training neural networks, it takes about ten thousand photographs of one object.

Collecting and labeling images to create a high-quality dataset from scratch requires a lot of resources. If you make research or want to create MVP and test its effectiveness, you can use publicly available datasets with already labeled data that can include up to 80 categories of different objects. In our case, we used a ready COCO dataset to create a prototype of the solution.

Step 3: Performance testing

When the training of the neural network is completed, we need to test the performance. Object detection is visualized with a rectangle that falls on boundary data objects in the image. CNN predicts object position selecting it with a “box” as you can see in the image below.

For the demonstration of our solution, we use YOLOv4 that allows detecting many objects in a frame. We should specify the resolution that will be used to convert the input image before starting to maintain the aspect ratio. It is worth to say that the image with a higher quality requires more time to process. For example, for resolution 512px you will need approximately 245ms per frame (4 FPS) and for resolution 1024px approximately 530ms per frame (2 FPS).

The result of our work has been successful: we have a working prototype detecting different types of vehicles. You should understand that rectangles or “boxes” don’t provide any useful information by themselves, however, you can obtain valuable insights from them. Our prototype allows users to select a certain area on the road (ROI – a region of interest) and count the number of cars per second. Another option is to calculate the speed of the cars and the level of traffic using changes in perspective and projection of the coordinate plane onto the road.

Benefits of the object detection system in your business

Object recognition can help you to automate the specific processes bringing competitive advantages to Retail, Healthcare, Manufacturing, Transportation and other industries. Exposit Machine Learning engineers have enhanced experience in solving complex business tasks by implementing smart software platforms. Contact us if you want to translate your business idea into a powerful Computer Vision solution.

Computer Vision use case: how object detection can help you in traffic analysis?