This text will information you thru the method of creating a masks detection utility with deep studying. With the no-code pc imaginative and prescient platform Viso Suite, you’ll be able to develop a production-ready picture recognition utility with out writing code from scratch.
Constructing with Viso saves lots of costly time, ensures future-proof compatibility throughout platforms (cameras, processing {hardware}, AI fashions). It offers a dependable option to develop pc imaginative and prescient tasks with enterprise-grade safety layers in place.
We are going to cowl the next:
- Planning the pc imaginative and prescient challenge
- Growing the pc imaginative and prescient utility with out coding
- Configuring the appliance modules
- Deploying the AI imaginative and prescient utility to the sting
- Previewing the output video stream
- Versatile upkeep and privateness
Necessities:
- You want a Viso Workspace that gives all of the instruments and capabilities in a single place.
- Optionally available: USB or IP digital camera, edge pc that may be a generic Intel machine (x86-64 platform, desktop, embedded)
Planning Section
Log into the Viso Workspace, and navigate to “Library” and “Functions.” That is the place the place you’ll be able to handle and edit all of your pc imaginative and prescient functions and variations. Right here, you additionally see “Modules” presently put in in your workspace. Set up new modules so as to add new extensions and performance to the Viso Builder, the visible modeling software of Viso Suite. You’ll be able to add modules from {the marketplace} or import your customized code as Docker containers.
To construct a masks detection utility, we begin with the design of the pc imaginative and prescient utility. Due to this fact, we have to perceive the appliance stream and make sure the required modules are put in within the workspace.
The idea of the video recognition utility could be very intuitive. Viso Suite permits utilizing a visible no-code modeling software to simply construct the pc imaginative and prescient pipeline – with out writing code from scratch. This protects time, avoids bugs, and makes it simple to observe finest practices.
Video-Enter
This module grabs the video frames that present the visible enter for the pc imaginative and prescient engine. With Viso, you’ll be able to seamlessly swap between a video file and any digital video digital camera.
Focus space
The region-of-interest module is used to deal with a selected space throughout the video frames. For instance, it’s common to carry out masks detection on the entrance of a shopping mall or retail retailer. By specializing in the doorway space, the next picture recognition algorithms solely apply to this area of curiosity. This often results in substantial enhancements in pc imaginative and prescient accuracy and efficiency by lowering the workload considerably.
Face Detection
Many pc imaginative and prescient functions contain and mix a number of completely different pc imaginative and prescient duties that every require specialised AI fashions. The idea of a stream with a number of levels can be referred to as a pc imaginative and prescient pipeline.
The primary pc imaginative and prescient module detects all faces within the focus space earlier than the next module performs the precise masks detection for each detected face. Technically, face detection is the elemental pc imaginative and prescient process referred to as object detection, to acknowledge and localize an object (right here: human faces) in a picture (right here: within the focus space).
With Viso Suite, you’ll be able to choose completely different AI frameworks (TensorFlow, OpenVINO, and many others.) and a broad checklist of AI fashions to carry out this process. Viso manages the mixing, compatibility, and orchestration of ML mannequin serving containers totally robotically. The power to simply change and replace the AI mannequin is crucial as a result of know-how advances quickly (see our article on real-time Object Detection). The checklist of obtainable, pre-trained AI fashions could be very in depth; you’ll be able to handle them in your workspace (Library -> AI fashions) and add your individual.
Masks Recognition
The next module within the pipeline can be a pc imaginative and prescient module for picture recognition. The output of the face detector is fed into this deep studying mannequin to acknowledge the presence of a masks. Therefore, the output is both “masks” or “no-mask” for every face, expressed with a degree of chance (e.g., 0.98 – indicating 98% chance).
Choose “create a brand new utility” and begin with a brand new utility from scratch. Set a novel new utility title. In our case, we title it “Masks Detection” and click on “verify” to proceed. The applying is robotically initialized, and the editor will present an empty canvas.
Output Logic
The output knowledge stream of a pc imaginative and prescient deep studying mannequin is often not helpful with out processing and aggregating. In easy phrases, it might continuously ship the messages “particular person: masks, particular person: masks, particular person: masks” each time the ML mannequin is processing a picture, which may very well be even increased than the FPS of a video (relying on the efficiency and configuration).
To mixture and make sense, a counting logic or aggregation logic must be utilized earlier than the info will be safely despatched to third-party methods or the pc imaginative and prescient dashboards inside Viso Suite.

Construct the pc imaginative and prescient utility
To create the masks detector, we drag the next modules from the Viso Builder aspect panel to the canvas: Video feed, video view, area of curiosity, object detection, object recognition, perform, MQTT. Wire the modules collectively as under:

Configure the appliance modules
For the Video Feed Module, choose “video file” because the picture supply. Later, you’ll be able to change the enter to a community/IP digital camera (most safety cameras) or USB (webcams, and many others.).
Within the object detection module, choose the OpenVINO framework. You see the out there {hardware} that can be used to course of the visuals. Evidently, solely choose {hardware} as it’s really out there to your computing gadgets.
You’ll be able to choose CPU for the AI inferencing process, but in addition VPU (Imaginative and prescient Processing Unit, Myriad X), iGPU (the brand new Intel Xe GPU sequence), and others. Within the tutorial, we use the Intel Myriad Imaginative and prescient Processing Models, highly effective and cost-efficient AI accelerators to successfully course of AI workloads with neural networks.
It’s essential to notice that the {hardware}’s processing energy considerably impacts the appliance’s efficiency (computed FPS, frames per second). CPUs are often the simplest out there however least highly effective methodology. Nvidia or the brand new Intel GPUs present probably the most computing energy however are comparatively costly and have a excessive power utilization. VPUs comparable to Myriad X present excellent power- and cost-efficiency ($/FPS and Watt/FPS). Identical to GPUs, they can be utilized together to extend picture processing efficiency. In our tutorial, we use 4 Myriad X processors.
For the mannequin choice, we use the pre-trained face detection mannequin of “OpenVINO for Retail.” As talked about earlier than, you might use your individual customized AI mannequin. However for a lot of standardized settings, pre-trained ML fashions present good outcomes.
Subsequent, we choose which skilled object class we would like the mannequin to detect. The item detection class will depend on what the neural community has been skilled for. As we already chosen the Retail Face Detection mannequin, we now choose the category “Face.”
Within the object recognition module, we choose the Keras Framework to run on the CPU. We choose the ML mannequin “Masks Detection” to be utilized.
Subsequent, we configure the video view node. This module shows the processed video output. It’s essential to notice that this module could also be eliminated in a manufacturing utility as a result of the pc imaginative and prescient system works totally autonomously. Often, prices will be saved if sources are devoted for higher utility efficiency as an alternative of processing a visible output video stream. Within the tutorial, we set “/video” because the native URL to preview the output immediately within the browser, given the shopper is throughout the identical community because the computing machine.
Then, we configure the perform node. We use this module to outline the logic that’s used to course of the output knowledge of the pc imaginative and prescient pipeline. The logic is particularly essential if a number of areas of curiosity are used. Best is to set the logic with a brief JavaScript code snippet to transform the output within the desired knowledge format relying on how the knowledge can be used.
Lastly, we configure the MQTT out node to ship the details about if individuals are sporting masks or not by way of the built-in MQTT dealer of Viso Suite. We outline the “Subject” to which the endpoint will subscribe. This can be essential to make use of the time-series knowledge, for instance, in dashboard widgets. The information will be seamlessly used throughout the dashboard builder of Viso Suite to create customized real-time dashboards with quite a few chart widgets.
Deploy the appliance
After the appliance modules have been configured, we save the app as an preliminary model. Will probably be created within the workspace library, underneath functions. The dependencies of the modules and the AI fashions they use are robotically managed; you’ll be able to view them within the module or AI mannequin sections of the workspace library.
To deploy the appliance, add it to a profile. We assign that profile to a tool that has been enrolled within the workspace. As soon as the machine is on-line and out there, the Viso Resolution Supervisor will robotically deploy the appliance with all dependencies to the computing machine.

Configure the deployed utility
We didn’t but set the area of curiosity to focus the masks detection on a selected space of the video/digital camera feed. That is solely doable after the appliance is deployed as a result of it’s a configuration on the native degree. After the appliance was deployed and is operating, we navigate to the machine in Viso Suite (Deployment -> Units) and entry Native configuration. We see all of the deployed modules which can be a part of the operating utility and choose the “area of curiosity” module.
After the cloud service requests a picture body, we will draw the area of curiosity (ROI) utilizing the polygon form. After defining the world by which the masks detection needs to be carried out, we’re prompted to set its title. We click on “Save” to verify the native configuration.
Previewing the outcomes
Since we have now the output video module deployed as a part of the appliance, we will now visualize the output of the pc imaginative and prescient pipeline. As a result of we set the output supply as “localhost,” we will view it immediately from throughout the browser. The output view node may very well be used to observe the outcomes from a number of enter (a number of cameras) and output (parallel utility paths) picture sources. In our tutorial, we used one enter feed (video file). Due to this fact, we have now a outcome output stream out there.
The outcomes present how solely faces throughout the area of curiosity (ROI) are detected. For every detected face, the pc imaginative and prescient app returns a worth “masks” or “no-mask,” which is up to date in real-time if an individual is placing on or taking off a masks.
Flexibility and Agility
The applying can now be simply up to date and maintained utilizing the Viso Builder. For instance, we will swap all the processing {hardware} platform to GPU – with out rewriting the code of the appliance. For example, Viso Suite makes it simple to make use of the just-released Intel iGPU, which boosts efficiency and effectivity, leading to important financial price financial savings for large-scale manufacturing methods.
We are able to swap from a video file to a webcam that we will plug into the machine or use the video feed of any community safety digital camera. It’s simply doable to make use of a number of video inputs or enhance the complexity of the appliance logic. And, importantly, we will simply migrate and roll out the appliance to new {hardware} gadgets.
The enterprise-grade versioning system and dependency administration of Viso Suite permit to securely roll out new functions to a fleet of gadgets. The built-in machine fleet administration ensures end-to-end safety and scalability. If wanted, we will roll out new variations in batches and simply roll them again to prior utility variations.
Privateness
We consider privateness and safety are probably the most crucial points of pc imaginative and prescient. Pc imaginative and prescient typically includes delicate knowledge, together with photos of individuals (workers, clients) or mental property (machines, processes, and many others.). Trade leaders use Viso Suite in healthcare, manufacturing, retail, authorities/public companies, and different industries requiring the very best knowledge privateness ranges.
As a result of connecting separate instruments is weak to extreme safety and privateness points, Viso Suite offers end-to-end capabilities to handle and shield all the course of. That is why we have to totally handle the gadgets, ML mannequin deployment, inference pipelines, entry administration, encryption – in a single place.
All visible knowledge is processed regionally, with on-device machine studying (Edge AI). This isn’t solely far more performance- and cost-efficient, it additionally ensures that no video photos are despatched to a cloud. We are able to course of the photographs in real-time, with out even storing them on-device. In our tutorial app, we included an output video preview that may solely be accessed on a neighborhood degree. We are able to all the time deploy the appliance with out the module and even take away the native preview totally.
The output knowledge is textual content strings “masks”/”no-mask” with out delicate visible knowledge. This edge intelligence strikes AI capabilities from the cloud to the sting machine as an alternative of shifting all knowledge to the cloud earlier than processing it there – enabling personal deep studying functions.
Get began
Be taught extra about creating no-code pc imaginative and prescient methods with Viso Suite. Get in contact with our crew and see how one can ship enterprise AI imaginative and prescient quickly, securely, and future-proof.