Face detection tips, suggestions, and best practices

In this tutorial, you will learn my tips, suggestions, and best practices to achieve high face detection accuracy with OpenCV and dlib.

We’ve covered face detection four times on the PyImageSearch blog:

Note: #3 and #4 link to the same tutorial as the guide covers both HOG + Linear SVM and the MMOD CNN face detector.

Today we’ll compare and contrast each of these methods, giving you a good idea of when you should be using each, allowing you to balance speed, accuracy, and efficiency.

To learn my face detection tips, suggestions, and best practices, just keep reading.

Face detection tips, suggestions, and best practices

In the first part of this tutorial, we’ll recap the four primary face detectors you’ll encounter when building your own computer vision pipelines, including:

OpenCV and Haar cascades
OpenCV’s deep learning-based face detector
Dlib’s HOG + Linear SVM implementation
Dlib’s CNN face detector

We’ll then compare and contrast each of these methods. Additionally, I’ll give you the pros and cons for each, along with my personal recommendation on when you should be using a given face detector.

I’ll wrap up this tutorial with my recommendation for a “default, all-purpose” face detector that should be your “first try” when building your own computer vision projects that require face detection.

4 popular face detection methods you’ll often use in your computer vision projects

There are four primary face detection methods that we’ve covered on the PyImageSearch blog:

Note: #3 and #4 link to the same tutorial as the guide covers both HOG + Linear SVM and the MMOD CNN face detector.

Before continuing, I suggest you review each of those posts individually so you can better appreciate the compare/contrast we’re about to perform.

Pros and cons of OpenCV’s Haar cascade face detector

**Figure 1:** OpenCV’s Haar cascade face detector is very fast but prone to false-positive detections.

OpenCV’s Haar cascade face detector is the original face detector that shipped with the library. It’s also the face detector that is familiar to most everyone.

Pros:

Very fast, capable of running in super real-time
Low computational requirements — can easily be run on embedded, resource-constrained devices such as the Raspberry Pi (RPi), NVIDIA Jetson Nano, and Google Coral
Small model size (just over 400KB; for reference, most deep neural networks will be anywhere between 20-200MB).

Cons:

Highly prone to false-positive detections
Typically requires manual tuning to the detectMultiScale function
Not anywhere near as accurate as its HOG + Linear SVM and deep learning-based face detection counterparts

My recommendation: Use Haar cascades when speed is your primary concern, and you’re willing to sacrifice some accuracy to obtain real-time performance.

If you’re working on an embedded device like the RPi, Jetson Nano, or Google Coral, consider:

Using the Movidius Neural Compute Stick (NCS) on the RPi — that will allow you to run deep learning-based face detectors in real-time
Reading the documentation associated with your device — the Nano and Coral have specialized inference engines that can run deep neural networks in real-time

Pros and cons of OpenCV’s deep learning face detector

**Figure 2:** OpenCV’s deep learning SSD face detector is both fast and accurate, capable of running in real-time on modern laptop/desktop CPUs.

OpenCV’s deep learning face detector is based on a Single Shot Detector (SSD) with a small ResNet backbone, allowing it to be both accurate and fast.

Pros:

Accurate face detector
Utilizes modern deep learning algorithms
No parameter tuning required
Can run in real-time on modern laptops and desktops
Model is reasonably sized (just over 10MB)
Relies on OpenCV’s cv2.dnn module
Can be made faster on embedded devices by using OpenVINO and the Movidius NCS

Cons:

More accurate than Haar cascades and HOG + Linear SVM, but not as accurate as dlib’s CNN MMOD face detector
May have unconscious biases in the training set — may not detect darker-skinned people as accurately as lighter-skinned people

My recommendation: OpenCV’s deep learning face detector is your best “all-around” detector. It’s very simple to use, doesn’t require additional libraries, and relies on OpenCV’s cv2.dnn module, which is baked into the OpenCV library.

Furthermore, if you are using an embedded device, such as the Raspberry Pi, you can plug in a Movidius NCS and utilize OpenVINO to easily obtain real-time performance.

Perhaps the biggest downside of this model is that I’ve found that the face detections on darker-skinned people aren’t as accurate as lighter-skinned people. That’s not necessarily a problem with the model itself but rather the data it was trained on — to remedy that problem, I suggest training/fine-tune the face detector on a more diverse set of ethnicities.

Pros and cons of dlib’s HOG + Linear SVM face detector

**Figure 3:** HOG + Linear SVM is a classic algorithm in the object detection/face detection literature. Use it when you need more accuracy than Haar cascades but cannot commit to the computational complexity of deep learning-based detectors.

The HOG + Linear SVM algorithm was first introduced by Dalal and Triggs in their seminal 2005 work, Histograms of Oriented Gradients for Human Detection.

Similar to Haar cascades, HOG + Linear SVM relies on image pyramids and sliding windows to detect objects/faces in an image.

The algorithm is a classic in computer vision literature and is still used today.

Pros:

More accurate than Haar cascades
More stable detection than Haar cascades (i.e., fewer parameters to tune)
Expertly implemented by dlib creator and maintainer, Davis King
Extremely well documented, both in terms of the dlib implementation and the HOG + Linear SVM framework in the computer vision literature

Cons:

Only works on frontal views of the face — profile faces will not be detected as the HOG descriptor does not tolerate changes in rotation or viewing angle well
Requires an additional library (dlib) be installed — not necessarily a problem per se, but if you’re using just OpenCV, then you may find adding another library into the mix cumbersome
Not as accurate as deep learning-based face detectors
For the accuracy, it’s actually quite computationally expensive due to image pyramid construction, sliding windows, and computing HOG features at every stop of the window

My recommendation: HOG + Linear SVM is a classic object detection algorithm that every computer vision practitioner should understand. That said, for the accuracy HOG + Linear SVM gives you, the algorithm itself is quite slow, especially when you compare it to OpenCV’s SSD face detector.

I tend to use HOG + Linear SVM in places where Haar cascades aren’t accurate enough, but I cannot commit to using OpenCV’s deep learning face detector.

Pros and cons of dlib’s CNN face detector

**Figure 4:** Dlib’s CNN face detector is the most accurate of the bunch but is quite slow. Use it when you need accuracy above all else.

Davis King, the creator of dlib, trained a CNN face detector based on his work on max-margin object detection. The method is highly accurate, thanks to the design of the algorithm itself, along with the care Davis took in curating the training set and training the model.

That said, without GPU acceleration, this model cannot realistically run in real-time.

Pros:

Incredibly accurate face detector
Small model size (under 1MB)
Expertly implemented and documented

Cons:

Requires an additional library (dlib) be installed
Code is more verbose — end-user must take care to convert and trim bounding box coordinates if using OpenCV
Cannot run in real-time without GPU acceleration
Not out-of-the-box compatible for acceleration via OpenVINO, Movidius NCS, NVIDIA Jetson Nano, or Google Coral

My recommendation: I tend to use dlib’s MMOD CNN face detector when batch processing face detection offline, meaning that I can set up my script and let it run in batch mode without worrying about real-time performance.

In fact, when I build training sets for face recognition, I often use dlib’s CNN face detector to detect faces before training the face recognizer itself. When I’m ready to deploy my face recognition model, I’ll often swap out dlib’s CNN face detector for a more computationally efficient one that can run in real-time (e.g., OpenCV’s CNN face detector).

The only place I tend not to use dlib’s CNN face detector is when I’m using embedded devices. This model will not run in real-time on embedded devices, and it’s out-of-the-box compatible with embedded device accelerators like the Movidius NCS.

That said, you just cannot beat the face detection accuracy of dlib’s MMOD CNN, so if you need accurate face detections, go with this model.

My personal suggestions for face detection

**Figure 5:** For a good all-around face detector, go with OpenCV’s deep learning-based face detector. It’s accurate and capable of running in real-time on modern laptops and desktops.

When it comes to a good, all-purpose face detector, I suggest using OpenCV’s DNN face detector:

It achieves a nice balance of speed and accuracy
As a deep learning-based detector, it’s more accurate than its Haar cascade and HOG + Linear SVM counterparts
It’s fast enough to run real-time on CPUs
It can be further accelerated using USB devices such as the Movidius NCS
No additional libraries/packages are required — support for the face detector is baked into OpenCV via the cv2.dnn module

That said, there are times when you would want to use each of the face detectors mentioned above, so be sure to read through each of those sections carefully.

What's next? I recommend PyImageSearch University.

Course information:
35+ total classes • 39h 44m video • Last updated: April 2022
★★★★★ 4.84 (128 Ratings) • 3,000+ Students Enrolled

I strongly believe that if you had the right teacher you could master computer vision and deep learning.

Do you think learning computer vision and deep learning has to be time-consuming, overwhelming, and complicated? Or has to involve complex mathematics and equations? Or requires a degree in computer science?

That’s not the case.

All you need to master computer vision and deep learning is for someone to explain things to you in simple, intuitive terms. And that’s exactly what I do. My mission is to change education and how complex Artificial Intelligence topics are taught.

If you're serious about learning computer vision, your next stop should be PyImageSearch University, the most comprehensive computer vision, deep learning, and OpenCV course online today. Here you’ll learn how to successfully and confidently apply computer vision to your work, research, and projects. Join me in computer vision mastery.

Inside PyImageSearch University you'll find:

&check; 35+ courses on essential computer vision, deep learning, and OpenCV topics
&check; 35+ Certificates of Completion
&check; 39+ hours of on-demand video
&check; Brand new courses released regularly, ensuring you can keep up with state-of-the-art techniques
&check; Pre-configured Jupyter Notebooks in Google Colab
&check; Run all code examples in your web browser — works on Windows, macOS, and Linux (no dev environment configuration required!)
&check; Access to centralized code repos for all 450+ tutorials on PyImageSearch
&check; Easy one-click downloads for code, datasets, pre-trained models, etc.
&check; Access on mobile, laptop, desktop, etc.

Click here to join PyImageSearch University

Summary

In this tutorial, you learned my tips, suggestions, and best practices for face detection.

In summary, they are:

Use OpenCV’s Haar cascades when speed is your primary concern (e.g., when you’re using an embedded device like the Raspberry Pi). Haar cascades aren’t as accurate as their HOG + Linear SVM and deep learning-based counterparts, but they make up for it in raw speed. Just be aware there will certainly be some false-positive detections and parameter tuning required when calling detectMultiScale.
Use dlib’s HOG + Linear SVM detector when Haar cascades are not accurate enough, but you cannot commit to the computational requirements of a deep learning-based face detector. The HOG + Linear SVM object detector is a classic algorithm in the computer vision literature that is still relevant today. The dlib library does a fantastic job implementing it. Just be aware that running HOG + Linear SVM on a CPU will likely be too slow for your embedded device.
Use dlib’s CNN face detection when you need super-accurate face detections. When it comes to face detection accuracy, dlib’s MMOD CNN face detector is incredibly accurate. That said, there is a tradeoff — with higher accuracy comes slower run-time. This method cannot run in real-time on a laptop/desktop CPU, and even with GPU acceleration, you’ll struggle to hit real-time performance. I typically use this face detector on offline batch processing where I’m less concerned about how long face detection takes (and instead, all I want is high accuracy).
Use OpenCV’s DNN face detector as a good balance. As a deep learning-based face detector, this method is accurate — and since it’s a shallow network with an SSD backbone, it’s easily capable of running in real-time on a CPU. Furthermore, since you can use the model with OpenCV’s cv2.dnn module, that also implies that (1) you can increase speed further by using a GPU or (2) utilizing the Movidius NCS on your embedded device.

In general, OpenCV’s DNN face detector should be your “first stop” when applying face detection. You can try other methods based on the accuracy the OpenCV DNN face detector gives you.

To download the source code to this post (and be notified when future tutorials are published here on PyImageSearch), simply enter your email address in the form below!

Join the PyImageSearch Newsletter and Grab My FREE 17-page Resource Guide PDF

Enter your email address below to join the PyImageSearch Newsletter and download my FREE 17-page Resource Guide PDF on Computer Vision, OpenCV, and Deep Learning.

The post Face detection tips, suggestions, and best practices appeared first on PyImageSearch.