Age and gender recognition with JavaCV and CNN

What is JavaCV?

We all have heard about OpenCV, the famous C++ library for computer vision related applications. Since it is written in C++, many wrappers/bindings have been created in order to provide the functionalities provided by the OpenCV library to other languages. OpenCV python binding is one such famous wrapper which is being used by many developers for image processing applications. Then we have emgucv, which is the .NET wrapper for OpenCV. Similarly, OpenCV have a java binding too. But, that is not JavaCV. JavaCV is more than a java binding.

opencv

JavaCV can be considered as a collection of wrappers for most of famous libraries used for image processing and computer vision related applications. That is, JavaCV include wrappers for OpenCV, FFmpeg, OpenKinect and many more. In short, you can consider JavaCV as everything required for computer vision and image processing in one place kind of a thing. Currently, JavaCV has released its latest version(1.3) of wrappers which support the OpenCV 3.1 through JNI. Even though JavaCV comes with a lot features (OpenCV + many more), it was quite less famous among the developers. Therefore, I decided to draw some light to JavaCV and expose what it is capable of doing. Let’s get started …

Let’s get started

Here, we will be doing some face recognition, face detection stuff and furthermore, we will be using CNN (Convolutional Neural Networks) for age and gender predictions. Since most of you have seen how to do face detection using Haar cascades and how to do face recognition using fisherfaces and so on, the interesting part will be the usage of CNN for age and gender predictions. First, let’s create a maven project and add JavaCV dependency as follows. I will be using JavaCV version 1.2. The latest version available is JavaCV 1.3 which has several improvements over the version that I’m using here.

    org.bytedeco
    javacv
    1.2

Capturing video from a camera

In order to capture the video, I have used the FFmpegFrameGrabber shipped with JavaCV. As I have read in several places, this is favored over the conventional OpenCVFrameGrabber when you are using a linux machine. First we have to create the frame grabber and configure as we want.

frameGrabber = new FFmpegFrameGrabber("/dev/video0");
frameGrabber.setFormat("video4linux2");
frameGrabber.setImageWidth(1280);
frameGrabber.setImageHeight(720);

Here, the parameter taken in by the constructor of FFmpegFrameGrabber is the camera device(“/dev/video0” is our default laptop webcam. If you connect another camera, it will be identified by the linux system as “/dev/video1” and so on) which is to be used for capturing frames. Since I’m a linux fan, I have done this example on an Ubuntu machine. The camera device and format (“video4linux2”) may differ in windows environments. However, the steps to be followed are still the same. Then the captured frame’s width and height are specified. If you have a 720p HD cam, this resolution is optimal. Once set up, we can start the frame grabber.

frameGrabber.start();

Once started we have to repeatedly grab frames from the frame grabber as follows.

boolean running = true;
while (running) {
    try {
        // Here we grab frames from our camera
        final Frame frame = frameGrabber.grab();

    } catch (FrameGrabber.Exception e) {
        logger.error("Error when grabbing the frame", e);
    } 
}

In the above code, we were just capturing frames from the web-cam and we didn’t even show them in an UI. Let’s do that part along with adding the gender and age recognition functionality to our code.

Face detection with Haar cascades

This is a part most of us at least have heard of. OpenCV/JavaCV provide direct methods to import Haar-cascades and use them to detect faces. I will not be explaining this part in deep. You can refer this class in gihub and this blog post to learn more on using Haar cascades to detect faces. I have written HaarFaceDetector class to detect faces. Following code snippet initializes the Haar cascade classifier.

File haarCascade = new File(this.getClass().getResource("/detection/haarcascade_frontalface_alt.xml").toURI());

haarClassifierCascade = new opencv_objdetect.CvHaarClassifierCascade(cvLoad(haarCascade.getAbsolutePath()));

Then the following method to detect faces in a grabbed frame and to crop the faces. I have put the cropped faces along with the coordinates of the faces (as CvRect objects) into a Map.

public Map<CvRect, Mat> detect(Frame frame) {
    Map<CvRect, Mat> detectedFaces = new HashMap<>();

    IplImage iplImage = iplImageConverter.convert(frame);

    /*
     * return a CV Sequence (kind of a list) with coordinates of rectangle face area.
     * (returns coordinates of left top corner & right bottom corner)
     */
    CvSeq detectObjects = cvHaarDetectObjects(iplImage, haarClassifierCascade, storage, 1.5, 3, CV_HAAR_DO_CANNY_PRUNING);
    Mat matImage = toMatConverter.convert(frame);

    int numberOfPeople = detectObjects.total();
    for (int i = 0; i < numberOfPeople; i++) {
        CvRect rect = new CvRect(cvGetSeqElem(detectObjects, i));
        Mat croppedMat = matImage.apply(new Rect(rect.x(), rect.y(), rect.width(), rect.height()));
        detectedFaces.put(rect, croppedMat);
    }

    return detectedFaces;
}

That’s it for face detection. Let’s see how these detected faces are processed to detect gender and age.

Gender Recognition with CNN

Gender recognition using openCV’s fisherfaces implementation is quite popular and some of you may have tried or read about it also. But, in example, I will be using a different approach to recognize gender. This method was introduced by two Israel researchers, Gil Levi and Tal Hassner in 2015. I have used the CNN models trained by them in this example. We are going to use the OpenCV’s dnn package which stands for “Deep Neural Networks”.

In the dnn package, OpenCV/JavaCV has provided a class called Net which can be used to populate a neural network. Furthermore, this packages supports importing neural network models from well known deep learning frameworks like caffe, tensorflow and torch. The researchers I had mentioned above have published their CNN models as caffe models. Therefore, we will be using the CaffeImporter import that model into our application.

A caffe model has 2 associated files,

  1. .prototxt — The definition of the CNN goes in here. This file defines the layers in the neural network, each layer’s inputs, outputs and functionality.
  2. .caffemodel — This contains the information of the trained neural network (trained model).

Following code segment is responsible for loading the CNN definitions and the trained models. (CNNGenderDetector.java)

genderNet = new Net();
File protobuf = new File(getClass().getResource("/caffe/deploy_gendernet.prototxt").toURI());
File caffeModel = new File(getClass().getResource("/caffe/gender_net.caffemodel").toURI());
Importer importer = createCaffeImporter(protobuf.getAbsolutePath(), caffeModel.getAbsolutePath());
importer.populateNet(genderNet);
importer.close();

“deploy_gendernet.prototxt” file contains the definition of the CNN responsible for gender recognition and “gender_net.caffemodel” file contains the trained caffe model. Note the following lines,

createCaffeImporter(protobuf.getAbsolutePath(),caffeModel.getAbsolutePath())

importer.populateNet(genderNet)

which do the importing and population of the CNN. I have written the detectGender(Mat face, Frame frame) in CNNGenderDetector class to detect gender of faces. The first parameter is the cropped face of which we want to detect the gender and the second parameter is the captured frame (used for color and depth information inside the method) in which we detected the corresponding face. Following is the processing part inside the method.

Mat croppedMat = new Mat();
resize(face, croppedMat, new Size(256, 256));
normalize(croppedMat, croppedMat, 0, Math.pow(2, frame.imageDepth), NORM_MINMAX, -1, null);

Blob inputBlob = new Blob(croppedMat);
genderNet.setBlob(".data", inputBlob);
genderNet.forward();
Blob prob = genderNet.getBlob("prob");

Indexer indexer = prob.matRefConst().createIndexer();
logger.debug("CNN results {},{}", indexer.getDouble(0, 0), indexer.getDouble(0, 1));
if (indexer.getDouble(0, 0) > indexer.getDouble(0, 1)) {
    logger.debug("Male detected");
    return Gender.MALE;
} else {
    logger.debug("Female detected");
    return Gender.FEMALE;
}

First, I resize the face (of type Mat passed as method parameter) and fill the resized Mat to a new Mat. This is because the trained model has been trained to take inputs of size 256×256. Therefore, I resize the face Mat to that size. Then I have normalized the Mat.

Then I have created a Blob using the face Mat we have and set that Blob as the input of the 1st layer of the CNN, .data layer. genderNet.forward() will then move the input through the layers of the neural network and finally we can take the output of the last layer of the CNN which is the probability layer (prob). In JavaCV we have to use indexers to iterate matrices and blobs since JavaCV is just using the JNI to invoke OpenCV methods. This gender recognition CNN outputs two values which are indexed as (0,0) and (0,1) in a one dimentional matrix. Value at (0,0) corresponds to the probability of the face being a male and the value at (0,1) is the probability of being female. Since these two values sum up to 1, greater one will be the predicted gender. Complete class is available here.

Age Recognition with CNN

This is almost similar to the gender detection part except that the corresponding prototxt file and the caffe model file are “deploy_agenet.prototxt” and “age_net.caffemodel”. Furthermore, the CNN’s output layer (probability layer) in this CNN consists of 8 values for 8 age classes (“0–2”, “4–6”, “8–13”, “15–20”, “25–32”, “38–43”, “48–53” and “60-“). The CNNAgeDetector class is written to process detected faces and predict their age classes.

Mat resizedMat = new Mat();
resize(face, resizedMat, new Size(256, 256));
normalize(resizedMat, resizedMat, 0, Math.pow(2, frame.imageDepth), NORM_MINMAX, -1, null);

Blob inputBlob = new Blob(resizedMat);
ageNet.setBlob(".data", inputBlob);
ageNet.forward();
Blob prob = ageNet.getBlob("prob");

DoublePointer pointer = new DoublePointer(new double[1]);
Point max = new Point();
minMaxLoc(prob.matRefConst(), null, pointer, null, max, null);
return AGES[max.x()];

Similarly to the gender recognition scenario, first I resize the face (of type Mat passed as method parameter) and fill the resized Mat to a new Mat. This is because the trained model has been trained to take inputs of size 256×256. Therefore, I resize the face Mat to that size. Then I have normalized the Mat.

Then I have created a Blob using the face Mat we have and set that Blob as the input of the 1st layer of the CNN, .data layer. ageNet.forward() will then move the input through the layers of the neural network and finally we can take the output of the last layer of the CNN which is the probability layer (prob). minMaxLoc(prob.matRefConst(), null, pointer, null, max, null) method has been used to get the predicted age class. As the neural network returns an array of 8 values consisting of probabilities of each age class being the predicted age of the user. The complete class can be found here.

Putting things together

I have created the main class JavaCVExample to put together the things we have done so far and come up with a small UI application. You can refer that class and understand what I have done there. I have also put some text around detected faces stating the predicted gender and age.

Let’s try this out

Now we are ready to run our application. Let’s try this out. The main method is located in the JavaCVExample class. When you run this class, your webcam will be switched on and the feed captured through that camera will be processed by our neural networks.

I would love to put a screen shot of myself being detected by my application. But why should I ruin your nice day 😀 . You will see a red box around your face and the gender and age label being shown in real time. Gender prediction is considerably accurate while the age prediction is not that accurate. However, age prediction can also predict the nearest class which match your age.

Hope you could follow the entire article and get the same results. Complete source code is available at https://github.com/IMS94/javacv-example. If you have any questions, feel free to comment them. I will try to answer them at my best. Thank you for reading.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

Powered by WordPress.com.

Up ↑

%d bloggers like this: