Cartoon Character Recognition using Deep Learning

February 23, 2021 Sunil Ghimire

Cartoon Character, one of the people or animals in an animated film like Mickey Mouse, Tom & Jerry etc. Cartoons are essential part of every childhood. They are, certainly, the most popular entertainment for children, but also much more than that. With the help of cartoons kids can learn about the world around us, about new emotions, life issues and other important things. Hence, just for fun the goal of current project is to recognize the cartoon character using deep learning algorithm.

***Figure 01: Cartoon Character Recognition using Deep Learning***

So, In this article, we will look how to use a Deep Neural Net model for performing Cartoon Character Recognition in OpenCV.

DEPENDENCIES:

1. Python (python==3.7)
2. Deep Learning based Cartoon Character Recognition uses OpenCV (opencv==4.2.0)
3. The model Convolution Neural Network(CNN) uses Keras (keras==2.3.1) on Tensorflow (tensorflow>=1.15.2)
4. imutils==0.5.3,
5. matplotlib==3.2.1
6. argparse==1.1

DATASET:

The images used in the dataset are collected from google-chrome and from various sites like Disney. The Dataset contains 4 categories ( Mickey-Mouse, Donald-Duck, Minions, and Winnie the Pooh) as of now and it contains a total of 2215 images. The Dataset is converted into structured format (collecting at one place and manually labeling of image) and all preprocessing (like resizing the image, applying filters to remove noise, etc) is done using OpenCV-python.

DEEP LEARNING ALGORITHM:

1. Model Architecture: For understanding the CNN Model, please go through the article. The architecture is different with respect to a number of layers, parameters, and hyperparameter tuning, but the basics are the same.

2. Calculating the number of Parameters in CNNs: If you’ve been playing with CNN’s it is common to encounter a summary of parameters as seen in the below image. We all know it is easy to calculate the activation size, considering it’s merely the product of width, height and the number of channels in that layer.

FIRST, WHAT ARE PARAMETERS?

Parameters in general are weights that are learned during training. They are weight matrices that contribute to model’s predictive power, changed during the back-propagation process. Who governs the change? Well, the training algorithm you choose, particularly the optimization strategy makes them change their values.
Now that you know what “parameters” are, let’s dive into calculating the number of parameters in the sample image we saw above. But, I’d want to include that image again here to avoid your scrolling effort and time.

1. Input Layer: Input layer has nothing to learn, at it’s core, what it does is just provide the input image’s shape. So no learnable parameters here. Thus number of parameters = 0.
2. CONV layer: This is where CNN learns, so certainly we’ll have weight matrices. To calculate the learnable parameters here, all we have to do is just multiply the by the shape of width m, height n, previous layer’s filters d and account for all such filters k in the current layer. Don’t forget the bias term for each of the filter. Number of parameters in a CONV layer would be : ((m * n * d)+1)* k), added 1 because of the bias term for each filter. The same expression can be written as follows: ((shape of width of the filter * shape of height of the filter * number of filters in the previous layer+1)*number of filters). Where the term “filter” refer to the number of filters in the current layer.
3. POOL layer: This has got no learnable parameters because all it does is calculate a specific number, no backprop learning involved! Thus a number of parameters = 0.
4. Fully Connected Layer (FC): This certainly has learnable parameters, matter of fact, in comparison to the other layers, this category of layers has the highest number of parameters, why? because, every neuron is connected to every other neuron! So, how to calculate the number of parameters here? You probably know, it is the product of the number of neurons in the current layer c and the number of neurons on the previous layer p and as always, do not forget the bias term. Thus number of parameters here are: ((current layer neurons c * previous layer neurons p)+1*c).

Now let’s follow these pointers and calculate the number of parameters, shall we?

1. The first input layer has no parameters.
2. Parameters in the second CONV1(filter shape =3*3, stride=1) layer is: ((shape of width of filter*shape of height filter*number of filters in the previous layer+1)*number of filters) = (((3*3*3)+1)*32) = 896.
3. Parameters in the fourth CONV2(filter shape =3*3, stride=1) layer is: ((shape of width of filter * shape of height filter * number of filters in the previous layer+1) * number of filters) = (((3*3*32)+1)*64) = 9248.
4. The third POOL1 layer has no parameters.
5. Parameters in the fourth CONV3(filter shape =3*3, stride=1) layer is: ((shape of width of filter * shape of height filter * number of filters in the previous layer+1) * number of filters) = (((5*5*32)+1)*64) = 18496.
6. Parameters in the fourth CONV4(filter shape =3*3, stride=1) layer is: ((shape of width of filter * shape of height filter * number of filters in the previous layer+1) * number of filters) = (((3*3*64)+1)*64) = 36928.
7. The fifth POOL2 layer has no parameters.
8. The Softmax layer has ((current layer c*previous layer p)+1*c) parameters = 238144*4 + 1*4= 952580.

HOW TO EXECUTE CODE:

1. You will first have to download the repository and then extract the contents into a folder.

2. Make sure you have the correct version of Python installed on your machine. This code runs on Python 3.6 above.

3. Now, install the libraries required.

4. Now, you can create your own dataset and put it in the Data/ folder in following format Data/Category1 Data/Category2 and so on.

5. Training of CNN Model: You can check Training.ipynb for training and save the trained model inside model/ folder.

6.Testing of CNN Model: You can use pretrained model and run the following command :

7. For recognizing cartoon character in images, run the following command:
python Cartoon_Character_Recognition_in Image.py --path Data/Donald.jpeg --model Model/model.h5

8. For detecting face mask in real-time video stream, run the following command:
python Cartoon_Character_Recognition_in Video.py --path Data/video.mp4 --model Model/model.h5

EXPERIMENTS AND RESULTS:

***Figure 04: Cartoon Character Recognition in Image.***

WHAT’S NEXT?

In this article, the cartoon character is recognized accurately. We have used 4 categories, this can be extended to train the model for more categories, and we can create a dataset for object detection (will going to do it just for learning object detection in the future).

REFERENCES



1. Dataset Collection 
2. Calculation of CNN Parameters
3. CNN Architecture

Devashi Choudhary has been working on this inspiring project i.e. “Cartoon Character Recognition using Deep Learning. This initiative will potentially help AI developers to spend more time on creative tasks and research more on the latest AI topic.

☺Thanks for your time ☺

What do you think of this “Cartoon Character Recognition using Deep Learning“? Let us know by leaving a comment below. (Appreciation, Suggestions, and Questions are highly appreciated).