Transfer learning deep convolutional neural network for RGB-D face recognition

Puveneswari Shunmugam

Transfer learning deep convolutional neural network for RGB-D face recognition

Date Issued

2020

Author(s)

Puveneswari Shunmugam

Handle (URI)

https://hdl.handle.net/20.500.14170/13681

Abstract

Two-dimensional face recognition has been researched for past few decades. With the recent development of Deep Convolutional Neural Network deep learning approaches, two-dimensional face recognition had achieved impressive recognition accuracy rate. However, there are still some challenges such as pose variation, scene illumination, facial emotions, facial occlusions exist in the two-dimensional face recognition. This problem can be solved by adding the Depth images as input as it provides valuable information to help model facial boundaries and understand the facial features and provide low frequency patterns. RGB-D images are more robust compared to only RGB images. Unfortunately, lack of large RGB-D face databases to train the DCNN is the main reason for this research to be unexplored earlier. Now with the transfer learning approach, few researches have been done very recently for RGB-D Face recognition. As the first contribution, this research constructed a new RGB-D face database under various face challenges (illumination, occlusion, emotion, and face poses) using the Intel RealSense D435 Depth Camera which has better Depth resolution compared to the Microsoft Kinect Camera. As the second contribution, this research developed a Twin-DCNN architecture based on Inception-ResNet-V2 model and VGG16 model which takes RGB-D images as input. The RGB stream (Inception-ResNet-V2) processes RGB images while the Depth stream (VGG16) process Depth images separately. The entire upper layer in each pre-trained model has been maintained and the lower layer were finetuned by adding a fully connected layer to each model. Then both RGB and Depth stream DCNN models were concatenated together. Finally, the Soft-Max layer was added with 50 output classes. Developed Twin-DCNN model achieved 96% accuracy on our newly constructed RGB-D database.