Home
  • English
  • ÄŒeÅ¡tina
  • Deutsch
  • Español
  • Français
  • Gàidhlig
  • LatvieÅ¡u
  • Magyar
  • Nederlands
  • Português
  • Português do Brasil
  • Suomi
  • Log In
    New user? Click here to register. Have you forgotten your password?
Home
  • Browse Our Collections
  • Publications
  • Researchers
  • Research Data
  • Institutions
  • Statistics
    • English
    • ÄŒeÅ¡tina
    • Deutsch
    • Español
    • Français
    • Gàidhlig
    • LatvieÅ¡u
    • Magyar
    • Nederlands
    • Português
    • Português do Brasil
    • Suomi
    • Log In
      New user? Click here to register. Have you forgotten your password?
  1. Home
  2. Research Output and Publications
  3. Faculty of Electronic Engineering & Technology (FKTEN)
  4. Theses & Dissertations
  5. Part of word segmentation and recognition techniques for Urdu handwritten text
 
Options

Part of word segmentation and recognition techniques for Urdu handwritten text

Date Issued
2020
Author(s)
Muhammad Kashif Siddhu
Handle (URI)
https://hdl.handle.net/20.500.14170/9556
Abstract
This thesis conducts research on Urdu handwritten text recognition. The handwriting recognition systems have several applications like forms processing, postal automation, document digitization and bank cheque processing. The text recognition in Urdu handwritten documents is in its infancy. This is the first work on segmentation based Urdu handwritten text recognition. All the three phases required for a segmentation based handwritten text recognition are addressed in this thesis. These phases include segmentation, data augmentation and recognition. This work proposes a novel segmentation algorithm to segment the Urdu handwritten text line images into Parts of Words (POWs). Since the available dataset is small and not have enough data to train a learning-based classifier, a data augmentation technique is designed to increase the amount of data. For this purpose, Auxiliary Classifier Generative Adversarial Networks (ACGANs) which are a variation of Deep Convolutional Generative adversarial networks (DCGANs) are used in combination with affine transformations to generate images that look like written by a human. This is the pioneer work that implements a deep generative model for data augmentation of Urdu POWs. For the POW recognition, three deep learning classifiers have been analyzed namely AlexNet, VGG16 and VGG19. All these are deep Convolutional Neural Networks (CNN). These models have achieved state of the art performance in natural images as well as on handwritten text images. To train these classifiers, the transfer learning technique is applied. For this purpose, pre-trained models of these architectures are used.Experiments are performed on Urdu handwritten dataset named UNHD dataset. For POW segmentation, experiments are also performed on an Arabic handwritten dataset named IFN/ENIT dataset for comparison with other segmentation algorithms proposed in the literature. The results show excellent performance of the proposed segmentation algorithm on all these datasets. A detection rate of 80.22% is achieved on UNHD dataset and a detection rate of 94.73% is achieved for IFN/ENIT dataset. For POW augmentation and recognition, the experiments are performed on UNHD dataset. Experiments are conducted on original data (without augmentation) as well as with augmented data. The results show significant improvement in performance when using the augmented data. The best recognition accuracy of 96.48% is achieved on VGG16 using the augmented data. This is 4.48% better than the accuracy on the original data. Therefore, this work provides solutions for all the phases of segmentation based Urdu handwritten text recognition pipeline achieving excellent results. Being a first work on segmentation based Urdu handwritten text; it can serve as a benchmark for future research in this direction.
Subjects
  • Text recognition

  • Handwriting

  • Urdu

  • Part of word

File(s)
Pages 1-24.pdf (1.03 MB) Full Text.pdf (1.53 MB) Declaration Form.pdf (732.85 KB)
google-scholar
Views
Downloads
  • About Us
  • Contact Us
  • Policies