Vol.10, No.4, November 2021.                                                                                                                                                                           ISSN: 2217-8309

                                                                                                                                                                                                                          eISSN: 2217-8333

 

TEM Journal

 

TECHNOLOGY, EDUCATION, MANAGEMENT, INFORMATICS

Association for Information Communication Technology Education and Science


Subword Recognition in Historical Arabic Documents using C-GRUs

 

Hanadi Hassen, Somaya Al-Madeed, Ahmed Bouridane

 

© 2021 Hanadi Hassen, published by UIKTEN. This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. (CC BY-NC-ND 4.0)

 

Citation Information: TEM Journal. Volume 10, Issue 4, Pages 1630-1637, ISSN 2217-8309, DOI: 10.18421/TEM104-19, November 2021.

 

Received: 10 August 2021.

Revised:  27 September 2021.
Accepted: 06 October 2021.
Published: 26 November 2021.

 

Abstract:

 

The recent years have witnessed an increased tendency to digitize historical manuscripts that not only ensures the preservation of these collections but also allows researchers and end-users’ direct access to these images. Recognition of Arabic handwriting is challenging due to the highly cursive nature of the script and other challenges associated with historical documents (degradation etc.). This paper presents an end-to-end system to recognize Arabic handwritten sub words in historical documents. More specifically, we introduce a hybrid CNN-GRU model where the shallow convolutional network learns robust feature representations while the GRU layers carry out the sequence modelling and generate the transcription of the text. The proposed system is evaluated on two different datasets, IBN SINA and VML-HD reporting recognition rates of 96.10% and 98.60% respectively. A comparison with existing techniques evaluated on the same datasets validates the effectiveness of our proposed model in characterizing Arabic subwords.

 

Keywords –handwriting recognition, Arabic historical documents, CNNs, GRUs, classification.

 

-----------------------------------------------------------------------------------------------------------

Full text PDF >  

-----------------------------------------------------------------------------------------------------------

 


Copyright © 2021 UIKTEN
Copyright licence: All articles are licenced via Creative Commons CC BY-NC-ND 4.0 licence