Cursive Scene Text Analysis by Deep Convolutional Linear Pyramids

Ahmed, Saad Bin; Naz, Saeeda; Razzak, Muhammad Imran; Yusof, Rubiyah

doi:10.1007/978-3-030-04167-0_28

Cursive Scene Text Analysis by Deep Convolutional Linear Pyramids

Saad Bin Ahmed^16,17,
Saeeda Naz¹⁹,
Muhammad Imran Razzak¹⁸ &
…
Rubiyah Yusof¹⁷

Conference paper
First Online: 17 November 2018

3646 Accesses
3 Citations

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 11301))

Abstract

The camera captured images have various aspects to investigate. Generally, the emphasis of research depends on the interesting regions. Sometimes the focus could be on color segmentation, object detection or scene text analysis. The image analysis, visibility and layout analysis are the tasks easier for humans as suggested by behavioural trait of humans, but in contrast when these same tasks are supposed to perform by machines then it seems to be challenging. The learning machines always learn from the properties associated to provided samples. The numerous approaches are designed in recent years for scene text extraction and recognition and the efforts are underway to improve the accuracy. The convolutional approach provided reasonable results on non-cursive text analysis appeared in natural images. The work presented in this manuscript exploited the strength of linear pyramids by considering each pyramid as a feature of the provided sample. Each pyramid image process through various empirically selected kernels. The performance was investigated by considering Arabic text on each image pyramid of EASTR-42k dataset. The error rate of 0.17% was reported on Arabic scene text recognition.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Ahmed, S.B., Naz, S., Razzak, M.I., Rashid, S.F., Afzal, M.Z., Breuel, T.M.: Evaluation of cursive and non-cursive scripts using recurrent neural networks. Neural Comput. Appl. 27(3), 603–613 (2016)
Article Google Scholar
Ahmed, S.B., Naz, S., Razzak, M.I., Yousaf, R.: Deep learning based isolated arabic scene character recognition. In: International Workshop on Arabic Script Analysis and Recognition (ASAR), pp. 46–51. IEEE (2017)
Google Scholar
Ahmed, S.B., Naz, S., Swati, S., Razzak, M.I.: Handwritten Urdu character recognition using one-dimensional BLSTM classifier. Neural Comput. Appl., pp. 1–9 (2017)
Google Scholar
Ahmed, S.B., Naz, S., Swati, S., Razzak, M.I., Umar, A.I., Khan, A.A.: UCOM offline dataset-an Urdu handwritten dataset generation. Int. Arab J. Inf. Technol. 14(2), 239–245 (2017)
Google Scholar
Gluckman, J.M.: Scale variant image pyramids. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), vol. I, pp. 1069–1075 (2006)
Google Scholar
Grauman, K., Darrell, T.J.: The pyramid match kernel: discriminative classification with sets of image features. In: ICCV, vol. II, pp. 1458–1465 (2005)
Google Scholar
Graves, A.: Supervised Sequence Labelling with Recurrent Neural Networks. SCI, vol. 385. Springer, Berlin (2012). https://doi.org/10.1007/978-3-642-24797-2
Book MATH Google Scholar
Lazebnik, S., Schmid, C., Ponce, J.: Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), vol. II, pp. 2169–2178 (2006)
Google Scholar
Lee, S., Cho, M.S., Jung, K., Kim, J.H.: Scene text extraction with edge constraint and text collinearity. In: ICPR, pp. 3983–3986. IEEE Computer Society (2010)
Google Scholar
Naz, S., Ahmed, S.B., Ahmad, R., Razzak, M.I.: Arabic script based digit recognition systems. In: International Conference on Recent Advances in Computer Systems (RACS), pp. 67–73 (2016)
Google Scholar
Naz, S., Hayat, K., Razzak, M.I., Anwar, M.W., Madani, S.A., Khan, S.U.: The optical character recognition of Urdu-like cursive scripts. Pattern Recognit. 47(3), 1229–1248 (2014)
Article Google Scholar
Naz, S., Umar, A.I., Shirazi, S.H., Ahmed, S.B., Razzak, M.I., Siddiqi, I.: Segmentation techniques for recognition of arabic-like scripts: a comprehensive survey. Educ. Inf. Technol. 21(5), 1225–1241 (2016)
Article Google Scholar
Naz, S., et al.: Urdu Nastaliq recognition using convolutional-recursive deep learning. Neurocomputing 243, 80–87 (2017)
Article Google Scholar
Naz, S., Umar, A.I., Ahmed, R., Razzak, M.I., Rashid, S.F., Shafait, F.: Urdu Nasta’liq text recognition using implicit segmentation based on multi-dimensional long short term memory neural networks. SpringerPlus 5(1), 2010 (2016)
Article Google Scholar
Razzak, M.I., Anwar, F., Husain, S.A., Belaid, A., Sher, M.: HMM and fuzzy logic: a hybrid approach for online urdu script-based languages’ character recognition. Knowl.-Based Syst. 23(8), 914–923 (2010)
Article Google Scholar
Sánchez, J., Perronnin, F., de Campos, T.E.: Modeling the spatial layout of images beyond spatial pyramids. Pattern Recognit. Lett. 33(16), 2216–2223 (2012)
Article Google Scholar
Tan, C.L., Yuan, B., Ang, C.H.: Agent-based text extraction from pyramid images. In: Singh, S. (ed.) International Conference on Advances in Pattern Recognition, pp. 344–352. Springer, London (1999). https://doi.org/10.1007/978-1-4471-0833-7_35
Chapter Google Scholar
Yang, J., Yu, K., Gong, Y., Huang, T.: Linear spatial pyramid matching using sparse coding for image classification. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2009)
Google Scholar
Yao, C., Bai, X., Liu, W., Ma, Y., Tu, Z.: Detecting texts of arbitrary orientations in natural images. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1083–1090. IEEE Computer Society (2012)
Google Scholar
Yousfi, S., Berrani, S.A., Garcia, C.: ALIF: a dataset for Arabic embedded text recognition in TV broadcast. In: 2015 13th International Conference on Document Analysis and Recognition (ICDAR), pp. 1221–1225. IEEE (2015)
Google Scholar
Zhang, J., Marszałek, M., Lazebnik, S., Schmid, C.: Local features and kernels for classification of texture and object categories: a comprehensive study. Int. J. Comput. Vis. 73(2), 213–238 (2007)
Article Google Scholar

Download references

Acknowledgement

The authors would like to thank Ministry of Education Malaysia and Universiti Teknologi Malaysia for funding this research project.

Author information

Authors and Affiliations

King Saud bin Abdulaziz University for Health Sciences, Riyadh, Saudi Arabia
Saad Bin Ahmed
Malaysia Japan International Institute of Technology (MJIIT), Universiti Teknologi Malaysia, Kuala-Lumpur, Malaysia
Saad Bin Ahmed & Rubiyah Yusof
University of Technology, Sydney, Australia
Muhammad Imran Razzak
Higher Education Department, Government Post Graduate College No. 01, Abbottabad, Pakistan
Saeeda Naz

Authors

Saad Bin Ahmed
View author publications
You can also search for this author in PubMed Google Scholar
Saeeda Naz
View author publications
You can also search for this author in PubMed Google Scholar
Muhammad Imran Razzak
View author publications
You can also search for this author in PubMed Google Scholar
Rubiyah Yusof
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Muhammad Imran Razzak .

Editor information

Editors and Affiliations

The Chinese Academy of Sciences, Beijing, China
Long Cheng
City University of Hong Kong, Kowloon, Hong Kong
Andrew Chi Sing Leung
Kobe University, Kobe, Japan
Seiichi Ozawa

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ahmed, S.B., Naz, S., Razzak, M.I., Yusof, R. (2018). Cursive Scene Text Analysis by Deep Convolutional Linear Pyramids. In: Cheng, L., Leung, A., Ozawa, S. (eds) Neural Information Processing. ICONIP 2018. Lecture Notes in Computer Science(), vol 11301. Springer, Cham. https://doi.org/10.1007/978-3-030-04167-0_28

Download citation

DOI: https://doi.org/10.1007/978-3-030-04167-0_28
Published: 17 November 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-04166-3
Online ISBN: 978-3-030-04167-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics