Skip to main content

Spatial-Temporal Graph Convolutional Networks for Sign Language Recognition

  • Conference paper
  • First Online:
Artificial Neural Networks and Machine Learning – ICANN 2019: Workshop and Special Sessions (ICANN 2019)

Abstract

The recognition of sign language is a challenging task with an important role in society to facilitate the communication of deaf persons. We propose a new approach of Spatial-Temporal Graph Convolutional Network for sign language recognition based on the human skeletal movements. The method uses graphs to capture the dynamics of the signs in two dimensions, spatial and temporal, considering the complex aspects of the language. Additionally, we present a new dataset of human skeletons for sign language based on ASLLVD to contribute to future related studies.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 79.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 99.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Dactylology - digital or manual alphabet, generally used by the deaf to introduce a word that does not yet have an equivalent sign [11].

  2. 2.

    Available at https://github.com/yysijie/st-gcn.

  3. 3.

    Available at http://csr.bu.edu/asl/asllvd/annotate/index.html.

  4. 4.

    Available at http://www.cin.ufpe.br/~cca5/asllvd-skeleton.

  5. 5.

    Available at http://www.cin.ufpe.br/~cca5/st-gcn-sl.

  6. 6.

    Available at http://www.cin.ufpe.br/~cca5/asllvd-skeleton-20.

References

  1. Athitsos, V., et al.: The American sign language lexicon video dataset. In: 2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, pp. 1–8, June 2008. https://doi.org/10.1109/CVPRW.2008.4563181

  2. Cao, Z., Simon, T., Wei, S.E., Sheikh, Y.: Realtime multi-person 2D pose estimation using part affinity fields. In: CVPR (2017). https://doi.org/10.1109/CVPR.2017.143

  3. Das, A., Gawde, S., Suratwala, K., Kalbande, D.: Sign language recognition using deep learning on custom processed static gesture images. In: 2018 International Conference on Smart City and Emerging Technology (ICSCET), pp. 1–6, January 2018. https://doi.org/10.1109/ICSCET.2018.8537248

  4. ElBadawy, M., Elons, A.S., Shedeed, H.A., Tolba, M.F.: Arabic sign language recognition with 3D convolutional neural networks. In: 2017 Eighth International Conference on Intelligent Computing and Information Systems (ICICIS), pp. 66–71, December 2017. https://doi.org/10.1109/INTELCIS.2017.8260028

  5. Konstantinidis, D., Dimitropoulos, K., Daras, P.: Sign language recognition based on hand and body skeletal data. In: 2018–3DTV-Conference: The True Vision - Capture, Transmission and Display of 3D Video (3DTV-CON), pp. 1–4, June 2018. https://doi.org/10.1109/3DTV.2018.8478467

  6. Laptev, I., Marszalek, M., Schmid, C., Rozenfeld, B.: Learning realistic human actions from movies. In: 2008 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8, June 2008. https://doi.org/10.1109/CVPR.2008.4587756

  7. Lim, K.M., Tan, A.W., Tan, S.C.: Block-based histogram of optical flow for isolated sign language recognition. J. Vis. Commun. Image Represent. 40, 538–545 (2016). https://doi.org/10.1016/j.jvcir.2016.07.020

    Article  Google Scholar 

  8. Neidle, C., Thangali, A., Sclaroff, S.: Challenges in development of the American sign language Lexicon video dataset (ASLLVD) corpus. In: 5th Workshop on the Representation and Processing of Sign Languages: Interactions between Corpus and Lexicon, LREC 2012, Istanbul, Turkey, May 2012. https://open.bu.edu/handle/2144/31899

  9. World Health Organization: Deafness and hearing loss, March 2018. https://www.who.int/news-room/fact-sheets/detail/deafness-and-hearing-loss. Online

  10. Pedregosa, F., et al.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011). http://www.ee.columbia.edu/~ronw/pubs/jmlr2011-scikit-learn.pdf

    MathSciNet  MATH  Google Scholar 

  11. Pereira, M.C.d.C., Choi, D., Vieira, M.I., Gaspar, P., Nakasato, R.: Libras - Conhecimento Além Dos Sinais, 1st edn. Pearson, São Paulo (2011)

    Google Scholar 

  12. Peres, S.M., Flores, F.C., Veronez, D., Olguin, C.J.M.: Libras signals recognition: a study with learning vector quantization and bit signature. In: 2006 Ninth Brazilian Symposium on Neural Networks (SBRN 2006), pp. 119–124, October 2006. https://doi.org/10.1109/SBRN.2006.26

  13. Pigou, L., Herreweghe, M.V., Dambre, J.: Gesture and sign language recognition with temporal residual networks. In: 2017 IEEE International Conference on Computer Vision Workshops (ICCVW), pp. 3086–3093, October 2017. https://doi.org/10.1109/ICCVW.2017.365

  14. de Quadros, R.M., Karnopp, L.B.: Língua de sinais brasileira: estudos linguísticos, vol. 1. Artmed, Porto Alegre (2004)

    Google Scholar 

  15. Rao, G.A., Syamala, K., Kishore, P.V.V., Sastry, A.S.C.S.: Deep convolutional neural networks for sign language recognition. In: 2018 Conference on Signal Processing And Communication Engineering Systems (SPACES), pp. 194–197, January 2018. https://doi.org/10.1109/SPACES.2018.8316344

  16. Sajanraj, T.D., Beena, M.: Indian sign language numeral recognition using region of interest convolutional neural network. In: 2018 Second International Conference on Inventive Communication and Computational Technologies (ICICCT), pp. 636–640, April 2018. https://doi.org/10.1109/ICICCT.2018.8473141

  17. Simon, T., Joo, H., Matthews, I.A., Sheikh, Y.: Hand keypoint detection in single images using multiview bootstrapping (2017). http://arxiv.org/abs/1704.07809

  18. Taskiran, M., Killioglu, M., Kahraman, N.: A real-time system for recognition of American sign language by using deep learning. In: 2018 41st International Conference on Telecommunications and Signal Processing (TSP), pp. 1–5 (2018). https://doi.org/10.1109/TSP.2018.8441304

  19. Wei, S., Ramakrishna, V., Kanade, T., Sheikh, Y.: Convolutional pose machines. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4724–4732, June 2016. https://doi.org/10.1109/CVPR.2016.511

  20. Yan, S., Xiong, Y., Lin, D.: Spatial temporal graph convolutional networks for skeleton-based action recognition. CoRR abs/1801.07455 (2018). http://arxiv.org/abs/abs/1801.07455

  21. Zheng, L., Liang, B., Jiang, A.: Recent advances of deep learning for sign language recognition. In: 2017 International Conference on Digital Image Computing: Techniques and Applications (DICTA), pp. 1–7, November 2017. https://doi.org/10.1109/DICTA.2017.8227483

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Cleison Correia de Amorim .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

de Amorim, C.C., Macêdo, D., Zanchettin, C. (2019). Spatial-Temporal Graph Convolutional Networks for Sign Language Recognition. In: Tetko, I., Kůrková, V., Karpov, P., Theis, F. (eds) Artificial Neural Networks and Machine Learning – ICANN 2019: Workshop and Special Sessions. ICANN 2019. Lecture Notes in Computer Science(), vol 11731. Springer, Cham. https://doi.org/10.1007/978-3-030-30493-5_59

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-30493-5_59

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-30492-8

  • Online ISBN: 978-3-030-30493-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics