Convolutional Neural Network for Spatial Perception of InMoov Robot Through Stereoscopic Vision as an Assistive Technology

Authors

DOI:

https://doi.org/10.29019/enfoqueute.776

Keywords:

Humanoid Robotic, Convolutional Neural Networks, Spatial Perception, Transfer Learning

Abstract

In the development of assistive robots, a major challenge is to improve the spatial perception of robots for object identification in various scenarios. For this purpose, it is necessary to develop tools for analysis and processing of artificial stereo vision data. For this reason, this paper describes a convolutional neural network (CNN) algorithm implemented on a Raspberry Pi 3, placed on the head of a replica of the open-source humanoid robot InMoov, to estimate the X, Y, Z position of an object within a controlled environment. This paper explains the construction of the InMoov robot head, the application of Transfer Learning to detect and segment an object within a controlled environment, the development of the CNN architecture, and, finally, the assignment and evaluation of training parameters. As a result, an estimated average error of 27 mm in the X coordinate, 21 mm in the Y coordinate, and 4 mm in the Z coordinate was obtained; data of great impact and necessary when using these coordinates in a robotic arm to reach and grab the object, a topic that remains pending for future work.

Downloads

Download data is not yet available.

References

Chansong, D., y Supratid, S. (2021). Impacts of kernel size on different resized images in object recognition based on convolutional neural network. 9th International Electrical Engineering Congress (iEECON): 448-451. https://doi.org/10.1109/ieecon51072.2021.9440284

Demby’s, J., et al., (2019). Object detection and pose estimation using CNN in embedded hardware for assistive technology. IEEE Symposium Series on Computational Intelligence (SSCI). https://doi.org/10.1109/ssci44817.2019.9002767

Enucă, R. (2019) Dual-input CNN with Keras. https://medium.datadriveninvestor.com/dual-input-cnn-with-keras-1e6d458cd979

Geron, A. (2019b). Hands-on machine learning with scikit-learn, keras, and TensorFlow: Concepts, tools, and techniques to build intelligent systems. O’Reilly Media.

Ghosh A., et al., (2020) Fundamental Concepts of Convolutional Neural Network. En Balas V., Kumar R., Srivastava R. (eds) Recent Trends and Advances in Artificial Intelligence and Internet of Things. Intelligent Systems Reference Library. Springer. https://doi.org/10.1007/978-3-030-32644-9_36

González, J., et al., (2008). La Silla RobÓTica SENA. Un enfoque basado en la interacción hombre-máquina. Revista Iberoamericana de Automática e Informática Industrial RIAI, 5(2): 38-47. https://doi.org/10.1016/s1697-7912(08)70143-2

Hassan, H. F.; Abou-Loukh, S. J., y Ibraheem, I. K. (2020). Teleoperated robotic arm movement using electromyography signal with wearable Myo armband. Journal of King Saud University-Engineering Sciences, 32(6): 378-387. https://doi.org/10.1016/j.jksues.2019.05.001

Huang, B., et al., (2020). Improving head pose estimation using two-stage ensembles with top-k regression. Image and Vision Computing, 93(103827): 103827. https://doi.org/10.1016/j.imavis.2019.11.005

Jardón, A., et al., (2008). Asibot: Robot portátil de asistencia a discapacitados. Concepto, arquitectura de control y evaluación clínica. Revista Iberoamericana de Automática e Informática Industrial RIAI, 5(2): 48-59. https://doi.org/10.1016/s1697-7912(08)70144-4

Kerstens, H., et al. (2020). Stumbling, struggling, and shame due to spasticity: a qualitative study of adult persons with hereditary spastic paraplegia. Disability and Rehabilitation, 42(26): 3744-3751. https://doi.org/10.1080/09638288.2019.1610084

Khirirat, S., Feyzmahdavian, H. R., & Johansson, M. (2017). Mini-batch gradient descent: Faster convergence under data sparsity. IEEE 56th Annual Conference on Decision and Control (CDC). https://doi.org/10.1109/cdc.2017.8264077

Langevin, G. (2012). InMoov -open-source 3D printed life-size robot. https://inmoov.fr

Lee S., y Saitoh T. (2018) Head Pose Estimation Using Convolutional Neural Network. En Kim K., Kim H., Baek N. (eds) IT Convergence and Security 2017. Lecture Notes in Electrical Engineering. Springer. https://doi.org/10.1007/978-981-10-6451-7_20

Leitner, J., et al., (2013). Artificial neural networks for spatial perception: Towards visual object localisation in humanoid robots. The 2013 International Joint Conference on Neural Networks (IJCNN). https://doi.org/10.1109/ijcnn.2013.6706819

Li, J., et al., (2021). An integrated approach for robotic Sit-To-Stand assistance: Control framework design and human intention recognition. Control Engineering Practice, 107(104680): 104680. https://doi.org/10.1016/j.conengprac.2020.104680

Li, T., et al., (2019). CNN and LSTM based facial expression analysis model for a humanoid robot. IEEE access: practical innovations, open solutions, 7: 93998-94011. https://doi.org/10.1109/access.2019.2928364

Lillicrap, T. P., et al., (2020). Backpropagation and the brain. Nature Reviews. Neuroscience, 21(6): 335-346. https://doi.org/10.1038/s41583-020-0277-3

Ministerio de Salud y Protección Social de Colombia. (2019). Sala situacional de las personas con discapacidad. https://www.minsalud.gov.co/sites/rid/Lists/BibliotecaDigital/RIDE/VS/MET/sala-situacional-discapacidad2019-2-vf.pdf

Miseikis, J., et al., (2018). Robot localisation and 3D position estimation using a free-moving camera and cascaded convolutional neural networks. IEEE/ASME International Conference on Advanced Intelligent Mechatronics (AIM). https://doi.org/10.1109/aim.2018.8452236

O’Mahony N. et al. (2020) Deep Learning vs. Traditional Computer Vision. En Arai K., Kapoor S. (eds) Advances in Computer Vision. CVC 2019. Advances in Intelligent Systems and Computing. Springer. https://doi.org/10.1007/978-3-030-17795-9_10

Poernomo, A., y Kang, D.-K. (2018). Biased Dropout and Crossmap Dropout: Learning towards effective Dropout regularization in convolutional neural network. Neural Networks: The Official Journal of the International Neural Network Society, 104: 60-67. https://doi.org/10.1016/j.neunet.2018.03.016

Pramod, R. T., Katti, H., & Arun, S. P. (2018). Human peripheral blur is optimal for object recognition. http://arxiv.org/abs/1807.08476

Qi, J., et al., (2020). On mean absolute error for deep neural network based vector-to-vector regression. IEEE Signal Processing Letters, 27: 1485-1489. https://doi.org/10.1109/lsp.2020.3016837

Qin, Z., et al., (2018). How convolutional neural networks see the world. A survey of convolutional neural network visualization methods. Mathematical Foundations of Computing, 1(2): 149-180. https://doi.org/10.3934/mfc.2018008

Redmon, J., y Farhadi, A. (2018). YOLOv3: An Incremental Improvement. http://arxiv.org/abs/1804.02767

Smola, A., y Vishwanathan, S. (2008). Introduction to Machine Learning. https://alex.smola.org/drafts/thebook.pdf

Tzutalin. (2015). LabelImg Free Software: MIT License. https://github.com/tzutalin/labelImg

Valencia, N. O., et al., (2016). Movement detection for object tracking applied to the InMoov robot head. XXI Symposium on Signal Processing, Images and Artificial Vision (STSIVA). https://doi.org/10.1109/stsiva.2016.7743328

Wozniak P., et al., (2018) Scene Recognition for Indoor Localization of Mobile Robots Using Deep CNN. En Chmielewski L., et al., (eds) Computer Vision and Graphics. ICCVG 2018. Lecture Notes in Computer Science. Springer. https://doi.org/10.1007/978-3-030-00692-1_13

Xu, Y., y Wang, Z. (2021). Visual sensing technologies in robotic welding: Recent research developments and future interests. Sensors and Actuators. A, Physical, 320(112551), 112551. https://doi.org/10.1016/j.sna.2021.112551

Zhang, Z.; Song, Y., y Qi, H. (2017). Age progression/regression by conditional adversarial autoencoder. IEEE Conference on Computer Vision and Pattern Recognition (CVPR). https://doi.org/10.1109/cvpr.2017.463

Published

2021-10-01

How to Cite

Cortes Zarta, J. F., Giraldo Tique, Y. A., & Vergara Ramirez, C. F. (2021). Convolutional Neural Network for Spatial Perception of InMoov Robot Through Stereoscopic Vision as an Assistive Technology. Enfoque UTE, 12(4), pp. 88 – 104. https://doi.org/10.29019/enfoqueute.776

Issue

Section

Miscellaneous