The problem of improving the computer vision accuracy through the use of the convolutional neural networks for the programmable robotic systems on the "Arduino" platform

Alakina-Kaminskaya A.I., Okhapkin V.P.

Russian State University for the Humanities, Institute for Information Sciences and Security Technologies, dep. Complex information security, dep. Information technologies and systems, Russia, 117534, Moscow, Kirovogradskaya St., 25, Bldg. 2, phone: 8(495)250-62-66,,

The synthesis of mathematical methods of the object detection and technical means in the industrial manufacturing has created an extensive area in the engineering sphere. First theoretical elaborations in the field of the object detection have been made by scientists of the Soviet Union and refer to the beginning of the second half of the twentieth century. Among of those developments it is necessary to note the researches of A. A. Kharkevich, V. M. Glushkov, V.S. Mikhalevich and V. S. Pugachev who are known as the scientists of the USSR Academy of Sciences, academician of the RAS Y. I. Zhuravlev and the works of other domestic and foreign scientists in the field of Cybernetics theory, functional analysis, stochastic systems and machine learning theory.

The development of robotic systems has revealed new challenges in the creation of methods and algorithms of the image recognition: recognition of the objects under an angle at the moment of fixing a snapshot.

Convolutional neural network (hereinafter CNN) is one of the effective ways to help identify objects in the image [1]. The basis of such network is the convolution operation, allowing to analyze the original image by moving the size-limited weight matrix (else, convolution kernel). In the theory of CNN's algorithm the convolution kernel could be considered as a graphical code of characteristics, for instance, of a certain object located at an angle to the device making a shot. In that case every subsequent layer of the CNN is to receive a map of the characteristics of the image being analyzed.

Hypothesis: there is a convolution kernel singular value decomposition of which allows rotating and zooming the analyzed image in the three dimensional space .

Computational experiment aimed to verify the formulated hypotheses is implemented through the use of the robotic platform "ARDUINO" coming with a video camera module.


Lecun Y, Bengio Y. Convolutional networks for images, speech, and time-series // The handbook of brain theory and neural networks, Vol. 3361, Issue 10, 1995, pp. 1995-2009

© 2004 Designed by Lyceum of Informational Technologies №1533