Journal of Shanghai Jiaotong University(Science) >
Sensorimotor Self-Learning Model Based on Operant Conditioning for Two-Wheeled Robot
Online published: 2017-04-04
Traditional control methods of two-wheeled robot are usually model-based and require the robot’s precise mathematic model which is hard to get. A sensorimotor self-learning model named SMM TWR is presented in this paper to handle these problems. The model consists of seven elements: the discrete learning time set, the sensory state set, the motion set, the sensorimotor mapping, the state orientation unit, the learning mechanism and the model’s entropy. The learning mechanism for SMM TWR is designed based on the theory of operant conditioning (OC), and it adjusts the sensorimotor mapping at every learning step. This helps the robot to choose motions. The leaning direction of the mechanism is decided by the state orientation unit. Simulation results show that with the sensorimotor model designed, the robot is endowed the abilities of self-learning and self-organizing, and it can learn the skills to keep itself balance through interacting with the environment.
ZHANG Xiaoping1,2* (张晓平), RUAN Xiaogang1 (阮晓钢), XIAO Yao1 (肖尧), HUANG Jing1 (黄静) . Sensorimotor Self-Learning Model Based on Operant Conditioning for Two-Wheeled Robot[J]. Journal of Shanghai Jiaotong University(Science), 2017 , 22(2) : 148 -155 . DOI: 10.1007/s12204-017-1814-8
[1] CHAN R P M, STOL K A, HALKYARD C R. Reviewof modelling and control of two-wheeled robots[J]. Annual Reviews in Control, 2013, 37: 89-103. [2] SUPRAPTO B Y, AMRI D, DWIJAYANTI S. Comparisonof control methods PD, PI, and PID on twowheeled self balancing robot [C]//Proceeding of InternationalConference on Electrical Engineering, ComputerScience and Informatics. Yogyakarta, Indonesia:IEEE, 2014: 67-71. [3] BATURE A A, BUYAMIN S, AHMAD M N, et al.A comparison of controllers for balancing two wheeledinverted pendulum robot [J]. International Journal ofMechanical & Mechatronics Engineering, 2014, 14(3):62-68. [4] ALARFAJ M, KANTOR G. Centrifugal force compensationof a two-wheeled balancing robot [C]// Proceedingof International Conference on Control, Automation,Robotics and Vision. Singapore: IEEE, 2010:2333-2338. [5] ZHOU Y S, WANG Z H. Motion controller design ofwheeled inverted pendulum with an input delay via optimalcontrol theory[J]. Journal of Optimization Theoryand Application, 2016, 168(2): 625-645. [6] LI C Q, GAO X S, HUANG Q, et al. A coaxial couplewheeled robot with T-S fuzzy equilibrium control[J]. Industrial Robot: An International Journal, 2011,38(3): 292-300. [7] NASIR A N K, AHMAD M A, GHAZALI R, et al.Performance comparison between fuzzy logic controller(FLC) and PID controller for a highly nonlinear twowheelsbalancing robot [C]// 2011 First InternationalConference on Informatics and Computational Intelligence.Bandung, Indonesia: IEEE, 2011: 176-181. [8] YUE M, WANG S, SUN J Z. Simultaneous balancingand trajectory tracking control for two-wheeledinverted pendulum vehicles: A composite control approach[J]. Neurocomputing, 2016, 191: 44-54. [9] RUAN X G, WU X. The skinner automaton: A psychologicalmodel formalizing the theory of operant conditioning[J]. Science China Technological Sciences,2013, 56(11): 2745-2761. [10] RUAN X G, CHEN J, YU N G. Thalamic cooperationbetween the cerebellum and basal gangliawith a new tropism-based action-dependent heuristicdynamic programming method [J]. Neurocomputing,2012, 93: 27-40. [11] SKINNER B F. The behavior of organisms: An experimentalanalysis [M]. New York: D Appleton-CenturyCompany, 1938. [12] ROSEN B E, GOODWIN J M, VIDAL J J. Machineoperant conditioning [C]// Annual International Conferenceof the IEEE Engineering in Medicine and BiologySociety. Piscataway, USA: IEEE, 1988: 1500-1501. [13] ZALAMA E, G′OMEZ J, PAUL M, et al. Adaptive behaviornavigation of a mobile robot [J]. IEEE Transactionson Systems, Man, and Cybernetics. Part A:Systems and Humans, 2002, 32(1): 160-169. [14] ITOH K, MIWA H, MATSUMOTO M, et al. Behaviormodel of humanoid robots based on operant conditioning[C]// Proceedings of 2005 5th IEEE-RAS InternationalConference on Humanoid Robots. Tsukuba:IEEE, 2005: 220-225. [15] TANIGUGHI T, SAWARAGI T. Incremental acquisitionof behaviors and signs based on a reinforcementlearning schemata model and a spike timingdependentplasticity network [J]. Advanced Robotics,2007, 21(10): 1177-1199. [16] CHEU E Y, QUEK C, NG S K. ARPOP: An appetitivereward-based pseudo-outer-product neural fuzzy inferencesystem inspired from the operant conditioning offeeding behavior in aplysia [J]. IEEE Transactions onNeural Networks and Learning Systems, 2012, 23(2):317-329. [17] PIAGET J. The origins of intelligence in children [M].New York: International Universities Press, 1952. [18] LEE D D, SEUNG H S. Learning in intelligent embeddedsystems [C]// Proceedings of the Embedded SystemsWorkshop. Cambridge, USA: IEEE, 1999: 133-139. [19] NATALE L, ORABONA F, BERTON F, et al. Fromsensorimotor development to object perception [C]//Proceedings of 2005 5th IEEE-RAS International Conferenceon Humanoid Robots. Tsukuba: IEEE, 2005:226-231. [20] HOFFMANN H. Perception through visual motor anticipationin a mobile robot [J]. Neural Networks, 2007,20(1): 22-33. [21] REN H G, SHI T, ZHANG R C. Foundation of the sensorimotorsystem cognitive model with operant conditioningmechanism [J]. Robot, 2012, 34(3): 292-298 (inChinese).
/
〈 |
|
〉 |