Training Neural Networks for Event-Based End-to-End Robot Control
End-to-end robot controllers using DQN, Braitenberg Vehicle, SNN, R-STDP
I reproduced this project with a Pioneer P3-DX mobile robot in the V-REP simulator on 20231028. There are four kinds of controllers, named Braitenberg, DQN, DQN+SNN, R-STDP.
Check codes on Github
The robot is attached with a DVS camera like the picture below:
A video demo link is attched below:
Host Machine Environment
I created a Ubuntu16.04, virtual host machine using EXSI-8.0, 4Cores 8GB Memroy 25GB hard disk.
Dependencies
- ROS-kinetic
- NEST2.10
- V-REP 3.6
- Python2
Running the v-rep simulator
$ cd /V-REP_PRO_V3_6_2_Ubuntu16_04
$ ./vrep.sh
and the screenshot is attached below:
Star the Controller
Braitenberg
$ python Controller/Braitenberg/controller.py
After you start the controller, it will show like the picture below:
Framework
Simulated Environment
The experiment is based on a lane-following scenario, depicted in the picture below.
Input Event data
The input data pre-processing is depicted like the picture below:
the raw 128x128 DVS frames is devided into small 4x4 regions and counting every event over consecutive frames regardless of the polarity. Then, the image is cropped at the top and bottom resulting in a 32 × 16 image. Fianlly, the input image will be transformed into a binary values with only [0,1].
DQN
Defined Rewards of MDP per time-step
The reward is defined as a Gaussian distributed function of the lane-center distance.
Communication between components of the DQN controller
DQN+SNN
The SNN is trained approximating the policy given by the Q-network of the DQN algorithm.
The Algorithm of Converting ANNs to SNNs
Model-based normalization algorithm for converting ANNs into SNNs according to Diehl et al(Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing.).
Communication between components of the DQN-SNN controller
The communication diagram is attached below:
Braitenberg
Communication and components of the Braitenberg vehicle controller:
The network architecture of the Braitenberg vehicle using DVS frames as input.
R-STDP
The network architecture is different from the previous Braitenberg vehicle, it merely use one SNN to control left and right motor outputs.
Comparision of different controllers on the outer lane
It is obvious from the result that R-STDP is the most robust and stable one from the distance away from the lane center.
References
Papers:
[1] Towards a framework for end-to-end control of a simulated vehicle with spiking neural networks [2] Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing
Relevant Links:
V-REP previous versions:
NEST2.10: