Training Neural Networks for Event-Based End-to-End Robot Control

End-to-end robot controllers using DQN, Braitenberg Vehicle, SNN, R-STDP

I reproduced this project with a Pioneer P3-DX mobile robot in the V-REP simulator on 20231028. There are four kinds of controllers, named Braitenberg, DQN, DQN+SNN, R-STDP.

Check codes on Github

The robot is attached with a DVS camera like the picture below:

A video demo link is attched below:

Host Machine Environment

I created a Ubuntu16.04, virtual host machine using EXSI-8.0, 4Cores 8GB Memroy 25GB hard disk.

Dependencies

  1. ROS-kinetic
  2. NEST2.10
  3. V-REP 3.6
  4. Python2

Running the v-rep simulator

$ cd /V-REP_PRO_V3_6_2_Ubuntu16_04
$ ./vrep.sh

and the screenshot is attached below:

Star the Controller

Braitenberg

$ python Controller/Braitenberg/controller.py 

After you start the controller, it will show like the picture below:

Framework

Simulated Environment

The experiment is based on a lane-following scenario, depicted in the picture below.

Input Event data

The input data pre-processing is depicted like the picture below:

the raw 128x128 DVS frames is devided into small 4x4 regions and counting every event over consecutive frames regardless of the polarity. Then, the image is cropped at the top and bottom resulting in a 32 × 16 image. Fianlly, the input image will be transformed into a binary values with only [0,1].

DQN

Defined Rewards of MDP per time-step

The reward is defined as a Gaussian distributed function of the lane-center distance.

Communication between components of the DQN controller

DQN+SNN

The SNN is trained approximating the policy given by the Q-network of the DQN algorithm.

The Algorithm of Converting ANNs to SNNs

Model-based normalization algorithm for converting ANNs into SNNs according to Diehl et al(Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing.).

Communication between components of the DQN-SNN controller

The communication diagram is attached below:

Braitenberg

Communication and components of the Braitenberg vehicle controller:

The network architecture of the Braitenberg vehicle using DVS frames as input.

R-STDP

The network architecture is different from the previous Braitenberg vehicle, it merely use one SNN to control left and right motor outputs.

Comparision of different controllers on the outer lane

It is obvious from the result that R-STDP is the most robust and stable one from the distance away from the lane center.

References

Papers:

[1] Towards a framework for end-to-end control of a simulated vehicle with spiking neural networks [2] Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing

Relevant Links:

Link 1

Link 2

Link 3

Link 4

V-REP previous versions:

V-REP

NEST2.10:

NEST Simulator