EEG Classification and Robot Simulation

EEG Controlled Robot Simulation

Above is a picture of a simulated EEG controlled robot (2d rover). The robot is controlled by two signals for desired left-turn and right-turn. These can be entered manually with the buttons, or from a text program file of 0s and 1s. Each desired control class (0 or 1) is used to randomly select an EEG trace (displayed at lower left) from one of two different sets of EEG data (test set #4 shown below) which I recorded previously. The selected 1d vector is then submitted to a time-series classifier which returns a 0 or 1 depending on what class it thinks the vector is from, and that value is then used as the actual control signal to the robot. Each dataset submitted can also be used to train the classifier, if desired.

After about 30-40 training vectors, the classifier produces near-perfect results. In the plot above, the robot is cruising in a slight right turn, producing a large circle. The grid and robot wraps around in X and Y coordinates, producing the overlapping pattern. In future versions of the software, the random EEG selection will be replaced with realtime EEG input, and the simulated robot will be replaced by an actual radio controlled robot.

The process for guiding the robot is as follows:

  1. A button press, or a program line, generates a desired left (0) or right (1) turn.
  2. An EEG vector is randomly selected from the desired control class (shown in lower left window).
  3. That vector is submitted to the classifier.
  4. The result of classification (0/1) is then used as an actual left/right control signal to the robot.
  5. Optionally, the correct class of the submitted EEG vector is used to further train the classifier.


A program consisting of 25 repetitions of the sequence '0 1 1 0' (100 lines) was used to train the classifier to direct the robot to move in a straight line. At first the robot moves erratically, but by the end of training by 100 vectors (or less), it moves reliably in a straight line:

Running a program

Once the classifier has been trained, the training flag can be turned off and other programs run. Shown below is the result of running a 'figure eight' program which consists of 24 desired left turns followed by 24 desired right turns of 15 degrees each:

The path of the robot can also be displayed from robot coordinates:

While the classifier is correct most of the time, it does occasionally make errors. The long-term path of the robot looping through the 'figure eight' program is therefore not just a repeat of two osculating circles, although the robot does tend to repeat the double-circle path at different places in the grid:

The robot wanders over the grid in the picture above because there is no feedback provided to correct its path, it is simply executing pre-programmed instructions which may occasionally be misclassified by the software. In a real robotic system, realtime feedback to the user will allow for correcting the robot's path as it moves. Feedback can also be applied in the simulation by using the buttons to control the robot, instead of a program.

Future work

In this project, I am interested in two modes of robotic operation, which have been simulated above:

  1. Programmed: This will involve writing a set of pre-programmed instructions to, for example, navigate a maze or follow a course. These instructions will select EEG candidate vectors and submit them to the classifier for conversion to actual robot instructions. Evaluation of the classifier/control system will be based on how well the robot does in the maze/course without additional supervision or interaction.

  2. Live: Similar to #1, but with live visual feedback so that when errors are made by the classifier/control system, the user can instantly respond with correcting signals.

Both modes of operation could be evaluated in a maze/race/competition environment, in order to better simulate real-world operation. Two users and robots might compete for a 'prize' in order to provide more motivation for correct and compensative control, both in the electronics/software, and in the human, systems.

EEG Classification

Classification of EEG signals is a subset of the classification of general time-series data. The basic idea is that there exist two or more distinct physical processes (such as in the brain) that can each generate examples of time-series data (such as a time-varying effect on an electrical current running through the scalp) so that the signals generated by one process are, in some sense, more like each other and less like the signals generated by another process. If the signals generated by one process are truely different from the signals generated by another process, it should be possible to invent a mathematical analysis, and to program a computer to execute that anaysis, which can tell the difference between signals generated by different processes.

Thus, the computer program can classify datasets by assigning to each a number indicating which process is likely to have generated that dataset. This number can then be used to provide a control signal for a mechanical device such as a 2d robot rover. For example, one class of data might represent a 'turn left' command to the robot, while another class of data represent 'turn right'. If the computer classifier is accurate enough, then the raw, unclassified signals can be used to control the robot via the use of software. In this project, the signals used are the EEG waveforms which are acquired from a single pair of electrodes placed on the scalp (i.e. from 1 channel of data). The underlying processes are the mental states representing a desire to 'turn left' and 'turn right' in some way, although any two externally discernable mental states would serve as well (or possibly better).

Although it will eventually be desirable to use more than one channel of data, and to classify that data into more than two distinct classes (and therefore more than two possible commands to a robot), a single channel of EEG data is sufficient to demonstrate the proof of this concept for two classes.

The classifier software developed in this project is proprietary. However, some features that may distinguish it from classifiers used by other researchers are worth noting:

  1. It can be used to classify both EEG data and other time-series data.
  2. It may make use of a number of techniques including both time-series and Fourier analyses.
  3. It does not make use of neural nets or ARMA algorithms.
  4. It is causal (incremental), in the sense that it uses only the data which have been submitted to it, in the order submitted, to perform classification. That is, to classify vector N, it only uses the previous N - 1 vectors.
  5. Vectors submitted for classification can simultaneously be used to train the classifier, or not, depending on the setting of a flag.
  6. It is fast to train, requiring only 30-40 vectors (or less, depending on the data) to achieve near 100% accuracy.
  7. It is fast to operate, and can classify and train 100 vectors of 256 points in less than 2 seconds on an 800 MHz Macintosh G4.
  8. It is written in Unix/ANSI C.

Several synthetic and real datasets have been used during development of the classifier. Four examples are shown below. In each example, a file of vectors was created which contained data from two different classes, ordered randomly in the file. During each test run, a new random ordering was created for the file, the vectors were then sampled without replacement according to that ordering, and submitted to the classifier one at a time. The correct class of each vector was also used to train the classifier. Each run is different in that it represents a potentially different permutation of the elements in each file.

In the text output below, there are six values reported after each vector has been classified. These values (columns) are:

  1. Total number of vectors submitted.
  2. Index of vector submitted, in random order.
  3. Actual class of vector.
  4. Classified class of vector.
  5. Total success rate for all submitted vectors.
  6. Success rate of last 20 submitted vectors (in 5% increments).

In the plots below, the data classes are color-coded (blue = 0, red = 1). For each test run, only the last 10 lines of output are shown here. To see the entire output, click on the 'Entire run' links.

Test set 1

This dataset consisted of 100 vectors of sine waves, 50 at 10 Hz, 50 at 12 Hz, all with random phases:

Classifier output:

 90  94 1 1 0.956 1.000
 91  72 0 0 0.956 1.000
 92  54 0 0 0.957 1.000
 93  90 1 1 0.957 1.000
 94  56 1 1 0.957 1.000
 95  10 0 0 0.958 1.000
 96  17 0 0 0.958 1.000
 97  59 0 0 0.959 1.000
 98  87 1 1 0.959 1.000
 99  77 1 1 0.960 1.000
100  75 1 1 0.960 1.000
Entire run

Test set 2

This dataset is similar to #1, but has random noise added to each sinewave:

Classifier output:

 90  22 0 0 0.956 1.000
 91  82 1 1 0.956 1.000
 92   5 0 0 0.957 1.000
 93  51 1 1 0.957 1.000
 94  37 0 0 0.957 1.000
 95  65 1 1 0.958 1.000
 96  44 1 1 0.958 1.000
 97  61 0 0 0.959 1.000
 98  21 0 0 0.959 1.000
 99  12 0 0 0.960 1.000
100  79 0 0 0.960 1.000
Entire run

Test set 3

This dataset consists of simulated EEG data using my cerebral cortex simulator (picture). The two data classes were created by using different parameters for the stimulatory and inhibitory settings of the simulator:

Classifier output:

 90  53 1 1 0.922 1.000
 91  96 1 1 0.923 1.000
 92  83 1 1 0.924 1.000
 93  98 0 0 0.925 1.000
 94  59 0 0 0.926 1.000
 95  71 1 1 0.926 1.000
 96  67 0 0 0.927 1.000
 97   5 0 0 0.928 1.000
 98  29 1 1 0.929 1.000
 99  89 0 0 0.929 1.000
100  95 1 1 0.930 1.000
Entire run

Test set 4

This dataset consists of 60 actual EEG data vectors which were recorded by me previously. Both classes are from occipital electrode configurations (PO3 and PO4):

Classifier output:

 50  38 0 0 0.940 1.000
 51  29 1 1 0.941 1.000
 52  42 1 1 0.942 1.000
 53  45 0 0 0.943 1.000
 54   1 0 0 0.944 1.000
 55  43 0 0 0.945 1.000
 56  40 1 1 0.946 1.000
 57   8 0 0 0.947 1.000
 58  12 0 1 0.931 0.950
 59  26 0 0 0.932 0.950
 60  27 1 1 0.933 0.950
Entire run

Here is an archive of the above data (in binary PPC format and text format), a Mac OS X PPC program to read and display it, and a Mac OS X PPC program to convert it to text. Several other data sets in addition to those shown above are included. The text files contain one column for each vector of data. The header shows version(1), dtype(0), #vectors, #pts/vec, rate(Hz), duration(secs), freq0(not applicable to time series data), freqn(N/A). Each column is preceeded by the vector's class(0/1) and a secondary class(-1). Text files are in Unix line format.

İSky Coyote 2003-2007