Traffic Signs Classifier
Challenge
Train a convolution neural network to classify traffic signs images using the German Traffic Sign Data set; with the trained model classify traffic signs from the web.

Actions
install anaconda, setup an environment, install pandas, pickle, tensorflow and openCV
explore, summarise and visualise the data set
design, train and test a model architecture
use the model to make predictions on new images
analyse the softmax probabilities of the new images
Tools
This project used a combination of Python, pandas, matplotlib, openCV and tensorflow gpu. These tools are installed in a anaconda environment and ran in a Jupyter notebook.
The complete project implementation is available here: https://github.com/FlorinGh/SelfDrivingCar-ND-pr2-Traffic-Signs-Classifier/blob/master/Traffic_Sign_Classifier.ipynb.
Data Set Summary and Exploration
The data provided was already split in train, validation and test; using a few lines of code we inspect the data set:
The size of training set is 34799
The size of the validation set is 4410
The size of test set is 12630
The shape of a traffic sign image is 32 x 32 pixels
The number of unique classes/labels in the data set is 43
The data is composed of 32 x 32 pixel images containing traffic signs; there are 43 different traffic signs; each type will be assign a number from 0 to 42 representing the class ID; we will train a model to estimate the ID based on the input image; then, with the predicted ID we can search in a table the corresponding sign name.
A bin chart was created to visualise the train, validation and test sets:
this chart works a frequency plot: it shows how many example we have for each of the classes
for any given set, the distribution is not uniform between classes; but this distribution is kept for all sets; this will ensure that the train will not bias the model toward a specific class

Design and Test a Model Architecture
The starting point for this project was LeNet-5 architecture with a change in pre-processing and the number of classes; given the training accuracy with the original architecture was below the accepted threshold, some improvements have been made; training was performed using tensorflow-gpu 1.6.
Preprocessing consisted in:
grayscale all images to remove the complexity of a 3 layer data to one; it was noticed that the colour information adds complexity with no additional accuracy; furthermore, using grayscale images significantly improves learning speed

normalisation: we have to make sure the data has mean zero and equal variance; given the images are now grayscale, each pixel has a value between 0 and 255; applying
(pixel - 128)/128
is a quick way to approximately normalise the data; normalising images increased accuracy with about 3.5% and improved learning speed
shuffle: random learning ensures better model accuracy; random state ensures the same random distribution is achieved every time the code is ran; this helps in assessing improvements between two different runs
An iterative approach was taken in order to design the neural network:
started from the LeNet-5 architecture and tested it on the coloured images, with an accuracy of 88.1%; I chose the LeNet-5 because it was the single complete net that I've used successfully before.
this architecture was actually created for a simpler problem, MNIST, with only one channel of data and only 10 classes; so it was expected to have lower accuracy than on MNIST
trying to improve accuracy I first did a couple of studies to see the effect of changing the batch size and learning rate on the accuracy; decided that 128 and 0.001 were the best values
then I switched to tensorflow-gpu and increased the number of epochs from 10 to 300 and this gave an improvement of about 2%, getting to 90.2%
then I applied normalisation, which increased the accuracy by another 3.6%, getting to 93.8% on the test data, and 95% on the validation data; these values are above the threshold for submitting, but I knew I could do better
applying grayscale didn't improve much the accuracy, I only won a bit on speed
the last bit I added in is dropout; added this on all layer except the first and last, dropout probability was 0.75 for training and 1.0 for evaluations; this increased accuracy to 95.0% on test set.
Lessons learned: colours don't matter much in the machine world; always normalise to make all numbers in the same range (this gave the biggest improvement in accuracy); use dropout, make it harder to train in order to make in easier at the test
The final architecture consisted of the following layers:
Layer
Description
Input
32x32x1 Random, grayscale, normalised image
Convolution 5x5
1x1 stride, valid padding, outputs 28x28x6
RELU
Activation
Max pooling
2x2 stride, valid padding, outputs 14x14x6
Convolution 5x5
1x1 stride, valid padding, outputs 10x10x16
RELU
Activation
Max pooling
2x2 stride, valid padding, outputs 5x5x16
Flatten
Outputs 400
Dropout
Ignore 25% of outputs
Fully connected
Outputs 120
RELU
Activation
Dropout
Ignore 25% of outputs
Fully connected
Outputs 84
RELU
Activation
Dropout
Ignore 25% of outputs
Fully connected
Outputs 43
Softmax
Outputs 43
Training setup:
epochs: 50
batch size: 128
learning rate: 0.001
dropout during training set to 0.75 and changed to 1.0 when evaluating against validation and test data sets
optimizer: AdamOptimizer.
My final model results were:
training set accuracy of 99.9%
validation set accuracy of 95.5%
test set accuracy of 95.0%
Test a Model on New Images
To test my model I used 12 traffic signs that were not in the original data set:

Image no.4 might be difficult to classify because half of the sign is covered in snow. In the table below are listed the predictions using the trained network:
Image
Prediction
Go straight or left
Go straight or left
Keep right
Keep right
No entry
No entry
No vehicles
No vehicles
Speed limit (50km/h)
Speed limit (70km/h)
Yield
Yield
Road work
Road work
Speed limit (20km/h)
Speed limit (20km/h)
Ahead only
Ahead only
Stop
Stop
Speed limit (50km/h)
Speed limit (50km/h)
Priority road
Priority road
The model was able to correctly guess 11 out of 12 traffic signs, which gives an accuracy of 91.7%. This compares favourably to the accuracy on the test set of 95.0%.
In the following we can see the top 5 softmax probabilities for each of the 12 new traffic signs:

Visualising the Neural Network
While neural networks can be a great learning device they are often referred to as a black box. This is not entirely true as we can look under the hood to see how the data evolves from one layer to the other; tensorflow has the power to map all action under names and we can plot each action to understand what is happening in each layer.
For instance, layer two focuses on the complete image taking into account a lot of details; as we go deeper in the network, the amount of data is smaller for each image and fewer and fewer details are considered.

Results
Using LeNet-5 as a starting point, a neural network architecture was developed to classify the German traffic signs; several technologies were learned along the way such as: tensorflow-gpu, neural networks architectures, grayscale, normalisation, types of layers, convolution, pooling, activation with RELU, dropout, how to save and recall a trained model.
In the end the network performance is the following:
training set accuracy: 99.9%
validation set accuracy: 95.5%
test set accuracy: 95.0%
For more details on this project visit the following github repository: https://github.com/FlorinGh/SelfDrivingCar-ND-pr2-Traffic-Signs-Classifier
Last updated