Firstly if you really want to learn in depth how a neural network works then this is not the post for you as I will not be going into too much detail into how it mathematically works atleast not in this blog post. However if you are just interested in quickly building one and playing around with it then this is definitely for you.
Here are the Python libraries we will need. We will be using Keras as it simplifies the usage of various machine learning techniques and Numpy for array manipulation (for simplicity, but not necessary).
import numpy as np from keras.models import Sequential from keras.layers import Dense
Now we need a dataset! As we are using supervised learning techniques, we need rows of sample data with the input and its corresponding output. Remember all neural networks only approximate functions (ie. map a given output to its expected output).
For this tutorial, we will be using the UCI Pima Indians Diabetes Database. The dataset is a simple CSV (Comma-Separated Values) file and each line in the file is one data point. The first 8 values of each line are the inputs (independent variables) and the 9th value is the output we are trying to predict (dependent variable). More information about the dataset can be found in the link above.
Numpy has a built-in function to load CSV files into arrays so we will be using this for simplicity. After loading the data into the program, we will separate the inputs and outputs for training the neural network later on.
# load the dataset dataset = np.loadtxt('pima-indians-diabetes.csv', delimiter=',') # shuffle the dataset (not strictly necessary) np.random.seed(1) # for repeatability dataset = np.random.shuffle(dataset) # split into input (X) and output (y) variables X = dataset[:,0:8] Y = dataset[:,8]
For evaluating any neural network, you will need training and testing data.
To see how many data points are in the dataset we can do a quick check with
print(len(dataset)) which will show us that there are 768 points. For this tutorial, we split into a train-to-test of 512:256 (ie. 2:1) but this can be whatever you like. The more the training data, the better the neural network will train. The more the test data, the better you will be able to evaluate the network's true accuracy. So it is a delicate balance that requires practice for improvement and more data for perfection.
## split into train and test data train_X = X[:512] train_Y = Y[:512] test_X = X[512:] test_Y = Y[512:]
Now we need to define the architecture of our neural network! No one has this perfect and the only way to improve at this is by practice and understanding how neural networks actually work! For the purpose of this tutorial, I will not be dwelling too deeply on the hyperparameters.
# define the keras model model = Sequential() model.add(Dense(64, input_dim=8, activation='relu')) model.add(Dense(24, activation='relu')) model.add(Dense(1, activation='sigmoid')) # compile the keras model model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
Our architecture is essentially the following:
4 Layers -> Nodes: 8 Inputs (1st), 64 RELUs (2nd), 24 RELUs (3rd), 1 Sigmoid Output (4th)
Now to update the weights of the neural network, we used the Adam optimiser with the loss function being the binary cross-entropy function (as our output is binary). Our most important metric is accuracy although loss can also be added to the array.
Training a Keras neural network is possibly the simplest line of code!
# fit the keras model on the dataset model.fit(train_X, train_Y, epochs=500, batch_size=32, verbose=2)
We give the function the training inputs and the corresponding outputs. The epochs is the number of times the neural network trains on the dataset completely (one forward pass and one backward pass of all the training examples). Here the neural network will go through all the training data 500 times. The batch size is the number of samples that will be propagated through the network at one time. We have set it to a nice factor of 512 (the number of training examples).
Verbose is by default 1 which shows a progress bar, 2 prints out one line per epoch and 0 prints out nothing. Verbose 0 is fastest as Python print slows down the program a lot but here it has been set to 2 just for demonstration.
# evaluate the keras model _, accuracy = model.evaluate(train_X, train_Y) print('Train Accuracy: %.2f' % (accuracy*100))
This is essentially a measure of how much the network has learnt from the training dataset and a high accuracy is necessary to get anywhere.
# make class predictions with the model predictions = model.predict_classes(test_X).reshape((256)) accuracy = np.count_nonzero( predictions == test_Y ) / len(test_Y) print('Test Accuracy: %.2f' % (accuracy))
This is the true measure of accuracy meaning that this shows how well a network can predict outputs for data it has not seen before.
If the training accuracy is high but the test accuracy is low then the neural network is overfitting and you need more data and/or hyperparameter tuning.
Export & Import
# save architecture and weights to a single file model.save("model.h5")
It is possible to save the architecture and weights separately but for simiplicity, it is easier to save it with one line of code.
To now use your model for later in some other program, you will only need to add the following lines of code:
from keras.models import load_model # load model model = load_model('model.h5') # summarize model (optional) model.summary()
As you can see from the above tutorial it is actually very easy to do machine learning with these human-friendly APIs. The real difficulty is understanding how it works and being able to design your own. I have in the past made my own neural networks for my projects and so I will some day cover a tutorial into designing a neural network from scratch as well as more advanced network designs.