The Artificial Neural Network (ANN) is inspired by the animal brain. Even though currently the ANNs are not as powerful as the brain yet, these are one of the most powerful learning models in the field of machine learning.
Understanding the Human Brain and ANNs
Let's understand first how the human brain works and how it influences the ANNs.
The biological neuron receives signals through its dendrites which are either amplified or inhibited as they pass through the axons to the dendrites of other neurons.
ANNs are also based on a similar concept, where ANNs are a collection of a large number of simple devices called Artificial Neurons. The network
learns to perform certain tasks ( like identifying a car ) by training the neurons to fire an action when a particular input ( like an image of a car) is provided.
Perceptron is one of the earlier proposed models, which takes a weighted sum of the inputs and applies an activation function to it. It is a kind of a single-layer artificial network with only one neuron.
The perceptron consists of 4 parts.
- Input values
- Weights and Bias
- Net sum
- Activation Function
Perceptron works as below-
- All the inputs (x1,x2.....) are multiplied by their corresponding weights (w1,w2..) i.e. x1.w1, x2.w2.. etc.
- Take the summation of all multiplied values which is called the weighted sum.
- Then an activation function is applied to the weighted sum. We can use a step function as an activation function. A step function is defined as below -
Perceptron is usually used to classify the data into two parts. Therefore, it is also known as a Linear Binary Classifier. E.g. Classification of population into male and female.
Artificial Neuron
An artificial neuron is similar to a perceptron, except that the activation function is not a step function.
Below are some properties of the activation function.
- It should be smooth and it should not have any abrupt changes.
- They should also, to some extent, make the inputs and outputs non-linear to each other. This is because non-linearity contributes to compacting neural networks.
Below we have commonly used activation functions.
- Hyperbolic Tangent Function
- Leaking and Parametric Relu
Artificial Neural Network
An artificial neural network (ANN) is a network of such neurons.
Neurons in a neural network are arranged in layers. The first and the last layer are called the input and output layers.
Input layers have as many neurons as the number of attributes in the data set.
For a classification problem, the output layer has as many neurons as the number of classes of the target variable .
For a regression problem, the number of neurons in the output layer would be 1.
Structure of ANN
Below are the parts of ANN:
- Network Topology
- Input Layer
- Output Layer
- Weights
- Activation functions
- Biases
The leftmost layer in this network is called the input layer (x), and the neurons within the layer are called input neurons. The rightmost or output layer (O) contains the output neurons. The middle layers are called hidden layer (h1,h2..hj) since the neurons in this layer are neither inputs nor outputs.
Given the complex nature of ANNs, we can assume below points:
- Neurons are arranged in layers, sequentially.
- Neurons within the same layer do not interact with each other.
- Neurons are densely connected i.e. all neurons in layer n are connected to all neurons in layer n+1.
- There is a weight associated with each interconnection in the neural network, and each neuron has a bias associated with it.
- All neurons use the same activation function in a specific layer.
Feedforward in Neural Networks
In feedforward neural network , the output from one layer is used as input to the next layer.
In this network, the information moves in only one direction, forward, from the input nodes, through the hidden nodes (if any) and to the output nodes. There are no cycles or loops in the network.
The main goal of a feedforward network is to approximate some function f*. For example, a regression function y = f *(x) maps an input x to a value y. A feedforward network defines a mapping y = f (x; θ) and learns the value of the parameters θ that result in the best function approximation
The layers between the input layer and the output layers are known as hidden layers, as the training data for these layers do not show the desired output. With any number of hidden units, a network can contain any number of hidden layers. A unit is essentially like a neuron that takes input from previous layer units and calculates its own activation value.
Backpropagation in Neural Networks
Back-propagation is the essence of neural net training. It is the method of fine-tuning the weights of a neural net based on the error rate obtained in the previous epoch (i.e., iteration). Proper tuning of the weights allows you to reduce error rates and to make the model reliable by increasing its generalization.
Backpropagation is a short form of "backward propagation of errors." It is a standard method of training artificial neural networks. This method helps to calculate the gradient of a loss function with respects to all the weights in the network.
There is one important thing you should note here. We minimize the average of the total loss and the not the total loss. Minimizing the average loss implies that the total loss is getting minimized.
The loss function is defined as follows:
The loss function is defined in terms of the network output F(xi) and the ground truth yi. Since F(xi) depends on the weights and biases, the loss, in turn, is a function of (w, b) . The average loss across all data points is denoted by G(w, b) which we want to minimize.
Artificial Nerual Network using Python
Here we are using an MNIST
dataset. Let's see how we can use python to create a neural network.
Loading and previewing the data.
Here the target variable has to be converted to a one-hot matrix. We use the function one-hot to convert the target dataset to one-hot encoding.
