Artificial neural networks

Math models designed to imitate the brain.

Introduction

ANN are commonly used to classify or to extract information and patterns from (RS) images (often missed when using traditional statistical tools)

A typical feedforward ANN consists of three or more inter-connected layers of nodes ― an input layer, one or more hidden intermediate layers (often just 1), and an output layer (Figure 8‑15). The arrows indicate the direction of information flow, feeding information forward from input to output. Note that there may be any number of nodes at each level of the network, and not all nodes need to be connected to every node in the next layer. For example, in Figure 8‑15 hidden node H2 only receives input from input nodes I1 and I3, and only provides output to node O1. This arrangement can be seen as a directed graph, or as a rather complex-looking function mapping.

The connections between the input layer and hidden layer can be described using a weight matrix, W, where the row/column entries wij are positive or negative real-valued weights, or 0 if no connection exists. Likewise, the connections between the hidden layer and the output layer can also be viewed as a weight matrix, Z say, again consisting of a set of weights, zjk. Positive weights in each case imply a reinforcement process associated with the source node or input, whilst negative weights correspond to inhibition.

 

Advantage

  • Can capture complex pattern
  • Can model complex relationship and non-linear functions
  • Can process high-dimentional data
  • Can make prediction really fast
  • Can inconporate continuous and discrete inputs/outputs

Disadvantage

  • Require training data
  • Selecting the right architecture and tuning hyperparameters (such as the number of layers, learning rate, etc.) can be challenging
  • Hard to interpret
  • Risk of overfitting

Terminologies in ANN

  • Size: The number of nodes in the model
  • Width: The number of nodes in a specific layer
  • Depth: The number of layers in a neural network (The input layer is often not counted)
  • Architecture: The specific arrangement of the layers and nodes in the network.

 

 

Explanation

 

Basic Concepts

 

1. Neurons (Nodes/Perceptrons):

  • The basic unit of an ANN, analogous to biological neurons.
  • Built from weighted input data which transformed using activation function

2. Layers:

  • Combination of perceptrons
  • Input Layer: The first layer, which receives the input data.
  • Hidden Layers: Intermediate layers where computations are performed. There can be multiple hidden layers.
  • Output Layer: The final layer that produces the output of the network.

3. Weights and Biases:

  • Weights are parameters that determine the importance of the input signals.
  • Each connection between neurons has a weight associated with it.
  • Biases are additional parameters added to the input of each neuron.

4. Activation Function:

  • A mathematical function applied to the weighted sum of inputs plus bias to determine the output of a neuron.
  • Common activation functions include Sigmoid, ReLU (Rectified Linear Unit), and Tanh.

Structure of an ANN

 

1. Feedforward Neural Network:

  • Information moves in one direction from input to output.
  • No cycles or loops in the network.

2. Recurrent Neural Network (RNN):

  • Neurons are connected in cycles, allowing information to be retained in the network, making them suitable for sequential data.

 

The dataset is normally split into 3:

-Training -> presented to the network

-Testing -> early stopping to minimize the risk of “overfitting

-Validation -> to get some kind of error measurement (RMSE)

Examples

How to

  1. initialize weights and biases randomly
  2. Calculate the weighted sum of inputs plus bias for each neuron in the hidden layer
  3. Apply the activation function to get the output for each neuron
  4. Pass the outputs from the hidden layer to the output neuron.
  5. Apply the Sigmoid function to the output neuron to get the final prediction
  6. compute the loss using a loss function
  7. Calculate the gradient of loss
  8. Update the weights and biases 
  9. iterate

How many layers and nodes to use in ANN?

  • The number of neurons comprised in the imput layer is equal to the number of features (columns) in the data.
  • (Some configurations add one additional node for a bias term)
  • The output layer has one node for each output.
    • A single node in the case of regression
    • K nodes in the case of K-class classification

Outgoing relations

Incoming relations

Contributors

  • Sandra
  • Tong Jiang