Open map List

Artificial neural networks

Math models designed to imitate the brain.

Introduction

ANN are commonly used to classify or to extract information and patterns from (RS) images (often missed when using traditional statistical tools)

A typical feedforward ANN consists of three or more inter-connected layers of nodes ― an input layer, one or more hidden intermediate layers (often just 1), and an output layer (Figure 8‑15). The arrows indicate the direction of information flow, feeding information forward from input to output. Note that there may be any number of nodes at each level of the network, and not all nodes need to be connected to every node in the next layer. For example, in Figure 8‑15 hidden node H2 only receives input from input nodes I1 and I3, and only provides output to node O1. This arrangement can be seen as a directed graph, or as a rather complex-looking function mapping.

The connections between the input layer and hidden layer can be described using a weight matrix, W, where the row/column entries wij are positive or negative real-valued weights, or 0 if no connection exists. Likewise, the connections between the hidden layer and the output layer can also be viewed as a weight matrix, Z say, again consisting of a set of weights, zjk. Positive weights in each case imply a reinforcement process associated with the source node or input, whilst negative weights correspond to inhibition.

Advantage

Can capture complex pattern
Can model complex relationship and non-linear functions
Can process high-dimentional data
Can make prediction really fast
Can inconporate continuous and discrete inputs/outputs

Disadvantage

Require training data
Selecting the right architecture and tuning hyperparameters (such as the number of layers, learning rate, etc.) can be challenging
Hard to interpret
Risk of overfitting

Terminologies in ANN

Size: The number of nodes in the model
Width: The number of nodes in a specific layer
Depth: The number of layers in a neural network (The input layer is often not counted)
Architecture: The specific arrangement of the layers and nodes in the network.

Explanation

Basic Concepts

1. Neurons (Nodes/Perceptrons):

The basic unit of an ANN, analogous to biological neurons.
Built from weighted input data which transformed using activation function

2. Layers:

Combination of perceptrons
Input Layer: The first layer, which receives the input data.
Hidden Layers: Intermediate layers where computations are performed. There can be multiple hidden layers.
Output Layer: The final layer that produces the output of the network.

3. Weights and Biases:

Weights are parameters that determine the importance of the input signals.
Each connection between neurons has a weight associated with it.
Biases are additional parameters added to the input of each neuron.

4. Activation Function:

A mathematical function applied to the weighted sum of inputs plus bias to determine the output of a neuron.
Common activation functions include Sigmoid, ReLU (Rectified Linear Unit), and Tanh.

Structure of an ANN

1. Feedforward Neural Network:

Information moves in one direction from input to output.
No cycles or loops in the network.

2. Recurrent Neural Network (RNN):

Neurons are connected in cycles, allowing information to be retained in the network, making them suitable for sequential data.

The dataset is normally split into 3:

-Training -> presented to the network

-Testing -> early stopping to minimize the risk of “overfitting”

-Validation -> to get some kind of error measurement (RMSE)

Examples

How to

initialize weights and biases randomly
Calculate the weighted sum of inputs plus bias for each neuron in the hidden layer
Apply the activation function to get the output for each neuron
Pass the outputs from the hidden layer to the output neuron.
Apply the Sigmoid function to the output neuron to get the final prediction
compute the loss using a loss function
Calculate the gradient of loss
Update the weights and biases
iterate

How many layers and nodes to use in ANN?

The number of neurons comprised in the imput layer is equal to the number of features (columns) in the data.
(Some configurations add one additional node for a bias term)
The output layer has one node for each output.
- A single node in the case of regression
- K nodes in the case of K-class classification

Outgoing relations

Artificial neural networks is part of Machine learning

Incoming relations

Feedforward networks is a kind of Artificial neural networks
Self-Organizing Maps is a kind of Artificial neural networks
Deep learning is part of Artificial neural networks

Contributors

Sandra
Tong Jiang