Introduction to Neural Networks: From Biological Inspiration to Artificial Intelligence

Introduction to Neural Networks: Bridging Biology and Artificial Intelligence

Comparison of biological neurons and artificial neural network architecture

Figure 2. Neural networks draw inspiration from biological nervous systems

Neural networks have become the cornerstone of modern artificial intelligence, powering everything from image recognition to natural language processing. But what exactly are they, and how do they work? In this comprehensive guide, we'll explore the fundamentals of neural networks, from their biological inspiration to the mathematical models that make them so powerful.

1. From Biological Neurons to Artificial Networks

The Biological Inspiration

The human brain contains approximately 86 billion neurons, each connected to thousands of others through synapses. These biological neurons:

Receive electrical signals from dendrites
Process these inputs in the cell body
Fire an output signal through the axon if the input exceeds a threshold
Adjust synaptic strengths based on experience (learning)

Detailed anatomy of a biological neuron showing key components

Figure 2.1 Key components of a biological neuron

The Artificial Neuron Model

Frank Rosenblatt's perceptron (1958) was one of the first artificial neuron models. A modern artificial neuron:

Takes multiple inputs (x₁, x₂, ..., xₙ)
Applies weights (w₁, w₂, ..., wₙ) representing synaptic strengths
Sums the weighted inputs (plus a bias term b)
Passes the result through an activation function f

output = f(w₁x₁ + w₂x₂ + ... + wₙxₙ + b)

Figure 2.2 Mathematical model of an artificial neuron

2. Activation Functions: Introducing Non-Linearity

Activation functions determine whether and how strongly a neuron should fire. They introduce non-linearities that allow neural networks to learn complex patterns.

Activation Function	Formula	Properties	Use Cases
Sigmoid	σ(x) = 1/(1+e⁻ˣ)	Smooth, outputs 0-1	Binary classification output
ReLU	max(0,x)	Simple, avoids vanishing gradient	Hidden layers
Leaky ReLU	max(αx,x), α≈0.01	Fixes "dying ReLU" problem	When ReLU performs poorly
Softmax	eˣᵢ/∑eˣⱼ	Outputs probability distribution	Multi-class output

Graph comparing shapes of common neural network activation functions

Figure 2.3 Comparison of common activation functions

Practical Tip: ReLU is generally the best first choice for hidden layers due to its simplicity and effectiveness. Use sigmoid for binary classification outputs and softmax for multi-class outputs.

3. Forward Propagation: Making Predictions

Forward propagation is the process of passing input data through the network to generate predictions. In a 3-layer network (input, hidden, output):

Input layer receives the raw data
Hidden layer computes: h = σ(W₁x + b₁)
Output layer computes: ŷ = σ(W₂h + b₂)
Compare ŷ to true y using loss function

        # Python implementation of forward propagation

        import numpy as np

        def sigmoid(x):

              return 1 / (1 + np.exp(-x))

        def forward_propagation(x, W1, b1, W2, b2):

              # Hidden layer computation

              z1 = np.dot(W1, x) + b1

              a1 = sigmoid(z1)

              # Output layer computation

              z2 = np.dot(W2, a1) + b2

              a2 = sigmoid(z2)

              return a2

4. Backpropagation: Learning from Mistakes

Backpropagation is the algorithm that allows neural networks to learn from data by efficiently computing gradients. The process:

Compute loss between prediction and true value
Calculate gradient of loss with respect to each parameter
Update parameters in the opposite direction of the gradient

Visualization of backpropagation showing gradient flow through network layers

Figure 2.4 Backpropagation efficiently computes gradients using the chain rule

The chain rule from calculus makes backpropagation efficient by breaking the gradient calculation into smaller, reusable parts:

        # Backpropagation implementation

        def backward_propagation(x, y, a1, a2, W1, W2):

              # Compute derivatives

              dL_da2 = 2*(a2 - y)  # MSE loss derivative

              da2_dz2 = a2 * (1 - a2)  # Sigmoid derivative

              dz2_dW2 = a1

              # Output layer gradients

              dL_dW2 = dL_da2 * da2_dz2 * dz2_dW2

              dL_db2 = dL_da2 * da2_dz2

              # Hidden layer gradients

              dz2_da1 = W2

              da1_dz1 = a1 * (1 - a1)

              dz1_dW1 = x

              dL_dW1 = dL_da2 * da2_dz2 * dz2_da1 * da1_dz1 * dz1_dW1

              dL_db1 = dL_da2 * da2_dz2 * dz2_da1 * da1_dz1

              return dL_dW1, dL_db1, dL_dW2, dL_db2

5. Training Neural Networks: The Complete Process

Putting it all together, training a neural network involves:

Chart of complete neural network training process

Figure 2.5 The complete neural network training cycle

Initialization: Set weights to small random values
Forward Pass: Compute predictions
Loss Calculation: Measure prediction error
Backward Pass: Compute gradients
Parameter Update: Adjust weights via gradient descent
Repeat: Until convergence or stopping criteria met

Modern Practice: While understanding these fundamentals is crucial, in practice we use frameworks like PyTorch or TensorFlow that handle automatic differentiation, making manual backpropagation implementation unnecessary for most applications.

Conclusion

Neural networks, inspired by biological neurons but implemented through mathematical models, provide a powerful framework for learning from data. By understanding the roles of activation functions, forward propagation, and backpropagation, you gain insight into how these models learn and make predictions.

In our next post, we'll dive deeper into training neural networks, exploring loss functions, optimization algorithms, and techniques to prevent overfitting.

Evolution from simple to complex neural network architectures

Figure 2.6 From simple perceptrons to complex deep learning architectures

✅ SHARE

🔍 Curious about Deep Learning? Read our next post on Convolutional Neural Networks

Follow DrASR Deep Learning for more in-depth tutorials, fundamentals, and research-backed content in Deep Learning.

If you found this helpful, leave a comment or share it with your peers. Let’s grow together in AI learning!

Search This Blog

Translate

Deep Learning

Menu