0% found this document useful (0 votes)
2 views

Discussion 9

The document outlines the implementation of a 2-layer neural network in Rust to classify handwritten digits from the MNIST dataset. It details the dataset structure, neural network architecture, and the steps for forward and backward passes, weight updates, and testing for accuracy. Additionally, it highlights key solution aspects such as data loading, network initialization, and the training loop.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

Discussion 9

The document outlines the implementation of a 2-layer neural network in Rust to classify handwritten digits from the MNIST dataset. It details the dataset structure, neural network architecture, and the steps for forward and backward passes, weight updates, and testing for accuracy. Additionally, it highlights key solution aspects such as data loading, network initialization, and the training loop.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 7

Discussion 9

HW9
Homework Overview

HW 9 Objective: Implement a 2-layer


neural network in Rust using
the ndarray crate to classify
handwritten digits from the
MNIST dataset.
HW 9
Dataset
• Files: mnist_train.csv (60,000 entries) and
mnist_test.csv (10,000 entries).
• Format:
• Each row: label, pixel1, pixel2, ..., pixel784.
• Label: The digit (0–9).
• Pixel Values: Grayscale values (0 to 255) for each pixel in a
28x28 image.
• Purpose:
• Train the model with mnist_train.csv.
• Test its accuracy with mnist_test.csv.
HW 9
Neural Network Architecture
• Input Layer:
• Nodes: 784 (one for each pixel).
• Hidden Layer:
• Nodes: User-defined (e.g., 128 or 64).
• Activation Function: Sigmoid.
• Output Layer:
• Nodes: 10 (one for each digit).
• Activation Function: Sigmoid.
HW 9
Steps to Solve
1. Forward Pass
1. Compute the hidden layer values:
1. z1=x⋅W1z_1 = x \cdot W_1z1​=x⋅W1​(dot product of input and weights).
2. Apply Sigmoid: a1=σ(z1)a_1 = \sigma(z_1)a1​=σ(z1​), where σ(x)=11+e−x\sigma(x) = \frac{1}{1 + e^{-x}}σ(x)=1+e−x1​.
2. Compute the output layer values:
1. z2=a1⋅W2z_2 = a_1 \cdot W_2z2​=a1​⋅W2​.
2. Apply Sigmoid: a2=σ(z2)a_2 = \sigma(z_2)a2​=σ(z2​).
3. The final output, a2a_2a2​, is a 10-element vector indicating the predicted probabilities for each digit.
2. Backward Pass
4. Compute the error:
1. E=y−a2E = y - a_2E=y−a2​, where yyy is the one-hot encoded true label.
5. Update the output layer weights:
1. Compute gradient: δ2=E⋅σ′(a2)\delta_2 = E \cdot \sigma'(a_2)δ2​=E⋅σ′(a2​), where σ′(x)=σ(x)⋅(1−σ(x))\sigma'(x) = \sigma(x) \
cdot (1 - \sigma(x))σ′(x)=σ(x)⋅(1−σ(x)).
2. Update: W2=W2+β⋅a1T⋅δ2W_2 = W_2 + \beta \cdot a_1^T \cdot \delta_2W2​=W2​+β⋅a1T​⋅δ2​.
6. Update the hidden layer weights:
1. Compute gradient: δ1=δ2⋅W2T⋅σ′(a1)\delta_1 = \delta_2 \cdot W_2^T \cdot \sigma'(a_1)δ1​=δ2​⋅W2T​⋅σ′(a1​).
2. Update: W1=W1+β⋅xT⋅δ1W_1 = W_1 + \beta \cdot x^T \cdot \delta_1W1​=W1​+β⋅xT⋅δ1​.
3. Training
• Use gradient descent to iteratively adjust weights W1W_1W1​and W2W_2W2​over multiple epochs.
• Learning rate β\betaβ (e.g., 0.1 or 0.05) controls step size.
4. Testing
• Evaluate the trained network on mnist_test.csv.
• Compute accuracy: Percentage of correctly classified digits.
HW 9
Solution Highlights (Referenced from the Uploaded File)
• 1. Data Loading
• Use csv and serde to parse the dataset efficiently.
• Convert CSV rows into input vectors (x) and labels (y).
• 2. Network Initialization
• Define weight matrices W1 and W2 with random values between 0 and 1 using ndarray::Array.
• 3. Forward Pass
• Perform dot products for z1 and z2 using ndarray methods (dot).
• Apply the sigmoid function: fn sigmoid(x: f64) -> f64 { 1.0 / (1.0 + (-x).exp()) }
• 4. Backward Pass
• Compute gradients δ1 and δ2 using element-wise operations.
• Update weights using gradient descent.
• 5. Training Loop
• Iterate through the dataset for a fixed number of epochs.
• Perform forward and backward passes to adjust weights.
• 6. Testing and Accuracy Calculation
• Pass test data through the network.
• Compare predictions (a2) with actual labels and compute the percentage of correct
classifications.
Additional
announceme Please keep working on your project!

nts

You might also like