Homework_6
Homework_6
Assignment Goals
• Get Pytorch set up for your environment.
• Familiarize yourself with the tools.
• Implementing and training a basic neural network using Pytorch.
• Happy deep learning :)
Summary
Home-brewing every machine learning solution is not only time-consuming but potentially error-prone. One of
the reasons we’re using Python in this course is because it has some very powerful machine learning tools. Besides
common scientific computing packages such as SciPy and NumPy, it’s very helpful in practice to use frameworks
such as Scikit-Learn, TensorFlow, Pytorch, and MXNet to support your projects. The utilities of these frameworks
have been developed by a team of professionals and undergo rigorous testing and verification.
In this homework, we’ll be exploring the Pytorch framework. Please complete the functions in the template
provided: intro pytorch.py.
You can work on your own machine but remember to test on Gradescope. The following are the installation steps
for Linux. If you don’t have a Linux computer, you can use the CS lab computers for this homework. Find more
instructions: How to access CSL Machines Remotely. For example, you can connect to the CSL Linux computers
by using ssh along with your CS account username and password. In your terminal simply type:
ssh {csUserName}@best-linux.cs.wisc.edu
You can use scp to transfer files: scp source destination. For example, to upload a file to the CSL
machine:
scp Desktop/intro_pytorch.py {csUserName}@best-linux.cs.wisc.edu:/home/{csUserName}
You will be working on Python 3 ( instead of Python 2 which is no longer supported) with Python version >= 3.8.
Read more about Pytorch and Python version here. To check your Python version use:
python -V or python3 -V
If you have an alias set for python=python3 then both should show the same version (3.x.x)
Step 1: For simplicity, we use the venv module (feel free to use other virtual envs such as Conda).
1
Homework 6
2
Homework 6
Torch, torchvision and the python standard packages are the only imports allowed on this assignment. The grader
will likely not handle any others.
The following 5 sections explain the details for each of the above functions you are required to implement respec-
tively.
Hint 1: note that Pytorch already contains various datasets for you to use, so there is no need to manually download
from the Internet. Specifically,
torchvision.datasets.FashionMNIST()
can be used to retrieve and return a Dataset object torchvision.datasets.FashionMNIST which is a wrapper that con-
tains image inputs (as 2D arrays) and labels (’T-shirt/top’,’ Trouser’, ’Pullover’, ’Dress’, ’Coat’, ’Sandal’,,’Shirt’,’Sneaker’,
’Bag’,’ Ankle Boot’):
train_set=datasets.FashionMNIST(‘./data’,train=True,
download=True,transform=custom_transform)
test_set=datasets.FashionMNIST(’./data’, train=False,
transform=custom_transform)
The train set contains images and labels we’ll be using to train our neural network; the test set contains images
and labels for model evaluation. Here we set the location where the dataset is downloaded as the data folder in the
current directory.
Note that input preprocessing can be done by specifying transform as our custom transform (you don’t need to
change this part)
custom_transform= transforms.Compose([
transforms.ToTensor(),
transforms.Normalize((0.1307,), (0.3081,))
])
3
Homework 6
• transforms.Normalize() normalizes the tensor with a mean and standard deviation which goes as the two
parameters respectively. Feel free to check the official doc for more details.
Hint 2: After obtaining the dataset object, you may wonder how to retrieve images and labels during training
and testing. Luckily, Pytorch provides such a class called torch.utils.data.DataLoader that implements the iterator
protocol. It also provides useful features such as:
• Batching the data
• Shuffling the data
• Load the data in parallel using multiprocessing.
• ...
Below is the full signature (for more details, check here):
DataLoader(dataset, batch_size=1, shuffle=False, sampler=None,
batch_sampler=None, num_workers=0, collate_fn=None,
pin_memory=False, drop_last=False, timeout=0,
worker_init_fn=None, *, prefetch_factor=2,
persistent_workers=False)
As an introductory project, we won’t use complicated features. We ask you to set the batch size = 64 for both
train loader and test loader. Besides, set shuffle=False for the test loader. Given a Dataset object data set, we can
obtain its DataLoader as follows:
loader = torch.utils.data.DataLoader(data_set, batch_size = 64)
Putting it all together, you should be ready to implement the get data loader() function. Note that when the
optional argument is unspecified, the function should return the Dataloader for the training set. If the optional
argument is set to False, the Dataloader for the test set is returned. The expected output is as follows:
>>> train_loader = get_data_loader()
>>> print(type(train_loader))
<class ’torch.utils.data.dataloader.DataLoader’>
>>> print(train_loader.dataset)
Dataset FashionMNIST
Number of datapoints: 60000
Root location: ./data
Split: Train
StandardTransform
Transform: Compose(
ToTensor()
Normalize(mean=(0.1307,), std=(0.3081,))
)
>>> test_loader = get_data_loader(False)
4
Homework 6
model = nn.Sequential(
nn.Flatten(),
nn.Linear(?, ?),
nn.ReLU()
nn.Linear(?, ?),
...
)
After building the model, the expected output should be like this:
>>> model = build_model()
>>> print(model)
Sequential(
(0): Flatten()
(1): Linear(in_features=?, out_features=?, bias=True)
(2): ReLU()
(3): Linear(in_features=?, out_features=?, bias=True)
...
)
Note that the Flatten layer just serves to reformat the data.
The standard training procedure contains 2 for loops: the outer for loop iterates over epochs, while the inner
for loop iterates over batches of (images, labels) pairs from the train DataLoader. Feel free to check the Train
the network part in this official tutorial for more details. Please pay attention to the order of zero grad(),
backward() and step(). A kind reminder: please set your model to train mode before iterating over the
dataset. This can be done with the following call:
model.train()
We ask you to print the training status after every epoch of training in the following format (it should have 3
components per line):
Train Epoch: ? Accuracy: ?/?(??.??%) Loss: ?.???
Then the training process (for 5 epochs) will be similar to the following (numbers can be different):
5
Homework 6
model.eval()
Besides, there is no need to track gradients during testing, which can be disabled with the context manager:
with torch.no_grad():
for data, labels in test_loader:
...
You are expected to print both the test Loss and the test Accuracy if show loss is set to True (print Accuracy only
otherwise) in the following format:
>>> evaluate_model(model, test_loader, criterion, show_loss = False)
Accuracy: 85.39%
Deeper Learning
Now build a similar model with 2 additional hidden layers. Deeper networks with more hidden layers can learn
more complex patterns in data, though this comes at the cost of longer training times, and the potential for over-
fitting.
Build the deeper network with the following layers in this order:
Then use the same procedure as before to train and evaluate this model.
6
Homework 6
The index are assumed to be valid. We assume the class names are (note that there is no white space in any class
name):
class_names = [’T-shirt/top’,’Trouser’,’Pullover’,’Dress’,’Coat’,’Sandal’,’Shirt’
,’Sneaker’,’Bag’,’Ankle Boot’]
Deliverable
A single file named intro pytorch.py containing the methods mentioned in the program specification section.
Please pay close attention to the format of the print statements in your functions. Incorrect format will lead to
point deduction.
Submission
Please submit your file “intro pytorch.py” to Gradescope. Do not submit a Jupyter notebook .ipynb file. All code
except imports should be contained in functions OR under a
if __name__=="__main__":
check so that it will not run if your code is imported to another program.