lec-3
lec-3
CS285 Deep RL
Instructor: Kyle Stachowicz
https://colab.research.google.com/drive/12nQiv6aZHXNuCfAAuTjJenDWKQbIt2Mz
http://bit.ly/cs285-pytorch-2023
Goal of this course
Train an agent to perform useful tasks
data agent
collect data
Goal of this course
Train an agent to perform useful tasks
train the
train model
the model
collect data
How do train a model?
You define:
PyTorch computes:
100x faster!
Multidimensional Arrays
Multidimensional Indexing
Axis 1
32 27 5 54 1
Axis 0
A 99 4 23 3 57
76 42 34 82 5
A.shape == (3, 5)
Multidimensional Indexing
Axis 1
32 27 5 54 1
Axis 0
99 4 23 3 57
76 42 34 82 5
A[0, 3]
Multidimensional Indexing
Axis 1
32 27 5 54 1
Axis 0
99 4 23 3 57
76 42 34 82 5
A[:, 3]
Multidimensional Indexing
Axis 1
32 27 5 54 1
Axis 0
99 4 23 3 57
76 42 34 82 5
A[0, :]
Multidimensional Indexing
Axis 1
32 27 5 54 1
Axis 0
99 4 23 3 57
76 42 34 82 5
A[0, 2:4]
Multidimensional Indexing
Axis 1
32 27
32 27 55 54
54 11
Axis 0 32 27
32 27 55 54
54 11
A 99
99
99 44 23
23 33 57
57
99 44 23
23 33 57
57
76 42
76 42 3434 82
82 55
76 42
76 42 34
34 8282 55
Axis 2
A.shape == (3, 5, 4)
Multidimensional Indexing
Axis 1
32 27
32 27 55 54
54 11
Axis 0 32 27
32 27 55 54
54 11
A 99
99
99 44 23
23 33 57
57
99 44 23
23 33 57
57
76 42
76 42 3434 82
82 55
76 42
76 42 34
34 8282 55
Axis 2
A[0, ...]
Multidimensional Indexing
Axis 1
32 27
32 27 55 54
54 11
Axis 0 32 27
32 27 55 54
54 11
A 99
99
99 44 23
23 33 57
57
99 44 23
23 33 57
57
76 42
76 42 3434 82
82 55
76 42
76 42 34
34 8282 55
Axis 2
A[..., 1]
Broadcasting
TL;DR: Shape (1, 3, 2) acts
like (6, 5, 4, 3, 2) when added
to shape (6, 5, 4, 3, 2)
https://jakevdp.github.io/PythonDataScienceHandbook/02.05-computation-on-arrays-broadcasting.html
Shape Operations
Device Management
• Numpy: all arrays live on the CPU’s RAM
• Torch: tensors can either live on CPU or GPU memory
• Move to GPU with .to(“cuda”)/.cuda()
• Move to CPU with .to(“cpu”)/.cpu()
P b
x
loss
y
loss
Computing Gradients
P b
x
.detach() y
loss
Training Loop
REMEMBER THIS!
Converting Numpy / PyTorch
Numpy -> PyTorch:
torch.from_numpy(numpy_array).float()
https://colab.research.google.com/drive/12nQiv6aZHXNuCfAAuTjJenDWKQbIt2Mz
http://bit.ly/cs285-pytorch-2023