CS229 Section: Python Tutorial: Maya Srikanth
CS229 Section: Python Tutorial: Maya Srikanth
Tutorial
Maya Srikanth
Content adapted from past CS229 iterations
Python
Python 2.0 released in 2000
(Python 2.7 “end-of-life” in
2020)
https://www.researchgate.net/figure/Genealogy-of-Programming-Languages-doi101371-
journalpone0088941g001_fig1_260447599
Text editor/IDE options.. (don’t settle with notepad)
• PyCharm (IDE)
• Atom
• Notepad ++/gedit
Sorting sorted(random_list)
random_list_2 = [(3, 'z'), (12, 'r'), (6, 'e’),
(8, 'c'), (2, 'g')]
sorted(random_list_2, key=lambda x: x[1])
Dictionary and Set
my_set = {i ** 2 for i in range(10)}
Set
{0, 1, 64, 4, 36, 9, 16, 49, 81, 25}
(unordered, unique)
and why?
Convenient math functions, read before use!
Python Command Description
array.dtype Check data type of array (for precision, for weird behavior)
array_1 + 5
array_1 * 5
NumPy supports many types of np.sqrt(array_1)
np.power(array_1, 2)
algebra on an entire array np.exp(array_1)
np.log(array_1)
Dot product and matrix multiplication
array_1 @ array_2
A few ways to write dot product array_1.dot(array_2)
np.dot(array_1, array_2)
# Notice that the results here are THE SAME! array([[ 1, 3, 5],
print(op1 + op3) [ 4, 6, 8],
print(op1 + op3.T)
[ 7, 9, 11]])
array([[ 1, 3, 5],
[ 4, 6, 8],
[ 7, 9, 11]])
Broadcasting for pairwise distance
samples = np.random.random((15, 5))
# Without broadcasting
expanded1 = np.expand_dims(samples, axis=1)
tile1 = np.tile(expanded1, (1, samples.shape[0], 1))
Both achieve the effect
expanded2 = np.expand_dims(samples, axis=0) of
tile2 = np.tile(expanded2, (samples.shape[0], 1 ,1))
diff = tile2 - tile1
distances = np.linalg.norm(diff, axis=-1)
# With broadcasting
diff = samples[: ,np.newaxis, :]
- samples[np.newaxis, :, :]
distances = np.linalg.norm(diff, axis=-1)
print(dot)
Wall time: 345ms Wall time: 2.9ms
An example with pairwise distance
Speed up depends on setup and nature of computation
total_dist = []
for s1 in samples: diff = samples[: ,np.newaxis, :] -
for s2 in samples: samples[np.newaxis, :, :]
d = np.linalg.norm(s1 - s2) distances = np.linalg.norm(diff, axis=-1)
total_dist.append(d) avg_dist = np.mean(distances)
avg_dist = np.mean(total_dist)
Matplotlib / Seaborn
• Visualization (line, scatter, bar, images
and even interactive 3D)
Pandas (https://pandas.pydata.org/)
• DataFrame (database/Excel-like)
• Easy filtering, aggregation (also plotting, but less
features than dedicated datavis packages)
Example plots
https://matplotlib.org/3.1.1/gallery/index.html
import matplotlib
Import import matplotlib.pyplot as plt
import numpy as np
fig, ax = plt.subplots()
Plotting ax.plot(t, s)
fig.savefig("test.png")
Save/show plt.show()
Plot with dash lines and legend
import numpy as np
import matplotlib.pyplot as plt
fig, ax = plt.subplots()
ax.legend()
plt.show()
Using subplot
plt.grid()
plt.plot(x, y_sin)
plt.title('Sine Wave')
plt.grid()
plt.tight_layout()
Plot area under curve
Confusion matrix
https://scikit-learn.org/stable/auto_examples/model_selection/plot_confusion_matrix.html
fig, ax = plt.subplots()
im = ax.imshow(cm, interpolation='nearest', cmap=cmap)
ax.figure.colorbar(im, ax=ax)
# We want to show all ticks...
ax.set(xticks=np.arange(cm.shape[1]),
yticks=np.arange(cm.shape[0]),
xticklabels=classes, yticklabels=classes,
ylabel='True label', xlabel='Predicted label’,
title=title)
Questions?
Supplementary Slides
Questions?
Where does my program start?
It just works
A function
Properly
What is a class?
Instance variable
Does something
with the instance
To use a class
Instantiate a class,
get an instance
2D list
list_of_list = [[1,2,3], [4,5,6], [7,8,9]]
List comprehension
initialize_a_list = [i for i in range(9)]
initialize_a_list = [i ** 2 for i in range(9)]
initialize_2d_list = [[i + j for i in range(5)] for j in range(9)]
Insert/Pop
my_list.insert(0, ‘stuff)
print(my_list.pop(0))
More on List
Sort a list
random_list = [3,12,5,6]
sorted_list = sorted(random_list)
Comprehension
my_dict = {i: i ** 2 for i in range(10)}
my_set = {i ** 2 for i in range(10)}