0% found this document useful (0 votes)
9 views

Ai Notes

The document provides an overview of an orientation for an AI/machine learning course. It discusses what tools will be used like Zoom, Canvas, and Colab and defines key concepts like machine learning, artificial intelligence, and data science. Challenges in the field like data quality, interpretability, privacy, and bias are also outlined. Current machine learning techniques like deep learning and generative AI are explained.

Uploaded by

pantherz1617
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views

Ai Notes

The document provides an overview of an orientation for an AI/machine learning course. It discusses what tools will be used like Zoom, Canvas, and Colab and defines key concepts like machine learning, artificial intelligence, and data science. Challenges in the field like data quality, interpretability, privacy, and bias are also outlined. Current machine learning techniques like deep learning and generative AI are explained.

Uploaded by

pantherz1617
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 12

9/24/23 Orientation

We will be using zoom. Canvas is for the management system like hub material. Colab is the
coding environment.

What is machine learning?

● It’s all around us


● You use it every day. For example in youtube and netflix. Machine learning algorithms in
suggestions.
● Self driving cars use ai
● Spam filters are also machine learning allgortihms
● Also google search autocomplete uses machine learning

● Ai is the study of how to create intelligent machines that can stimulate human thinking
capabilities and bahavior. Ai answers the question how can we make machines smart
● ML is a subfield of ai. How a machine can learning without explicitly being programmed.
How can we make machines improve through experience.
● ML and ai often used interchangeably
● Data science: lot of overlap with ai.
● Deep learning: type of machine learning. More advance and used in stuff like chatgpt
● Neural network: a bunch of perceptions that work together to learn and stuff. Many
similarities with human neural networks
● Generative ai: can create new things by itself without being programmed to do so (create
art, make text, etc.)
● 1966: eliza was a chatbot that was used like a therapist. It used simple pattern matching
to respond. It had limited vocab and responses. No understanding of context. Sparked
debate on ai ethics.
● Generative ai can be integrated into spreadsheets
Current challenges
● Data quality and availability: the need for large amounts of data while ensuring
individuals' rights of privacy as well as getting comprehensive datasets with
representative samples. Ai use lots of data
● Interpretibility and explainability: the ability to understand an d explain how a machine
learning model arrives at its predictions or decisions.
○ Black box: know what we’re inputting and outputting, but don’t know how the
output comes. It’s unknown.
○ Hard to improve models when you don’t understand the black box.
● Privacy and security: need to keep data private
● Generalizaiton: the ability of machine learning models to perform accurately on unseen
data. For example, need to know how to ride a bike in multiple settings.
● Ethics and bias: need for comprehensive ethical frameworks that are adhered to across
industries and countries
DO WE HAVE TO DOWNLOAD THE COLAB APP

9/30/23 Lab 1 Orientation


Pypy shows all the different python projects

Question
length = 5
width = 4
area = length*width
print(area)

Solution, will print 20

Don’t use space: use _ between words

When you divide a number in python, you will always get a float number. When you print a
value, you can set how much decimals you want to print. Every number has a float data.

100/13 → 7.69…
If you do //, you are rounding numbers. So 100//13 → 7

If you put a decimal in a function (like when your multiplying), then you will receive a float
number. If you only have integers, then you will have a integer number printed.

Code:
import numpy as np
print (np.pi)

This code will allow you to print pi. Numpy is a library, an I called pi from the library. Instead of
needing to write out numpy everytime, I can just write np. In this case I called out pi, so I wrote
np.pi.
You have to import numpy if you want to use it. When you say “as”, you can use what every you
want.

10/7/23 Lab 2
What we will learn: Objects, methods, attributes
Methods are behaviors that objects can perform
Variables are a way to store objects
Computers will read code from top to bottom and left to right

IDE (short for integrated development environment): a area where you can create, edit, and run
programs (PYCHARM and JUPYTER notebook are good platforms)

Attributes (data): data that you store. Basically a variable. Stores data and you can come back
to it and change it etc.

Methods (behaviors): actions where we perform with data. For examples, the print() function is a
method

Objects: when you combine attributes with methods. Collections of attributes and methods. An
example of an object is “class”

Analogy: attributes are nouns, methods are verbs

Let’s say you were coding a phone simulator.


● Attributes: battery left, time, list of contacts, etc.
● Method: set an alarm, open app, make a call
Methods: all the apps would be type of functions, which is a method.
Arithmetic operators are also methods
objects:
● Types, classes, and data types

Type
● Integer
● Float (decimals)
○ Integers are easier to store. (think about bits)
○ For floats, you have many decimals to keep track of, which takes more bits to
store.
○ Integers are more efficient and make it run faster
● String: “Hi”, “This is a string”, “apple”
● Boolean: True, False
● List, group of variables. [1, 2, 4, 77, 98, “apple”]
● Dictionary
○ {1: ”response 1”, 2: “response 2”}, {“apples”: 5, “bananas”; 21}
○ Thing on the left is a key. Thing on the right is a value.

Buit in behaviors: +, -, /, *, ** (exponent), % is the mod operater,

In python, everything after a # is a comment


Comilers are what read the code
Shortcuts in the the notebook: Cntrl enter will run it. Shift enter will run and go to the next
module.

If you do 24.0*24.4, you will get 609.5999999999999999999999… This is a floating point error.
This is because when you transfer binary numbers and decimal numbers, you get this.

Decimal is a library that will solve this. You don’t even have to this.

Variables: store objects

variable_name = value
= is the assigning operator

In variables
● use lower case
● underscores to separate words
● names that make it clear what the specific variable stores
● camelCase: first word is lowercase, and other words are capitalized
○ Variables mostly
● snake_case: underscores
○ Functions mostly
● PurdueCase: each word is capitalized
○ Use this for classes
Do not use
● Numbers at the start
○ Try not to use numbers in general
● Spaces
● Special characters ($, %, @, etc.
● Reserved words, ex, for,
○ Other things mean these
● Built in names: print, str

print = ‘blue’
print(12)
Will give u an error

str(1)
Will turn 1 into a string

Python does count PEMDAS


10/14/2023 Lab Python Basics 2
Just like a cookie cutter defines the shape of a cookie, a type defines the specifics of an object.

Built in behaviors: arithmetic operators

Object:
● Frac.numerator
○ Attributes are not followed by (), bur methods are. You usie this
● object.method()
● object.attribute
● Object.num_rows
○ attribute
● object.method(arg1)
○ The arg1 means something changes something. This is when something is in the
()
● object.demoninator
○ attribute
● object.method(arg1, arg2)
● fraction.evaluate()
○ Method
● object.capitalize()
○ Method

max = baby()
● Object name would be max
() is the function that creates objects

If you want to do stuff with the baby object:

print(max.attribute_name)
max.method_name()

Just recognize the format


Since there’s no period, it’s not a attribute or method
Python doesn’t know what baby() object is. We have to define it, but we won’t learn today.

Methods are actions

This is an ex of a method
● My_message = ‘hi’
● New_message = my_message.upper()
● print (New_message)
Hi

Another example
● GroceryList = [“flour”, “eggs”]
● grocery_list.append(“milk”)
● print(GroceryList)
[“flour”, “eggs”, “milk”]

Debugging:
● As a programmer, need to diagnose problems of your code
Types of errors
● Syntax errors
● Runtime errors
○ Occur during execution, similar to a car breaking down during a drive
■ Maybe python on ur device crashes
■ Your IDE crashes
● Logic Errors
○ The code runs but produces unexpected results, similar to a car behaving
unpredictably
○ Gives us 2+3 = 8
○ Can be tricky to diagnose
Comments
● You can comment stuff
● You can debug code by commenting out a line
Use hashtags for comments

● Stack overflow
○ You can ask a specific question and random people can answer, but usually
answers are monitored by admins
● Documentation:

Objects:things we write in python. They have data, which is attributes. With the attributes, you
can do behaviors, which is mmethdos.
Attributes are an object’s variables

Python: object oriented programming

Generally, you don’t want to modify attributes. Sometiems you can, but sometiems you can’t.
If you want to change an attribute, you should use methdos.
10/21/23 Lab
Built in objects are like num = 1, frac = 2.5,

General objects are stuff we create: baby(), rect(arg1, arg2), etc.

For attributes, the name is usually a noun


For methods, the name is usually an action

Libraries: things in python that you can use

Codebooks: datasets are often with codebooks. Give crical info about the data as a dictionary or
map for your dataset.

You can have labels, variables, classes, unique values, missing data, descriptions, etc.

You could understand what labels mean. This helps you understand your data.

pandas: a python library that is used to read, analyze, and clean data. It serves as the backbone
of most data projects

Other libraries

pandas (over view of everything), matplotlib (plotting), seaborn (plotting and analytical graphs)

Installation:
!pip install pandas

Importing it:
import pandas as pd
● If you do it as pd, then you reference pandas as pd. Pd is conventially what is used for
pandas.

DataFrames: 2D data structure, like a 2D list/arrayf or a table with rows and columns. Pandas
turns data into tables

Pandas DataFrame serves as the primary object for storing and manipulating data in a
structured format.

Df = pd.DataFrame()
Creates a dataframe
Df is the variable name

You can use the pandas codebook to look at manyh stuff


How to panda reads stuff. Two ways:

Pd = pd.read_csv(“www.url_to_csv_file”)

Or

From sklearn import datasets


Example = dataset.load_example_dataset()

head() syntax:

df.head(NUM OF ROWS)

When you start data, head() helps you quickly glimpse the first few rows of your data frame. The
default without anything in the () is 5 rows.

df.info()

To get a concise summary of the dataframe, includng null values. It helps in identifying missing
data and understanding the datatype composition

df.attribute
df.shape
Shows (rows, columns)
Types of useful attributes:
Index, columns, dtypes, size, ndim (number of dimensions)

Instances are number of rows.

df.describe()
Gives statistical summary (like mean, median, std)

value_counts()
df[‘column name’].value_counts()
Counts the number of unique values in columns.

unique()
df[‘column name’].unique()
Returns an array of the columns
Data Analysis Functions

df[‘colum 1 name’]

Look at the data in columns

df[[‘colum 1 name’, ‘column 2 name’]]

There are 2 brackets because it’s a list in a list

df.iloc[first row:last row]


Specifically view or analyze certain rows. The last row is excluded. So if the las row was 16,
then 15 isn’t part of it.

Df[df[‘column name’] conditional]

Exporling values

df[‘column name’] [row number]

df.iloc[rownumber,columnumber]

iloc is used to access rows

10/28/2023 Lab (Plotting in Matplotlib)


Data Visualization is in all parts of the ML process

Scaterplots
● Plots continuous data. X values can have multiple y values. Scatter plots is good for
showing correlations.

Lines Plots
● Where consecutive data points are connected. One x value has one y value. Tracks stuff
over time. For continuous data.

Bar Plots
● For discrete x data and continuous y data where data point is visualized as a bar whose
height corresponds to its y0value. The x axis is for categorical data, and y axis is for
numerical data
Libraries:
● Visualization libraries
○ Software tools that assist in data visualization
○ Picking the right library can affect the stuff you get
● Matplotlib
○ Highly customizable, extensive community support. Lots of people use matplotlib,
so lots of documentation. Don’t need to memorize a lots of stuff because can just
search up.
○ Academic research, quick data analysis, creating basic graphs
○ Sometimes, it’s kinda ugly
● Seaborn
○ Similar to matplotlib and similar syntax
○ Looks better and extensive customization
○ Better for analyzing more complex data
● Plotly and ggplot
○ Not as common
○ Specialized
○ Plotly: interactive, web ready plot. For dashboards and data apps.
○ Ggplot: known for its structured approach to making complex visualizations.
easier to create and understand.
!pip install matplotlib
import matplotlib.pyplot as plt

● Pyplot is a sublibrary, but pyplot is where most of the stuff is there. Other sublibraries
could be for customizing.

Scatter Plots

plt.scatter (x_var, y_var)


plot.show()

X_var is a list of all x cords, and y_var is a list of all y cords. They should correspond to each
other.

Plot title:
● plot.title(“TITLE”)
X and Y axes
● plt.xlabel(“label”)
● plt.ylabel(“label”)

Line Plots

plt.plot (x_var, y_var)


plt.show()
Plot title:
● plot.title(“TITLE”)
X and Y axes
● plt.xlabel(“label”)
● plt.ylabel(“label”)

Bar Graphs

plt.bar(x_categories, y_heights)
plt.show()

● X_categories are usually strings, and y_heights are numbers


Bar Width

plt.bar(x, height, width=’’)


plt.show()

● .8 is default width
● x is the list name of categories

You can use markers in matplot lib. This is for customizing.


● Colors, size, transparency, etc.

plt.legend()

plt.scatter(x, y, label = ‘text’)

11/4/23 Lab
Library for MS:
sklearn

Everything we will learn can be done with sklearn


They have built in datasets to learn from, and there is good documentation
Caggle is a good website for ms

Import pandas as pd
Import patplotlib.pyplot as plt
From sklearn import datasets, model_selection

url = ‘the-css
There is a module that automatically splits your data. You can give the features and labels and
the test_size, and it will return four different data training datasets.

Model = sklearn_model

You might also like