0% found this document useful (0 votes)

6 views14 pages

Numpy Array

The document provides an overview of data science fundamentals, focusing on the use of NumPy for numerical computing in Python. It covers key concepts such as creating and manipulating NumPy arrays, performing basic and advanced operations, and utilizing universal functions for efficient computations. Additionally, it discusses aggregation techniques for data analysis, highlighting their purpose and applications in summarizing datasets.

Uploaded by

nl3454343

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

6 views14 pages

Numpy Array

Uploaded by

nl3454343

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 14

lOMoARcPSD|28284242

OCS353 - DATA SCIENCE FUNDAMENTALS

Environment • %env: List environment variables

8.
Management • %store: Store a variable in the IPython database

Extension • %load_ext: Load an IPython extension

9.
Management • %unload_ext: Unload an IPython extension
Help and
• %help: Show help for a magic command or function
10. Documentation
• %quickref: Show a quick reference guide for IPython

Memory Usage • %memit: Measure memory usage of a statement or expression

11.
• %mprun: Profile memory usage of a statement or expression

• %parallel: Run a command in parallel

Parallel Computing
12. • %px: Run a command in parallel with Xtrae
• %pxresult: Get the result of a parallel computation
• %install_ext: Install an IPython extension
Other • %install_nbext: Install a Jupyter notebook extension
13.
• %uninstall_ext: Uninstall an IPython extension
• %uninstall_nbext: Uninstall a Jupyter notebook extension

4. NUMPY ARRAYS

NumPy is a powerful library for numerical computing in Python. It provides support for arrays, which
are more efficient than Python lists for numerical operations. Here are some basic and advanced
operations you can perform with NumPy arrays.
Creating NumPy Arrays
 numpy.array(): Create an array from a Python list or tuple
 numpy.zeros(): Create an array filled with zeros
 numpy.ones(): Create an array filled with ones
 numpy.random.rand(): Create an array with random values
NumPy Array Properties
 shape: The number of dimensions and size of each dimension
 dtype: The data type of the array elements
 size: The total number of elements in the array
Indexing and Slicing
 arr[index]: Access a single element
 arr[start:stop:step]: Access a slice of elements
 arr[start:stop]: Access a slice of elements with default step size 1
Basic Operations
 arr + arr: Element-wise addition
 arr - arr: Element-wise subtraction
 arr * arr: Element-wise multiplication
 arr / arr: Element-wise division
Advanced Operations
 numpy.dot(): Compute the dot product of two arrays
 numpy.cross(): Compute the cross product of two arrays

23 | P a g e
Downloaded by Kalai ilaiya ([email protected])
lOMoARcPSD|28284242

OCS353 - DATA SCIENCE FUNDAMENTALS

 numpy.inner(): Compute the inner product of two arrays

 numpy.outer(): Compute the outer product of two arrays
Array Functions
 numpy.sum(): Compute the sum of all elements in an array
 numpy.mean(): Compute the mean of all elements in an array
 numpy.median(): Compute the median of all elements in an array
 numpy.std(): Compute the standard deviation of all elements in an array
 numpy.var(): Compute the variance of all elements in an array
Array Comparison
 numpy.equal(): Compare two arrays element-wise
 numpy.not_equal(): Compare two arrays element-wise
 numpy.greater(): Compare two arrays element-wise
 numpy.less(): Compare two arrays element-wise

Creating a NumPy array Basic operations

import numpy as np import numpy as np
arr = np.array([1, 2, 3, 4, 5]) arr1 = np.array([1, 2, 3])
print(arr) arr2 = np.array([4, 5, 6])
print(arr1 + arr2)
Output: print(arr1 * arr2)
[1 2 3 4 5]
Output:
[5 7 9]
[ 4 10 18]
Indexing and slicing Array functions
import numpy as np import numpy as np
arr = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9]) arr = np.array([1, 2, 3, 4, 5])
print(arr[3]) print(np.sum(arr))
print(arr[2:5]) print(np.mean(arr))

Output: Output:
4 15
[3 4 5] 3.0
Reshaping an array Array Comparison
import numpy as np import numpy as np
arr = np.array([1, 2, 3, 4, 5, 6]) arr1 = np.array([1, 2, 3])
arr = arr.reshape(2, 3) arr2 = np.array([1, 2, 4])
print(arr) print(np.equal(arr1, arr2))

Output: Output:
[[1 2 3] [ True True False]
[4 5 6]]
Concatenating arrays Splitting an array
import numpy as np import numpy as np
arr1 = np.array([1, 2, 3]) arr = np.array([1, 2, 3, 4, 5, 6])
arr2 = np.array([4, 5, 6]) arr1, arr2 = np.split(arr, 2)
arr = np.concatenate((arr1, arr2)) print(arr1)
print(arr) print(arr2)

Output: Output:
[1 2 3 4 5 6] [1 2 3]
[4 5 6]

24 | P a g e
Downloaded by Kalai ilaiya ([email protected])
lOMoARcPSD|28284242

OCS353 - DATA SCIENCE FUNDAMENTALS

Example Program: Overall Operations using Numpy Array

import numpy as np

# Creating arrays
a = np.array([1, 2, 3])
b = np.array([(1, 2, 3), (4, 5, 6)])
c = np.arange(0, 10, 2)
d = np.linspace(0, 1, 5)
e = np.zeros((2, 3))
f = np.ones((2, 3))
g = np.eye(3)
h = np.random.random((2, 3))

# Displaying arrays
print("Array a:\n", a)
print("Array b:\n", b)
print("Array c (arange):\n", c)
print("Array d (linspace):\n", d)
print("Array e (zeros):\n", e)
print("Array f (ones):\n", f)
print("Array g (identity matrix):\n", g)
print("Array h (random values):\n", h)

# Array properties
print("Shape of array b:", b.shape)
print("Size of array b:", b.size)
print("Data type of array a:", a.dtype)

# Array operations
i = np.array([1, 2, 3])
j = np.array([4, 5, 6])
print("i + j:\n", i + j)
print("i * j:\n", i * j)

# Matrix operations
k = np.array([[1, 2], [3, 4]])
l = np.array([[5, 6], [7, 8]])
print("Matrix product of k and l:\n", np.dot(k, l))

# Aggregate functions
m = np.array([1, 2, 3, 4, 5])
print("Sum of array m:", np.sum(m))
print("Mean of array m:", np.mean(m))
print("Standard deviation of array m:", np.std(m))

# Indexing and slicing

print("First element of array a:", a[0])
n = np.array([1, 2, 3, 4, 5])
print("Elements from index 1 to 3 of array n:", n[1:4])

25 | P a g e
Downloaded by Kalai ilaiya ([email protected])
lOMoARcPSD|28284242

OCS353 - DATA SCIENCE FUNDAMENTALS

5. UNIVERSAL FUNCTIONS

Universal functions (ufuncs) in NumPy are functions that operate element-wise on arrays, supporting
broadcasting, type casting, and other standard features. They are essential for performing vectorized
operations, which are both more concise and more efficient than using Python loops.
Key Characteristics of Ufuncs
1. Element-wise Operations: Ufuncs apply operations element-wise, which means they operate
on each element of the input arrays independently.
2. Broadcasting: Ufuncs support broadcasting, which allows them to work with arrays of different
shapes in a flexible manner.
3. Performance: Ufuncs are implemented in C and are optimized for performance, making them
much faster than equivalent Python loops.
Common Ufuncs;-

UNIVERSAL FUNCTIONS FUNCTION EXAMPLE

import numpy as np
a = np.array([1, 2, 3])
np.add
b = np.array([4, 5, 6])
result = np.add(a, b)
print("Addition:", result)
result = np.subtract(a, b)
Arithmetic Operations np.subtract
print("Subtraction:", result)
result = np.multiply(a, b)
np.multiply
print("Multiplication:", result)
result = np.divide(a, b)
np.divide
print("Division:", result)
result = np.power(a, 2)
np.power
print("Power:", result)
result = np.sqrt(a)
Square Root: np.sqrt
print("Square Root:", result)
result = np.exp(a)
Exponential: np.exp
print("Exponential:", result)
Mathematical Functions result = np.log(a)
Logarithm: np.log
print("Logarithm:", result)
angle = np.array([0, np.pi/2, np.pi])
Trigonometric Functions: print("Sine:", np.sin(angle))
np.sin, np.cos, np.tan print("Cosine:", np.cos(angle))
print("Tangent:", np.tan(angle))
result = np.mean(a)
Mean: np.mean
print("Mean:", result)
Standard Deviation: result = np.std(a)
Statistical Functions
np.std print("Standard Deviation:", result)
result = np.sum(a)
Sum: np.sum
print("Sum:", result)
result = np.greater(a, b)
Greater Than: np.greater
print("Greater Than:", result)
result = np.less(a, b)
Less Than: np.less
Comparison Operators print("Less Than:", result)
result = np.equal(a, b)
Equal: np.equal print("Equal:", result)

26 | P a g e
Downloaded by Kalai ilaiya ([email protected])
lOMoARcPSD|28284242

OCS353 - DATA SCIENCE FUNDAMENTALS

Example Code:
import numpy as np
# Arrays
a = np.array([1, 2, 3])
b = np.array([4, 5, 6])
# Arithmetic Operations
print("Addition:", np.add(a, b))
print("Subtraction:", np.subtract(a, b))
print("Multiplication:", np.multiply(a, b))
print("Division:", np.divide(a, b))
# Mathematical Functions
print("Square Root:", np.sqrt(a))
print("Exponential:", np.exp(a))
print("Logarithm:", np.log(a))
# Trigonometric Functions
angle = np.array([0, np.pi/2, np.pi])
print("Sine:", np.sin(angle))
print("Cosine:", np.cos(angle))
print("Tangent:", np.tan(angle))
# Statistical Functions
print("Mean:", np.mean(a))
print("Standard Deviation:", np.std(a))
print("Sum:", np.sum(a))
# Comparison Operators
print("Greater Than:", np.greater(a, b))
print("Less Than:", np.less(a, b))
print("Equal:", np.equal(a, b))
6. AGGREGATIONS

Aggregation in data science refers to the process of summarizing or combining multiple data points to produce a
single result or a smaller set of results. This is a fundamental concept used to simplify and analyze large datasets,
making it easier to draw insights and make decisions. Aggregation can be performed in various ways, depending
on the type of data and the analysis being conducted.

Definition: Aggregation is the process of combining multiple pieces of data to produce a summary
result.

27 | P a g e
Downloaded by Kalai ilaiya ([email protected])
lOMoARcPSD|28284242

OCS353 - DATA SCIENCE FUNDAMENTALS

Purpose: The primary purpose of aggregation is to simplify and summarize data, making it easier to
analyze and interpret. This helps in identifying trends, patterns, and anomalies.

6. 1 AGGREGATION TECHNIQUES

Group By: Grouping data based on one or more columns and then applying an aggregation function. For
example, grouping sales data by region and then calculating the total sales per region.

Pivot Tables: Reshaping data by turning unique values from one column into multiple columns,
providing a summarized dataset.
Rolling Aggregation: Calculating aggregates over a rolling window, such as a moving average.

Common aggregation Techniques:

1. Sum:
 Adds up all the values in a dataset. Commonly used to calculate total sales, total expenses, etc.
 Calculate the total value of a column or group of data points.

Function: SUM()

total_sales = df['sales'].sum()

2. Mean (Average):
 Calculates the average value of a dataset.
 Calculate the mean value of a column or group of data points.

Function: AVG()

average_age = df['age'].mean()
3. Median:
Finds the middle value in a dataset, which is less affected by outliers than the mean.
median_income = df['income'].median()
4. Mode:
Identifies the most frequently occurring value in a dataset.
most_common_category = df['category'].mode()
most_common_category = df['category'].mode()
5. Count:
Counts the number of entries in a dataset, often used to determine the number of occurrences of a
specific value.
count_of_sales = df['sales'].count()
6. Min and Max:
 Finds the minimum and maximum values in a dataset.
 Find the maximum or minimum value in a column or group of data points.
Function: MAX()
MIN():

28 | P a g e
Downloaded by Kalai ilaiya ([email protected])
lOMoARcPSD|28284242

OCS353 - DATA SCIENCE FUNDAMENTALS

min_salary = df['salary'].min()
max_salary = df['salary'].max()
7. Standard Deviation and Variance:
 Measures the spread or dispersion of the data around the mean.
 Calculate the spread or dispersion of a column or group of data points.
Function: STDDEV()
VAR()

std_dev = df['scores'].std()
variance = df['scores'].var()
8. Group By:
 Aggregates data based on one or more categories. This is often used in conjunction with other
aggregation functions.
 Group data by one or more columns and apply aggregations.
sales_by_region = df.groupby('region')['sales'].sum()
6.2 APPLICATIONS OF AGGREGATION
Descriptive Statistics:
Aggregation is used to describe the main features of a dataset quantitatively. For example, summarizing
the central tendency and dispersion of data.
Data Cleaning:
Aggregation can help in identifying and handling missing values, outliers, and inconsistencies in the
data.
Data Visualization:
Aggregated data is often used to create plots and charts, making it easier to visualize trends and
patterns.
Feature Engineering:
Aggregation can be used to create new features from existing data, improving the performance of
machine learning models.
Reporting:
Aggregated data is commonly used in business reports and dashboards to provide a high-level overview
of key metrics.
Example Code: Using Pandas
import pandas as pd
# Sample data
data = {
'region': ['North', 'South', 'East', 'West', 'North', 'South'],
'sales': [250, 150, 200, 300, 400, 100],
'expenses': [100, 50, 80, 120, 150, 60]

29 | P a g e
Downloaded by Kalai ilaiya ([email protected])
lOMoARcPSD|28284242

OCS353 - DATA SCIENCE FUNDAMENTALS

}
df = pd.DataFrame(data)
# Sum of sales
total_sales = df['sales'].sum()
# Average expenses
average_expenses = df['expenses'].mean()
# Sales by region
sales_by_region = df.groupby('region')['sales'].sum()
print(f"Total Sales: {total_sales}")
print(f"Average Expenses: {average_expenses}")
print("Sales by Region:")
print(sales_by_region)
7. COMPUTATION ON ARRAYS
Computation on arrays is a fundamental aspect of data science, enabling efficient data manipulation,
analysis, and machine learning. Arrays, especially as implemented in libraries like NumPy, provide a
powerful way to handle large datasets and perform a wide range of mathematical operations. Here,
we'll explore the essential aspects of array computations in data science.
Key Concepts
1. Array Creation and Initialization
Creating and initializing arrays is the first step in performing any computation. Arrays can be created
from lists, using functions like np.array, or from scratch using functions like np.zeros, np.ones, and
np.full.
import numpy as np
# From a list
arr = np.array([1, 2, 3, 4])
# From scratch
zeros = np.zeros((3, 3))
ones = np.ones((2, 2))
full = np.full((2, 3), 7)

2. Array Operations
NumPy supports a variety of element-wise operations, such as addition, subtraction, multiplication, and
division, as well as more complex mathematical functions like exponentiation, logarithms, and
trigonometric functions.
# Element-wise operations
arr1 = np.array([1, 2, 3])
arr2 = np.array([4, 5, 6])

sum_arr = arr1 + arr2

diff_arr = arr1 - arr2

30 | P a g e
Downloaded by Kalai ilaiya ([email protected])
lOMoARcPSD|28284242

OCS353 - DATA SCIENCE FUNDAMENTALS

prod_arr = arr1 * arr2

quot_arr = arr1 / arr2
3. Broadcasting
As previously discussed, broadcasting allows operations on arrays of different shapes, making it easier
to perform operations without explicitly reshaping arrays.
arr = np.array([[1, 2, 3], [4, 5, 6]])
scalar = 3
result = arr + scalar # Broadcasting scalar to the shape of arr

4. Indexing and Slicing

Efficiently accessing and manipulating array elements is crucial. NumPy provides powerful indexing
and slicing capabilities.
arr = np.array([[1, 2, 3], [4, 5, 6]])
# Indexing
element = arr[1, 2] # Accessing the element at row 1, column 2
# Slicing
slice_arr = arr[:, 1:3] # Slicing columns 1 to 2 for all rows

4. Aggregation
Aggregation functions like sum, mean, median, min, and max help summarize data.
arr = np.array([[1, 2, 3], [4, 5, 6]])
total_sum = np.sum(arr)
column_mean = np.mean(arr, axis=0)
row_max = np.max(arr, axis=1)

5. Linear Algebra
NumPy provides support for linear algebra operations, including dot products, matrix multiplication,
determinants, and inverses.
# Dot product
vec1 = np.array([1, 2])
vec2 = np.array([3, 4])
dot_product = np.dot(vec1, vec2)
# Matrix multiplication
mat1 = np.array([[1, 2], [3, 4]])
mat2 = np.array([[5, 6], [7, 8]])
mat_mult = np.matmul(mat1, mat2)
7.1 BROADCASTING
Broadcasting is a powerful mechanism in NumPy (a popular library for numerical computations in
Python) that allows for element-wise operations on arrays of different shapes. When performing
arithmetic operations, NumPy automatically stretches the smaller array along the dimension with size
1 to match the shape of the larger array. This allows for efficient computation without the need for
explicitly replicating the data.
Broadcasting Rules:
To understand how broadcasting works, it's important to know the rules that govern it:

31 | P a g e
Downloaded by Kalai ilaiya ([email protected])
lOMoARcPSD|28284242

OCS353 - DATA SCIENCE FUNDAMENTALS

 If the arrays differ in their number of dimensions, the shape of the smaller array is padded with
ones on its left side.
 If the shape of the two arrays does not match in any dimension, the array with shape equal to 1
in that dimension is stretched to match the other shape.
 If in any dimension the sizes are different and neither is equal to 1, an error is raised
Broadcasting follows a set of rules to make arrays compatible for element-wise operations:
 Align Shapes: If the arrays have different numbers of dimensions, the shape of the smaller array
is padded with ones on its left side.
 Shape Compatibility: Arrays are compatible for broadcasting if, in all dimensions, the following
is true:The dimension sizes are equal, orOne of the dimensions is 1.
 Result Shape: The resulting shape is the maximum size along each dimension from the input
arrays.
Examples of Broadcasting
Example 1: Adding a Scalar to an Array
import numpy as np
arr = np.array([1, 2, 3])
scalar = 5
result = arr + scalar
print(result)

Output: [6 7 8]

Example 2: Adding Arrays of Different Shapes

arr1 = np.array([[1, 2, 3], [4, 5, 6]])
arr2 = np.array([1, 2, 3])
result = arr1 + arr2
print(result)

Output: [[2 4 6]
[5 7 9]]
Example 3: More Complex Broadcasting
arr1 = np.array([[1, 2, 3], [4, 5, 6]])
arr2 = np.array([[1], [2]])
result = arr1 + arr2
print(result)

Output: [[2 3 4]
[6 7 8]]
Practical Applications
Normalizing Data
Broadcasting is useful for normalizing data, subtracting the mean, and dividing by the standard
deviation for each feature.

32 | P a g e
Downloaded by Kalai ilaiya ([email protected])
lOMoARcPSD|28284242

OCS353 - DATA SCIENCE FUNDAMENTALS

data = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])

mean = np.mean(data, axis=0)
std = np.std(data, axis=0)
normalized_data = (data - mean) / std
print(normalized_data)

Output:
[[-1.22474487 -1.22474487 -1.22474487]
[0. 0. 0. ]
[ 1.22474487 1.22474487 1.22474487]]

Element-wise Operations
Broadcasting simplifies scaling each column of a matrix by a different factor.
matrix = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
scaling_factors = np.array([0.1, 0.2, 0.3])
scaled_matrix = matrix * scaling_factors
print(scaled_matrix)

Output:
[[0.1 0.4 0.9]
[0.4 1. 1.8]
[0.7 1.6 2.7]]
8. FANCY INDEXING
Fancy indexing, also known as advanced indexing, is a technique in data science and programming
(particularly in Python with NumPy and pandas) that allows for more flexible and powerful ways to
access and manipulate data arrays or dataframes. It involves using arrays or sequences of indices to
select specific elements or slices from an array or dataframe.
Fancy indexing refers to using arrays of indices to access multiple elements of an array simultaneously.
Instead of accessing elements one by one, you can pass a list or array of indices to obtain a subset of
elements. This technique can be used for both reading from and writing to arrays.
NumPy Fancy Indexing
 NumPy is a fundamental package for scientific computing with Python, providing support for
arrays and matrices.
In NumPy, fancy indexing is done by passing arrays of indices inside square brackets. Here’s an
example:
import numpy as np
# Create a NumPy array
arr = np.array([10, 20, 30, 40, 50])
# Fancy indexing with a list of indices
indices = [0, 2, 4]
subset = arr[indices]
print(subset)

Output: [10 30 50]

33 | P a g e
Downloaded by Kalai ilaiya ([email protected])
lOMoARcPSD|28284242

OCS353 - DATA SCIENCE FUNDAMENTALS

Boolean Indexing
Another form of fancy indexing is boolean indexing, where you use boolean arrays to select elements:
mask = arr > 30
subset = arr[mask]
print(subset)

Output: [40 50]

Fancy Indexing in 2D Arrays
Fancy indexing can also be applied to multi-dimensional arrays
# Create a 2D array
arr2d = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
# Fancy indexing with row and column indices
row_indices = [0, 1, 2]
col_indices = [2, 1, 0]
subset = arr2d[row_indices, col_indices]
print(subset)
Output: [3 5 7]
Fancy Indexing in pandas
Using loc and iloc:
loc is used for label-based indexing, while iloc is used for integer-based indexing.
import pandas as pd
# Create a DataFrame
df = pd.DataFrame({
'A': [10, 20, 30, 40, 50],
'B': [5, 10, 15, 20, 25]
})
# Fancy indexing with .iloc (integer-location based indexing)
subset = df.iloc[[0, 2, 4]]
print(subset)
Combined Indexing Techniques
Fancy indexing can be combined with other indexing techniques to achieve complex selections
# Combined indexing
subset = df.iloc[[0, 2, 4], [0, 1]]
print(subset)

Applications in Data Science

Fancy indexing is particularly useful in various data science tasks, including:
 Data Cleaning: Selecting and modifying subsets of data based on certain conditions.
 Data Analysis: Efficiently extracting and analyzing specific parts of a dataset.
 Machine Learning: Preprocessing data by selecting specific features or samples.
 Visualization: Selecting specific data points to visualize.
 Data Selection: Extract specific elements, rows, or columns from large datasets.
 Data Filtering: Filter data based on conditions or criteria.
 Data Transformation: Apply operations to specific subsets of data.
 Efficient Computations: Perform efficient computations on selected data without looping.
34 | P a g e
Downloaded by Kalai ilaiya ([email protected])
lOMoARcPSD|28284242

OCS353 - DATA SCIENCE FUNDAMENTALS

9. SORTING ARRAYS
Sorting means putting elements in an ordered sequence.
Ordered sequence is any sequence that has an order corresponding to elements, like numeric or
alphabetical, ascending or decending.
The NumPy ndarray object has a function called sort(). That will sort a specified array.
Sorting in NumPy
1. Simple Sorting
 numpy.sort() returns a sorted copy of the array.
 numpy.ndarray.sort() sorts the array in-place.
import numpy as np
arr = np.array([3, 1, 2, 5, 4])
sorted_arr = np.sort(arr)
print(sorted_arr) # Output: [1 2 3 4 5]
arr.sort()
print(arr)
Output: [1 2 3 4 5]
2. Sorting Multi-dimensional Arrays
 our can sort along a specified axis using the axis parameter.
arr_2d = np.array([[3, 1, 2], [5, 4, 6]])
sorted_arr_2d = np.sort(arr_2d, axis=0) # Sort along the rows
print(sorted_arr_2d)

Output: [[3 1 2]
[5 4 6]]
sorted_arr_2d = np.sort(arr_2d, axis=1) # Sort along the columns
print(sorted_arr_2d)

Output: [[1 2 3]
[4 5 6]]
3. Argsort for Indirect Sorting
 numpy.argsort() returns the indices that would sort an array.
arr = np.array([3, 1, 2, 5, 4])
indices = np.argsort(arr)
print(indices)

Output: [1 2 0 4 3]
sorted_arr = arr[indices]
print(sorted_arr)

Output: [1 2 3 4 5]

35 | P a g e
Downloaded by Kalai ilaiya ([email protected])
lOMoARcPSD|28284242

OCS353 - DATA SCIENCE FUNDAMENTALS

Sorting by Multiple Keys

 You can sort a structured array by multiple fields.
data = np.array([(1, 'first', 200),
(2, 'second', 100),
(3, 'third', 150)],
dtype=[('id', 'i4'), ('name', 'U10'), ('score', 'i4')])
sorted_data = np.sort(data, order=['score', 'id'])
print(sorted_data)

Output: [(2, 'second', 100) (3, 'third', 150) (1, 'first', 200)]
Custom Sorting
 You can use numpy.lexsort() for custom sorting.
names = np.array(['Betty', 'John', 'Alice', 'Alice'])
ages = np.array([25, 34, 30, 22])
indices = np.lexsort((ages, names))
sorted_data = list(zip(names[indices], ages[indices]))
print(sorted_data)

Output: [('Alice', 22), ('Alice', 30), ('Betty', 25), ('John', 34)]

10. STRUCTURED DATA

NumPy’s Stuctured Arrays:
NumPy's structured arrays (also known as record arrays) are a powerful feature for handling
heterogeneous data, where each element can have multiple fields of different data types. Structured
arrays allow you to define complex data structures and perform efficient operations on them.
Creating Structured Arrays
1. Defining Data Types
 You can define a structured array by specifying a list of tuples, where each tuple represents a
field's name and its data type.
import numpy as np
dtype = [('name', 'U10'), ('age', 'i4'), ('weight', 'f4')]
data = np.array([('Alice', 25, 55.5), ('Bob', 30, 75.2)], dtype=dtype)
print(data)

Output: [('Alice', 25, 55.5) ('Bob', 30, 75.2)]

2. Accessing Fields
 You can access individual fields of the structured array using the field names.
names = data['name']
ages = data['age']
weights = data['weight']
print(names)

Output: ['Alice' 'Bob']

print(ages)

36 | P a g e
Downloaded by Kalai ilaiya ([email protected])

Lecture Notes Applied Mathematics For Business, Economics, and The Social Sciences (4th Edition) by Budnick PDF
100% (5)
Lecture Notes Applied Mathematics For Business, Economics, and The Social Sciences (4th Edition) by Budnick PDF
171 pages
Clever Keeping Maths Simple
75% (8)
Clever Keeping Maths Simple
104 pages
Lisp Interpreter in Rust
From Everand
Lisp Interpreter in Rust
Vishal Patil
1/5 (1)
Harmonic Motion of A Maxwell Model: Trigonometric Notation
No ratings yet
Harmonic Motion of A Maxwell Model: Trigonometric Notation
4 pages
General Mathematics: Quarter 1 - Module 19: Representations of Exponential Functions
100% (1)
General Mathematics: Quarter 1 - Module 19: Representations of Exponential Functions
19 pages
2.4. NumPy Operations
No ratings yet
2.4. NumPy Operations
49 pages
OOP Unit-3.2. Python Libraries - Numpy-2
No ratings yet
OOP Unit-3.2. Python Libraries - Numpy-2
37 pages
Numpy Full
100% (1)
Numpy Full
40 pages
numpy_basics
No ratings yet
numpy_basics
3 pages
Data Science Handwritten Notes - 3
No ratings yet
Data Science Handwritten Notes - 3
26 pages
Week7B PBD
No ratings yet
Week7B PBD
3 pages
numpy cheet sheet 6 page
No ratings yet
numpy cheet sheet 6 page
6 pages
Mds1111 Merged Numbered (1)
No ratings yet
Mds1111 Merged Numbered (1)
41 pages
Numpy Python
No ratings yet
Numpy Python
36 pages
Python Numpy
No ratings yet
Python Numpy
20 pages
Introduction To NumPy
No ratings yet
Introduction To NumPy
27 pages
Basic of Numphy
No ratings yet
Basic of Numphy
14 pages
NumPy is
No ratings yet
NumPy is
8 pages
What is NumPy.docx
No ratings yet
What is NumPy.docx
5 pages
Numpy
No ratings yet
Numpy
20 pages
numpy
No ratings yet
numpy
7 pages
Numpy
No ratings yet
Numpy
4 pages
45B AIML Practical1.1
No ratings yet
45B AIML Practical1.1
57 pages
13 - NumPy
No ratings yet
13 - NumPy
46 pages
Numpy
No ratings yet
Numpy
71 pages
Ot Lab 6
No ratings yet
Ot Lab 6
13 pages
NUMPY, PANDAS
No ratings yet
NUMPY, PANDAS
19 pages
HKU - 7001 - 3.2 Managing Data II
No ratings yet
HKU - 7001 - 3.2 Managing Data II
67 pages
Module 6 NumPY and Pandas
No ratings yet
Module 6 NumPY and Pandas
12 pages
Numpy Cheat Sheet Python For Data Science: Inspecting Your Array Sorting Arrays
No ratings yet
Numpy Cheat Sheet Python For Data Science: Inspecting Your Array Sorting Arrays
1 page
Unit 4 Numpy
No ratings yet
Unit 4 Numpy
14 pages
Module3 Advance Pythonlibraries
No ratings yet
Module3 Advance Pythonlibraries
53 pages
Numpy
No ratings yet
Numpy
9 pages
unit-3
No ratings yet
unit-3
34 pages
Python NumPy Cheat Sheet
No ratings yet
Python NumPy Cheat Sheet
1 page
Numpy
No ratings yet
Numpy
64 pages
NumPy is a powerful Python library used for numerical computing. Here are s_20250101_154624_0000
No ratings yet
NumPy is a powerful Python library used for numerical computing. Here are s_20250101_154624_0000
8 pages
Numpy Data Analytics
No ratings yet
Numpy Data Analytics
13 pages
NumPy Basics
No ratings yet
NumPy Basics
23 pages
Arrays
No ratings yet
Arrays
28 pages
Week2-1 Numpy
No ratings yet
Week2-1 Numpy
43 pages
Unit 1
No ratings yet
Unit 1
170 pages
Numpy - Tutorial - Ipynb - Colaboratory
No ratings yet
Numpy - Tutorial - Ipynb - Colaboratory
9 pages
python 2.1.1 (2)
No ratings yet
python 2.1.1 (2)
7 pages
15.NUMPY
No ratings yet
15.NUMPY
32 pages
Data Science Lab: Numpy: Numerical Python
No ratings yet
Data Science Lab: Numpy: Numerical Python
71 pages
Chapter 2
No ratings yet
Chapter 2
32 pages
Unit 3_Numpy_VP
No ratings yet
Unit 3_Numpy_VP
53 pages
NumPy Functions
No ratings yet
NumPy Functions
5 pages
Unit 5 - Python Programming
No ratings yet
Unit 5 - Python Programming
9 pages
Introduction to NumPy
No ratings yet
Introduction to NumPy
15 pages
Python Numpy
No ratings yet
Python Numpy
4 pages
NUMPY
No ratings yet
NUMPY
33 pages
Numpy Cheat Sheet
No ratings yet
Numpy Cheat Sheet
1 page
Numpy Tutorial
No ratings yet
Numpy Tutorial
19 pages
Numerical Methods Using Python: (MCSC-202)
No ratings yet
Numerical Methods Using Python: (MCSC-202)
34 pages
10 1-Numpy
No ratings yet
10 1-Numpy
4 pages
FALLSEM2023-24 CSI3007 ETH VL2023240104352 2023-09-27 Reference-Material-I
No ratings yet
FALLSEM2023-24 CSI3007 ETH VL2023240104352 2023-09-27 Reference-Material-I
47 pages
CAP776 Numpy
No ratings yet
CAP776 Numpy
71 pages
Python Sem v Portion 2
No ratings yet
Python Sem v Portion 2
29 pages
NUMPYA03
No ratings yet
NUMPYA03
36 pages
Machine Learning- Section #3 (Numpy)
No ratings yet
Machine Learning- Section #3 (Numpy)
21 pages
Profound Python Data Science
From Everand
Profound Python Data Science
Onder Teker
No ratings yet
Lisp Programming Language
From Everand
Lisp Programming Language
Faiz ul haque Zeya
No ratings yet
Profit and Loss
No ratings yet
Profit and Loss
6 pages
unit 4 dsf
No ratings yet
unit 4 dsf
15 pages
Types of Breakwater
No ratings yet
Types of Breakwater
7 pages
Design of Staircase
No ratings yet
Design of Staircase
43 pages
Assignment 2-1
No ratings yet
Assignment 2-1
2 pages
Calculus Review Problems For Math 105 (Multivariable Calculus)
No ratings yet
Calculus Review Problems For Math 105 (Multivariable Calculus)
22 pages
Schaum’s Outline of Precalculus Fred Safier pdf download
100% (4)
Schaum’s Outline of Precalculus Fred Safier pdf download
59 pages
LESSON_3_Part_1(7) (1)
No ratings yet
LESSON_3_Part_1(7) (1)
10 pages
Identify Relationships in Real Life Which Are Exponential in Nature
No ratings yet
Identify Relationships in Real Life Which Are Exponential in Nature
3 pages
05-Chapter 5 Pages 001-012
No ratings yet
05-Chapter 5 Pages 001-012
12 pages
MPCH 2
No ratings yet
MPCH 2
9 pages
TPA-Referenced Lesson Plan Teacher Candidate Name: Hannah Hasz Grade & Subject Area: Algebra 1 Date For Planned Lesson: Monday - Beginning of Tuesday
No ratings yet
TPA-Referenced Lesson Plan Teacher Candidate Name: Hannah Hasz Grade & Subject Area: Algebra 1 Date For Planned Lesson: Monday - Beginning of Tuesday
5 pages
Drying Kinetics
No ratings yet
Drying Kinetics
13 pages
Quadratics And Polynomials Unit Plan
No ratings yet
Quadratics And Polynomials Unit Plan
8 pages
General Mathematics Melc
100% (1)
General Mathematics Melc
5 pages
Grade_9_Term_2_Test_Framework_2025
No ratings yet
Grade_9_Term_2_Test_Framework_2025
5 pages
Civil Engineering Outline Abet
No ratings yet
Civil Engineering Outline Abet
85 pages
Practical Skills 2
No ratings yet
Practical Skills 2
22 pages
Maths Project
No ratings yet
Maths Project
29 pages
2024 - Maths ATP Grade 12
No ratings yet
2024 - Maths ATP Grade 12
5 pages
Sympy-0 7 2
100% (1)
Sympy-0 7 2
1,520 pages
GEN MATH MODULE Final PDF
No ratings yet
GEN MATH MODULE Final PDF
103 pages
Unit 4 Cover Sheet Homework Packet Fall 2016
No ratings yet
Unit 4 Cover Sheet Homework Packet Fall 2016
8 pages
MAT117Syllabus07-2024
No ratings yet
MAT117Syllabus07-2024
8 pages
(Ebook PDF) College Algebra 10th Edition by Michael Sullivan 2024 Scribd Download
100% (5)
(Ebook PDF) College Algebra 10th Edition by Michael Sullivan 2024 Scribd Download
51 pages
Course Outline of Complex
No ratings yet
Course Outline of Complex
3 pages
MATH 119 Calculus With Analytic Geometry (2011-1)
No ratings yet
MATH 119 Calculus With Analytic Geometry (2011-1)
2 pages
Lab 0: Introduction To MATLAB: (LABE 410) Dr. Jad Abou Chaaya
No ratings yet
Lab 0: Introduction To MATLAB: (LABE 410) Dr. Jad Abou Chaaya
6 pages
DLL General Math
No ratings yet
DLL General Math
3 pages
AP Calculus BC Chapter 10 Summary: d0f0c0 February 2020
No ratings yet
AP Calculus BC Chapter 10 Summary: d0f0c0 February 2020
6 pages
Exponential and Logarithmic Functions PDF
100% (1)
Exponential and Logarithmic Functions PDF
2 pages

Numpy Array

Uploaded by

Numpy Array

Uploaded by

lOMoARcPSD|28284242

OCS353 - DATA SCIENCE FUNDAMENTALS

Environment • %env: List environment variables

Extension • %load_ext: Load an IPython extension

Memory Usage • %memit: Measure memory usage of a statement or expression

• %parallel: Run a command in parallel

OCS353 - DATA SCIENCE FUNDAMENTALS

 numpy.inner(): Compute the inner product of two arrays

Creating a NumPy array Basic operations

OCS353 - DATA SCIENCE FUNDAMENTALS

Example Program: Overall Operations using Numpy Array

# Indexing and slicing

OCS353 - DATA SCIENCE FUNDAMENTALS

UNIVERSAL FUNCTIONS FUNCTION EXAMPLE

OCS353 - DATA SCIENCE FUNDAMENTALS

OCS353 - DATA SCIENCE FUNDAMENTALS

Common aggregation Techniques:

OCS353 - DATA SCIENCE FUNDAMENTALS

OCS353 - DATA SCIENCE FUNDAMENTALS

sum_arr = arr1 + arr2

OCS353 - DATA SCIENCE FUNDAMENTALS

prod_arr = arr1 * arr2

4. Indexing and Slicing

OCS353 - DATA SCIENCE FUNDAMENTALS

Example 2: Adding Arrays of Different Shapes

OCS353 - DATA SCIENCE FUNDAMENTALS

data = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])

Output: [10 30 50]

OCS353 - DATA SCIENCE FUNDAMENTALS

Output: [40 50]

Applications in Data Science

OCS353 - DATA SCIENCE FUNDAMENTALS

OCS353 - DATA SCIENCE FUNDAMENTALS

Sorting by Multiple Keys

Output: [('Alice', 22), ('Alice', 30), ('Betty', 25), ('John', 34)]

10. STRUCTURED DATA

Output: [('Alice', 25, 55.5) ('Bob', 30, 75.2)]

Output: ['Alice' 'Bob']

You might also like