Unit Iii Using Numpy
Unit Iii Using Numpy
NumPy stands for Numerical Python. It is a Python library used for working with an array. In Python, we use the list
for the array but it’s slow to process. NumPy array is a powerful N-dimensional array object and is used in linear
algebra, Fourier transform, and random number capabilities. It provides an array object much faster than traditional
Python lists.
Types of Array:
1. One Dimensional Array
2. Multi-Dimensional Array
One Dimensional Array:
A one-dimensional array is a type of linear array.
# creating list
list = [1, 2, 3, 4]
Output:
List in python : [1, 2, 3, 4]
Numpy Array in python : [1 2 3 4]
print(type(list_1))
print(type(sample_array))
Output:
<class 'list'>
<class 'numpy.ndarray'>
Multi-Dimensional Array:
Data in multidimensional arrays are stored in tabular form.
Example:
# importing numpy module
import numpy as np
# creating list
list_1 = [1, 2, 3, 4]
list_2 = [5, 6, 7, 8]
list_3 = [9, 10, 11, 12]
Output:
Numpy multi dimensional array in python
[[ 1 2 3 4]
[ 5 6 7 8]
[ 9 10 11 12]]
Rank 1
Rank 2
4. Data type objects (dtype): Data type objects (dtype) is an instance of numpy.dtype class. It describes how the
bytes in the fixed-size block of memory corresponding to an array item should be interpreted.
Example:
# Import module
import numpy as np
# Creating the array
sample_array_1 = np.array([[0, 4, 2]])
sample_array_2 = np.array([0.2, 0.4, 2.4])
# display data type
print("Data type of the array 1 :",sample_array_1.dtype)
print("Data type of array 2 :",sample_array_2.dtype)
Output:
Data type of the array 1 : int32
Data type of array 2 : float64
Syntax: numpy.array(parameter)
Example:
# import module
import numpy as np
#creating a array
arr = np.array([3,4,5,5])
print("Array :",arr)
Output:
Array : [3 4 5 5]
2. numpy.fromiter(): The fromiter() function create a new one-dimensional array from an iterable object.
Example:
#Import numpy module
import numpy as np
# iterable
iterable = (a*a for a in range(8))
arr = np.fromiter(iterable, float)
print("fromiter() array :",arr)
Output:
fromiter() array : [ 0. 1. 4. 9. 16. 25. 36. 49.]
3. numpy.arange(): This is an inbuilt NumPy function that returns evenly spaced values within a given interval.
4. numpy.linspace(): This function returns evenly spaced numbers over a specified between two limits.
Example:
import numpy as np
np.linspace(3.5, 10, 3)
Output:
array([ 3.5 , 6.75, 10. ])
5. numpy.empty(): This function create a new array of given shape and type, without initializing value.
Example:
import numpy as np
np.empty([4, 3], dtype = np.int32, order = 'f')
Output:
array([[ 1, 5, 9],
[ 2, 6, 10],
[ 3, 7, 11],
[ 4, 8, 12]])
6. numpy.ones(): This function is used to get a new array of given shape and type, filled with ones(1).
Example:
import numpy as np
np.ones([4, 3], dtype = np.int32, order = 'f')
Output:
array([[1, 1, 1],
[1, 1, 1],
[1, 1, 1],
[1, 1, 1]])
7. numpy.zeros(): This function is used to get a new array of given shape and type, filled with zeros(0).
Example:
import numpy as np
np.zeros([4, 3], dtype = np.int32, order = 'f')
Output:
array([[0, 0, 0],
[0, 0, 0],
[0, 0, 0],
[0, 0, 0]])
8. Create a Full Numpy Array
To create a full NumPy array, you can use the numpy.full() function. The full() function in NumPy creates an array of
a given shape and fills it with a specified value. A full NumPy array is an array where all the elements have the same
predefined value. This is useful when you want to initialize an array with a specific value.
Example:
import numpy as np
full_array_2d = np.full((3, 4), 5)
print(full_array_2d)
Output :
Array is of type: <class 'numpy.ndarray'>
No. of dimensions: 2
Shape of array: (2, 3)
Size of array: 6
Array stores elements of type: int64
Array Creation
There are various ways to create arrays in NumPy.
For example, you can create an array from a regular Python list or tuple using the array function.
The type of the resulting array is deduced from the type of the elements in the sequences.
Often, the elements of an array are originally unknown, but its size is known. Hence, NumPy offers
several functions to create arrays with initial placeholder content. These minimize the necessity
of growing arrays, an expensive operation.
For example: np.zeros, np.ones, np.full, np.empty, etc.
To create sequences of numbers, NumPy provides a function analogous to range that returns arrays
instead of lists.
arange: returns evenly spaced values within a given interval. step size is specified.
linspace: returns evenly spaced values within a given interval. num no. of elements are returned.
Reshaping array: We can use reshape method to reshape an array. Consider an array with shape
(a1, a2, a3, …, aN). We can reshape and convert it into another array with shape (b1, b2, b3, …,
bM). The only required condition is:
a1 x a2 x a3 … x aN = b1 x b2 x b3 … x bM . (i.e original size of array remains unchanged.)
Flatten array: We can use flatten method to get a copy of array collapsed into one dimension. It
accepts order argument. Default value is ‘C’ (for row-major order). Use ‘F’ for column major
order.
Example:
# Python program to demonstrate
# array creation techniques
import numpy as np
Run on IDE
Output :
Array created using passed list:
[[ 1. 2. 4.]
[ 5. 8. 7.]]
Array Indexing
Knowing the basics of array indexing is important for analysing and manipulating the array object. NumPy offers
many ways to do array indexing.
Slicing: Just like lists in python, NumPy arrays can be sliced. As arrays can be multidimensional,
you need to specify a slice for each dimension of the array.
Integer array indexing: In this method, lists are passed for indexing for each dimension. One to
one mapping of corresponding elements is done to construct a new arbitrary array.
Boolean array indexing: This method is used when we want to pick elements from array which
satisfy some condition.
Example:
# Python program to demonstrate
# indexing in numpy
import numpy as np
# An exemplar array
arr = np.array([[-1, 2, 0, 4],
[4, -0.5, 6, 0],
[2.6, 0, 7, 8],
[3, -7, 4, 2.0]])
# Slicing array
temp = arr[:2, ::2]
print ("Array with first 2 rows and alternate" "columns(0 and 2):\n", temp)
# Integer array indexing example
temp = arr[[0, 1, 2, 3], [3, 2, 1, 0]]
print ("\nElements at indices (0, 3), (1, 2), (2, 1)," "(3, 0):\n", temp)
# boolean array indexing example
cond = arr > 0 # cond is a boolean array
temp = arr[cond]
print ("\nElements greater than 0:\n", temp)
Output :
Array with first 2 rows and alternatecolumns(0 and 2):
[[-1. 0.]
[ 4. 6.]]
np.array([1, 2, 5, 3])
# transpose of array
np.array([[1, 2, 3], [3, 4, 5], [9, 6, 0]])
np.array([[1, 5, 6],
[4, 7, 2],
[3, 1, 9]])
np.array([[1, 2],
[3, 4]])
np.array([[4, 3],
[2, 1]])
# add arrays
print ("Array sum:\n", a + b)
# matrix multiplication
print ("Matrix multiplication:\n", a.dot(b))
Run on IDE
Output:
Array sum:
[[5 5]
[5 5]]
Array multiplication:
[[4 6]
[6 4]]
Matrix multiplication:
[[ 8 5]
[20 13]]
Universal functions (ufunc): NumPy provides familiar mathematical functions such as sin, cos,
exp, etc. These functions also operate elementwise on an array, producing an array as output.
Note: All the operations we did above using overloaded operators can be done using ufuncs like np.add, np.subtract,
np.multiply, np.divide, np.sum, etc.
# Python program to demonstrate
# universal functions in numpy
numpy as np
# exponential values
np.array([0, 1, 2, 3])
print ("Exponent of array elements:", np.exp(a))
Data Type
Every ndarray has an associated data type (dtype) object. This data type object (dtype) informs us about the layout of
the array. This means it gives us information about :
Type of the data (integer, float, Python object etc.)
Size of the data (number of bytes)
Byte order of the data (little-endian or big-endian)
If the data type is a sub-array, what is its shape and data type.
The values of a ndarray are stored in a buffer which can be thought of as a contiguous block of memory bytes. So
how these bytes will be interpreted is given by the dtype object.
Every Numpy array is a table of elements (usually numbers), all of the same type, indexed by a tuple of positive
integers. Every ndarray has an associated data type (dtype) object.
This data type object (dtype) provides information about the layout of the array. The vaues of an ndarray are stored in
a buffer which can be thought of as a contiguous block of memory bytes which can be interpreted by the dtype
object. Numpy provides a large set of numeric datatypes that can be used to construct arrays.
At the time of Array creation, Numpy tries to guess a datatype, but functions that construct arrays usually also
include an optional argument to explicitly specify the datatype.
print("Size is:",dt.itemsize)
x = np.array([1, 2, 3, 4, 5])
x < 3 # less than
array([ True, True, False, False, False], dtype=bool)
x > 3 # greater than
array([False, False, False, True, True], dtype=bool)
x <= 3 # less than or equal
array([ True, True, True, False, False], dtype=bool)
x >= 3 # greater than or equal
array([False, False, True, True, True], dtype=bool)
x != 3 # not equal
array([ True, True, False, True, True], dtype=bool)
x == 3 # equal
array([False, False, True, False, False], dtype=bool)
rng = np.random.RandomState(0)
x = rng.randint(10, size=(3, 4))
x
array([[5, 0, 3, 3],
[7, 9, 3, 5],
[2, 4, 7, 6]])
x<6
array([[ True, True, True, True],
[False, False, True, True],
[ True, True, False, False]], dtype=bool)
In each case, the result is a Boolean array, and NumPy provides a number of straightforward patterns
for working with these Boolean results.
If we're interested in quickly checking whether any or all the values are true, we can use (you guessed
it) np.any or np.all:
Boolean operators
We've already seen how we might count, say, all days with rain less than four inches, or all days with
rain greater than two inches. But what if we want to know about all days with rain less than four
inches and greater than one inch? This is accomplished through Python's bitwise logic operators, &, |,
^, and ~. Like with the standard arithmetic operators, NumPy overloads these as ufuncs which work
element-wise on (usually Boolean) arrays.
The following table summarizes the bitwise Boolean operators and their equivalent ufuncs:
x
array([[5, 0, 3, 3],
[7, 9, 3, 5],
[2, 4, 7, 6]])
We can obtain a Boolean array for this condition easily, as we've already seen:
x<5
array([[False, True, True, True],
[False, False, True, False],
[ True, True, False, False]], dtype=bool)
Now to select these values from the array, we can simply index on this Boolean array; this is known
as a masking operation:
x[x < 5]
array([0, 3, 3, 3, 2, 4])
What is returned is a one-dimensional array filled with all the values that meet this condition; in other
words, all the values in positions at which the mask array is True.
We are then free to operate on these values as we wish. For example, we can compute some relevant
statistics on our Seattle rain data:
# construct a mask of all rainy days
rainy = (inches > 0)
# construct a mask of all summer days (June 21st is the 172nd day)
days = np.arange(365)
summer = (days > 172) & (days < 262)
print("Median precip on rainy days in 2014 (inches): ",
np.median(inches[rainy]))
print("Median precip on summer days in 2014 (inches): ",
np.median(inches[summer]))
print("Maximum precip on summer days in 2014 (inches): ",
np.max(inches[summer]))
print("Median precip on non-summer rainy days (inches):",
np.median(inches[rainy & ~summer]))
Median precip on rainy days in 2014 (inches): 0.194881889764
Median precip on summer days in 2014 (inches): 0.0
Maximum precip on summer days in 2014 (inches): 0.850393700787
Median precip on non-summer rainy days (inches): 0.200787401575
By combining Boolean operations, masking operations, and aggregates, we can very quickly answer
these sorts of questions for our dataset.
When you use and or or, it's equivalent to asking Python to treat the object as a single Boolean entity.
In Python, all nonzero integers will evaluate as True. Thus:
bool(42), bool(0)
(True, False)
bool(42 and 0)
False
bool(42 or 0)
True
When you use & and | on integers, the expression operates on the bits of the element, applying the and
or the or to the individual bits making up the number:
bin(42)
'0b101010'
bin(59)
'0b111011'
bin(42 & 59)
'0b101010'
bin(42 | 59)
'0b111011'
Notice that the corresponding bits of the binary representation are compared in order to yield the
result.
When you have an array of Boolean values in NumPy, this can be thought of as a string of bits where
1 = True and 0 = False, and the result of & and | operates similarly to above:
A or B
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-38-5d8e4f2e21c0> in <module>()
----> 1 A or B
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or
a.all()
Similarly, when doing a Boolean expression on a given array, you should use | or & rather than or or
and:
x = np.arange(10)
(x > 4) & (x < 8)
array([False, False, False, False, False, True, True, True, False, False], dtype=bool)
Trying to evaluate the truth or falsehood of the entire array will give the same ValueError we saw
previously:
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or
a.all()
So remember this: and and or perform a single Boolean evaluation on an entire object, while & and |
perform multiple Boolean evaluations on the content (the individual bits or bytes) of an object. For
Boolean NumPy arrays, the latter is nearly always the desired operation.
Fancy Indexing
Fancy indexing is conceptually simple: it means passing an array of indices to access multiple array
elements at once. For example, consider the following array:
In [1]:
import numpy as np
rand = np.random.RandomState(42)
x = rand.randint(100, size=10)
print(x)
[51 92 14 71 60 20 82 86 74 74]
Suppose we want to access three different elements. We could do it like this:
[x[3], x[7], x[2]]
Out[2]:
[71, 86, 14]
Alternatively, we can pass a single list or array of indices to obtain the same result:
In [3]:
ind = [3, 7, 4]
x[ind]
Out[3]:
array([71, 86, 60])
When using fancy indexing, the shape of the result reflects the shape of the index arrays rather than
the shape of the array being indexed:
In [4]:
ind = np.array([[3, 7],
[4, 5]])
x[ind]
Out[4]:
array([[71, 86],
[60, 20]])
Fancy indexing also works in multiple dimensions. Consider the following array:
In [5]:
X = np.arange(12).reshape((3, 4))
X
Out[5]:
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11]])
Like with standard indexing, the first index refers to the row, and the second to the column:
In [6]:
row = np.array([0, 1, 2])
col = np.array([2, 1, 3])
X[row, col]
Out[6]:
array([ 2, 5, 11])
Notice that the first value in the result is X[0, 2], the second is X[1, 1], and the third is X[2, 3]. The
pairing of indices in fancy indexing follows all the broadcasting rules that were mentioned
in Computation on Arrays: Broadcasting. So, for example, if we combine a column vector and a row
vector within the indices, we get a two-dimensional result:
In [7]:
X[row[:, np.newaxis], col]
Out[7]:
array([[ 2, 1, 3],
[ 6, 5, 7],
[10, 9, 11]])
Here, each row value is matched with each column vector, exactly as we saw in broadcasting of
arithmetic operations. For example:
In [8]:
row[:, np.newaxis] * col
Out[8]:
array([[0, 0, 0],
[2, 1, 3],
[4, 2, 6]])
It is always important to remember with fancy indexing that the return value reflects the broadcasted
shape of the indices, rather than the shape of the array being indexed.
Combined Indexing
For even more powerful operations, fancy indexing can be combined with the other indexing schemes
we've seen:
In [9]:
print(X)
[[ 0 1 2 3]
[ 4 5 6 7]
[ 8 9 10 11]]
We can combine fancy and simple indices:
In [10]:
X[2, [2, 0, 1]]
Out[10]:
array([10, 8, 9])
We can also combine fancy indexing with slicing:
In [11]:
X[1:, [2, 0, 1]]
Out[11]:
array([[ 6, 4, 5],
[10, 8, 9]])
And we can combine fancy indexing with masking:
In [12]:
mask = np.array([1, 0, 1, 0], dtype=bool)
X[row[:, np.newaxis], mask]
Out[12]:
array([[ 0, 2],
[ 4, 6],
[ 8, 10]])
All of these indexing options combined lead to a very flexible set of operations for accessing and
modifying array values.
Example: Selecting Random Points
One common use of fancy indexing is the selection of subsets of rows from a matrix.
In [13]:
mean = [0, 0]
cov = [[1, 2],
[2, 5]]
X = rand.multivariate_normal(mean, cov, 100)
X.shape
Out[13]:
(100, 2)