0% found this document useful (0 votes)
6 views

data science

An ndarray is a multidimensional container for homogeneous data, characterized by its shape and dtype. Various functions like np.zeros, np.ones, and np.empty are used to create arrays, while methods such as astype and pad() allow for data type conversion and padding, respectively. Additionally, functions like ptp() and any() provide statistical insights and boolean evaluations on the array elements.

Uploaded by

Rudra Abhishek
Copyright
© © All Rights Reserved
Available Formats
Download as TXT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views

data science

An ndarray is a multidimensional container for homogeneous data, characterized by its shape and dtype. Various functions like np.zeros, np.ones, and np.empty are used to create arrays, while methods such as astype and pad() allow for data type conversion and padding, respectively. Additionally, functions like ptp() and any() provide statistical insights and boolean evaluations on the array elements.

Uploaded by

Rudra Abhishek
Copyright
© © All Rights Reserved
Available Formats
Download as TXT, PDF, TXT or read online on Scribd
You are on page 1/ 5

An ndarray is a generic multidimensional container for homogeneous data; that is,

all
of the elements must be the same type. Every array has a shape, a tuple indicating
the
size of each dimension, and a dtype, an object describing the data type of the
array.
Nested sequences, like a list of equal-length lists, will be converted into a
multidimen#sional array.

In addition to np.array, there are a number of other functions for creating new
arrays.
As examples, zeros and ones create arrays of 0’s or 1’s, respectively, with a given
length
or shape. empty creates an array without initializing its values to any particular
value.
It’s not safe to assume that np.empty will return an array of all zeros. In
many cases, as previously shown, it will return uninitialized garbage
values.

The imshow() function in pyplot module of matplotlib library is used to display


data as an image; i.e. on a 2D regular raster.

To create a higher dimensional array with these methods, pass a tuple for the
shape:
In [23]: np.zeros(10)
Out[23]: array([ 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.])
In [24]: np.zeros((3, 6))
Out[24]:
array([[ 0., 0., 0., 0., 0., 0.],
[ 0., 0., 0., 0., 0., 0.],
[ 0., 0., 0., 0., 0., 0.]])
In [25]: np.empty((2, 3, 2))
Out[25]:
array([[[ 4.94065646e-324, 4.94065646e-324],
[ 3.87491056e-297, 2.46845796e-130],
[ 4.94065646e-324, 4.94065646e-324]],
[[ 1.90723115e+083, 5.73293533e-053],
[ -2.33568637e+124, -6.70608105e-012],
[ 4.42786966e+160, 1.27100354e+025]]])

You can explicitly convert or cast an array from one dtype to another using
ndarray’s
astype method: Calling astype always creates a new array (a copy of the data), even
if
the new dtype is the same as the old dtype.

In [31]: arr = np.array([1, 2, 3, 4, 5])


In [32]: arr.dtype
Out[32]: dtype('int64')
In [33]: float_arr = arr.astype(np.float64)
In [34]: float_arr.dtype
Out[34]: dtype('float64')

Operations between differently sized arrays is called broadcasting.. Broadcasting


solves the problem of mismatched shaped arrays by replicating the smaller array
along the larger array to ensure both arrays are having compatible shapes for NumPy
operations
The pad() method in NumPy is used to pad an array with specified values. The pad()
method takes three arguments:
array: The array to pad.
pad_width: A tuple of four integers that specifies the amount of padding to add to
each side of the array. The first and second integers specify the amount of padding
to add to the left and right sides of the array, respectively. The third and fourth
integers specify the amount of padding to add to the top and bottom sides of the
array, respectively.
mode: A string that specifies the type of padding to use. The possible values for
mode are constant, reflect, symmetric, and wrap.
The pad() method returns a new array with the specified padding applied.

import numpy as np
array = np.array([1, 2, 3, 4])
padded_array = np.pad(array, pad_width=(2, 2), mode='constant')
print(padded_array)

# Output: array([0, 0, 1, 2, 3, 4, 0, 0]

An important first dis#tinction from lists is that array slices are views on the
original array. This means that
the data is not copied, and any modifications to the view will be reflected in the
source
array. If you want a copy of a slice of an ndarray instead of a view, you will
need to explicitly copy the array; for example arr[5:8].copy().

The ptp() function in NumPy is used to calculate the peak-to-peak value of an


array. The peak-to-peak value is the difference between the maximum and minimum
values in an array.

The ptp() function takes an array as input and returns a scalar value that
represents the peak-to-peak value of the array.
Here is an example of how to use the ptp() function:

Python
import numpy as np

array = np.array([1, 2, 3, 4, 5])

peak_to_peak_value = np.ptp(array)

print(peak_to_peak_value)
# Output: 4
Use code with caution. Learn more
In this example, the array variable is initialized with the integers 1, 2, 3, 4,
and 5. The ptp() function then returns the peak-to-peak value of the array, which
is 4.

The ptp() function can also be used to calculate the peak-to-peak value of a
multidimensional array. In this case, the ptp() function will return a vector that
contains the peak-to-peak value for each dimension of the array.
Here is an example of how to calculate the peak-to-peak value of a multidimensional
array:
import numpy as np

array = np.array([[1, 2, 3], [4, 5, 6]])

peak_to_peak_values = np.ptp(array, axis=1)

print(peak_to_peak_values)
# Output: array([2, 2])
Use code with caution. Learn more
In this example, the array variable is initialized with a 2D array with two rows
and three columns. The ptp() function is then used to calculate the peak-to-peak
value of the array along the second dimension. The ptp() function returns a vector
with two elements, which represent the peak-to-peak value for each row of the
array.

The any() method in NumPy is used to test whether any element in an array is True.
The any() method takes an array as input and returns a Boolean value that indicates
whether any element in the array is True.
Sure. The any() method in NumPy is used to test whether any element in an array is
True. The any() method takes an array as input and returns a Boolean value that
indicates whether any element in the array is True.

Here is an example of how to use the any() method:

Python
import numpy as np

array = np.array([True, False, True, False])

any_true_value = np.any(array)

print(any_true_value)
# Output: True
Use code with caution. Learn more
In this example, the array variable is initialized with an array with four
elements. The any() method then returns a Boolean value that indicates whether any
element in the array is True. The any() method returns True because there are two
elements in the array that are True.The any() method can also be used to test
whether any element in an array is greater than a certain value. For example, the
following code tests whether any element in the array is greater than 2:

Python
import numpy as np

array = np.array([1, 2, 3, 4])

any_greater_than_2 = np.any(array > 2)

print(any_greater_than_2)
# Output: True

every NumPy array comes with the


following attributes:
ndim: The number of dimensions
shape: The size of each dimension
size: The total number of elements in the array
dtype: The data type of the array (for example, int,
float, string, and so on)

print("int_arr ndim: ", int_arr.ndim)


... print("int_arr shape: ", int_arr.shape)
... print("int_arr size: ", int_arr.size)
... print("int_arr dtype: ", int_arr.dtype)

. Let's say that we want to create an


array with three rows and five columns, with all the
elements initialized to zero. If we don't specify a data
type, NumPy will default to using floats:
In [23]: arr_2d = np.zeros((3, 5))

As you probably know from your OpenCV days, this


could be interpreted as a 3 x 5 grayscale image with all
pixels set to 0 (black). Analogously, if we wanted to
create a tiny 2 x 4 pixel image with three color channels
(R, G, B), but all pixels set to white, we would use
NumPy to create a 3D array with the dimensions, 3 x 2
x 4:
In [24]: arr_float_3d = np.ones((3, 2, 4))
... arr_float_3d
Out[24]: array([[[ 1., 1., 1., 1.],
... [ 1., 1., 1., 1.]],
...
... [[ 1., 1., 1., 1.],
... [ 1., 1., 1., 1.]],
...
... [[ 1., 1., 1., 1.],
... [ 1., 1., 1., 1.]]])

Here, the first dimension defines the color channel


(red, green, blue, green, and red in OpenCV). Thus, if
this was real image data, we could easily grab the color
information in the first channel by slicing the array:
In [25]: arr_float_3d[0, :, :]
Out[25]: array([[ 1., 1., 1., 1.],
... [ 1., 1., 1., 1.]])

In OpenCV, images either come as 32-bit float arrays


with values between 0 and 1 or they come as 8-bit
integer arrays with values between 0 and 255. Hence,
we can also create a 2 x 4 pixel, all-white RGB image
using 8-bit integers by specifying the dtype attribute of
the NumPy array and multiplying all the ones in the
array by 255:
In [26]: arr_uint_3d = np.ones((3, 2, 4), dtype=np.uint8) * 255
... arr_unit_3d
Out[26]: array([[[255, 255, 255, 255],
... [255, 255, 255, 255]],
...
... [[255, 255, 255, 255],
... [255, 255, 255, 255]],
...
... [[255, 255, 255, 255],
... [255, 255, 255, 255]]], dtype=uint8)

Let's say that we want to produce a simple line plot of


the sine function, sin(x). We want the function to be
evaluated at all points on the x-axis where 0 x < 10. We
will use NumPy's linspace function to create a linear
spacing on the x axis, from x values 0 to 10, and a total
of 100 sampling points:
In [3]: import numpy as np
In [4]: x = np.linspace(0, 10, 100)
We can evaluate the sin function at all points x using
NumPy's sin function and visualize the result by calling
plt's plot function:
In [5]: plt.plot(x, np.sin(x))

# images are 0-indexed, but subplots are 1-indexed.

You might also like