Summer Training Report On Python Cum Computer Vision
Summer Training Report On Python Cum Computer Vision
on
BACHELOR OF TECHNOLOGY
In
By
SAZID
1712231093
EC-73
Affiliated to
The satiation and euphoria that accompany the successful completion of the project would be
incomplete without the mention of the people who made it possible and it’s my pleasure to learn
under these people, who directly or indirectly contribute in the development of this work and
who influenced my thinking, behavior, and acts during the course of study.
I am thankful to Mr. VIVEK PATEL for his support, cooperation, and motivation provided to
me during the training for constant inspiration, presence and blessings.
I would like to express my sincere appreciation to Dr. Vibha Srivastava and Er. Pratishtha
Gupta who provided their valuable suggestions and precious time in accomplishing my training
report.
Lastly, I would like to thank the almighty and my parents for their moral support and my friends
with whom I shared my day-to-day experience and received lots of suggestions that improved
my quality of work.
I hope that I can build upon the experience and knowledge that I have gained and make a
valuable contribution towards the industry in the coming future.
SAZID
1712231093
(ii)
PREFACE
Summer training is the most vital part of a B.Tech. course, both as a link between theory and
actual practices. I therefore, consider myself fortunate to receive the training in a great learning
environment of the organization viz. Blue Heart Lab.
In this summer training we’ve learned about Python and Computer Vision. Computer vision is
the field of computer science that focuses on replicating parts of the complexity of the human
vision system and enabling computers to identify and process objects in images and videos in the
same way that humans do. Until recently, computer vision only worked in limited capacity. Here,
I performed various operations on images and videos with the help of python using opencv as a
library. Therefore, this summer training report contains the basic concepts of Python
programming language, OOP’S concept, Computer Vision and its applications.
SAZID
1712231093
(iii)
CONTENT
CERTIFICATE ii
ACKNOWLEDGEMENT iii
PREFACE iv
INTRODUCTION TO PYTHON 2
COMPUTER VISION 10
CONCLUSION 23
(iv)
Industrial Training Report Python cum Computer Vision
INTRODUCTION
The main vision of the institute is to be established as a word class lab facility providing best in class
training in technology aiming for human welfare & creating brilliant minds, also to revolutionize the
field of technical education through more focus on practical with available technology in a smarter way
INTRODUCTION TO PYTHON
Python is a high-level, interpreted, interactive and object-oriented scripting language. Python is designed
to be highly readable. Python was developed by Guido van Rossum in the late eighties and early nineties
at the National Research Institute for Mathematics and Computer Science in the Netherlands.
Python is named after the British comedy group Monty Python—makers of the 1970s BBC comedy
series Monty Python’s Flying Circus and a handful of later full-length films, including Monty Python
and the Holy Grail, that are still widely popular today.
Developer productivity: Python boosts developer productivity many times beyond compiled or
statically typed languages such as C, C++, and Java. Python code is typically one-third to one-
fifth the size of equivalent C++ or Java code.
Easy-to-learn: Python has few keywords, simple structure, and a clearly defined syntax. This
allows the student to pick up the language quickly.
A broad standard library: Python's bulk of the library is very portable and cross-platform
compatible on UNIX, Windows, and Macintosh.
Interactive Mode: Python has support for an interactive mode which allows interactive testing
and debugging of snippets of code.
Portable: Python can run on a wide variety of hardware platforms and has the same interface on
all platforms.
Extendable: You can add low-level modules to the Python interpreter. These modules enable
programmers to add to or customize their tools to be more efficient.
VARIABLE TYPES
The data stored in memory can be of many types. For example, a person's age is stored as a numeric
value and his or her address is stored as alphanumeric characters. Python has various standard data types
that are used to define the operations possible on them and the storage method for each of them. Python
has five standard data types:
1. Numbers
Number data types store numeric values. Number objects are created when you assign a value to
them. Python supports four different numerical types:
int (signed integers)
long (long integers, they can also be represented in octal and hexadecimal)
float (floating point real values)
complex (complex numbers)
2. String
Strings in Python are identified as a contiguous set of characters represented in the quotation
marks. Python allows for either pairs of single or double quotes. Subsets of strings can be taken
using the slice operator ([ ] and [:]) with indexes starting at 0 in the beginning of the string and
working their way from -1 at the end. The plus (+) sign is the string concatenation operator and
the asterisk (*) is the repetition operator.
3. List
Lists are the most versatile of Python's compound data types. A list contains items separated by
commas and enclosed within square brackets ([]). To some extent, lists are similar to arrays in C.
One difference between them is that all the items belonging to a list can be of different data type.
The values stored in a list can be accessed using the slice operator ([ ] and [:]) with indexes
starting at 0 in the beginning of the list and working their way to end -1.
4. Tuple
A tuple is another sequence data type that is similar to the list. A tuple consists of a number of
values separated by commas. Unlike lists, however, tuples are enclosed within parentheses. The
main differences between lists and tuples are: Lists are enclosed in brackets ( [ ] ) and their
elements and size can be changed, while tuples are enclosed in parentheses ( ( ) ) and cannot be
updated. Tuples can be thought of as read only lists.
Figure 3 : TUPLE
5. Dictionary
Python's dictionaries are kind of hash table type. They work like associative arrays or hashes
found in Perl and consist of key-value pairs. A dictionary key can be almost any Python type, but
are usually numbers or strings. Values, on the other hand, can be any arbitrary Python object.
Dictionaries are enclosed by curly braces ({ }) and values can be assigned and accessed using
square braces ([]).
6. Sets
A set is an unordered collection of distinct items. Unordered means that items aren’t stored in
any particular order. Something is either in the set or it’s not, but there’s no notion of it being the
first, second, or last item. Distinct means that any item appears in a set at most once; in other
words, there are no duplicates. Set is defined by values separated by comma inside braces { }.
MODULE
A module is a collection of variables and functions that are grouped together in a single file. The
variables and functions in a module are usually related to one another in some way; for example, module
math contains the variable pi and mathematical functions such as cos (cosine) and sqrt (square root). A
module allows you to logically organize your Python code. Grouping related code into a module makes
the code easier to understand and use. A module is a Python object with arbitrarily named attributes that
you can bind and reference.
Importing Modules
You can use any Python source file as a module by executing an import statement in some other
Python source file. When the interpreter encounters an import statement, it imports the module if the
module is present in the search path. A search path is a list of directories that the interpreter searches
before importing a module.
Example of module:
module_print.py
def print_function(input_text):
print (“Hello”, input_text)
return
The import has the following syntax:
import module_print
module_print.print_function(“Introduction to Python”)
The output will be: Hello Introduction to Python
FILE HANDLING
Python to support file handling and allow users to handle file i.e. to read and write files, along with
many other file handling options, to operate on it. There are many kinds of files. Text files, music files,
videos, and various word processor and presentation documents are common. A file has two key
properties: a filename (usually written as one word) and a path. The path specifies the location of a file
on the computer.
Opening a File
When you want to write a program that opens and reads a file, that program needs to tell Python
where that file is. By default, Python assumes that the file you want to read is in the same
directory as the program that is doing the reading.
For example: To open a File name ‘file_example.txt’ following code is needed.
file = open('file_example.txt', 'r')
contents = file.read()
print(contents)
file.close()
Here, ‘r’ as the second argument to open() to open the file in read mode.
Because every call on function open should have a corresponding call on method close, Python
provides with statement that automatically closes a file when the end of the block is reached.
Here is the same example using with statement:
with open('file_example.txt', 'r') as file:
contents = file.read()
print(contents)
Writing Files
Python allows you to write content to a file in a way similar to how the print() function “writes”
strings to the screen. Write mode will overwrite the existing file and start from scratch, just like
when you overwrite a variable’s value with a new value. Pass 'w' as the second argument to
open() to open the file in write mode
Append mode, on the other hand, will append text to the end of the existing file. Pass 'a' as the
second argument to open() to open the file in append mode.
For example: To write Computer Science in the existing file named ‘file_example.txt’.
To add to our previous file file_example.txt, we can append the words Software Engineering:
output_file.write('Software Engineering')
Python has been an object-oriented language since it existed. Because of this, creating and using classes
and objects are downright easy. The goals of OOP are
Class: A user-defined prototype for an object that defines a set of attributes that characterize any
object of the class. The attributes are data members (class variables and instance variables) and
methods, accessed via dot notation.
Class variable: A variable that is shared by all instances of a class. Class variables are defined
within a class but outside any of the class's methods. Class variables are not used as frequently as
instance variables are.
Data member: A class variable or instance variable that holds data associated with a class and its
objects.
Function overloading: The assignment of more than one behavior to a particular function. The
operation performed varies by the types of objects or arguments involved.
Instance variable: A variable that is defined inside a method and belongs only to the current
instance of a class.
Inheritance: The transfer of the characteristics of a class to other classes that are derived from it.
Instance: An individual object of a certain class. An object obj that belongs to a class Circle, for
example, is an instance of the class Circle.
Instantiation: The creation of an instance of a class.
Method: A special kind of function that is defined in a class definition.
Object: A unique instance of a data structure that's defined by its class. An object comprises both
data members (class variables and instance variables) and methods.
Operator overloading: The assignment of more than one function to a particular operator.
Inheritance A process of using details from a new class without modifying existing
class.
COMPUTER VISION
Computer Vision, often abbreviated as CV, is defined as a field of study that seeks to develop
techniques to help computers “see” and understand the content of digital images such as photographs
and videos. It is a multidisciplinary field that could broadly be called a subfield of artificial
intelligence and machine learning, which may involve the use of specialized methods and make
use of general learning algorithms.
“Computer vision is the automated extraction of information from images. Information can mean
anything from 3D models, camera position, object detection and recognition to grouping and
searching image content.”
— Page ix, Programming Computer Vision with Python, 2012.
APPLICATIONS
The good news is that computer vision is being used today in a wide variety of real-world applications,
some of them are as follows:
Optical character recognition (OCR) : reading handwritten postal codes on letters and
automatic number plate recognition (ANPR)
Motion Capture : using retro-reflective markers viewed from multiple cameras or other vision-
based techniques to capture actors for computer animation;
Surveillance : monitoring for intruders, analyzing highway traffic (Figure 1.4f), and monitoring
pools for drowning victims;
Face detection : for improved camera focusing as well as more relevant image searching
OpenCV
We have several programming language choices for computer vision – OpenCV using C++, OpenCV
using Python, or MATLAB. However, most engineers have a personal favorite, depending on the task
they perform. Beginners often pick OpenCV with Python for its flexibility. It’s a language most
programmers are familiar with, and owing to its versatility is very popular among developers.
Image processing
Image processing is focused on processing raw images to apply some kind of transformation.
Usually, the goal is to improve images or prepare them as an input for a specific task, while in
computer vision the goal is to describe and explain images. For instance, noise reduction,
contrast, or rotation operations, typical components of image processing, can be performed at
pixel level and do not need a complex grasp of the image that allows for some understanding of
what is happening in it.
Machine vision
This is a particular case where computer vision is used to perform some actions, typically in
production or manufacturing lines. In the chemical industry, machine vision systems can help
with the manufacturing of products by checking the containers in the line (are they clean, empty,
and free of damage?) or by checking that the final product is properly sealed.
Computer vision
Computer vision can solve more complex problems such as facial recognition (used, for
example, by Snapchat to apply filters), detailed image analysis that allows for visual searches
like the ones Google Images performs, or biometric identification methods.
OPERATIONS PERFORMED
THRESHOLDING
Thresholding is a technique in OpenCV, which is the assignment of pixel values in relation to the
threshold value provided. In thresholding, each pixel value is compared with the threshold value.
If the pixel value is smaller than the threshold, it is set to 0, otherwise, it is set to a maximum
value (generally 255). Thresholding is a very popular segmentation technique, used for
separating an object considered as a foreground from its background. A threshold is a value
which has two regions on its either side i.e. below the threshold or above the threshold.
In Computer Vision, this technique of thresholding is done on grayscale images. So initially, the
image has to be converted in grayscale color space.
o cv2.THRESH_BINARY: If pixel intensity is greater than the set threshold, value set to
255, else set to 0 (black).
o cv2.THRESH_BINARY_INV: Inverted or Opposite case of cv2.THRESH_BINARY.
o cv.THRESH_TRUNC: If pixel intensity value is greater than threshold, it is truncated to
the threshold. The pixel values are set to be the same as the threshold. All other value
remain the same.
o cv.THRESH_TOZERO: Pixel intensity is set to 0, for all the pixels intensity, less than
the threshold value.
o cv.THRESH_TOZERO_INV: Inverted or Opposite case of cv2.THRESH_TOZERO
The syntax used here is :
cv2.threshold(source, thresholdValue, maxVal, thresholdingTechnique)
o ADAPTIVE THRESHOLDING
If an image has different lighting conditions in different areas. In that case, adaptive
thresholding can help. Here, the algorithm determines the threshold for a pixel based on a
small region around it. So we get different thresholds for different regions of the same
image which gives better results for images with varying illumination.
Adaptive Thresholding algorithm provide the image in which Threshold values vary over
the image as a function of local image characteristics.
So Adaptive Thresholding involves two following steps
1. Divide image into strips
2. Apply global threshold method to each strip.
Adaptive thresholding changes the threshold dynamically over the image. It typically
takes a gray scale or color image as input and, in the simplest implementation, outputs a
binary image representing the segmentation.
FILTERING
2. Median Filtering - Here, the function cv2.medianBlur() computes the median of all the
pixels under the kernel window and the central pixel is replaced with this median value.
This is highly effective in removing salt-and-pepper noise. One interesting thing to note
is that, in the Gaussian and box filters, the filtered value for the central element can be a
value which may not exist in the original image. However this is not the case in median
filtering, since the central element is always replaced by some pixel value in the image.
This reduces the noise effectively. The kernel size must be a positive odd integer.
3. Bilateral Filtering - As we noted, the filters we presented earlier tend to blur edges.
This is not the case for the bilateral filter, cv2.bilateralFilter(), which was defined for, and
is highly effective at noise removal while preserving edges. But the operation is slower
compared to other filters.
MORPHOLOGICAL TRANSFORMATION
Morphological transformations are some simple operations based on the image shape. It is
normally performed on binary images. It needs two inputs, one is our original image, second one
is called structuring element or kernel which decides the nature of operation.
Kernel will tell you how to change the value of px by combining it with different amount of
neighbour px.
1. Erosion : shrink image regions
EDGE DETECTION
Edge Detection is a method of segmenting an image into regions of discontinuity. It is a widely
used technique in digital image processing like:
Pattern Recognition
Image Morphology
Feature Extraction
Edge Detection Operators are of two types:
1. Gradient – based operator which computes first-order derivations in a digital image like:
Sobel operator
Prewitt operator
Robert operator
2. Gaussian – based operator which computes second-order derivations in a digital image
like:
Canny edge detector
Laplacian of Gaussian
OUTPUT:
FACE DETECTION
Face detection is a computer technology being used in a variety of applications that identifies
human faces in digital images. Face detection using Haar cascades is a machine learning based
approach where a cascade function is trained with a set of input data. OpenCV already contains
many pre-trained classifiers for face, eyes, smiles, etc. To detect faces in images: A few things to
note: The detection works only on grayscale images.
OCR is the automatic process of converting typed, handwritten, or printed text to machine-
encoded text that we can access and manipulate via a string variable.
pytesseract Installation :
1) open cmd : pip install tesseract
2) Download binary file from : https://github.com/UB-Mannheim/tesseract/wiki
3) Install it(64 or 32 bit depend on your system)
4) Look for "Tesseract-OCR" folder in my case : C:\Program Files\Tesseract-OCR
Program:
import cv2
import imutils
import numpy as np
import pytesseract #pip install tesseract
pytesseract.pytesseract.tesseract_cmd = r'C:\Program Files\Tesseract-OCR\tesseract.exe'
#using a bilateral filter (Blurring) will remove the unwanted details an image.
gray = cv2.bilateralFilter(gray, 13, 15, 15)
#Edge detection
edged = cv2.Canny(gray, 50, 200) #Perform Edge detection
cv2.imshow("Edge_image", edged)
contours=cv2.findContours(edged.copy(),cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
contours = imutils.grab_contours(contours)
contours = sorted(contours,key=cv2.contourArea, reverse = True)[:10]
screenCnt = None
for c in contours:
# approximate the contour
peri = cv2.arcLength(c, True)
approx = cv2.approxPolyDP(c, 0.018 * peri, True)
if len(approx) == 4:
screenCnt = approx
break
print(screenCnt)
# Now crop
(x, y) = np.where(mask == 255) #Return elements chosen from x or y depending
#on condition.
(topx, topy) = (np.min(x), np.min(y))
(bottomx, bottomy) = (np.max(x), np.max(y))
Cropped = gray[topx:bottomx+1, topy:bottomy+1]
cv2.imshow("cropped_plate",Cropped)
cv2.imwrite("crop.jpg",Cropped)
The accuracy depends on the clarity of image, orientation, light exposure etc. To get better results you
can try implementing Machine learning algorithms along with this.
OUTPUT:
CONCLUSION
It was a wonderful learning experience at Blue Heart Lab for a period of 52 days. During this training
period I’ve learned about python programming language, object oriented programming and Computer
Vision The training proved to be fruitful as it provides opportunity to learn how the use of technology
can help us in different aspects of our life. I learnt about the concept of face detection using Open CV in
Python . There are a number of detectors other than the face, which can be found in the library. We feel
free to experiment with them and create detectors for coin , recognition of number plate etc.
This will be very handy when we are trying to develop applications that require image recognition and
similar principles. Now, we should also be able to use these concepts to develop applications easily with
the help of OpenCV in Python.
This training has provided us great insight and understanding of the new and continuously advancing
technologies in the current world. As a result of this summer training we are more confident to build our
future career