IVALAB
IVALAB
NO: 01
T-PYRAMID
AIM:
To Write a program that computes the T-pyramid of an image
ALGORITHM:
6. After displaying all four levels of the image pyramid, the code
attempts to display each level individually using cv2.imshow() with a
window title based on the current loop index i. However, there is a
mistake in the code where the window title is not updated correctly, and
it always displays "str(i)" as the title.
# Scaling factor
scale_factor = 0.8
RESULT:
Thus, to write a program that computes the T-pyramid of an image,
successfully.
EX. NO: 02
QUADTREE
AIM:
To write a program that derives the quad tree representation of an
image using the homogeneity criterion of equal intensity
ALGORITHM:
STEP-1: Define a Quadtree Node structure to represent each node in
thequadtree. Each node should contain the following information.
• Position (x, y): The top-left corner of the node within the
imageSize: The width and height of the node.
• Color: The dominant color of the node.
• Children: An array or a dictionary to store child nodes.
STEP-2: Initialize the quadtree by creating the root node, which represents
the entireimage.
STEP-4: If the termination condition is not met, subdivide the current node
intofour quadrants, each representing a sub region of the image Divide the
current node's size by 2.
• Create four child nodes, one for each quadrant.
• Determine the dominant color for each quadrant.
• Recursively apply the quad tree algorithm to each child node.
STEP-5: Repeat the subdivision process for each child node until the
termination condition is met for each leaf node.
PROGRAM:
import cv2 as cv
import matplotlib.pyplot as plt
import numpy as np
class QuadTree:
def __init__(self, img, threshold):
self.img = img
self.threshold = threshold
self.height, self.width = img.shape
self.tree = np.zeros((self.height, self.width), dtype=np.uint8)
self.subdivide(0, 0, self.width, self.height)
RESULT:
Thus, to write a program that derives the quadtree representation of
an image using the homogeneity criterion of equal intensity,successfully.
EX. NO: 03
Geometric Transformation of Image
AIM:
To apply geometric transformations like skewing, rotation, scaling, and
affine transformation on an image. Visualize the results using OpenCV and
Matplotlib.
ALGORITHM:
Step 1: Load Image
• Read the input image using cv2.imread().
• Convert the image from BGR to RGB using cv2.cvtColor().
Step 2: Get Image Dimensions
• Extract image dimensions (rows, cols) using the shape attribute.
Step 3: Apply Skewing
• Define the skew matrix.
• Apply the skew transformation using cv2.warpAffine().
Step 4: Apply Rotation
• Define the rotation matrix with cv2.getRotationMatrix2D() (e.g.,
rotate 45°).
• Apply the rotation with cv2.warpAffine().
Step 5: Apply Scaling
• Scale the image using cv2.resize() with scaling factors (e.g., 1.5x).
Step 6: Apply Affine Transformation
• Select three points and define the affine transformation matrix
using
cv2.getAffineTransform().
• Apply affine transformation with cv2.warpAffine().
Step 7: Display the Images
• Store original and transformed images in a list.
• Use plt.imshow() to display each image in a subplot.
• Display the results using plt.show().
PROGRAM:
import cv2
import numpy as np
import matplotlib.pyplot as plt
image = cv2.cvtColor(cv2.imread("C:/Users/Hp/Downloads/thala.jpg"),
cv2.COLOR_BGR2RGB)
rows, cols, _ = image.shape
skewed_image = cv2.warpAffine(image, np.float32([[1, 0.5, 0], [0.5, 1,
0]]), (cols, rows))
rotated_image = cv2.warpAffine(image,
cv2.getRotationMatrix2D((cols / 2, rows / 2), 45, 1), (cols, rows))
scaled_image = cv2.resize(image, None, fx=1.5, fy=1.5)
M_affine = cv2.getAffineTransform(np.float32([[50, 50], [200, 50],
[50, 200]]),
np.float32([[10, 100], [200, 50], [100, 250]]))
affine_transformed_image = cv2.warpAffine(image, M_affine, (cols,
rows))
images = [image, skewed_image, rotated_image, scaled_image,
affine_transformed_image]
titles = ['Original', 'Skewed', 'Rotated', 'Scaled', 'Affine Transformed']
plt.figure(figsize=(10, 7))
for i, (img, title) in enumerate(zip(images, titles)):
plt.subplot(2, 3, i + 1)
plt.imshow(img)
plt.title(title)
plt.axis('off')
plt.tight_layout()
plt.show()
OUTPUT:
RESULT:
AIM:
To detect and recognize objects in an image using a pre-trained
deep learning model, and visualize the results by drawing bounding
boxes and labels on detected objects.
ALGORITHM:
Object Detection and Recognition
1. Load Model
Load the pre-trained object detection model.
Load the associated class labels for object categories.
2. Load and Preprocess Image
Read the input image.
Preprocess the image (resize and convert to a blob) for the model.
3. Run Object Detection
Feed the preprocessed image to the model.
Perform a forward pass to get object detections.
4. Filter Detections
Set a confidence threshold (e.g., 0.5).
Filter out detections below the confidence threshold.
5. Non-Maximum Suppression (NMS)
Apply NMS to eliminate overlapping boxes.
Keep the most confident bounding boxes.
6. Draw Bounding Boxes and Labels
Draw bounding boxes around detected objects.
Label the objects with their class names and confidence scores.
7. Display or Save the Result
Display the image with detected objects.
Optionally, save the image with detection results.
PROGRAM:
import cv2
import numpy as np
model_proto = "C:/Users/harih/Downloads/New folder/deploy.prototxt"
model_weights = "C:/Users/harih/Downloads/New
folder/mobilenet_iter_73000.caffemodel"
net = cv2.dnn.readNetFromCaffe(model_proto, model_weights)
CLASSES = ["background", "aeroplane", "bicycle", "bird", "boat",
"bottle", "bus", "car", "cat", "chair", "cow", "diningtable",
"dog", "horse", "motorbike", "person", "pottedplant", "sheep",
"sofa", "train", "tvmonitor"]
image_path = "C:/Users/harih/Downloads/New folder/860x394.jpg"
image = cv2.imread(image_path)
if image is None:
print("Error: Could not load the image. Please check the file path or
format.")
else:
(h, w) = image.shape[:2]
blob = cv2.dnn.blobFromImage(cv2.resize(image, (300, 300)), 0.007843,
(300, 300), 127.5)
net.setInput(blob)
detections = net.forward()
confidence_threshold = 0.5
boxes = []
confidences = []
class_ids = []
for i in range(detections.shape[2]):
confidence = detections[0, 0, i, 2]
if confidence > confidence_threshold:
idx = int(detections[0, 0, i, 1])
label = CLASSES[idx]
box = detections[0, 0, i, 3:7] * np.array([w, h, w, h])
(startX, startY, endX, endY) = box.astype("int")
boxes.append([startX, startY, endX - startX, endY - startY])
confidences.append(float(confidence))
class_ids.append(idx)
indices = cv2.dnn.NMSBoxes(boxes, confidences, confidence_threshold,
0.4)
if len(indices) > 0:
for i in indices.flatten():
(startX, startY, width, height) = boxes[i]
endX = startX + width
endY = startY + height
label = f"{CLASSES[class_ids[i]]}: {confidences[i] * 100:.2f}%"
cv2.rectangle(image, (startX, startY), (endX, endY), (0, 255, 0), 2)
cv2.putText(image, label, (startX, startY - 10),
cv2.FONT_HERSHEY_SIMPLEX, 0.5, (255, 255, 255), 2)
cv2.imshow("Object Detection", image)
cv2.waitKey(0)
cv2.destroyAllWindows()
output_path = "C:/Users/harih/Downloads/detected_image.jpg"
cv2.imwrite(output_path, image)
OUTPUT:
RESULT:
Thus, to detect and recognize objects in an image using a pre-trained
deep learning model, and visualize the results by drawing bounding boxes
and labels on detected objects, successfully.
EX. NO: 05
Motion Analysis
AIM:
To detect and highlight significant movements in a video by
comparing consecutive frames. It enables real-time identification of
motion, which can be applied in surveillance, activity recognition, and
video analytics.
ALGORITHM:
Initialization: Load video and set up variables.
First Frame: Capture, convert to grayscale, and apply Gaussian blur.
Frame Processing:
While the video is open:
Read the next frame; exit if unavailable.
Convert to grayscale and blur.
Compute absolute difference with the previous frame.
Threshold and dilate the difference image.
Find and draw bounding boxes for significant contours.
Display: Show the processed frame.
Cleanup:
Release resources and close windows.
Find contours in the thresholded image.
For each detected contour:
If the area of the contour is above a certain threshold (to
ignore noise), retrieve the bounding box coordinates and draw a
rectangle around the detected motion area.
Display Results:
Show the processed frame with the bounding boxes drawn around
detected moving objects.
Cleanup:
Release the video capture object to free resources.
Close any open display windows to end the application.
PROGRAM:
import cv2
import numpy as np
from moviepy.editor import VideoFileClip
import threading
def play_audio(video_path):
clip = VideoFileClip(video_path)
clip.audio.preview()
RESULT:
Thus, to detect and highlight significant movements in a video by
comparing consecutive frames, enabling real-time identification of
motion, which can be applied in surveillance, activity recognition, and
video analytics, successfully.
EX. NO: 06
Facial Detection and Recognition
AIM:
To detect and locate human faces in an image using a pre-trained
Haar Cascade classifier. This enables visualization by drawing bounding
boxes around the identified faces.
ALGORITHM:
Input Image:
• Load an image file (e.g., JPEG, PNG) that may contain one
or more human faces.
Preprocessing:
• Convert the loaded color image to a grayscale image to
simplify processing and improve detection efficiency.
Model Loading:
• Load the pre-trained Haar Cascade Classifier model
specifically designed for frontal face detection from the
OpenCV library.
Face Detection:
• Use the detectMultiScale() function on the grayscale image to
identify faces. This function analyzes the image and returns
coordinates of rectangles that indicate where faces are
located.
Draw Rectangles:
• For each detected face, draw a rectangle around it in the
original color image using the coordinates obtained from the
detection step. This helps visualize the detected faces.
Display Results:
• Present the original image with the overlayed rectangles
around detected faces in a window using OpenCV’s imshow()
function.
Cleanup:
• Wait for user input to close the displayed image window, then
clean up resources by releasing any allocated memory and
closing all OpenCV windows
PROGRAM:
import cv2
import numpy as np
path = "C:/Users/harih/OneDrive/Pictures/Camera
Roll/WIN_20240827_09_27_58_Pro (2).jpg"
img = cv2.imread(path)
if img is None:
print("Error: Could not load image.")
else:
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
detector = cv2.CascadeClassifier(cv2.data.haarcascades +
"haarcascade_frontalface_default.xml")
faces = detector.detectMultiScale(gray, scaleFactor=1.1,
minNeighbors=7, minSize=(30, 30))
for (x, y, w, h) in faces:
cv2.rectangle(img, (x, y), (x + w, y + h), (255, 0, 0), 2)
cv2.imshow('Face Detection', img)
cv2.waitKey(0)
cv2.destroyAllWindows( )
OUTPUT:
RESULT:
Thus, to detect and locate human faces in an image using a pre-trained
Haar Cascade classifier, enabling visualization by drawing bounding boxes
around the identified faces, successfully.
EX. NO: 07
Hand Gesture Recognition
AIM:
To detect faces in an image using Haar cascades, convert the image to
grayscale, and highlight detected faces with bounding boxes. The final image
with detected faces is displayed to the user for visualization.
ALGORITHM:
Face Detection Using Haar Cascades Load the Image:
• Read the input image from the specified file path.
• Check if the image is loaded successfully. If it fails to load, print
an error message indicating the issue.
Convert Image to Grayscale:
• Convert the loaded color image into a grayscale image.
• This step simplifies the processing since Haar cascades operate
more effectively on grayscale images.
Load Haar Cascade Classifier:
• Load the pre-trained Haar cascade classifier for face detection
using haarcascade_frontalface_default.xml.
Detect Faces in the Grayscale Image:
• Use the detectMultiScale method to identify faces in the
grayscale image, adjusting parameters like scaleFactor,
minNeighbors, and minSize for optimal detection.
Draw Bounding Boxes Around Detected Faces:
• For each detected face, retrieve the coordinates (x, y) and
dimensions (width, height).
• Draw a rectangle around each detected face in the original image
using these coordinates.
Display the Result:
• Display the processed image, which now includes rectangles
around the detected faces.
• Wait for a key press from the user to allow them to view the result
before proceeding.
Clean Up Resources:
• Once the user closes the image display window, release any
resources used and ensure all windows
PROGRAM:
import cv2
import mediapipe as mp
# Initialize MediaPipe Hands and Drawing modules
mp_hands = mp.solutions.hands
mp_drawing = mp.solutions.drawing_utils
# Gesture labels
GESTURE_LABELS = {
'THUMBS_UP': 'Thumbs Up',
'PALM_OPEN': 'Palm Open'
}
# Function to classify gesture based on the hand landmarks
def classify_gesture(landmarks):
# Extracting tip coordinates of the thumb, index, middle, ring, and pinky fingers
thumb_tip = landmarks[mp_hands.HandLandmark.THUMB_TIP]
index_tip = landmarks[mp_hands.HandLandmark.INDEX_FINGER_TIP]
middle_tip = landmarks[mp_hands.HandLandmark.MIDDLE_FINGER_TIP]
ring_tip = landmarks[mp_hands.HandLandmark.RING_FINGER_TIP]
pinky_tip = landmarks[mp_hands.HandLandmark.PINKY_TIP]
# Classifying gesture based on y-coordinate of the fingertips
if thumb_tip.y < index_tip.y < middle_tip.y < ring_tip.y < pinky_tip.y:
return GESTURE_LABELS['THUMBS_UP']
if thumb_tip.y > index_tip.y > middle_tip.y > ring_tip.y > pinky_tip.y:
return GESTURE_LABELS['PALM_OPEN']
return "Unknown Gesture"
# Initialize webcam capture
cap = cv2.VideoCapture(0)
# Initialize MediaPipe Hands module
with mp_hands.Hands(min_detection_confidence=0.7,
min_tracking_confidence=0.7) as hands:
while cap.isOpened():
ret, frame = cap.read()
if not ret:
break
# Convert the frame to RGB (MediaPipe works in RGB format)
frame_rgb = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
# Process the frame with MediaPipe Hands
results = hands.process(frame_rgb)
# If hands are detected
if results.multi_hand_landmarks:
for hand_landmarks in results.multi_hand_landmarks:
# Draw landmarks and connections on the frame
mp_drawing.draw_landmarks(frame, hand_landmarks,
mp_hands.HAND_CONNECTIONS)
# Classify gesture based on landmarks
gesture = classify_gesture(hand_landmarks.landmark)
# Display the gesture on the frame
cv2.putText(frame, gesture, (50, 50),
cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 255, 0), 2, cv2.LINE_AA)
# Display the frame with the detected hand landmarks and gesture
cv2.imshow('Hand Gesture Recognition', frame)
# Exit the loop if 'q' is pressed
if cv2.waitKey(1) & 0xFF == ord('q'):
break
# Release the webcam and close the OpenCV windows
cap.release()
cv2.destroyAllWindows()
OUTPUT:
RESULT:
Thus, to detect faces in an image using Haar cascades, convert the
image to grayscale, and highlight detected faces with bounding boxes. The
final image with detected faces is displayed to the user for visualization,
successfully.