0% found this document useful (0 votes)
111 views

Tutorial 675

This tutorial provides instructions for building a lane detection system using OpenCV. It describes using the Hough transform algorithm to detect straight lines in video frames as the first step. It then discusses preprocessing frames by applying Canny edge detection and creating a triangular mask to segment the lane area. The next steps involve calculating the slopes and positions of left and right lane lines, and visualizing the detected lines overlaid on the original frames.
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
111 views

Tutorial 675

This tutorial provides instructions for building a lane detection system using OpenCV. It describes using the Hough transform algorithm to detect straight lines in video frames as the first step. It then discusses preprocessing frames by applying Canny edge detection and creating a triangular mask to segment the lane area. The next steps involve calculating the slopes and positions of left and right lane lines, and visualizing the detected lines overlaid on the original frames.
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 11

Tutorial: Build a lane detector

Approach 1: Hough Transform

Most lanes are designed to be relatively straightforward not only as to encourage orderliness but
also to make it easier for human drivers to steer vehicles with consistent speed. Therefore, our
intuitive approach may be to first detect prominent straight lines in the camera feed through edge
detection and feature extraction techniques. We will be using OpenCV, an open source library of
computer vision algorithms, for implementation. The following diagram is an overview of our
pipeline.

1. Setting up your environment

If you do not already have OpenCV installed, open Terminal and run:

pip install opencv-python

Now, clone the tutorial repository by running:

git clone https://github.com/chuanenlin/lane-detector.git

Next, open detector.py with your text editor. We will be writing all of the code of this section in
this Python file.
Processing a video

We will feed in our sample video for lane detection as a series of continuous frames (images) by
intervals of 10 milliseconds. We can also quit the program anytime by pressing the ‘q’ key.

import cv2 as cv

# The video feed is read in as a VideoCapture object


cap = cv.VideoCapture("input.mp4")
while (cap.isOpened()):
# ret = a boolean return value from getting the frame, frame = the current
frame being projected in the video
ret, frame = cap.read()
# Frames are read by intervals of 10 milliseconds. The programs breaks out
of the while loop when the user presses the 'q' key
if cv.waitKey(10) & 0xFF == ord('q'):
break

# The following frees up resources and closes all windows


cap.release()
cv.destroyAllWindows()

3. Applying Canny Detector

The Canny Detector is a multi-stage algorithm optimized for fast real-time edge detection. The
fundamental goal of the algorithm is to detect sharp changes in luminosity (large gradients), such
as a shift from white to black, and defines them as edges, given a set of thresholds. The Canny
algorithm has four main stages:

A. Noise reduction

As with all edge detection algorithms, noise is a crucial issue that often leads to false detection. A
5x5 Gaussian filter is applied to convolve (smooth) the image to lower the detector’s sensitivity to
noise. This is done by using a kernel (in this case, a 5x5 kernel) of normally distributed numbers
to run across the entire image, setting each pixel value equal to the weighted average of its
neighboring pixels.
5x5 Gaussian kernel. Asterisk denotes convolution operation.

B. Intensity gradient

The smoothened image is then applied with a Sobel, Roberts, or Prewitt kernel (Sobel is used in
OpenCV) along the x-axis and y-axis to detect whether the edges are horizontal, vertical, or
diagonal.

Sobel kernel for calculation of the first derivative of horizontal and vertical directions

C. Non-maximum suppression

Non-maximum suppression is applied to “thin” and effectively sharpen the edges. For each pixel,
the value is checked if it is a local maximum in the direction of the gradient calculated previously.
Non-maximum suppression on three points

A is on the edge with a vertical direction. As gradient is normal to the edge direction, pixel values
of B and C are compared with pixel values of A to determine if A is a local maximum. If A is
local maximum, non-maximum suppression is tested for the next point. Otherwise, the pixel value
of A is set to zero and A is suppressed.

D. Hysteresis thresholding

After non-maximum suppression, strong pixels are confirmed to be in the final map of edges.
However, weak pixels should be further analyzed to determine whether it constitutes as edge or
noise. Applying two pre-defined minVal and maxVal threshold values, we set that any pixel with
intensity gradient higher than maxVal are edges and any pixel with intensity gradient lower than
minVal are not edges and discarded. Pixels with intensity gradient in between minVal and
maxVal are only considered edges if they are connected to a pixel with intensity gradient above
maxVal.

Hysteresis thresholding example on two lines


Edge A is above maxVal so is considered an edge. Edge B is in
between maxVal and minVal but is not connected to any edge
above maxVal so is discarded. Edge C is in between maxVal and
minVal and is connected to edge A, an edge above maxVal, so is
considered an edge.

For our pipeline, our frame is first grayscaled because we only


need the luminance channel for detecting edges and a 5 by 5
gaussian blur is applied to decrease noise to reduce false edges.

4. Segmenting lane area


We will handcraft a triangular mask to segment the lane area and
discard the irrelevant areas in the frame to increase the
effectiveness of our later stages.

# import cv2 as cv
import numpy as np
import matplotlib.pyplot as plt

# def do_canny(frame):
# gray = cv.cvtColor(frame, cv.COLOR_RGB2GRAY)
# blur = cv.GaussianBlur(gray, (5, 5), 0)
# canny = cv.Canny(blur, 50, 150)
# return canny

def do_segment(frame):
# Since an image is a multi-directional array containing the relative
intensities of each pixel in the image, we can use frame.shape to return a
tuple: [number of rows, number of columns, number of channels] of the
dimensions of the frame
# frame.shape[0] give us the number of rows of pixels the frame has. Since
height begins from 0 at the top, the y-coordinate of the bottom of the frame
is its height
height = frame.shape[0]
# Creates a triangular polygon for the mask defined by three (x, y)
coordinates
polygons = np.array([
[(0, height), (800, height), (380, 290)]
])
# Creates an image filled with zero intensities with the same dimensions
as the frame
mask = np.zeros_like(frame)
# Allows the mask to be filled with values of 1 and the other areas to be
filled with values of 0
cv.fillPoly(mask, polygons, 255)
# A bitwise and operation between the mask and frame keeps only the
triangular area of the frame
segment = cv.bitwise_and(frame, mask)
return segment

# cap = cv.VideoCapture("input.mp4")
# while (cap.isOpened()):
# ret, frame = cap.read()
# canny = do_canny(frame)

# First, visualize the frame to figure out the three coordinates defining
the triangular mask
plt.imshow(frame)
plt.show()
segment = do_segment(canny)

# if cv.waitKey(10) & 0xFF == ord('q'):


# break

# cap.release()
# cv.destroyAllWindows()
# import cv2 as cv
# import numpy as np
# # import matplotlib.pyplot as plt

# def do_canny(frame):
# gray = cv.cvtColor(frame, cv.COLOR_RGB2GRAY)
# blur = cv.GaussianBlur(gray, (5, 5), 0)
# canny = cv.Canny(blur, 50, 150)
# return canny

# def do_segment(frame):
# height = frame.shape[0]
# polygons = np.array([
# [(0, height), (800, height), (380, 290)]
# ])
# mask = np.zeros_like(frame)
# cv.fillPoly(mask, polygons, 255)
# segment = cv.bitwise_and(frame, mask)
# return segment

def calculate_lines(frame, lines):


# Empty arrays to store the coordinates of the left and right lines
left = []
right = []
# Loops through every detected line
for line in lines:
# Reshapes line from 2D array to 1D array
x1, y1, x2, y2 = line.reshape(4)
# Fits a linear polynomial to the x and y coordinates and returns a
vector of coefficients which describe the slope and y-intercept
parameters = np.polyfit((x1, x2), (y1, y2), 1)
slope = parameters[0]
y_intercept = parameters[1]
# If slope is negative, the line is to the left of the lane, and
otherwise, the line is to the right of the lane
if slope < 0:
left.append((slope, y_intercept))
else:
right.append((slope, y_intercept))
# Averages out all the values for left and right into a single slope and
y-intercept value for each line
left_avg = np.average(left, axis = 0)
right_avg = np.average(right, axis = 0)
# Calculates the x1, y1, x2, y2 coordinates for the left and right lines
left_line = calculate_coordinates(frame, left_avg)
right_line = calculate_coordinates(frame, right_avg)
return np.array([left_line, right_line])

def calculate_coordinates(frame, parameters):


slope, intercept = parameters
# Sets initial y-coordinate as height from top down (bottom of the frame)
y1 = frame.shape[0]
# Sets final y-coordinate as 150 above the bottom of the frame
y2 = int(y1 - 150)
# Sets initial x-coordinate as (y1 - b) / m since y1 = mx1 + b
x1 = int((y1 - intercept) / slope)
# Sets final x-coordinate as (y2 - b) / m since y2 = mx2 + b
x2 = int((y2 - intercept) / slope)
return np.array([x1, y1, x2, y2])

def visualize_lines(frame, lines):


# Creates an image filled with zero intensities with the same dimensions
as the frame
lines_visualize = np.zeros_like(frame)
# Checks if any lines are detected
if lines is not None:
for x1, y1, x2, y2 in lines:
# Draws lines between two coordinates with green color and 5
thickness
cv.line(lines_visualize, (x1, y1), (x2, y2), (0, 255, 0), 5)
return lines_visualize

# cap = cv.VideoCapture("input.mp4")
# while (cap.isOpened()):
# ret, frame = cap.read()
# canny = do_canny(frame)
# # plt.imshow(frame)
# # plt.show()
# segment = do_segment(canny)
# hough = cv.HoughLinesP(segment, 2, np.pi / 180, 100, np.array([]),
minLineLength = 100, maxLineGap = 50)

# Averages multiple detected lines from hough into one line for left
border of lane and one line for right border of lane
lines = calculate_lines(frame, hough)
# Visualizes the lines
lines_visualize = visualize_lines(frame, lines)
# Overlays lines on frame by taking their weighted sums and adding an
arbitrary scalar value of 1 as the gamma argument
output = cv.addWeighted(frame, 0.9, lines_visualize, 1, 1)
# Opens a new window and displays the output frame
cv.imshow("output", output)

# if cv.waitKey(10) & 0xFF == ord('q'):


# break

# cap.release()
# cv.destroyAllWindows()

Now, open Terminal and run python detector.py to test your


simple lane detector! In case you have missed any code, here is the
full solution with comments:
import cv2 as cv
import numpy as np
# import matplotlib.pyplot as plt

def do_canny(frame):
# Converts frame to grayscale because we only need the luminance channel
for detecting edges - less computationally expensive
gray = cv.cvtColor(frame, cv.COLOR_RGB2GRAY)
# Applies a 5x5 gaussian blur with deviation of 0 to frame - not mandatory
since Canny will do this for us
blur = cv.GaussianBlur(gray, (5, 5), 0)
# Applies Canny edge detector with minVal of 50 and maxVal of 150
canny = cv.Canny(blur, 50, 150)
return canny

def do_segment(frame):
# Since an image is a multi-directional array containing the relative
intensities of each pixel in the image, we can use frame.shape to return a
tuple: [number of rows, number of columns, number of channels] of the
dimensions of the frame
# frame.shape[0] give us the number of rows of pixels the frame has. Since
height begins from 0 at the top, the y-coordinate of the bottom of the frame
is its height
height = frame.shape[0]
# Creates a triangular polygon for the mask defined by three (x, y)
coordinates
polygons = np.array([
[(0, height), (800, height), (380, 290)]
])
# Creates an image filled with zero intensities with the same dimensions
as the frame
mask = np.zeros_like(frame)
# Allows the mask to be filled with values of 1 and the other areas to be
filled with values of 0
cv.fillPoly(mask, polygons, 255)
# A bitwise and operation between the mask and frame keeps only the
triangular area of the frame
segment = cv.bitwise_and(frame, mask)
return segment

def calculate_lines(frame, lines):


# Empty arrays to store the coordinates of the left and right lines
left = []
right = []
# Loops through every detected line
for line in lines:
# Reshapes line from 2D array to 1D array
x1, y1, x2, y2 = line.reshape(4)
# Fits a linear polynomial to the x and y coordinates and returns a
vector of coefficients which describe the slope and y-intercept
parameters = np.polyfit((x1, x2), (y1, y2), 1)
slope = parameters[0]
y_intercept = parameters[1]
# If slope is negative, the line is to the left of the lane, and
otherwise, the line is to the right of the lane
if slope < 0:
left.append((slope, y_intercept))
else:
right.append((slope, y_intercept))
# Averages out all the values for left and right into a single slope and
y-intercept value for each line
left_avg = np.average(left, axis = 0)
right_avg = np.average(right, axis = 0)
# Calculates the x1, y1, x2, y2 coordinates for the left and right lines
left_line = calculate_coordinates(frame, left_avg)
right_line = calculate_coordinates(frame, right_avg)
return np.array([left_line, right_line])

def calculate_coordinates(frame, parameters):


slope, intercept = parameters
# Sets initial y-coordinate as height from top down (bottom of the frame)
y1 = frame.shape[0]
# Sets final y-coordinate as 150 above the bottom of the frame
y2 = int(y1 - 150)
# Sets initial x-coordinate as (y1 - b) / m since y1 = mx1 + b
x1 = int((y1 - intercept) / slope)
# Sets final x-coordinate as (y2 - b) / m since y2 = mx2 + b
x2 = int((y2 - intercept) / slope)
return np.array([x1, y1, x2, y2])

def visualize_lines(frame, lines):


# Creates an image filled with zero intensities with the same dimensions
as the frame
lines_visualize = np.zeros_like(frame)
# Checks if any lines are detected
if lines is not None:
for x1, y1, x2, y2 in lines:
# Draws lines between two coordinates with green color and 5
thickness
cv.line(lines_visualize, (x1, y1), (x2, y2), (0, 255, 0), 5)
return lines_visualize

# The video feed is read in as a VideoCapture object


cap = cv.VideoCapture("input.mp4")
while (cap.isOpened()):
# ret = a boolean return value from getting the frame, frame = the current
frame being projected in the video
ret, frame = cap.read()
canny = do_canny(frame)
cv.imshow("canny", canny)
# plt.imshow(frame)
# plt.show()
segment = do_segment(canny)
hough = cv.HoughLinesP(segment, 2, np.pi / 180, 100, np.array([]),
minLineLength = 100, maxLineGap = 50)
# Averages multiple detected lines from hough into one line for left
border of lane and one line for right border of lane
lines = calculate_lines(frame, hough)
# Visualizes the lines
lines_visualize = visualize_lines(frame, lines)
cv.imshow("hough", lines_visualize)
# Overlays lines on frame by taking their weighted sums and adding an
arbitrary scalar value of 1 as the gamma argument
output = cv.addWeighted(frame, 0.9, lines_visualize, 1, 1)
# Opens a new window and displays the output frame
cv.imshow("output", output)
# Frames are read by intervals of 10 milliseconds. The programs breaks out
of the while loop when the user presses the 'q' key
if cv.waitKey(10) & 0xFF == ord('q'):
break
# The following frees up resources and closes all windows
cap.release()
cv.destroyAllWindows()

You might also like