Tutorial 675
Tutorial 675
Most lanes are designed to be relatively straightforward not only as to encourage orderliness but
also to make it easier for human drivers to steer vehicles with consistent speed. Therefore, our
intuitive approach may be to first detect prominent straight lines in the camera feed through edge
detection and feature extraction techniques. We will be using OpenCV, an open source library of
computer vision algorithms, for implementation. The following diagram is an overview of our
pipeline.
If you do not already have OpenCV installed, open Terminal and run:
Next, open detector.py with your text editor. We will be writing all of the code of this section in
this Python file.
Processing a video
We will feed in our sample video for lane detection as a series of continuous frames (images) by
intervals of 10 milliseconds. We can also quit the program anytime by pressing the ‘q’ key.
import cv2 as cv
The Canny Detector is a multi-stage algorithm optimized for fast real-time edge detection. The
fundamental goal of the algorithm is to detect sharp changes in luminosity (large gradients), such
as a shift from white to black, and defines them as edges, given a set of thresholds. The Canny
algorithm has four main stages:
A. Noise reduction
As with all edge detection algorithms, noise is a crucial issue that often leads to false detection. A
5x5 Gaussian filter is applied to convolve (smooth) the image to lower the detector’s sensitivity to
noise. This is done by using a kernel (in this case, a 5x5 kernel) of normally distributed numbers
to run across the entire image, setting each pixel value equal to the weighted average of its
neighboring pixels.
5x5 Gaussian kernel. Asterisk denotes convolution operation.
B. Intensity gradient
The smoothened image is then applied with a Sobel, Roberts, or Prewitt kernel (Sobel is used in
OpenCV) along the x-axis and y-axis to detect whether the edges are horizontal, vertical, or
diagonal.
Sobel kernel for calculation of the first derivative of horizontal and vertical directions
C. Non-maximum suppression
Non-maximum suppression is applied to “thin” and effectively sharpen the edges. For each pixel,
the value is checked if it is a local maximum in the direction of the gradient calculated previously.
Non-maximum suppression on three points
A is on the edge with a vertical direction. As gradient is normal to the edge direction, pixel values
of B and C are compared with pixel values of A to determine if A is a local maximum. If A is
local maximum, non-maximum suppression is tested for the next point. Otherwise, the pixel value
of A is set to zero and A is suppressed.
D. Hysteresis thresholding
After non-maximum suppression, strong pixels are confirmed to be in the final map of edges.
However, weak pixels should be further analyzed to determine whether it constitutes as edge or
noise. Applying two pre-defined minVal and maxVal threshold values, we set that any pixel with
intensity gradient higher than maxVal are edges and any pixel with intensity gradient lower than
minVal are not edges and discarded. Pixels with intensity gradient in between minVal and
maxVal are only considered edges if they are connected to a pixel with intensity gradient above
maxVal.
# import cv2 as cv
import numpy as np
import matplotlib.pyplot as plt
# def do_canny(frame):
# gray = cv.cvtColor(frame, cv.COLOR_RGB2GRAY)
# blur = cv.GaussianBlur(gray, (5, 5), 0)
# canny = cv.Canny(blur, 50, 150)
# return canny
def do_segment(frame):
# Since an image is a multi-directional array containing the relative
intensities of each pixel in the image, we can use frame.shape to return a
tuple: [number of rows, number of columns, number of channels] of the
dimensions of the frame
# frame.shape[0] give us the number of rows of pixels the frame has. Since
height begins from 0 at the top, the y-coordinate of the bottom of the frame
is its height
height = frame.shape[0]
# Creates a triangular polygon for the mask defined by three (x, y)
coordinates
polygons = np.array([
[(0, height), (800, height), (380, 290)]
])
# Creates an image filled with zero intensities with the same dimensions
as the frame
mask = np.zeros_like(frame)
# Allows the mask to be filled with values of 1 and the other areas to be
filled with values of 0
cv.fillPoly(mask, polygons, 255)
# A bitwise and operation between the mask and frame keeps only the
triangular area of the frame
segment = cv.bitwise_and(frame, mask)
return segment
# cap = cv.VideoCapture("input.mp4")
# while (cap.isOpened()):
# ret, frame = cap.read()
# canny = do_canny(frame)
# First, visualize the frame to figure out the three coordinates defining
the triangular mask
plt.imshow(frame)
plt.show()
segment = do_segment(canny)
# cap.release()
# cv.destroyAllWindows()
# import cv2 as cv
# import numpy as np
# # import matplotlib.pyplot as plt
# def do_canny(frame):
# gray = cv.cvtColor(frame, cv.COLOR_RGB2GRAY)
# blur = cv.GaussianBlur(gray, (5, 5), 0)
# canny = cv.Canny(blur, 50, 150)
# return canny
# def do_segment(frame):
# height = frame.shape[0]
# polygons = np.array([
# [(0, height), (800, height), (380, 290)]
# ])
# mask = np.zeros_like(frame)
# cv.fillPoly(mask, polygons, 255)
# segment = cv.bitwise_and(frame, mask)
# return segment
# cap = cv.VideoCapture("input.mp4")
# while (cap.isOpened()):
# ret, frame = cap.read()
# canny = do_canny(frame)
# # plt.imshow(frame)
# # plt.show()
# segment = do_segment(canny)
# hough = cv.HoughLinesP(segment, 2, np.pi / 180, 100, np.array([]),
minLineLength = 100, maxLineGap = 50)
# Averages multiple detected lines from hough into one line for left
border of lane and one line for right border of lane
lines = calculate_lines(frame, hough)
# Visualizes the lines
lines_visualize = visualize_lines(frame, lines)
# Overlays lines on frame by taking their weighted sums and adding an
arbitrary scalar value of 1 as the gamma argument
output = cv.addWeighted(frame, 0.9, lines_visualize, 1, 1)
# Opens a new window and displays the output frame
cv.imshow("output", output)
# cap.release()
# cv.destroyAllWindows()
def do_canny(frame):
# Converts frame to grayscale because we only need the luminance channel
for detecting edges - less computationally expensive
gray = cv.cvtColor(frame, cv.COLOR_RGB2GRAY)
# Applies a 5x5 gaussian blur with deviation of 0 to frame - not mandatory
since Canny will do this for us
blur = cv.GaussianBlur(gray, (5, 5), 0)
# Applies Canny edge detector with minVal of 50 and maxVal of 150
canny = cv.Canny(blur, 50, 150)
return canny
def do_segment(frame):
# Since an image is a multi-directional array containing the relative
intensities of each pixel in the image, we can use frame.shape to return a
tuple: [number of rows, number of columns, number of channels] of the
dimensions of the frame
# frame.shape[0] give us the number of rows of pixels the frame has. Since
height begins from 0 at the top, the y-coordinate of the bottom of the frame
is its height
height = frame.shape[0]
# Creates a triangular polygon for the mask defined by three (x, y)
coordinates
polygons = np.array([
[(0, height), (800, height), (380, 290)]
])
# Creates an image filled with zero intensities with the same dimensions
as the frame
mask = np.zeros_like(frame)
# Allows the mask to be filled with values of 1 and the other areas to be
filled with values of 0
cv.fillPoly(mask, polygons, 255)
# A bitwise and operation between the mask and frame keeps only the
triangular area of the frame
segment = cv.bitwise_and(frame, mask)
return segment