Event Info Extraction From Flyers: Yang Zhang Hao Zhang, Haoranli
Event Info Extraction From Flyers: Yang Zhang Hao Zhang, Haoranli
Previous EE368 students have worked on a project of 3) Text Detection: Two heuristic methods based on text
importing contact from business cards automatically [1]. Here variance and edge density are firstly used to locate
we are targetting posters, which have more variations and the text area. And then filters are used to eliminate
involve more complex image components. Therefore, it would the useless regions by area, height, orientation,
be more challenging for accurate text detection of various solidity and etc. It would be discussed in Section III.
posters. We also want to improve the performance of our 4) OCR: Detected bounding box for text is sent to the
system by allowing users to take photos from different angles Tesseract OCR engine seperately and the result
and illuimnation conditions. would be concatnated into one string text and be
pushed back to the Android device via internet.
Secion IV covers it.
5) NLP Information Extraction: The downloaded text quadrilateral of the poster/flyer in the image, we can form a
file would be parsed into dates and venue by some homographic transform that transform the irregular
natural language processing methods. It is introduced quadrilateral to a horizontal rectangular domain. To find the
in Section V. quadrilateral, we run the following steps:
6) Importing Google Calendar: The dates and venue 1. Use Canny edge detector to generate an edge map of
information would be add to google calendar as the original image. The parameters for Canny edge
discussed in Section V. detectors can be refined to better suit our purpose.
Because the edge we want to detect should be
relatively strong and spatially long, therefore the
gradient threshold should be set to rule out the weaker
edges, and the pre-smoothing filter needs to be
applied in order to get rid of edges that come from
local image textures.
2. Computing Hough transform on the edge map we
obtained, in order to obtain boundary line candidates.
3. Locate the four sides of the bounding quadrilateral by
detecting two near-horizontal edge candidates and two
near-vertical edge candidates. This is done by using
houghpeaks and houghlines functions. Because there
can be more candidates than we need, we select and
rank the candidates based on several criterions:
a. We only accept the edge candidates to have
±15 deg deviation from nominal axis
directions.
Figure 2: System pipeline. The blocks in red are implemented
in Android devide and the blocks in blue are in the server b. We only accept edges that are longer than
1/4th of the image’s dimension along the
corresponding axis. And the longer ones get
III. IMAGE PREPROCESSING higher priority.
Preprocessing is an essential step to obtain accurate text c. We assign higher priorities to the edges that
recognition results with the Tesseract OCR tool, because an are further away from the image center.
poster image captured by a hand held camera is far from ideal
input for an OCR engine. 1) The image would have perspective Also some safeguarding is added to handle the case
distortions, while OCR engine assumes the image containing where some of the edges cannot be found (because of
text is taken from a perpendicular upright view. 2) The occlusion or the actual edges are outside of the
illumination of the image is not uniform everywhere. 3) There captured image domain). Figure 3 shows the
are usually multiple text blocks in a flyer, and the blocks are detection of candidate edges that could form bounding
not homogeneous (different font size, color, background). Our quadrilateral.
preprocessing workflow is designed to tackle these issues. And 4. Once we determine the encompassing quadrilateral,
they are described in the following subsections. we form a homographic transform to map the image
content inside the distorted quadrilateral to a standard
A. Correcting Perspective distortion rectangular area, thus achieving geometric correction.
We assume that the surface of the poster/flyer forms a flat Figure 4 shows the detected final encompassing
plane in 3-D space, and is presented on a rectangular shape of quadrilateral, and Figure 5 shows the result after
paper. Therefore, as long as we can find the boundary undoing the perspective distortion.
(c) (d)
Figure 3: Using the Canny’s edge map enables the detection
of candidate edges that could from bounding quadrilaterals.
(a): Input subsampled camera image; (b) Canny edge
detection result; (c) near-horizontal edges from Hough
transform; (d): near-vertical edges from Hough transform.
V. POSTPROCESSING
!!!!!! !
(d)!OCR!Result! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! (e)!App!Screenshot!
!
Figure 10: Result images of each step for a sample input image. (a) is the input image, (b) is the rectified image,
that we identify, (d) is the result text of OCR! (e) is a screenshot of our app.
Fig!1.!Put!events!in!calendar!by!snapping!a!picture
(c) is text crops