0% found this document useful (0 votes)

39 views

Event Info Extraction From Flyers: Yang Zhang Hao Zhang, Haoranli

This document describes the development of an Android app that allows users to automatically add event information like dates and venues from flyers to their Google calendar. The app uses image processing and optical character recognition techniques to extract the text from captured flyer images, even in difficult lighting conditions or with partial occlusions. Key information like the date and venue is extracted from the text using natural language processing and then added to the user's Google calendar. The system utilizes a client-server model where image preprocessing and OCR are performed on a high-performance server to improve responsiveness for the mobile app.

Uploaded by

Yassine Akermi

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

39 views

Event Info Extraction From Flyers: Yang Zhang Hao Zhang, Haoranli

Uploaded by

Yassine Akermi

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 7

Event Info Extraction from Flyers

Yang Zhang* Hao Zhang, HaoranLi

Department of Geophyiscs Department of Electrical Engineering
Stanford University Stanford University
Stanford,CA Stanford, CA
[email protected] {hzhang22, aimeeli}@stanford.edu

Abstract—In this study, we developed an Android app to add

poster dates and venue information to Google calendar. Several
digital image processing methods are used to preprocess the
captured poster’s image, which might have been taken from a
difficult perspective, bad lightening conditions or partial
occlusions. And we also coped with the complex background of
the poster to make it easier to locate the text area in the poster.
After locating the text bounding boxes, the text would be
extracted by an optical character recognition (OCR) engine,
Tesseract, and parsed to isolate locations and venues. The image
processing part is done in a server by several heuristic methods
and the resulting concatenated text would be pushed back to the
phone. Natural language processing (NLP) technique then would
be utilized to analyze the text string and extract the date and
venue to e automatically add to Google calendar. Test results
show that our algorithm performs well to locate the text area and Figure 1: Put events in calendar by snapping a picture
extract the key information.
Keywords—Android, Image Processing, OCR, event info II. SYSTEM OVERVIEW
extraction
Since Tesseract OCR would require high complexity cost,
running this engine on a mobile client with responsive
I. INTRODUCTION interactions is infeasible. So we decide to offload the work of
Every day we encounter a lot of events information from preprocessing and OCR part to a high-performance server over
the paper flyers posted on the hallway walls, entrance doors the network. Thus this Android app uses a client-server
and elevators in our buildings. Will it be a presentation, a communication system framework[2]. A flowchart of the
forum discussion or a concert; they all come with a date, time application pipeline is showed in Figure 2.
and venue. People might want to keep a record of interested There are major stages in the pipeline:
events in their calendar when they come across those posters.
But since the information are not digitized, it is very 1) Capture and upload Image: Using Android device to
inconvenient for us to manually input such information into our capture the image. After downsizing the image,
smartphone calendars. Therefore, we aim to build an OCR- upload it to the server.
based mobile image processing system that can automatically
extract event information from flyer pictures and directly 2) Preprocessing: The stage several image processing
integrate then imto the users’ diigital calendars. Our goal is to methods are used to prepare the image for text
design a small tool on Android platform that enables the user to detection including: resizing of image, edge detection,
put the event info (time and location) into his personal calendar histogram equalization, homograhy rectifying, and
by simply snapping a picture on the related flyer/poster Figure1. etc. Section III discusses it in details.

Previous EE368 students have worked on a project of 3) Text Detection: Two heuristic methods based on text
importing contact from business cards automatically [1]. Here variance and edge density are firstly used to locate
we are targetting posters, which have more variations and the text area. And then filters are used to eliminate
involve more complex image components. Therefore, it would the useless regions by area, height, orientation,
be more challenging for accurate text detection of various solidity and etc. It would be discussed in Section III.
posters. We also want to improve the performance of our 4) OCR: Detected bounding box for text is sent to the
system by allowing users to take photos from different angles Tesseract OCR engine seperately and the result
and illuimnation conditions. would be concatnated into one string text and be
pushed back to the Android device via internet.
Secion IV covers it.
5) NLP Information Extraction: The downloaded text quadrilateral of the poster/flyer in the image, we can form a
file would be parsed into dates and venue by some homographic transform that transform the irregular
natural language processing methods. It is introduced quadrilateral to a horizontal rectangular domain. To find the
in Section V. quadrilateral, we run the following steps:
6) Importing Google Calendar: The dates and venue 1. Use Canny edge detector to generate an edge map of
information would be add to google calendar as the original image. The parameters for Canny edge
discussed in Section V. detectors can be refined to better suit our purpose.
Because the edge we want to detect should be
relatively strong and spatially long, therefore the
gradient threshold should be set to rule out the weaker
edges, and the pre-smoothing filter needs to be
applied in order to get rid of edges that come from
local image textures.
2. Computing Hough transform on the edge map we
obtained, in order to obtain boundary line candidates.
3. Locate the four sides of the bounding quadrilateral by
detecting two near-horizontal edge candidates and two
near-vertical edge candidates. This is done by using
houghpeaks and houghlines functions. Because there
can be more candidates than we need, we select and
rank the candidates based on several criterions:
a. We only accept the edge candidates to have
±15 deg deviation from nominal axis
directions.
Figure 2: System pipeline. The blocks in red are implemented
in Android devide and the blocks in blue are in the server b. We only accept edges that are longer than
1/4th of the image’s dimension along the
corresponding axis. And the longer ones get
III. IMAGE PREPROCESSING higher priority.
Preprocessing is an essential step to obtain accurate text c. We assign higher priorities to the edges that
recognition results with the Tesseract OCR tool, because an are further away from the image center.
poster image captured by a hand held camera is far from ideal
input for an OCR engine. 1) The image would have perspective Also some safeguarding is added to handle the case
distortions, while OCR engine assumes the image containing where some of the edges cannot be found (because of
text is taken from a perpendicular upright view. 2) The occlusion or the actual edges are outside of the
illumination of the image is not uniform everywhere. 3) There captured image domain). Figure 3 shows the
are usually multiple text blocks in a flyer, and the blocks are detection of candidate edges that could form bounding
not homogeneous (different font size, color, background). Our quadrilateral.
preprocessing workflow is designed to tackle these issues. And 4. Once we determine the encompassing quadrilateral,
they are described in the following subsections. we form a homographic transform to map the image
content inside the distorted quadrilateral to a standard
A. Correcting Perspective distortion rectangular area, thus achieving geometric correction.
We assume that the surface of the poster/flyer forms a flat Figure 4 shows the detected final encompassing
plane in 3-D space, and is presented on a rectangular shape of quadrilateral, and Figure 5 shows the result after
paper. Therefore, as long as we can find the boundary undoing the perspective distortion.

Fig. Image Processing Pipeline

(a) (b) Figure 4: Final selection of the bounding quadrilaterals,
drawn in green lines. Blue dots show the vertices.

(c) (d)
Figure 3: Using the Canny’s edge map enables the detection
of candidate edges that could from bounding quadrilaterals.
(a): Input subsampled camera image; (b) Canny edge
detection result; (c) near-horizontal edges from Hough
transform; (d): near-vertical edges from Hough transform.

Figure 5: The geometrically corrected (rectified) image

after performing the inferred homographic mapping.
B. Text Detection and Individual Textbox Extraction C. Binarization
We use the geometrically corrected image to detect text We binarize each text blocks separately using Otsu’s
regions. We have two major steps in this module: first is to method, which essentially achieves locally adaptive
detect where the text locates; second is to group the areas binarization. We notice that Tesseract is able to detect both
containing text into individual text blocks. By making it black text on white background and vice versa, therefore our
homogeneous inside each text block, we will achieve much program does not need to make such distinction.
better recognition result from Tesseract. To find text regions,
we use two heuristic criterions:
1. Local image variance. For every image location,
we compute the image’s brightness variance within
a neighborhood window of 31 pixels. Locations
with brightness variance lower than a certain
threshold (10% of the maximum variance found in
the image) would be identified as non-text (i.e.
plain background).
2. Local edge density. We first compute an edge map
using Canny’s method. Then for every image
location, we compute the proportion of edge pixels
among all pixels in a neighborhood window of 31
pixels, we refer this quantity as “edge density”.
Locations having their edge density within a
certain range [0.05 0.5] will be considered text. Figure 6: The computed local brightness variance (left) and
This measure will differentiate the text region from edge density map (right). The two quantities are filtered with
regions containing image content that has rich local different threshold values.
details and thus a higher edge density.

Figure 6 shows the two heuristics computed for the

example image.

After the candidate text regions are identified as a

binary indicator map, we apply image dilation to close small
holes, and group the text regions based on special
connectivity. The grouped candidate regions are further
filtered by several heuristics:
1. Regions that with an orientation that deviates from
horizontal directions for more than 10 degrees are
rejected.
2. Regions with too big or too small areas are rejected.
3. Regions with aspect ratio (width/height) less than
2.0 are rejected. (Assume texts are horizontal)
4. Regions with solidity less than 60% are rejected.
(These regions are too far from being rectangular)
Figure 7 shows the identified text regions and the final
result after property filtering (Yellow labels).
Figure 7: The identified potential text regions from
Finally, we take bounding boxes of the accepted text analyzing the two heuristics shown on Figure 6. The
regions, and window them out as individual text blocks. The regions with YELLOW No. labels are the final ones that
resulting text blocks are shown in Figure 8. passed filtering on region properties.

We could apply more sophisticated schemes to improve the

result of text detection, nonetheless under the scope of our
project, we found that the results are sufficient for our
purpose: to correctly identify the key event information
from these camera images.
indicate locations such as “building”, “room” etc. If there are
numbers following those keywords, we will add those
numbers to the keyword. Otherwise, we will extract words
preceding those keywords and insert them to the front of the
keyword. Standard locations that we got would be like
“Gates Room 104”, “Huang Mackenzie Room” etc.

B. Adding Event Information to Calendar

After the event information is extracted from the string,
we would add them to the Google calendar. In the user part,
once he/she finished taking a photo, a calendar window
would appear on the phone with captured location and time
Figure 8: The six individual text block images extracted information filled in the calendar part. Figure 10 (e) shows
using the region information shown in Figure 7, and their the app appearance.
corresponding binarized images.
VI. RESULTS
In this section, we present the output of each step for
several sample input images. Figure 10 shows a case where
IV. OPTICAL CHARACTER RECOGNITION the image is taken with perspective distortion and papers are
uneven.
A. Tesseract
For those test cases, we can see that for the first step our
Tesseract is an open-source OCR engine developed at algorithm is able to find the bounding quadrilateral and
HP labs and now maitained by Google. It is one of the most rectify the image. For the next step, we can correctly find
accurate open source OCR engine with the ability to read a regions for the text. OCR would return a quite good result if
wide variety of image formats and convert them to text in
we are able to feed in proper regions of text. Finally our
over 60 languages[3]. The Tesseract library would be used
application is able to get the correct time and location from
on our server.
the text and add to the calendar.
Tesseract algorithm would assume its input is a binary Our algorithm will sometimes fail when the poster
image and would do its own preprocessing first followed by background is too complex and we are not able to identify
a recognition stage. An adaptive classifier is used to identify regions of text from the poster, such as the input image in
the words. Figure 9. We are able to correctly rectify the image. But
since the background image is too cluttered, we are not able
B. Server side OCR to find regions of the text based on variance, edge density or
On our server, the input for the Tesseract engine are geometries of the region. Therefore, we got incorrect results
several identified text locations on the image indicated by the in this case.
bounding box detected by our algorithm instead of the entire
image. The resulting text of each region are then
concatenated together as the final output of OCR. Though
there may be some false positive regions, we consider the
output of OCR is correct as long as the dates and venue
information are included in the final output text.

V. POSTPROCESSING

A. Text Information Extraction

Once we get the text from posters by Tesseract OCR, we
want to extract information of time and location from the
text. For time extraction, I use a time parser library in Java
[4]. The parser will extract all times from the given string,
then I will filter those times based on the matching text and
text position and choose a start time and end time. Standard
times that we got would be like “Tue Feb 04 18:30:00 PST
2014”.
Figure 9: Sample input image with incorrect result
For location extraction, at first, we do string matching of
common places at Stanford campus such as “Gates”,
“Packard”, “Huang” etc. Then we will match keywords that
!! !! !
! ! ! ! (a)!Input!Image! ! ! ! ! ! ! (b)!Rectified!Image! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! (c)!Text!Crop!

!!!!!! !
(d)!OCR!Result! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! (e)!App!Screenshot!
!
Figure 10: Result images of each step for a sample input image. (a) is the input image, (b) is the rectified image,
that we identify, (d) is the result text of OCR! (e) is a screenshot of our app.
Fig!1.!Put!events!in!calendar!by!snapping!a!picture
(c) is text crops

And natural language processing methods are used to extract

the event information.
VII. FUTURE WORK
Though our algorithm is a heuristic one, it is observed
In our system, we only extract time and location from that it is robust to difficult conditions like partial occlusions,
posters. It would be great if we can also extract the main complex backgrounds, differnet camera perspective, uneven
topic of a poster. We can detect topic by considering font of illuminations and etc. And compared to the raw OCR result,
the text in the image, regarding text with bigger font as the our result has an obvious improvement on the accuracy.
topic. In addition, we can improve the robustness of our
system by applying techniques such as Stroke Width
Transform to improve text detection and performing better REFERENCES
text segmentation from cluttered background rather than
[1] https://stacks.stanford.edu/file/druid:np318ty6250/Sharma_Fujii_Aut
simply Otsu’s Method. omatic_Contact_Importer.pdf
[2] http://www.stanford.edu/class/ee368/Android/Tutorial-3-Server-
VIII. CONCLUSION Client-Communication-for-Android.pdf
In this project, we have implement an Android app which [3] https://code.google.com/p/tesseract-ocr/
is capable of capturing a poster image, uploading the image [4] http://natty.joestelmach.com
to a server, downloading the processed text from the server,
extracting event information and finally adding them to
Goolge canlendar. Lots of image processing methods are
used to preprocess the image including canny filter edge
detection, histogram equalization, rectifying homography,
and etc. Two heuristic methods and variuos fitlers are used
to detect the text area. Tesseract is used to recognize the text.
APPENDIX
The work presented in this report and the accompanying
source codes are the results of joint work of the following
students.
• Yang Zhang proposed the initial project idea, and was
in charge of the image preprocessing and text block
detection-extraction components implementation. He
also produced the demo video with the help of the
other team members.
• Hao Zhang developed a prototype of the Android app
to run Tesseract OCR engine and the Google calendar
part combination with the app. He also built the server
part of the project.
• Haoran Li implemented the post-processing part and
made the poster.
• Joint efforts: Most of the brainstorming and
debugging parts are done among all members. And all
members create the final report together.

Quiz Tiger
100% (2)
Quiz Tiger
27 pages
Applications On Civil 3D For Civil Enginners (Aboelkasim Diab Ahmed)
100% (1)
Applications On Civil 3D For Civil Enginners (Aboelkasim Diab Ahmed)
104 pages
Application of machine vision image feature
No ratings yet
Application of machine vision image feature
9 pages
Automated Laser Scanning System For Reverse Engineering and Inspection
No ratings yet
Automated Laser Scanning System For Reverse Engineering and Inspection
9 pages
Cse Final Year Project Proposal PDF Free
No ratings yet
Cse Final Year Project Proposal PDF Free
10 pages
Verma 2016
No ratings yet
Verma 2016
4 pages
Synopsis For Mini Project
No ratings yet
Synopsis For Mini Project
5 pages
exofly
No ratings yet
exofly
9 pages
Image-Based Indoor Localization Using Smartphone Camera
No ratings yet
Image-Based Indoor Localization Using Smartphone Camera
9 pages
remotesensing-16-03278
No ratings yet
remotesensing-16-03278
18 pages
Final Proposal
No ratings yet
Final Proposal
15 pages
Williem Fast and Robust 2014 CVPR Paper PDF
No ratings yet
Williem Fast and Robust 2014 CVPR Paper PDF
2 pages
IA Unit-04
No ratings yet
IA Unit-04
11 pages
CV 2 MARKS
No ratings yet
CV 2 MARKS
11 pages
IJERT Self Driving Autonomous Car Using
No ratings yet
IJERT Self Driving Autonomous Car Using
3 pages
HUVIAiR WorkFlow
No ratings yet
HUVIAiR WorkFlow
12 pages
basic of computer vision UNIT II
No ratings yet
basic of computer vision UNIT II
29 pages
Fast Object Localization With Real Time 3D Laser R
No ratings yet
Fast Object Localization With Real Time 3D Laser R
11 pages
Unit - 3 - Object Recognition
No ratings yet
Unit - 3 - Object Recognition
12 pages
Fast Bilateral-Space Stereo For Synthetic Defocus
No ratings yet
Fast Bilateral-Space Stereo For Synthetic Defocus
9 pages
Paper_bdf
No ratings yet
Paper_bdf
4 pages
Synopsis[1]
No ratings yet
Synopsis[1]
17 pages
Automated 3D Scenes Reconstruction For Mobile Robots Using Laser Scanning
No ratings yet
Automated 3D Scenes Reconstruction For Mobile Robots Using Laser Scanning
6 pages
University of Zimbabwe: Development of A Geodata Processing Software
No ratings yet
University of Zimbabwe: Development of A Geodata Processing Software
7 pages
Hand Gesture Translator
No ratings yet
Hand Gesture Translator
17 pages
3D Mapping of Selected Regions Using NavIC Receivers
100% (1)
3D Mapping of Selected Regions Using NavIC Receivers
12 pages
(2017) Entry and Exit Monitoring Using License Plate Recognition
No ratings yet
(2017) Entry and Exit Monitoring Using License Plate Recognition
5 pages
Feature-Based Aerial Image Registration and Mosaicing
No ratings yet
Feature-Based Aerial Image Registration and Mosaicing
32 pages
A Novel Framework For Extraction of Landscape Areas and Automatic Building Detection in Satellite Images
No ratings yet
A Novel Framework For Extraction of Landscape Areas and Automatic Building Detection in Satellite Images
4 pages
Yevdokimov_Thesis_Stud_Conf_RTF_2023_Eng_30.04_rv2
No ratings yet
Yevdokimov_Thesis_Stud_Conf_RTF_2023_Eng_30.04_rv2
3 pages
Fast Range Image Segmentation and Smooth
No ratings yet
Fast Range Image Segmentation and Smooth
6 pages
A Low-Cost Stereo System for 3D Object Recognition
No ratings yet
A Low-Cost Stereo System for 3D Object Recognition
7 pages
Efficient Image Retrieval Based Mobile Indoor Localization
No ratings yet
Efficient Image Retrieval Based Mobile Indoor Localization
4 pages
MOBILE IRIS TECHNOLOGY
No ratings yet
MOBILE IRIS TECHNOLOGY
7 pages
Journal of Intelligent Fuzzy Systems
No ratings yet
Journal of Intelligent Fuzzy Systems
9 pages
When Transformer Meets Robotic Grasping Exploits Context For Efficient Grasp Detection
No ratings yet
When Transformer Meets Robotic Grasping Exploits Context For Efficient Grasp Detection
8 pages
Sketch-to-Sketch Match With Dice Similarity Measures
No ratings yet
Sketch-to-Sketch Match With Dice Similarity Measures
6 pages
PXC 387549
No ratings yet
PXC 387549
8 pages
Colourising Point Clouds Using Independent Cameras
No ratings yet
Colourising Point Clouds Using Independent Cameras
8 pages
Intergration of LIDAR and Camera Data For 3D Reconstruction
No ratings yet
Intergration of LIDAR and Camera Data For 3D Reconstruction
2 pages
Depth Resolution Enhancement in Time-of-Flight
No ratings yet
Depth Resolution Enhancement in Time-of-Flight
9 pages
A_Real_Time_Object_Distance_Measurement
No ratings yet
A_Real_Time_Object_Distance_Measurement
6 pages
Learning To Be A Depth Camera For Close Range Human Capture and Interaction
No ratings yet
Learning To Be A Depth Camera For Close Range Human Capture and Interaction
11 pages
Preprints202308 1425 v1
No ratings yet
Preprints202308 1425 v1
17 pages
Implementation of Building Recognition Android App: Dongguk University at Gyeongju Korea Yim@dongguk - Ac.kr
No ratings yet
Implementation of Building Recognition Android App: Dongguk University at Gyeongju Korea Yim@dongguk - Ac.kr
16 pages
A General Line Tracking Algorithm Based on Computer Vision
No ratings yet
A General Line Tracking Algorithm Based on Computer Vision
6 pages
Satellite Image Stitcher Using ORB
No ratings yet
Satellite Image Stitcher Using ORB
6 pages
Real Time Object Detection in Surveillance Cameras With 2xjeq74wam
No ratings yet
Real Time Object Detection in Surveillance Cameras With 2xjeq74wam
8 pages
3dmodelpipeline
No ratings yet
3dmodelpipeline
24 pages
Empirical Analysis of SIFT, Gabor and Fused Feature Classification Using SVM For Multispectral Satellite Image Retrieval
No ratings yet
Empirical Analysis of SIFT, Gabor and Fused Feature Classification Using SVM For Multispectral Satellite Image Retrieval
6 pages
Boundary Viên Thuốc
No ratings yet
Boundary Viên Thuốc
12 pages
Location Fingerprinting of Mobile Terminals by Using Wi-Fi Device
No ratings yet
Location Fingerprinting of Mobile Terminals by Using Wi-Fi Device
4 pages
Methodology For Eliminating Plain Regions From Captured Images
No ratings yet
Methodology For Eliminating Plain Regions From Captured Images
13 pages
applsci-12-03322-v2
No ratings yet
applsci-12-03322-v2
17 pages
Text Extraction and Localization From Captured Images: Taufin M Jeeralbhavi Dr. Jagadeesh D. Pujari Shivananda V. Seeri
No ratings yet
Text Extraction and Localization From Captured Images: Taufin M Jeeralbhavi Dr. Jagadeesh D. Pujari Shivananda V. Seeri
3 pages
Progress Report
No ratings yet
Progress Report
4 pages
3D Map-Building From RGB-D Data Considering Noise Characteristics of Kinect
No ratings yet
3D Map-Building From RGB-D Data Considering Noise Characteristics of Kinect
6 pages
Building Detection and Recognition For An Automated Tour Guide
No ratings yet
Building Detection and Recognition For An Automated Tour Guide
7 pages
5-IEEEMELECON_2022
No ratings yet
5-IEEEMELECON_2022
6 pages
R-CNN, Fast R-CNN, Faster R-CNN, YOLO - Object Detection Algorithms
No ratings yet
R-CNN, Fast R-CNN, Faster R-CNN, YOLO - Object Detection Algorithms
11 pages
2D23D Final Report Team B3
No ratings yet
2D23D Final Report Team B3
31 pages
Multi View Three Dimensional Reconstruction: Advanced Techniques for Spatial Perception in Computer Vision
From Everand
Multi View Three Dimensional Reconstruction: Advanced Techniques for Spatial Perception in Computer Vision
Fouad Sabry
No ratings yet
Programming 1: Lab 11 v1.0
No ratings yet
Programming 1: Lab 11 v1.0
16 pages
Suoranta Jyri
No ratings yet
Suoranta Jyri
35 pages
Examprog
No ratings yet
Examprog
3 pages
Lecturer: Dr. Jozsef Katona Course Code: DFAN-ISF-213-EN Course Name: Programming 1 Type of Test: Written Time To Finish: 1 Week
No ratings yet
Lecturer: Dr. Jozsef Katona Course Code: DFAN-ISF-213-EN Course Name: Programming 1 Type of Test: Written Time To Finish: 1 Week
1 page
1 1 A Deep Dive in The S4 HANA Conversion
No ratings yet
1 1 A Deep Dive in The S4 HANA Conversion
31 pages
Vishwajeet Synopsis
No ratings yet
Vishwajeet Synopsis
13 pages
IR-310D AnE Spec v1.1
No ratings yet
IR-310D AnE Spec v1.1
8 pages
Arif Danial Bin Ahmad Dahlan - Report
No ratings yet
Arif Danial Bin Ahmad Dahlan - Report
29 pages
Description of Device Parameters PDF
No ratings yet
Description of Device Parameters PDF
204 pages
NeuralNetworks One PDF
No ratings yet
NeuralNetworks One PDF
58 pages
AR Report
No ratings yet
AR Report
6 pages
Til311 Datasheet
No ratings yet
Til311 Datasheet
7 pages
Co Lab File Last One
No ratings yet
Co Lab File Last One
135 pages
Process Audit (Standard Operating Procedure Audit)
No ratings yet
Process Audit (Standard Operating Procedure Audit)
3 pages
Authoring Tool
No ratings yet
Authoring Tool
35 pages
LR (0) Parser
No ratings yet
LR (0) Parser
8 pages
Collaborative Engineering An Airbus Case Study
No ratings yet
Collaborative Engineering An Airbus Case Study
11 pages
8061 Manual S2
No ratings yet
8061 Manual S2
106 pages
Lime3ds Log - Old
No ratings yet
Lime3ds Log - Old
14 pages
Excerpt From "Amazon Unbound" by Brad Stone
No ratings yet
Excerpt From "Amazon Unbound" by Brad Stone
33 pages
Unit 1
No ratings yet
Unit 1
14 pages
BUCKLER SYSTEMS PRE- ASSESSMENT AD[1]
No ratings yet
BUCKLER SYSTEMS PRE- ASSESSMENT AD[1]
2 pages
Computer Graphics Program
No ratings yet
Computer Graphics Program
28 pages
Rojos: Nombre Valores Valores Valores Colores
No ratings yet
Rojos: Nombre Valores Valores Valores Colores
8 pages
Security Awareness Training
No ratings yet
Security Awareness Training
17 pages
Domino Laser Printer Manual PDF
No ratings yet
Domino Laser Printer Manual PDF
224 pages
Operating Systems 5
No ratings yet
Operating Systems 5
34 pages
ANSYS Motion 2019 R3 Standalone Solver Manager Manual
No ratings yet
ANSYS Motion 2019 R3 Standalone Solver Manager Manual
15 pages
Download Full Python for Everyone 2nd Edition Cay Horstmann. Rance Necaise PDF All Chapters
No ratings yet
Download Full Python for Everyone 2nd Edition Cay Horstmann. Rance Necaise PDF All Chapters
82 pages
Chatbot Using A Knowledge in Database
No ratings yet
Chatbot Using A Knowledge in Database
7 pages
Object Oriented Concepts in Python
No ratings yet
Object Oriented Concepts in Python
10 pages
Construction and Project Management Sub-Process
No ratings yet
Construction and Project Management Sub-Process
11 pages

Event Info Extraction From Flyers: Yang Zhang Hao Zhang, Haoranli

Uploaded by

Event Info Extraction From Flyers: Yang Zhang Hao Zhang, Haoranli

Uploaded by

Event Info Extraction from Flyers

Yang Zhang* Hao Zhang*, HaoranLi*

Abstract—In this study, we developed an Android app to add

Fig. Image Processing Pipeline

Figure 5: The geometrically corrected (rectified) image

Figure 6 shows the two heuristics computed for the

After the candidate text regions are identified as a

We could apply more sophisticated schemes to improve the

B. Adding Event Information to Calendar

A. Text Information Extraction

And natural language processing methods are used to extract

You might also like

Yang Zhang* Hao Zhang, HaoranLi