0% found this document useful (0 votes)

12 views17 pages

Local Event Retrieval

This document discusses an approach for local event retrieval using social media as a sensor to detect real-world events in real-time. It presents a framework that ranks tuples of location and time based on how likely they represent the starting time and location of a relevant event for a given query. The framework defines two components - one based on topic relevance between tweets from a location and time to the query, and another based on changes in tweeting rate that may indicate an event. Tweets are aggregated using CombSUM voting to estimate topic relevance scores, while tweeting rate changes are quantified to estimate event likelihood scores, which are combined in a linear ranking function.

Uploaded by

jeyalakshmi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

12 views17 pages

Local Event Retrieval

Uploaded by

jeyalakshmi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 17

Local Event Retrieval

Introduction
• It has been suggested that a large proportion of queries submitted
to web search engines has a “local intent” and that these queries
compose the majority of searches submitted from mobile phones
• Examples of information needs expressed by such queries include
“what is happening near me?” or “finding restaurants in the Covent
Garden district”.
• The prevalence of such queries highlights the importance of
building effective local search tools that serve this type of
information need.
• In this section, an approach for local event retrieval is presented
where social media is considered as a social sensor to detect events
in real-time.
Social Sensors for Local Event Retrieval
• Our motivation stems from the fact that the communities of
users in Twitter often share messages about local events as they
progress
• The plot shows how local events are reflected in social media-
the volume of tweets that are posted within London and contain
the phrase “beach boys” over a period of 12 days, where “beach
boys” is the name of a rock band who held a concert in London’s
Royal Albert Hall during the considered time period.
• We observe that just before and during the concert, tweets
mentioning the “beach boys” within London have spiked.
• This is an indication that the concert as a real-world event has
been reflected in the tweeting activities within the city.
A plot of the volume of tweets in London that contain
the phrase “beach boys”
over time.
Attempts to harness Social media for IR
• This includes (i) identifying social media content relevant to known events
• (ii) detecting unknown events using user-generated content in social media .
• In the first case, social media content is identified to provide users with
more information about a planned event (e.g. a festival or a football match).
• Users would be able, for example, to access tweets about ticket prices before
the event, or Flickr photos posted by attendees after the event.
• The second case is more challenging as there is no prior knowledge about
the events.
• While some approaches have focused on detecting news-related
• events or simply clustering social media content based on a database of
targeted events, a recent work has devised methods for retrieving global
events from Twitter archives that correspond to an arbitrary query (event
type); a problem which the authors called “structured event retrieval” over
Twitter
• Unlike, which focused on non-local events, we make use of the
opportunities that social media can bring to local search services.
• In particular, we define a new localized IR task that extends the
aforementioned structured event retrieval task .
• The task we propose aims at identifying and ranking local events
based on social media activities in the area where the events
occur.
• In other words, we use social media as a social sensor to detect
local events in real-time.
• The work presented here advances the state-of-the-art in
detecting and locating unknown events in social media and
proposes a new IR task of local event retrieval
Problem Formulation
• Our overall goal is to identify and rank local events
happening in the real-world as a response to a user query.
• For a formal definition of a local event, we adopt a definition
that has been previously used in the new event detection
broadcast news task of the TDT (Topic Detection and
Tracking) evaluation forum.
• This definition states that an event is something that occurs
in a certain place at a certain time. Formally, we consider a
set of locations L = {l1, l2, . . .} that are of interest to the user.
• The granularity of locations can vary from buildings and
streets to entire cities.
Contd..
• For example, we might consider each location to represent an area
in a city in which the user is located.
• The city in this case is considered to be divided into equally sized
areas specified by polygons of geographical coordinates, or we can
use the divisions defined by the local authority such as postcodes or
boroughs.
• Each location li at a certain time tj is denoted by the tuple <li, tj>.
• We define the problem of local event retrieval as follows.
• For a user interested in local events within locations L (explicitly
defined or implicitly inferred from the current user’s location), the
event retrieval framework aims to score tuples <li, tj> according to
how likely tj represents a starting time of an event within the
location li that matches the user query.
Contd..
• An event is considered relevant if it matches the explicit
query of the user and/or the implicit context of the user (the
time of the query, the location of the user and or her profile).
• In other words, the event retrieval framework defines a
ranking function that gives a score R(q, li, tj) for each tuple
<li, tj> with regards to the user’s query q.
• Examples of events to retrieve include festivals, football
matches or security incidents.
• When expressed explicitly by a user, a query is assumed to be
in the form of a bag of words (e.g. “live music”,
“conference”).
• When using Twitter as a social sensor, a location li at a certain time tj is
characterised by the tweeting activities observed at that location within a
given timeframe (tj − tj−1).
• The tweeting activities are represented with a set of tweets originating from
that location shared publicly within the given timeframe (tj −tj−1).
• This set of tweets is denoted by Ti,j . Note that the fixed timeframe is defined
using an arbitrary sampling rate θ; ∀j : tj − tj−1 = θ.
• An event happening in the real-world is represented by a tuple <l, ts, tf >;
where l is the location where the event is taking place, ts is the starting time
and tf is the finishing time.
• Our aim is to use the tweeting activities as the main source of evidence to
define the ranking function R(q, <li, tj>).
• More specifically and to define the ranking function, we use the set of tweets
Ti,j , and a time series of tweets Ti,j =< . ., Ti,j−2, Ti,j−1, Ti,j >in the location
li before the current time tj .
• This allows us to identify sudden changes in the tweeting activities,which
may have been triggered by an occurrence of an event.
• Moreover, the event retrieval framework can identify a subset of the tweet set
Ti,j that matches the query, which may help the user in the event information
seeking process.
A Framework for Event Retrieval
• The framework aims to define an effective ranking function that scores
tuples
• of time and location according to how likely they represent the starting time
• and the location of a relevant event for a given query. Note that with
regards
• to the previous definition of the local event retrieval problem in Section
3.3.2,
• as a first step, we are not aiming to determine the finishing time of an
event.
• As discussed in Section 3.3.2, here we aim to use tweets as the main source
• of evidence to score the tuples. In particular, we define two components
built
• on this evidence:
• 1. The first component is based on the intuition that social media may reflect
• real-world events, hence when an event occurs somewhere we expect to
• find topically related social posts about it originating from the location
• where it occurs. To instantiate this component, for each location at a
• given time, i.e. for each tuple li, tj, we measure how much the tweets
• Ti,j corresponding to the tuple are topically related to the query q.
• 2. The second component is based on the intuition that events trigger
• an increasing tweeting activity [66] causing peaks of tweeting rates
• during the event (bursts). For this component, we aim to quantify the
• change in the tweeting rate, the volume of tweets over time, observed
• at li, tj when compared to previous observations over time at the
• same location. In other words, we aim to measure the unusual tweeting
• behaviour that may indicate an occurrence of an event. To compute the
• tweeting rate, we can either consider all the tweets posted within the
• given timeframe at the given location or only a subset of those which
• are relevant to the user query, e.g. tweets which contain terms of the
• query.
• Following this, the ranking function can be defined as a linear combination
of
• the previous two components as follows:
• R(q, li, tj) ∝ (1 − λ) ・ S(q, Ti,j) + λ ・ E(q, li, tj) (3.1)
• where S(q, Ti,j ) is the score of the tweet set Ti,j that quantifies how much
• they are topically related to the query q;E(q, li, tj) is a score proportionate
• to the change in the tweeting rate with regards to the query q at the given
• time tj within the location li, and 0 ≤ λ ≤ 1 is a parameter to control
• the contribution for each component in the linear combination in
• Equation (3.1). Next, we show how we approach the problem of
quantifying
• each component.
Aggregating Tweets
• To estimate S(q, Ti,j) in Equation (3.1), we propose to borrow ideas and
• techniques originally designed for the IR problem of expert search. In expert
• search, a profile of an expert candidate is typically represented by the
• documents associated to the candidate [8, 41]. Similarly, the tuple li, tj
• is associated with a set of tweets. Inspired by [41], the score of each tuple
• (candidate) can be estimated by aggregating the retrieval scores (votes) for
• each tweet (document) associated to it. In [41], several voting techniques were
• used to aggregate the scores. We use the intuitive, yet effective, CombSUM
• voting technique, which estimates the final score of the tweet set representing
• a tuple (candidate) as follows:
• S(q, Ti,j) =
• t∈Rel(q)∩Ti,j
• (Score(q, t)) (3.2)
• where Rel(q) is the subset of tweets that match the query q and Score(q, t)
• is the individual retrieval score obtained by a traditional bag-of-words ranking
• function, e.g. BM25 [53]. Higher scores represent more topically related
• tweets for the considered tuple.
Change Point Analysis
• The problem of quantifying the score E(q, li, tj) in
Equation (1) maps well to change point analysis
• Change point analysis aims at identifying points in
time series data where the statistical properties
change.
• It has been previously applied to detect events in
continuous streams of data.
• For example, Guralnik et al. developed change point
detection techniques that can accurately detect
events in traffic sensor data.
Contd..
• In our case, the change point analysis can be applied on the tweeting rate in a location
li to quantify the probability that the tweeting rate at a certain time tj represents a
change point when compared retrospectively to previous points in time tj−1, tj−2, . .,
tj−k.
• We apply the Grubb’s test as a change point detection technique as it is
computationally inexpensive and it has been successfully applied in a similar context,
namely first story detection from Twitter and Wikipedia .
• Given a location li and at each point of time, e.g. on minute intervals, we maintain a
moving window of size k points, e.g. k minutes, over the previous observations.
• We apply the Grubb’s test to each moving window to determine if the tweeting rate of
the last point is an outlier that stands out with respect to the tweeting rates of previous
observations.
• With Grubb’s test, rj is an outlier if v = (rj − xj,k)/σ2 > z,
• where xj,k is the mean tweeting rate in the window (tj−k, tj ),
• σ is the standard deviation of the tweeting rates in the window (tj−k, tj ), and
• z is a fixed threshold.
• Note that this test gives a binary decision for each point in time.
• We smooth this binary decision into a normalised score and use it for the second
component of Equation (1) as follows:
• E(q,< li, tj>) = Ec(tj) = 1 − e((−ln 2)/z ・ v) (3)
• where 0 ≤ Ec(tj) ≤ 1 represents a score of a change point using the Grubb’s test.
• Note that when v = z, the resulting score in Equation (3) is equal to 0.5.
• The tweeting rate rj can be estimated in two different ways:
• (i) By simply using the volume of tweets posted in the given
• location within the timeframe corresponding to tj , i.e. rj = |Ti,j |.We call this
• a query independent (QI) tweeting rate; and
• (ii) By using the score of the voting technique described above, i.e. rj = S(q, Ti,j).
• We call this a query dependent (QD) tweeting rate.
• It should be noted that this framework can operate in a real-time fashion
• on top of the SMART architecture where social feeds are incrementally indexed such
that the retrieval components are able to provide the freshest results.

Time Series with Python: How to Implement Time Series Analysis and Forecasting Using Python
From Everand
Time Series with Python: How to Implement Time Series Analysis and Forecasting Using Python
Bob Mather
3/5 (1)
IRS Notes
No ratings yet
IRS Notes
40 pages
Kumar 2021
No ratings yet
Kumar 2021
8 pages
Improving Crisis Event Detection Rate in Online Social Networks Twitter Stream Using Apache Spark
No ratings yet
Improving Crisis Event Detection Rate in Online Social Networks Twitter Stream Using Apache Spark
11 pages
Traffic Data Mining Australasian Database Conference
No ratings yet
Traffic Data Mining Australasian Database Conference
12 pages
3614 Ijnlc 02
No ratings yet
3614 Ijnlc 02
12 pages
TEDAS: A Twitter-Based Event Detection and Analysis System
No ratings yet
TEDAS: A Twitter-Based Event Detection and Analysis System
4 pages
Identifying On-Site Users For Social Events Mobility, Content, and Social Relationship
No ratings yet
Identifying On-Site Users For Social Events Mobility, Content, and Social Relationship
14 pages
Real-Time Detection Tracking and Monitoring of Automatically Discovered Events in Social Media 2
No ratings yet
Real-Time Detection Tracking and Monitoring of Automatically Discovered Events in Social Media 2
7 pages
2014 - Tao Cheng - EventDetectionusingTwitterASpatioTemporalApproach (Retrieved 2021-10-20)
No ratings yet
2014 - Tao Cheng - EventDetectionusingTwitterASpatioTemporalApproach (Retrieved 2021-10-20)
10 pages
Beyond Trending Topics: Real-World Event Identification On Twitter
No ratings yet
Beyond Trending Topics: Real-World Event Identification On Twitter
4 pages
Event Detection in Twitter: Jianshu Weng Yuxia Yao Erwin Leonardi Francis Lee
No ratings yet
Event Detection in Twitter: Jianshu Weng Yuxia Yao Erwin Leonardi Francis Lee
22 pages
Event Detection, Tracking and Visualization in Twitter A Mention-Anomaly-Based Approach
No ratings yet
Event Detection, Tracking and Visualization in Twitter A Mention-Anomaly-Based Approach
18 pages
CityBeat - Real-Time Social Media Visualization of Hyper-Local City Data
No ratings yet
CityBeat - Real-Time Social Media Visualization of Hyper-Local City Data
4 pages
Twitternews: Real Time Event Detection From The Twitter Data Stream
No ratings yet
Twitternews: Real Time Event Detection From The Twitter Data Stream
9 pages
A Review of Approaches For Topic Detection in Twitter
No ratings yet
A Review of Approaches For Topic Detection in Twitter
28 pages
Earthquake Shakes Twitter User:: Analyzing Tweets For Real-Time Event Detection
No ratings yet
Earthquake Shakes Twitter User:: Analyzing Tweets For Real-Time Event Detection
50 pages
JournalNX - Traffic Time Monitoring
No ratings yet
JournalNX - Traffic Time Monitoring
3 pages
Analyzing and Ranking Prevalent News over Social Media
No ratings yet
Analyzing and Ranking Prevalent News over Social Media
12 pages
div-class-title-relational-event-models-in-network-science-div
No ratings yet
div-class-title-relational-event-models-in-network-science-div
9 pages
Social Media Safe Twitter Usage
No ratings yet
Social Media Safe Twitter Usage
6 pages
Mathematics 10 00447
No ratings yet
Mathematics 10 00447
22 pages
Streaming First Story Detection With Application To Twitter: (LSH) (Indyk and Motwani, 1998), A Randomized
No ratings yet
Streaming First Story Detection With Application To Twitter: (LSH) (Indyk and Motwani, 1998), A Randomized
9 pages
A_Collaborative_and_Content_Based_Event_Recommendation_System_Integrated_with_Data_Collection_Scrapers_and_Services_at_a_Social_Networking_Site
No ratings yet
A_Collaborative_and_Content_Based_Event_Recommendation_System_Integrated_with_Data_Collection_Scrapers_and_Services_at_a_Social_Networking_Site
6 pages
Discovering Social Events Through Online Attention: Dror Y. Kenett, Fred Morstatter, H. Eugene Stanley, Huan Liu
No ratings yet
Discovering Social Events Through Online Attention: Dror Y. Kenett, Fred Morstatter, H. Eugene Stanley, Huan Liu
7 pages
RRR
No ratings yet
RRR
35 pages
Ontology Based Recommender Systems Using Social Network Data
No ratings yet
Ontology Based Recommender Systems Using Social Network Data
11 pages
Twitter As Data PDF
No ratings yet
Twitter As Data PDF
116 pages
Titov Bunker
No ratings yet
Titov Bunker
8 pages
3 Zhao Mitra (1) 11111111
No ratings yet
3 Zhao Mitra (1) 11111111
4 pages
Towards A Standard Sampling Methodology On Online Social Networks: Collecting Global Trends On Twitter
No ratings yet
Towards A Standard Sampling Methodology On Online Social Networks: Collecting Global Trends On Twitter
19 pages
Exploring Emerging Issues in Social Torrent Via Link-Irregularity Detection
No ratings yet
Exploring Emerging Issues in Social Torrent Via Link-Irregularity Detection
6 pages
Twitter Sentiment and Statement Reality Analysis Considering Pre-Current and Post Event Tweets
No ratings yet
Twitter Sentiment and Statement Reality Analysis Considering Pre-Current and Post Event Tweets
6 pages
IEEE BigData 2022 QueryExpansion
No ratings yet
IEEE BigData 2022 QueryExpansion
8 pages
Demo Tweet Sieve
No ratings yet
Demo Tweet Sieve
2 pages
Pdfs-V6-I2-P11 - Chinthala Shyamala 2016
No ratings yet
Pdfs-V6-I2-P11 - Chinthala Shyamala 2016
7 pages
Emerging News - Detecting Emerging Events From Social Media and News Feeds
No ratings yet
Emerging News - Detecting Emerging Events From Social Media and News Feeds
2 pages
Big Data: Methodological Challenges and Approaches For Sociological Analysis
No ratings yet
Big Data: Methodological Challenges and Approaches For Sociological Analysis
19 pages
Road Traffic Event Detection Using Twitter Data Machine Learning and Apache Spark
No ratings yet
Road Traffic Event Detection Using Twitter Data Machine Learning and Apache Spark
8 pages
DM I Summer 13 Reader
No ratings yet
DM I Summer 13 Reader
332 pages
Chen 2021
No ratings yet
Chen 2021
15 pages
Events_Recommendation_System_ijariie26413
No ratings yet
Events_Recommendation_System_ijariie26413
7 pages
Using Incremental PLSI For Threshold-Resilient Online Event Analysis
No ratings yet
Using Incremental PLSI For Threshold-Resilient Online Event Analysis
11 pages
A Graph Analytical Approach For Topic Detection
No ratings yet
A Graph Analytical Approach For Topic Detection
21 pages
Addressing_Event-Driven_Concept_Drift_in_Twitter_Stream_A_Stance_Detection_Application
No ratings yet
Addressing_Event-Driven_Concept_Drift_in_Twitter_Stream_A_Stance_Detection_Application
13 pages
Temporal and Social Context Based Burst Detection From Folksonomies
No ratings yet
Temporal and Social Context Based Burst Detection From Folksonomies
6 pages
(IJCST-V4I6P20) :siddu P. Algur, Rashmi H. Patil, Prashant Bhat
No ratings yet
(IJCST-V4I6P20) :siddu P. Algur, Rashmi H. Patil, Prashant Bhat
6 pages
Azw 031
No ratings yet
Azw 031
21 pages
Journal.pone.0256175
No ratings yet
Journal.pone.0256175
21 pages
A Framework To Predict Social Crimes Using Twitter Tweets
No ratings yet
A Framework To Predict Social Crimes Using Twitter Tweets
5 pages
paper_23
No ratings yet
paper_23
4 pages
Framework For Analyzing Twitter To Detect Community Suspicious Crime Activity
No ratings yet
Framework For Analyzing Twitter To Detect Community Suspicious Crime Activity
20 pages
Trends
No ratings yet
Trends
19 pages
Traffic Condition Is More Than Colored Lines On A Map: Characterization of Waze Alerts
No ratings yet
Traffic Condition Is More Than Colored Lines On A Map: Characterization of Waze Alerts
10 pages
2020.Findings Emnlp.344
No ratings yet
2020.Findings Emnlp.344
11 pages
Clustering Thesis
No ratings yet
Clustering Thesis
55 pages
Twitch: A Social Media Tool For Event-Goers and Promoters
No ratings yet
Twitch: A Social Media Tool For Event-Goers and Promoters
53 pages
Hotsketch:: Drawing Police Patrol Routes Among Spatiotemporal Crime Hotspots
No ratings yet
Hotsketch:: Drawing Police Patrol Routes Among Spatiotemporal Crime Hotspots
30 pages
NewSociRank: Recognizing and Ranking Frequent News Topics Using Social Media Factors
No ratings yet
NewSociRank: Recognizing and Ranking Frequent News Topics Using Social Media Factors
4 pages
F-Growth. Gamification, virality and monetization
From Everand
F-Growth. Gamification, virality and monetization
Ilya Osipov
No ratings yet
Thinking Statistically
From Everand
Thinking Statistically
Anthony Banfield
5/5 (1)
Information Retrieval From Scientific Abstract and Citation Database Query by Documents Approach Based On Monte Carlo Sampling
No ratings yet
Information Retrieval From Scientific Abstract and Citation Database Query by Documents Approach Based On Monte Carlo Sampling
9 pages
Probabilistic Model
No ratings yet
Probabilistic Model
46 pages
NLP, Language models
No ratings yet
NLP, Language models
22 pages
IRS answer key
No ratings yet
IRS answer key
16 pages
2.notes CS8080 - Information Retrieval Technique
No ratings yet
2.notes CS8080 - Information Retrieval Technique
164 pages
CS8080 Irt Unit 4 23 24
No ratings yet
CS8080 Irt Unit 4 23 24
36 pages
Modern Information Retrieval: Modeling
No ratings yet
Modern Information Retrieval: Modeling
263 pages
Probabilistic Information Retrieval: Keerthi Nuthi Vipul Munot Arun Ram Sankaranarayanan
No ratings yet
Probabilistic Information Retrieval: Keerthi Nuthi Vipul Munot Arun Ram Sankaranarayanan
27 pages
IR - Ricardo Unit II
No ratings yet
IR - Ricardo Unit II
512 pages
Probabilistic IR: Giorgio Gambosi
No ratings yet
Probabilistic IR: Giorgio Gambosi
42 pages
The Anatomy of A Large-Scale Hypertextual
No ratings yet
The Anatomy of A Large-Scale Hypertextual
41 pages
IR Ch23 Text Representation
No ratings yet
IR Ch23 Text Representation
36 pages
Module 2-Students.pptx
No ratings yet
Module 2-Students.pptx
143 pages
IRS Notes
No ratings yet
IRS Notes
10 pages
Web Crawling
No ratings yet
Web Crawling
10 pages
Information Retrieval: Unit 4: Web Search - Part 1
No ratings yet
Information Retrieval: Unit 4: Web Search - Part 1
63 pages
Framework of Competitor Analysis by Monitoring Inf
No ratings yet
Framework of Competitor Analysis by Monitoring Inf
8 pages
Cs8080 Ir Unit2 I Modeling and Retrieval Evaluation
No ratings yet
Cs8080 Ir Unit2 I Modeling and Retrieval Evaluation
42 pages
internet technologies 4th sem important questions.
No ratings yet
internet technologies 4th sem important questions.
26 pages
IR Endsem Leaked (1)
No ratings yet
IR Endsem Leaked (1)
50 pages
lecture7b-efficient-scoring
No ratings yet
lecture7b-efficient-scoring
18 pages
Joachims 02c
No ratings yet
Joachims 02c
10 pages
Lecture10 Efficient Scoring
No ratings yet
Lecture10 Efficient Scoring
19 pages
1 Introduction to IR FSS20
No ratings yet
1 Introduction to IR FSS20
47 pages
UNIT I_ Introduction and Motivation
No ratings yet
UNIT I_ Introduction and Motivation
57 pages
Probabilistic Information Retrieval Model
No ratings yet
Probabilistic Information Retrieval Model
51 pages
Unit Ii Modeling
No ratings yet
Unit Ii Modeling
15 pages
NLP assignment notes
No ratings yet
NLP assignment notes
28 pages
IR Cs Sem 6
No ratings yet
IR Cs Sem 6
16 pages

Local Event Retrieval

Uploaded by

Local Event Retrieval

Uploaded by

Local Event Retrieval

You might also like