NDR 2007 A New Approach To Information Discovery: Perry Stokes (Mike Odell) July 20, 2006
NDR 2007 A New Approach To Information Discovery: Perry Stokes (Mike Odell) July 20, 2006
A New Approach to
Information Discovery
Perry Stokes
(Mike Odell)
1
Agenda
Who is MetaCarta?
What is Natural Language
Processing
Information Discovery Challenge
Approaches to Information Discovery
Geographic Text Search
What is it ?
How does it work ?
How is it used?
W h at is its valu e?
2
Who is MetaCarta?
T h e g e o g ra p h ic te xt se a rch co m p a n y fo u n d e d in 1 9 9 9
by MIT researchers
Offices are in Boston, MA, Washington D.C., and Houston, TX
Energy Customers
Partners
4
Information Discovery Challenges
Internal Sources
EDMS
Shared Drives
Library Technical Published Technical Data in
Catalogs Databases Documents File Servers
Internal Network
(peers)
GIS Layers
Satellite Images Knowledge Corporate Archives email
Portals Websites
External Sources
News Events
Science and Industry Content Providers
6
Approaches to Information
Discovery: Document Search
Folder Count
36
3 h o u rs la te r
still searching
8
Approaches to Information
Discovery: Text search
T h o u sa n d s o f b e st h its is still to o m a n y to re a d
Filtering on words alone is insufficient
Time consuming to review
9
Approaches to Information
Discovery: Web Search
T h e rig h t text
search an sw er isn t
always the right
human answer.
10
Find IT Geographical Text
Search
Fast 1 second
Can search across
multiple collections
Brings results into GIS
decision space
11
Approaches to Information
Discovery : GIS
Must know
what you are looking for
that the information is
stored in a database
and the attributes that you
Lease MC 441 are using are included
Lease
Database
Lease Document
12
How Does it Work ?
13
Geographic Text Search
Puts documents
on the map
14
Geospatial Knowledge Discovery
Shared
Drives
Shared
Drives
Shared
Drives
Shared
Drives
Single search across many
sources
Knowledge
BU - Specific Data
Archives Corporate Websites
Internal Network
(Peers)
Portals
Corporate Knowledge Base
Published
Library Catalogs
Internal Spatial Data, Map
Documents
services
External Subscription Data
GIS Layers External Free Web Data
Competitive Intelligence
15
Data Fusion
Commercial Data
Mapping & GIS Sources
& Visualization
Document
Management Systems
16
What information do we have ?
18
What information do we have ?
19
Standard Text Search
20
The Impossible Search
21
Customer Uses
Filter
Geographically
Filter
by TIME
Competitor Analysis:
ChevronTexaco activities in/near Angola
within the last 5 years
23
User Comments
T h is is a se a rch to o l th a t talks th e lan g u ag e of the knowledge
w o rke r
T h is is G o o g le fo r th e o il in d u stry
E ve ry d a y, a ll d a y lo n g , I n e e d th is to o l
W h at th is g ives m e is a to o l th at sh o w s m e w h at I d o not kn o w .
24
Thank you for your time
Questions ?
25
Recap
IHS and MetaCarta
The largest collection of E&P spatial data and use of Natural
Language Processing to locate all your documents using a map
Jointly developed and owned Geographic Data Modules
26
ROI
Initial trial indicated at least 2% (1 hour/week) of workers time could be
saved using geographic text search
If implemented across the enterprise, this translates to:
(1 hour per week*50 weeks)*($150k FTE per annum /2000 hours)*10k
employees
5 0 h o u rs p e r a n n u m *$ 7 5 /h o u r*1 0 k e m p lo ye e s 1 =
1Assumption:
27
The Solution
E-mail
Messages/Cables
Correspondence
Reports Unstructured GIS Maps & Imagery
Analyses Content Complete Systems
Web Pages Overlap
News Feeds
Databases
Company Proprietary
28
The Problem
E-mail
Messages/Cables
Correspondence
Reports Unstructured GIS Maps & Imagery
Analyses Content Trivial Systems
Web Pages Overlap
News Feeds Area of
Databases Pain
29
The MetaCarta and IHS
Relationship
30