0% found this document useful (0 votes)
61 views

Architecture Brief

Machine tagging uses natural Language processing (gate) and Machine Learning (svm) to extract information from unstructured information. Jasperintelligence Benchmark results (6 posts per page) Each subject as a query Standard OPIC forum link structure post type classification Subjects with acronyms spelled out queries (novice) acronyms added back in (c) 2007 Openwater, Inc. Comments baseline would be much worse at 1 / page 37% better 44% better baseline 450% better Similar trends on ingres

Uploaded by

travii23
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
61 views

Architecture Brief

Machine tagging uses natural Language processing (gate) and Machine Learning (svm) to extract information from unstructured information. Jasperintelligence Benchmark results (6 posts per page) Each subject as a query Standard OPIC forum link structure post type classification Subjects with acronyms spelled out queries (novice) acronyms added back in (c) 2007 Openwater, Inc. Comments baseline would be much worse at 1 / page 37% better 44% better baseline 450% better Similar trends on ingres

Uploaded by

travii23
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 4

Logical Architecture

Community
Profile CMS Global Applications
service UI E.g. Ency, dashboards
and others

App Services

Community Database
Search/query Interface (VQL)

Language Content Indexes Vertical


(IDX2)
services Optimization

I/O Interface

Data Services

Library Service Messaging Global Search


manager Intelligence & Network Repository Appliance
Management - Sub of Library
Manager
(NOC)

1 © 2007 Openwater, Inc. all rights reserved


Machine Tagging..community database

Openwater Index

Search Index

Add information to the index and maintain the link

2 © 2007 Openwater, Inc. all rights reserved


Machine Tagging

• Using Open Source Natural Language Processing (GATE) and Machine


Learning (SVM) to extract information from unstructured information:
•Most important concepts for the network
•Acronyms and their relation to terms
•Text Classification (spam, is form post question/answer/me too/...)
• Takes inputs from structured information
•CMS provides names of people, email adress, user name
•HTML title/heading/.. tags indicate higher chance of concept
•Network classes representing Product Taxonomy
• Feedback loop through user feedback
•How often do users view/link to/promote to topic extracted concepts
•How good is the overlap between concepts and search strings
•Users correct spam/text classification

© 2007 Openwater, Inc. all rights reserved


Forum Benchmark

• Scope is 800 posts in JI


Forum
• Scoring rewards correct
topic high in results list,
answers over questions
• Currently understands
Jive, Joomla!, vBulletin
forums

JasperIntelligence Benchmark results (6 posts per page)


Relative
Each subject as a query Score Comments
Standard OPIC 0.685 baseline would be much worse at 1/page • Similar trends on Ingres
forum link structure 0.942 37% better and Oracle forums
post type classification 0.984 44% better
• Just the start, rapid im-
Subjects with acronyms provements in acronym
spelled out queries (novice) 0.166 baseline
acronyms added back in 0.753 450% better detection & classification

© 2007 Openwater, Inc. all rights reserved

You might also like