0% found this document useful (0 votes)

9 views

02 - Lect2 Biomedical IR

Uploaded by

Mahmoud Nasser

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

9 views

02 - Lect2 Biomedical IR

Uploaded by

Mahmoud Nasser

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 20

Biomedical

Information Retrieval
 Search Engine Architecture
Lecture 2

Dr. Ebtsam AbdelHakam

[email protected]
Minia University
Introduction
Retrieving relevant information from biomedical text data is a new
challenging area of research. Thousands of articles are being added
into biomedical literature each year and this large collection of
publications offer an excellent opportunity for discovering hidden
biomedical knowledge by applying information retrieval (IR) and
Natural Language Processing (NLP) technologies.

Biomedical Text processing is different from others. It requires

special kind of processing as it has complex medical terminologies.
Medical entity identification and normalization itself is a research
problem. Relationships among medical entities have the impact on
any system.

Medical field has various types of queries: short questions, medical

case reports, medical case narratives, verbose medical queries,
community questioning, semi-structured queries, etc. These diverse
nature of medical data demands special kind of attention from IR and
NLP.
Requirements of Designing a
Search Engine

 The two primary requirements of a search engine are:

 • Effectiveness (quality): We want to be able to retrieve

the most relevant set of documents possible for a query.

 • Efficiency (speed): We want to process queries from

users as quickly as possible.
Designing a Search Engine

Search engine design balances two factors:

‣ Effectiveness – accuracy of results, presentation of

results, absence of spam, good ad selection

‣ Efficiency / Performance – response time,

concurrency, disaster mitigation, security issues.

These factors deeply impact the architecture of these

systems. Often the engineering solutions feed back into
research (NoSQL, Map Reduce, etc.).
Search Engine Basic Building Blocks

 Search engine components support two major functions, which

we are called:
 .
 1- the indexing process: The indexing process builds the
structures that enable searching.
 The index (inverted index) is an efficient data structure that represents the
documents of a Corpus and allows fast searching of the Corpus documents
using that indexed information.

 2- the query process: the query process uses those structures

(index) and a person’s query to produce a ranked list of
documents
Query process

1. User interaction
It supports creation and refinement of user query and
displays the results.

2. Ranking
It uses query and indexes to create ranked list of
documents.

3. Evaluation
It monitors and measures the effectiveness and
efficiency. It is done offline
Query Process
(User Interaction)
The• user interaction component provides the interface
between the person doing the searching and the search
engine.

Its four tasks are:

1- Accepting the user’s query, query language is deﬁned and
transforming it into index terms.

2- Query Transformation: The user-interface parses user queries, and

converts search terms in a form that is acceptable for input to the
query engine i.e. into index terms that appear in the index
vocabulary.
User Interaction
•
User Interaction Component

•
3- Spell checking and query suggestion and refinement .
‣ Query expansion adds terms related to the query terms (e.g. synonyms,
related entities)
‣ Relevance feedback runs an initial query, then uses the top-ranked
documents to expand the query for a second run

4- Take the ranked list of documents from the search engine and
organize it into the results shown to the user.
‣ Displays the top-ranked results
‣ Generates snippets to show how queries match documents
‣ Highlights important words and passages
‣ Retrieves query-relevant advertising.
What is Query Expansion?

 Query Expansion is the term given when a

search engine adding search terms to a
user’s weighted search.
 The goal is to improve precision and/or
recall.

 Example: User Query: “car”; Expanded

Query: “car cars automobile automobiles
auto” etc…
Query Process
(Ranking)
Ranking Component

 The ranking component is the core of the search engine.

• It takes the transformed query from the user interaction component
and generates a ranked list of documents using scores based on a
retrieval model.

• Ranking must be both efficient, since many queries may need to be

processed in a short time, and effective, since the quality of the
ranking determines whether the search engine accomplishes the goal of
finding relevant information.

 The efficiency of ranking depends on the indexes,

 The effectiveness depends on the retrieval model.

Ranking
Document scoring
•

‣ A score is assigned to the most likely-relevant documents based

on how well it matches the query.

‣ Core component of a search engine, and often the most

closely-guarded secret.

‣ Many, many approaches and variations have been

developed

‣ The basic form is the dot product of query term weights and
corresponding document weights:
Query Process
(Evaluation)
Evaluation component
 The task of the evaluation component is to measure and monitor
effectiveness and efficiency.

• An important part of that is to record and analyze user behavior using

log data.

 The results of evaluation are used to tune and improve the ranking
component.

• Most of the evaluation component is not part of the online search

engine, apart from logging user and system data.

 Evaluation is primarily an offline activity, but it is a critical part of any

search application.
Evaluation component
• Logging

‣ Logging user interaction is an essential tool for

measuring performance

‣ Query logs and clickthrough data are used for query

suggestion, spell checking, query caching, ranking,
advertising search, …

• Logging. Query logs of the users’ interactions with the

search engine are obtained and are of paramount
importance.

• They can improve the search experience, speed up

results, store results of common queries, and identify
source of new revenue.
Evaluation component

 Pages that are clicked or ignored might be logged to improve the overall
quality of the search engine but also detect patterns in user activity (i.e.
data-mining).

 Query logs can be used for a variety of other reasons that include:
1. Keeping track of a history of user queries,
2. Generation of spell checking logs (instead of running the spellchecker
every time)
3. Recording of time spent on the query or a particular document
4. Query logs and clickt-hrough data are used for query suggestion, spell
checking, query caching, ranking, advertising search.

21 Irrefutable Laws of Leadership by John Maxwell PDF
No ratings yet
21 Irrefutable Laws of Leadership by John Maxwell PDF
4 pages
02 -Lect2 search engines - part1
No ratings yet
02 -Lect2 search engines - part1
18 pages
Search Engine Architecture
No ratings yet
Search Engine Architecture
15 pages
Cmpsci 446 Search Engines
No ratings yet
Cmpsci 446 Search Engines
32 pages
Seo Learning Guide
From Everand
Seo Learning Guide
ngencoband
No ratings yet
Unit 5 - Data Science & Big Data - WWW - Rgpvnotes.in
No ratings yet
Unit 5 - Data Science & Big Data - WWW - Rgpvnotes.in
17 pages
Building Fast Search Engines
No ratings yet
Building Fast Search Engines
21 pages
chapter 2
No ratings yet
chapter 2
45 pages
ASSIGNMENT 3 DM
No ratings yet
ASSIGNMENT 3 DM
12 pages
Chapter 2
No ratings yet
Chapter 2
23 pages
Information Retrieval
No ratings yet
Information Retrieval
142 pages
Chap 1
No ratings yet
Chap 1
22 pages
Unit - 1
No ratings yet
Unit - 1
51 pages
Chap - Week8 - Queries and Information Needs
No ratings yet
Chap - Week8 - Queries and Information Needs
44 pages
Lecture1 Chap1
No ratings yet
Lecture1 Chap1
22 pages
Introduction To Information Retrieval
No ratings yet
Introduction To Information Retrieval
42 pages
Information Retrieval and XML Data: ADBMS Unit-4
No ratings yet
Information Retrieval and XML Data: ADBMS Unit-4
37 pages
Introduction To Information Retrieval
No ratings yet
Introduction To Information Retrieval
50 pages
IR Unit V Notes remaining
No ratings yet
IR Unit V Notes remaining
10 pages
2 Mod-1_Lec-2
No ratings yet
2 Mod-1_Lec-2
58 pages
Chapter 1
No ratings yet
Chapter 1
52 pages
Lect 1 IRIntroduction
No ratings yet
Lect 1 IRIntroduction
59 pages
Search Tools and Their Components
No ratings yet
Search Tools and Their Components
7 pages
Search ENgine
No ratings yet
Search ENgine
28 pages
Monday - IR Fundamentals - Grace Yang - AFIRM19-IR
No ratings yet
Monday - IR Fundamentals - Grace Yang - AFIRM19-IR
77 pages
IR_workbook_answers
No ratings yet
IR_workbook_answers
36 pages
Mastering Search Engine Marketing: A Guide for SEM Campaign Success
From Everand
Mastering Search Engine Marketing: A Guide for SEM Campaign Success
Rebecca Cox
No ratings yet
Modern Information Retrieval: Computer Engineering Department Fall 2005
No ratings yet
Modern Information Retrieval: Computer Engineering Department Fall 2005
19 pages
Topic 2 W2 - SDR - Edited - March2023
No ratings yet
Topic 2 W2 - SDR - Edited - March2023
25 pages
Unit 5
No ratings yet
Unit 5
36 pages
Search Engine: Amit Kamath Ancy Alphonso
No ratings yet
Search Engine: Amit Kamath Ancy Alphonso
22 pages
CompletedUNIT 1 PPT 10.7.17
100% (6)
CompletedUNIT 1 PPT 10.7.17
87 pages
Chapter 2
No ratings yet
Chapter 2
31 pages
As3 DM
No ratings yet
As3 DM
9 pages
Unit - I - IR
No ratings yet
Unit - I - IR
39 pages
Web Technologies Unit-III
No ratings yet
Web Technologies Unit-III
11 pages
Aesthetics and Technology in Building, Pier Luigi Nervi
100% (4)
Aesthetics and Technology in Building, Pier Luigi Nervi
146 pages
SearchLand: Search Quality For Beginners
No ratings yet
SearchLand: Search Quality For Beginners
29 pages
Introduction To IR 2021
No ratings yet
Introduction To IR 2021
40 pages
Informaiton Retrieval and Web Search
No ratings yet
Informaiton Retrieval and Web Search
44 pages
Chap 1
No ratings yet
Chap 1
23 pages
Working of Search Engines: Avinash Kumar Widhani, Ankit Tripathi and Rohit Sharma Lnmiit
No ratings yet
Working of Search Engines: Avinash Kumar Widhani, Ankit Tripathi and Rohit Sharma Lnmiit
13 pages
Text
No ratings yet
Text
5 pages
MCS-034: Software Engineering
From Everand
MCS-034: Software Engineering
Dr. DK Sukhani
No ratings yet
unit8
No ratings yet
unit8
32 pages
Agile Software Development: Incremental-Based Work Benefits Developers and Customers
From Everand
Agile Software Development: Incremental-Based Work Benefits Developers and Customers
Anthony Baah
No ratings yet
Module 1 - Search Engine Basics
No ratings yet
Module 1 - Search Engine Basics
79 pages
1-Overview of Information Retrieval
No ratings yet
1-Overview of Information Retrieval
44 pages
UNIT 3 Notes
No ratings yet
UNIT 3 Notes
32 pages
Comsats Institute of Information TECHNOLOGY Islamabad
No ratings yet
Comsats Institute of Information TECHNOLOGY Islamabad
11 pages
Duplichecker Plagiarism Report
No ratings yet
Duplichecker Plagiarism Report
4 pages
93512information Retrieval LecturesNotes2024
No ratings yet
93512information Retrieval LecturesNotes2024
153 pages
Search Engine Using Apache Lucene
No ratings yet
Search Engine Using Apache Lucene
5 pages
Working of Webb Search Engines
No ratings yet
Working of Webb Search Engines
29 pages
1.introduction Information Retrival
No ratings yet
1.introduction Information Retrival
31 pages
Seach Engine
50% (2)
Seach Engine
18 pages
Search Engine Student Documents
No ratings yet
Search Engine Student Documents
6 pages
Working of Search Engine
No ratings yet
Working of Search Engine
11 pages
Search Engines Information Retrieval in Practice PDF
No ratings yet
Search Engines Information Retrieval in Practice PDF
542 pages
Software Testing Interview Questions You'll Most Likely Be Asked
From Everand
Software Testing Interview Questions You'll Most Likely Be Asked
Vibrant Publishers
No ratings yet
Touchpad Information Technology Class 10: Skill Education Based on Windows & OpenOffice Code (402)
From Everand
Touchpad Information Technology Class 10: Skill Education Based on Windows & OpenOffice Code (402)
Dr. Sanjay Jain
No ratings yet
Fourier Transform Infrared (FT-IR) Spectroscopy
No ratings yet
Fourier Transform Infrared (FT-IR) Spectroscopy
11 pages
Vpecker User Manual V6.3
100% (1)
Vpecker User Manual V6.3
51 pages
Components of Optical Communication Systems-Transmitter
No ratings yet
Components of Optical Communication Systems-Transmitter
31 pages
Choice Boards Packet - Menu PDF
No ratings yet
Choice Boards Packet - Menu PDF
16 pages
Synchronous Optical Network (SONET) by Abdullah
100% (1)
Synchronous Optical Network (SONET) by Abdullah
18 pages
[Ebooks PDF] download (Ebook) India's Foreign Policy: 1947-2003 by Jyotindra Nath Dixit ISBN 9788187285212, 8187285214 full chapters
100% (3)
[Ebooks PDF] download (Ebook) India's Foreign Policy: 1947-2003 by Jyotindra Nath Dixit ISBN 9788187285212, 8187285214 full chapters
71 pages
Cica Tell I 1997
No ratings yet
Cica Tell I 1997
10 pages
Abacus Brochure
No ratings yet
Abacus Brochure
4 pages
Thoughts Academy - Corporate Training Profile
No ratings yet
Thoughts Academy - Corporate Training Profile
8 pages
3 Top 30 Old Age Homes in Srirangam, Trichy - Institutions For Aged - Justdial
No ratings yet
3 Top 30 Old Age Homes in Srirangam, Trichy - Institutions For Aged - Justdial
8 pages
DMir 1912 05 14 01-Titanic
No ratings yet
DMir 1912 05 14 01-Titanic
16 pages
Measurement of Reverberation Time 4th Sem
No ratings yet
Measurement of Reverberation Time 4th Sem
3 pages
MIDI Reference MIDI-Referenz Référence MIDI Referencia MIDI
No ratings yet
MIDI Reference MIDI-Referenz Référence MIDI Referencia MIDI
3 pages
Aira Module 1 Bped 19
No ratings yet
Aira Module 1 Bped 19
8 pages
Short PPT On Biodiesel
No ratings yet
Short PPT On Biodiesel
15 pages
Elegant Fantasy Creature Generator
100% (3)
Elegant Fantasy Creature Generator
16 pages
PRC Cord Caret
No ratings yet
PRC Cord Caret
3 pages
12 - Powerful - Lessons - From - Thread - by - Darkpsychformen - Jun 10, 22 - From - Rattibha
100% (1)
12 - Powerful - Lessons - From - Thread - by - Darkpsychformen - Jun 10, 22 - From - Rattibha
7 pages
Dunkin Donuts Case Study
100% (1)
Dunkin Donuts Case Study
8 pages
Web Development
No ratings yet
Web Development
23 pages
Unit 1
No ratings yet
Unit 1
8 pages
First Exam Christian Living 8
No ratings yet
First Exam Christian Living 8
2 pages
MIS - Ethical and Social Issues
No ratings yet
MIS - Ethical and Social Issues
38 pages
A Family Friend
100% (1)
A Family Friend
7 pages
Traditional Wedding of Vietnam: Welcome To Presentation of Group 2
No ratings yet
Traditional Wedding of Vietnam: Welcome To Presentation of Group 2
32 pages
Asking Permission Board Game
No ratings yet
Asking Permission Board Game
3 pages
20112E ISS3 W Alum IE2 IE3
No ratings yet
20112E ISS3 W Alum IE2 IE3
20 pages
Samrat Malik AEM
No ratings yet
Samrat Malik AEM
3 pages
Ford Ranger
No ratings yet
Ford Ranger
18 pages

02 - Lect2 Biomedical IR

Uploaded by

02 - Lect2 Biomedical IR

Uploaded by

Biomedical

Dr. Ebtsam AbdelHakam

Biomedical Text processing is different from others. It requires

Medical field has various types of queries: short questions, medical

 The two primary requirements of a search engine are:

 • Effectiveness (quality): We want to be able to retrieve

 • Efficiency (speed): We want to process queries from

Search engine design balances two factors:

‣ Effectiveness – accuracy of results, presentation of

‣ Efficiency / Performance – response time,

These factors deeply impact the architecture of these

 Search engine components support two major functions, which

 2- the query process: the query process uses those structures

Its four tasks are:

2- Query Transformation: The user-interface parses user queries, and

 Query Expansion is the term given when a

 Example: User Query: “car”; Expanded

 The ranking component is the core of the search engine.

• Ranking must be both efficient, since many queries may need to be

 The efficiency of ranking depends on the indexes,

 The effectiveness depends on the retrieval model.

‣ A score is assigned to the most likely-relevant documents based

‣ Core component of a search engine, and often the most

‣ Many, many approaches and variations have been

• An important part of that is to record and analyze user behavior using

• Most of the evaluation component is not part of the online search

 Evaluation is primarily an offline activity, but it is a critical part of any

‣ Logging user interaction is an essential tool for

‣ Query logs and clickthrough data are used for query

• Logging. Query logs of the users’ interactions with the

• They can improve the search experience, speed up

You might also like