0% found this document useful (0 votes)
9 views

Information Retrieval - Lecture 1

information

Uploaded by

M
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views

Information Retrieval - Lecture 1

information

Uploaded by

M
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 15

Information Retrieval & Search

Engines

Instructor: Prof. Shereen Taie

Information Retrieval & Search Engines

BIS216E

Course: Information Retrieval & Search Engines


Course References

• Textbook:
Essential Books:
– SEO 2018: Learn search engine optimization with
smart internet marketing strategies Adam Clarke,
Simple Effectiveness Publishing, 2018.
Recommended Books:
- Search Engine Optimization All-in-One For
Dummies by Bruce Clay (Author), Kristopher B.
Jones (Author) 2022 For Dummies (Business &
Personal Finance)) 4th Edition.

Course: Information Retrieval & Search Engines


Assessment of
Participants
Assessment will be based on the following deliverables:
• Week 7-
• Mid Term Exam: (15 grades)
• Assignments (15 Grades)
• Week 12
• Evaluation (20 grades) includes (practical
assignments + quizzes)
• Participation (10 Grades)
• End-of-Term-Exam: (40 grades)

For success:
Achieving 50% of total score & achieving at least 12 out of
40 at the Final exam.

Course: Information Retrieval & Search Engines


Group project:

• The aim of this project is to help students to develop a simple Search


Engine.
• Groups form 3 to 5
• The features will be described as separated tasks in the lab.
• Present your project to the on week 12
• Presentations should be no longer than 15 minutes.

Course: Information Retrieval & Search Engines


Introduction to
Information Retrieval
Introducing Information Retrieval
and & Search Engines

Course: Information Retrieval & Search Engines


Information Retrieval
• Information Retrieval (IR) is finding material (usually documents) of
an unstructured nature (usually text) that satisfies an information
need from within large collections (usually stored on computers).

– These days we frequently think first of web search,


but there are many other cases:
• E-mail search
• Searching your laptop
• Corporate knowledge bases
• Legal information retrieval

6
Course: Information Retrieval & Search Engines
The problem of IR
• Goal = find documents relevant to an information
need from a large document set
Inf
o.
ne
Query ed
IR
Document Retrieval
system
collection Answer list

7
Course: Information Retrieval & Search Engines
Example

Google

Web

8
Course: Information Retrieval & Search Engines
What is a Document?
• Examples:
– web pages, email, books, news stories, scholarly
papers, text messages, Word, Powerpoint, PDF,
forum postings, patents, IM sessions, etc.
• Common properties
– Significant text content
– Some structure (e.g., title, author, date for papers;
subject, sender, destination for email)

Course: Information Retrieval & Search Engines


Documents vs. Database
Records
• Database records (or tuples in relational databases) are typically
made up of well-defined fields (or attributes)
– e.g., bank records with account numbers,
balances, names, addresses, social security
numbers, dates of birth, etc.
• Easy to compare fields with well-defined semantics to queries in
order to find matches
• Text is more difficult

Course: Information Retrieval & Search Engines


Documents vs. Records
• Example bank database query
– Find records with balance > $50,000 in branches
located in Amherst, MA.
– Matches easily found by comparison with field values
of records
• Example search engine query
– bank scandals in western mass
– This text must be compared to the text of entire news
stories

Course: Information Retrieval & Search Engines


Unstructured (text) vs. structured
(database) data in the mid-nineties

12
Course: Information Retrieval & Search Engines
Unstructured (text) vs. structured
(database) data today

13
Course: Information Retrieval & Search Engines
Sec. 1.1

Basic assumptions of
Information Retrieval
• Collection: A set of documents
– Assume it is a static collection for the
moment

• Goal: Retrieve documents with information


that is relevant to the user’s information need
and helps the user complete a task

14
Course: Information Retrieval & Search Engines
The classic search model
User task Get rid of mice in a
politically correct way

Misconception?

Info need
Info about removing mice
without killing them
Misformulation?
Search
Query how trap mice
alive

Search
engine

Query Results
Collection
refinement

Course: Information Retrieval & Search Engines

You might also like