0% found this document useful (0 votes)

2 views

irs unit-1 modified

An Information Retrieval System (IRS) is designed to store, retrieve, and maintain various types of information, including text and multimedia, while improving user efficiency in finding relevant data. Key processes include item normalization, selective dissemination of information, document database search, and index database search, each contributing to effective information management. The integration of IRS with database management systems enhances data handling, allowing seamless access to both structured and unstructured data.

Uploaded by

Balle Manasa

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

2 views

irs unit-1 modified

Uploaded by

Balle Manasa

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 12

Definition of Information Retrieval System

An Information Retrieval System (IRS) is a system designed to store, retrieve, and maintain
information. This information can be text, images, audio, video, or other multimedia. Modern
techniques allow for searching across different media types (eg. EXCALIBUR's Visual Retrieval
Ware)

An "item" refers to the smallest complete unit processed by the system, such as a document, video,
or audio program. The system helps users find the information they need, sometimes using
specialized hardware to process non-text data (eg, converting audio to text). The process involves
search composition, execution, and filtering out non-relevant items, which contribute to retrieval
overhead

With advancements in computing and storage, large databases are now accessible to the average
user. The growth of the Internet and advanced search engines like INFOSEEK and EXCITE have
made it easier to access huge amounts of information. Media like images and audio are also
searchable. and organizations like BBC and Disney use transcription and video indexing for easier
access to content

Objectives of
Information Retrieval Systems

The main objective of an Information Retrieval System is to reduce the time users spend finding
relevant information. This includes query generation, execution, reviewing results, and avoiding
irrelevant items. The goal is to improve user efficency in locating the needed information.

Key Measures: Precision and Recall

1. Precision: Measures the proportion of relevant items retrieved out of all retrieved items. Precision
drops when many non-relevant items are retrieved.

2. Recall: Measures the proportion of relevant items retrieved out of all possible relevant items.
Recall remains high once relevant items are retrieved, regardless of non-relevant ones.

Natural Language Queries

Modern IRS systems, like AltaVista and Infoseek, allow users to enter natural language queries.
This makes searching more intuitive, though most users typically enter only one or two keywords
instead of long queries.
Functional Overview:
A total Information Storage and Retrieval System is composed of four major functional
processes:
1) Item Normalization
2) Selective Dissemination of Information (i.e., “Mail”)
3) Archival Document Database Search, and an Index
4) Database Search along with the Automatic File Build process that
supportsIndexFiles.

1.Item Normalization

*Item Normalization* is the process of converting incoming data into a standard format that the
system can understand and work with. This step is essential for ensuring that all data, regardless of
its original format, is compatible with the system’s processing and search capabilities. Here's a
breakdown of its key aspects:

---

### 1. Logical Restructuring and Standardization

- *Purpose*: To ensure all incoming data is in a consistent format.
- *Example*: Foreign language text can be converted to Unicode, a universal encoding system.
Similarly, video files can be converted to formats like MPEG-2, and images to formats like JPEG.
- *Benefits*: Allows the system to process diverse types of data, including multimedia like text,
audio, video, and images.

---

### 2. Processing Tokens

- *What are Tokens?*: Tokens are the smallest meaningful units in the text, such as words or
symbols.
- *Steps in Token Processing*:
1. *Identification*: Recognize and separate words (e.g., “running” into “run”).
2. *Stemming*: Remove endings from words to reduce them to their base form (e.g., “running” →
“run”).
3. *Characterization*: Understand the role of tokens (e.g., is it a valid word, a separator, or a
special character?).
---

### 3. Zoning

- *Definition*: Dividing an item into logical sections like Title, Author, Abstract, Main Text, and
References.
- *Purpose*: Makes searches more precise.
- Example: If searching for "Einstein," the system can avoid matches in the Bibliography zone and
focus on the Main Text.
- *Structure*: Zones can overlap and be hierarchical, which helps users refine searches further.

---

### 4. Handling Symbols

- *Classes of Symbols*:
1. *Valid Word Symbols*: Alphabetic characters, numbers, etc.
2. *Inter-word Symbols*: Blanks, periods, semicolons, etc., that separate words.
3. *Special Processing Symbols*: Apostrophes, hyphens, or domain-specific symbols.
- *Customization*: Each language or domain has unique requirements for recognizing symbols
(e.g., an apostrophe is important in names like O'Connor).

---

### 5. Stop Lists and Stop Algorithms

- *Purpose*: Save resources by removing tokens with little value, like common words (“the,”
“and”) or irrelevant numbers.
- *Examples of Stop Algorithms*:
- Ignore numbers greater than 999999 (except dates).
- Discard tokens with mixed letters and numbers if irrelevant.
- *Current Trend*: With advancements in memory and storage, the use of Stop Lists is becoming
less critical.

---

### Why Item Normalization is Important

1. *Standardization*: Ensures all data is processed consistently, no matter its source or type.
2. *Search Optimization*: Helps the system focus on relevant content and discard unnecessary data.
3. *Improved Precision*: By zoning and token processing, the system provides more accurate
search results.
4. *Resource Efficiency*: Saves system resources by removing unimportant data.

2)Selective Dissemination of Information (SDI):

*Selective Dissemination of Information (SDI)* is a system process that automatically delivers new
and relevant information to users based on their specified interests. This method ensures users
receive personalized updates without needing to search manually.

---

### How SDI Works

1. *User Profiles*:
- Each user creates a *profile* containing their topics of interest.
- A profile is a broad search statement (e.g., "Artificial Intelligence in Healthcare").
- It also includes the user’s designated mail files, where matched items will be sent.

2. New Items Processing:

- When new information (like documents or articles) enters the system, it is compared against all
user profiles.
- This process is dynamic and happens for every newly received item.

3. Matching and Dissemination:

- If an item matches a user’s profile, it is sent to their mail file.
- This ensures that users only receive information relevant to their interests.

---

Key Components of the SDI Process

1. *Search Process*:
- The system continuously matches new data against user profiles.
- It performs searches dynamically as new items are added to the system.

2. *User Profiles*:
- Profiles define what each user is interested in.
- Broad and flexible search statements allow users to receive diverse but relevant information.

3. *Mail Files*:
- These are storage spaces where matched documents are sent.
- Each user has dedicated mail files to organize the received information.

---

### Advantages of SDI

1. *Personalization*: Users receive only the information that matches their specific needs and
interests.
2. *Time-Saving*: Users don’t need to manually search for updates—they are delivered
automatically.
3. *Efficiency*: The system processes data in real time, ensuring timely delivery of relevant
information.
4. *Scalability*: SDI can handle multiple user profiles and a large influx of new data.

---
### *Limitations*
1. *Broad Search Statements*: Profiles might lead to receiving some irrelevant items if the search
criteria are too broad.
2. *Lack of Multimedia Support*: Currently, SDI systems focus mainly on text-based information
and do not fully support multimedia data like videos or images.

---

### Example in Real Life

Consider an academic researcher interested in the latest articles on "Machine Learning." They set up
an SDI profile with this topic. When new articles or papers are added to the database, the system
identifies those matching their profile and sends them directly to their mailbox.

3) Document Database Search*

*Simplified Explanation:*
The *Document Database Search* allows users to search through all the information stored in the
system. This includes documents, articles, or any data received and saved. The key components of
this process are:

1. *Query-Based Search:*
- Users input queries to find relevant documents.
- These queries are often created as needed (called *ad hoc queries*).

2. *Stored Documents:*
- The system keeps all received documents in a database, known as the *Document Database*.
- Once stored, these documents are not edited or changed.

3. *Search Process:*
- The system scans the database to find items matching the user’s query.

*Significance:*
- This process ensures users can access all the documents the system has received.
- It is useful for retrieving past information quickly and accurately.

*Example:*
If a user wants to find all documents related to “Artificial Intelligence,” they can enter the query,
and the system will search the database to retrieve relevant documents.

4) Index Database Search*

*Simplified Explanation:*
The *Index Database Search* allows users to save important documents for future use by filing
them with specific tags or descriptions. This process involves:

1. *Indexing:*
- Users can *file documents* into special categories called *index files*.
- Additional tags or descriptions (index terms) can be added to make searching easier later.

2. Types of Index Files:

- *Private Index Files:*
- Created by individual users.
- These files reference only a small subset of the total database.
- Access is limited to specific users.

- Public Index Files:

- Created and maintained by library professionals.
- These files often reference every document in the database.
- Access is available to a larger group of users, depending on permissions.

3. *Search Process:*
- Users can search both private and public index files to find saved documents.
- The system helps users generate indexes automatically through *Automatic File Build
(Information Extraction)*.

*Significance:*
- Indexing organizes documents for quick and efficient retrieval.
- Private files enable personalized storage, while public files serve broader organizational needs.

*Example:*
- A researcher can create a private index file for documents on “Machine Learning” and tag them
with specific keywords. Later, they can search this index file to find those documents easily.

Relationship to Database Management Systems

Integrating Database Management Systems (DBMS) with Information Retrieval (IR) systems is
crucial for managing both structured data and unstructured information effectively.

Practical Importance: Combining DBMS and IR systems improves data handling, enabling users to
search and retrieve both structured (tables, records) and unstructured (text, documents) data
seamlessly.

Commercial Integration: Many commercial database systems have integrated these capabilities:

INQUIRE DBMS: One of the first to combine DBMS and IR systems, available for over 15 years

ORACLE DBMS: Offers an embedded IR tool called CONVECTIS, which uses a thesaurus to
generate "themes" for items, improving search accuracy.

INFORMIX DBMS: Links to Retrieval Ware, enabling the integration of structured data with
advanced IR functions.

Key Benefits:

1. Enhanced Search: Users can query structured data alongside related unstructured data for a
complete view.

2. Efficient Retrieval: Features like thesaurus-based searches (e.g., in ORACLE) allow for more
intuitive and comprehensive results.
3. Unified Systems: Integration reduces the need for separate systems, saving time and improving
user experience.

This integration bridges the gap between traditional databases and modern information retrieval
needs, offering a more powerful and flexible approach to managing diverse types of data.

Digital Libraries

*Digital libraries* are online platforms where books, research papers, videos, images, and other
materials are stored digitally. These libraries let users access information from anywhere, anytime,
using devices like smartphones, laptops, or tablets. They aim to make knowledge more accessible
and preserve important materials in a digital format.

#### Significance of Digital Libraries

1. *Convenience*: You can access resources 24/7 without going to a physical location.
2. *Preservation*: Rare and valuable resources are digitized, preventing wear and tear.
3. *Global Access*: Resources can be shared across the world, helping remote learners.
4. *Search Efficiency*: Tools like keywords and filters make finding information quick.
5. *Cost Savings*: No need for physical space or maintenance, reducing costs for institutions.
6. *Eco-Friendly*: Reduces the need for paper and printing, supporting a greener planet.

Data Warehouses

A *data warehouse* is a large storage system used by businesses to collect and manage data from
various sources. Unlike regular storage, it organizes data in a way that makes it easy to analyze and
create reports, helping businesses make informed decisions. It’s like a giant library for business
information.

#### Significance of Data Warehouses

1. *Centralized Data*: Combines information from different departments, like sales, marketing, and
finance.
2. *Improved Decision-Making*: Helps businesses analyze trends and make better plans.
3. *Fast Insights*: Processes massive amounts of data quickly for real-time reporting.
4. *Historical Records*: Stores older data, which can be used for long-term analysis.
5. *Better Performance*: Reduces the workload on regular databases by handling heavy data
analysis separately.
6. *Accuracy*: Provides reliable and consistent data for critical business strategies

2.2 Browse Capabilities

Browse capabilities help users explore search results efficiently by summarizing and organizing the
information. They allow users to identify and select relevant items for display, making the process
of finding what they need easier and more intuitive.

---

#### Displaying Results

- *Line Item Status:* Results are displayed as a list, where each line represents a summary of an
item.
- *Data Visualization:* Uses visual tools like 2D or 3D graphs to represent search results. Each
point on the graph represents an item, and its position shows how closely it relates to the user’s
query. Clusters of points indicate related topics, making it easier to browse similar items.

---

#### 2.2.1 Ranking

Ranking organizes results based on relevance scores, helping users quickly find the most important
items.
- *Relevance Scores:* Scores range from 0.0 (least relevant) to 1.0 (most relevant), indicating how
well an item matches the query.
- *Collaborative Filtering:* This technique, used by platforms like Amazon, MovieFinder, and
CDNow, suggests results based on what other users with similar interests have selected.
- *Graphical Ranking:* Relevance can also be visualized using color or position on a graph. For
example, clusters of related items can be highlighted to make browsing easier.

---

#### 2.2.2 Zoning

Zoning focuses on specific sections (zones) of a document, like headings, passages, or specific
paragraphs, to display the most relevant parts of an item.
- *Locality-Based Searches:* Ensures users only see the most meaningful sections of a document,
saving time and reducing unnecessary reading.

---

#### 2.2.3 Highlighting

Highlighting emphasizes key words or phrases in search results, making it easier to find relevant
content.
- *Starting Points:* Browsing starts with the first highlighted word or section, and users can jump to
the next highlight as needed.
- *Color and Intensity:* Different colors or shades indicate how important a word or section is to
the search.
- *Paragraph-Based Highlights:* Some systems, like DCARS, allow users to browse results in the
order of paragraphs or words that contributed most to the item’s relevance score.

---

### Why Browse Capabilities Matter

1. *Simplifies Search:* Makes large sets of results easier to understand and navigate.
2. *Improves Accuracy:* Tools like ranking, zoning, and highlighting help users focus on what’s
most relevant.
3. *Visual Aid:* Graphical displays and highlighting reduce the effort needed to sift through
unrelated content.
4. *User-Friendly:* Collaborative filtering and clustering provide personalized and intuitive
browsing experiences.

2.3 Miscellaneous Capabilities

Miscellaneous capabilities refer to additional tools that enhance search systems, making them more
user-friendly, efficient, and flexible. These features help users refine searches, explore word
relationships, and save time by reusing previous queries.

#### 2.3.1 Vocabulary Browse

Vocabulary Browse allows users to explore the words in the database in alphabetical order. It
displays unique words, their occurrences, and helps users understand the impact of search variations
like using wildcards (e.g., “compul*” for “compulsion,” “compulsive,” etc.). It also identifies
common errors, like typing “computen” instead of “computer.” This feature simplifies the process
of constructing accurate search queries.

#### 2.3.2
Iterative Search and Search History Log
Iterative search makes refining results easier by applying new conditions to previous search
outcomes. It helps users focus on relevant items without starting over. Relevance feedback lets users
improve results further. The search history log saves all searches from the session, enabling quick
access to modify or revisit previous queries.

#### 2.3.3 Canned Query

Canned queries save time by allowing users to store and reuse frequently used searches. Once
saved, these queries can be refined or expanded with new criteria as needed. Variables can be added
to canned queries, offering flexibility to adjust search parameters during execution. This feature is
especially useful for repetitive tasks or ongoing research.

2.1 Search Capabilities

The goal of search capabilities is to map a user's needs to relevant items in a database. Users can
input queries using natural language or Boolean logic. Some systems allow search terms to be
"weighted" based on importance to improve results.

2.1.1 Boolean Logic

Boolean logic connects search terms using operators like AND, OR, and NOT to retrieve relevant
information. These operations work by set intersection (AND), set union (OR), and set difference
(NOT). Special Boolean searches like "M of N" allow items containing a subset of terms to be
retrieved

2.1.2 Proximity

Proximity restricts how close search terms must be in a document to be considered related,
improving precision. For example, terms like "COMPUTER" and "DESIGN" close together suggest
relevance.
2.1.3 Contiguous Word Phrases

A Contiguous Word Phrase treats multiple words as a single unit, like "United States of America."
This search method is similar to proximity but allows for greater specificity in querying exact
phrases.

2.1.4 Fuzzy Searches

Fuzzy searches help find terms with similar spellings, useful for handling typos. For example, a
search for "computer" might also find "compiter" or "conputer," improving recall but reducing
precision.

2.1.5 Term Masking

Term masking allows part of a search term to be hidden, expanding the search. For example,
"COMPUTER" can find terms like "COMPUTER" and "COMPUTERS." Masking can be used for
prefix, suffix, or embedded searches.
2.1.6 Numeric and Date Ranges

Masking doesn't work for numeric or date ranges. Instead, specialized range queries are used to find
values like numbers greater than 125.

2.1.7 Concept/Thesaurus Expansion

Thesaurus and concept class expansion help find terms related in meaning. A thesaurus expands
search terms based on language, while concept classes expand based on related ideas in a structured
tree.

2.1.8 Natural Language Queries

Natural language queries allow users to input questions directly. While they improve recall, they
can reduce precision, especially when using negation.

IB Grade 9 Math Book-Chapter1
76% (25)
IB Grade 9 Math Book-Chapter1
36 pages
Copyright Affidavit Template
100% (17)
Copyright Affidavit Template
3 pages
Anthology of Games and Activities
75% (4)
Anthology of Games and Activities
23 pages
Indexing and Abstracting Reviewer LLE
100% (2)
Indexing and Abstracting Reviewer LLE
46 pages
Community Policing Partnerships For Problem Solving 6th Edition PDF
No ratings yet
Community Policing Partnerships For Problem Solving 6th Edition PDF
2 pages
Irs Unit-1
No ratings yet
Irs Unit-1
61 pages
UNIT I
No ratings yet
UNIT I
65 pages
Unit 1
No ratings yet
Unit 1
19 pages
IRS Unit 1 by Krishna
No ratings yet
IRS Unit 1 by Krishna
33 pages
Introduction To Information Retrieval
No ratings yet
Introduction To Information Retrieval
44 pages
Unit-1 Chapter 1
No ratings yet
Unit-1 Chapter 1
44 pages
Irs Unit1
No ratings yet
Irs Unit1
15 pages
IRS Study Material
100% (1)
IRS Study Material
87 pages
Irs PDF
No ratings yet
Irs PDF
68 pages
Module 1 - Introduction
No ratings yet
Module 1 - Introduction
61 pages
Functional Overview of an Information Retrieval System
No ratings yet
Functional Overview of an Information Retrieval System
1 page
UNIT 1 IRS (1)
No ratings yet
UNIT 1 IRS (1)
26 pages
IRS IMP Questions
No ratings yet
IRS IMP Questions
7 pages
IRSUnit-1
No ratings yet
IRSUnit-1
26 pages
Unit - 1
No ratings yet
Unit - 1
51 pages
IRS Unit-1
50% (2)
IRS Unit-1
14 pages
Unit-I: Introduction To Information Retrieval Systems
100% (1)
Unit-I: Introduction To Information Retrieval Systems
14 pages
IRS Unit-1
No ratings yet
IRS Unit-1
27 pages
Irs I
No ratings yet
Irs I
20 pages
IRS U-1
No ratings yet
IRS U-1
49 pages
UNIT 1 IRS WWWWW
No ratings yet
UNIT 1 IRS WWWWW
26 pages
IRS Unit-1
100% (5)
IRS Unit-1
14 pages
Cmrit Isr Notes - Docx New
No ratings yet
Cmrit Isr Notes - Docx New
54 pages
Information Retrivals Ans
No ratings yet
Information Retrivals Ans
78 pages
Unit 5
No ratings yet
Unit 5
14 pages
IRS_Unit_2
No ratings yet
IRS_Unit_2
15 pages
Modern Information Retrieval: Computer Engineering Department Fall 2005
No ratings yet
Modern Information Retrieval: Computer Engineering Department Fall 2005
19 pages
IR ASS1
No ratings yet
IR ASS1
12 pages
Wollo University Kombolcha Institute of Technology College of Informatics Department of Information Technology
100% (1)
Wollo University Kombolcha Institute of Technology College of Informatics Department of Information Technology
35 pages
irs notes_merged (1)
No ratings yet
irs notes_merged (1)
166 pages
Subject Analysis in Online Catalogs 2nd edition Hope A. Olson pdf download
100% (1)
Subject Analysis in Online Catalogs 2nd edition Hope A. Olson pdf download
83 pages
Explain Item Normalization?
No ratings yet
Explain Item Normalization?
7 pages
1 IRIntro
No ratings yet
1 IRIntro
95 pages
IRS Spectrum
100% (1)
IRS Spectrum
150 pages
IRS Unit 2 by Krishna
No ratings yet
IRS Unit 2 by Krishna
39 pages
Module 1print
No ratings yet
Module 1print
5 pages
Unit 1 Introduction To NLP
No ratings yet
Unit 1 Introduction To NLP
59 pages
Introduction To IR 2021
No ratings yet
Introduction To IR 2021
40 pages
IR Chapter 1&2
No ratings yet
IR Chapter 1&2
88 pages
Concepts of Information Retrieval System
No ratings yet
Concepts of Information Retrieval System
10 pages
Objectives of Information Retrieval
No ratings yet
Objectives of Information Retrieval
5 pages
Chapter 1 Introduction To ISR
No ratings yet
Chapter 1 Introduction To ISR
39 pages
Unit 4
No ratings yet
Unit 4
31 pages
unit-1-irs-information-retrieval-systems-unit-1
No ratings yet
unit-1-irs-information-retrieval-systems-unit-1
27 pages
IRS
No ratings yet
IRS
88 pages
Intelligent
No ratings yet
Intelligent
20 pages
9214-1
No ratings yet
9214-1
19 pages
Unit1 Mot
No ratings yet
Unit1 Mot
22 pages
IRS unit 1 part 2
No ratings yet
IRS unit 1 part 2
6 pages
Jeppiaar Institute of Technology: Department OF Computer Science and Engineering
No ratings yet
Jeppiaar Institute of Technology: Department OF Computer Science and Engineering
24 pages
irs unit-4 modified
No ratings yet
irs unit-4 modified
13 pages
Information Retrieval
No ratings yet
Information Retrieval
21 pages
nlp 1
No ratings yet
nlp 1
59 pages
Unit - 6
No ratings yet
Unit - 6
6 pages
Web Mining UNIT-II Chapter-01 - 02 - 03
No ratings yet
Web Mining UNIT-II Chapter-01 - 02 - 03
19 pages
Information Search and Retrieval
No ratings yet
Information Search and Retrieval
23 pages
Image Retrieval: Fundamentals and Applications
From Everand
Image Retrieval: Fundamentals and Applications
Fouad Sabry
No ratings yet
Data Structures I Essentials
From Everand
Data Structures I Essentials
Dennis Smolarski
No ratings yet
Image Retrieval: Unlocking the Power of Visual Data
From Everand
Image Retrieval: Unlocking the Power of Visual Data
Fouad Sabry
No ratings yet
CW, CWR & CWX Pumps Installation, Operation and Maintenance Manual
100% (1)
CW, CWR & CWX Pumps Installation, Operation and Maintenance Manual
20 pages
Pricol
100% (3)
Pricol
18 pages
Richard and Elms (1979)
100% (1)
Richard and Elms (1979)
8 pages
Mechmaark Filtech Company Profile
No ratings yet
Mechmaark Filtech Company Profile
5 pages
This Study Resource Was Shared Via
No ratings yet
This Study Resource Was Shared Via
5 pages
The Economics of Renewable en
No ratings yet
The Economics of Renewable en
2 pages
Math T Terengganu
No ratings yet
Math T Terengganu
16 pages
Rear Door Cooling
No ratings yet
Rear Door Cooling
1 page
Corrective Action Tracker Sample
No ratings yet
Corrective Action Tracker Sample
2 pages
Petroleum Geology and Reservoirs
75% (4)
Petroleum Geology and Reservoirs
87 pages
Template BODY TEXT of The SIP Paper
No ratings yet
Template BODY TEXT of The SIP Paper
5 pages
35 Factory Load Test
No ratings yet
35 Factory Load Test
1 page
Cocacola and Nestlé Analysis
No ratings yet
Cocacola and Nestlé Analysis
2 pages
PHO Organizational Chart 2022b
No ratings yet
PHO Organizational Chart 2022b
2 pages
Monera John Philip M Bsee4a Lecture 1 Review Questions
No ratings yet
Monera John Philip M Bsee4a Lecture 1 Review Questions
1 page
Managing Human Resources 18th Edition Scott A. Snell - Read the ebook online or download it to own the complete version
100% (1)
Managing Human Resources 18th Edition Scott A. Snell - Read the ebook online or download it to own the complete version
50 pages
Oxyblock D
No ratings yet
Oxyblock D
13 pages
150 de Thi Hoc Sinh Gioi Tieng Anh Lop 6 Co Dap An
No ratings yet
150 de Thi Hoc Sinh Gioi Tieng Anh Lop 6 Co Dap An
118 pages
AE Series Disconnect Switches: Explosionproof, Dust-Ignitionproof
No ratings yet
AE Series Disconnect Switches: Explosionproof, Dust-Ignitionproof
2 pages
Noise Rules-10-08-2017 PDF
No ratings yet
Noise Rules-10-08-2017 PDF
8 pages
Badie Paper On NCHRP 12-65
No ratings yet
Badie Paper On NCHRP 12-65
19 pages
Naval
No ratings yet
Naval
2 pages
Cambodia Standard of Audit
No ratings yet
Cambodia Standard of Audit
76 pages
Awb - 176-57043582 - Global
No ratings yet
Awb - 176-57043582 - Global
6 pages

irs unit-1 modified

Uploaded by

irs unit-1 modified

Uploaded by

Definition of Information Retrieval System

Key Measures: Precision and Recall

Natural Language Queries

### *1. Logical Restructuring and Standardization*

### *2. Processing Tokens*

### *3. Zoning*

### *4. Handling Symbols*

### *5. Stop Lists and Stop Algorithms*

### *Why Item Normalization is Important*

2)Selective Dissemination of Information (SDI):

### *How SDI Works*

2. *New Items Processing*:

3. *Matching and Dissemination*:

*Key Components of the SDI Process*

### *Advantages of SDI*

### *Example in Real Life*

3) Document Database Search*

4) Index Database Search*

2. *Types of Index Files:*

- *Public Index Files:*

Relationship to Database Management Systems

#### *Significance of Digital Libraries*

#### *Significance of Data Warehouses*

*2.2 Browse Capabilities*

#### *Displaying Results*

#### *2.2.1 Ranking*

#### *2.2.2 Zoning*

#### *2.2.3 Highlighting*

### *Why Browse Capabilities Matter*

2.3 Miscellaneous Capabilities

#### 2.3.1 Vocabulary Browse

#### 2.3.3 Canned Query

2.1 Search Capabilities

2.1.1 Boolean Logic

2.1.4 Fuzzy Searches

2.1.5 Term Masking

2.1.7 Concept/Thesaurus Expansion

2.1.8 Natural Language Queries

You might also like

### 1. Logical Restructuring and Standardization

### 2. Processing Tokens

### 3. Zoning

### 4. Handling Symbols

### 5. Stop Lists and Stop Algorithms

### Why Item Normalization is Important

### How SDI Works

2. New Items Processing:

3. Matching and Dissemination:

Key Components of the SDI Process

### Advantages of SDI

### Example in Real Life

2. Types of Index Files:

- Public Index Files:

#### Significance of Digital Libraries

#### Significance of Data Warehouses

2.2 Browse Capabilities

#### Displaying Results

#### 2.2.1 Ranking

#### 2.2.2 Zoning

#### 2.2.3 Highlighting

### Why Browse Capabilities Matter