TSA@1
TSA@1
The User Task: The information first is supposed to be translated into a query
by the user. In the information retrieval system, there is a set of words that
convey the semantics of the information that is required whereas, in a data
retrieval system, a query expression is used to convey the constraints which
are satisfied by the objects. Example: A user wants to search for something
but ends up searching with another thing. This means that the user is
browsing and not searching. The above figure shows the interaction of the
user through different tasks.
Logical View of the Documents: A long time ago, documents were
represented through a set of index terms or keywords. Nowadays,
modern computers represent documents by a full set of words which
reduces the set of representative keywords. This can be done by
eliminating stopwords i.e. articles and connectives. These operations are
text operations. These text operations reduce the complexity of the
document representation from full text to set of index terms.
Past, Present, and Future of Information Retrieval
1. Early Developments: As there was an increase in the need for a lot of
information, it became necessary to build data structures to get faster
access. The index is the data structure for faster retrieval of information.
Over centuries manual categorization of hierarchies was done for indexes.
2. Information Retrieval In Libraries: Libraries were the first to adopt IR
systems for information retrieval. In first-generation, it consisted, automation
of previous technologies, and the search was based on author name and
title. In the second generation, it included searching by subject heading,
keywords, etc. In the third generation, it consisted of graphical interfaces,
electronic forms, hypertext features, etc.
3. The Web and Digital Libraries: It is cheaper than various sources of
information, it provides greater access to networks due to digital
communication and it gives free access to publish on a larger medium.
Advantages of Information Retrieval
1. Efficient Access: Information retrieval techniques make it possible for users
to easily locate and retrieve vast amounts of data or information.
2. Personalization of Results: User profiling and personalization techniques
are used in information retrieval models to tailor search results to individual
preferences and behaviors.
3. Scalability: Information retrieval models are capable of handling increasing
data volumes.
4. Precision: These systems can provide highly accurate and relevant search
results, reducing the likelihood of irrelevant information appearing in search
results.
Disadvantages of Information Retrieval
1. Information Overload: When a lot of information is available, users often
face information overload, making it difficult to find the most useful and
relevant material.
2. Lack of Context: Information retrieval systems may fail to understand the
context of a user’s query, potentially leading to inaccurate results.
3. Privacy and Security Concerns: As information retrieval systems often
access sensitive user data, they can raise privacy and security concerns.
4. Maintenance Challenges: Keeping these systems up-to-date and effective
requires ongoing efforts, including regular updates, data cleaning, and
algorithm adjustments.
5. Bias and fairness: Ensuring that information retrieval systems do not exhibit
biases and provide fair and unbiased results is a crucial challenge, especially
in contexts like web search engines and recommendation systems
2.Recurrent Neural Networks (RNN):
RNNs are a type of neural network that can be used to model sequence
data. RNNs, which are formed from feedforward networks, are similar to
human brains in their behaviour. Simply said, recurrent neural
networks can anticipate sequential data in a way that other algorithms
can’t.
All of the inputs and outputs in standard neural networks are independent of one
another, however in some circumstances, such as when predicting the next word of
a phrase, the prior words are necessary, and so the previous words must be
overcome the problem. The most important component of RNN is the Hidden state,
RNNs have a Memory that stores all information about the calculations. It employs
the same settings for each input since it produces the same outcome by performing
RNN Architecture
RNNs are a type of neural network that has hidden states and allows past outputs to
Input Layer: This layer receives the initial element of the sequence data. For
Hidden Layer: The heart of the RNN, the hidden layer contains a set of
interconnected neurons. Each neuron processes the current input along with the
information from the previous hidden layer’s state. This “state” captures the
context.
enabling it to learn complex patterns. It transforms the combined input from the
current input layer and the previous hidden layer state before passing it on.
Output Layer: The output layer generates the network’s prediction based on the
processed information. In a language model, it might predict the next word in the
sequence.
the hidden layer. This connection allows the network to pass the hidden state
information (the network’s memory) to the next time step. It’s like passing a baton in
RNNs are a type of neural network that has hidden states and allows past outputs to
those with a single input and output to those with many (with variations between).
One To One: There is only one pair here. A one-to-one architecture is used in
outputs. One too many networks are used in the production of music, for example.
Many To One: In this scenario, a single output is produced by combining many
inputs from distinct time steps. Sentiment analysis and emotion identification use
Many To Many: For many to many, there are numerous options. Two inputs yield
three outputs. Machine translation systems, such as English to French or vice versa
Advantages of RNNs:
Handle sequential data effectively, including text, speech, and time series.
Disadvantages of RNNs:
Lexical Analysis
Definition: Breaks down language into units (lexemes) such as words, and categorizes
them into parts of speech (POS).
Example: The sentence "The cat is sleeping." can be broken down into words: "The"
(determiner), "cat" (noun), "is" (verb), "sleeping" (verb).
Syntactic Analysis
Semantic Analysis
Discourse Integration
Pragmatic Analysis
Definition: Understands the intended meaning behind language, beyond the literal
interpretation.
Example: The phrase "It's raining cats and dogs" is understood pragmatically to mean
it’s raining heavily, not literally that animals are falling from the sky.
4. the calculated bi-gram probabilities and the final
probability for the sentence "They play in a big garden" using
a bi-gram language model with Laplace smoothing.