0% found this document useful (0 votes)

7 views

Providing Accurate Data:: How Does It Work?

Uploaded by

warda.bsbi680

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

7 views

Providing Accurate Data:: How Does It Work?

Uploaded by

warda.bsbi680

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 9

Slide 1:

Oral squamous cell carcinoma (OSCC) is a type of cancer that develops in the
squamous cells lining the inside of your mouth and throat. Squamous cells are thin,
flat cells that form the surface layer of the skin and many other tissues in the body.
Some of the key risk factors for OSCC are smoking, tobacco and alcohol that damage
the DNA of squamous cells, the lining of the mouth and throat

Basically we have designed a Clinical Decision Support System (CDSS) in the

context of OSCC. A CDSS is a computer program designed to assist healthcare
professionals and the patient themselves in answering their queries related to OSCC.
Here's how a CDSS might be helpful for OSCC:

 Providing accurate data: A CDSS can integrate information from various

sources, including , symptoms, prognosis , impacts etc to provide a comprehensive
picture of the patient's condition.
 Aligning data with user needs: It can tailor the information presented to the specific
needs of the user, highlighting relevant data points and potential treatment options for
OSCC based on the specific case.
 Timely decision making: By offering quick access to relevant information, a CDSS
can help a user to make timely decisions about the best course of treatment for OSCC
patients

SLIDE 11:
Firstly, we know about what is LLM?
Large Language Model (LLM)
Large Language Models (LLMs) are a type of artificial intelligence model that excels
in understanding and generating human-like text
How does it work?
 It learns from tons of information on the internet, figuring out how
words and sentences fit together.
 Large Language Models (LLMs) work by learning from massive
amounts of text data, using a neural network architecture called
transformers. They understand language patterns, context, and
relationships during pre-training. Afterward, they can be fine-tuned for
specific tasks. During inference, LLMs generate text based on learned
patterns, making them versatile for tasks like translation,
summarization, and question-answering.
Examples:
 GPT-3 is a famous one. It's like having a smart friend who's really
good with words and can help with all sorts of language-related tasks.
LangChain:
LangChain is a framework that makes it easier to create applications using large
language models (LLMs). It provides:

 Tools: These are resources that help in building and customizing applications.
 Abstractions: These are simplified models or templates that hide complex
details, making development easier.

Developers use LangChain to create and tailor applications for various tasks, such as:

 Document Analysis: Understanding and extracting information from

documents.
 Summarization: Condensing long texts into shorter summaries.
 Chatbots: Building conversational agents that interact with users.
 Code Analysis: Analysing and understanding programming code.

In essence, LangChain helps streamline and simplify the development process for
applications that leverage the power of large language models.

SLIDE 12

This diagram depicts the LangChain framework for a question-answering system.

Here's a step-by-step explanation of each part of the framework:

1. Load: This step involves loading different types of data sources such as JSON
files, URLs, documents, and other file formats.
2. Split: The loaded data is then split into smaller, manageable chunks. This is
often done to ensure the data can be processed more efficiently and to fit
within the constraints of downstream models.
3. Embed: Each of these chunks is then transformed into a numerical
representation (embedding). This involves converting the textual data into
vectors that capture the semantic meaning of the text.
4. Store: The embeddings are stored in a database or some storage system. This
allows for efficient retrieval of relevant chunks based on a query.
5. Retrieve: When a question is posed to the system, it retrieves the most
relevant chunks of text from the stored embeddings. This retrieval step is
based on the similarity between the query and the stored embeddings.
6. Prompt: The retrieved chunks are then used to create a prompt for the
language model.
7. LLM (Large Language Model): The prompt is fed into a large language
model, which processes the information and generates an answer.
8. Answer: The final step is delivering the answer generated by the language
model to the user.

This flow enables the system to answer questions by leveraging stored embeddings
and a large language model, ensuring efficient and relevant retrieval of information
for accurate responses.

SLIDE 15
The Text Splitting pre-processing step is crucial for managing large texts and making
them more suitable for processing by downstream components. The
RecursiveCharacterTextSplitter class is designed to handle this task efficiently. It
operates by recursively splitting text based on a list of user-defined characters,
ensuring that related pieces of text remain adjacent to each other, thus preserving their
semantic relationship.
Text Splitting with RecursiveCharacterTextSplitter
Purpose
 To break large texts into smaller, manageable pieces that make sense on their
own.
How It Works
1. Start with Large Text:
o Begin with a big text, like a long article or document.
2. Initial Split:
o First, divide the text into big chunks, like paragraphs.
3. Check Chunk Size:
o If any chunk (paragraph) is too big, split it further.
4. Recursive Splitting:
o Split the big chunks into smaller pieces, like sentences.
o If a sentence is too long, split it into smaller parts, like phrases.
5. Keep Going:
o Repeat this process until all pieces are a manageable size.
Benefits
 Easier to Handle: Smaller pieces are easier to work with.
 Makes Sense: Each piece still makes sense on its own.
 Flexible: Adjusts to different text sizes automatically.
Example
 You have a long document.
 Split it into paragraphs.
 If a paragraph is too long, split it into sentences.
 If a sentence is too long, split it into shorter parts.
This way, you end up with small, meaningful chunks of text that are easy to process
further.
Slide 16

Chunk Size:
 Refers to the maximum desired length of each individual chunk in
characters.
 It sets a target upper limit for the number of characters in each segment.
 In your code, chunk_size is set to 1000, indicating that ideally, each chunk
should contain no more than 1000 characters.

Chunk Overlap:

 Defines the number of characters by which consecutive chunks will

overlap.
 This helps to maintain context when processing the split text later.
 It ensures that adjacent chunks share some content, especially at the end of the
first chunk and the beginning of the next.
 In your code, chunk_overlap is set to 200. This means the last 200 characters
of one chunk will be included at the beginning of the next chunk.

SLIDE 17

Embeddings

In Natural Language Processing (NLP), an embedding is a representation of text

where words, phrases, or entire documents are mapped to vectors of real numbers.
These vectors capture the semantic meaning of the text in a way that is understandable
to machine learning models.

• Dimensionality Reduction:

• Embeddings reduce the high-dimensional space of text (e.g., vocabulary size)

into a lower-dimensional continuous vector space.

• Semantic Similarity:

• Words or phrases with similar meanings are represented by vectors that are
close to each other in the embedding space. For instance, the words "king" and
"queen" would have vectors that are closer together than "king" and "car.

Slide 18:

INSTRUCTOR Embeddings

 Task Specificity: By incorporating task instructions, INSTRUCTOR

embeddings are more accurately tailored to the specific needs of the task,
leading to better performance.
 One Model, Many Tasks: A single INSTRUCTOR model can handle various
tasks effectively, reducing the need for multiple specialized models.
 Robustness: INSTRUCTOR embeddings are robust to changes in
instructions, meaning they can adapt to slightly different task descriptions
without significant performance loss.
Aspect Traditional Embeddings INSTRUCTOR Embeddings
Combine text input with task
Create a single, fixed
Design instructions to create task-specific
representation of text.
embeddings.
Task-agnostic; the same Task-aware; embeddings vary
Functionality embedding is used for based on provided task
different tasks. instructions.
Requires additional fine-
One model can adapt to multiple
Task Adaptation tuning or separate models for
tasks using instructions.
different tasks.
General-purpose, but not Highly versatile, optimized for
Versatility optimized for specific tasks multiple tasks without needing
without extra steps. separate models.
Good performance, but may State-of-the-art performance
Performance
require task-specific tuning. across a wide range of tasks.
Fixed embeddings may not Robust to changes in task
Robustness handle changes in task instructions, adaptable to slightly
requirements well. different descriptions.
Embedding for "Apple is a Embedding changes based on
Example Usage great company." is the same whether the task is sentiment
regardless of the task. analysis or entity recognition.
More complex, integrating task
Implementation Simpler, often requiring task-
instructions directly into the
Complexity specific adaptations.
embedding process.

Slide 19

1. Architecture: INSTRUCTOR uses GTR (General Text

Representation) models, initialized from T5 (Text-to-Text Transfer
Transformer) models, and finetuned on information search
datasets.

Summary of GTR Model in INSTRUCTOR

• Base Architecture: Uses T5 model as the foundational architecture.

• Fine-Tuning: Further trained on specific datasets to improve task
performance.
• General Text Representation: Aims to create high-quality embeddings
usable across multiple tasks.
• Task-Specific Embeddings: Produces embeddings that are tailored to the task
described by the instructions provided with the text.
• Versatility: Capable of handling diverse tasks without needing separate
models.

In essence, the GTR model in the INSTRUCTOR framework is a specialized version

of the T5 model that has been fine-tuned to generate embeddings that are both
general-purpose and task-aware, providing high performance and adaptability for a
wide range of text processing tasks.
T5 stands for Text-To-Text Transfer Transformer. It's a type of transformer-based
model developed by Google Research that is trained to perform various text-related
tasks by converting both the input and output into text-to-text format. This approach
allows T5 to handle a wide range of tasks with a unified architecture, including tasks
like translation, summarization, question answering, and more, by framing them all as
text generation tasks.

2. Training Objective:

 Contrastive loss: This loss function pushes the model to create embeddings
(numerical representations) that group similar texts together and separate
dissimilar texts. This helps the model distinguish between relevant and
irrelevant information during information search.

3. MEDI Dataset: The Training Ground

MEDI (Multitask Embedding Data with Instructions) is like INSTRUCTOR's

training gym.

 Variety is Key: MEDI contains 300 datasets from a source called Super-
Natural Instructions. These datasets cover many different tasks, like
classification, summarization, and question answering. Each task comes with
specific instructions, guiding INSTRUCTOR on how to handle it.
 Extra Practice: MEDI also includes 30 additional datasets, giving
INSTRUCTOR even more examples to learn from and improve its text
understanding skills.

Slide 20

Retriever Overview

A retriever's primary function is to fetch the most relevant documents or data points
from a vector store based on a user's query. Here's how it accomplishes this:

1. Vector Store Backbone: The vector store serves as the repository where
documents or data points are stored as vectors (numeric representations).
These vectors capture essential features or characteristics of each document or
data point.
2. Query Processing: When a user submits a query to the retriever, it takes this
query and uses it to identify which documents or data points in the vector store
are most relevant to the query.

Search Methods

The retriever employs several methods to determine relevance:

1. Similarity Search Method (Traditional Approach):

o Description: This method retrieves documents or data points that are
most similar to the query vector in terms of their vector
representations.
o Process:
 Computes similarity scores between the query vector and
vectors of all documents or data points in the vector store.
 Ranks the documents or data points based on these similarity
scores.
 Returns the top-ranked documents or data points as the most
relevant results.
2. Maximum Marginal Relevance (MMR):
o Description: MMR is used to balance relevance and diversity in search
results.
o Process:
 Initially retrieves the most relevant document or data point
based on similarity.
 Subsequent retrievals prioritize diversity by selecting
documents or data points that are dissimilar to those already
retrieved, ensuring a broader range of information is presented.
 This method helps prevent redundancy and provides a more
comprehensive view of the topic.
3. Specifying Top k:
o Description: Allows users to specify the number of top-ranked
documents or data points they want to retrieve.
o Process:
 After retrieving relevant documents or data points based on
similarity or MMR, the retriever limits the results to the
specified number, such as the top 5 or top 10 most relevant
items.
 This customization allows users to control the depth and
breadth of the retrieved information according to their needs.

Implementation and Use

 Efficiency: These methods are designed to efficiently retrieve relevant

information from the vector store without storing all documents internally.
 Flexibility: The retriever can adapt to different search requirements, from
focusing purely on similarity to incorporating diversity through MMR,
providing users with tailored and meaningful results.

In summary, a retriever leveraging vector stores as its backbone employs various

search methods like similarity search, MMR, and top-k specifications to efficiently
find and present the most relevant documents or data points to users based on their
queries. These methods enhance usability and effectiveness in information retrieval
tasks

SLIDE 24

The code snippet you're referring to is using a retriever with the maximal marginal
relevance (MMR) search strategy in a vector database (vectordb). This approach
aims to balance relevance and diversity of the retrieved documents.

Here’s a brief explanation and a possible setup in Python:

python
Copy code
# Assuming vectordb is already set up and connected
retriever = vectordb.as_retriever(search_type="mmr",
search_kwargs={"k": 3})

 search_type="mmr": Specifies that the retriever should use Maximal

Marginal Relevance. This helps in retrieving results that are both relevant and
diverse.
 search_kwargs={"k": 3}: Limits the number of retrieved documents to 3.

To provide more context or an example of how this can be integrated into a larger
system, please let me know more about your specific use case or the surrounding
code.

SLIDE 25

Large Language Models (LLMs) are powerful AI models trained on massive amounts
of text data to understand and generate human language. Here are some key features:

1. Understanding Text: LLMs analyze the meaning of sentences, identify

relationships between words, and grasp the overall sentiment of a piece of text.
2. Generating Text: They can generate creative text in various formats,
including articles, stories, poems, and more, based on the input and desired
style.
3. Adaptability: LLMs demonstrate adaptability by being able to perform
different tasks such as text summarization, translation, question answering,
and more. They can also adapt to different contexts, learning from new data to
improve performance.

These features make LLMs versatile tools for natural language processing tasks
across different domains and applications.

SLIDE 26

TheBloke/wizardLM-7B-HF is an enhanced version of Llama (Large Language

Model Meta AI), developed by TheBloke, known for their work in the field of large
language models. Here are its key features:

 Understanding and Generating Text: Like other large language models,

wizardLM-7B-HF can understand the meaning of text and generate human-
like responses across various formats and styles.
 Actionability: This model is designed to not only understand and generate
text but also to act upon instructions in a more comprehensive and flexible
manner. This could involve tasks such as executing commands based on
textual input or interacting with systems through natural language interfaces.
 Scale: It boasts a size of 7 billion parameters (7B), indicating its capacity to
handle complex language tasks and generate nuanced responses.
 Hugging Face (HF): Hosted on the Hugging Face model hub, it benefits from
the ecosystem and community support provided by Hugging Face.
Overall, TheBloke/wizardLM-7B-HF represents an advanced iteration of large
language models, emphasizing both language understanding and actionable responses,
suitable for a wide range of applications requiring sophisticated AI capabilities.

-OceanofPDF.com-Generative AI Apps With Langchain and Python - Rabi Jay
No ratings yet
-OceanofPDF.com-Generative AI Apps With Langchain and Python - Rabi Jay
387 pages
Instant ebooks textbook Build a Large Language Model (From Scratch) (MEAP V01) Sebastian Raschka download all chapters
100% (4)
Instant ebooks textbook Build a Large Language Model (From Scratch) (MEAP V01) Sebastian Raschka download all chapters
34 pages
Report of Online Banking System
84% (128)
Report of Online Banking System
70 pages
Skoda Superb 1.8 Tsi: (Engine Codes CDAA, BZB)
100% (2)
Skoda Superb 1.8 Tsi: (Engine Codes CDAA, BZB)
16 pages
How To Create A Private ChatGPT With Your Own Data
No ratings yet
How To Create A Private ChatGPT With Your Own Data
11 pages
Data Center & Cooling Solutions PDF
No ratings yet
Data Center & Cooling Solutions PDF
81 pages
The Newbie’s Guidebook to ChatGPT: A Beginner's Tutorial: The Newbie’s Guidebook
From Everand
The Newbie’s Guidebook to ChatGPT: A Beginner's Tutorial: The Newbie’s Guidebook
Timothy King
No ratings yet
Absolutely, Let'S Break Down The Recursivecharactertextsplitter Class Even Further, Focusing On The Key Aspects and How It Achieves Text Splitting
No ratings yet
Absolutely, Let'S Break Down The Recursivecharactertextsplitter Class Even Further, Focusing On The Key Aspects and How It Achieves Text Splitting
12 pages
A-Z of RAG Question Answering Methods in Langchain
No ratings yet
A-Z of RAG Question Answering Methods in Langchain
33 pages
Crafting Excellence in Software Development
From Everand
Crafting Excellence in Software Development
Pasquale De Marco
No ratings yet
Learn C++
From Everand
Learn C++
Aishik Dutta
No ratings yet
Mastering Computer Programming: A Comprehensive Guide
From Everand
Mastering Computer Programming: A Comprehensive Guide
Kondwani Hara
No ratings yet
Day 1
No ratings yet
Day 1
32 pages
Introduction
No ratings yet
Introduction
17 pages
SQLite Database Programming for Xamarin: Cross-platform C# database development for iOS and Android using SQLite.XM
From Everand
SQLite Database Programming for Xamarin: Cross-platform C# database development for iOS and Android using SQLite.XM
Anthony Serpico
No ratings yet
Flowise AI Tutorial #3 File Loaders, Text Splitters, Embeddings & Vector Stores
No ratings yet
Flowise AI Tutorial #3 File Loaders, Text Splitters, Embeddings & Vector Stores
3 pages
Rust Crash Course: Build High-Performance, Efficient and Productive Software with the Power of Next-Generation Programming Skills (English Edition)
From Everand
Rust Crash Course: Build High-Performance, Efficient and Productive Software with the Power of Next-Generation Programming Skills (English Edition)
Abhishek Kumar
No ratings yet
Programming with X10: Definitive Reference for Developers and Engineers
From Everand
Programming with X10: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Implementing Domain-Specific Languages with Xtext and Xtend - Second Edition
From Everand
Implementing Domain-Specific Languages with Xtext and Xtend - Second Edition
Lorenzo Bettini
4/5 (1)
Gensim for Natural Language Processing: Definitive Reference for Developers and Engineers
From Everand
Gensim for Natural Language Processing: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
THE PERFECT CHATBOT DOC
No ratings yet
THE PERFECT CHATBOT DOC
11 pages
DP Module 5
No ratings yet
DP Module 5
8 pages
Session 9 LangChain Ecosystem
No ratings yet
Session 9 LangChain Ecosystem
34 pages
Beginning Swift Programming
From Everand
Beginning Swift Programming
Wei-Meng Lee
No ratings yet
TextMate in Depth: Definitive Reference for Developers and Engineers
From Everand
TextMate in Depth: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Lang Chain
No ratings yet
Lang Chain
27 pages
LangChain
No ratings yet
LangChain
7 pages
Mastering C: A Comprehensive Guide to Programming Excellence
From Everand
Mastering C: A Comprehensive Guide to Programming Excellence
THE NORTHERN HIMALAYAS
No ratings yet
One Stop Framework Building Applications With Llms
No ratings yet
One Stop Framework Building Applications With Llms
8 pages
XML Programming: The Ultimate Guide to Fast, Easy, and Efficient Learning of XML Programming
From Everand
XML Programming: The Ultimate Guide to Fast, Easy, and Efficient Learning of XML Programming
Christopher Right
2.5/5 (2)
2022 Promptchainer
No ratings yet
2022 Promptchainer
10 pages
Learn Rust Programming: Safe Code, Supports Low Level and Embedded Systems Programming with a Strong Ecosystem (English Edition)
From Everand
Learn Rust Programming: Safe Code, Supports Low Level and Embedded Systems Programming with a Strong Ecosystem (English Edition)
Claus Matzinger
No ratings yet
Java for Beginners: A Crash Course to Learn Java Programming in 1 Week
From Everand
Java for Beginners: A Crash Course to Learn Java Programming in 1 Week
Brady Ellison
5/5 (2)
Introduction to LLMs for Business Leaders: Responsible AI Strategy Beyond Fear and Hype: Byte-Sized Learning Series
From Everand
Introduction to LLMs for Business Leaders: Responsible AI Strategy Beyond Fear and Hype: Byte-Sized Learning Series
I. Almeida
No ratings yet
dl_pro_456
No ratings yet
dl_pro_456
8 pages
UNIT 5a
No ratings yet
UNIT 5a
48 pages
TOML Config Basics
From Everand
TOML Config Basics
Frank Wellington
No ratings yet
01-Transformer Based NLP Applications
No ratings yet
01-Transformer Based NLP Applications
55 pages
Chapter 1 Solutions
No ratings yet
Chapter 1 Solutions
5 pages
14-LookingForward
No ratings yet
14-LookingForward
48 pages
NLP_basics
No ratings yet
NLP_basics
119 pages
brexhq_prompt-engineering_ Tips and tricks for working with Large Language Models like OpenAI's GPT-4_
No ratings yet
brexhq_prompt-engineering_ Tips and tricks for working with Large Language Models like OpenAI's GPT-4_
12 pages
NLP- AI2214601 unit 1to unit 5 notes
No ratings yet
NLP- AI2214601 unit 1to unit 5 notes
98 pages
NLP Short Que Ans
No ratings yet
NLP Short Que Ans
21 pages
Running Llama 2 On CPU Inference Locally For Document Q&A - by Kenneth Leung - Jul, 2023 - Towards Data Science
100% (1)
Running Llama 2 On CPU Inference Locally For Document Q&A - by Kenneth Leung - Jul, 2023 - Towards Data Science
21 pages
Large Language Models (LLM)
No ratings yet
Large Language Models (LLM)
139 pages
03-NLP-Document
No ratings yet
03-NLP-Document
38 pages
NLP Chapter -1 Sheet
No ratings yet
NLP Chapter -1 Sheet
6 pages
Thinking About Star
From Everand
Thinking About Star
Francis McCabe
No ratings yet
Large Language Models and Where To Use Them - Part 2
No ratings yet
Large Language Models and Where To Use Them - Part 2
12 pages
CCS369
No ratings yet
CCS369
2 pages
Assignment-I
No ratings yet
Assignment-I
6 pages
Visual Word: Unlocking the Power of Image Understanding
From Everand
Visual Word: Unlocking the Power of Image Understanding
Fouad Sabry
No ratings yet
Prompt to Profit: AI Patterns That Give Solo Builders an Unfair Advantage
From Everand
Prompt to Profit: AI Patterns That Give Solo Builders an Unfair Advantage
Lucas Merritt
No ratings yet
2023 07 28 Evolution of Language Models
No ratings yet
2023 07 28 Evolution of Language Models
73 pages
Introduction to Microsoft SQL Server
From Everand
Introduction to Microsoft SQL Server
Eric Frick
No ratings yet
Objective-C Programming Nuts and bolts
From Everand
Objective-C Programming Nuts and bolts
Keith Lee
No ratings yet
Hugging Face Transformers Essentials: From Fine-Tuning to Deployment
From Everand
Hugging Face Transformers Essentials: From Fine-Tuning to Deployment
Robert Johnson
No ratings yet
The Ascetic Programmer
From Everand
The Ascetic Programmer
Antonio Piccolboni
5/5 (1)
Case Study
No ratings yet
Case Study
25 pages
Module5 PPT
No ratings yet
Module5 PPT
69 pages
Large Language Models: Dr. Asgari, Dr. Rohban, Soleymani Fall 2023
No ratings yet
Large Language Models: Dr. Asgari, Dr. Rohban, Soleymani Fall 2023
53 pages
Natural Language Processing
No ratings yet
Natural Language Processing
8 pages
Huawei AirEngine 5761R-11 & AirEngine 5761R-11E Access Points Datasheet
No ratings yet
Huawei AirEngine 5761R-11 & AirEngine 5761R-11E Access Points Datasheet
16 pages
ATV310 User Manual en EAV94277 08
No ratings yet
ATV310 User Manual en EAV94277 08
144 pages
Job Profile Form - Digital Marketing Manager
No ratings yet
Job Profile Form - Digital Marketing Manager
2 pages
151 - GustoMSC 08.107 - Hydr. Jacking System HPE
100% (1)
151 - GustoMSC 08.107 - Hydr. Jacking System HPE
2 pages
Judge William E. Moody Order Denying Temporary Injunction
100% (1)
Judge William E. Moody Order Denying Temporary Injunction
2 pages
4_5816452880718111596
No ratings yet
4_5816452880718111596
73 pages
Falamine Plus Pages LQ
No ratings yet
Falamine Plus Pages LQ
1 page
Maine Community Health Options v. U.S. Brief of The United States
No ratings yet
Maine Community Health Options v. U.S. Brief of The United States
119 pages
(Specpro) Outline of Republic Act 10142
No ratings yet
(Specpro) Outline of Republic Act 10142
32 pages
Dkte PPT 300716
No ratings yet
Dkte PPT 300716
195 pages
Supply Chain Management Coca Cola Bevragespakistan LTD.: January 8
No ratings yet
Supply Chain Management Coca Cola Bevragespakistan LTD.: January 8
24 pages
Phil Comsat Corp v. Alcuaz (G.r. No. 84818)
No ratings yet
Phil Comsat Corp v. Alcuaz (G.r. No. 84818)
3 pages
Challenges in Researching Consumer Ethics
No ratings yet
Challenges in Researching Consumer Ethics
18 pages
Petrosilah Observed Wells Final Surveying Report1
No ratings yet
Petrosilah Observed Wells Final Surveying Report1
34 pages
Herzberg Theory
No ratings yet
Herzberg Theory
2 pages
Final Project Report
No ratings yet
Final Project Report
33 pages
Unit 5 - English For Banking & Finance 1
100% (1)
Unit 5 - English For Banking & Finance 1
37 pages
Medical Imaging Systems
No ratings yet
Medical Imaging Systems
1 page
Life On The Leash Chapter Sampler
No ratings yet
Life On The Leash Chapter Sampler
20 pages
Section 1 - What Do We Mean by End of Life
No ratings yet
Section 1 - What Do We Mean by End of Life
15 pages
FINAL DRAFT 2 OF IPD Manual For PSTs Nov 01,2022
No ratings yet
FINAL DRAFT 2 OF IPD Manual For PSTs Nov 01,2022
118 pages
HH
No ratings yet
HH
10 pages
Finderscult Fbi1
No ratings yet
Finderscult Fbi1
374 pages
Saicejournalofcivilengineeringvol 64 No 3
No ratings yet
Saicejournalofcivilengineeringvol 64 No 3
64 pages
Year 9 Final Exam 2019 ACG Math
No ratings yet
Year 9 Final Exam 2019 ACG Math
28 pages
Importando Pelos Correios: Shipping To Brazil
No ratings yet
Importando Pelos Correios: Shipping To Brazil
19 pages
Crosby Shackle
No ratings yet
Crosby Shackle
1 page

Providing Accurate Data:: How Does It Work?

Uploaded by

Providing Accurate Data:: How Does It Work?

Uploaded by

Slide 1:

Basically we have designed a Clinical Decision Support System (CDSS) in the

 Providing accurate data: A CDSS can integrate information from various

 Document Analysis: Understanding and extracting information from

This diagram depicts the LangChain framework for a question-answering system.

 Defines the number of characters by which consecutive chunks will

In Natural Language Processing (NLP), an embedding is a representation of text

• Embeddings reduce the high-dimensional space of text (e.g., vocabulary size)

 Task Specificity: By incorporating task instructions, INSTRUCTOR

1. Architecture: INSTRUCTOR uses GTR (General Text

Summary of GTR Model in INSTRUCTOR

• Base Architecture: Uses T5 model as the foundational architecture.

In essence, the GTR model in the INSTRUCTOR framework is a specialized version

3. MEDI Dataset: The Training Ground

MEDI (Multitask Embedding Data with Instructions) is like INSTRUCTOR's

The retriever employs several methods to determine relevance:

1. Similarity Search Method (Traditional Approach):

Implementation and Use

 Efficiency: These methods are designed to efficiently retrieve relevant

In summary, a retriever leveraging vector stores as its backbone employs various

Here’s a brief explanation and a possible setup in Python:

 search_type="mmr": Specifies that the retriever should use Maximal

1. Understanding Text: LLMs analyze the meaning of sentences, identify

TheBloke/wizardLM-7B-HF is an enhanced version of Llama (Large Language

 Understanding and Generating Text: Like other large language models,

You might also like