0% found this document useful (0 votes)
16 views

4.information Extraction

Information extraction is the automated retrieval of specific information related to a selected topic from text. It can pull information from documents, databases, websites or multiple sources. Information extraction uses named entity recognition to find targeted information like locations, people or organizations. Once recognized, extraction utilities extract related details to create structured, machine-readable output for further analysis and meaning extraction. Current research also looks at multimedia content annotation and extraction.

Uploaded by

DEVARAJ P
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views

4.information Extraction

Information extraction is the automated retrieval of specific information related to a selected topic from text. It can pull information from documents, databases, websites or multiple sources. Information extraction uses named entity recognition to find targeted information like locations, people or organizations. Once recognized, extraction utilities extract related details to create structured, machine-readable output for further analysis and meaning extraction. Current research also looks at multimedia content annotation and extraction.

Uploaded by

DEVARAJ P
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 2

4.

INFORMATION EXTRACTION

 Information extraction (IE) is the automated retrieval of specific information related to a selected
topic from a body or bodies of text.
 Information extraction tools make it possible to pull information from text documents, databases,
websites or multiple sources.
 IE may extract info from unstructured, semi-structured or structured, machine-readable text.
 IE is used in natural language processing (NLP) to extract structured from unstructured text.

Information extraction depends on


 Named entity recognition (NER) a sub-tool used to find targeted information to extract.
 NER recognizes entities first as one of several categories such as location (LOC), persons (PER) or
organizations (ORG).
 Once the information category is recognized, an information extraction utility extracts the named
entity‘s related information and constructs a machine-readable document from it, which algorithms
can further process to extract meaning.
 IE finds meaning by way of other subtasks including co-reference resolution, relationship
extraction language and vocabulary analysis and sometimes audio extraction.
 Current efforts in multimedia document processing in IE include automatic annotation and content
recognition and extraction from images and video could be seen as IE as well.
 Because of the complexity of language, high-quality IE is a challenging task for artificial
intelligence (AI) systems.

You might also like