0% found this document useful (0 votes)

5 views

moawad2012

Uploaded by

Thanhbich Nguyen

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

5 views

moawad2012

Uploaded by

Thanhbich Nguyen

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 7

Semantic Graph Reduction Approach for Abstractive

Text Summarization
Ibrahim F. Moawad Mostafa Aref
Information Systems Dept. Computer Science Dept.
Faculty of Computer and Information Sciences Faculty of Computer and Information Sciences
Ain shams University Ain shams University
Cairo, Egypt Cairo, Egypt
[email protected] [email protected]

Abstract— One of the important Natural Language Processing In this paper, a novel approach is presented to generate an
applications is Text Summarization, which helps users to manage abstractive summary automatically for the input text using a
the vast amount of information available, by condensing semantic graph reducing technique. This approach exploits a
documents’ content and extracting the most relevant facts or new semantic graph called Rich Semantic Graph (RSG) [3, 4].
topics included. Text Summarization can be classified according RSG is an ontology-based representation developed to be used
to the type of summary: extractive, and abstractive. Extractive as an intermediate representation for Natural Language
summary is the procedure of identifying important sections of the Processing (NLP) applications. The new approach consists of
text and producing them verbatim while abstractive summary three phases: creating a rich semantic graph for the source
aims to produce important material in a new generalized form.
document, reducing the generated rich semantic graph to more
In this paper, a novel approach is presented to create an
abstracted graph, and finally generate the abstractive summary
abstractive summary for a single document using a rich semantic
graph reducing technique. The approach summaries the input from the abstracted rich semantic graph.
document by creating a rich semantic graph for the original The paper is organized as follows. A simple background
document, reducing the generated graph, and then generating the and related work are presented in Section II. Section III
abstractive summary from the reduced graph. Besides, a presents the proposed approach architecture, while Section IV
simulated case study is presented to show how the original text describes and explains the approach phases. To illustrate how
was minimized to fifty percent. the approach works, and what its expected utility is, a
simulated case study called “Graduate Students” is presented in
Keywords- Text Summarization; Abstractive Summary;
Semantic Representation; Rich Semantic Graph; Semantic Graph Section V. Finally, Section VI concludes the paper.

I. INTRODUCTION II. BACKGROUND AND RELATED WORK

The exponential growth in data increases the need for Radev et al. [5] define a summary as "a text that is
intelligent filtering and knowledge-based approaches to reduce produced from one or more texts, which conveys important
the time needed to absorb the key facts in documents and to information in the original text(s), and that is no longer than
avoid drowning in it. It has now become near impossible to half of the original text(s) and usually significantly less than
manually search, sift and choose which information one should that". This simple definition captures three important aspects
incorporate. Text Summarization is one of the techniques that that characterize research on automatic text summarization:
can help humans in this matter. Summaries may be produced from a single document [6] or
multiple documents [7], Summaries should preserve important
Text Summarization helps users to manage the vast amount information, and Summaries should be short.
of information available, by condensing documents’ content
and extracting the most relevant facts or topics included in Summarization techniques can be classified according to
them. It can be successfully combined with a number of many Summarization factors. For example, they can be
intelligent systems based on Human Language Technologies classified according to the number of input documents (single-
(e.g. information retrieval, question answering, and text document versus multi-document), to the type of these
classification) to find quickly the specific information the users documents (textual versus multimedia), to the output types
are looking for within documents [1]. (extractive versus abstractive), etc [8]. Summary extractive is
the procedure of identifying important sections of the text and
Many approaches addressed this problem by building producing them verbatim. On the other hand, Summary
systems depending on the type of the required summary. While abstraction aims to produce important material in a new form
extractive summarization is mainly concerned with what the [5].
summary content should be, usually relying solely on
extraction of sentences, abstractive summarization puts strong Leskovec, et al. [ 9, 10, 11, 12 ] provided a semantic
emphasis on the form, aiming to produce a generalized graph-based approach to generate an extractive summary for a
summary which usually requires advanced language generation single input document. The purpose of their extractive
techniques [2]. summarization is to obtain the most important sentences from

978-1-4673-2961-3/12/$31.00 ©2012 IEEE 132

the original document by first generating the document Cracks appeared Tuesday in the U.N. trade embargo against Iraq as Saddam Hussein
sought to circumvent the economic noose around his country. Japan, meanwhile,
announced it would increase its aid to countries hardest hit by enforcing the sanctions.

semantic graph and then using the document and graph features
Hoping to defuse criticism that it is not doing its share to oppose Baghdad,
Japan said up to $2 billion in aid may be sent to nations most affected by the U.N.
embargo on Iraq. President Bush on Tuesday night promised a joint session of
Congress and a nationwide radio and television audience that ``Saddam Hussein will

to obtain the document summary. They created a semantic

fail'' to make his conquest of Kuwait permanent. ``America must stand up to
aggression, and we will,'' said Bush, who added that the U.S. military may remain in the
Saudi Arabian desert indefinitely. ``I cannot predict just how long it will take
to convince Iraq to withdraw from Kuwait,'' Bush said. More than 150,000 U.S. troops
have been sent to the Persian Gulf region to deter a possible Iraqi invasion of

representation of the document, based on the logical form

Saudi Arabia. Bush's aides said the president would follow his address to Congress with
a televised message for the Iraqi people, declaring the world is united against
their government's invasion of Kuwait. Saddam had offered Bush time on Iraqi TV. The
Philippines and Namibia, the first of the developing nations to respond to affer Monday
by Saddam of free oil _ in exchange for sending their own tankers to get it _ said no to

triplets (the basic form subject–verb–object of the document

th I i l d S dd ' ff t btl

sentences) that have been retrieved from the text. For each
generated triplet, they assign a set of features comprising
Rich Semantic Graph
linguistic, document, and graph attributes. They then train the
Creation
linear Support Vector Machine classifier to determine those
triplets that are useful for extracting sentences which are later
compose the summary.
In their approach, Leskovec, et al. aimed to create an
extractive summary from the source document only, and hence
they did not consider the abstractive summary. Besides, they
use the semantic graph it its ordinary form to represent the
input document, therefore the generated graph will be very Rich Semantic Graph Domain Ont.
huge, because the graph granularity level is high. Reduction &
WordNet
In this paper, a novel approach is presented to create an
abstractive text summary automatically. This approach
summaries the input document by creating a rich semantic
graph for the original document. The generated rich semantic
graph enriches the traditional semantic graph by associating
attributes to the graph nodes. After that, the approach reduces Summarized Text
the generated rich semantic graph to more abstracted graph, Generation
and then it generates the abstractive summary from the
abstracted rich semantic graph.
Cracks appeared Tuesday in the U.N. trade embargo
against Iraq as Saddam Hussein sought to circumvent the
economic noose around his country. Japan, meanwhile,
announced it would increase its aid to countries hardest
hit by enforcing the sanctions. Hoping to defuse criticism
that it is not doing its share to oppose Baghdad,
Japan said up to $2 billion in aid may be sent to nations
most affected by the U.N. embargo on Iraq. President
Bush on Tuesday night promised a joint session of
Congress and a nationwide radio and television audience
that ``Saddam Hussein will fail'' to make his conquest of
Kuwait permanent. ``America must stand up to
aggression, and we will,'' said Bush, who added that the

III. THE PROPOSED APPROACH

The proposed approach aims to summarize an input single Figure 1. The proposed approach architecture
document by creating a semantic graph called Rich Semantic
Graph (RSG) for the original document, reducing the generated
semantic graph, and then generating the final abstractive Finally, the Summarized Text Generation Phase aims to
summary from the reduced semantic graph. As shown in Fig. 1, generate the abstractive summary from the reduced rich
the approach accepts the input document and outputs an semantic graph. This phase accepts a semantic representation
abstractive summary for that document. The approach consists in the form of RSG and generates the summarized text. The
of three phases: the Rich Semantic Graph Creation Phase, the input graph contains the information needed to generate the
Rich Semantic Graph Reduction Phase, and the Summarized final text. To achieve its task, the phase accesses the domain
Text Generation Phase. ontology, which contains the information needed in the same
domain of RSG.
The main objective of the Rich Semantic Graph Creation
Phase is to represent the input document semantically using
IV. THE APPROACH PHASES
Rich Semantic Graph (RSG). In RSG, the verbs and nouns of
the input document are represented as graph nodes along with
A. The Rich Semantic Graph Creation Phase
edges corresponding to semantic and topological relations
between them. The graph nodes are instances of the This phase [3, 4] starts with deep syntactic analysis of the
corresponding verb and noun classes in the domain ontology. input text, then generates typed dependency relations
The domain ontology contains the class hierarchy and relations (grammatical relations), and syntactic and morphological tags
of the input document verbs and nouns. Unlike traditional for each word. After that, for each sentence, the model accesses
semantic graph, the Rich Semantic Graph is able to capture the the domain ontology to instantiate, interconnect and validate
meaning of words, sentences, and paragraphs. Because the the sentence concepts to build rich semantic sub-graphs.
RSG nodes are instances of the domain ontology classes, not Finally, the sentences rich semantic sub-graphs are merged
just atomic values, they store the text semantics. together to represent the whole document semantically by
creating the final Rich Semantic Graph.
The Rich Semantic Graph Reduction Phase aims to reduce
the generated rich semantic graph of the source document to Fig. 2 shows the architecture of the Rich Semantic Graph
more reduced graph. A model of heuristic rules is applied to Creation phase, where it is composed of three modules:
reduce the graph by replacing, deleting, or consolidating the Preprocessing, Rich Semantic Sub-graphs Generation, and
graph nodes using the WordNet relations [13, 14]. Rich Semantic Graph Generation modules. The Preprocessing

133
module is responsible to accept the input text, and converts it to Finally, co-reference and pronominal resolution reference
preprocessed sentences. The Rich Semantic Sub-graphs resolution processes identify co-reference named entities and
Generation module is responsible to transform each resolve pronominal references in the whole input text. The
preprocessed sentence to a set of ranked rich semantic sub- preprocessing module has two main objectives: resolving the
graphs. Finally, the Rich Semantic Graph Generation module is syntactic ambiguity and then retrieving both set of tags
responsible to generate set of ranked RSGs for the input ranked (syntactic and morphological) and typed dependency relations
semantic sub-graphs. These RSGs represent different semantic between words for the input text. For example, Fig. 3 shows
representations of the whole document, where the most ranked the syntactic and morphological tags, and the typed
RSG is considered. dependency relations for the "Sara is a graduate student."
sentence using syntactic analyzer tool called "lingsoft" [15] and
1) The Preprocessing module: It consists of four main a parser tool built by Stanford University Natural Language
processes: named entity recognition, morphological and Group [16].
syntactic analysis, cross-reference resolution, and pronominal
resolution processes. The named entity recognition process 2) The Rich Semantic Sub-graphs Generation module: The
locates atomic elements into predefined categories such as main objective of the Rich Semantic Sub-graphs Generation
person names, organizations, etc. In morphological analysis, module is to generate multiple rich semantic sub-graphs for
each word is divided into morphemes and figures out its each input preprocessed sentence. Each preprocessed sentence
grammatical categories, the syntactic analysis parses the whole is composed of a sequence of words: Si = [Wi1, Wi2, … Win],
sentence to describe each word syntactic function and build the where Wij is a word j belonging to a sentence i. Each word is
parse tree, and typed dependencies expresses syntactic represented as a triple sequence Wij = [St, T, D], where St
knowledge in terms of direct relationships between words. represents the word stem, T represents the set of tags
(morphological and syntactic), and D represents the set of
typed dependency relations. This module includes three
processes: Word Senses Instantiation, Concepts Validation, and
Semantic Sentences Ranking processes.
Cracks appeared Tuesday in the U.N. trade embargo against Iraq as Saddam Hussein
sought to circumvent the economic noose around his country. Japan, meanwhile,
announced it would increase its aid to countries hardest hit by enforcing the sanctions.
Hoping to defuse criticism that it is not doing its share to oppose Baghdad,
Japan said up to $2 billion in aid may be sent to nations most affected by the U.N.
embargo on Iraq. President Bush on Tuesday night promised a joint session of
Congress and a nationwide radio and television audience that ``Saddam Hussein will
fail'' to make his conquest of Kuwait permanent. ``America must stand up to
aggression, and we will,'' said Bush, who added that the U.S. military may remain in the

x
Saudi Arabian desert indefinitely. ``I cannot predict just how long it will take

Word Senses Instantiation process: For each input

to convince Iraq to withdraw from Kuwait,'' Bush said. More than 150,000 U.S. troops
have been sent to the Persian Gulf region to deter a possible Iraqi invasion of
Saudi Arabia. Bush's aides said the president would follow his address to Congress with
a televised message for the Iraqi people, declaring the world is united against
their government's invasion of Kuwait. Saddam had offered Bush time on Iraqi TV. The

preprocessed sentence, this process instantiates a set of

Philippines and Namibia, the first of the developing nations to respond to affer Monday
by Saddam of free oil _ in exchange for sending their own tankers to get it _ said no to
th I i l d S dd ' ff t btl

word concepts for both noun and verb senses based on

the domain ontology. For example, for the "Sara is a
Preprocessing
graduate student." sentence, the sequence has three
concept sets: three senses of the word "Sara", which
acts as noun in the sentence, thirteen senses of the
Preprocessed Sentences word "is", which acts as verb syntactic category, and
two senses of the word "student", which acts as noun
syntactic category.
Rich Semantic Sub- Domain Ont.
graphs Generation x Concepts Validation process: In this process, for each
&
WordNet
preprocessed sentence, the sentence concepts
instantiated are interconnected and validated to
generate multiple rich semantic sub-graphs. The
sentence concepts are interconnected through the
semantic and syntactic relationships generated in the
Ranked Rich Semantic Sub-graphs preprocessing module. For some sentences, different
word senses (meanings) may lead to different concepts
interconnections. In the domain ontology, the word
concept of some sense is classified according to its
Rich Semantic attributes and constraints which include the various
Graph Generation semantics and linguistics issues. Though, there may be
some semantically rejected interconnections, which
resolve some of the natural language ambiguities. For
example, the "student" noun has two concepts: the first
concept is a kind of enrollee or pupil, and the second is
equivalent to scholar. Therefore, the phrase "graduate
student" is valid only with the first concept.
Figure 2. The rich semantic graph creation phase

134
entailment. There are many rules can be derived based on
Morphological and syntactic tags many factors: the semantic relation, the graph node type (noun
x "sara" <*> <Proper> N NOM SG @SUBJ or verb), the similarity or dissimilarity between graph nodes,
x "be" <SV> <SVC/N> <SVC/A> V PRES SG3 VFIN @+FMAINV etc. Table 1 presents a set of heuristic rule examples that can
x "a" <Indef> DET CENTRAL ART SG @DN> be applied on the graph nodes of two simple sentences:
x "graduate" A ABS @AN> Sen1=[SN1,MV1,ON1] and Sen2=[SN2,MV2,ON2]. Each
x "student" N NOM SG @PCOMPL-S sentence is composed of three nodes: Subject Noun (SN) node,
Typed dependencies Main Verb (MV) node, and Object Noun (ON) node. For
x nsubj(student, Sara) example, in rule 1, both main verbs (MV1 and MV2) are
x cop(student, is) merged and both sentence objects (ON1 and ON2) are merged if
x det(student, a)
the two sentence subjects are instances of the same noun (N),
x amod(student, graduate)
both sentence verbs are similar, and finally both sentence
objects are similar.
Figure 3. Example of syntactic and morphological tags,
and typed dependency relations
C. The Summarized Text Generation Phase
x Sentences Ranking process: It aims to rank and to This phase aims to generate the abstractive summary from
threshold the highest ranked rich semantic sub-graphs the reduced Rich Semantic Graph (RSG) [18]. To achieve its
for each sentence. To generate single rich semantic task, the phase accesses the domain ontology, which contains
graph and to keep the semantic consistency for the the information needed in the same domain of RSG to generate
whole sentence, the process considers the first ranked the final texts.
rich semantic sub-graph only. The ranking method is
based on deriving the average weight of each concept TABLE I. REDUCTION HEURISTIC RULE EXAMPLES
(word sense) and the average weight of the whole
sentence concepts based on (1) and (2) respectively.
The weight of the word concept is derived according to

its usage popularity (Wordnet usage popularity) [17].
In (1), n represents the Wordnet usage popularity of the
concept C, and N is the total number of senses for the
concept word. In (2), M represents the total number of
concepts in a sentence. For example, in the "Sally is
specialized in computer-science." sentence, the word
"Sally" has only one concept (one sense), so its weight
%
equals to 10. The word "specialized" has three &
concepts (senses) and, their weights equals to (10, 7, #!%%%%"!%%"$
and 6). The word "computer-science" has only one #!&%&%"!&&"$
concept and, its weight equals to 10. Based on these
values, the output rank values of these sentence rich % %
semantic sub-graphs are (10, 9, and 8.6). & &
% &

C weight = 10 * (5 * ((n-1)/N)) % &
%
&
S weight = ( ∑ Mm=0 C weight )/M % &

3) The Rich Semantic Graph Generation module: Finally, % %
& &
the Rich Semantic Graph Generation module is responsible to % &
generate the final rich semantic graphs of the whole input % &
document from the highest-ranked rich semantic sub-graphs of

the document sentences. The semantic sub-graphs of the input % &
document will be merged to form the final rich semantic graph. % &
%
&
B. The Rich Semantic Graph Reduction Phase
This phase aims to reduce the generated rich semantic % &
% %
graph of the original document to more reduced graph. In this & &
phase, a set of heuristic rules are applied on the generated rich % &
semantic graph to reduce it by merging, deleting, or
consolidating the graph nodes. These rules exploit the
WordNet semantic relations: hypernym, holonym, and

135
Besides, the WordNet ontology is accessed to generate 2) The Sentence Planning module: It improves the fluency
multiple texts according to the word synonyms. The generated or understandability of the text. To achieve this objective, the
multiple texts are evaluated and ranked, where the most ranked words of the text should be related to each other, the clauses
text is considered. The texts evaluation is achieved according should exhibit no unintentional redundancy, and the different
two criteria: the most frequently used words and the discourse
sentences with the same subject should be aggregated. The
sentence relations.
sentence planning module receives noun and verb objects and
Fig. 4 shows the main modules composing the Summarized generates semi-paragraphs. The sentence planning consists of
Text Generation phase, where there are four modules namely four main processes: Lexicalization, Discourse Structuring,
the Text planning, the Sentence Planning, the Surface Aggregation, and Referring Expression processes.
Realization, and the Evaluation modules. Firstly, the text x Lexicalization Process: In this process, for each
planning module aims to select the appropriate content material verb/noun object, its synonyms are selected by
to be expressed in the final text. Secondly, the sentence accessing the WordNet ontology to generate the target
planning module specifies the sentence boundaries, and content. To select the most appropriate synonyms, a
generates and orders an intermediate paragraphs. Thirdly, the weight W is assigned for every synonym. This weight
sentence realization module generates grammatically-corrected is calculated using (3), where E is the existence
paragraphs. Finally, because of generating multiple texts, the probability of the synonym in the input rich semantic
module of text evaluation evaluates the final multiple texts graph, NR represents the synonym WordNet rank, RT
based on the most frequently used words using the WordNet represents the total value of all synonym ranks, NGS
ontology and the relations between sentences. represents the WordNet group by similarity for
1) The Text planning module: It decides what information synonym, and TG represents the total number of
should be included in the generated text. In our approach, to groups by similarity for all synonyms. According to
preserve all semantic information embedded in the input experimental test, the best weight value starts from 8,
so this value has been considered as a threshold value.
semantic representation (Rich Semantic Graph), all graph
Therefore, the word synonyms that have weight greater
objects (noun and verb objects) are considered to be passed to than or equal 8 will be selected only.
the sentence planning module.
NR NGS
W ( E (1 )( )) / 3) *10
RT TG
x Discourse Structuring Process: It builds a suitable
structure to contain the selected object synonyms in the
Text Planning
form of pseudo-sentences (the first form of the
generated sentences). Initially, the noun objects are
sorted in descending order according to their number
Set of Objects of attributes. For each noun object, a pseudo-sentence
is composed for each attribute, and a pseudo-sentence
is composed for each related verb to that object.
Sentence Planning x Aggregation Process: It decides how pseudo-sentences
are combined into semi-paragraphs. Two processes are
applied: subject grouping and predicate grouping
Semi-Paragraphs processes [19]. The subject grouping process is
responsible for grouping clauses with common
elements with the same subjects, while the predicate
Surface Domain Ont. grouping process is responsible for grouping the clause
Realization & with the same predicate. Using the domain ontology,
WordNet the discourse relations are retrieved. The module uses
Paragraphs
the PDTB (Pann Discourse Tree Bank Model) relations
[20]. The discourse role of the object is defined in the
input semantic graph. It is defined as the discourse
relation type and the argument span in which the object
Evaluation is located in the input semantic graph. Then, the
relations retrieved from domain ontology among the
pseudo-sentences connect them with each other. The
Cracks appeared Tuesday in the U.N. trade embargo
against Iraq as Saddam Hussein sought to circumvent the
economic noose around his country. Japan, meanwhile,
details of this process are very application dependent.
announced it would increase its aid to countries hardest
hit by enforcing the sanctions. Hoping to defuse criticism

x
that it is not doing its share to oppose Baghdad,

Referring Expression Process: It identifies and replaces

Japan said up to $2 billion in aid may be sent to nations

the intended referent by its appropriate pronoun. Every

Figure 4. The summarized text generation phase pseudo-sentence is visited in every semi-paragraph to
store the repeated subject of every pseudo-sentence

136
into a list, and the process starts replacing the subject "Student 2" are instances of the Student noun class, both
with the appropriate pronoun after leaving the first "Specialize 1" and "Specialize 2" verbs are similar, both "Field
pseudo-sentence subject. The process restricts the 1" and "Field 2" objects are instances of subclasses of the same
replacement of the subject for every three pseudo- super-class. Therefore, both "Specialize 1" and "Specialize 2"
sentences, and then it starts again. verbs were merged into "Specialize 1", both "Field 1" and
"Field 2" objects were replaced and merged into "Field 3",
3) The Surface Realization module: This module aims to which has more abstracted value ("Artificial Intelligence").
transform the enhanced semi-paragraphs into paragraphs by
correcting them grammatically (inflect words for tense, etc.)
Angle Chris is a graduate student. Mrs. Chris is
and adding the required punctuation (capitalization adding
specialized in Machine learning field. John Michel is
semicolon, etc). In the proposed approach, the techniques of
Simplenlg (Simple natural language generation) [21] can be a graduate student. He is specialized in Intelligent
exploited to achieve these objectives. Agents field. During his study, Mr. Michel passed the
4) The Evaluation module: The main objective of this preparatory courses. Angle Chris published two
module is to evaluate and then rank the paragraphs according papers in international conferences. Also, John
to two factors: the coherence between paragraph sentences, and Michel published two papers in international
the most frequently used paragraph word synonyms. According conferences.
to experimental test, we have found that the coherence measure
generates very close results, so the most frequently used
paragraph word synonyms is used as an additional evaluation Figure 5. The original text of graduate students example
factor. Firstly, text coherence evaluation is applied for
assessing whether the paragraphs are coherent or not [22]. Specialize 1
Field 1
Therefore, each paragraph is evaluated and ranked according to Tense: Present
the number of coherence relations between its sentences. Student 1 Agent: Student 1 Value: Machine
Secondly, the most frequently used paragraph word synonyms Name: Angle Chris
Object: Field 1 learning
Type: Singular
are aggregated by accessing the WordNet rank. Finally, the Level: Graduate
final paragraphs can be sorted according to the coherence Type: Singular
Publish 1
evaluation rank and then by the most frequently used paragraph Research 1
word synonyms rank. Tense: Past
Agent: Student 1 Value: Scientific
Object: Research 1 Paper
V. GRADUATE STUDENTS CASE STUDY Location: international Adjective: Two
conferences Type: Plural
To show how the proposed approach works, a simulated
case study called “Graduate students” is presented in details.
Fig. 5 shows the input text, which consists of single paragraph
Specialize 2
talking about two graduate students (Angle Chris and John
Michel). It consists of 7 sentences and contains 53 words. After Tense: Present
Field 2
applying the Preprocessing, Rich Semantic Sub-graphs Agent: Student 2
Value: Intelligent
Generation, and Rich Semantic Graph Generation modules of Object: Field 2
Agents
the Rich Semantic Graph Creation phase, a rich semantic graph Type: Singular
is created as shown in Fig. 6. The rich semantic graph nodes
represent the instantiated objects of the domain ontology Pass 1
classes for the input text nouns and verbs. It contains 8 noun Student 2
Tense: Past Course 2
nodes representing the sentence subjects and objects, and 5 Name: John Michel Agent: Student 2
verb nodes (with gray background color) representing the Level: Graduate Object: Course 2 Value: Preparatory
sentence main verbs. For example, the "Mrs. Chris is Type: Singular Time: during study Type: Plural
specialized in Machine learning field." sentence is represented
with the "Student 1", "Specialize 1", and "Field 1" nodes. Publish 2
By applying the reduction heuristic rules on the generated Research 2
Tense: Past
rich semantic graph from the input text, the reduced graph is Agent: Student 2 Value: Scientific
shown in Fig. 7. Initially, rule number 1 was fired and applied Object: Research 2 Paper
on the original semantic graph, where both "Student 1" and Location: international Adjective: Two
"Student 2" are instances of the Student noun class, both conference Type: Plural
"Publish 1" and " Publish 2" are similar, and both "Research 1"
and " Research 2" are similar. Therefore, both "Publish 1" and
"Publish 2" are merged into "Publish 1", and both "Research 1" Figure 6. The rich semantic graph of graduate students original text
and "Research 2" are merged into "Research 1". After that, rule
number 4 was fired and applied, where both "Student 1" and

137
Finally, the Text planning, the Sentence Planning, the REFERENCES
Surface Realization, and the Evaluation modules of the [1] E. Lloret, M. Palomar, "Text summarisation in progress: a literature
Summarized Text Generation phase are applied on the reduced review", Artificial Intelligence Review, Vol. 37, No. 1, pp. 1-41, 2012.
Rich Semantic Graph to generate the abstractive summary [2] D. Das, A. Martins, "A Survey on automatic text summarization",
shown in Fig. 8. As shown in the summary, the "Angle Chris Unpublished, Literature survey for Language and Statistics II, Carnegie
and John Michel are graduate students." sentence was Mellon University, 2007.
composed from both "Student 1" and "Student 2" nodes, and [3] M. Aref, I. Moawad, S. Ibrahim., "Rich Semantic Graph Generation
the "They are specialized in Artificial Intelligence field." System Prototype", The tenth Conference on Language Engineering,
Cairo, Egypt, 2010.
sentence was composed from the "Student 1", "Student 2",
[4] I. Moawad, M. Aref, S. Ibrahim, "Ontology-based Model for Generating
"Specialize 1", and "Field 3" nodes. The final summary Text Semantic Representation", the International Journal of Intelligent
consists of 4 sentences and contains 29 words. The final Computing and Information Sciences “IJICIS”, Vol. 11, No. 1, pp. 117-
abstractive text represents about 50% of the original text. 128, January 2011.
[5] D. Radev, E. Hovy, K. McKeown, "Introduction to the Special Issue on
Summarization", Computational Linguistics, Vol. 28, No. 4, pp. 399-
Specialize 1 408, 2002.
[6] K. Svore, L. Vanderwende, C. Burges, "Enhancing single-document
Field 3
Tense: Present summarization by combining RankNet and third-party sources", In
Student 1 Agent: Student 1,2 Proceedings of the EMNLP-CoNLL, pp. 448-457, 2007.
Value: Artificial
Object: Field 3 [7] D. Evans, K. McKeown, J. Klavans, "Similarity-based Multilingual
Name: Angle Chris Intelligence
Type: Singular Multi-Document Summarization", Technical Report CUCS-014-05,
Level: Graduate Department of Computer Science, Columbia University, Apr 2005.
Type: Singular
Publish 1 [8] A. Stergos, K. Vangelis, S. Panagiotis, "Summarization from medical
Research 1 documents: a survey", Artificial intelligence in medicine, Vol. 33, No. 2,
Tense: Past pp. 157-77, 2005.
Agent: Student 1, 2 Value: Scientific [9] J. Leskovec, M. Grobelnik, N. Milic-Frayling, "Learning Sub-structures
Object: Research 1 Paper of Document Semantic Graphs for Document Summarization", in
Location: international Adjective: Two KDD2004 Workshop on Link Analysis, 2004.
conferences Type: Plural [10] J. Leskovec, M. Grobelnik, N. Milic-Frayling, "Learning Semantic
Graph Mapping for Document Summarization", 2000.
[11] J. Leskovec, M. Grobelnik, N. Milic-Frayling, "Extracting Summary
Pass 1 Sentences Based on the Document Semantic Graph", Microsoft
Student 2
Research, 2005.
Tense: Past Course 2
Name: John Michel [12] D. Rusu, B. Fortuna, M. Grobelnik, D. Mladenić, "Semantic Graphs
Agent: Student 2
Level: Graduate Value: Preparatory Derived From Triplets With Application In Document Summarization",
Object: Course 2
Type: Singular International journal of Computing and Informatics, Vol.33, No.3, 2009.
Time: during study Type: Plural
[13] C. Fellbaum, "WordNet: An Electronic Lexical Database", MIT Press,
1998.
Figure 7. The reduced rich semantic graph [14] G. Miller, R. Beckwith, C. Fellbaum, D. Gross, K. Miller, Five Papers
on WordNet. Cognitive Science Laboratory, Princeton University,
Princeton, 1990.
Angle Chris and John Michel are graduate students. [15] ENGCG: Constraint Grammar Parser of English,
They are specialized in Artificial Intelligence field. http://www2.lingsoft.fi/cgi-bin/engcg, June 15, 2012.
They published two papers in international [16] Stanford Parser,http://nlp.stanford.edu:8080/parser/index.jsp, June 15,
2012.
conferences. During study, John Michel passed [17] A. Sharaf, "An Object-Oriented Model for Semantic Analysis of Natural
Preparatory courses. Languages", Master Thesis, Information Computer Science Dept., King
Fahd University of Petroleum and Mineral, Saudi Arabia, January 2001.
Figure 8. The graduate students summarized text [18] I. Fathy, D. Fadl, M. Aref, “Rich Semantic Representation Based
Approach for Text Generation”, The 8th International conference on
Informatics and systems (INFOS2012), Egypt, 2012.
VI. CONCLUSION
[19] H. Dalianis, E. Hovey, "Aggregation in Natural Language Generation",
In conclusion, a novel approach to create an abstractive EWNLG-93, Proceedings of the 4th European Workshop on Natural
summary for a single document using a semantic graph Language Generation, Pisa, Italy, 2004.
reducing approach was presented in this paper. The approach [20] R. Prasad, N. Dinesh, A. Lee, E. Miltsakaki, L. Robaldo, A. Joshi, B.
summaries the source document by creating a semantic graph Webber, "The Penn Discourse Treebank 2.0", Proceedings of the 6th
International Conference on Language Resources and Evaluation (LREC
called Rich Semantic Graph for the original document, 2008), Morocco, 2008.
reducing the generated semantic graph to more abstracted [21] A. Gangemi, R. Navigli, P. Velardi, "The OntoWordNet Project:
graph, and generating the abstractive summary from the Extension and Axiomatization of Conceptual Relations in WordNet", In
reduced graph. A case study showed that the proposed Proc. of International Conference on Ontologies, Databases and
approach succeeded to minimize the original text to fifty Applications of SEmantics (ODBASE 2003), Catania, Italy, pp. 820–
percent. In the future work, we are going to develop a 838, 2003.
prototype to conduct more several case studies using several [22] Z. Lin, H. Ng, M. Kan, "Automatically Evaluating Text Coherence
Using Discourse Relations", In Proceedings of the 49th Annual Meeting
documents with different sizes, and hence assess the results of of the Association for Computational Linguistics: Human Language
our work properly. Technologies (ACL-HLT 2011), Portland, Oregon, USA, 2011.

138

The Johns Hopkins Guide To Psychological First Aid.
100% (29)
The Johns Hopkins Guide To Psychological First Aid.
23 pages
The Johns Hopkins Guide To Psychological First Aid 2017
100% (1)
The Johns Hopkins Guide To Psychological First Aid 2017
209 pages
Conceptual Framework For Abstractive Text Summarization
No ratings yet
Conceptual Framework For Abstractive Text Summarization
11 pages
An Extractive Approach for English Text
No ratings yet
An Extractive Approach for English Text
11 pages
Rane, Govilkar - 2019 - Recent Trends in Deep Learning Based Abstractive Text Summarization-Annotated
No ratings yet
Rane, Govilkar - 2019 - Recent Trends in Deep Learning Based Abstractive Text Summarization-Annotated
8 pages
Abstractive Text Summarization: State of The Art, Challenges, and Improvements
No ratings yet
Abstractive Text Summarization: State of The Art, Challenges, and Improvements
38 pages
Proposing An Extractive Mono-Document Summarization System For Persian Language
No ratings yet
Proposing An Extractive Mono-Document Summarization System For Persian Language
8 pages
22mca025 22mca032 22mca034
No ratings yet
22mca025 22mca032 22mca034
14 pages
Analysis of Abstractive and Extractive Summarizati
No ratings yet
Analysis of Abstractive and Extractive Summarizati
11 pages
Types of Extractive Methods
No ratings yet
Types of Extractive Methods
22 pages
Paper A Survey On ETS
No ratings yet
Paper A Survey On ETS
6 pages
Automatic Text Summarization Using Python
No ratings yet
Automatic Text Summarization Using Python
8 pages
Text Summarization Using Python NLTK
No ratings yet
Text Summarization Using Python NLTK
8 pages
EASESUM: An Online Abstractive and Extractive Text Summarizer Using Deep Learning Technique
No ratings yet
EASESUM: An Online Abstractive and Extractive Text Summarizer Using Deep Learning Technique
12 pages
Optimal Features Set For Extractive Automatic Text Summarization
No ratings yet
Optimal Features Set For Extractive Automatic Text Summarization
6 pages
Text Summarizer Using NLP (Natural Language Processing) : © JUL 2022 - IRE Journals - Volume 6 Issue 1 - ISSN: 2456-8880
No ratings yet
Text Summarizer Using NLP (Natural Language Processing) : © JUL 2022 - IRE Journals - Volume 6 Issue 1 - ISSN: 2456-8880
6 pages
Research Paper On Text
No ratings yet
Research Paper On Text
7 pages
An Approach To Abstractive Text Summarization
No ratings yet
An Approach To Abstractive Text Summarization
7 pages
ATSSI Abstractive Text Summarization Using Sentiment Infusion
No ratings yet
ATSSI Abstractive Text Summarization Using Sentiment Infusion
7 pages
An Automatic Text Summarization Using Feature Terms For Relevance Measure
No ratings yet
An Automatic Text Summarization Using Feature Terms For Relevance Measure
5 pages
State of The Art Text - Summarisation
No ratings yet
State of The Art Text - Summarisation
15 pages
A Review Paper On Extractive Techniques of Text Summarization
No ratings yet
A Review Paper On Extractive Techniques of Text Summarization
4 pages
The Impact of Rule-Based Text Generation On The Quality of Abstractive Summaries
No ratings yet
The Impact of Rule-Based Text Generation On The Quality of Abstractive Summaries
10 pages
Research Final
No ratings yet
Research Final
6 pages
A Graph Based Approach On Extractive Summarization
No ratings yet
A Graph Based Approach On Extractive Summarization
9 pages
Text Summarization Using Word Frequency
No ratings yet
Text Summarization Using Word Frequency
3 pages
Feature Based Automatic Text Summarization Methods a Comprehensive State-Of-The-Art Survey
No ratings yet
Feature Based Automatic Text Summarization Methods a Comprehensive State-Of-The-Art Survey
23 pages
Abstractive Text Summarization Using Transformer Based Approach
No ratings yet
Abstractive Text Summarization Using Transformer Based Approach
10 pages
Abstractive Survey
No ratings yet
Abstractive Survey
8 pages
Text Summarization Using Natural Language Processing
No ratings yet
Text Summarization Using Natural Language Processing
5 pages
Automatic Text Document Summarization Based On Machine Learning
No ratings yet
Automatic Text Document Summarization Based On Machine Learning
4 pages
Extractive Text Summarization: Motilal Nehru National Institute of Technology Allahabad
No ratings yet
Extractive Text Summarization: Motilal Nehru National Institute of Technology Allahabad
29 pages
Malayalam 2
No ratings yet
Malayalam 2
4 pages
Research Paper 7
No ratings yet
Research Paper 7
8 pages
NLP Report
No ratings yet
NLP Report
14 pages
Synopsis_Creation_for_Research_Paper_using_Text_Summarization_Models
No ratings yet
Synopsis_Creation_for_Research_Paper_using_Text_Summarization_Models
5 pages
Text Summarizing Using NLP
No ratings yet
Text Summarizing Using NLP
8 pages
Extractive Text Summarization Using Word Frequency
No ratings yet
Extractive Text Summarization Using Word Frequency
6 pages
Abstractive Text Summarization Using Transformer Architecture
No ratings yet
Abstractive Text Summarization Using Transformer Architecture
5 pages
An Overall Survey of Extractive Based Automatic Text Summarization Methods
No ratings yet
An Overall Survey of Extractive Based Automatic Text Summarization Methods
6 pages
405 417jis 408848
No ratings yet
405 417jis 408848
14 pages
NLP Text Summary
No ratings yet
NLP Text Summary
21 pages
150
No ratings yet
150
6 pages
A_Survey_of_Advances_in_Text_Summarization_Methods
No ratings yet
A_Survey_of_Advances_in_Text_Summarization_Methods
5 pages
Rare Words in Text Summarization
No ratings yet
Rare Words in Text Summarization
11 pages
ASWIN_TS_summarisation_of_NLP_simplified_notes_unit_3[1]
No ratings yet
ASWIN_TS_summarisation_of_NLP_simplified_notes_unit_3[1]
4 pages
Text Summarization
No ratings yet
Text Summarization
6 pages
Journal of King Saud University - Computer and Information Sciences
No ratings yet
Journal of King Saud University - Computer and Information Sciences
10 pages
An Overview of Extractive Based Automati
No ratings yet
An Overview of Extractive Based Automati
12 pages
Irsw Project
No ratings yet
Irsw Project
8 pages
A Domain-Specific Automatic Text Summarization Using Fuzzy Logic
No ratings yet
A Domain-Specific Automatic Text Summarization Using Fuzzy Logic
13 pages
Coas Ojit 0502 03065k
No ratings yet
Coas Ojit 0502 03065k
16 pages
BP15
No ratings yet
BP15
15 pages
Automatic Text Summarization Using Natural Language Processing
No ratings yet
Automatic Text Summarization Using Natural Language Processing
54 pages
Automatic Text Summarization Using Natural Language Processing PDF
No ratings yet
Automatic Text Summarization Using Natural Language Processing PDF
54 pages
Abstractive Text Summary Generation With Knowledge Graph Representation
No ratings yet
Abstractive Text Summary Generation With Knowledge Graph Representation
9 pages
RVVM
No ratings yet
RVVM
9 pages
Automatic Summarization of Document Using Machine Learning
No ratings yet
Automatic Summarization of Document Using Machine Learning
3 pages
Text Summarization
No ratings yet
Text Summarization
3 pages
S2-Hybrid Method For Text Summarization Based On Statistical and Semantic Treatment
No ratings yet
S2-Hybrid Method For Text Summarization Based On Statistical and Semantic Treatment
34 pages
8921-Article Text-15992-1-10-20210614
No ratings yet
8921-Article Text-15992-1-10-20210614
7 pages
Semantic Computing
From Everand
Semantic Computing
Phillip C.-Y. Sheu
No ratings yet
Gulf War Group 22
No ratings yet
Gulf War Group 22
7 pages
War as Performance 1st ed. Edition Lindsey Mantoan All Chapters Instant Download
100% (3)
War as Performance 1st ed. Edition Lindsey Mantoan All Chapters Instant Download
76 pages
PDF War As Performance 1st Ed. Edition Lindsey Mantoan Download
100% (1)
PDF War As Performance 1st Ed. Edition Lindsey Mantoan Download
57 pages
Case Study of the Gulf Crisis-1990(Development of Pakistan's Foreign Policy)
No ratings yet
Case Study of the Gulf Crisis-1990(Development of Pakistan's Foreign Policy)
12 pages
The Gulf War Revisited Issues and Contributions To
No ratings yet
The Gulf War Revisited Issues and Contributions To
24 pages
The Gulf War
No ratings yet
The Gulf War
3 pages
TOPIC 59_1945USA
No ratings yet
TOPIC 59_1945USA
10 pages
History Iraq Kuwait
No ratings yet
History Iraq Kuwait
7 pages
Kenneth Waltz, Neorealism, and Foreign Policy
No ratings yet
Kenneth Waltz, Neorealism, and Foreign Policy
14 pages
Cambridge O Level: HISTORY 2147/11
No ratings yet
Cambridge O Level: HISTORY 2147/11
12 pages
Cl-8 Elit Poem Compiled Is
No ratings yet
Cl-8 Elit Poem Compiled Is
18 pages
Air and Space Power Review Vol.21 No.3
No ratings yet
Air and Space Power Review Vol.21 No.3
224 pages
Interventions
No ratings yet
Interventions
12 pages
The New World Order - Ian Martin - 1993
No ratings yet
The New World Order - Ian Martin - 1993
8 pages
The Johns Hopkins Guide to Psychological First Aid 1st Edition George S. Everly Jr. all chapter instant download
100% (1)
The Johns Hopkins Guide to Psychological First Aid 1st Edition George S. Everly Jr. all chapter instant download
52 pages
War at Sea - Issue 9, 2022 - The Gulf War - Operation Desert Storm
100% (1)
War at Sea - Issue 9, 2022 - The Gulf War - Operation Desert Storm
100 pages
Introduction of The Gulf War
No ratings yet
Introduction of The Gulf War
4 pages
Rakos, 1993 - Propaganda As Stimulus Control - The Case of The Iraqi Invasion of Kuwait
No ratings yet
Rakos, 1993 - Propaganda As Stimulus Control - The Case of The Iraqi Invasion of Kuwait
28 pages
WORLD-HISTORY-I-MODULE
No ratings yet
WORLD-HISTORY-I-MODULE
33 pages
Gulf War
No ratings yet
Gulf War
15 pages
PRIMEIROS SOCORROS PSICOLÓGICOS NO MODELO RAPID 2ED
No ratings yet
PRIMEIROS SOCORROS PSICOLÓGICOS NO MODELO RAPID 2ED
283 pages
1 The Idea and the Reality of Collective Security
No ratings yet
1 The Idea and the Reality of Collective Security
31 pages
Prophet Predicts Death of Hussein Newsletter
No ratings yet
Prophet Predicts Death of Hussein Newsletter
12 pages
Dore Gold, The Emerging Order in The Middle East
No ratings yet
Dore Gold, The Emerging Order in The Middle East
26 pages
KSS - Class 4 (Question Bank 2023) PDF
No ratings yet
KSS - Class 4 (Question Bank 2023) PDF
4 pages
News Agencies Problems
No ratings yet
News Agencies Problems
56 pages
Persian Gulf War Thesis Statement
100% (2)
Persian Gulf War Thesis Statement
7 pages
Saddam Speaks On The Gulf Crisis - Saddam Hussein
No ratings yet
Saddam Speaks On The Gulf Crisis - Saddam Hussein
216 pages

moawad2012

Uploaded by

moawad2012

Uploaded by

Semantic Graph Reduction Approach for Abstractive

I. INTRODUCTION II. BACKGROUND AND RELATED WORK

978-1-4673-2961-3/12/$31.00 ©2012 IEEE 132

to obtain the document summary. They created a semantic

representation of the document, based on the logical form

triplets (the basic form subject–verb–object of the document

III. THE PROPOSED APPROACH

Word Senses Instantiation process: For each input

preprocessed sentence, this process instantiates a set of

word concepts for both noun and verb senses based on

Referring Expression Process: It identifies and replaces

the intended referent by its appropriate pronoun. Every

You might also like