FGGAN Feature-Guiding Generative Adversarial Networks For Text Generation
FGGAN Feature-Guiding Generative Adversarial Networks For Text Generation
ABSTRACT Text generation is a basic work of natural language processing, which plays an important role
in dialogue system and intelligent translation. As a kind of deep learning framework, Generative Adversarial
Networks (GAN) has been widely used in text generation. In combination with reinforcement learning, GAN
uses the output of discriminator as reward signal of reinforcement learning to guide generator training, but
the reward signal is a scalar and the guidance is weak. This paper proposes a text generation model named
Feature-Guiding Generative Adversarial Networks (FGGAN). To solve the problem of insufficient feedback
guidance from the discriminator network, FGGAN uses a feature guidance module to extract text features
from the discriminator network, convert them into feature guidance vectors and feed them into the generator
network for guidance. In addition, sampling is required to complete the sequence before feeding it into the
discriminator to get feedback signal in text generation. However, the randomness and insufficiency of the
sampling method lead to poor quality of generated text. This paper formulates text semantic rules to restrict
the token of the next time step in the sequence generation process and remove semantically unreasonable
tokens to improve the quality of generated text. Finally, text generation experiments are performed on
different datasets and the results verify the effectiveness and superiority of FGGAN.
INDEX TERMS Generative adversarial networks, text generation, deep learning, reinforcement learning.
I. INTRODUCTION in the field of text generation, but the structure has the fol-
Generative Adversarial Networks (GAN) [1] has gradually lowing drawbacks:
developed into a hot research field in deep learning since it 1) The probability that the discriminator outputs positive
was proposed. As a generative model, its main application and negative samples is used as a reward signal in rein-
field is image generation, but GAN also has a high research forcement learning. The feedback signal is a scalar and
potential in text generation. Poetry writing, dialogue system cannot preserve the high-level semantic information of
and intelligent translation are all based on text generation. the text. The generator lacks a clear training direction.
Although some progress has been made, the quality of gen- 2) Sampling is required to complete the sequence and
erated text is often poor or limited to specific areas and lack get feedback signal through discriminator in the text
of generality. sequence generation process. Due to the limitation of
The combination of GAN and reinforcement learn- sampling times, the sampling process is highly random
ing (RL) for text sequence generation has become one of the and inadequate, and may result in semantically unrea-
hotspots of research. Sequence GAN (SeqGAN) proposed sonable sequences such as subject repetition and verb
by Yu et al. [2] for the first time feeds discriminator output deletion.
as reward signals of reinforcement learning to generators as Based on the current problems of text generation, we pro-
decision guidance for generating sequences. GAN combined pose a text generation algorithm based on the Feature-
with reinforcement learning has achieved remarkable results Guiding Generative Adversarial Networks (FGGAN). The
improved algorithm will effectively solve the shortcomings
The associate editor coordinating the review of this manuscript and of the existing text generation model. The main contributions
approving it for publication was Nizam Uddin Ahamed . of this paper are as follows:
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
VOLUME 8, 2020 105217
Y. Yang et al.: FGGAN: Feature-Guiding Generative Adversarial Networks for Text Generation
1) Because the output probability of the discriminator completed the training by adjusting the generator to make
is scalar, the training direction of the generator is the generated samples have the same characteristics as the
not clear. In this paper, a feature guidance module real samples. Su et al. [10] have made some achievements in
is improved to transform the high-order text feature dialogue generation using GAN combined with a hierarchical
extracted from the discriminator into the generator net- recurrent encoder-decoder. Fedus et al. [11] introduced an
work for feedback guidance. The feedback signal is actor-critic conditional GAN and produced more realistic text
transformed into a guidance vector with more guidance samples.
information to improve the generation of sequences by GAN can be combined with reinforcement learning. The
the generator. generator can be seen as a decision maker and the network
2) Because of the randomness and inadequacy in the can be optimized using a policy gradient. Reinforcement
process of sequence sampling, it is possible to gen- learning requires reward signals as feedback. At each time
erate non semantic tokens in the process of sequence step, complete sequences can be sampled using Monte Carlo
generation. This paper proposes a method to create Tree Search (MC Search) [12] and fed into the discriminator
a vocabulary mask based on semantic rules, which to get reward signal. Yu et al. proposed SeqGAN, which is
restricts the tokens generated in the next time step the first time to use discriminator as reward for reinforcement
during the sequence generation. Remove the candidate learning. Nie et al. [13] proposed Relational GAN (RelGAN),
tokens with low correlation with the generated pre- which uses Gumbel-Softmax to train GAN on discrete data
fix sequence to make the generated sequence is more and multiple embedded representations in the discriminator
realistic. to provide a more informative signal. Lin et al. [14] proposed
RankGAN which replaces the original binary classifier with a
II. RELATED WORK sorting model based on cosine similarity to make the feedback
Text generation is a task of simulating and generating of the discriminator more continuous. However, these models
sequence data. Most text generation tasks are based on Recur- still have the disadvantage that the output of the discrimi-
rent Neural Network (RNN). The Long Short-Term Mem- nator as a feedback signal is scalar and has weak guidance.
ory (LSTM) proposed by Hochreiter et al. [3] has been In addition, the randomness of the sampling process may
widely used. Wen et al. [4] used LSTM to build a natural cause the network not to learn the implicit semantic infor-
language dialogue system. In 2014, Goodfellow proposed the mation, resulting in the unrealistic generated data. In this
generative adversarial networks. Compared with the single paper, FGGAN is proposed for the shortcomings of the exist-
generation model, GAN has a more significant effect in ing model and the effectiveness of the improved module is
data generation. GAN uses generator and discriminator to verified by experiments.
train and optimize in the adversarial way and finally reaches
the state of Nash equilibrium to effectively learn the data III. ALGORITHM
distribution. This paper adopts the overall framework of GAN and puts
GAN has been successfully applied in the field of images to forward improvement aiming at the existing defects. Firstly,
generate realistic images. Chang et al. [5] proposed KGGAN the output scalar of the discriminator in the correlation algo-
by setting up multiple generators, one of which is respon- rithm is used as the feedback signal, which leads to the
sible for learning the information in a priori knowledge unclear training direction of the generator. In this paper,
field and directing the learned knowledge to another gen- a feature guidance module is improved to transform the
erator to generate a variety of image data. But the disad- high-order text feature extracted from the discriminator into
vantages are that a priori domain knowledge is needed to the generator network for feedback guidance. In addition,
assist and the learned diversity is not easily accepted by this paper proposes a method to create a vocabulary mask
the discriminator. Lian et al. [6] proposed FG-SRGAN for based on semantic rules, which restricts the tokens generated
high-resolution image generation, mainly by setting up a in the next time step during the sequence generation. The
guidance module to learn the mapping from low-resolution improved model is named Feature-guiding GAN, and the
image to high-resolution image, to improve the quality of the overall network structure is shown in Fig. 1.
generated image by the generator. The overall structure of the GAN is divided into a generator
However, due to the discreteness of text data, the original Gθ and a discriminator Dφ . The objective of the generator is
GAN could not optimize the generator parameters based on to find the parameters of the optimal distribution probability
gradient backpropagation. Martin et al. [7] conducted some of the data. However, the parameter update does not originate
analysis on the training methods of GAN such as using the from the data sample, but from the back-propagation gradient
Wasserstein divergence instead of the traditional JS diver- of the discriminator. The model is trained via the adversarial
gence to train GAN in the field of text generation. In addition, strategy. Also, the generator and the discriminator are alter-
Che et al. [8] proposed the maximum-likelihood augmented nately optimized.
discrete GAN (MailGAN) and designed training techniques The generator on the left activity box in Fig.1 can be
to directly calculate the difference between the generated data regarded as performing a text generation task. The objective
distribution and the real data distribution. Zhang et al. [9] of the generator G is to predict the next token based on the
zt = softmax(Xt wt ) (14)
B. TEXT SEMANTIC RULES FOR RESTRICTING FIGURE 3. The mask vector that corresponds to each token.
GENERATION
Sequence generation is the process of token by token gen-
eration. The intermediate time sequence is not generated the word vector. The model is a shallow two-layer neural
completely, but it still needs to evaluate the reward of the network that is used to train to reconstruct the token. After
current generated token in this time step. Because the dis- the training has been completed, the word2vec model can be
criminator can only accept the input of the complete sequence used to map each token to a vector, which can be used to
and the Monte Carlo sampling can supplement the incomplete represent the relationships between tokens. This paper uses
sequence, a large number of sampling operations are needed the cosine distance to characterize the similarity between two
to fill the incomplete sequence in the intermediate time. tokens. Each token in vocabulary is traversed to find out the
Then the complete sequence is fed into the discriminator to tokens with high subsequent relevance in real dataset, so as
determine the reward of the current token and the subsequent to restrict in the subsequent generation process. The current
generation is carried out according to the feedback guidance. token is wordi and the cosine distances are calculated between
Due to the limitation of sampling times, the process of Monte wordnext , which denotes the k tokens that appear next in the
Carlo sampling cannot be fully traversed in the complete real data sequence, and all tokens in the vocabulary. For every
vocabulary space, which has a high randomness. There may token in the vocabulary, If k-times calculation similarity is
be generating sequences with unreasonable semantics, such less than threshold Thsim , set the value to 0 in the correspond-
as subject repetition and verb deletion. ing mask vector, otherwise, set it to 1. Thus, the vector Maski
This paper proposes a method to create a vocabulary mask that corresponds to each token is obtained. The vector can be
based on semantic rules, which restricts the tokens generated shown in Fig. 3 and equations (15)-(15).
in the next time step during the sequence generation. The
specific method is to preprocess the real dataset according Wksimn = 0 if Sim(Wk , Vocan ) < Thsim else 1 (15)
k
to the semantic rules to get the corresponding mask vector of [
each token in the vocabulary. The mask vector represents the Maski = Masknextj (16)
relationship between tokens, such as part of speech, similar- j=1
ity. It represents the tokens that should be restricted in the sub- Through the experimental test, the parameter k is set to 5,
sequent generation when the token appears in the current step. Thsim is set to 0.6. The vector similarity formula can be shown
The dimension of the mask vector is equal to the dimension in (17), as follows:
of the vocabulary. If the parameter value on the dimension Pn
Ai ∗ Bi
is 0, it means that the token corresponding to the dimension similarity = qP i=1 qP (17)
has a very low correlation with the generated prefix sequence n 2 n 2
i=1 Ai ∗ i=1 Bi
and subsequent generation should mask the token. The pur-
pose of mask vector is to eliminate the candidate tokens that where n is the word vector dimension. When generation at
do not conform to the objective context in the generation, each time step, consider the m tokens that are generated by
so that the generation process can learn the implicit semantic the prefix sequence. The masks are Maski , i ∈ [1, m]. The
structure and improve the quality of the final generation final mask is ORed by the mask vectors of the latest m tokens
sequence. at the current time step, which can be shown in (18).
Assuming that the vocabulary size is n and the sequence m
[
length is m, the sampling space is of size nm . In the current MaskT = MaskT −i (18)
time step, if the prefix sequence has been generated, the sam- i=1
pling space size is still the vocabulary size n. However, there Through the experimental test, the parameter m is set
are many alternatives that do not conform to the current to 4. Finally, MaskT is used to mask the token selection
semantics. These tokens should be restricted to subsequent probability of the current time step. Combined with the mask
generation. Although the text sequence has a complex seman- vector MaskT and the feature-guiding vector wt in the feature
tic structure, in the case of the generated prefix sequence, guidance module, the current time step vector Xt in the text
the frequency of occurrence of subsequent tokens has a non- generation module is sent to the softmax layer to determine
random probability distribution. We define rules in terms of the next token generation. The probability of the next token
the word similarity. Each token in the text sequence can be can be shown in equation (19) and Fig. 4.
encoded into a word vector in the word embedding layer.
Word2vec [15] is a related model that is used to generate zt = softmax(Xt wt MaskT ) (19)
TABLE 2. Experimental results of NLL on synthetic data. TABLE 3. COCO dataset BLEU score comparison.
V. CONCLUSION
For text sequence generation, this paper proposes an
improved framework FGGAN. In order to solve the problem
that the feedback signal of the discriminator is not very
instructive, this paper proposes a feature guidance module
which obtains the text semantic feature feedback to the gen-
erator with more guidance. In addition, this paper proposes
a method to create a vocabulary mask based on semantic
rules which restricts the tokens during the generation to make
the sequence more realistic. The superiority of the improved
module is evaluated experimentally. In the synthetic experi-
ment, the negative log-likelihood is used for the evaluation.
FGGAN proposed in this paper has higher ability of fit-
ting data distribution. In the experiment on real data, BLEU
is used for the evaluation. Compared with other models,
FGGAN has a higher evaluation score and generates more
realistic text data.
VI. DISCUSS
Compared with other comparison algorithms, FGGAN in this
paper has some improvements in the relevant dataset, but
there are still some problems that need further improvement.
Firstly, the feature guidance module extracts the text features
from the discriminator and sends them to the text generation
module for guidance after transformation. However, because
Each poem contains five or seven Chinese words per sen- the linear transformation may not be able to adapt to the
tence. Among them, 4000 poems are used as training sets, high-speed changing feature space of the discriminator CNN,
4000 are used as test sets and BLEU-2 is used as the result the text generation module may not learn advanced semantic
features. Further research on extraction and transformation YANG YANG was born in 1981. She received the
of text feature is needed in the future. Secondly, in the pro- Ph.D. degree from the Beijing University of Posts
and Telecommunications, in 2011. She is currently
cess of using semantic rules to restrict sampling, the mask an Associate Professor. More than 30 articles have
vector is obtained by preprocessing the dataset according been published in SCI/EI journals and one inter-
to the semantic rule. However, too complex rule restriction national standard have been applied. Meanwhile,
will cause the neural network to mode collapse, resulting in she is the paper review experts of journals of
Sensors, the International Journal of Distributed
some duplicate text sequences. Subsequent work related to Sensor Networks, and ICC conference. Her current
semantic rule optimization needs to be carried out. research interests are big data analysis and trend
depth prediction. She is the Session Chair of CENET2018 International
REFERENCES Conference Internet of Things and big data analysis.
[1] I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley,
S. Ozair, A. Courville, and Y. Bengio, ‘‘Generative adversarial nets,’’
in Proc. Adv. Neural Inf. Process. Syst., New York, NY, USA, 2014,
pp. 2672–2680.
[2] L. Yu, W. Zhang, J. Wang, and Y. Yu, ‘‘Seqgan: Sequence generative
adversarial nets with policy gradient,’’ in Proc. AAAI, San Francisco, CA,
XIAODONG DAN was born in 1994. He received
USA, 2017, pp. 2852–2858.
[3] S. Hochreiter and J. Schmidhuber, ‘‘Long short-term memory,’’ Neural the bachelor’s degree from Xidian University,
Comput., vol. 9, no. 8, pp. 1735–1780, 1997. in 2017. He is currently pursuing the master’s
[4] T. H. Wen, M. Gasic, and N. Mrksic, ‘‘Semantically conditioned degree in computer science wit the Institute of Net-
LSTM-based natural language generation for spoken dialogue systems,’’ work Technology, Beijing University of Posts and
Comput. Sci., vol. 3, no. 17, pp. 144–152, 2015. Telecommunications. His main research interests
[5] C.-H. Chang, C.-H. Yu, S.-Y. Chen, and E. Y. Chang, ‘‘KG-GAN: include big data analysis and machine learning.
Knowledge-guided generative adversarial networks,’’ 2019,
arXiv:1905.12261. [Online]. Available: http://arxiv.org/abs/1905.12261
[6] S. Lian, H. Zhou, and Y. Sun, ‘‘FG-SRGAN: A feature-guided super-
resolution generative adversarial network for unpaired image super-
resolution,’’ in Proc. ISNN, Moscow, Russia, 2019, pp. 151–161.
[7] M. Arjovsky and L. Bottou, ‘‘Towards principled methods for training
generative adversarial networks,’’ in Proc. ICLR, Toulon, France, 2017,
pp. 124–131.
[8] T. Che, Y. Li, R. Zhang, R. Devon Hjelm, W. Li, Y. Song, and XUESONG QIU was born in 1973. He received
Y. Bengio, ‘‘Maximum-likelihood augmented discrete generative adver-
the Ph.D. degree from the Beijing University of
sarial networks,’’ 2017, arXiv:1702.07983. [Online]. Available: http://
Posts and Telecommunications, Beijing, China,
arxiv.org/abs/1702.07983
[9] Y. Zhang, Z. Gan, K. Fan, Z. Chen, R. Henao, D. Shen, and L. Carin, in 2000. He is currently a Professor and the Ph.D.
‘‘Adversarial feature matching for text generation,’’ in Proc. ICML, Supervisor. He has authored about 100 SCI/EI
Sydney, NSW, Australia, 2017, pp. 4006–4015. index articles. He presides over a series of key
[10] H. Su, X. Shen, P. Hu, W. Li, and Y. Chen, ‘‘Dialogue generation with research projects on network and service man-
GAN,’’ in Proc. AAAI, New Orleans, LA, USA, 2018, pp. 8163–8164. agement, including the projects supported by the
[11] W. Fedus, I. Goodfellow, and A. M. Dai, ‘‘MaskGAN: Better text gen- National Natural Science Foundation and the
eration via filling in the,’’ in Proc. ICLR, Vancouver, BC, Canada, 2018, National High-Tech Research and Development
pp. 1–17. Program of China. He received 13 national and provincial scientific and
[12] C. B. Browne, E. Powley, D. Whitehouse, S. M. Lucas, P. I. Cowling, technical awards, including the national scientific and technical awards
P. Rohlfshagen, S. Tavener, D. Perez, S. Samothrakis, and S. Colton, (second-class) twice.
‘‘A survey of Monte Carlo tree search methods,’’ IEEE Trans. Comput.
Intell. AI Games, vol. 4, no. 1, pp. 1–43, Mar. 2012.
[13] C. Zhang, C. Xiong, and L. Wang, ‘‘A research on generative adversarial
networks applied to text generation,’’ in Proc. 14th Int. Conf. Comput. Sci.
Edu. (ICCSE), New Orleans, LA, USA, Aug. 2019, pp. 1268–1288.
[14] K. Lin, D. Li, X. He, Z. Zhang, and M.-T. Sun, ‘‘Adversarial ranking
for language generation,’’ in Proc. NIPS, Long Beach, CA, USA, 2017, ZHIPENG GAO was born in 1980. He received
pp. 3155–3165. the Ph.D. degree from the Beijing University of
[15] T. Mikolov, K. Chen, G. Corrado, and J. Dean, ‘‘Efficient estimation of
Posts and Telecommunications, Beijing, China,
word representations in vector space,’’ 2013, arXiv:1301.3781. [Online].
in 2007. He is currently a Professor and the
Available: http://arxiv.org/abs/1301.3781
[16] X. Chen, H. Fang, T.-Y. Lin, R. Vedantam, S. Gupta, P. Dollar, and
Ph.D. Supervisor. He presides over a series of
C. L. Zitnick, ‘‘Microsoft COCO captions: Data collection and evalua- key research projects on network and service
tion server,’’ 2015, arXiv:1504.00325. [Online]. Available: http://arxiv. management, including the projects supported by
org/abs/1504.00325 the National Natural Science Foundation and the
[17] K. Papineni, S. Roukos, T. Ward, and W.-J. Zhu, ‘‘BLEU: A method National High-Tech Research and Development
for automatic evaluation of machine translation,’’ in Proc. 40th Annu. Program of China. He received eight provincial
Meeting Assoc. Comput. Linguistics (ACL), Philadelphia, PA, USA, 2002, scientific and technical awards.
pp. 311–318.