Human Computer Interaction and Robotics
Human Computer Interaction and Robotics
In the ¯eld of arti¯cial intelligence, human–computer interaction (HCI) technology and its
related intelligent robot technologies are essential and interesting contents of research. From the
perspective of software algorithm and hardware system, these above-mentioned technologies
study and try to build a natural HCI environment. The purpose of this research is to provide an
overview of HCI and intelligent robots. This research highlights the existing technologies of
listening, speaking, reading, writing, and other senses, which are widely used in human inter-
action. Based on these same technologies, this research introduces some intelligent robot sys-
tems and platforms. This paper also forecasts some vital challenges of researching HCI and
intelligent robots. The authors hope that this work will help researchers in the ¯eld to acquire
the necessary information and technologies to further conduct more advanced research.
1. Introduction
Arti¯cial intelligence (AI) technology is a technical science that studies and develops
theories, methods, technologies, and application systems for the simulation, exten-
sion, and expansion of human intelligence. It has been one of the most popular and
widely growing technologies in recent years and has already achieved signi¯cant
success in many areas such as robots, speech recognition, computer vision, and nat-
ural language processing.1–4 AI is regarded as the most valuable technology, which
holds the highest potential to achieve many breakthroughs. It attempts to understand
the essence of intelligence and produces intelligent machines that can respond in the
This is an Open Access article published by World Scienti¯c Publishing Company. It is distributed under
the terms of the Creative Commons Attribution 4.0 (CC BY) License which permits use, distribution and
reproduction in any medium, provided the original work is properly cited.
5
6 F. Ren & Y. Bao
similar form of human intelligence. It signi¯es that intelligent machine (or robots,
agents) with human-like intelligence is the ultimate goal and carrier of AI technology.
Human intelligence is the intellectual prowers of humans, which is marked by
performing and solving complex cognitive feats and also including high levels of mo-
tivation and self-awareness.5 Intelligence enables humans to learn, apply logic, reason,
recognize patterns, make decisions, solve problems, and think. Through their intelli-
gence, humans possess the cognitive abilities to perceive the world, comprehend truth
by 111.125.236.28 on 12/22/24. Re-use and distribution is strictly not permitted, except for Open Access articles.
and good things, and interact with surrounding people and environments through
perception, understanding, reasoning, and expressing. Humans can hear the beautiful
voices and melodies, read classic literature and masterpiece, gaze at refreshing scenery
Int. J. Info. Tech. Dec. Mak. 2020.19:5-47. Downloaded from www.worldscientific.com
and artwork, and feel a rich world and a®ection. Then, they can tell the di®erence,
know what those mean, and express themselves by making dialogs, writing articles,
painting, and any other possible ways of expression. And through these processes,
humans are able to apply their intelligence and interact in di®erent ways.
The expeditious development in the ¯elds of AI, deep learning technology,
intelligent robot, and human–computer interaction (HCI) has achieved a substantial
progress in recent years. Now, the intelligent robots possess more and more human-
like intelligence and ability, such as the ability of listening, speaking, reading,
writing, vision, feeling, and consciousness.6–9
The purpose of this paper is to provide a comprehensive overview of technologies
involved in the technologies of intelligent robots and HCI, including natural language
understanding (NLU), computer vision, deep neural network, and wearable devices.
There has been a huge amount of innovative works conducted on intelligent robots
and HCI in the AI literature. However, this paper only focuses on the latest research
advancements, which are oriented toward interaction between each other. This
interaction is vital and closely related to intelligence. Through this research work,
the authors hope to provide the intelligent robots and HCI community with useful
reference resources.
The rest of the review is organized as follows: in the next section, we will ¯rst deliver
a general review of intelligent robots and HCI, including its history, de¯nition, and
categorization. We will, then, review some current research works on the topic of
intelligent robots and HCI from the perspective of abilities that should be mastered by
the intelligent robots for aspiring a natural and harmonious HCI environment and
experience. In Sec. 4, we will summarize the research processes about a®ective com-
puting, which is considered to be one of the most important challenges in the ¯eld of
intelligence. Then, in Sec. 5, we will introduce some successful applications of this topic.
Finally, we will discuss some key scienti¯c problems in these ¯elds and conclude the
paper with a formulation of future works (including its recommendations) in Sec. 6.
2. Overview
Generally speaking, the technologies of HCI and intelligent robots encompass a huge
amount of research ¯elds. Figure 1 summarizes the functions that intelligent robot
Int. J. Info. Tech. Dec. Mak. 2020.19:5-47. Downloaded from www.worldscientific.com
by 111.125.236.28 on 12/22/24. Re-use and distribution is strictly not permitted, except for Open Access articles.
Fig. 1.
Subtasks of HCI and intelligent robot system.
A Review on Human-Computer Interaction and Intelligent Robots
7
8 F. Ren & Y. Bao
should possess for man–machine interactions, and our review will also include lit-
eratures in most of the ¯elds in this ¯gure.
thorough research of the scienti¯c implications and practices of the interfaces be-
tween people and computers or intelligent agents. There are two levels of meaning
associated with the related research works (see Fig. 2). On the primary level, it
Int. J. Info. Tech. Dec. Mak. 2020.19:5-47. Downloaded from www.worldscientific.com
includes the research of ways and design of new technologies to (better) promote the
computers as useful tools, whereas on the higher level, it includes the research of
intelligent technologies that will adopt the natural ways of interaction between
humans and computers, thereby boosting the cause for the computers to become
more harmonious as partners to get along with. HCI was ¯rst used in 1976,10 and it
was popularized by the book, The Psychology of Human–Computer Interaction
published in 1983.11 In 1992, a HCI curriculum was developed by Hewett and other
leading HCI educators to serve the needs of the HCI community.12 In CES 2008, Bill
Gates emphasized the role of natural user interface and predicted that the way in
which HCI will bring a radical change in the next few years. Thereafter, HCI
researchers expounded the de¯nition of a natural HCI by employing di®erent
approaches.13–15
As far as we know, the development process of HCI has gone through ¯ve major
stages: manual stage, interactive command language stage, graphical user interface
(GUI) stage, network user interface stage, and natural HCI. As their names imply,
developed in this ¯eld. It seems that natural HCI technologies will lead the next
generation of interactive technologies. In fact, natural HCI is not a new concept,
and it has been in existence for a considerable amount of time and is constantly
Int. J. Info. Tech. Dec. Mak. 2020.19:5-47. Downloaded from www.worldscientific.com
developing itself since the emergence of the computer. People want to use the
simplest and most e®ective way to control computer to achieve task completion.
However, in the present era, people want to use the most direct and natural way for
the computer to provide more services, that is, the computer is hoped to be more
intelligent.
industrial robots, domestic robots, medical robots, military robots, education robots,
entertainment robots, etc. Additionally, independent interaction ability and the
ability of emotion cognition and expression are also the essential characteristics of
intelligent robots.22,23
3. A®ective Computing
by 111.125.236.28 on 12/22/24. Re-use and distribution is strictly not permitted, except for Open Access articles.
gone far beyond the reach of humans, but it still lacks to con¯rm that these robots
possess human-like intelligence. The key issue is that these robots are not similar to
humans from the perspective of emotions, and their EQs have been nil as compared
to humans. It is well known that emotion is a necessary factor for communication and
interaction between humans. Therefore, people naturally expect intelligent robots to
also have the ability of emotional interaction in the HCI process, that is, an intel-
ligent robot should have EQ along with IQ.
In 1985, Marvin Minsky, one of the founders of AI, put forward in the book The
Society of Mind that \The question is not whether intelligent machines can have
emotions, but whether machines can be intelligent without any emotions."24
Thereafter, emotions gradually became the consensus of the AI professionals as an
important part of intelligence. The research of endowing intelligent machines' abil-
ities of understanding, expressing, and reproducing emotions has been widely carried
out, which mainly includes a®ective computing, Kansei engineering, and arti¯cial
psychology.
A®ective computing will improve a natural HCI environment and expand the
application scenarios for intelligent robots. Researchers have carried out extensive
innovative works focusing on a®ective computing and achieved a wealth of research
¯ndings. A relatively complete processing procedure and some theoretical systems
have also been established, such as mechanism and theoretical modeling of emotions,
emotional information acquisition, emotion recognition, emotion understanding, and
emotion expression.
In 1997, Picard in MIT put forward the concept of a®ective computing for the ¯rst
time. She pointed out that a®ective computing relates to, arises from, or deliberately
in°uences emotions or other a®ective phenomena.25 The purpose of a®ective com-
puting is to promote the EQ of intelligent robots and equip them with an emotional
\heart" so that they can develop the human-like capacities of perception, under-
standing, and generating a variety of emotional characteristics, and, then, create a
natural and harmonious HCI system. A®ective computing is the theory basis of
realizing natural and harmonious HCI, and is also an extremely challenging research
topic in the ¯eld of AI.
Nagamachi created Kansei engineering based on the research of a®ective engi-
neering in 1988.26 In 1999–2000, researchers had put forward the theory of \Arti¯cial
A Review on Human-Computer Interaction and Intelligent Robots 11
that emotion is associated with multiple brain regions, including the prefrontal
cortex, hypothalamus, and cingulate cortex, and amygdala serves as the center of all
emotions.30,31
Int. J. Info. Tech. Dec. Mak. 2020.19:5-47. Downloaded from www.worldscientific.com
Researchers generally use discrete emotional states model and dimensional model
to construct and understand the emotional space. The discrete emotional states
model divides emotions into a variety of discrete states, which can be further divided
into several di®erent emotional states (e.g., happiness or disgust).32 The most
common classi¯cation scheme is dividing it into six emotional states: happiness,
sadness, anger, fear, surprise, and disgust. Human emotional states are continuous
and dynamical in a natural interaction scene, so the discrete emotional states model
is unable to accurately represent the change of human emotions.33 The dimensional
model considers the emotional space as a continuous space composed of di®erent
dimensions, which can better characterize and stimulate human emotions.34 There
are two-dimensional valence-arousal model35 and activation-evaluation model36 be-
sides the three-dimensional pleasure-arousal-dominance model37 and arousal-va-
lence-stance model.38 Reference 39 proposed a new academic system called
\Enriching Mental Engineering", which aims to deal with the mental system of
human beings. It measures and enriches the mental richness by employing engi-
neering methods. Reference 40 carried out research on a®ective computing from the
view of psychology and proposed a mental state transition network model to dy-
namically detect human emotions. After that, these researchers conducted a series of
experiments involving basic theories, emotional data resources construction, and
their applications.41–44 Table 1 summarizes the above reviewed models. In addition,
there are a large amount of literatures available on the applications of a®ective
computing in the ¯eld of HCI and intelligent robots, which will be reviewed in the
following sections.
4. Human–Computer Interaction
In this section, we will review the relevant literatures in the ¯eld of HCI by con-
sidering the aspect of interactional abilities, such as listening, speaking, reading,
writing, visual sense and other senses, possessed by humans. These same activities
are desired in an intelligent robot.
Auditory sense is one of the most important senses of the human body. It is used for
mutual interaction among humans and its main forms include listening and speaking.
Int. J. Info. Tech. Dec. Mak. 2020.19:5-47. Downloaded from www.worldscientific.com
Listening is used to receive the voices of outside world, and speaking is used to
express own ideas and opinions to the outside world. The robot's abilities of listening
and speaking aim to imitate the auditory ability of humans in the interaction pro-
cess, and these two kinds of abilities are carried out via the spoken dialogue system in
intelligent robots. Figure 3 shows the framework of a spoken dialogue system.
Generally speaking, the spoken dialogue system comprises ¯ve modules: automatic
speech recognition (ASR), NLU, dialogue management (DM), natural language
generation (NLG) and automatic speech synthesis (ASS).
The primary responsibility of the ASR is to transform the continuous time signal
of a user's speech into a series of discrete syllable units or words. The primary
responsibility of NLG is to analyze the result of speech recognition process and
transform the user's dialogue information into a representative form that can be
utilized by the dialogue system via syntactic and semantic analysis. DM is used to
make a comprehensive analysis based on the result of language understanding, the
context of the dialogue, the historical information of the dialogue, etc., to determine
the current intention of the user. Thereafter, the response or response strategy is
adopted by the system. Then, NLG organizes the appropriate response statement
and convert the system's response into the natural language that users can under-
stand. The primary responsibility of ASS is to synthesize the text generated by NLG
into the ¯nal answering voice and feed it back to the user. A large number of
extensive e®orts have been actualized in the ¯eld of dialogue system, which is divided
into two categories, acoustic-based and text-based.
One of the key terminals of the auditory module is ASR, which has changed the
way we interact with intelligent agents/systems. The development of ASR bene¯ts
from both ¯elds of academic research and industry, including Google, Microsoft,
IBM, Baidu, Amazon, iFLYTEK, etc., all of which have developed speech recogni-
tion engines. In a traditional solution, hidden Markov models (HMMs) are widely
used in speech recognition systems, and most of modern general-purpose speech
recognition systems are based on HMMs.45 HMMs are used in speech recognition
process because a speech signal can be viewed as a piecewise stationary signal or a
short-time stationary signal. This signal can be considered suitable for the Markov
model based on the hypothesis of HMMs that hidden state variables, speech as an
by 111.125.236.28 on 12/22/24. Re-use and distribution is strictly not permitted, except for Open Access articles.
observed value, and the transfer among states conform to the hypothesis of HMMs.
Reference 46 introduced the application of the theory of probabilistic functions of
a (hidden) Markov chain to actualize ASR for an isolated word. Following that,
Int. J. Info. Tech. Dec. Mak. 2020.19:5-47. Downloaded from www.worldscientific.com
corpus consumes lot of resources and requires a sophisticated design. HMMs- and
STRAIGHT-based method overcame this barrier and is suitable for the mobile-
embedded platform.72–74 The deep learning network was ¯rst applied in the ¯eld of
speech recognition, and the recognition rate increased by more than 10%, which
greatly attracted the attention of researchers. There are also abundant research
achievements in the ¯eld of speech synthesis with the use of deep neural
networks.75–83 In a conventional neural networks-based approach, text analysis and
acoustic modeling are processed separately. However, Ref. 75 attempted to integrate
them together and proposed a novel end-to-end framework to deal with speech
synthesis. By combining memory-less modules and stateful recurrent neural net-
works, the unconditional audio generation in the raw acoustic domain was resear-
ched in Refs. 75 and 76. Reference 77 introduced WaveNet to generate the raw audio
waveforms and yielded a state-of-the-art performance after applying the same to
speech synthesis. References 78–80 have carried out a series of meaningful works in
this ¯eld based on deep neural networks. References 81 and 82 had focused their
works on vocoder-based speech synthesis system to improve the sound quality and
real-time performance of speech synthesis. Some research works also aim to syn-
thesize the speech of a speci¯c type or person.74,83 Reference 83 introduced an
emotional speech synthesizer based on the end-to-end neural model, which could be
used to generate speech for the given emotion labels. Reference 84 used Variational
AutoEncoder (VAE) to synthesize speech to control it in an unsupervised manner.
Certain types of speech synthesis tasks, especially emotional speech synthesis task,
are of great signi¯cance and value, which can a®ect the content and e®ect to be
expressed, because the e®ect will be greatly di®erent when the same content is
expressed by di®erent emotional semantics.
Although the quality of speech synthesis has steadily improved over the past
decades, especially with the rapid development of deep neural network technology,
speech synthesis systems remain clearly distinguishable from the natural human
speech. The challenges of emotional speech synthesis and natural language proces-
sing accompanied by speech synthesis are still in an urgent need to be addressed
and solved.
A Review on Human-Computer Interaction and Intelligent Robots 15
whereas the latter is used to con¯rm whether a speech is spoken by a speci¯ed person.
In 1995, Reynolds successfully applied the Gaussian mixture model (GMM) to the
text-independent VPR task for the ¯rst time85 and established the foundation
Int. J. Info. Tech. Dec. Mak. 2020.19:5-47. Downloaded from www.worldscientific.com
is used to generate dialogue responses under the guidance of the dialogue strategy,
whose generation ways contain generative models-based methods106,107 and
retrieval-based methods.108,109 References 110–112 have researched about a®ective
DM, which is one of the cores of dialogue system. Unlike pipeline methods, end-to-
end methods had treated the dialogue system learning as the problem of learning
a mapping from dialogue histories to system responses, and applied an encoder–
decoder model to train the whole system.113,114
by 111.125.236.28 on 12/22/24. Re-use and distribution is strictly not permitted, except for Open Access articles.
Non-task-oriented dialogue systems are often called chatbot, whose main purpose
is to provide the ability to chat with people in an open domain, and there are rule-
based methods, retrieval-based methods, and generation-based methods. In fact, we
Int. J. Info. Tech. Dec. Mak. 2020.19:5-47. Downloaded from www.worldscientific.com
can think of it as the joint modeling of all modules in the pipeline-based methods. In
recent years, the research works in this ¯eld are mainly concentrated upon deep
neural network methods-based generation models. Referring to the Seq2Seq model of
machine translation, multiple end-to-end response systems based on the deep neural
network model emerged in 2015.115–117 Thereafter, the attention mechanism is
introduced to generate context-sensitive dialogue responses.118 In addition, the
research works in this ¯eld also include deep reinforcement learning-based dialogue
generation,119 dialogue generation model study based on VAE and CVAE120,121 and
dialogue generation study based on GAN.122,123
A®ective computing and dialogue systems are two emerging and interesting re-
search directions in the ¯eld of AI. Many scholars have conducted a lot of research
works on these two aspects, respectively; however, the researched content is basically
independent and less related to each other. With the gradual perfection of dialogue
system and the comprehensive deepening of a®ective computing, some scholars have
begun to explore a new cross-research topic, that is, how to integrate the emotion
into the dialogue system to build an emotional dialogue system.124 Reference 125
combined the a®ective computing theory with the spoken language dialogue system
and proposed to use the spoken language dialogue system as the carrier for the
integration of the multi-modal emotion recognition, e®ective emotional interaction,
and the emotion generation and expression of intelligent robots. The generation of
emotional dialogue responses is mainly achieved by learning emotional labels.22,126
Question answering system, which is focused more on factual questions, can be
regarded as a special case of the dialogue system, and it can answer the questions
posed by humans with more accurate and concise natural language. There are also
plenty of good works in the ¯eld. Reference 127 proposed a distantly supervised open-
domain question answering (DS-QA) system, which retrieves the relevant text from
Wikipedia and extracts the answer by reading comprehension. Reference 128 pro-
posed a denoising DS-QA, which contains a paragraph selector and paragraph reader
to make the full use of all informative paragraphs and alleviate the wrong labeling
problem in DS-QA. Reference 129 proposed a method of answer extraction for long
documents, which separated the answer generation in DS-QA into selecting a target
paragraph in document and extracting the correct answer from the target paragraph
A Review on Human-Computer Interaction and Intelligent Robots 17
characters. Then, readers can understand the meaning and thoughts deeply present
in characters by reading them. Enduing the intelligent robots with the abilities of
reading and writing is still in the category of natural language processing, whose
Int. J. Info. Tech. Dec. Mak. 2020.19:5-47. Downloaded from www.worldscientific.com
purposes are to enable robots to read human characters and understand human
thoughts, and to express their thoughts and ideas by generating a speci¯c character
sequence. In the following content, several tasks will be introduced to re°ect the
robots' abilities of reading and writing (see Table 2 for summaries), including part-of-
speech tagging, named entity recognition, text classi¯cation, text sentiment analysis,
machine translation, machine reading comprehension (MRC), machine writing, etc.
To handle the problems of machine's reading and writing, the ¯rst task of the
representation of text in the computer needs to be solved. Although in some
languages, such as Chinese, word segmentation is needed before the word repre-
sentation. In traditional statistical natural language processing tasks, text repre-
sentation is mostly based on the discrete feature vector method, which relies heavily
on handcrafted feature engineering (e.g., vector space model (VSM)).131 Feature
engineering is always time consuming and incomplete, and the problem of dimen-
sional explosion also exist in it. With the rise and development of deep learning
methods and computing hardware, deep learning methods have been employed and
produced state-of-the-art results in many domains, ranging from computer version to
speech processing. References 132–134 put forward a neural networks-based method
to embed words into the low-dimensional distributional vectors known as Word
Embeddings. Word Embeddings is also a statistical method, which follows the
distributional hypothesis that words occurring in a similar context tend to have
similar meanings. Thus, we can think that Word Embeddings contain syntactical
and semantic information, and its major advantage is that they can capture the
similarity between words by measuring the similarity between vectors. Distributed
representations have been the basis of deep learning-based NLP tasks and have
helped achieve encouraging results in a wide range of NLP tasks.135–137
POS tagging is the process of marking up a word in a text with a particular part of
speech based on both its de¯nition and its context. The di±culty of this problem is
that the same word will show di®erent parts of speech in di®erent contexts. Rule-
based methods and statistics-based methods are the main approaches in traditional
POS tagging and most machine learning methods have achieved accuracy above
95%, whereas recent research works focused on deep learning based-method have
been achieving even better accuracy. Reference 137 proposed a deep neural network
that learns the character-level representation of words and associates them with
18 F. Ren & Y. Bao
Word representation Feature Engineering & VSM131 Word Embeddings & neural
network-based language
model132–134
POS Tagging Rule-based methods & Statistics- Convolutional neural networks137
based methods Adversarial neural networks138
by 111.125.236.28 on 12/22/24. Re-use and distribution is strictly not permitted, except for Open Access articles.
Transfer learning139
CRF, LSTM, Bi-LSTM, LSTM140
RNN, GRU, LSTM, and
Bi-LSTM141
Int. J. Info. Tech. Dec. Mak. 2020.19:5-47. Downloaded from www.worldscientific.com
usual word representations to perform POS tagging. Reference 138 presented the
method of employing adversarial neural networks to deal with the POS tagging
problem for Twitter's text. Transfer learning was introduced to induce automatically
a POS tagger for languages that have no labeled training corpus.139 In Ref. 140, a few
A Review on Human-Computer Interaction and Intelligent Robots 19
models were utilized to address the Uyghur POS tagger, including CRF, LSTM,
Bi-LSTM, LSTM with a CRF layer, and Bi-LSTM networks with a CRF layer.
Reference 141 evaluated several sequential deep learning methods, including RNN,
GRU, LSTM, and Bi-LSTM for Malayalam tweets, and di®erent experimental
parameters.
NER, which is one of the most important bases of NLP tasks, refers to the task of
recognizing the entity with a speci¯c meaning in the text, mainly includes name,
by 111.125.236.28 on 12/22/24. Re-use and distribution is strictly not permitted, except for Open Access articles.
place's name, institution's name, and proper noun. It also includes two subtasks:
entity boundary recognition and entity type determination. CRF is a traditional
discriminant probability model recognized as a good algorithm for solving NER
Int. J. Info. Tech. Dec. Mak. 2020.19:5-47. Downloaded from www.worldscientific.com
problems. Many fusion methods of CRF and neural networks have emerged in this
¯eld. Reference 142 is one of the representative works that neural networks were used
for NER, in which CRF was fused into CNN. On the basis of similar ideas,
Refs. 143–145 have combined RNN and CRF to deal with NER. These papers had
proposed novel architectures for combining word embedding with character-level
representation, in which attention mechanism was introduced to dynamically extract
information from both word- and character-level components.146,147 Generally
speaking, deep learning relies on a large number of annotated samples as training
data. In order to solve the limitation caused by the massive annotated data, many
literatures have studied the NER methods based on a small amount of annotated
data, such as transfer learning,148 semi-supervised method,149 and active learning.150
Text classi¯cation is the technology that automatically marks the text with labels
according to a certain or standard classi¯cation system. The research of text clas-
si¯cation has gone through several stages including keywords matching-based
method, rules-based knowledge engineering, statistical machine learning-based
methods (e.g., SVM, KNN), and deep learning-based methods. Recently, Ref. 151 is
an excellent work that provided an overview of the state-of-the-art elements of text
classi¯cation. Reference 152 explored a simple but e±cient baseline for text classi-
¯cation, fastText, which provides the idea that some tasks can be solved by some
extremely simple models. Reference 153 researched convolutional neural networks to
deal with text sequence and carried out experiments for sentence-level classi¯cation,
further achieving compelling results. Following this route, character-level convolu-
tional neural networks are studied for text classi¯cation.154 Reference 155 divided the
text into three levels: word, sentence, and document. They constructed a hierarchical
model for long text classi¯cation by using the hierarchical attention mechanism.
Reference 156 proposed deep average networks (DAN) and attentional DAN to
actualize the conversational topic classi¯cation for the evaluation of the conversa-
tional bots. Lai et al. introduced recurrent convolutional neural networks for this
task and applied a recurrent structure to capture the contextual information,
whereas a convolutional neural network was used to construct the representation of
text.157 Much of the success that transfer learning has achieved in computer vision
cannot yet be fully transplanted into NLP. Text categorization still requires task-
speci¯c modi¯cations and training from scratch. Howard et al. proposed an e®ective
20 F. Ren & Y. Bao
transfer learning method for text classi¯cation, known as universal language model
¯ne tuning, and introduced some key techniques for model ¯ne tuning.158
Also known as opinion mining and inclination analysis, text sentiment analysis is
the process of analyzing the emotions present in the text. Reference 159 gave a
macroscopic introduction to the ¯eld of sentiment analysis, such as research objects
and venues. Reference 160 summarized several major models in the ¯eld of deep
learning and comprehensively introduced their applications in the task of sentiment
by 111.125.236.28 on 12/22/24. Re-use and distribution is strictly not permitted, except for Open Access articles.
analysis. Additionally, they reviewed three levels of granularity research works for
sentiment analysis and their subtasks. Reference 161 researched the usage of auto-
encoders in modeling textual data and sentiment analysis, and tried to address the
Int. J. Info. Tech. Dec. Mak. 2020.19:5-47. Downloaded from www.worldscientific.com
problems of scalability with the high dimensionality of vocabulary size and task-
irrelevant words by introducing a loss function of autoencoders. Also, there are many
other leading research works that focused on text sentiment analysis and its
applications.162–170 Although sentiment analysis is treated as a classi¯cation prob-
lem, sentiment analysis is actually a suitcase research problem that requires dealing
with many NLP tasks.171 Reference 172 proposed a novel tagging scheme to jointly
extract entities and relations, which can be seen as the subtasks of sentiment anal-
ysis, by using several end-to-end models.
Machine translation is a cross-language literacy that automatically translates the
source language into the target language. Machine translation consists of experi-
enced rule-based methods and statistical machine translation. In recent years, re-
search works had mainly focused on the neural machine translation (NMT).
Reference 173 summarized a successful usage of neural networks in the machine
translation system. Cho et al. proposed a novel neural network model called RNN
encoder–decoder for statistical machine translation and show that the proposed
model had the capacity of learning semantic and syntactic meaningful representation
of linguistic phrases.174 This research also involved an empirical evaluation of a novel
hidden gated unit. Reference 175 presented a general end-to-end approach to
sequence learning for machine translation and suggested that the NMT can achieve
results similar to the traditional techniques. Reference 176 proposed the attention
mechanism, which achieved state-of-art results for statistical machine translation.
Following this research, Ref. 177 explored attention-based NMT architectures,
including a global approach and a local one, to improve the NMT performance and
achieved remarkable results. Reference 178 introduced a sequence-to-sequence ar-
chitecture, which was always deployed via RNN and based entirely on CNN, and
achieved better accuracy and time e±ciency. Di®erent from the previous encoder–
decoder architecture, Ref. 179 proposed a neural network architecture that only used
attention mechanism, and the experimental results on the machine translation task
have showed that the architecture performed well both on quality and training
speed. Google team presented Google's NMT system to address some relevant pro-
blems such as robustness, accuracy, and speed.180 Thereafter, their team tried to
solve the problem of multilingual translation by using a single NMT model.181 GANs
were also applied to NMT, and Ref. 182 introduced a conditional sequence, GAN, in
A Review on Human-Computer Interaction and Intelligent Robots 21
which the generator aimed to translate the sentences while the discriminator tried to
discriminate the outputs generated by the generator from the sentences translated by
a human being. Reference 183 proposed a novel model to produce translation outputs
in parallel instead of one after another, so as to reduce the latency occurring during
inference.
MRC, also researched as the open domain QA, is the ability of intelligent robots
to comprehend a given context and give answers of questions related to the given
by 111.125.236.28 on 12/22/24. Re-use and distribution is strictly not permitted, except for Open Access articles.
context. Information retrieval can also be considered as a MRC issue.184 Many MRC
datasets were exposed to train and evaluate MRC, such as Machine Comprehension
Test, Children's Book Test, CNN/Daily Mail, The Stanford Question Answering
Int. J. Info. Tech. Dec. Mak. 2020.19:5-47. Downloaded from www.worldscientific.com
ing, image description, etc. Reference 204 ¯rst proposed the technology of text
summarization, which refers to analysis background documents, summarization of
the main points of documents, along with extraction or generation of short sum-
Int. J. Info. Tech. Dec. Mak. 2020.19:5-47. Downloaded from www.worldscientific.com
vision, or computer vision, is a science that studies how to make a machine \see" like
humans. This implies to use the camera to replace the human eye to obtain images
and use the computer to replace the human brain to process images, so that the
machines can be made for gaining a high-level understanding of images to simulation
functions that the human visual system possesses. In the process of interpersonal
communication, human beings recognize and judge the object's identity, expression,
physical behavior, etc., through vision, and consider this as the basis of interaction.
by 111.125.236.28 on 12/22/24. Re-use and distribution is strictly not permitted, except for Open Access articles.
In the following sections, we will brie°y summarize these contents (see Table 3 for an
outline).
Identi¯cation is a technique used in computer vision to determine one's identity
Int. J. Info. Tech. Dec. Mak. 2020.19:5-47. Downloaded from www.worldscientific.com
deep neural network methods are introduced into the ¯eld to seek a better recog-
nition performance.
The ¯rst step of face recognition is the detection of face with an aim to determine
whether faces exist on a given image or not. If these faces exist, the location and size
of faces are also determined. A number of studies have focused on this area.235,236
Reference 237 provided a review of face detection for low-quality images. Face
alignment is the process of marking out the important organs, such as eyes, nose, and
by 111.125.236.28 on 12/22/24. Re-use and distribution is strictly not permitted, except for Open Access articles.
mouth, in the image with feature points, and Refs. 238 and 239 had reviewed the
research progresses in this ¯eld systematically. There is a lot of research works in the
¯eld of face recognition. References 240 and 241 summarized the previous research
Int. J. Info. Tech. Dec. Mak. 2020.19:5-47. Downloaded from www.worldscientific.com
works based on shallow representations, whereas Ref. 242 focused on the literatures
of deep learning-based face recognition. In addition to the face recognition function
like humans, the intelligent robots also have the abilities of identi¯cation that human
beings do not have. Because of the stability of biometrics such as iris and ¯ngerprints,
these biometric features are often used for identi¯cation. Reference 243 surveyed the
iris recognition literatures based on machine learning methods, whereas Ref. 244
focused on long-range iris recognition research works that can extend the application
range of this technology. Jain et al. researched the ¯ngerprint recognition question of
young children, which did not get enough attention as much as the research of
Ref. 245. An automated latent ¯ngerprint recognition algorithm was proposed for
the comparison of latents found at the crime scenes.246 References 247 and 248 are
the latest reviews conducted in this ¯eld. Besides, there are other works that are
related to identi¯cation, which focused on age and gender recognition.249,250
Facial expression recognition refers to the recognition of the states of expression
contained in the image from a given static image or dynamic video sequence, so as to
determine the psychological emotions of the identi¯ed object.251 Reference 252
proposed a neural network-based expression recognition method to improve the
generalizability of model, which consisted of two convolutional layers with each
followed by max pooling and, then, four inception layers. Reference 253 proposed
another CNN-based expression recognition scheme, which was combined together
with speci¯c image pre-processing steps to address the questions of limited training
samples and the uncertainty of sampling during training. A multi-modality feature
fusion-based framework was proposed for face recognition in videos to improve the
system's robustness.254 While expression recognition based on static images was also
researched by the authors, Ref. 255 proposed a novel method to train an expression
recognition network based on the static images.255 Micro-expression recognition,
which is regarded as a harder problem, was also researched by a large amount of
research works.256–258
Corresponding to facial recognition, this study provides the automatic generation
of facial expressions. Its content generated various emotional expressions of a given
facial image or a speci¯c text. This research is considered important as it can be seen
as a feedback in the HMI. In Ref. 259, a chaotic feature which extracted associative
memory was proposed to stimulate the human brain in generating the facial
A Review on Human-Computer Interaction and Intelligent Robots 25
developed a free software and API that can generate dynamic facial expressions for
the three-dimensional speaking characters. Reference 264 investigated a novel
problem of generating images from some natural description and proposed a
Int. J. Info. Tech. Dec. Mak. 2020.19:5-47. Downloaded from www.worldscientific.com
performance of low-quality images was studied in Ref. 278. Reference 279 proposed a
CNN-based method to learn the features of Chinese characters. Then, it addressed
the problem of Chinese characters in completely automated public Turing test to tell
computers and humans apart, which is increasingly used in many web applications
for security reasons. Another aspect of OCR is to train the computer to automati-
cally write characters or generate images with the character, which is also chal-
lenging and interesting. Reference 280 proposed a RNN-based framework to train a
by 111.125.236.28 on 12/22/24. Re-use and distribution is strictly not permitted, except for Open Access articles.
generation.
In Ref. 294, high-density surface EMG signals were decomposed from the forearm
muscles in the non-isometric wrist motor tasks of normally limbed and limb-de¯cient
individuals, which could be used for prosthesis control with the help of the decoded
neural information. Reference 295 proposed an optimal control framework based on
EMG for the design of physical human–robot interaction in the application of
rehabilitation. In Ref. 296, natural EMG signals were collected in a natural manner
by introducing a physical haptic feedback mechanism, and an interface was designed
by 111.125.236.28 on 12/22/24. Re-use and distribution is strictly not permitted, except for Open Access articles.
for human adaptive impedance, extracted from the transfer of EMG signals. An
algorithmic framework is proposed in Ref. 297 for EMG-based gesture recognition,
and a prototype system along with an application program was developed to realize
Int. J. Info. Tech. Dec. Mak. 2020.19:5-47. Downloaded from www.worldscientific.com
5. Intelligent Robots
Intelligent robots are an updated version of the traditional robots in both software
and hardware systems. By upgrading the software, the intelligent robots have higher
levels of brains, which bestow them with a comprehensive improvement in percep-
tion, reasoning, and decision-making. With the hardware upgrade, the intelligent
robots have more perfect bodies so that they can better imitate human behaviors on
the basis of completing delicate works and toilsome works. In combination with the
both improvements, the intelligent robots can execute human commands or think
independently to complete certain tasks, learn, and improve them autonomously.
They can also interact with human beings in a friendly manner.
Motion elements are the centralized embodiment of robot positioning, obstacle
recognition, navigation, and other functions in an unstructured environment,
28 F. Ren & Y. Bao
its balance even after slipping on snow, or it can get up if it is pushed down delib-
erately.
Another important element of an intelligent robot is the control element, which
can perceive human's control intention in various ways and execute relevant actions
according to commands. It is often used to assist the control of prostheses for patients
with paralysis. Spinal cord injuries, stroke in the brain stem, and other diseases make
it impossible for patients with paralysis to control their limbs autonomously. The
prosthesis with controllable capability can detect and execute the patient's intention
via signal sources, such as neural interface and physiological signals, so as to realize
the patient's control of the prosthesis. Reference 317 exhibited the abilities of people
with chronic tetraplegia to perform three-dimensional stretching and grasping
motions by using a robotic arm controlled through the neural interface. This liter-
ature also showed that it is possible for tetraplegic patients to reconstruct the useful
multi-dimensional neural controls from complex devices directly even after years of
central nervous system injuries. References 318 and 319 researched on the controlling
of robotic arm by modeling the multi-channel EEG signals and motion state to-
gether. Using pneumatic arti¯cial muscles and in°atable sleeves, Ref. 320 developed
a robotic arm with seven degrees of freedom (DOFs), which were combined with
elements and positive qualities of rigid and soft robotics. Brain–computer interfaces
(BCIs) were employed in Ref. 321 to stimulate the muscle and control of robotic arm
for reaching and grasping movements in people with tetraplegia.
The above-mentioned robots generally have solid bodies, complex structures, and
limited DOFs, whereas the soft robots can achieve continuous deformation and,
therefore, have in¯nite DOFs. References 322 and 323 conducted research on soft
robots. The development of 3D printing technology and materials science have greatly
bene¯tted researcher works on soft robots, owing to which they have shown a sig-
ni¯cant progress and achieved the tasks of grabbing, human–robot collaboration, etc.
The interactive elements of intelligent robots are studied and practiced by a large
number of researchers, and Sec. 4 introduces a great amount of research works and
technologies focused on these interactions. In fact, scientists have developed several
intelligent platforms and robots with the rudimentary ability of natural HCI. For
example, MIT a®ective computing research team launched the Tega and Jibo
platforms successively in 2016, which have certain emotional computing and
A Review on Human-Computer Interaction and Intelligent Robots 29
fusion.
The Ren team from Hefei University of Technology in China studied the emotion
computing system on the platform of a humanoid robot; constructed a heart state
transfer network, which combined universality and individuality, for mental health
problems; developed a multi-modal emotional response model based on the estab-
lished heart state transfer network; and established an evaluation system for coping
strategies. The emotional robot platform and its cloud system developed by the team
mainly have the functions of character identity and emotion recognition, gesture and
voice interaction, intelligent emotional conversation and chat, and emotional in-
teraction. Emotional robots can be used at home and in medical settings for people of
di®erent ages (especially for the elderly) and the assisted rehabilitation of speci¯c
conditions (autism and depression).
The above content roughly belongs to the intelligent system of intelligent robots.
In fact, more research works are focused on the hardware system of robots, such as
actuator, driving device, sensing device, control system, etc. However, these studied
contents are not within the scope of this review. For more literatures about intelli-
gent robot systems, see Ref. 1. The authors of that research had reviewed the current
research works on intelligent robot systems and prospected the future development
trend in this ¯eld.
some physiological signals, such as blood pressure and heartbeat. Most of the existing
perception methods are focused on the single mode, whereas the correlation between
the multiple modes is ignored. Therefore, multimodal databases, multimodal data
hierarchical fusion perception, and human-like intelligent perception technologies
based on this database will become an important direction for research.
Existing mainstream approach for multimodal fusion perception is dependent on
large-scale neural network and big data. In addition, we also can be provided good
by 111.125.236.28 on 12/22/24. Re-use and distribution is strictly not permitted, except for Open Access articles.
references by the group decision making and multiple criteria decision making in
management to study the decision-making process in the process of multimodal
fusion perception.324–328
Int. J. Info. Tech. Dec. Mak. 2020.19:5-47. Downloaded from www.worldscientific.com
health. However, if the voice is low and the expression is painful at the same time,
and the content of negative emotions is included in the ordinary voice text, the
mental health of the patient can be judged to be in a bad state, which needs to be
adjusted and dealt with. It is a great challenge to accurately perceive and calculate
people's mental health state through multi-source signals. Additionally, ways to
guide and improve people's psychological state in the process of interaction also serve
as another great challenge.
by 111.125.236.28 on 12/22/24. Re-use and distribution is strictly not permitted, except for Open Access articles.
Deep understanding of natural language and personalized interaction are also di±-
cult challenges faced by HCI. First, the combination of scene, historical interactive
information, pragmatics, and, even, emotions and then, interaction with a deep
understanding of semantics could be natural and e±cient. In the personalized in-
teraction, the intelligent robot can adjust the interaction method and strategize
neatly according to the scene, interaction object, interaction state, etc.
7. Conclusions
Ever since computers were born, there have been various interactions between
people and computers in order to make computers more responsive to the humans'
needs. The continuous improvement in human demand and curiosity drives the
development of HCI technology and intelligent robot technologies. A large amount
of research works has been carried out to make HCI more natural and harmonious
and, at the same time, make robots more intelligent and adaptable. With the rapid
development of AI technology in recent years, it provides unprecedented develop-
ment opportunities for the research of these two technologies. This paper sum-
marizes the development status of HCI from the aspect of interaction abilities and
introduced the related technologies of an intelligent robot. Thereafter, the chal-
lenges for these two ¯elds in the future development and possible research
approaches are expounded.
32 F. Ren & Y. Bao
Acknowledgments
This research has been partially supported by National Natural Science Foundation
of China under Grant Nos. 61432004 and U1613217.
References
1. T. M. Wang, Y. Tao and H. Liu, Current researches and future development trend of
by 111.125.236.28 on 12/22/24. Re-use and distribution is strictly not permitted, except for Open Access articles.
preprint arXiv:1704.01074.
23. F. Ren and Z. Huang, Automatic facial expression learning method based on humanoid
robot XIN-REN, IEEE Transactions on Human-Machine Systems 46(6) (2016)
810–821.
Int. J. Info. Tech. Dec. Mak. 2020.19:5-47. Downloaded from www.worldscientific.com
24. M. L. Minsky, The Society of Mind (Simon & Schuster Press, 1988).
25. R. W. Picard, A®ective Computing (MIT Press, 1997).
26. M. Nagamachi, Kansei/A®ective Engineering (CRC Press, 2011).
27. X. Tu, Arti¯cial emotion, The Paper Assembly of the 10th Annual CAAI (Guangzhou,
China, 2000).
28. Z. Wang and L. Xie, Arti¯cial psychology-an attainable scienti¯c research on the human
brain, IPMM, Vol. 2 (Honolulu, 1999), pp. 1067–1072.
29. Z. Wang, Arti¯cial psychology and arti¯cial emotion, CAAI Transactions on Intelligent
Systems 1(1) (2006) 38–43.
30. J. E. Ledoux, Emotion circuits in the brain, Annual Review of Neuroscience 23(23)
(1999) 155–184.
31. R. N. Cardinal, J. A. Parkinson, J. Hall et al., Emotion and motivation: The role of the
amygdala, ventral striatum, and prefrontal cortex, Neuroscience & Biobehavioral
Reviews 26(3) (2002) 321–352.
32. M. S. Hossain and G. Muhammad, Audio-visual emotion recognition using multi-
directional regression and Ridgelet transform, Journal on Multimodal User Interfaces
10(4) (2016) 325–333.
33. P. Shaver, J. Schwartz, D. Kirson et al., Emotion knowledge: further exploration of a
prototype approach, Journal of Personality & Social Psychology 52(6) (1987) 1061.
34. M. A. Nicolaou, S. Zafeiriou and M. Pantic, Correlated-spaces regression for learning
continuous emotion dimensions, ACM International Conference on Multimedia (2013),
pp. 773–776.
35. L. F. Barrett, Discrete emotions or dimensions? The role of valence focus and arousal
focus, Cognition & Emotion 12(4) (1998) 579–599.
36. S. K. A. Kamarol, M. H. Jaward, H. Kälviäinen et al., Joint facial expression recognition
and intensity estimation based on weighted votes of image sequences, Pattern Recog-
nition Letters 92(C) (2017) 25–32.
37. A. Mehrabian, Pleasure-arousal-dominance: A general framework for describing and
measuring individual di®erences in Temperament, Current Psychology 14(4) (1996)
261–292.
38. C. Breazeal, Function meets style: Insights from emotion theory applied to HRI, IEEE
Transactions on Systems Man & Cybernetics Part C 34(2) (2004) 187–194.
39. F. Ren, C. Quan and K. Matsumoto, Enriching mental engineering, International
Journal of Innovative Computing, Information and Control 9(8) (2013) 3271–3284.
40. H. Xiang, P. Jiang, S. Xiao et al., A model of mental state transition network, IEEJ
Transactions on Electronics Information & Systems 127(3) (2007) 434–442.
41. F. Ren, A®ective information processing and recognizing human emotion, Electronic
Notes in Theoretical Computer Science 225 (2009) 39–50.
34 F. Ren & Y. Bao
42. C. Quan and F. Ren, A blog emotion corpus for emotional expression analysis in
Chinese, Computer Speech and Language 24(4) (2010) 726–749.
43. F. Ren and K. Matsumoto, Semi-automatic creation of youth slang corpus and its
application to a®ective computing, IEEE Transactions on A®ective Computing 7(2)
(2016) 176–189.
44. F. Ren and N. Liu, Emotion computing using Word Mover's Distance features based on
Ren CECps, PLoS ONE 13(4) (2018), doi: 10.1371/journal.pone.0194136.
45. L. R. Rabiner, A tutorial on hidden Markov models and selected applications in speech
by 111.125.236.28 on 12/22/24. Re-use and distribution is strictly not permitted, except for Open Access articles.
64. N. Carlini and D. Wagner, Audio adversarial examples: Targeted attacks on speech-to-
text (2018), arXiv preprint arXiv:1801.01944.
65. V. Chernykh and P. Prikhodko, Emotion recognition from speech with recurrent neural
networks (2017), arXiv preprint arXiv:1701.08071.
Int. J. Info. Tech. Dec. Mak. 2020.19:5-47. Downloaded from www.worldscientific.com
66. R. Xia and Y. Liu, A multi-task learning framework for emotion recognition using 2d
continuous space, IEEE Transactions on A®ective Computing 8 (2017) 3–14.
67. F. Tao and G. Liu, Advanced LSTM: A study about better time dependency modeling in
emotion recognition, in ICASSP 2018 (Calgary, Canada, April 2018), pp. 1–6.
68. R. Sproat, Multilingual text-to-speech synthesis: The Bell Labs approach, Computa-
tional Linguistics 3(4) (1997) 761–764.
69. Y. Tabet and M. Boughazi, Speech synthesis techniques. A survey, International
Workshop on Systems, Signal Processing and Their Applications (2011), pp. 67–70.
70. C. Khorinphan, S. Phansamdaeng and S. Saiyod, Thai speech synthesis with emotional
tone: Based on Formant synthesis for Home Robot, Student Project Conference (2014),
pp. 111–114.
71. D. Schwarz, Current research in concatenative sound synthesis, International Computer
Music Conference (2005), pp. 42–45.
72. H. Banno, H. Hata, M. Morise et al., Implementation of realtime STRAIGHT speech
manipulation system: Report on its ¯rst implementation (Applied Systems), Acoustical
Science & Technology 28(3) (2007) 140–146.
73. X. Gonzalvo, S. Tazari, C. A. Chan et al., Recent advances in Google real-time HMM-
driven unit selection synthesizer, in 17th Annual Conference of the International Speech
Communication Association (2016), pp. 1–5.
74. K. Tokuda, The HMM-based speech synthesis system (HTS), Ieice Technical
Report Natural Language Understanding & Models of Communication 107(406) (2007)
301–306.
75. W. Wang, S. Xu and B. Xu, First step towards end-to-end parametric TTS synthesis:
Generating spectral parameters with neural attention, Interspeech (2016), pp. 2243–
2247.
76. S. Mehri, K. Kumar, I. Gulrajani et al., SampleRNN: An unconditional end-to-end
neural audio generation model (2016), arXiv preprint arXiv:1612.07837.
77. A. V. D. Oord, S. Dieleman, H. Zen et al., WaveNet: A generative model for raw audio,
in SSW (September 2016), pp. 1–15.
78. S. O. Arik, M. Chrzanowski, A. Coates et al., Deep voice: Real-time neural text-to-
speech (2017), arXiv preprint arXiv:1702.07825.
79. S. Arik, G. Diamos, A. Gibiansky et al., Deep voice 2: Multi-speaker neural text-to-
speech (2017), arXiv preprint arXiv:1705.08947.
80. W. Ping, K. Peng, A. Gibiansky et al., Deep voice 3: Scaling text-to-speech with con-
volutional sequence learning (2018).
81. Y. Agiomyrgiannakis, Vocaine the vocoder and applications in speech synthesis,
IEEE International Conference on Acoustics, Speech and Signal Processing (2015),
pp. 4230–4234.
36 F. Ren & Y. Bao
87. J. Zhang and X. M. Chen, A research of improved algorithm for GMM voiceprint
recognition model, Control and Decision Conference (2016), pp. 5560–5564.
88. P. Kenny, Joint factor analysis of speaker and session variability: Theory and algo-
rithms, CRIM, (2005).
89. Y. Liu, Y. Qian, N. Chen et al., Deep feature for text-dependent speaker veri¯cation,
Speech Communication 73 (2015) 1–13.
90. N. Dehak, P. J. Kenny, R. Dehak et al., Front-end factor analysis for speaker veri¯ca-
tion, IEEE Transactions on Audio Speech & Language Processing 19(4) (2011)
788–798.
91. O. Ghahabi and J. Hernando, Deep belief networks for i-vector based speaker recogni-
tion, IEEE International Conference on Acoustics, Speech and Signal Processing (2014),
pp. 1700–1704.
92. P. Ha®ner, G. Tur and J. H. Wright, Optimizing SVMs for complex call classi¯cation,
IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003.
Proceedings, Vol. 1. (2003), pp. I-632–I-635.
93. R. Sarikaya, G. E. Hinton and A. Deoras, Application of deep belief networks for natural
language understanding, IEEE/ACM Transactions on Audio Speech & Language
Processing 22(4) (2014) 778–784.
94. A. Ezen-Can and K. E. Boyer, Unsupervised classi¯cation of student dialogue acts with
query-likelihood clustering, In Educational Data Mining, July 2013
95. F. Jiang, X. Chu, X. U. Sheng et al., A macro discourse primary and secondary relation
recognition method based on topic similarity, Journal of Chinese Information Proces-
sing, 32(1) (2018) 43–50.
96. Y. Du, W. Zhang and T. Liu, Topic augmented convolutional neural network for
user interest recognition, Journal of Computer Research & Development, 55(1) (2018)
188–197.
97. F. Ren and H. Yu, Role-explicit query extraction and utilization for quantifying user
intents, Information Sciences, 329(C) (2016) 568–580.
98. K. Yao, G. Zweig, M. Y. Hwang et al., Recurrent neural networks for language under-
standing, Interspeech (2013) 2524–2528.
99. G. Mesnil, Y. Dauphin, K. Yao et al., Using recurrent neural networks for slot ¯lling
in spoken language understanding, IEEE/ACM Transactions on Audio Speech &
Language Processing 23(3) (2015) 530–539.
100. N. Mrkšić, D. O. Seaghdha, B. Thomson et al., Multi-domain dialog state tracking using
recurrent neural networks (2015), arXiv preprint arXiv:1506.07190.
101. H. Shi, T. Ushio, M. Endo et al., A multichannel convolutional neural network for
cross-language dialog state tracking, Spoken Language Technology Workshop (SLT),
December 2016, pp. 559–564.
A Review on Human-Computer Interaction and Intelligent Robots 37
102. A. Rastogi, D. Hakkani-Tür and L. Heck, Scalable multi-domain dialogue state tracking,
Automatic Speech Recognition and Understanding Workshop (2018), pp. 561–568.
103. G. Weisz, P. Budzianowski, P. H. Su et al., Sample e±cient deep reinforcement learning
for dialogue systems with large action spaces, IEEE/ACM Transactions on Audio
Speech & Language Processing (2018), https://arxiv.org/abs/1802.03753.
104. B. Peng, X. Li, J. Gao et al., Deep Dyna-Q: Integrating planning for task-completion
dialogue policy learning (2018), arXiv:1801.06176, 2018.
105. Z. Zhang, M. Huang, Z. Zhao et al., Memory-augmented dialogue management for task-
by 111.125.236.28 on 12/22/24. Re-use and distribution is strictly not permitted, except for Open Access articles.
107. L. Yu, W. Zhang, J. Wang et al., Seqgan: Sequence generative adversarial nets with
policy gradient, AAAI Conference on Arti¯cial Intelligence (San Francisco, California,
USA, 2017), pp. 2851–2858.
108. R. Yan, Y. Song and H. Wu, Learning to respond with deep neural networks for re-
trieval-based human-computer conversation system, in Proc. 39th International ACM
SIGIR Conf.Research and Development in Information Retrieval (2016), pp. 55–64.
109. M. Wang, Z. Lu, H. Li and Q. Liu, Syntax-based deep matching of short texts (2015),
arXiv preprint arXiv:1503.02427.
110. F. Ren, Y. Wang and C. Quan, A novel factored POMDP model for a®ective dialogue
management, Journal of Intelligent & Fuzzy Systems 31(1) (2016) 127–136.
111. F. Ren, Y. Wang and C Quan, TFSM-based dialogue management model framework for
a®ective dialogue systems, IEEJ Transactions on Electrical and Electronic Engineering
10(4) (2015) 404–410.
112. Y. Wang, F. Ren and C Quan, A new factored POMDP model framework for a®ective
tutoring systems, IEEJ Transactions on Electrical and Electronic Engineering 13(11)
(2018) 1603–1611.
113. T. H. Wen, D. Vandyke, N. Mrksic et al., A network-based end-to-end trainable task-
oriented dialogue system (2016), arXiv preprint arXiv:1604.04562.
114. B. Liu, G. Tur, D. Hakkani-Tur et al., Dialogue learning with human teaching and
feedback in end-to-end trainable task-oriented dialogue systems (2018),
arXiv:1804.06512.
115. O. Vinyals and Q. Le, A neural conversational model (2015), arXiv preprint
arXiv:1506.05869.
116. A. Sordoni, M. Galley, M. Auli et al., A neural network approach to context-sensitive
generation of conversational responses (2015), arXiv preprint arXiv:1506.06714.
117. L. Shang, Z. Lu and H. Li, Neural responding machine for short-text conversation, in
Proc. ACL-IJCNLP (2015), pp. 1577–1586.
118. V. K. Tran and L. M. N guyen, Neural-based natural language generation in dialogue
using RNN encoder-decoder with semantic aggregation (2017), arXiv:1706.06714v2,
2017.
119. J. Li, W. Monroe, A. Ritter et al., Deep reinforcement learning for dialogue generation
(2016), arxiv.org/abs/1606.01541.
120. T. Zhao, R. Zhao and M. Eskenazi, Learning discourse-level diversity for neural dialog
models using conditional variational autoencoders (2017), arXiv:1703.10960.
121. I. V. Serban, A. Sordoni, R. Lowe et al., A hierarchical latent variable encoder-decoder
model for generating dialogues (2016), arXiv:1605.06069.
122. J. Li, W. Monroe, T. Shi et al., Adversarial learning for neural dialogue generation
(2017), arxiv.org/abs/1701.06547v1.
38 F. Ren & Y. Bao
123. J. Guo, S. Lu, H. Cai et al., Long text generation via adversarial training with leaked
information (2017), arxiv.org/abs/1709.08624v1.
124. E. Andre, L. Dybkjær, W. Minker et al., A®ective Dialogue Systems, Tutorial and
Research Workshop, ADS 2004 (Kloster Irsee, Germany, 2004).
125. D. Ma, S. Li, X. Zhang et al., Interactive attention networks for aspect-level sentiment
classi¯cation, Twenty-Sixth International Joint Conf. Arti¯cial Intelligence (2017),
pp. 4068–4074.
126. X. Sun, X. Peng and S. Ding, Emotional human-machine conversation generation based
by 111.125.236.28 on 12/22/24. Re-use and distribution is strictly not permitted, except for Open Access articles.
143. G. Lample, M. Ballesteros, S. Subramanian et al., Neural architectures for named entity
recognition (2016), arXiv:1603.01360.
144. X. Ma and E. Hovy, End-to-end sequence labeling via bi-directional LSTM-CNNs-CRF
(2016), arXiv:1603.01354.
145. J. P. Chiu and E. Nichols, Named entity recognition with bidirectional LSTM-CNNs,
Computer Science (2015), arXiv:1511.08308.
146. M. Rei, G. K. Crichto and S. Pyysalo, Attending to characters in neural sequence
labeling models (2016), arXiv:1611.04361.
by 111.125.236.28 on 12/22/24. Re-use and distribution is strictly not permitted, except for Open Access articles.
147. A. Bharadwaj, D. Mortensen, C. Dyer et al., Phonologically aware neural model for
named entity recognition in low resource transfer settings, Conf. Empirical Methods in
Natural Language Processing (2016), pp. 1462–1472.
148. Z. Yang, R. Salakhutdinov and W. W. Cohen, transfer learning for sequence tagging
Int. J. Info. Tech. Dec. Mak. 2020.19:5-47. Downloaded from www.worldscientific.com
166. F. Ren and J. Deng, Background knowledge based multi-stream neural network for text
classi¯cation, Applied Sciences 8(12) (2018), https://doi.org/10.3390/app8122472.
167. F. Ren, X. Kang and C Quan, Examining accumulated emotional traits in suicide blogs
with an emotion topic model, IEEE Journal of Biomedical and Health Informatics 20(5)
(2016) 1384–1396.
168. X. Kang, F. Ren and Y. Wu, Exploring latent semantic information for textual emotion
recognition in blog articles, IEEE/CAA Journal of Automatics Sinica 5(1) (2018)
204–216.
by 111.125.236.28 on 12/22/24. Re-use and distribution is strictly not permitted, except for Open Access articles.
169. F. Ren and Y. Wu, Predicting user-topic opinions in Twitter with social and topical
context, IEEE Transactions on A®ective Computing 4(4) (2013) 412–424.
170. F. Ren and X. Kang, Employing hierarchical Bayesian networks in simple and complex
emotion topic analysis, Computer Speech and Language 27(4) (2013) 943–968.
Int. J. Info. Tech. Dec. Mak. 2020.19:5-47. Downloaded from www.worldscientific.com
171. E. Cambria, S. Poria, A. Gelbukh et al., Sentiment analysis is a big suitcase, IEEE
Intelligent Systems 32(6) (2018) 74–80.
172. S. Zheng, F. Wang, H. Bao et al., Joint extraction of entities and relations based on a
novel tagging scheme (2017), arXiv:1706.05075.
173. H. Schwenk, Continuous space translation models for phrase-based statistical machine
translation, COLING 2012: Posters (2012), pp. 1071–1080.
174. K. Cho, B. V. Merrienboer, C. Gulcehre et al., Learning phrase representations using
RNN encoder-decoder for statistical machine translation, in EMNLP (2014), pp. 1–15.
175. I. Sutskever, O. Vinyals and Q. V. Le, Sequence to sequence learning with neural net-
works, in Advances in Neural Information Processing Systems, Vol. 4 (2014), pp. 3104–
3112.
176. D. Bahdanau, K. Cho and Y. Bengio, Neural machine translation by jointly learning to
align and translate, in ICLR (2015), pp. 1–15.
177. M. T. Luong, H. Pham and C. D. Manning, E®ective approaches to attention-based
neural machine translation, in EMNLP (2015), pp. 1–11.
178. J. Gehring, M. Auli, D. Grangier, D. Yarats and Y. Dauphin, Convolutional sequence to
sequence learning (2017), arXiv preprint arXiv:1705.03122v2.
179. A. Vaswani, N. Shazeer, N. Parmar et al., Attention is all you need, in Advances in
Neural Information Processing Systems, Vol. 10 (2017), pp. 5998–6008.
180. Y. Wu, M. Schuster, Z. Chen et al., Google's Neural machine translation system:
Bridging the gap between human and machine translation (2016), arXiv:1609.08144.
181. M. Johnson, M. Schuster, Q. V. Le et al., Google's multilingual neural machine trans-
lation system: Enabling zero-shot translation (2017), arXiv:1611.04558.
182. Z. Yang, W. Chen, F. Wang et al., Improving neural machine translation with condi-
tional sequence generative adversarial nets, NAACL2018 (2018), pp. 1356–1365.
183. J. Gu, J. Bradbury, C. Xiong et al., Non-autoregressive neural machine translation,
ICLR (2018), pp. 1–13.
184. F. Ren and D. B. Bracewell, Advanced information retrieval, Electronic Notes in The-
oretical Computer Science 225(1) (2009) 303–317.
185. M. Richardson, C. J. Burges and E. Renshaw, Mctest: A challenge dataset for the open-
domain machine comprehension of text, in Proc. 2013 Conference on Empirical Methods
in Natural Language Processing (2013), pp. 193–203.
186. F. Hill, A. Bordes, S. Chopra et al., The Goldilocks Principle: Reading children's books
with explicit memory representations, in ICLR (2016), pp. 1–13.
187. K. M. Hermann, T. Kocisky, E. Grefenstette et al., Teaching machines to read and
comprehend, in Advances in Neural Information Processing Systems, Vol. 4 (2015),
pp. 1693–1701.
A Review on Human-Computer Interaction and Intelligent Robots 41
188. P. Rajpurkar, J. Zhang, K. Lopyrev et al., SQuAD: 100,000+ Questions for Machine
Comprehension of Text (2016), arXiv:1606.05250.
189. A. Miller, A. Fisch, J. Dodge et al., Key-value memory networks for directly reading
documents (2016), arXiv:1606.03126.
190. T. Nguyen, M. Rosenberg, X. Song et al., MS MARCO: A human generated machine
reading comprehension dataset (2016), arXiv:1611.09268.
191. W. He, K. Liu, J. Liu et al., DuReader: A Chinese machine reading comprehension
dataset from real-world applications, ACL (2018), pp. 1–10.
by 111.125.236.28 on 12/22/24. Re-use and distribution is strictly not permitted, except for Open Access articles.
192. W. Yin, S. Ebert and H. Schütze, Attention-based convolutional neural network for
machine comprehension (2016), arXiv preprint arXiv:1602.04341.
193. S. Wang and J. Jiang, Learning Natural Language Inference with LSTM (2016),
arXiv:1512.08849.
Int. J. Info. Tech. Dec. Mak. 2020.19:5-47. Downloaded from www.worldscientific.com
194. S. Wang and J. Jiang, Machine comprehension using Match-LSTM and answer pointer,
in ICLR (2017), pp. 1–15.
195. Microsoft Asia Natural Language Computing Group. R-net: Machine reading compre-
hension with self-matching networks (2017).
196. M. Seo, A. Kembhavi, A. Farhadi et al., Bidirectional attention °ow for machine
comprehension, ICLR (2017), pp. 1–13.
197. C. Xiong, V. Zhong and R. Socher, DCN+: Mixed objective and deep residual coat-
tention for question answering (2017), arXiv:1711.00106.
198. M. Hu, Y. Peng, Z. Huang et al., Reinforced mnemonic reader for machine reading
comprehension (2017), arXiv preprint arXiv:1705.02798.
199. Y. Cui, Z. Chen, S. Wei et al., Attention-over-attention neural networks for reading
comprehension (2017), arXiv:1607.04423.
200. D. Golub, P. Huang, X. He et al, Two-stage synthesis networks for transfer learning in
machine comprehension (2017), arXiv:1706.09789.
201. Y. Xu, J. Liu, J. Gao et al., Dynamic fusion networks for machine reading compre-
hension (2018), arXiv:1711.04964v2.
202. C. Clark and M. Gardner, Simple and e®ective multi-paragraph reading comprehension
(2017), arXiv:1710.10723.
203. J. Welbl, P. Stenetorp and S. Riedel, Constructing datasets for multi-hop reading
comprehension across documents (2018), arXiv:1710.06481.
204. H. P. Luhn, The automatic creation of literature abstracts, Ibm Journal of Research &
Development 2(2) (1958) 159–165.
205. D. Shen, J. Sun, H. Li et al., Document summarization using conditional random ¯elds,
IJCAI, Vol. 7, (2007), pp. 2862–2867.
206. Y. Ouyang, W. Li, S. Li and Q. Lu, Applying regression models to query-focused
multi-document summarization, Information Processing & Management 47(2) (2011)
227–237.
207. Z. Cao, F. Wei, L. Dong et al., Ranking with recursive neural networks and its appli-
cation to multi-document summarization, in Twenty-Ninth AAAI Conf. Arti¯cial In-
telligence (2015), pp. 2153–2159.
208. D. Bollegala, N. Okazaki and M. Ishizuka, A bottom-up approach to sentence ordering
for multi-document summarization, Information Processing & Management 46(1)
(2010) 89–109.
209. C. Li, X. Qian and Y. Liu, Using supervised bigram-based ILP for extractive summa-
rization, ACL (2013), pp. 1004–1013.
210. H. Lin and J. Bilmes, A class of submodular functions for document summarization, in
Proc. 49th Annual Meeting of the Association for Computational Linguistics: Human
Language Technologies, Vol. 1 (2011), pp. 510–520.
42 F. Ren & Y. Bao
211. X. Qian and Y. Liu, Fast joint compression and summarization via graph cuts, EMNLP
(2013), pp. 1492–1502.
212. C. Li, Y. Liu, F. Liu et al., Improving multi-documents summarization by sentence
compression based on expanded constituent parse trees, in EMNLP (2014), pp. 691–701.
213. L. Bing, P. Li, Y. Liao et al., Abstractive multi-document summarization via phrase
selection and merging, Computational Linguistics 31(4) (2015) 505–530.
214. S. Dohare, H. Karnick and V. Gupta, Text summarization using abstract meaning
representation (2017), arXiv:1706.01678.
by 111.125.236.28 on 12/22/24. Re-use and distribution is strictly not permitted, except for Open Access articles.
215. Y. Xia, F. Tian, L. Wu et al., Deliberation networks: Sequence generation beyond one-
pass decoding, NIPS (2017), pp. 1–11.
216. J. Gehring, M. Auli, D. Grangier et al., A convolutional encoder model for neural ma-
chine translation (2016) arXiv:1611.02344.
Int. J. Info. Tech. Dec. Mak. 2020.19:5-47. Downloaded from www.worldscientific.com
217. K. Lin, D. Li, X. He et al., Adversarial ranking for language generation (2018),
arXiv:1705.11001.
218. R. Paulus, C. Xiong and R. Socher, A deep reinforced model for abstractive summari-
zation (2017), arXiv preprint arXiv:1705.04304.
219. A. Abu-Jbara and D. Radev, Coherent citation-based summarization of scienti¯c
papers, in Proc. 49th Annual Meeting of the Association for Computational Linguistics:
Human Language Technologies, Vol. 1 (2011), pp. 500–509.
220. S. Wang, X. Wan and S. Du, Phrase-based presentation slides generation for academic
papers, AAAI (2017), pp. 196–202.
221. W. Luo, F. Liu, Z. Liu et al., Automatic summarization of student course feedback,
Conf. North American Chapter of the Association for Computational Linguistics:
Human Language Technologies (2018), pp. 80–85.
222. E. Goldberg, R. Kittredge and A. Polguere, Computer generation of marine weather
forecast text, Journal of Atmospheric and Oceanic Technology 5(4) (2009) 473–483.
223. J. Zhang, J. Yao and X. Wan. Toward constructing sports news from live text com-
mentary, ACL 16 (2016) 1361–1371.
224. R. Lebret, D. Grangier and M. Auli, Neural text generation from structured data with
application to the biography domain, EMNLP (2016), pp. 1–11.
225. Z. Wang, W. He, H. Wu et al., Chinese poetry generation with planning based neural
network, COLING (2016), pp. 1051–1060.
226. J. Zhang, Y. Feng, D. Wang et al., Flexible and creative Chinese poetry generation using
neural memory, ACL (2017), pp. 1364–1373.
227. L. Xu, L. Jiang, C. Qin et al., How images inspire poems: Generating classical Chinese
poetry from images with memory networks (2018), arXiv:1803.02994.
228. X. Chen and C. L. Zitnick, Mind's eye: A recurrent visual representation for image
caption generation, CVPR (2015), pp. 2422–2431.
229. P. Anderson, X. He, C. Buehler et al., Bottom-up and top-down attention for image
captioning and VQA (2017), arXiv preprint arXiv:1707.07998.
230. S. Liu, Z. Zhu, N. Ye et al., Improved image captioning via policy gradient optimization
of spider, in Proc. IEEE Int. Conf. Comp. Vis, Vol. 3 (2017), pp. 873–881.
231. S. Lee, U. Hwang, S. Min et al., Polyphonic music generation with sequence generative
adversarial networks (2018), arXiv:1710.11418.
232. H. Dong and Y. Yang, Convolutional generative adversarial networks with binary
neurons for polyphonic music generation, ISMIR (2018), pp. 1–13.
233. J. C. García and E. Serrano, Automatic music generation by deep learning, Int. Symp.
Distributed Computing and Arti¯cial Intelligence (Springer, 2018), pp. 284–291.
234. M. Wang and W. Deng, Deep face recognition: A survey (2018), arXiv:1804.06655.
A Review on Human-Computer Interaction and Intelligent Robots 43
235. H. Jiang and E. Learned-Miller, Face detection with the faster R-CNN, IEEE Int. Conf.
Automatic Face & Gesture Recognition (2017), pp. 650–657.
236. S. Yang, P. Luo, C. Loy et al., Faceness-Net: Face detection through deep facial part
responses, IEEE Transactions on Pattern Analysis & Machine Intelligence 40(8) (2017)
1845–1859.
237. Y. Zhou, D. Liu and T. Huang, Survey of face detection on low-quality images, IEEE Int.
Conf. Automatic Face & Gesture Recognition (2018), pp. 769–773.
238. X. Jin and X. Tan, Face alignment in-the-wild: A survey, Computer Vision & Image
by 111.125.236.28 on 12/22/24. Re-use and distribution is strictly not permitted, except for Open Access articles.
256. X. Ben, X. Jia, R. Yan et al., Learning e®ective binary descriptors for micro-expression
recognition transferred by macro-information, Pattern Recognition Letters 107 (2018)
50–58.
257. Y. J. Liu, J. K. Zhang, W. J. Yan et al., A main directional mean optical °ow feature for
spontaneous micro-expression recognition, IEEE Transactions on A®ective Computing
7(4) (2016) 299–310.
258. X. Huang, S. J. Wang, G. Zhao et al., Facial micro-expression recognition using spa-
tiotemporal local binary pattern with integral projection, The Workshop on Computer
by 111.125.236.28 on 12/22/24. Re-use and distribution is strictly not permitted, except for Open Access articles.
Vision for A®ective Computing at ICCV (IEEE Computer Society, 2015), pp. 1–13.
259. I. Nejadgholi, S. A. Seyyedsalehi and S. Chartier, A brain-inspired method of facial
expression generation using chaotic feature extracting bidirectional associative memory,
Neural Processing Letters 46(3) (2017) 943–960.
Int. J. Info. Tech. Dec. Mak. 2020.19:5-47. Downloaded from www.worldscientific.com
275. X. Ding, Z. Lv, C. Zhang et al., A robust online saccadic eye movement recognition
method combining electrooculography and video, IEEE Access 5 (2017) 17997–18003.
276. A. Chaudhuri, K. Mandaviya, P. Badelia et al., Optical Character Recognition Systems
for Di®erent Languages with Soft Computing, Vol. 352 (Springer, 2017) ISBN 978-3-319-
50251-9.
277. A. Marial and J. Jos, Feature extraction of optical character recognition: Survey,
International Journal of Applied Engineering Research 12(7) (2017) 1129–1137.
278. M. Brisinello, R. Grbic, M. Pul et al., Improving optical character recognition perfor-
by 111.125.236.28 on 12/22/24. Re-use and distribution is strictly not permitted, except for Open Access articles.
mance for low quality images, in Proc. ELMAR-2017 (IEEE, 2017), pp. 167–171.
279. D. Lin, F. Lin, Y. Lv et al., Chinese character CAPTCHA recognition and performance
estimation via deep neural network, Neurocomputing 288 (2018) 11–19.
280. X. Y. Zhang, F. Yin, Y. M. Zhang et al., Drawing and recognizing chinese characters
Int. J. Info. Tech. Dec. Mak. 2020.19:5-47. Downloaded from www.worldscientific.com
with recurrent neural network, IEEE Transactions on Pattern Analysis and Machine
Intelligence 40(4) (2018) 849–862.
281. K. Han and H. H. Chang, Digits generation and recognition using RNNPB (2018).
282. H. D. Critchley and S. N. Gar¯nkel, The in°uence of physiological signals on cognition,
Current Opinion in Behavioral Sciences 19 (2018) 13–18.
283. D. Zhang, L. Yao, X. Zhang et al., Cascade and parallel convolutional recurrent neural
networks on EEG-based intention recognition for brain computer interface, AAAI Conf.
Arti¯cial Intelligence (2018), pp. 1–8.
284. S. Wang, J. Gwizdka and W. A. Chaovalitwongse, Using wireless EEG signals to assess
memory workload in the, n-back task, IEEE Transactions on Human-Machine Systems
46(3) (2016) 424–435.
285. T. Chen, S. Ju, X. Yuan, M. Elhoseny, F. Ren and M. Fan, Emotion recognition using
empirical mode decomposition and approximation entropy, Computers and Electrical
Engineering 72 (2018) 383–392.
286. F. Ren, Y. Dong and W. Wang, Emotion recognition based on physiological signals
using brain asymmetry index and echo state network, Neural Computing and Applica-
tions (2018), https://doi.org/10.1007/s00521-018-3831-4.
287. M. Soleymani, S. Asghari-Esfeden, Y. Fu et al., Analysis of EEG signals and facial
expressions for continuous emotion detection, IEEE Transactions on A®ective Com-
puting 7(1) (2016) 17–28.
288. L. Shu, J. Xie, M. Yang et al., A review of emotion recognition using physiological
signals, Sensors 18(7) (2018), doi.org/10.3390/s18072074.
289. M. Mahmud, M. S. Kaiser, A. Hussain et al., Applications of deep learning and rein-
forcement learning to biological data, IEEE Transactions on Neural Networks &
Learning Systems 29(6) (2018) 2063–2079.
290. I. D. Castro, C. Varon, T. Torfs et al., Evaluation of a multichannel non-contact ECG
system and signal quality algorithms for sleep apnea detection and monitoring, Sensors
18(2) (2018), doi.org/10.3390/s18020577.
291. N. Dey, A. S. Ashour, F. Shi et al., Developing residential wireless sensor networks for
ECG healthcare monitoring, IEEE Transactions on Consumer Electronics 63(4) (2018)
442–449.
292. L. Liu, X. Chen, Z. Lu et al., Development of an EMG-ACC-based upper limb reha-
bilitation training system, IEEE Transactions on Neural Systems and Rehabilitation
Engineering 25(3) (2017) 244–253.
293. Y. Hu, Z. Li, G. Li et al., Development of sensory-motor fusion-based manipulation and
grasping control for a robotic hand-eye system, IEEE Transactions on Systems, Man,
and Cybernetics: Systems 47(7) (2017) 1169–1180.
46 F. Ren & Y. Bao
294. T. Kapelner, F. Negro, O. C. Aszmann and D. Farina, Decoding motor unit activity
from forearm muscles: Perspectives for myoelectric control, IEEE Transactions on
Neural Systems and Rehabilitation Engineering 26(1) (2018) 244–251.
295. T. Teramae, T. Noda and J. Morimoto, EMG-based model predictive control for
physical human–robot interaction: Application for assist-as-needed control, IEEE
Robotics and Automation Letters 3(1) (2018) 210–217.
296. C. Yang, C. Zeng, P. Liang et al., Interface design of a physical human-robot interaction
system for human impedance adaptive skill transfer, IEEE Transactions on Automation
by 111.125.236.28 on 12/22/24. Re-use and distribution is strictly not permitted, except for Open Access articles.
313. U. Saranli, M. Buehler and D. E. Koditschek, RHex: A simple and highly mobile
hexapod robot, The International Journal of Robotics Research 20(7) (2001) 616–631.
314. M. Raibert, K. Blankespoor, G. Nelson et al., Bigdog, the rough-terrain quadruped
robot, IFAC Proceedings Volumes 41(2) (2008) 10822–10825.
315. M. P. Murphy, A. Saunders, C. Moreira et al., The littledog robot, The International
Journal of Robotics Research 30(2) (2011) 145–149.
316. S. Kuindersma, R. Deits, M. Fallon et al., Optimization-based locomotion planning,
estimation, and control design for the atlas humanoid robot, Autonomous Robots 40(3)
by 111.125.236.28 on 12/22/24. Re-use and distribution is strictly not permitted, except for Open Access articles.
(2016) 429–455.
317. L. R. Hochberg, D. Bacher, B. Jarosiewicz et al., Reach and grasp by people with
tetraplegia using a neurally controlled robotic arm, Nature 485 (2012) 372–375.
318. G. Santhanam, S. I. Ryu, B. M. Yu et al., A high-performance brain–computer interface,
Int. J. Info. Tech. Dec. Mak. 2020.19:5-47. Downloaded from www.worldscientific.com