A Review of Physics Simulators For Robotic Applications
A Review of Physics Simulators For Robotic Applications
April 8, 2021.
Digital Object Identifier 10.1109/ACCESS.2021.3068769
ABSTRACT The use of simulators in robotics research is widespread, underpinning the majority of recent
advances in the field. There are now more options available to researchers than ever before, however
navigating through the plethora of choices in search of the right simulator is often non-trivial. Depending
on the field of research and the scenario to be simulated there will often be a range of suitable physics
simulators from which it is difficult to ascertain the most relevant one. We have compiled a broad review of
physics simulators for use within the major fields of robotics research. More specifically, we navigate through
key sub-domains and discuss the features, benefits, applications and use-cases of the different simulators
categorised by the respective research communities. Our review provides an extensive index of the leading
physics simulators applicable to robotics researchers and aims to assist them in choosing the best simulator
for their use case.
INDEX TERMS Simulation, review, robotics, field robotics, soft robotics, aerial robotics, marine robotics,
manipulation, robotic learning, surgical robotics.
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
51416 VOLUME 9, 2021
J. Collins et al.: Review of Physics Simulators for Robotic Applications
FIGURE 1. Diversity of simulation scenes and environments throughout robotics (a) soft robotics in Simulation Open Framework Architecture [3],
(b) medical robotics in Asynchronous Multi-Body Framework [4], (c) manipulation in PyBullet [5], (d) dexterous manipulation in MuJoCo [6], (e) legged
locomotion in RaiSim [7] and (f) underwater vehicles in URSim [8].
phenomenon, (ii) collision detection and friction models, We recommend researchers use this review as a guide for
(iii) Graphical User Interface (GUI), (iv) import capability selecting a simulator for their particular research endeavour.
for scenes and meshes, (v) API especially for programming We suggest first that a user has some idea of the particular
language used by the robotics community (c++/Python), robot platform(s) they wish to simulate (e.g. UR10, Husky,
and (vi) models for an array of joints, actuators and sen- etc.), the method of actuation of the robotic platform(s),
sors readily available. This review covers only simulators the sensors they intend to use and the physical operating
which are actively being developed, used, or maintained. environment (e.g. air, underwater, sand, city streets, etc.).
It is our understanding that anything otherwise would be The user then infers the relevant robotics research community
of limited importance to the research community in the from the denoted platform, so that, e.g., robot arms are found
long-term. Additionally, this review focuses on robotics sim- under manipulation. From here, we identify a subset of simu-
ulators and not physics engines. Physics engines are inte- lators that are likely to provide support for the required meth-
gral to every robotics simulator, but a physics engine alone ods of actuation, sensors, and operating environment. If the
does not constitute a robotics simulator unless it satisfies planned endeavour is a crossover between fields, e.g., under-
all criteria we use to define what a robotics simulator water manipulation, we suggest that all relevant sections be
is. considered.
This review acts as a guide to assist researchers in short-
listing the most relevant simulators for a given application II. MOBILE GROUND ROBOTICS
to aid their decision making. We map out the landscape of Autonomous ground vehicle research – including legged,
current robotics simulators by categorising simulators that are wheeled and tracked robots – is one of the largest stud-
actively maintained, and provide exemplar tasks and func- ied domains in robotics. There are many fields which
tionality that makes a simulator useful for a particular field or are incorporated in this sub-domain, including navigation,
sub-domain. For each category we also provide a standardised locomotion, cognition, control, perception, Simultaneous
summary table for concise communication of simulators and Localisation and Mapping (SLAM), and many others [9].
features. Figure 1 demonstrates the diversity of simulation To motivate the use cases for simulators within mobile
environments required within robotics. The popularity of the robotics we begin by investigating current challenges and
robotics simulators discussed in sections II-VIII is portrayed competitions being run that represent some applications of
visually in Figure 2. mobile robotics research. The most prominent challenges are
FIGURE 2. Citation count from 2016 to 2020 for reviewed simulators. Citations were gathered from Google Scholar using either one or more of a
simulators’ research paper, reference manual or other citation type and then filtered for robotics keyword.
the Defense Advanced Research Projects Agency (DARPA) of mobile robots compete in games of football. Depending
organised robotics challenges, starting in 2003 with an on the league, robots can either be legged or wheeled [14].
autonomous driving challenge, continuing in 2012 with a Each of these challenges require multiple robots to: coexist
search and rescue challenge and now a subterranean com- in the same environment and potentially interact with one
petition running since 2018 [10]–[12]. The most current of another; navigate through and interact with terrain that may
these requires a team of robots to collaboratively navigate and be geometrically uneven; and perceive the state of the robot
map underground GPS denied environments to find human in its environment with a suite of onboard sensors such as
survivors, this event even hosts a virtual competition that is LiDAR, stereo-camera, GPS, or IMU.
to be completed in the Gazebo simulation environment. Gazebo is a popular robotics simulator used in a wide range
DJI has been running a robot competition featuring mobile of mobile ground robot state-of-the-art (SOTA) research,
robots since 2015. The task is to employ a team of robots in for both legged [15], [16] and wheeled [17] robots. The
an arena to battle an opposing team using projectiles [13]. Robot Operating System (ROS) interface provided by Gazebo
RoboCup is another prominent challenge that sees teams contributes to the simulator’s popularity, and simplifies the
process of testing control software in simulation and transfer- real world [27]. CARLA is a simulator targeted at self-driving
ring it onto the physical system. Gazebo also offers a model car research and therefore has features aligned with this goal.
library for many commonly used sensors such as camera, It is scaleable to allow for large scenes through a distributed
GPS, and IMU. Gazebo provides capability to import envi- architecture and includes ROS integration. CARLA provides
ronments from digital elevation models, SDF meshes, and support to import environments through the openDRIVE file
OpenStreetMap. It is also possible to import robot models format. A large number of sensors are available for use
from the Universal Robot Description Format (URDF) files. including GPS, IMU, LIDAR, Radar and Camera. CARLA
Being a rigid body simulator, the simulator runs quickly and uses PhysX to compute the vehicle physics although the
can simulate multiple robots in real-time. Although Gazebo settings available to the user are restricted.
itself doesn’t provide motion planning functionality, its tight Table 2 provides a comparison between the capabilities of
integration with ROS allows ROS path planners to be used. the discussed robotics simulators in areas that are identified
CoppeliaSim (previously V-Rep) is another popular choice to be critical to the domain of mobile ground robots. Critical
for simulating ground-based mobile robots. SOTA research features identified include the ability to model sensors com-
uses CoppeliaSim for navigation planning of bipedal monly used in the field, as well as common forms of loco-
robots [18], differential-drive robots [19], and visual tra- motion, ability to import various environments, and inbuilt
jectory tracking of differential-drive robots [20]. Similar to support for ROS. If the user wishes to realistically model
Gazebo, CoppeliaSim is a rigid-body simulator which is complex environments, including sand, water, and gravel,
able to simulate multiple robots in real-time. It also supplies Project Chrono provides this capability through an inbuilt
a large model library of common mobile robot platforms Discrete Element Method (DEM) model [28].
and sensors including 2D/3D laser, accelerometer, stereo-
camera, camera, event camera, GPS, and gyro. CoppeliaSim III. MANIPULATION
offers path planning functionality through the commonly The manipulation community within robotics is large and
used OMPL library and supports height-fields for terrain diverse, exploring everything from physical design of arms
specification. and grippers to algorithms for motion planning and control.
Webots is another popular alternative, and is used in Recent competitions and benchmarks provide insight into the
SOTA research to investigate, e.g., the performance of current research avenues within the field and as an extension,
non-holonomic robot trajectory tracking [21], and to evolve the use cases for simulators within the manipulation research
bipedal robot gaits [22]. Webots has a large model database community.
of mobile robots, environments and sensors. Sensors include One of the most prolific annual competitions is Robocup,
accelerometer, camera, compass, GPS, inertial measurement which has a league called Robocup@home [29] for assistive
unit, LIDAR, and radar. Webots also supports maps to be robotics that compete in a domestic setting. Tasks that must
imported using the openDrive file format. be completed as part of the challenge rely upon manipulat-
Raisim [7] is a rigid-body physics simulator developed by ing rigid and deformable objects. The Tidy Up My Room
ETH Zurich, used in research into learning dynamic policies Challenge from the International Conference on Robotics
for legged platforms [23]. Raisim allows uneven terrain to be and Automation (ICRA) 2018 [30] and Fetch it! The Mobile
imported using height-map images. Raisim is not as full fea- Manipulation Challenge [31] are two additional challenges
tured as other reviewed mobile robot simulators but instead is that are similar in that they all require multiple mobile manip-
developed to provide high fidelity contact dynamics models ulation platforms to complete multi-step tasks which include
which are needed for transferring controllers from simulation manipulation, often of rigid objects.
to reality. Other relevant challenges include the Amazon picking
SOTA work into transferring locomotion policies for challenge, held between 2015 and 2017, which required
quadrupeds from simulation to real-world platforms was con- robots to complete pick-and-place style tasks of seen and
ducted using the PyBullet simulator [24] along with research unseen objects, some of which were deformable and oth-
into visual navigation on the turtlebot platform [25]. Pybullet ers which were transparent [32]. Most recently a challenge
is applicable to most mobile robotics applications as it sup- called the Real Robot Challenge looks at the dexterous
ports rigid-body simulation with the possibility of faster than manipulation of objects in both simulated and real-world
real-time simulation of multiple robots. It also supports the environments. There are a number of benchmarks proposed
import of height-fields for terrain geometry specification. The within the manipulation community for benchmarking phys-
sensors supported by PyBullet are quite minimal compared to ical as well as algorithmic advances, relevant benchmarks to
other more specific mobile robot simulators with support for our review are task-based and include pick-and-place [33],
cameras and several less relevant sensors. assembly [34], peg-in-hole [35] and deformable objects [36].
The recent large scale investment into self-driving cars has From here we see that the field is at a state of simulating
yielded a comparably large amount of research into the field multi-task or multi-step scenarios, requiring the fine move-
of autonomous cars with the CARLA simulator a by-product. ment and contact modelling of rigid bodies. This, combined
Several SOTA papers use CARLA for learning driving poli- with the self-expletive need for stable physics that robustly
cies [26] and transferring policies trained in CARLA to the handles contacts, reduces the field of viable simulators. To be
TABLE 2. Feature comparison between popular robotics simulators used for Mobile Ground Robotics.
TABLE 3. Feature comparison for popular robotics simulators used for Manipulation.
useful for manipulation, research simulators must have actu- to deal with non-rigid bodies. Gazebo provides a simulation
ator models for position control, velocity control, and torque environment with the necessary actuators and sensors for
control, as these are the most commonly used modes of robotic manipulation. It also provides support for ROS, which
control for physical arms. The simulator needs to support provides packages for forward and inverse kinematics, as well
torque sensors as well as visual sensors, namely RGB and as path and motion planning.
RGB-D. Finally, built-in features that are relevant specifi- CoppeliaSim is a robotics simulator with a range of
cally for manipulators are Inverse and Forward Kinematics user-centric features including sensor and actuator models,
solvers, and path planning. Less common – but becoming as well as motion planning, and forward and inverse kine-
more relevant as computation becomes cheaper – is the need matics support. PyRep was recently introduced as a python
for modelling deformable objects as the underlying assump- toolkit for robot learning built on CoppeliaSim, and has
tion that the robotics world is entirely rigid does not hold been shown to be capable of being used for manipulation,
true. explicitly picking and placing cubes using a Kinova robot
Recent SOTA research has utilised a range of simulators arm [49].
for producing results. The capabilities of these simulators SimGrasp is used in a study for the design and simulation
in areas relevant to the domain of robotic manipulation is of tendon-driven underwater robotic hands [50]. It is a simu-
summarised in Table 3. MuJoCo [37] is a simulator com- lation package built on top of the Klampt’t simulator, which
monly employed within research, with notable contact sta- markets itself as having better collision handling than the
bility being a reason for its popularity [38]. It was applied previously-mentioned manipulation simulators [51]. Klamp’t
in an in-hand manipulation context for solving a Rubiks cube lends itself to fast deployment of dexterous manipulation in
with a 24DOF robotic hand actuated with tendons [39]. SOTA robotics through the simulation of actuators and sensors with
research uses MuJoCO to train policies for robotic manipu- kinematics, dynamics and path planning.
lators in simulation, whether just for proof of concept [40] or Several works simulate deformable object manipulation
for later transferring onto real-world systems [41]–[43]. From using physics simulators that this review does not clas-
the list of features a good manipulation simulation should sify as robotics simulators. One such work uses Blender
have, MuJoCo supports most but lacks support for inverse for cloth simulations [52], however this investigation only
kinematics and path planning. simulates cloth and the displacement of picked cloth co-
Pybullet is used in studies with object collisions [44], pick ordinates. Another study uses Nvidia Flex to simulate fluids
and grasp dynamics [5] and for deformable object manip- and deformable objects but abstracts away robot interac-
ulation (i.e. cloth) [45]. Pybullet has a strong robotics tar- tions [53]. Additionally, another work with Nvidia Flex sim-
get with functionality specifically implemented for those ulates a robot completing a swinging peg-in-hole task with
researching robotics. Functionality that may assist manipula- a 7-DoF Yumi robot. Flex is available through the ISAAC
tion researchers includes: forward/inverse kinematics; Rein- simulator [54].
forcement Learning (RL) environments; Virtual Reality (VR)
integration (for task demonstration); and deformable object IV. MEDICAL ROBOTICS
and cloth simulation (Finite Element Method). Medical robotics is a sub-domain of robotics research apply-
Gazebo [46] is used for robotic manipulation research [47], ing automation and robotics to e.g., surgery, therapy, reha-
[48]. Although neither of these investigations rely on Gazebo bilitation and hospital automation [55]. Unlike competitions
to conduct dexterous manipulation, one of the studies did in other sub-fields of robotics, medical robotics does not
explicitly augment the simulation with external algorithms have competitions focused on solving specific tasks, instead
TABLE 4. Feature comparison of popular robotics simulators used for Medical Robotics.
competitions are often judged based on the innovation of sub- SOFA is a medical simulator used to study force and ther-
missions. Examples of medical robotics competitions include mal feedback methods for minimally invasive surgery [62].
the United Kingdom Robotics and Autonomous System Med- SOFA offers a plugin for teleoperation and haptic control,
ical Robotics for Contagious Diseases Challenge [56] and the supports real-time deformable object simulation and through
Kuka Medical Robotics Challenge [57]. The wide range of a robotics plugin supports tendon driven joints. These features
applications for medical robotics limits the number of medi- paired with the strong focus on medical applications makes
cal robotics benchmarks to base the requirements of a medical SOFA a good candidate for medical robotics research requir-
robotics simulator on. We therefore base the requirements of ing simulation.
medical robotics simulators on the needs of recent research CHAI3D is another simulation framework used in medical
being conducted in this field. robotics research to learn a neural network for autonomous
Due to the nature of therapeutic and rehabilitation interven- tissue manipulation in simulation [63]. CHAI3D sup-
tions, studies are typically conducted in reality only. Instead, ports multi-DOF teleoperation controllers and haptic feed-
we focus on simulation for robotic surgery, including training back systems, real-time simulation, and deformable object
and practice for real surgeries with a robotic surgical system, simulation.
and training autonomous agents to attempt surgery in a safe AMBF is a simulator for medical applications developed
environment. at the Automation and Interventional Medicine Robotics
The most prolific robotic surgical system is the da Vinci Research Laboratory at Worcester Polytechnic Institute, built
by Intuitive Inc., which consists of multiple arms with both around CHAI3D and the bullet physics engine [4], [64].
rotational joints and tendon driven joints [58]. There are AMBF enables fast-running simulation which uses CHAI3D
research robotic platforms that have been developed with for teleoperation support and haptic feedback, and bullet for
similar hardware to the da Vinci, including the Raven II [59]. soft body simulation and prismatic joints.
To realistically simulate these platforms, a simulator must be UnityFlexML is a simulator developed for use in machine
capable of simulating rotational as well as prismatic joints learning applications [65]. It is based on Unity which has a
and should ideally simulate the tendons in both these robots large network of supported plugins including Nvidia flex for
for accuracy. deformable object simulation and teleoperation control with
Surgeries performed with a robotic surgical system are haptics.
often teleoperated by the surgeon. State of the art teleop- The remaining simulators used for research in medical
eration controls have haptic feedback which gives the user robotics offer a limited subset of the necessary features iden-
force/torque feedback directly from the robot [60]. Robotics tified. For example CoppeliaSim is used as a deep rein-
simulators for research into medical robotics benefit from forcement learning environment to train both pick and reach
having the tools to provide simulated force/torque feedback policies on a surgical robot where deformable objects are not
for haptic devices. Simulators which take in user input for required [66]. CoppeliaSim does however support teleoper-
teleoperation must run in real-time, otherwise user input ation and haptic feedback through a CHAI3D plugin, and
would result in a delayed action in the environment. Robotic supports prismatic joints for realistic simulation of surgical
simulators for medical robotics also require deformable robot joints.
object simulation. As humans consist primarily of non-rigid Another simulation environment that was used for medical
tissue, realistic simulations require the ability to simulate research was a custom implementation using Open Dynam-
deformable objects. ics Engine (ODE). ODE was used in the presentation of
There are several simulators that offer some or all of a framework for training users for robotics surgery [67].
the features required of a simulator in medical robotics. Although ODE is a physics-engine and not a robotics simu-
SOTA research in the domain of medical robotics uses lator, the users added additional support for teleoperation and
robotics simulators such as the Simulation Open Frame- haptic feedback for use with the Raven II surgical robot.
work Architecture (SOFA), CHAI3D [61], Asynchronous
Multi-Body Framework (AMBF), CoppeliaSim, and Uni- V. MARINE ROBOTICS
tyFlexML. The capabilities of the identified simulators in Simulators for marine robotics can be divided into two cat-
areas relevant to the domain of surgical robotics is sum- egories, namely those which are designed for underwater
marised in Table 4. vehicles (AUVs, ROVs, etc) and those which are suitable
for surface vehicles (USVs, Ships, boats, etc). Competi- range camera/sensor, IMU, DVL, and much more. Multiple
tions such as the Singapore AUV Challenge (SAUVC) [68], vehicles can be loaded and managed simultaneously while
RoboSub [69], RobotX [70] and MATE ROV [71] focus complex environments can be modelled using OSG and other
on practical, challenging missions. The interested reader is 3rd party tools such as Blender. The underwater rendering is
directed towards [72] which provides an excellent review highly realistic and it already includes a default model for
of such competitions in the field of marine robotics and girona500 and ARM5E manipulator. UWSim has been used
the summary in Table 5 that summarises the capabilities in a number of applications including controller design [79],
of the reviewed simulators. The design of marine robots path planning [80], 3D mapping [81], etc. It does, however,
is greatly aided by high fidelity simulation in areas such lag behind in terms of accuracy of simulation for dynamics
as navigation, waypoint following, seabed mapping, and and hydrodynamics of vehicles [82]. It also does not support
sensor-based control [73]. A good simulator should support simulation of manipulator dynamics (only kinematics) [82].
different types of controllable vehicles, manipulators, sen- The most credible alternative to UWSim and UUV Simula-
sors, and complex environments with accurate representation tor is the newly proposed StoneFish Library [82], a wrapper
of hydrodynamic/hydrostatic forces. UWSim [74] and UUV for bullet which supports standard sensors such as camera,
Simulator [75] are the two most widely used options for pressure, DVL, multi-beam, etc. All hydrodynamic compu-
underwater simulation. tations are based on the actual geometry of the body, which
Unmanned Underwater Vehicle (UUV) Simulator [75] is allows for better approximation of hydrodynamic forces. The
an extension for Gazebo which supports multiple underwa- simulation effects include added mass, buoyancy and drag.
ter vehicles (ROVs and AUVs) and robotic manipulators Underwater thrusters and vehicle manipulator systems are
with high fidelity representation of hydrostatic and hydro- available for modelling more complex setups. The simulator
dynamic forces. A number of commonly used sensors are supports advanced rendering of underwater scenes, including
included eg. underwater camera, pressure sensor, IMU, Mag- scattering and light absorption. The advanced rendering is
netometer, Doppler Velocity Log (DVL), etc. Models for fins computationally expensive though and requires a recent GPU.
and thrusters are also included for actuation. UUV allows Unity ROS Simulator (URSim) [8] uses ROS and the Unity
researchers to create complex underwater environments with 3D game engine. It has sensor models for camera, IMU
models already included for seabeds, lakes, ship-wrecks, etc. and pressure together with noise models for sensor input.
UUV simulator has been used for applications such as map- The simulator is capable of modelling environments used in
ping [76] and path following [77]. competitions such as SAUVC and RoboSub. Unity allows
Other notable gazebo extensions/packages include Rock- for the modelling of hydrodyanmic forces such as buoyancy
Gazebo [78] and freefloating − gazebo [73]. ROCK-Gazebo and drag. ROS provides functionality needed for control,
is the integration between Gazebo and the Robot Construction communication, vision and sensing, with target applications
Kit (ROCK) framework to allow for real-time simulation. in sensing, mapping, path planning, localization, obstacle
This involved extending the ROCK visualisation tool using avoidance and target acquisition. URSim is being actively
OpenSceneGraph(OSG) for rendering underwater environ- developed, with new sensors (DVL, side scan SONAR, etc)
ments while Gazebo was used for physics simulation [75]. and a robotic manipulator planned.
Rock-Gazebo has a number of limitations including not sup- Surface vehicle simulation is relatively rare [83], primarily
porting multi-robot simulations [75]. freefloating − gazebo due to the complexity associated with modelling environ-
combines the dynamic simulation capabilities of Gazebo mental factors such as waves, wind and water currents [83],
with the realistic underwater rendering of UWSim [74]. This [84]. Unmanned Surface Vehicle simulator (USVSim) [83]
allows it to model hydrodynamic forces. freefloating−gazebo is a dedicated simulator for this application, which is an
lacks in that, due to stability concerns, it does not include the extension of Gazebo [46]. The freefloating plugin [73] sup-
computation of added-mass forces [75]. Rock-Gazebo and ports USV simulations by improving the hydrodynamics and
freefloating − gazebo both have limited sensor support. buoyancy effects. The lift-drag plugin was used for calculat-
UWSim [74] is another open source option, which was ing foil dynamics. UWSim [74] offers accurate modelling
developed at the Interactive and Robotic Systems Lab at of wave and water visual effects. Re-using and improving
the Jaume-I University. It utilises Bullet and OpenScene- elements from the above tools allowed the authors to come
Graph(OSG) for contact physics and supports a wide range of up with a robust simulator, which has been used for path
simulated sensors such as pressure sensor, force sensor, GPS, planning [85].
VI. AERIAL ROBOTICS Flightmare [100] combines a flexible physics engine with
In this section we focus on unmanned aerial vehicles (UAVs) the Unity rendering engine into a powerful simulator. Flight-
which are the most popular field of research within aerial mare simulates high-fidelity environments including ware-
robotics. houses and forests. Sensor models are available for IMU
Competitions such as the UAV challenge [86] and the and RGB cameras with ground-truth depth and semantic
International Aerial Robotics Competition (IARC) [87] are segmentation. The simulator is well suited for applications
open competitions in the field. In the UAV challenge, the goal in deep/reinforcement learning.
is to demonstrate the utility of UAVs on real-world mis- jMAVSim [101] is another widely used simulator, mainly
sions such as medical rescue or delivery of essential items. due to its tight coupling with the open-source PX4 con-
IARC is the longest running collegiate competition for aerial troller owing to the initial goal of testing PX4 firmware and
robots, focusing on missions relating to human-robot interac- devices [92]. jMavSim supports basic sensing and render-
tions, robot-robot interactions and interactions of robots with ing [92].
complex environments. The NASA SAND (Safeguard with Webots [102] is an open source simulator with an extensive
Autonomous Navigation Demonstration) [88] competition set of supported sensors, including cameras, LIDARs, GPS,
aims to address safety-critical risks associated with flying etc. Users can add custom physics to simulate things such as
UAVs in US airspace. wind and integrate data from OpenStreetMap to create more
Modern UAV simulators allow researchers to replicate realistic environments. Integration with the Adrupilot flight
complex real-world environments by modelling turbulence, controller is supported. Webots has been used in multi-agent
air density, wind shear, clouds, precipitation and other fluid simulations [103], mitigation of bird strikes [104] and landing
mechanics constraints [89]. They also support various sensors applications [105].
– eg. Lidars, GPS, camera, etc. Digital elevation models or
height maps are also used to simulate the terrain underneath
the UAV. Aerial robotics simulators include Gazebo, AirSim, VII. SOFT ROBOTICS
Flightmare, jMAVSim, and Webots all of which are included Soft Robotics is generally a harder simulation problem
in Table 6 along with a comparison of important features than other robotics domains which often assume that the
required for aerial robotic research. robot and world it operates in are mechanically rigid. Soft
Gazebo [46] is a popular simulator for both indoor and robotics requires simulating deformable objects and support
outdoor applications [90]. Gazebo relies on the LiftDrag Plu- for unconventional modes of actuation, including tendon or
gin to simulate aerodynamic properties, and supports many cable, pneumatic, and heat transference. Simulators must
common sensors such as stereo-cameras and LIDAR. The also support contact dynamics between the soft robot and
Hector plugin [91] adds UAV-specific sensors such as barom- soft/solid materials or fluids. As an emerging research field,
eters, GPS receivers and sonar rangers. Gazebo supports a competitions are relatively recent in their inception. The
comprehensive list of UAV models [46], and open-source 2016 Robosoft Grand Challenge consisted of three team chal-
hardware controllers such as Ardupilot and PX4 which can lenges: manipulation, terrestrial locomotion, and underwater
be integrated for hardware-in-the-loop simulations. Gazebo, locomotion [106]. The Annual Soft Robotics Competition
however, features limited rendering capability compared to ran annually between 2015 and 2018. It consisted of several
Unity and Unreal Engine [92]. Gazebo has found application categories, with a panel of judges awarding prizes based on
in, e.g., autonomous navigation [93], landing on moving contribution and design [107]. Table 7 provides a comparison
platforms [94], multi-UAV simulation [95], and visual ser- between the capabilities of different robotics simulators used
voing [96]. to simulate soft robots in areas relevant to the domain.
Microsoft’s AirSim [92] is based on Unreal engine, and Soft robotics typically employs Multiphysics packages
supports IMU, magnetometer, GPS, barometer, and camera such as COMSOL, ANSYS, and Abaqus, which solve
sensors. AirSim provides a built-in controller called sim- through Finite Element Method (FEM), as well as simulating
ple_flight, and also supports open-source controllers such as aspects including heat transfer, electric conduction, mag-
PX4. AirSim is resource-intensive and hence requires large netism, and fluid flow. They typically lack sensing, however
computing power to run when compared with other simula- they are fully capable of modelling soft actuation mechanisms
tors [89]. It has been used in drone racing [97], wildlife con- and are used for more fundamental studies (e.g., not including
servation [98], and depth perception from visual images [99]. environmental modelling). They have a range of modules
available to support different physics, however tend to be or damage during training, the majority of works on deep
expensive to purchase. RL are first learned in simulation before being deployed
Abaqus, for example, is used to model laminar jamming on hardware. Due to the relative recency of robotic deep
structures [108], 3D locomotion of soft robots with electro- learning, there are relatively few competitions in the domain
static actuators [109], deflection of a soft robot produced by of learning for robotics. The Real Robot Challenge [119]
thermal conduction [110] and a soft robotic grippers [111]. is a notable exception in which participants learn dexterous
Abaqus models non-linear behaviour well, and supports a manipulation of objects with a parallel manipulator, and their
large range of material properties with a material model learned policies are compared both in simulation – in phase
library. The Abaqus FEM simulation is considered to be the 1 of the competition – and hardware in a later phase. Tasks
industry standard. ANSYS is another modelling package used that must be completed as part of this challenge include
in soft robotics research, which simulates electrical, thermal pushing an object to a target location, lifting it to a specified
and structural properties in simulation [112]. ANSYS Fluent height, and moving it to a target position and orientation.
is a well-developed package for fluid simulation, which is Though there are few challenges targeted towards learning
popular for e.g., underwater soft robotic modelling [113]. for robotics, learning methods have been applied in a number
COMSOL has more user-definable material properties than of other robotics challenges. A learned locomotion controller
Abaqus, lending itself to research methods which use mate- for a quadrupedal robot was recently deployed in the DARPA
rials with unique properties. It is also considered more Subterranean challenge [123], for example.
user-friendly than ANSYS. In SOTA research, Comsol is Learning for robotics differs from the other sections cov-
used to simulate a flexible inchworm with actuation through ered in this work because it is concerned with implementation
magnetic fields [114], a caterpillar-like robot actuated by on a robot rather than the type of robot or the environment
light [115]. that a robot is deployed in. Due to the emerging popularity of
SOFA is a popular open-source simulator that has been robotic learning, a guide for simulator selection is included
used to simulate cable driven soft robots [3], and for FEM in this work. Learning methods can be applied in a wide
simulation of four-legged soft robots [116]. SOFA has several range of robotics fields, and so the features pertinent to those
useful features for robotics, including a ROS bridge and a Soft fields should be considered alongside the features required
Robot plugin for modelling and actuation. Actuators from the for learning itself. For instance, applying learning methods to
soft robot plugin include tendon-driven and pneumatic actua- soft robotics requires support for soft contacts and materials
tors. SOFA is supported by an active open source community as well as the ability to perform rapid iterative policy learning.
that regularly adds new modules and features alongside its OpenAI Gym is a popular toolkit for training and evalu-
internal development. ating RL algorithms, and provides environments in MuJoCo
Evosoro is a soft robot simulator based on the Voxelyze (Fig. 3) that are commonly used as baselines to evaluate new
physics engine. It uses Spring-Mass modelling for simula- RL algorithms and methods in the literature [124]–[127].
tion of voxel-based soft robots and includes variable-volume OpenAI Gym is used in a number of simulators to train and
actuation but no sensing. Evosoro is a comparatively fast soft evaluate learned policies, demonstrating the efficacy of these
robot simulator, which has been coupled with evolutionary simulators for learning methods. Simulators used in con-
algorithms to design robot morphologies [117], and as a junction with OpenAIGym include: PyBullet (Fig. 3) [128],
design tool for real deployments of soft robots [118]. The Webots [129], Nvidia Flex [130], [131], Nvidia Isaac [132],
simulator was found to have a significant gap when solutions CARLA [133], Project Chrono [134], Raisim [7], and Gazebo
were transferred to reality however, owing to the (fast, rela- (Fig. 3) [135].
tively inaccurate) Spring-Mass modelling. One common application of deep learning is to learn
manipulation and grasping policies. For these tasks,
VIII. LEARNING FOR ROBOTICS the fidelity of rigid or soft body contact dynamics is impor-
Learning for robotics has been an important topic of research tant, as well as having sensors to support policies for such
over the last decade. Due to the sample inefficiency of current tasks. Another application is in path planning or locomotion
Reinforcement Learning (RL) algorithms as well as the need over rough terrain with a mobile robot, and so researchers
to explore the state-action space that may lead to robot failure may require complex terrains to be modelled. Many
FIGURE 3. Robotics research within reinforcement learning relies heavily upon simulation environments with the ones pictures in A) MuJoCo [120], B)
PyBullet [121], and C) Gazebo [122] being popular choices.
simulators such as Gazebo, Raisim, MuJoco, and PyBul- that prevents learned policies over-fitting to the simulation
let allow non-flat rigid terrains to be imported from environment [140]. Gazebo, V-Rep, and Pybullet support
heightmap images or mesh files, but do not model soft or multiple back-end physics engines, whereas some other sim-
granular materials at a large enough scale to mimic soils, ulators such as Raisim and MuJoCo do not. The quality of
gravels, or fluid terrains. Project Chrono is one alternative rendering is also an important factor for learned policies that
with in-built support for deformable terrain, granular ter- rely on visual data.
rain, and fluid simulations, as well as being parallelisable. Due to the large amount of data needed to train neural
Locomotion and path-planning policies for aerial robots such network parameters and properly explore the state-action
as quad-rotors has been achieved in Raisim [136], Flight- space it is also important that the chosen simulator facilitates
mare [100], and Gazebo [137], [138]. collecting this data in a timely manner. There are a number of
In each of the described applications, sensor support simulator features that can facilitate deep learning in a timely
is an important consideration for researchers to consider fashion, including: supporting parallel simulation – either
when selecting a suitable simulator. Force-torque sensors and through simulating multiple robots in one environment or
vision sensors are common requirements and are supported running multiple simulations in parallel with multi-threading
in simulators such as Gazebo, PyBullet and CoppeliaSim. or multiprocessing – the ability to run in headless mode,
Gazebo provides support for noise models which can be and rapid dynamics solvers that allow simulations to run
applied to sensor outputs. Because simulation is an abstrac- faster than real-time. Due to the GPU-based physics engine
tion of real-world conditions, policies learned in simulation of Nvidia Flex – which is available to use as a physics engine
typically degrade when transferred onto hardware. Overcom- in the NVidia Isaac robotic simulator – a walking policy
ing this reality gap is one of the most important consider- for humanoid robots could be learned in 16 minutes on a
ations for researchers in the learning community to address single CPU and GPU. Flex also supports distributed GPU
when selecting a simulator. It is also important that simula- simulations which can further reduce training times by up to
tion environments vary between episodes, commonly using eight times on some tasks [130]. Flightmare is able to main-
a technique called domain randomisation [42], to diversify tain 200,000 steps per second while simulating 150 quadro-
the training data and allow the robot to properly explore tors in parallel, allowing it to train locomotion policies for
the shared state-action space. Many simulators have in-built the quadrotors much faster than in real-time [100]. Run-
support, e.g., the ability to reset a simulation environment ning similar simulations of a humanoid robot in Gazebo,
without shutting down the entire simulator- and to vary initial V-Rep, and Webots [141] showed that the Gazebo was more
positions and orientations of robots, cameras, and objects CPU-intensive than the other two simulators and Webots was
within the simulation- is common to many robotic simulators. the least intensive of the three. Computational load is relevant
The ability to randomise the textures of rendered objects in to researchers considering simulators for learning because it
simulation and characteristics of the camera used to render is important to perform either as many simulations in parallel
them is built into MuJoCo and demonstrated in [42]. This as possible, or to run a simulation as quickly as possible so
functionality is not innately supported in Gazebo, though an that training time can be reduced.
external plugin has been created to do so [139]. Randomis- Evolutionary robotics is a subset of learning that is distinct
ing object mass and inertia, as well as friction coefficients, from the majority of deep RL methods, though many of the
is another method of performing domain randomisation com- challenges with simulating environments for deep RL are
mon to many of the simulators considered. Applying small shared in the field of evolutionary robotics. Improving the
random forces to robots also aids in overcoming the reality reality gap is just as important for locomotion policies and
gap but is not possible in all simulators. Supporting multiple part designs developed through evolutionary techniques as
physics engines is another domain randomisation technique for those developed with deep RL, and evolutionary methods
FIGURE 4. The future of robotic simulators is predicted to see advancements with A) widespread use of differentiable physics [151], B) increased stability
and speed of simulation [155], and C) increased rendering capabilities within simulation [156].
also require a large number of time steps or simulations can be trained to replicate the properties of that phenomena
to be run. Simulators that have been used for evolution- with high accuracy, and integrate into the simulator [151].
ary methods in the literature include ODE [142], Nvidia Differentiable simulators are a fast growing area of
Physx [143], Bullet [144], [145], V-Rep [146], Gazebo [147], research which is tightly coupled with robotics. The avail-
and Webots [148], voxcad [117], and Project Chrono [149]. ability of automatic differentiation libraries contributes to the
Table 8 compares relevant features of common simulators large number of new publications in this domain. The pri-
used for learning in robotics. Features important to this field mary benefit of differentiable simulation — the ability to use
include: those that enable domain randomization, such as gradient based rather than black-box optimisation approaches
the ability to apply random external forces to the robot and — promises a leap in efficiency and opens up previously
employ multiple back-end physics engines; common sensors intractable problems to learning-based optimisation. Several
required by learned policies such as RGBD, LiDAR, and papers have proven examples which show the applicability
force sensors; and realistic rendering capabilities for learned of such simulators for system ID [152], policy creation [153]
policies that rely on visual data. and embedding physics in neural networks (Fig. 4) [154].
Plugins and tools which are currently supported in some
IX. FUTURE simulators will likely become even more prolific and ubiqui-
Physical simulation is tightly intertwined with continued tous. Features that are most likely to be adopted by a wider
advances in robotics research. It is increasingly important, range of simulators include support for the ROS middle-
especially in fields such as robotic deep learning. ware, and integration with external renderers such as Unity
In a recent debate style workshop for sim-to-real, debaters or UnrealEngine for more realistic camera streams (Fig. 4).
proposed the progression of simulator accuracy as an impor- It is likely we will see more robotic simulators also integrate
tant step in progressing simulator technology (Fig. 4) [150]. baseline tools for domain randomisation, system ID, and
Improved accuracy can be attempted in a multitude of black-box optimisation.
approaches as simulators abstract away real phenomenon, We are also likely to see further integration in bench-
making a coarser representation of the world. The most marking and algorithmic frameworks. Examples of this are
prolific phenomenon to model well is contacts with large RL frameworks like OpenAI Gym [157], spinningup [158],
improvements likely to be seen with improved methods for and robosuite [159]. Benchmarks and algorithmic implemen-
collision detection and resolution. Collision detection is very tations will likely become embedded within the simulator
resource intensive and is often a source of instability within framework much like path planners and kinematic solvers
simulators. One option is to replace phenomena that are already are. This will make it easier to benchmark algorithms
difficult to model in simulation with a neural network that against the SOTA.
Finally, we predict that we will see further research into [9] F. Rubio, F. Valero, and C. Llopis-Albert, ‘‘A review of mobile robots:
estimating and modelling uncertainty of simulators. Having Concepts, methods, theoretical framework, and applications,’’ Int. J. Adv.
Robotic Syst., vol. 16, no. 2, Mar. 2019, Art. no. 1729881419839596, doi:
a metric that encapsulates when a simulator is accurately 10.1177/1729881419839596.
projecting the real world is immensely advantageous. It pro- [10] S. Thrun et al., ‘‘Stanley: The robot that won the DARPA grand chal-
vides researchers with an estimation of how likely a solution lenge,’’ J. Field Robot., vol. 23, no. 9, pp. 661–692, Sep. 2006, doi:
10.1002/rob.20147.
created in simulation will transfer to the real world, and [11] E. Krotkov, D. Hackett, L. Jackel, M. Perschbacher, J. Pippine, J. Strauss,
where additional modelling is required [160]. G. Pratt, and C. Orlowski, ‘‘The DARPA robotics challenge finals: Results
and perspectives,’’ J. Field Robot., vol. 34, no. 2, pp. 229–240, Mar. 2017,
doi: 10.1002/rob.21683.
X. CONCLUSION
[12] I. D. Miller, F. Cladera, A. Cowley, S. S. Shivakumar, E. S. Lee,
Simulators aid robotics research in a multitude of ways. The L. Jarin-Lipschitz, A. Bhat, N. Rodrigues, A. Zhou, A. Cohen,
benefits include reduction in cost, better management of time, A. Kulkarni, J. Laney, C. J. Taylor, and V. Kumar, ‘‘Mine tunnel explo-
ration using multiple quadrupedal robots,’’ IEEE Robot. Autom. Lett.,
and an added level of safety when dealing with complex envi- vol. 5, no. 2, pp. 2840–2847, Apr. 2020.
ronments. This review article provides a detailed summary on [13] Shenzhen DJI Sciences and Technologies Ltd. RoboMaster Robotics
the type of simulators available for researchers in seven dif- Competition | Overview. Accessed: Dec. 9, 2020. [Online]. Available:
ferent, prominent, domains of robotics research. Each section https://www.robomaster.com/en-US/robo/overview
[14] H. Kitano, M. Asada, I. Noda, and H. Matsubara, ‘‘RoboCup: Robot
covers a range of aspects including competitions, simulator world cup,’’ IEEE Robot. Autom. Mag., vol. 5, no. 3, pp. 30–36,
support for features needed in each domain – sensors, actua- Sep. 1998.
tors, environments – and the current SOTA. Section IX also [15] A. W. Winkler, C. D. Bellicoso, M. Hutter, and J. Buchli, ‘‘Gait and
trajectory optimization for legged systems through phase-based end-
provides a discussion on developments that we can expect to effector parameterization,’’ IEEE Robot. Autom. Lett., vol. 3, no. 3,
see in the not so distant future. pp. 1560–1567, Jul. 2018.
To the best of our knowledge, this is the first review arti- [16] C. D. Bellicoso, F. Jenelten, C. Gehring, and M. Hutter, ‘‘Dynamic loco-
motion through online nonlinear motion optimization for quadrupedal
cle on robotic simulators covering such a diverse range in robots,’’ IEEE Robot. Autom. Lett., vol. 3, no. 3, pp. 2261–2268, Jul. 2018.
domains of robotics research. It is an excellent starting point [17] K. Takaya, T. Asai, V. Kroumov, and F. Smarandache, ‘‘Simulation
for new researchers and a useful reference guide for expe- environment for mobile robots testing using ROS and Gazebo,’’ in Proc.
20th Int. Conf. Syst. Theory, Control Comput. (ICSTCC), Oct. 2016,
rienced researchers. Hence, we hope that more studies like pp. 96–101.
these are published over the coming years as new simulators [18] A. K. Rath, D. R. Parhi, H. C. Das, M. K. Muni, and P. B. Kumar,
enter the field and as some seasoned ones become obsolete. ‘‘Analysis and use of fuzzy intelligent technique for navigation
of humanoid robot in obstacle prone zone,’’ Defence Technol.,
vol. 14, no. 6, pp. 677–682, Dec. 2018. [Online]. Available:
ACKNOWLEDGMENT http://www.sciencedirect.com/science/article/pii/S2214914718300229
(Jack Collins and Shelvin Chand contributed equally to the [19] L. Tai, G. Paolo, and M. Liu, ‘‘Virtual-to-real deep reinforcement
work.) learning: Continuous control of mobile robots for mapless navigation,’’
in Proc. IEEE/RSJ Int. Conf. Intell. Robots Syst. (IROS), Sep. 2017,
pp. 31–36.
REFERENCES [20] J. Chen, B. Jia, and K. Zhang, ‘‘Trifocal tensor-based adaptive visual tra-
[1] T. Erez, Y. Tassa, and E. Todorov, ‘‘Simulation tools for model-based jectory tracking control of mobile robots,’’ IEEE Trans. Cybern., vol. 47,
robotics: Comparison of bullet, Havok, MuJoCo, ODE and PhysX,’’ in no. 11, pp. 3784–3798, Nov. 2017.
Proc. IEEE Int. Conf. Robot. Autom. (ICRA), May 2015, pp. 4397–4404. [21] X. Wang, G. Zhang, F. Neri, T. Jiang, J. Zhao, M. Gheorghe, F. Ipate,
[2] L. Pitonakova, M. Giuliani, A. Pipe, and A. Winfield, Feature and Per- and R. Lefticaru, ‘‘Design and implementation of membrane controllers
formance Comparison of the V-REP, Gazebo and ARGoS Robot Simu- for trajectory tracking of nonholonomic wheeled mobile robots,’’ Integr.
lators (Lecture Notes in Computer Science: Lecture Notes in Artificial Computer-Aided Eng., vol. 23, no. 1, pp. 15–30, Dec. 2015.
Intelligence and Lecture Notes in Bioinformatics: Lecture Notes on Arti- [22] C.-F. Juang and Y.-T. Yeh, ‘‘Multiobjective evolution of biped robot
ficial Intellience), vol. 10965. Cham, Switzerland: Springer, Jul. 2018, gaits using advanced continuous ant-colony optimized recurrent neu-
pp. 357–368. ral networks,’’ IEEE Trans. Cybern., vol. 48, no. 6, pp. 1910–1922,
[3] E. Coevoet, A. Escande, and C. Duriez, ‘‘Optimization-based inverse Jun. 2018.
model of soft robots with contact handling,’’ IEEE Robot. Autom. Lett., [23] J. Hwangbo, J. Lee, A. Dosovitskiy, D. Bellicoso, V. Tsounis,
vol. 2, no. 3, pp. 1413–1419, Jul. 2017. V. Koltun, and M. Hutter, ‘‘Learning agile and dynamic motor
[4] A. Munawar, Y. Wang, R. Gondokaryono, and G. S. Fischer, ‘‘A real-time skills for legged robots,’’ Sci. Robot., vol. 4, no. 26, Jan. 2019,
dynamic simulator and an associated front-end representation format for Art. no. eaau5872. [Online]. Available: http://robotics.sciencemag.
simulating complex robots and environments,’’ in Proc. IEEE/RSJ Int. org/content/4/26/eaau5872.abstract
Conf. Intell. Robots Syst. (IROS), Nov. 2019, pp. 1875–1882. [24] J. Tan, T. Zhang, E. Coumans, A. Iscen, Y. Bai, D. Hafner, S. Bohez,
[5] A. Zeng, S. Song, J. Lee, A. Rodriguez, and T. Funkhouser, ‘‘TossingBot: and V. Vanhoucke, ‘‘Sim-to-real: Learning agile locomotion for
Learning to throw arbitrary objects with residual physics,’’ IEEE Trans. quadruped robots,’’ Apr. 2018, arXiv:1804.10332. [Online]. Available:
Robot., vol. 36, no. 4, pp. 1307–1319, Aug. 2020. http://arxiv.org/abs/1804.10332
[6] O. M. Andrychowicz, B. Baker, M. Chociej, R. Józefowicz, B. McGrew, [25] K. Chen, J. P. de Vicente, G. Sepulveda, F. Xia, A. Soto, M. Vazquez,
J. Pachocki, A. Petron, M. Plappert, G. Powell, A. Ray, J. Schneider, and S. Savarese, ‘‘A behavioral approach to visual navigation with graph
S. Sidor, J. Tobin, P. Welinder, L. Weng, and W. Zaremba, ‘‘Learning localization networks,’’ 2019, arXiv:1903.00445. [Online]. Available:
dexterous in-hand manipulation,’’ Int. J. Robot. Res., vol. 39, no. 1, https://arxiv.org/abs/1903.00445
pp. 3–20, Jan. 2020, doi: 10.1177/0278364919887447. [26] F. Codevilla, M. Müller, A. López, V. Koltun, and A. Dosovitskiy, ‘‘End-
[7] J. Hwangbo, J. Lee, and M. Hutter, ‘‘Per-contact iteration method for to-end driving via conditional imitation learning,’’ in Proc. IEEE Int.
solving contact dynamics,’’ IEEE Robot. Autom. Lett., vol. 3, no. 2, Conf. Robot. Autom. (ICRA), May 2018, pp. 4693–4700.
pp. 895–902, Apr. 2018. [Online]. Available: www.raisim.com [27] J. Zhang, L. Tai, P. Yun, Y. Xiong, M. Liu, J. Boedecker, and
[8] P. Katara, M. Khanna, H. Nagar, and A. Panaiyappan, ‘‘Open source W. Burgard, ‘‘VR-goggles for robots: Real-to-sim domain adaptation for
simulator for unmanned underwater vehicles using ROS and Unity3D,’’ visual control,’’ IEEE Robot. Autom. Lett., vol. 4, no. 2, pp. 1148–1155,
in Proc. IEEE Underwater Technol. (UT), Apr. 2019, pp. 1–7. Apr. 2019.
[28] A. Tasora, R. Serban, H. Mazhar, A. Pazouki, D. Melanz, J. Fleischmann, [47] L. Kunze and M. Beetz, ‘‘Envisioning the qualitative effects of
M. Taylor, H. Sugiyama, and D. Negrut, ‘‘Chrono: An open source multi- robot manipulation actions using simulation-based projections,’’
physics dynamics engine,’’ in High Performance Computing in Science Artif. Intell., vol. 247, pp. 352–380, Jun. 2017. [Online]. Available:
and Engineering (Lecture Notes in Computer Science), T. Kozubek, http://www.sciencedirect.com/science/article/pii/S0004370214001544
R. Blaheta, J. Šístek, M. Rozložník, and M. Čermák, Eds. Cham, [48] H. Jin, Q. Chen, Z. Chen, Y. Hu, and J. Zhang, ‘‘Multi-
Switzerland: Springer, 2016, pp. 19–49. LeapMotion sensor based demonstration for robotic refine
[29] F. Jumel, ‘‘Advancing research at the RoboCupHome competition [com- tabletop object manipulation task,’’ CAAI Trans. Intell. Technol.,
petitions],’’ IEEE Robot. Autom. Mag., vol. 26, no. 2, pp. 7–9, Jun. 2019. vol. 1, no. 1, pp. 104–113, Jan. 2016. [Online]. Available:
[30] J. Leitner. Tidy Up My Room Challenge. Accessed: Aug. 10, 2020. http://www.sciencedirect.com/science/article/pii/S2468232216000111
[Online]. Available: http://juxi.net/challenge/tidy-up-my-room/ [49] S. James, A. J. Davison, and E. Johns, ‘‘Transferring end-to-end visuo-
[31] Fetchit! The Mobile Manipulation Challenge. Accessed: Aug. 11, 2020. motor control from simulation to real world for a multi-stage task,’’ in
[Online]. Available: https://opensource.fetchrobotics.com/competition Proceedings of Machine Learning Research, S. Levine, V. Vanhoucke,
[32] D. Morrison et al., ‘‘Cartman: The low-cost Cartesian manipulator that and K. Goldberg, Eds., vol. 78, Aug. 2017, pp. 334–343. [Online]. Avail-
won the Amazon robotics challenge,’’ in Proc. IEEE Int. Conf. Robot. able: http://proceedings.mlr.press/v78/james17a.html
Autom. (ICRA), May 2018, pp. 7757–7764. [50] H. Stuart, S. Wang, O. Khatib, and M. R. Cutkosky, ‘‘The ocean
[33] A. S. Morgan, K. Hang, W. G. Bircher, F. M. Alladkani, A. Gandhi, one hands: An adaptive design for robust marine manipulation,’’
B. Calli, and A. M. Dollar, ‘‘Benchmarking cluttered robot pick-and- Int. J. Robot. Res., vol. 36, no. 2, pp. 150–166, Feb. 2017, doi:
place manipulation with the box and blocks test,’’ IEEE Robot. Autom. 10.1177/0278364917694723.
Lett., vol. 5, no. 2, pp. 454–461, Apr. 2020. [51] K. Hauser, ‘‘Robust contact generation for robot simulation with unstruc-
[34] K. Chatzilygeroudis, B. Fichera, I. Lauzana, F. Bu, K. Yao, F. Khadivar, tured meshes,’’ in Proc. 16th Int. Symp. Robot. Res. (ISRR), M. Inaba and
and A. Billard, ‘‘Benchmark for bimanual robotic manipulation of P. Corke, Eds. Cham, Switzerland: Springer, 2016, pp. 357–373.
semi-deformable objects,’’ IEEE Robot. Autom. Lett., vol. 5, no. 2, [52] D. Tanaka, S. Arnold, and K. Yamazaki, ‘‘EMD net: An encode–
pp. 2443–2450, Apr. 2020. manipulate–decode network for cloth manipulation,’’ IEEE Robot.
[35] B. Calli, A. Walsman, A. Singh, S. Srinivasa, P. Abbeel, and Autom. Lett., vol. 3, no. 3, pp. 1771–1778, Jan. 2018.
A. M. Dollar, ‘‘Benchmarking in manipulation research: Using the Yale- [53] Y. Li, J. Wu, R. Tedrake, J. B. Tenenbaum, and A. Torralba, ‘‘Learning
CMU-Berkeley object and model set,’’ IEEE Robot. Autom. Mag., vol. 22, particle dynamics for manipulating rigid bodies, deformable objects, and
no. 3, pp. 36–52, Sep. 2015. fluids,’’ in Proc. Int. Conf. Learn. Represent., 2019. [Online]. Available:
[36] I. Garcia-Camacho, M. Lippi, M. C. Welle, H. Yin, R. Antonova,
https://openreview.net/forum?id=rJgbSn09Ym
A. Varava, J. Borras, C. Torras, A. Marino, G. Alenyà, and D. Kragic,
[54] Y. Chebotar, A. Handa, V. Makoviychuk, M. Macklin, J. Issac, N. Ratliff,
‘‘Benchmarking bimanual cloth manipulation,’’ IEEE Robot. Autom.
and D. Fox, ‘‘Closing the sim-to-real loop: Adapting simulation random-
Lett., vol. 5, no. 2, pp. 1111–1118, Apr. 2020.
ization with real world experience,’’ in Proc. Int. Conf. Robot. Autom.
[37] E. Todorov, T. Erez, and Y. Tassa, ‘‘MuJoCo: A physics engine
(ICRA), May 2019, pp. 8973–8979.
for model-based control,’’ in Proc. IEEE/RSJ Int. Conf. Intell.
[55] J. Troccaz, G. Dagnino, and G.-Z. Yang, ‘‘Frontiers of medical robotics:
Robots Syst., Oct. 2012, pp. 5026–5033. [Online]. Available:
From concept to systems to clinical translation,’’ Annu. Rev. Biomed.
http://ieeexplore.ieee.org/document/6386109/
[38] A. Rajeswaran, V. Kumar, A. Gupta, G. Vezzani, J. Schulman, Eng., vol. 21, no. 1, pp. 193–218, Jun. 2019, doi: 10.1146/annurev-
E. Todorov, and S. Levine, ‘‘Learning complex dexterous manipula- bioeng-060418-052502.
[56] Medical Robotics for Contagious Diseases Challenge 2020—UK-RAS
tion with deep reinforcement learning and demonstrations,’’ in Proc.
Network. Accessed: Dec. 9, 2020. [Online]. Available: https://www.ukras.
Robot., Sci. Syst., Pittsburgh, PA, USA, Jun. 2018. [Online]. Available:
org/robotics-week/challenges/medical-robotics-for-contagious-diseases-
http://www.roboticsproceedings.org/rss14/index.html
[39] I. Akkaya, M. Andrychowicz, M. Chociej, M. Litwin, B. McGrew, challenge-2020/
A. Petron, A. Paino, M. Plappert, G. Powell, R. Ribas, J. Schneider, [57] KUKA Innovation Award 2020|KUKA AG. Accessed: Dec. 9, 2020.
N. Tezak, J. Tworek, P. Welinder, L. Weng, Q. Yuan, W. Zaremba, [Online]. Available: https://www.kuka.com/en-au/future-production/
and L. Zhang, ‘‘Solving Rubik’s cube with a robot hand,’’ Oct. 2019, research-and-development/kuka-innovation-award/kuka-innovation-
arXiv:1910.07113. [Online]. Available: http://arxiv.org/abs/1910.07113 award-2020
[40] D. Pathak, D. Gandhi, and A. Gupta, ‘‘Self-supervised exploration [58] C. Freschi, V. Ferrari, F. Melfi, M. Ferrari, F. Mosca, and A. Cuschieri,
via disagreement,’’ in Proceedings of Machine Learning Research, ‘‘Technical review of the da vinci surgical telemanipulator,’’ Int. J. Med.
vol. 97, K. Chaudhuri and R. Salakhutdinov, Eds., Long Beach, Robot. Comput. Assist. Surg., vol. 9, no. 4, pp. 396–406, Dec. 2013, doi:
CA, USA, Aug. 2019, pp. 5062–5071. [Online]. Available: 10.1002/rcs.1468.
http://proceedings.mlr.press/v97/pathak19a.html [59] B. Hannaford, J. Rosen, D. W. Friedman, H. King, P. Roan, L. Cheng,
[41] P. Christiano, Z. Shah, I. Mordatch, J. Schneider, T. Blackwell, D. Glozman, J. Ma, S. N. Kosari, and L. White, ‘‘Raven-II: An open
J. Tobin, P. Abbeel, and W. Zaremba, ‘‘Transfer from simulation to platform for surgical robotics research,’’ IEEE Trans. Biomed. Eng.,
real world through learning deep inverse dynamics model,’’ Oct. 2016, vol. 60, no. 4, pp. 954–959, Apr. 2013.
arXiv:1610.03518. [Online]. Available: http://arxiv.org/abs/1610.03518 [60] N. Enayati, E. De Momi, and G. Ferrigno, ‘‘Haptics in robot-assisted
[42] J. Tobin, R. Fong, A. Ray, J. Schneider, W. Zaremba, and P. Abbeel, surgery: Challenges and benefits,’’ IEEE Rev. Biomed. Eng., vol. 9,
‘‘Domain randomization for transferring deep neural networks from sim- pp. 49–65, 2016.
ulation to the real world,’’ in Proc. IEEE/RSJ Int. Conf. Intell. Robots Syst. [61] F. Conti, F. Barbagli, R. Balaniuk, M. Halg, C. Lu, D. Morris, L. Sentis,
(IROS), Sep. 2017, pp. 23–30. J. Warren, O. Khatib, and K. Salisbury, ‘‘The CHAI libraries,’’ in Proc.
[43] A. A. Rusu, M. Večerík, T. Rothörl, N. Heess, R. Pascanu, and R. Hadsell, Eurohaptics, Dublin, Ireland, 2003, pp. 496–500.
‘‘Sim-to-real robot learning from pixels with progressive nets,’’ in Pro- [62] M. Guiatni, V. Riboulet, C. Duriez, A. Kheddar, and S. Cotin, ‘‘A com-
ceedings of Machine Learning Research, vol. 78, S. Levine, V. Vanhoucke bined force and thermal feedback interface for minimally invasive pro-
and K. Goldberg, Eds., Aug. 2017, pp. 262–270. [Online]. Available: cedures simulation,’’ IEEE/ASME Trans. Mechatronics, vol. 18, no. 3,
http://proceedings.mlr.press/v78/rusu17a.html pp. 1170–1181, Jun. 2013.
[44] J. Mahler and K. Goldberg, ‘‘Learning deep policies for robot bin [63] C. Shin, P. W. Ferguson, S. A. Pedram, J. Ma, E. P. Dutson, and J. Rosen,
picking by simulating robust grasping sequences,’’ in Proceedings ‘‘Autonomous tissue manipulation via surgical robot using learning based
of Machine Learning Research, vol. 78, S. Levine, V. Vanhoucke, model predictive control,’’ in Proc. Int. Conf. Robot. Autom. (ICRA),
and K. Goldberg, Eds., Aug. 2017, pp. 515–524. [Online]. Available: May 2019, pp. 3875–3881.
http://proceedings.mlr.press/v78/mahler17a.html [64] E. Coumans, ‘‘Bullet physics simulation,’’ in Proc. ACM SIG-
[45] J. Matas, S. James, and A. J. Davison, ‘‘Sim-to-real reinforcement GRAPH Courses (SIGGRAPH), New York, NY, USA, 2015. Accessed:
learning for deformable object manipulation,’’ pp. 734–743, Aug. 2018, Aug. 24, 2020, doi: 10.1145/2776880.2792704.
arXiv:1806.07851. [Online]. Available: http://arxiv.org/abs/1806.07851 [65] E. Tagliabue, A. Pore, D. Alba, E. Magnabosco, M. Piccinelli, and
[46] N. Koenig and A. Howard, ‘‘Design and use paradigms for Gazebo, P. Fiorini, ‘‘Soft tissue simulation environment to learn manipulation
an open-source multi-robot simulator,’’ in Proc. IEEE/RSJ Int. Conf. tasks in autonomous robotic surgery,’’ in Proc. IEEE/RSJ Int. Conf. Intell.
Intell. Robots Syst. (IROS), vol. 3, Sep. 2004, pp. 2149–2154. [Online]. Robots Syst. (IROS), Las Vegas, NV, USA, Jan. 2020, pp. 3261–3266.
Available: http://ieeexplore.ieee.org/document/1389727/ Accessed: Nov. 26, 2020.
[66] F. Richter, R. K. Orosco, and M. C. Yip, ‘‘Open-sourced reinforcement [90] M. Schmittle, A. Lukina, L. Vacek, J. Das, C. P. Buskirk, S. Rees,
learning environments for surgical robotics,’’ 2019, arXiv:1903.02090. J. Sztipanovits, R. Grosu, and V. Kumar, ‘‘OpenUAV: A UAV testbed
[Online]. Available: http://arxiv.org/abs/1903.02090 for the CPS and robotics community,’’ in Proc. ACM/IEEE 9th Int. Conf.
[67] X. Li, H. Alemzadeh, D. Chen, Z. Kalbarczyk, R. K. Iyer, and Cyber-Phys. Syst. (ICCPS), Apr. 2018, pp. 130–139.
T. Kesavadas, ‘‘Surgeon training in telerobotic surgery via a hardware- [91] J. Meyer, A. Sendobry, S. Kohlbrecher, U. Klingauf, and O. von Stryk,
in-the-loop simulator,’’ J. Healthcare Eng., vol. 2017, Jan. 2017, ‘‘Comprehensive simulation of quadrotor UAVs using ROS and Gazebo,’’
Art. no. 6702919, doi: 10.1155/2017/6702919. in Simulation, Modeling, and Programming for Autonomous Robots,
[68] IEEE-OES. Singapore AUV Challenge. Accessed: Dec. 15, 2020. I. Noda, N. Ando, D. Brugali, and J. J. Kuffner, Eds. Berlin, Germany:
[Online]. Available: https://sauvc.org/ Springer, 2012, pp. 400–411.
[69] RoboNation. RoboSub. Accessed: Dec. 15, 2020. [Online]. Available: [92] S. Shah, D. Dey, C. Lovett, and A. Kapoor, ‘‘AirSim: High-fidelity
https://robosub.org/ visual and physical simulation for autonomous vehicles,’’ Jul. 2017,
[70] RobotX. Accessed: Dec. 15, 2020. [Online]. Available: https://robotx.org/ arXiv:1705.05065. [Online]. Available: http://arxiv.org/abs/1705.05065
[71] MATE ROV Competition. Accessed: Dec. 15, 2020. [Online]. Available:
[93] N. Imanberdiyev, C. Fu, E. Kayacan, and I.-M. Chen, ‘‘Autonomous
https://materovcompetition.org/
navigation of UAV by using real-time model-based reinforcement learn-
[72] F. Ferreira and G. Ferri, ‘‘Marine robotics competitions: A survey,’’
ing,’’ in Proc. 14th Int. Conf. Control, Autom., Robot. Vis. (ICARCV),
Current Robot. Rep., vol. 1, no. 4, pp. 169–178, Dec. 2020, doi:
Nov. 2016, pp. 1–6.
10.1007/s43154-020-00022-5.
[73] O. Kermorgant, ‘‘A dynamic simulator for underwater vehicle- [94] A. Rodriguez-Ramos, C. Sampedro, H. Bavle, P. de la Puente,
manipulators,’’ in Simulation, Modeling, and Programming for and P. Campoy, ‘‘A deep reinforcement learning strategy for UAV
Autonomous Robots (Lecture Notes in Computer Science), D. Brugali, autonomous landing on a moving platform,’’ J. Intell. Robotic
J. F. Broenink, T. Kroeger, and B. A. MacDonald, Eds. Cham, Syst., vol. 93, nos. 1–2, pp. 351–366, Feb. 2019. [Online]. Avail-
Switzerland: Springer, 2014, pp. 25–36. able: https://link.springer.com/article/10.1007/s10846-018-0891-8, doi:
[74] M. Prats, J. Perez, J. J. Fernandez, and P. J. Sanz, ‘‘An open source tool 10.1007/s10846-018-0891-8.
for simulation and supervision of underwater intervention missions,’’ in [95] N. Mahdoui, V. Frémont, and E. Natalizio, ‘‘Communicating
Proc. IEEE/RSJ Int. Conf. Intell. Robots Syst., Oct. 2012, pp. 2577–2582. multi-UAV system for cooperative SLAM-based exploration,’’
[75] M. M. M. Manhaes, S. A. Scherer, M. Voss, L. R. Douat, and J. Intell. Robotic Syst., vol. 98, no. 2, pp. 325–343, May 2020,
T. Rauschenbach, ‘‘UUV simulator: A Gazebo-based package for under- doi: 10.1007/s10846-019-01062-6.
water intervention and multi-robot simulation,’’ in Proc. OCEANS [96] H. Shi, X. Li, K.-S. Hwang, W. Pan, and G. Xu, ‘‘Decoupled visual
MTS/IEEE Monterey, Sep. 2016, pp. 1–8. servoing with fuzzy Q-learning,’’ IEEE Trans. Ind. Informat., vol. 14,
[76] B.-J. Ho, P. Sodhi, P. Teixeira, M. Hsiao, T. Kusnur, and M. Kaess, no. 1, pp. 241–252, Jan. 2018.
‘‘Virtual occupancy grid map for submap-based pose graph SLAM and [97] R. Madaan, N. Gyde, S. Vemprala, M. Brown, K. Nagami, T. Taubner,
planning in 3D environments,’’ in Proc. IEEE/RSJ Int. Conf. Intell. Robots E. Cristofalo, D. Scaramuzza, M. Schwager, and A. Kapoor, ‘‘Air-
Syst. (IROS), Oct. 2018, pp. 2175–2182. Sim drone racing lab,’’ 2020, arXiv:2003.05654. [Online]. Available:
[77] Q. Zhang, J. Lin, Q. Sha, B. He, and G. Li, ‘‘Deep interactive reinforce- http://arxiv.org/abs/2003.05654
ment learning for path following of autonomous underwater vehicle,’’ [98] E. Bondi, D. Dey, A. Kapoor, J. Piavis, S. Shah, F. Fang, B. Dilkina,
IEEE Access, vol. 8, pp. 24258–24268, 2020. R. Hannaford, A. Iyer, L. Joppa, and M. Tambe, ‘‘AirSim-W: A simu-
[78] T. Watanabe, G. Neves, R. Cerqueira, T. Trocoli, M. Reis, S. Joyeux, and lation environment for wildlife conservation with UAVs,’’ in Proc. 1st
J. Albiez, ‘‘The rock-Gazebo integration and a real-time AUV simula- ACM SIGCAS Conf. Comput. Sustain. Societies, New York, NY, USA,
tion,’’ in Proc. 12th Latin Amer. Robot. Symp. 3rd Brazilian Symp. Robot. Jun. 2018, pp. 1–12, doi: 10.1145/3209811.3209880.
(LARS-SBR), Oct. 2015, pp. 132–138. [99] K. Julian, J. Mern, and R. Tompa, ‘‘UAV depth perception from visual
[79] P. Kormushev and D. G. Caldwell, ‘‘Towards improved AUV control images using a deep convolutional neural network,’’ Stanford Univ.,
through learning of periodic signals,’’ in Proc. OCEANS, San Diego, CA, Stanford, CA, USA, Tech. Rep., 2017.
USA, Sep. 2013, pp. 1–4. [100] Y. Song, S. Naji, E. Kaufmann, A. Loquercio, and D. Scaramuzza,
[80] M. Carreras, J. D. Hernández, E. Vidal, N. Palomeras, D. Ribas, and ‘‘Flightmare: A flexible quadrotor simulator,’’ 2020, arXiv:2009.00563.
P. Ridao, ‘‘Sparus II AUV—A hovering vehicle for seabed inspection,’’ [Online]. Available: http://arxiv.org/abs/2009.00563
IEEE J. Ocean. Eng., vol. 43, no. 2, pp. 344–355, Apr. 2018. [101] A. Babushkin. jMavSim. Accessed: Dec. 15, 2020. [Online]. Available:
[81] Z. Shen, J. Song, K. Mittal, and S. Gupta, ‘‘Autonomous 3-D mapping and https://github.com/PX4/jMAVSim
safe-path planning for underwater terrain reconstruction using multi-level [102] O. Michel, ‘‘Cyberbotics Ltd. Webots: Professional mobile robot simu-
coverage trees,’’ in Proc. OCEANS, Anchorage, AK, USA, Sep. 2017, lation,’’ Int. J. Adv. Robotic Syst., vol. 1, no. 1, pp. 39–42, Mar. 2004.
pp. 1–6. [Online]. Available: http://journals.sagepub.com/doi/10.5772/5618
[82] P. Cieślak, ‘‘Stonefish: An advanced open-source simulation tool [103] Z. Obdržálek, ‘‘Software environment for simulation of UAV multi-
designed for marine robotics, with a ROS interface,’’ in Proc. OCEANS, agent system,’’ in Proc. 21st Int. Conf. Methods Models Autom. Robot.
Marseille, France, Jun. 2019, pp. 1–6. (MMAR), Aug. 2016, pp. 720–725.
[83] M. Paravisi, D. H. Santos, V. Jorge, G. Heck, L. Gonçalves, and A. Amory,
[104] P. Aliasghari, K. Dautenhahn, and C. L. Nehaniv, ‘‘Simulations on herd-
‘‘Unmanned surface vehicle simulator with realistic environmental distur-
ing a flock of birds away from an aircraft using an unmanned aerial
bances,’’ Sensors, vol. 19, no. 5, p. 1068, Mar. 2019. [Online]. Available:
vehicle,’’ in Proc. Conf. Artif. Life, 2020, pp. 626–635.
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6427536/
[105] V. Stary, R. Doskocil, V. Krivanek, P. Kutilek, and A. Stefek, ‘‘Missile
[84] Y. Shi, C. Shen, H. Fang, and H. Li, ‘‘Advanced control in marine mecha-
guidance systems for UAS landing application,’’ in Proc. 17th Int. Conf.
tronic systems: A survey,’’ IEEE/ASME Trans. Mechatronics, vol. 22,
Mechatronics-Mechatronika (ME), Dec. 2016, pp. 1–5.
no. 3, pp. 1121–1131, Jun. 2017.
[85] A. G. D. S. Silva Junior, D. H. D. Santos, A. P. F. D. Negreiros, [106] M. Calisti, M. Cianchetti, M. Manti, F. Corucci, and C. Laschi.
J. M. V. B. D. S. Silva, and L. M. G. Gonçalves, ‘‘High-level path (2016). Contest-Driven Soft-Robotics Boost: The RoboSoft Grand
planning for an autonomous sailboat robot using Q-Learning,’’ Challenge. [Online]. Available: https://www.frontiersin.org/article/
Sensors, vol. 20, no. 6, p. 1550, Mar. 2020. [Online]. Available: 10.3389/frobt.2016.00055
https://www.mdpi.com/1424-8220/20/6/1550 [107] D. P. Holland, C. Abah, M. Velasco-Enriquez, M. Herman, G. J. Bennett,
[86] UAV Challenge. Accessed: Dec. 15, 2020. [Online]. Available: https:// E. A. Vela, and C. J. Walsh, ‘‘The soft robotics toolkit: Strategies for over-
uavchallenge.org/ coming obstacles to the wide dissemination of soft-robotic hardware,’’
[87] AUVSI-Foundation. International Aerial Robotics Competition IEEE Robot. Autom. Mag., vol. 24, no. 1, pp. 57–64, Mar. 2017.
(IARC). Accessed: Dec. 15, 2020. [Online]. Available: http://www. [108] Y. S. Narang, J. J. Vlassak, and R. D. Howe, ‘‘Mechanically versatile soft
aerialroboticscompetition.org/ machines through laminar jamming,’’ Adv. Funct. Mater., vol. 28, no. 17,
[88] NASA. NASA SAND Challenge. Accessed: Dec. 15, 2020. [Online]. Apr. 2018, Art. no. 1707136, doi: 10.1002/adfm.201707136.
Available: https://www.nasa.gov/sand [109] J. Cao, L. Qin, J. Liu, Q. Ren, C. C. Foo, H. Wang, H. P. Lee,
[89] A. I. Hentati, L. Krichen, M. Fourati, and L. C. Fourati, ‘‘Simulation tools, and J. Zhu, ‘‘Untethered soft robot capable of stable locomotion
environments and frameworks for UAV systems performance analysis,’’ using soft electrostatic actuators,’’ Extreme Mech. Lett., vol. 21,
in Proc. 14th Int. Wireless Commun. Mobile Comput. Conf. (IWCMC), pp. 9–16, May 2018. [Online]. Available: http://www.sciencedirect.com/
Jun. 2018, pp. 1495–1500. science/article/pii/S2352431617302250
[110] C. Ahn, X. Liang, and S. Cai, ‘‘Bioinspired design of light- [130] J. Liang, V. Makoviychuk, A. Handa, N. Chentanez, M. Macklin,
powered crawling, squeezing, and jumping untethered soft robot,’’ and D. Fox, ‘‘GPU-accelerated robotic simulation for distributed rein-
Adv. Mater. Technol., vol. 4, no. 7, Jul. 2019, Art. no. 1900185, doi: forcement learning,’’ 2018, arXiv:1810.05762. [Online]. Available:
10.1002/admt.201900185. http://arxiv.org/abs/1810.05762
[111] Z. Wang, R. Kanegae, and S. Hirai, ‘‘Circular shell gripper for handling [131] X. Lin, Y. Wang, J. Olkin, and D. Held, ‘‘SoftGym: Benchmarking
food products,’’ Soft Robot., Aug. 2020, doi: 10.1089/soro.2019.0140. deep reinforcement learning for deformable object manipulation,’’ 2020,
[112] M. Al-Rubaiai, T. Pinto, C. Qian, and X. Tan, ‘‘Soft actuators with arXiv:2011.07215. [Online]. Available: http://arxiv.org/abs/2011.07215
stiffness and shape modulation using 3D-printed conductive polylactic [132] P. Klink, C. D’Eramo, J. Peters, and J. Pajarinen, ‘‘Self-paced deep
acid material,’’ Soft Robot., vol. 6, no. 3, pp. 318–332, Jun. 2019, doi: reinforcement learning,’’ 2020, arXiv:2004.11812. [Online]. Available:
10.1089/soro.2018.0056. http://arxiv.org/abs/2004.11812
[113] C. Yue, S. Guo, and M. Li, ‘‘ANSYS Fluent-based modeling and hydro- [133] J. Chen, S. E. Li, and M. Tomizuka, ‘‘Interpretable end-to-end urban
dynamic analysis for a spherical underwater robot,’’ in Proc. IEEE Int. autonomous driving with latent deep reinforcement learning,’’ 2020,
Conf. Mechatronics Autom., Aug. 2013, pp. 1577–1581. arXiv:2001.08726. [Online]. Available: http://arxiv.org/abs/2001.08726
[114] E. B. Joyee and Y. Pan, ‘‘A fully three-dimensional printed inchworm- [134] A. Tasora, R. Serban, H. Mazhar, A. Pazouki, D. Melanz, J. Fleischmann,
inspired soft robot with magnetic actuation,’’ Soft Robot., vol. 6, no. 3, M. Taylor, H. Sugiyama, and D. Negrut, ‘‘Chrono: An open source multi-
pp. 333–345, Jun. 2019, doi: 10.1089/soro.2018.0082. physics dynamics engine,’’ in Proc. Int. Conf. High Perform. Comput. Sci.
[115] M. Rogóż, H. Zeng, C. Xuan, D. S. Wiersma, and P. Wasylczyk, Eng., 2015, pp. 19–49.
‘‘Light-driven soft robot mimics caterpillar locomotion in natural scale,’’ [135] N. G. Lopez, Y. L. E. Nuin, E. B. Moral, L. U. S. Juan, A. S. Rueda,
Adv. Opt. Mater., vol. 4, no. 11, pp. 1689–1694, Nov. 2016, doi: V. M. Vilches, and R. Kojcev, ‘‘Gym-Gazebo2, a toolkit for reinforce-
10.1002/adom.201600503. ment learning using ROS 2 and Gazebo,’’ 2019, arXiv:1903.06278.
[116] Z. Zhang, J. Dequidt, A. Kruszewski, F. Largilliere, and C. Duriez, [Online]. Available: http://arxiv.org/abs/1903.06278
‘‘Kinematic modeling and observer based control of soft robot using real- [136] J. Hwangbo, I. Sa, R. Siegwart, and M. Hutter, ‘‘Control of a quadrotor
time finite element method,’’ in Proc. IEEE/RSJ Int. Conf. Intell. Robots with reinforcement learning,’’ IEEE Robot. Autom. Lett., vol. 2, no. 4,
Syst. (IROS), Oct. 2016, pp. 5509–5514. pp. 2096–2103, Oct. 2017.
[117] N. Cheney, R. MacCurdy, J. Clune, and H. Lipson, ‘‘Unshackling [137] I. Zamora, N. G. Lopez, V. M. Vilches, and A. H. Cordero, ‘‘Extending
evolution: Evolving soft robots with multiple materials and a pow- the OpenAI gym for robotics: A toolkit for reinforcement learning using
erful generative encoding,’’ in Proc. 15th Annu. Conf. Genet. Evol. ROS and Gazebo,’’ Aug. 2016, arXiv:1608.05742. [Online]. Available:
Comput. (GECCO), Amsterdam, The Netherlands. New York, NY, http://arxiv.org/abs/1608.05742
USA: Association for Computing Machinery, 2013, pp. 167–174, [138] W. Koch, R. Mancuso, R. West, and A. Bestavros, ‘‘Reinforcement
doi: 10.1145/2463372.2463404. learning for UAV attitude control,’’ ACM Trans. Cyber-Phys. Syst., vol. 3,
[118] S. Kriegman, A. M. Nasab, D. Shah, H. Steele, G. Branin, M. Levin, no. 2, pp. 1–21, Mar. 2019, doi: 10.1145/3301273.
J. Bongard, and R. Kramer-Bottiglio, ‘‘Scalable sim-to-real transfer of [139] J. Borrego, R. Figueiredo, A. Dehban, P. Moreno, A. Bernardino, and
soft robot designs,’’ in Proc. 3rd IEEE Int. Conf. Soft Robot. (RoboSoft), J. Santos-Victor, ‘‘A generic visual perception domain randomisation
May 2020, pp. 359–366. framework for Gazebo,’’ [Online]. Available: http://www.ros.org/
[119] Max Planck Institute for Intelligent Systems. Real Robot Chal- [140] D. Ferigo, S. Traversaro, G. Metta, and D. Pucci, ‘‘Gym-ignition: Repro-
lenge. Accessed: Oct. 30, 2020. [Online]. Available: https://real-robot- ducible robotic simulations for reinforcement learning,’’ Nov. 2019,
challenge.com arXiv:1911.01715. [Online]. Available: http://arxiv.org/abs/1911.01715
[120] M. Plappert, M. Andrychowicz, A. Ray, B. McGrew, B. Baker, G. Powell, [141] A. Ayala, F. Cruz, D. Campos, R. Rubio, B. Fernandes, and
J. Schneider, J. Tobin, M. Chociej, P. Welinder, V. Kumar, and R. Dazeley, ‘‘A comparison of humanoid robot simulators:
W. Zaremba, ‘‘Multi-goal reinforcement learning: Challenging robotics A quantitative approach,’’ 2020, arXiv:2008.04627. [Online]. Available:
environments and request for research,’’ 2018, arXiv:1802.09464. http://arxiv.org/abs/2008.04627
[Online]. Available: http://arxiv.org/abs/1802.09464 [142] J. E. Auerbach and J. C. Bongard, ‘‘Environmental influence on the
[121] B. Delhaisse, L. Rozo, and D. G. Caldwell, ‘‘PyRoboLearn: evolution of morphological complexity in machines,’’ PLoS Comput.
A python framework for robot learning practitioners,’’ in Proc. Biol., vol. 10, no. 1, Jan. 2014, Art. no. e1003399.
Conf. Robot Learn. (PMLR), L. P. Kaelbling, D. Kragic, and [143] J. Rieffel, D. Knox, S. Smith, and B. Trimmer, ‘‘Growing and evolving
K. Sugiura, Eds. vol. 100, 2020, pp. 1348–1358. [Online]. Available: soft robots,’’ Artif. Life, vol. 20, no. 1, pp. 143–162, Jan. 2014.
http://proceedings.mlr.press/v100/delhaisse20a.html [144] C. J. Pretorius, M. C. du Plessis, and J. W. Gonsalves, ‘‘Evolutionary
[122] N. G. Lopez, Y. L. E. Nuin, E. B. Moral, L. U. S. Juan, A. S. Rueda, robotics applied to hexapod locomotion: A comparative study of simula-
V. M. Vilches, and R. Kojcev, ‘‘Gym-Gazebo2, a toolkit for reinforce- tion techniques,’’ J. Intell. Robotic Syst., vol. 96, nos. 3–4, pp. 363–385,
ment learning using ROS 2 and Gazebo,’’ 2019, arXiv:1903.06278. Dec. 2019.
[Online]. Available: http://arxiv.org/abs/1903.06278 [145] M. van Diepen and K. Shea, ‘‘A spatial grammar method for the computa-
[123] J. Lee, J. Hwangbo, L. Wellhausen, V. Koltun, and M. Hutter. (2020). tional design synthesis of virtual soft locomotion robots,’’ J. Mech. Des.,
Animal Robots Learning Quadrupedal Locomotion Over Challenging vol. 141, no. 10, Oct. 2019, Art. no. 101402.
Terrain. [Online]. Available: http://robotics.sciencemag.org/ [146] M. Duarte, J. Gomes, S. M. Oliveira, and A. L. Christensen, ‘‘Evolution
[124] J. Schulman, A. Gupta, S. Venkatesan, M. Tayson-Frederick, and of repertoire-based control for robots with complex locomotor systems,’’
P. Abbeel, ‘‘A case study of trajectory transfer through non-rigid regis- IEEE Trans. Evol. Comput., vol. 22, no. 2, pp. 314–328, Apr. 2018.
tration for a simplified suturing scenario,’’ in Proc. IEEE/RSJ Int. Conf. [147] J. Nordmoen, K. O. Ellefsen, and K. Glette, ‘‘Combining MAP-elites
Intell. Robots Syst., Nov. 2013, pp. 4111–4117. and incremental evolution to generate gaits for a mammalian quadruped
[125] T. Salimans, J. Ho, X. Chen, S. Sidor, and I. Sutskever, ‘‘Evolution robot,’’ in Applications of Evolutionary Computation, K. Sim and
strategies as a scalable alternative to reinforcement learning,’’ Mar. 2017, P. Kaufmann, Eds. Cham, Switzerland: Springer, 2018, pp. 719–733.
arXiv:1703.03864. [Online]. Available: http://arxiv.org/abs/1703.03864 [148] C.-F. Juang, Y.-H. Jhan, Y.-M. Chen, and C.-M. Hsu, ‘‘Evolutionary wall-
[126] Y. Wu, E. Mansimov, S. Liao, R. Grosse, and J. Ba, ‘‘Scalable trust- following hexapod robot using advanced multiobjective continuous ant
region method for deep reinforcement learning using kronecker-factored colony optimized fuzzy controller,’’ IEEE Trans. Cognit. Develop. Syst.,
approximation,’’ in Proc. 31st Int. Conf. Neural Inf. Process. Syst. (NIPS), vol. 10, no. 3, pp. 585–594, Sep. 2018.
Long Beach, Ca, USA. Red Hook, NY, USA: Curran Associates, 2017, [149] J. Collins, W. Geles, D. Howard, and F. Maire, ‘‘Towards the targeted
pp. 5285–5294. environment-specific evolution of robot components,’’ in Proc. Genetic
[127] H. Zheng, P. Wei, J. Jiang, G. Long, and C. Zhang, ‘‘Cooperative hetero- Evol. Comput. Conf., Jul. 2018, pp. 61–68.
geneous deep reinforcement learning,’’ Tech. Rep. [Online]. Available: [150] S. Höfer, K. Bekris, A. Handa, J. Camilo Gamboa, F. Golemo,
https://arxiv.org/pdf/2011.00791.pdf M. Mozifian, C. Atkeson, D. Fox, K. Goldberg, J. Leonard, C. K. Liu,
[128] E. Coumans and Y. Bai. (2017). Pybullet, a Python Module for Physics J. Peters, S. Song, P. Welinder, and M. White, ‘‘Perspectives on Sim2Real
Simulation in Robotics, Games and Machine Learning. [Online]. Avail- transfer for robotics: A summary of the R:SS 2020 workshop,’’ Dec. 2020,
able: https://pybullet.org arXiv:2012.03806. [Online]. Available: http://arxiv.org/abs/2012.03806
[129] M. Kirtas, K. Tsampazis, N. Passalis, and A. Tefas, ‘‘Deepbots: A webots- [151] E. Heiden, D. Millard, E. Coumans, Y. Sheng, and G. S. Sukhatme,
based deep reinforcement learning framework for robotics,’’ in Proc. IFIP ‘‘NeuralSim: Augmenting differentiable simulators with neural
Int. Conf. Artif. Intell. Appl. Innov. Cham, Switzerland: Springer, 2020, networks,’’ Nov. 2020, arXiv:2011.04217. [Online]. Available:
pp. 64–75. http://arxiv.org/abs/2011.04217
[152] C. Song and A. Boularias, ‘‘Learning to slide unknown objects with SHELVIN CHAND was born in Lautoka, Fiji,
differentiable physics simulations,’’ 2020, arXiv:2005.05456. [Online]. in 1991. He received the B.Sc. and M.Sc. degrees
Available: https://arxiv.org/abs/2005.05456 from the University of the South Pacific, Suva,
[153] J. Degrave, M. Hermans, J. Dambre, and F. Wyffels, ‘‘A dif- Fiji, in 2013 and 2014, respectively, and the Ph.D.
ferentiable physics engine for deep learning in robotics,’’ Fron- degree in computer science from the University of
tiers Neurorobot., vol. 13, p. 9, Mar. 2019. [Online]. Available:
New South Wales, Canberra, Australia, in 2018.
https://www.frontiersin.org/article/10.3389/fnbot.2019.00006
[154] E. Heiden, D. Millard, H. Zhang, and G. S. Sukhatme, ‘‘Interactive He is currently a Postdoctoral Fellow with
differentiable simulation,’’ 2019, arXiv:1905.10706. [Online]. Available: the Robotics and Autonomous Systems Group,
http://arxiv.org/abs/1905.10706 Commonwealth Scientific and Industrial Research
[155] M. Müller, M. Macklin, N. Chentanez, S. Jeschke, and T. Kim, ‘‘Detailed Organization (CSIRO). His current research inter-
rigid body simulation with extended position based dynamics,’’ Com- ests include evolutionary robotics and computational creativity.
put. Graph. Forum, vol. 39, no. 8, pp. 101–112, Dec. 2020, doi:
10.1111/cgf.14105.
[156] Robotic Simulation | Unity. Accessed: Feb. 24, 2021. [Online]. Available:
https://unity.com/solutions/automotive-transportation-manufacturing/
robotics ANTHONY VANDERKOP received the B.Sc.
[157] G. Brockman, V. Cheung, L. Pettersson, J. Schneider, J. Schulman,
degree in physics and the B.E. degree in mecha-
J. Tang, and W. Zaremba, ‘‘OpenAI gym,’’ 2016, arXiv:1606.01540.
[Online]. Available: https://arxiv.org/abs/1606.01540 tronics engineering from the University of Queens-
[158] J. Achiam. (2018). Spinning Up in Deep Reinforcement Learning. land, in 2019. He is currently pursuing the Ph.D.
[Online]. Available: https://github.com/openai/spinningu degree in robotics with the Queensland University
[159] L. Fan, Y. Zhu, J. Zhu, Z. Liu, O. Zeng, A. Gupta, J. Creus-Costa, of Technology, Brisbane, Australia.
S. Savarese, and L. Fei-Fei, ‘‘SURREAL: Open-source reinforcement He has been a Student with the Common-
learning framework and robot manipulation benchmark,’’ in Proceedings wealth Scientific and Industrial Research Organi-
of Machine Learning Research, vol. 87, A. Billard, A. Dragan, J. Peters, zation, Brisbane, since 2018. His research inter-
and J. Morimoto, Eds., Aug. 2018, pp. 767–782. [Online]. Available: ests include improving the capabilities of mobile
http://proceedings.mlr.press/v87/fan18a.html legged robots in rugged natural environments, particular those involving
[160] J. B. Mouret and K. Chatzilygeroudis, ‘‘20 Years of reality gap: A few
difficult terrain such as sand and snow.
thoughts about simulators in evolutionary robotics,’’ in Proc. GECCO
Genetic Evol. Comput. Conf. Companion, 2017, pp. 1121–1124.