An Introduction To Swarm Robotics - A.martinoli - Tutorial - Slides
An Introduction To Swarm Robotics - A.martinoli - Tutorial - Slides
Alcherio Martinoli
SNSF Professor in Computer and Communication Sciences, EPFL Part-Time Visiting Associate in Mechanical Engineering, CalTech Swarm-Intelligent Systems Group cole Polytechnique Fdrale de Lausanne CH-1015 Lausanne, Switzerland http://swis.epfl.ch/ [email protected] Tutorial at ANTS-06, Bruxelles, September 4, 2006
Outline
Background
Mobile robotics Swarm Intelligence Swarm Robotics
microcontrollers batteries
5.5 cm
Perception-to-Action Loop
sensors Reactive (e.g., linear or nonlinear transform) Reactive + memory (e.g. filter, state variable) Deliberative (e.g. planning) actuators
Perception
Computation
Environment
Action
Swarm Robotics
Research Industry
Beni (2004)
Intelligent swarm = a group of non-intelligent robots (machines) capable of universal computation Usual difficulties in defining the intelligence concept (non predictable order from disorder, creativity)
Communication
direct local communication (peer-to-peer) indirect communication through signs in the environment (stigmergy)
Current Tendencies
IEEE SIS-05
self-organization, distributedness, parallelism, local communication mechanisms, individual simplicity as invariants More interdisciplinarity, more engineering, biology not the only reservoir for ideas
ANTS-06, IEEE SIS-06 follow the tendency; IEEE SIS-07 even more so
24 cm
size
Model-Based Approach
(main focus: analysis)
Ss
Ss
Ss
Sa Sa Sa
Experimental time
Abstraction
Ss
Sa
Common metrics
dN n (t ) = W (n | n, t ) N n (t ) W (n | n, t ) N n (t ) dt n n
Microscopic Level
Ss Sa
Se
Sd Si
Rn1
Ss
Sa
Ss Sa
Se
Sd Si
Rnm
Sa
Sb
Sa
Sb
Oq1
Oqr
O11
O1p
Caste1
Se Si Sd
Caste n
Coupling
Sa Sb
average quantities central tendency prediction (1 run) continuous quantities: +1 ODE per state for all robotic castes and object types (metric/task dependent!) - 1 ODE if substituted with conservation equations (e.g., total # of robots, total # of objects of type q, )
Rate Equation
(time-continuous)
inflow
outflow
n, n = states of the agents Nn = average # of robots in state n at time t W = transition rates (linear, nonlinear)
N n ((k + 1)T ) = N n (kT ) + TW (n | n, kT ) N n ' (kT ) TW (n | n, kT ) N n (kT )
n n
Calibration procedures:
Idea 1: run orthogonal experiments on local a priori known interactions (robot-to-robot, robot-to-environment) use for all type of interactions happening these values Idea 2: use all a priori known information (e.g., geometry) without running experiments get initial guesses fine-tune parameters automatically on the target experiment with a as cheap as possible calibration (e.g., fitting algorithm using a subset of the system)
A simple Example
Start Start ps Search Avoidance Search Ss Obstacle? N Y 1-pa Nonspatiality & microscopic characterization Obstacle? ps Avoid., a Sa
pa pa
Search
Ns(k+1) = Ns(k) - paNs(k) + psNa(k) Na(k+1) = N0 Ns(k+1)
Ns(0) = N0 ; Na(0) = 0
pa
Avoidance, Ta
Ta = mean obstacle avoidance duration pa = probability of moving to obstacle av. ps = probability of resuming search Ns = average # robots in search Na= average # robots in obstacle avoidance N0 = # robots used in the experiment k = 0,1, (iteration index)
Search
pa
Avoidance, Ta
Ns(0) = N0 ; Na(0) = 0
Micro to macro comparison (same robot density but wall surface become smaller with bigger arenas)
IR reflective band
Proximity sensors
Systematic Experiments
Real robots
[Martinoli and Mondada, ISER, 1995] [Ijspeert et al., AR, 2001]
Realistic simulation
Nrobots Nsticks Real robots (3 runs) and realistic simulations (10 runs) System bifurcation as a function of #robots/#sticks
Geometric Probabilities
Aa = surface of the whole arena
p s = As / Aa p r = Ar / Aa p R = pr ( N 0 1) p w = Aw / Aa p g1 = p s p g 2 = Rg p s
j = k Tg TSL
[1 p
g2
N s ( j )]
6 states: 5 DE + 1 cons. EQ Ti,Ta,Td,Tc 0; xyz = x + y + z TSL= Shift Left duration [Martinoli et al., IJRR, 2004]
C t (k) =
C (k )
k =0
Te
collaborations at iteration k
Te
Sample Results
Webots (10 runs), microscopic (100 runs), macroscopic model (1 run)
Search
unsuccessful
Grip
+ pg2Ng(k)Ns(k)
N0 Ns(k+1)
g2
[1 p
N s ( j )]
Initial conditions and causality Ns(0) = N0, Ng(0) = 0 Ns(k) = Ng(k) = 0 for all k<0
Ns = average # robots in searching mode Ng= average # robots in gripping mode N0 = # robots used in the experiment M0 = # sticks used in the experiment = fraction of robots that abandon pulling Te = maximal number of iterations k = 0,1, Te (iteration index)
opt g
for
N0 2 M 0 1 + Rg
with N0 = number of robots and M0= number of sticks, Rg approaching angle for collaboration approaching angle for collaboration
Counterintuitive conclusion: an optimal Tg can exist also in scenarios with more robots than sticks if the collaboration is very difficult (i.e. Rg very small)!
Example: Rg =
opt g
(1 + Rg ) for
2 c = 1 + Rg
opt Tg can be computed numerically by integrating the full model ODEs or solving the full model steady-state equations
Journal Publications
Stick Pulling
Li, Martinoli, Abu-Mostafa, Adaptive Behavior, 2004 -> learning + micro Martinoli, Easton, Agassounon, Int. J. of Robotics Res., 2004 -> real + realistic + micro + macro Lerman, Galstyan, Martinoli, Ijspeert, Artificial Life, 2001 -> realistic + macro Ijspeert, Martinoli, Billard, and Gambardella, Auton. Robots, 2001 -> real + realistic + micro
Object Aggregation
Agassounon, Martinoli, Easton, Autonomous Robots, 2004 -> realistic + macro + activity regulation Martinoli, Ijspeert, Mondada, Robotics and Autonomous Systems -> real + realistic + micro
Collision time
Time to Completion
1500
1000
500
10 12 14 Number of robots
16
18
20
Machine-Learning-Based Approach
(main focus: synthesis)
Learning to Avoid Obstacles by Shaping a Neural Network Controller using Genetic Algorithms
Oi = f ( xi )
Ij input
2 f ( x) = 1 x 1+ e
xi = wij I j + I 0
j =1 m
S8
Note: In our case we evolve synaptic weigths but Hebbian rules for dynamic
change of the weights, transfer function parameters, can also be evolved
= V (1 V )(1 i )
V = mean speed of wheels, 0 V 1 v = absolute algebraic difference between wheel speeds, 0 v 1 i = activation value of the sensor with the highest activity, 0 i 1
Note: Fitness accumulated during evaluation span, normalized over number of control loops (actions).
Note: Controller architecture can be of any type but worth using GA/PSO if the number of parameters to be tuned is important
Fitness evolution
Note: Direction of motion NOT encoded in the fitness function: GA automatically discovers asymmetry in the sensory system configuration (6 proximity sensors in the front and 2 in the back)
Collective: fitness become noisy due to partial perception, independent parallel actions
fitness f1 fitness f2
Noisy Optimization
Multiple evaluations at the same point in the search space yield different results Depending on the optimization problem the evaluation of a candidate solution can be more or less expensive in terms of time Causes decreased convergence speed and residual error Little exploration of noisy optimization in evolutionary algorithms, and very little in PSO
Key Ideas
Better information about candidate solution can be obtained by combining multiple noisy evaluations We could evaluate systematically each candidate solution for a fixed number of times not smart from computational point of view In particular for long evaluation spans, we want to dedicate more computational power/time to evaluate promising solutions and eliminate as quickly as possible the lucky ones each candidate solution might have been evaluated a different number of times when compared In GA good and robust candidate solutions survive over generations; in PSO they survive in the individual memory Use aggregation functions for multiple evaluations: ex. minimum and average
GA
PSO
Scenario 3
Three orthogonal axes to consider (extremities or balanced solutions are possible): Individual and group fitness Private (non-sharing of parameters) and public (parameter sharing) policies Homogeneous vs. heterogeneous systems
Not Always a big Artillery such a GA/PSO is the Most Appropriate Solution
Simple individual learning rules combined with collective flexibility can achieve extremely interesting results Simplicity and low computational cost means possible embedding on simple, real robots
Differences with basic In-Line Learning: Step adaptive faster and more stability at convergence
Enforcing Homogeneity
Three orthogonal axes to consider (extremities or balanced solutions are possible): Individual and group fitness Private (non-sharing of parameters) and public (parameter sharing) policies Homogeneous vs. heterogeneous systems
5 robots
5 robots
3 robots 2 robots
3 robots 2 robots
100
600
100
600
Allowing Heterogeneity
Three orthogonal axes to consider (extremities or balanced solutions are possible): Individual and group fitness Private (non-sharing of parameters) and public (parameter sharing) policies Homogeneous vs. heterogeneous systems
1.2
1.15
Specialized teams
1.1
1.05
1 2 3 4 Number of robots 5 6
Performance ratio between heterogeneous (full and 2castes) and homogeneous groups AFTER learning
Diversity Metrics
(Balch 1998)
Entropy-based diversity measure introduced in AB-04 could be used for analyzing threshold distributions Simple entropy: Social entropy:
Specialization Metric
Specialization metric introduced in AB-04 could be used for analyzing specialization arising from a variable-threshold division of labor algorithm
S = specialization; D = social entropy; R = swarm performance Note: this would be in particular useful when the number of tasks to be solved is not well-defined or it is difficult to assess the task granularity a priori. In such cases the mapping between task granularity and caste granularity might not trivial (one-to-one mapping? How many sub-tasks for a given main task, etc. see the limited performance of a caste-based solution in the stick-pulling experiment)
Relative Performance
Spec more important for small teams Local p > global p enforced caste: pay the price for odd team sizes
Diversity
Flat curves, difficult to tell whether diversity bring performance
Specialization
Specialization higher with global when needed, drop more quickly when not needed Enforcing caste: low-pass filter
?
Networks of S&A Vertebrates
Pedestrians
Multi-robot systems
Realistic: intra-node details and communication channel reproduced faithfully (Webots with Omnet++ plugin) Physical reality: detailed info on sensor nodes available
Experimental time
Ss
Ss
Sa Sa Sa
Common metrics
Abstraction
Ss
Ss
Ss
Sa Sa Sa
Macroscopic: rate equations, mean field approach, whole swarm Microscopic: multi-agent models, 1 agent = 1 robot or cockroach; similar description for all nodes Realistic: intra-robot details, environment (e.g., shelter, arena) details reproduced faithfully; cockroaches: body volume + animation Physical reality: detailed info on robots; limited info on physiology of cockroaches, individual behavior measurable externally
Experimental time
Common metrics
Abstraction
Macroscopic 1: Chemical equilibrium is completely defined by equilibrium constants K of each reaction (law of mass action)
Abstraction Experimental time
Macroscopic 2: Reactions kinetics describes how a reaction occurs and at which speed (differential equations)
Ss Ss Ss Sa Sa Sa
Microscopic 2: Agent-Based model, molecules 2D- and 3D geometry captured, 1 agent = 1 aggregate Physical reality: microscopic (e.g., crystallography) and macroscopic measurements (chemical reaction)
Realistic: intra-robot details, environment simplified (no realistic fluid dynamics yet) Physical reality: detailed info on robots
Experimental time
Microscopic: multi-agent models, 1 agent = 1 blimp; trajectory maintained, visualization with Webots
Abstraction
TBD
Common metrics
Conclusions
Books:
Swarm Intelligence: From Natural to Artificial Systems", E. Bonabeau, M. Dorigo, and G. Theraulaz, Santa Fe Studies in the Sciences of Complexity, Oxford University Press, 1999. Balch T. and Parker L. E. (Eds.), Robot teams: From diversity to polymorphism, Natick, MA: A K Peters, 2002.