From Michael Branicky -- Thanks Michael!
The term Behavioral Programming is meant to convey a body of techniques
for "predictably composing lower-level behaviors into solutions that
satisfy higher-level goals." Typically, the lower-level behaviors are
given by sensorimotor loops, or controllers, and operate in a continuous
domain. Typically, also, higher-level goals are encoded symbolically.
Behavioral Programming techniques are predominantly necessary for solving
procedural learning problems, such as juggling, riding a bike, playing a
piano, or running an obstacle course.
Control theory has been successful in the analysis and design of robust
low-level controllers. AI has been successful in planning to achieve
higher-level goals when the the underlying state space and move operators
are discrete. The emerging area of hybrid systems has attempted to
formalize systems that combine both discrete and continuous state spaces
and dynamics. Together, we think that they offer the chance for
efficiently solving Behavioral Programming problems, leading to new
capabilities for engineered intelligent systems.
The Behavioral Programming session consisted of an introductory lecture
and four invited talks, as follows:
- M.S. Branicky (Case Western Reserve U.): Behavioral Programming
- M. Huber and R. A. Grupen, (U. Massachusetts, Amherst): A Hybrid
Architecture for Learning Robot Control Tasks
- R. Grzeszczuk, D. Terzopoulos and G. Hinton (U. Toronto, Intel):
Fast Neural Network Emulation and Control of Dynamical Systems
- R.W. Ghrist and D.E. Koditschek (U. Michigan): Safe Cooperative
Robot Dynamics on Graphs
- S.V. Shastri (SRI): On use of Hybrid Control for Legged Locomotion
Branicky introduced the problem of Behavioral Programming (BP) and set it
in a context he called a "Middle-Out" Approach to AI (to complement the
Top-Down, GOFAI and the Bottom-Up, Emergent approaches). Examples of
behaviors and their composition (in time, in chains, and in parallel)
were given. The key to effective BP is that it is PROGRAMMING, not
emergence or global search. He showed how his theory of Optimal Hybrid
Control can solve BP problems (even if the behaviors can be continuously
modulated) by considering them to be dynamic programming problems with
both continuous and discrete actions. He presented an idea of combining
a fast marching method for time optimal hybrid control problems with A*
search. Also of particular relevance was some recent work by Precup,
Sutton, et al. More work combining, control theory, AI, and hybrid
systems was felt necessary to make further progress.
Huber and Shastri presented examples of actual walking robots where
reinforcement learning had been applied at the level of behaviors, like
turning, instead of at the joint-torque level. Huber stressed parallel
compositions, where one controller acts in the "null space" of another;
Shastri stressed the idea of using "reference models" as behaviors.
Terzopoulos gave an overview of his framework for the learning of complex
motions (e.g., Sea World tricks) by Artificial Fishes. He used
stochastic search over actions at a hierarchy of levels (motor commands
to local controllers to compositions of these) to accomplish such feats.
In particular, at each level, simulated annealing is performed to
minimize an objective function, which is evaluated on a simulation run.
As this can be costly, his student, Grzeszczuk, presented a means of
learning a (differentiable!) neural network prediction of simulation
"super steps" to speed evaluations by orders of magnitude. Koditschek
and Ghrist's student, Eric Klavins, presented their work on extending
potential fields to topological spaces like graphs. Some examples of
how these can be used to enforce cooperation by multiple robots in a
manufacturing setting were given.
A short discussion session looked at questions raised by the session.
These included the following. Q: How are behaviors chosen? A: By
engineering insight, to be orthogonal, for completeness. Two other major
questions remained open: Can new behaviors be learned or discovered? In
what other ways can behaviors be orchestrated to achieve overall goals?
Other outcomes included a precise definition of "behavior" (in response
to Prof. Peter Caines of McGill U.) and a proposal to add rewards to
continuous actions in a planned extension to Hybridcc (in response to
Dr. Vineet Gupta of NASA Ames).
Finally, our session was related to many other papers presented at
the workshop. Of particular note were the work of Feng Zhao, Xerox
PARC (who performs searches on graphs gleaned from behaviors where
the control parameters are set to different, constant values); Todd
Neller, Stanford U. (who extended alpha beta pruning to hybrid games);
Peter Caines (who has developed a hierarchical hybrid systems theory and
a hierarchically-accelerated dynamical programming method for discrete
systems); and Shankar Sastry, UC Berkeley and Claire Tomlin, Stanford U.
(who use game theoretic approaches for solving hybrid control problems).