Subscribe via Email

Thursday, June 1, 2017

Preparation for RISS 2017

technical: Directed to those in CS. General reader may still keep up.

In preparation for the RISS Internship, I have spent roughly a month exploring literature from my mentor Dr.Sycara, and others from the Advanced Agent-Robotics Technology Lab. This post is a summary of my initial review.

Obviously I can't absorb all of the literature in a month. As such I put special focus on the thesis work by Sasanka Nagavalli, the graduate student I will be working closely under this summer. During his PhD thesis, he has gone through a process of formulating novel concepts involving swarm behaviour, proving the existence of these concepts, and then using these concepts to develop algorithms and new benchmarks which were used in human-swarm interaction experiments.

What is Swarm Robotics?

As opposed the multi-robot systems I have worked with (via arc_ros) where teams of robots work together using their unique traits and roles, swarm robotics consists of a larger number of simple robots that coordinate together to accomplish tasks. Each robot in a swarm runs it's own local control laws, but together through their interactions comes emergent behaviour...  This is similar to a flock of birds forming a V, or ant behaviour in a colony.

Comparing the behaviour of a robot swarm to that of an ant colony and flock of birds.
Flock photo courtesy of Huffington Post. Robot swarm of Harvard 

Just like an ant colony will survive if you remove a couple of ants, a robot swarm can remain stable even if some robots in the swarm are damaged or lost. Another benefit is that swarms are relatively cheap. Since behaviour is emergent, you don't need to invest big dollars into a couple of high-tech robots that will do everything themselves. The complexity comes from putting lots of cheap ones together.

Challenges and Open Questions

A primary challenge in swarm robotics comes from this emergent behaviour... For example: If one wished to have a swarm complete a package delivery task, it would be extremely tedious to craft specific local control laws, such that when put in a swarm would emerge behaviour that would have the swarm pick up the package, and navigate to the destination.

Instead of that approach, a collection of simple behaviours can be created such as flocking, and rendezvousing. These independent behaviours are then toggled on and off in order to complete parts of a task.

Using a sequence of behaviours to guide a swarm.
Figure courtesy of Sasanka Nagavalli et al., (2017) Original Source

Despite providing an operator with a collection of these simple behaviours that they may trigger in the swarm, it is still difficult to predict how their inputs will affect the swarms behaviour... For example if a swarm is in the middle of performing some behaviour, but the operator interrupts this by activating a different one, the swarm may fail to complete the second behaviour.

Showing how different input times can alter a swarms behaviour.  Figure courtesy of Sasanka Nagavalli et al., (2014) Original Source (DOI: 10.1109/ICRA.2014.6907750)

Maybe the operator should have waited a couple more seconds before activating the new behaviour? In this case let's say the swarm now successfully switched behaviours in 10 seconds. Could changing when we give the input request even improve how long it takes for the behaviour to switch over? Maybe if we waited another 2 seconds the transition would take only 5 seconds instead of 10. The main question is how one may determine the best time to interrupt a swarm with a new behaviour input such that the overall performance is maximized. From this, how can an optimal sequence of behaviours and activation times be algorithmically constructed and executed on the swarm in order to complete a full task? Can a human learn how to approximate these optimal input times as well? These are some primary questions approached by Sasankas work.

I'll give a brief outline of 3 of his papers below. These were my personal favorite and made the picture very clear as to the research goals and direction.

Literature Summary
1: Neglect Benevolence in Human Control of Robotic Swarms
This paper introduces the concept of "Neglect Benevolence" which states performance of the swarm may be increased by purposely neglecting the swarm for a period of time.
This means that as opposed to human intuition (where most operators will give input as soon as possible), a better performance can be achieved by waiting a period of time before giving input.

After introducing the concept, this paper formalizes it and proves that Neglect Benevolence exists for a given class of swarms. Next the paper outlined an analysis that shows how one could calculate the optimal input time to a swarm to maximize performance. Lastly some simulations are run to show how neglect benevolence affects swarm control, and how their analysis allows for calculating optimal input time.

2: Bounds of Neglect Benevolence in Input Timing for Human Interaction with Robotic Swarms
This work follows Neglect Benevolence, a novel concept outlined in the paper above. Since the aforementioned research allows for calculating the optimal input time for a behaviour in a certain class of swarms, researchers could now (in theory) create a benchmark for comparing human vs optimal performance, as opposed to just comparing humans to other agents/heuristics as done in most AI work.
This paper setup an experiment to compare 2 different displays, to see which one better allowed operators to learn to approximate optimal input time.
The operator must give input at some time, to switch the swarm behaviour from Formation 1 to Formation 2. Depending on when they click "Activate New Formation" the convergence time may vary. Figure courtesy of Sasanka Nagavalli et al., (2015) Original Source (DOI 10.1145/2696454.2696470)
The first display is unaided, shown below. 
Each green dot is a robot in the swarm. Figure courtesy of Sasanka Nagavalli et al., (2015) Original Source (DOI 10.1145/2696454.2696470) 
 The second display however, is aided:
Each robot now has a line showing their current position, to their new potential position in the second formation. Figure courtesy of Sasanka Nagavalli et al., (2015) Original Source (DOI 10.1145/2696454.2696470) 

 They found that despite operators having better performance (ie giving input significantly closer to the optimal time) using the aided display, they didn't improve as much over time as those using the unaided display. The paper states this may be due to the fact that people those using the unaided display developed their own internal model of the swarm dynamics, as opposed to relying on the aided display; regardless further experimentation is needed to explore these questions.

3: Automated Sequencing of Swarm Behaviours for Supervisory Control of Robotic Swarms
As mentioned earlier, it is far too difficult to come up with control laws that -- when combined in as warm -- create elaborate emergent behaviour for reliably performing complex tasks. The solution to this is to use a library of simple behaviours, and put them together to complete work. Given a task to complete and a library of behaviours at your disposal, it would be useful to have a complete algorithm that generates an optimal sequence of behaviours and the times to activate them in order to complete a task. This the paper does just that; but in a simplified problem formulation where the behaviours must be selected at predefined discrete time points. The algorithm is similar to A*, traversing the search space using estimated remaining cost.

Summary of Literature
This was the first time that upon reading research has, I was able to obtain a sufficient grasp on the research problems, goals, and importance. The three publications above flowed together particularly well, which helped in building my understanding. I encourage the interested readers to check out these publications (firstsecond, third)


Application Prep: Gazebo and ROS
Aside from the theoretical preparation, my work term this summer will involve implementing swarm behaviour in simulation and on real robots. The lab is using the Gazebo simulation software ported to the ROS platform. This news was delightful considering my prior experience with ROS and Stage, a 2D simulator.
Gazebo is a major step up with 3D rendering, physics support, and much more intricate modelling.
An example of a Gazebo simulation. 

Summary
I'm super excited for this internship. Through the online orientation sessions we had, many wonderful opportunities were presented and I can't wait to grab hold of them. The official orientation is in a few hours, and things are already becoming quite busy.
I'll do my best to maintain blog updates. It's going to be quite the summer!

No comments:

Post a Comment

Please feel free to give feedback/criticism, or other suggestions! (be gentle)

Subscribe to Updates via Email