PhD in Aerospace Engineering, 2015
University of Bristol
Master of Arts with Honours Pure Mathematics, 2011
University of Edinburgh
Through the use of autonomy Unmanned Aerial Vehicles (UAVs) can be used to solve a range of of multi-agent problems that exist in the real world, for example search and rescue or surveillance. Within these scenarios the global objective might often be better achieved if aspects of the problem can be optimally shared amongst its agents. However, in uncertain, dynamic and often partially observable environments centralised global-optimisation techniques are not achievable. Instead, agents may have to act on their own belief of the world, making the best decisions independently and potentially myopically. With multiple agents acting in a decentralised manner how can we discourage competitive behaviour and instead facilitate cooperation. This paper focuses on the specific problem of multiple UAVs simultaneously searching for tasks in an environment whilst efficiently routing between them and ultimately visiting them. This paper is motivated by this idea that collaboration can be simple and achieved without the need for a dialogue but instead through the design of the individual agent’s behaviour. By focusing on what is communicated we expand the use of a single agent behaviour. Which through minor modifications can produce distinct agents demonstrating independent, collaborative and competitive behaviour. In particular by investigating the role of sensor and communication ranges this paper will show that increased sensor ranges can be detrimental to system performance, and instead the simple modelling of nearby agents’ intent is a far better approach.
Formation flight has the potential to significantly reduce aircraft fuel consumption by allowing “follower” aircraft to fly in the aerodynamic wake of “leader” aircraft. However, this requirement for aircraft to be in close proximity for large parts of their journey raises questions about the suitability of flying in formation given the diverse range of existing flights and geographical regions. This paper demonstrates the potential for two-aircraft formation flight for three distinct case studies: long-haul airline, transatlantic airline, and low-cost airline, encompassing a range of typical airline regions and characteristics. The results indicate, even with only minor scheduling alterations, the potential fuel savings could result in saving hundreds of millions of dollars in fuel costs and reducing millions of tonnes of carbon dioxide emissions. An analytical geometric method for calculating all possible combinations of optimal routes is presented. This is coupled with a mixed integer linear program for providing an assignment of aircraft into formation pairs. A number of different key metrics, correlations, and predictive indicators help to determine which flights, airlines, and regions show “good” formation potential. Importantly, this paper also demonstrates these results for a wide range of drag-reduction possibilities and the impact this has on achievable fuel saving.
Modelling and planning as well as Machine Learning techniques such as Reinforcement Learning are often difficult in multi-agent problems. With increasing numbers of agents the decision space grows rapidly and is made increasingly complex through interacting agents. This paper is motivated by the question of if it is possible to train single- agent policies in isolation and without the need for explicit cooperation or coordination still successfully deploy them to multi-agent scenarios. In particular we look at the multi-agent Persistent Surveillance Problem (MAPSP), which is the problem of using a number of agents to continually visit and re-visit areas of a map to maximise a metric of surveillance. We outline five distinct single-agent policies to solve the MAPSP: Reinforcement Learning (DDPG); Neuro-Evolution (NEAT); a Gradient Descent (GD) heuristic; a random heuristic; and a pre-defined ‘ploughing pattern’ (Trail). We will compare the performance and scalability of these single-agent policies to the Multi-Agent PSP. Importantly, in doing so we will demonstrate an emergent property which we call the Homogeneous-Policy Convergence Cycle (HPCC), whereby agents following homogeneous policies can get stuck together, continuously repeating the same action as other agents, significantly impacting performance. This paper will show that just a small amount of noise, at the state or action level, is sufficient to solve the problem, essentially creating artificially-heterogeneous policies for the agents.
VENTURER was one of the first three UK government funded research and innovation projects on Connected Autonomous Vehicles (CAVs) and was conducted predominantly in the South West region of the country. A series of increasingly complex scenarios conducted in an urban setting were used to: (i) evaluate the technology created as a part of the project; (ii) systematically assess participant responses to CAVs and; (iii) inform the development of potential insurance models and legal frameworks. Developing this understanding contributed key steps towards facilitating the deployment of CAVs on UK roads. This paper aims to describe the VENTURER Project trials, their objectives and detail some of the key technologies used. Importantly we aim to introduce some informative challenges that were overcame and the subsequent project and technological lessons learned in a hope to help others plan and execute future CAV research. The project successfully integrated several technologies crucial to CAV development. These included, a Decision Making System using behaviour trees to make high level decisions; A pilot-control system to smoothly and comfortably turn plans into throttle and steering actuation; Sensing and perception systems to make sense of raw sensor data; Inter-CAV Wireless communication capable of demonstrating vehicle-to-vehicle communication of potential hazards. The closely coupled technology integration, testing and participant-focused trial schedule led to a greatly improved understanding of the engineering and societal barriers that CAV development faces. From a behavioural standpoint the importance of reliability and repeatability far outweighs a need for novel trajectories, while the sensor-to-perception capabilities are critical, the process of verification and validation is extremely time consuming.
Some of my previous projects