Automatic generation of training datasets
for machine learning-based visual relative localization
of micro-scale UAVs
By leveraging our relative Micro-scale Unmanned Aerial Vehicle localization sensor UVDAR, we generated an automatically annotated dataset MIDGARD, which the community is invited to use for training and testing their machine learning systems for the detection and localization of Micro-scale Unmanned Aerial Vehicles (MAVs) by other MAVs. Furthermore, we provide our system as a mechanism for rapidly generating custom annotated datasets specifically tailored for the needs of a given application. The recent literature is rich in applications of machine learning methods in automation and robotics. One particular subset of these methods is visual object detection and localization, using means such as Convolutional Neural Networks, which nowadays enable objects to be detected and classified with previously inconceivable precision and reliability. Most of these applications, however, rely on a carefully crafted training dataset of annotated camera footage. These must contain the objects of interest in environments similar to those where the detector is expected to operate. Notably, the positions of the objects must be provided in annotations. For non-laboratory settings, the construction of such datasets requires many man-hours of manual annotation, which is especially the case for use onboard Micro-scale Unmanned Aerial Vehicles. In this paper, we are providing for the community a practical alternative to that kind of approach.
The dataset and the automatic annotation method using UVDAR is described in this paper.
More on the UVDAR system can be found here: [1][2][3]
The current version of the dataset can be found here:
The dataset is separated into folders containing different types of target environments, and each of these has subfolders with specific footage sources. Each such footage contains folder with .png images and a folder with the same amount of .csv files, one for each image. The the .csv annotation files contain one line per detected MAV target, and each of these is a comma separated list in the format N,X,Y,W,H,D,R where N is the ID number of the MAV; X, Y are the coordinages of the top-left corner of the axis-aligned bounding box containing the current UAV expressed in pixels. The horizontal axis of the image is X, with Y being the vertical axis. The origin [0,0] is at the top-left corner, and since the bounding boxes can slightly extend outside of the image area their coordinates can be negative. W, H are the width and height of this bounding box in pixels. D is the extimated distance of the current MAV from the camera and R is the estimatad standard deviation of the MAV distance from the camera. For more details, see the README.txt files included in the dataset.
| Background | Lighting | FoV | Frames |
|---|---|---|---|
| Fields, hills | Direct sunlight | 180° | 780 |
| Fields, hills | Direct sunlight | 96° | 554 |
| Coniferous forest | Direct sunlight | 180° | 763 |
| Coniferous forest | Direct sunlight | 96° | 769 |
| Semi-urban | Direct sunlight | 96° | 475 |
| Stands | Direct sunlight | 96° | 586 |
| Modern archit. | Direct sunlight | 96° | 534 |
| Historical stariwell | Low light | 96° | 319 |
| Church interior | Very low mixed | 96° | 984 |
| Church exterior | Late evening sky | 96° | 697 |
| Warehouse interior | Low fluorescent | 96° | 564 |
| Warehouse exit | Changes halfway | 96° | 272 |
| Appartment building | Overcast sky | 96° | 300 |
| Sports hall | Fluorescent | 96° | 1179 |
If you use this dataset, please cite the paper:
V. Walter, M. Vrba and M. Saska. "On training datasets for machine learning-based visual relative localization of micro-scale UAVs". In 2020 IEEE International Conference on Robotics and Automation (ICRA). August 2020, 10674-10680.
@inproceedings{walter_icra2020,
author = "V. {Walter} and M. {Vrba} and M. {Saska}",
booktitle = "2020 IEEE International Conference on Robotics and Automation (ICRA)",
title = "On training datasets for machine learning-based visual relative localization of micro-scale {UAVs}",
year = 2020,
month = "Aug",
pages = "10674-10680"
}






