- Researchers build a Deep convolutional neural network that can map every building in the US, using satellite imagery.
- To do this, they combined signed-distance labels with SegNet (a deep learning network).
- The system processes an area of 56 kilometer square in less than one minute.
In the last couple of years, high resolution remote sensing imagery has become less expensive and more accessible. Several studies have been devoted to enabling automated building extraction, but establishing a large-scale database of reliable building footprints has remained a challenge till this date.
Although VGI (stands for Volunteered Geographic Information) from OpenStreetMap could be used to obtain a large map of buildings, the mediocre quality of VGI would require a tremendous amount of effort to refine the data. Thus, in order to create large-scale building maps, an automated, scalable framework is yet to be built.
Now, developers at Oak Ridge National Laboratory has built a deep convolutional neural network (CNN) that is capable of mapping each and every building in the United States, using satellite imagery. It could help in emergency planning before and after a disaster strikes.
How DiD They Do This?
To extract building footprint across the entire continental U.S, researchers conducted an in-depth analysis of 4 state of the art convolutional neural networks
- Fully convolutional neural network
- Branch-out convolutional neural network
- Conditional random field as recurrent neural network, and
- SegNet
All these deep learning techniques were tested on one meter resolution aerial pictures from NAIP (National Agriculture Imagery Program). After thoroughly evaluating all four methods, researchers preferred SegNet CNN architecture. They merged signed-distance labels with SegNet to enhance building extraction outcomes to instance level.
The main goal was to accurately extract individual building (spatial extent). To achieve this, they combined 2 CNN models (trained with additional spectral bands) while leveraging the pre-trained model’s learned parameter values for initialization.
Building extraction for Pennsylvania state (left). The red lines (right) show extracted building of Philadelphia city (blue box).
Researchers created seamless building maps with a GPU cluster, using a single optical convolutional neural network model derived through the validation process. Then they performed the quality checks on the outcomes, and discovered some key sources of commission errors. The outcomes were further refined with a small retraining process.
Reference: arXiv:1805.08946
The system is trained on NVIDIA Tesla GPUs and the cuDNN library-accelerated Caffe deep learning framework. To completely train the model, it took more than 120,000 iterations. And a total of 8 Tesla GPUs were used for inference.
The final results demonstrate the use of deep learning model for robust building extraction at larger scale. The model processes an area of 56 kilometer square in less than one minute.
According to the researchers, the insights provided by this study will benefit future similar projects based on remote sensing imagery.
What’s Next?
In coming years, researchers will be testing the same strategy on high performance computing resources for multi-GPU training to further enhance the mapping system. Also, this will enable more testing of more sophisticated and complex network architectures.
Read: Google AI Can Create Short Video Clips From Two Still Images
At present, the capacity of the GPU memory and latency between GPU nodes limit the batch size and the size of convolutional neural network architectures. With more GPU memory, developers will be able to analyze the model performance using more than 3 spectral bands in 1 CNN for building extraction.