2.3.1Problem statement¶
Remote sensing and conservation scientists have access to more earth observation data than ever before, but the feature data is only one piece of the puzzle in realising the full potential of this vast resource. As with any model of a physical system, meaningful and robust ground truth data is essential to reliable predictions. Robust datasets (target-feature pairs) are readily available for common targets in common contexts such as street trees in urban environments and demonstrate remarkable performance when leveraged by deep learning algorithms. However, developing a high-quality dataset of sufficient size that catalogues the entire diversity of plant species alone is a monumental task, and this is without considering the many aspects of diversity possible to each species.
A skilled botanist can rapidly learn to visually identify a previously unseen species with only a handful of examples by leveraging existing knowledge of categories combined with contextual information. Moreover, the probability of a species occurring in a particular location is strongly influenced by environmental factors like elevation, aspect, soil type, and climate variables Porfirio et al., 2014. However, current deep learning segmentation approaches treat vegetation mapping as purely a spectral-spatial problem, without considering contextual information about the surrounding landscape or the target taxa. This is particularly important in situations where training data is limited Safonova et al., 2023Sumbul et al., 2018.
2.3.2Proposed methods¶
Develop a multimodal DNN architecture with separate encoding branches for each different mode of information.
We will include variables such as:
Terrain: elevation, slope, aspect
Climate: temperature, precipitation, humidity, solar radiation
Geology: soil, watercourse proximity
Contextual: coordinates, systematics, date
Physical: spectral signatures
Implement cross-attention mechanisms between spectral and environmental features to enable inter-modal relationships. Conduct ablation studies removing individual environmental variables to quantify their contribution. Compare against baseline models using stacked preprocessing and single-encoder approaches Audebert et al., 2018.
2.3.2.1Key innovations¶
This research will pioneer true multimodal vegetation segmentation models, moving beyond simple data stacking. Additionally, we will quantify the relative importance of environmental context compared to spectral information for species detection. Lastly, this research will determine how models integrate ecological knowledge, potentially revealing new ecological insights about species-environment relationships.
- Porfirio, L. L., Harris, R. M. B., Lefroy, E. C., Hugh, S., Gould, S. F., Lee, G., Bindoff, N. L., & Mackey, B. (2014). Improving the Use of Species Distribution Models in Conservation Planning and Management under Climate Change. PLoS ONE, 9(11), e113749. 10.1371/journal.pone.0113749
- Safonova, A., Ghazaryan, G., Stiller, S., Main-Knorn, M., Nendel, C., & Ryo, M. (2023). Ten Deep Learning Techniques to Address Small Data Problems with Remote Sensing. International Journal of Applied Earth Observation and Geoinformation, 125, 103569. 10.1016/j.jag.2023.103569
- Sumbul, G., Cinbis, R. G., & Aksoy, S. (2018). Fine-Grained Object Recognition and Zero-Shot Learning in Remote Sensing Imagery. IEEE Transactions on Geoscience and Remote Sensing, 56(2), 770–779. 10.1109/TGRS.2017.2754648
- Audebert, N., Le Saux, B., & Lefèvre, S. (2018). Beyond RGB: Very High Resolution Urban Remote Sensing with Multimodal Deep Networks. ISPRS Journal of Photogrammetry and Remote Sensing, 140, 20–32. 10.1016/j.isprsjprs.2017.11.011