How did we estimate species distribution?

First, we compiled all known occurrence records for each species from the Global Biodiversity Information Facility (GBIF) database. The retrieved raw data underwent a cleaning routine in order to remove duplicate, erroneous, and uncertain records. By erroneous records, we mean records that do not correspond to the real location of the species (e.g. within oceans or in countries’ centroids). This step was done with the aid of the R package CoordinateCleaner (Zizka et al. 2019). By uncertain records, we mean records assigned to botanical countries where species are not expected to naturally occur according to Plants of the World Online (POWO). While we value citizen science, records from these initiatives were not included in our database.
Optimally, we predicted the species occurrence using Maxent (Phillips et al. 2004), in which climatic datasets provided by CHELSA (Karger et al. 2017) were used to calibrate the models. Then, we clipped the species’ predicted distributions by a spatial buffer around each known occurrence point, which size was determined by the Inverse Distance Weighting technique (Shepard 1968), used here as a conservative perspective of species spatial envelopes. Only models with high performance after a cross-validation routine were considered. For that, a minimum of 5 occurrence records was necessary. Distribution hypotheses for species with fewer than 5 occurrence records were generated by the method of the Circular Area (Hijmans & Spooner 2001), in which we drew circles of a radius of 50 km around every known occurrence record.

References
Bondi L, Prado-Monteiro B, de Paula LFA, Rosado BHP, Porembski S. 2025. Blind spots in traditional approaches to conservation prioritization in a climate change context. bioRxiv 2025-10.
Hijmans RJ, Spooner DM. 2001. Geographic distribution of wild potato species. American Journal of Botany 88(11):2101–2112. https://doi.org/10.2307/3558435
Karger DN., Conrad O, Böhner J, Kawohl T, Kreft H, Soria-Auza RW, et al. 2017. Climatologies at high resolution for the earth’s land surface areas. Scientific data 4(1):1-20. https://doi.org/10.1038/sdata.2017.122
Phillips SJ, Dudík M, Schapire RE. 2004. A maximum entropy approach to species distribution modeling. Proceedings, Twenty-First International Conference on Machine Learning 2004:655–662. https://doi.org/10.1145/1015330.1015412
Shepard D. 1968. A two-dimensional interpolation function for irregularly-spaced data. Proceedings of the 1968 23rd ACM National Conference (pp. 517–524). https://doi.org/10.1145/800186.81061
Zizka A, Silvestro D, Andermann T, Azevedo J, Duarte Ritter C, Edler D, et al. 2019. CoordinateCleaner: Standardized cleaning of occurrence records from biological collection databases. Methods in Ecology and Evolution 10(5):744-751. https://doi.org/10.1111/2041-210X.13152
