15. Cloud-Native Data#

This page is in production

This page is in production, but in the meantime, the Cloud Native Geo group host probably the definitive guide to cloud-optimised datasets.

15.1. “Cloud Native”?#

Don’t be scared off by the phrase ‘cloud-native’ - this isn’t some complex arcane magic like creating and hosting a dynamic STAC database. This is, for the most part, simply choosing to choose a different file format. This file format, when hosted at a web address (e.g. https://data_website.com/cloudoptimisedfile.tif) allows for software (not only Python packages but even QGIS!) to stream the file from online. Due to the way the file metadata is organised, we don’t have to download entire files: instead, the software just downloads the little bit of data we need. We utilised this in the ice velocity tutorial, when we pointed rioxarray a very large ITS_LIVE mosaic of the whole of Greenland, but only downloaded the section of Kangerlussuaq we wanted in a matter of seconds!

If you plan on your research data being shared, this is of course is a fantastic thing - immediate cloud-accesibility with little-to-no effort on your end. However, even if you don’t, this is still a great move, especially if you have relatively large file sizes: cloud-native datasets generally offer better compression and faster loading that legacy datasets.