Geospatial Python at Durham#
Tom Chudley
thomas.r.chudley@durham.ac.uk
Department of Geography, Durham University
Target Audience#
This informal set of documents is aimed at colleagues in the Department of Geography at Durham University, particularly within the Glaciology Group. It is designed to get beginners up and running with using Python for geospatial programming tasks.
However, in practice, it is generally applicable to geospatial data users and glaciologists outside of Durham: in reality, only some of the specifics of the HPC interface will be different.
Purpose#
As an open source software, there are often multiple and conflicting options and advice to pick from when setting up Python. Figuring out the setup that works for you can be a large time drain and put people off from taking advantage of everything that Python has to offer.
Here, I aim to give you the (inevitably opinionated) ‘best’ options for installing and using Python for geospatial analysis in the current ecosystem, as well as a broad-level introduction to the tools. Where there are multiple options for performing a task (e.g. multiple ways to install something, or multiple packages that can perform a task) I am to focus on, in order:
The most widely-used option in the Python community.
Where this isn’t suitable, the option most suitable for our application (geospatial analysis).
Where two equally valid options remain, the option that can be used across the broadest range of tasks (i.e. whatever allows you learn the least new things to do the most possible stuff).
It aims to be specific to the Durham IT environment where relevant (e.g. HPC setup). It is not designed to be a comprehensive tutorial - it instead aims to give you the optimal ‘best’ method of doing something, leaving you to fill in the gaps with your own research (e.g. googling and LLMs). However, there will also be links to useful sources for information on installs/downloads as we go.
This material can be as an online Jupyter Book. The backend repository can be found at the GitHub repository - this could be useful as many of the webpages are in fact Jupyter Notebooks, which you can download and run/modify yourself. At the Github repo, individual Markdown and Jupyter Notebook files can be found within the website
directory in the root folder, then organised into directories and named according to the web address of individual pages.
Tips#
Section 1 (Installing Python) is not the most enthralling start - setting up a rigorous Python environment can be challenging, and was one of the main motivations for beginning to create this documentation! If you’d like to just get a feel for what Python can do for you, I would recommend browsing sections 2 (Using Python) or 3 (Getting Cryosphere Data) instead. If you would like to play with Python and Jupyter Notebooks without having to install it on your machine, you could always run an instance of Google Colab, which allows you to run a Jupyter Notebook online. You can still load your own data into this if you want.
NB - when stuck, generative AI is actually pretty good for coding purposes (although I like to minimise using it to ensure I have my own understanding of how my code works, as well as for environmental reasons). It’s a good use of AI primarily because, unlike using it for writing prose, it is immediately falsifiable: if it starts hallucinating, then the code/solution it proposes simply won’t work. You can go back and provide the error message, and it may even be able to fix its mistakes.