I’ve had an idea in my head for the past few months to build myself a nice, neat, self-contained data processing set-up using Docker.
If you don’t know, Docker is a containerisation system that allows software to be run in a sand-boxed environment i.e. without impacting on your operating system set up. The great thing about Docker, and systems like it, is that the software that you set up (containers) can be easily shared and set up on any other computer that has Docker running. This means that all users and/or machines have the exact same configuration. This in turn helps in delivering reproducible or comparative analyses.
So far I have two components configured and set up: an Anaconda container with a fully stocked set of geospatial libraries for my data analytics, and an ARCSI atmospheric correction container (for which I have utilised the system created by the JNCC). I have plans to add web mapping, metadata and data server containers so that the entire data processing workflow can be implemented if needs be.
The great thing about this is that I can set it up and rip it down on any machine. If need to process data on a client network then I can use this system on one of their machines and remove it when the contract is over. I can also (once I’ve done a bit of reading around the practicalities) stand the system up in the cloud to allow for the processing and dissemination of large datasets.
The whole ‘DevOps for Earth Observation’ concept is one that I am currently very interested in. If you want to talk about systems such as Docker, Vagrant or cloud platforms then please get in touch – I’d love to see how this sort of technology could help you.