# GAIA Data and the Plan for Azure

First things first; I've never learned how to display code in a blog post before. I found a few tutorials suggesting the <pre> html tags. Let's see how they look with the obligatory "Hello World" program;

#include <iostream>
int main()
{
std::cout<<"Hello World!"<<std::endl;
return 0;
}

Excellent, that looks good. I was approached (via Reddit, of all places) to take part in a project involving data from the GAIA telescope. I heard a bit about the recent data release whilst I was in Trieste towards the end of last year, but, being predominantly a simulations guy, I've never really worked on telescope data before, so I thought it might be a nice way to cut my teeth on "real" data. I'm currently in the process of downloading the whole GAIA catalogue, using a simple little python script which basically crawls their 'direct download' page downloading all the data it finds.

The other plan for the next week or so is to develop a framework for running our COMPAS code in a scalable way through Microsoft Azure, to start making real use of my grant. I think that Python should be a pretty natural choice for writing the framework. The rough outline of what I want to achieve is the following;

• Download and compile COMPAS on a simple virtual machine. Preliminary investigations suggest that I can probably get away with the simplest (A0) core, which has 750MB of RAM and a single core.
• Find  a way of then cloning this virtual machine, ideally coupled with a configuration file so that each node can do something slightly different.
• Collate the outputs from all of these nodes to some central place, and then, when they're all done, download them to Birmingham.
• Automatically shut down all of my virtual machines, so that I'm not spending my allowance unnecessarily.

I'm not sure how possible all of this is. If I'm happy with the framework I end up with, I'll publish it on github, but we'll see.