Slashdot Mirror


Ask Slashdot: Scientific Computing Workflow For the Cloud?

diab0lic writes "I have recently come into the situation where I need to run cloud computing on demand for my research. Amazon's EC2 Spot Instances are an ideal platform for this as I can requisition an appropriate instance for the given experiment {high cpu, high memory, GPU instance} depending on its needs. However I currently spin up the instance manually, set it up, run the experiment, and then terminate manually. This gets tedious monitoring experiments for completion, and I incur unnecessary costs if a job finishes while I'm sleeping, for example. The whole thing really should be automated. I'm looking for a workflow somewhat similar to this:
  1. Manually create Amazon machine image (AMI) for experiment.
  2. Issue command to start AMI on specified spot instance type.
  3. Automatically connect EBS to instance for result storage.
  4. Automatically run specified experiment, bonus if this can be parameterized.
  5. Have AMI automatically terminate itself upon experiment completion.

Something like docker that spun up on-demand spot instances of a specified type for each run and terminated said instance at run completion would be absolutely perfect. I also know HTCondor can back onto EC2 spot instances but I haven't really been able to find any concise information on how to set up a personal cloud — I also think this is slight overkill. Do any other Slashdot users have similar problems? How did you solve it? What is your workflow? Thanks!"

2 of 80 comments (clear)

  1. Re:EC2 is scriptable by diab0lic · · Score: 5, Insightful

    I'm aware that EC2 is inherently scriptable, though the documentation is incredibly poor for some areas, and heavily favours those interested in long running instances. This post is about asking others what their workflow for short term spot instances is, and generating some collaboration and sharing of ideas on the subject. Looking through the other comments there is a PhD who wrote some of his own scripts using boto (complains about its docs -- trend here?), someone working on a product to do this (wonder why he sees a business case for this?) . The comments in this thread are evidence enough that there is hardly any consensus on how to do this easily and elegantly. To all those shouting RTFM, you've clearly never read the EC2 docs or tried to use them for this use case. They are hardly adequate, just take a look at their scientific computing page (http://aws.amazon.com/ec2/spot-and-science/) Not a single person here has said something along the lines of "RTFM -- I did and it allowed me to easily do something similar." Just saying RTFM because you can doesn't help, nor does it mean anything if the docs are inadequate for the use case in question.

  2. Re:EC2 is scriptable by dotancohen · · Score: 4, Insightful

    EC2 is inherently scriptable. There's nothing stopping you from using the command-line tools to fire up an instance, and let it run, and store its results to S3, and then decommission the instance.

    You are correct that what you propose is easy and well documented. However, that is not what the OP needs.

    The OP needs lower-priced spot instances, which are intermittently available and designed exactly for this workflow. When the entire AWS datacenter has some spare capacity, these spot instances turn on for those who requested them to run (usually to crunch data that is not time-sensitive). The use and configuration of these instances is not so well documented, probably because you cannot run a webserver on them and that seems to be the focus of much AWS documentation. However, it is exactly these 'spot instances' which are in my opinion the genius of the cloud: they let the heavy, non-time-critical work (i.e. scientific computing) be done when the webservers and mailservers aren't so busy, thus flattening out the daily CPU demand curve.

    The OP should start here:
    http://aws.amazon.com/ec2/spot-tutorials/

    And end here:
    http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/tutorial-spot-adv-java.html

    --
    It is dangerous to be right when the government is wrong.