DISQUS

Data Wrangling Blog: On-Demand MPI Cluster with Python and EC2 (part 1 of 3)

  • Michael Fairchild · 2 years ago

    Awesome! Thanks for writing this up for the rest of us. I am looking forward to benchmarking some mpi jobs on ec2 and comparing them to my own beowulf.
    Do you have a version of your mpi enabled image you could make public? You have laid out all the steps to make one, but if you had a public image we could boot into that would be great.
    Thanks a lot, looking forward to parts 2 and 3 :-)

  • Peter Skomoroch · 2 years ago

    I'll try to bundle a public image this week, I just need to clean out my working directories first. I think this basic approach will be good for benchmarking MPI, but I'm looking forward to someone making an image with one of the real cluster distributions as well.

  • Stu Gott · 2 years ago

    Great writeup! You might want to check out rBuilder Online. AMI Images you create are automatically uploaded to Amazon's S3 and can be booted on Amazon's EC2--saving developers the trouble of deploying appliances by hand. All images created on rBuilder Online are freely available. The MPI tools you mention haven't been packaged in Conary by anybody yet, but that should be a SMOP.

  • Michael Creel · 2 years ago

    Pretty interesting stuff. I'll try to get ParallelKnoppix working with this. Looks like a great way to do some sporadic embarrassingly parallel work.

  • Peter Skomoroch · 2 years ago

    Michael,


    Let me know how it goes, that would simplify things a lot. Right now I use some client side python scripts to configure the cluster based on the list of EC2 instances I start from my laptop (I will be posting that code along with an AMI later this week).


    I started off on my MPI kick with a small Parallel Knoppix cluster at home and would like to eventually have the same system on EC2. There are already some EC2 debian base images in the public AMI section so it should be possible to get up and running.


    As a relative newbie, I wanted to avoid digging into the PK build and just get something running quickly, but I think the ideal setup would be to find a way to get the PK node auto-discover working and do a network launch of the mpi cluster within a single security group on EC2. I suspect there is a bit of work in getting the iptables configuration right. EC2 uses its own custom setup instead of the standard iptables config.


    -Pete








    Debian iptables thread:





    http://developer.amazonwebservices.com/connect/thread.jspa?messageID=44592&





    Debian AMIs:






    http://www.ioncannon.net/system-administration/118/debian-ec2-ami/
    http://developer.amazonwebservices.com/connect/entry.jspa?externalID=639&categoryID;=101
    http://developer.amazonwebservices.com/connect/entry.jspa?externalID=638&categoryID;=101

  • Mo · 2 years ago

    Nice post dude. Make your comments font one size larger. :)

  • Michael Creel · 2 years ago

    I'm on the wait list for EC2, so I don't know when I'll be trying this out. I suspect that this will not be hard to get working. I think that virtual clusters like this are going to be pretty important tools in the near future, or maybe they already are in private businesses.

  • Mark J. · 2 years ago

    Hi Peter,


    I have a question regarding your MPI setup. I did a benchmark of a simple application on a single CPU, and found that the elapsed time (wall-clock time) of the application varied widely, by more than 40%, even though the CPU time was the same. It is my belief that the virtual machine is not guaranteed a set slice of CPU cycles by Xen. Given this, if a parallel application is doing frequent communication, during its solution between multiple instances, the overall performance could be very unpredictable. Not only that, since the user is charged based on the elapsed time for each instance, the total charges for a project are also hard to estimate.


    Do you have any insight into the above issue, or any experiences to share? Thanks.

  • Peter Skomoroch · 2 years ago

    I haven't looked into the Xen/cpu time issue, but I definitely expect latency to be an issue given the unpredictable nature of which nodes you are assigned, their proximity to eachother, and the usage of bandwidth on the shared boxes. I'm planning on running some statistics this week on the distributions of job run times, hopefully it will be somewhat predictable.

  • Mark J. · 2 years ago

    Another issue that would probably merit a detailed analysis is the cost structure of using EC2, in its current form, over a fully-owned cluster. For a small consulting shop running simulations on a 8 EC2-instances, it comes out to 0.8$/per hour, or approximately $1600/year assuming a typical 8 hour simulation per day investigating various designs etc. However, since each instance is only the equivalent of a 1.7GHz Xeon (SPECfp 700). Compare that with a dual-core Intel Core2 E6700, which has a single-core SPECfp rating of 2700, and amounts to the same total compute power as the 8-instance EC2 cluster. Such a machine can be purchased outright for something like $2000.00 with 4GB of memory.


    I think for memory-bound applications, EC2 makes sense, where each VM has 1.7GB of RAM, and with 8 instances, the total RAM available becomes almost 12GB. From a transaction processing, or database-driven application point of view, EC2 may exhibit excellent cost-effectiveness. For a compute-intensive application however, it does not seem to be a very compelling argument.


    While my simplistic comparison does not account for maintenance, power, backup infrastructure, etc for the fully-owned machine, I would not expect a dramatic difference.

  • Peter Skomoroch · 2 years ago

    Good point, I will have to run the numbers on that comparison, but I expect EC2 to come out on top for large clusters which are only used intermittently (unless the latency kills it). Also, we might be underestimating the power, cabling, and cooling costs - especially for larger clusters. All that aside, it looks like your estimate is pretty close, Jeff Layton at ClusterMonkey has a post from January, Kronos Pricing Redux, which gives numbers for a 4 node cluster similar to the one you describe, and he puts the price tag at $2,505.44*


    -- *This is a Correction, I originally quoted the 8-node , 16 core system price of $4,563.72---


    I think the sweet-spot for EC2 will be for shoestring 2-3 person analytical or bioinformatics startups where they need to run occasional large jobs (50-100 nodes), but can't afford to build a large permanent cluster without additional funding.


    For instance, I'd rather not spend $30K right now for a 100 core cluster to run a few large jobs a week...not to mention heating/cooling bills and construction time. If I could get comparable performance on Amazon, it would run me around $1K per month to get past the proof-of-concept stage (assuming 3 eight hour jobs per week). Once I had the capital and space, I could transition to my own large cluster.

  • Mark J. · 2 years ago

    Any update on the test? Would be interesting to see if something more substantial actually performs well on EC2.

  • Peter Skomoroch · 2 years ago

    Mark, I've just wrapped up some projects this week and should have time to check this out now, I'll update the blog when I have an analysis ready.

  • Mark J. · 1 year ago

    I wonder if the benchmarking exercise was successful or not? It would be an interesting datapoint. Mine do not show much advantage to using EC2 for scientific computations, and it seems to be geared more towards hosting web services rather than scalable computing.

  • Raghav · 1 year ago

    I am just curious. Where did you specify the maximum number of nodes? You said 20, can that be increased? If so, how?

  • Peter Skomoroch · 1 year ago

    I found the secret to avoiding a lot of MPI errors on EC2, but haven't found time to do an additional post...


    The secret seems to be that just because Amazon says that an instance is "running", doesn't mean that the ssh daemons are available. This caused all kinds of intermittent problems setting up the hosts and my old scripts would fail silently.


    In my current codebase, I do some checks like the following:


    <pre lang="python">
    print "Instance is %s" % BOOTING_INSTANCE

    # wait for instance description to return "running" and grab HOSTNAME variable
    print "Polling server status (ec2-describe-instances %s)" % BOOTING_INSTANCE
    while 1:
    print "waiting for instance to boot..."
    HOSTNAME = commands.getoutput("ec2-describe-instances %s | grep running | awk '{print $4}'" % BOOTING_INSTANCE)
    if len(HOSTNAME) > 1:
    print "-------Instance booted, The server is available at %s" % HOSTNAME
    DOM_NAME = commands.getoutput("ec2-describe-instances %s | grep running | awk '{print $5}'" % BOOTING_INSTANCE).split('.')[0]
    break
    time.sleep(1)

    # sometimes it takes a while for the ssh service to start, even when the ec2 api describes an instance as running.
    # A machine in the "running" state may not have finished booting. Try executing a no-op command until a valid response is found
    print "verifying ssh daemon has started..."
    counter=0
    while 1:
    print "Waiting for ssh daemon to start..."
    counter += 1
    REPLY = commands.getoutput('''ssh %s "root@%s" 'echo "hello"' ''' % (SSH_OPTS, HOSTNAME) )
    if REPLY == 'hello':
    print "-------ssh has started, proceeding with AMI build"
    break
    if counter > 24:
    print "Instance not respoding to SSH hails, aborting..."
    ## sshd should not take more than 2 minutes to launch
    terminate_status = commands.getoutput('ec2-terminate-instances %s' % BOOTING_INSTANCE)
    ec2_launch_failed = True
    print "Base Instance terminated"
    break
    time.sleep(5)

    if ec2_launch_failed:
    print "Aborting build"
    return


    </pre>
  • Iwein Fuld · 1 year ago

    "Now remove the keys and delete the bash history:" Is after the bundle-vol command. Surely that won't matter anymore at that point. Deleting the keys form /mnt/ is an unneeded step afaics.

  • Peter Skomoroch · 1 year ago

    Iwein,


    That is correct, that last line is unnecessary as /mnt is excluded from bundling.


    -Pete