Slashdot Mirror


Science Grid Genesis

Cranial Dome writes "According to this Cnet.com story, the Department of Energy (DOE) is working to interconnect the first two computers which will form the genesis of the DOE Science Grid, a virtual supercomputing system which will eventually encompass many more systems at several locations. The larger of the two machines: DOE National Energy Research Science Center's (NERSC) IBM SP RS/6000, a distributed memory machine with 2,944 compute processors. This machine, together with a smaller 160 processor Intel system, will make up a combined 3,328 processor Unix system with 1.3 petabytes(!) of storage space. And this is only the beginning..."

4 of 166 comments (clear)

  1. (Slightly OT) 1.3 Petabytes? by Ecyrd · · Score: 3, Informative

    According to this paper, the entire human life takes roughly a petabyte of storage.

    Using the current prices, this amounts to roughly 150.000. It's not that impossible to store your entire life on a single computer anymore. These guys show that such a thing can be built.

  2. The scheme of it all by fruey · · Score: 5, Informative
    Go to the link about the actual project. Look at the PDF. It explains things quite well, it's a wicked thang that is happening...

    Here, for the lazy, are some of the objectives:

    • Computational modeling,multi-disciplinary simulation,and scientific data analysis with a world-wide scope of participants and the use of computing and data resources at many sites.
    • High Energy Physics data analysis that involves hundreds of collaborators,and tens of institutions providing data and computing resources
    • Observational cosmology that involves data collection from a world-wide collection of instruments, analysis of that data to re-target the instruments,and subsequent comparison of the observational data with simulation results
    • Climate modeling that involves coupling simulations running on different supercomputers
    • Real-time data analysis and collaboration involving on-line instruments,especially those that are unique national resources
    • Generation, management, and use of very large,complex data archives that are shared across global science communities e..g.high energy physics data,earth environment data,human genome data
    • Collaborative,interactive analysis and visualization of massive datasets e.g.DOEs Combustion Corridor project
    • Multi-disciplinary R&D that integrates the computing and data aspects of the different scientific disciplines.

    Thus, the applications are enormous. Not that you couldn't do it distributed across desktops à la SETI, but here we're talking data integrity, and let's not forget that even SETI has a kick-ass centralised server setup or the whole thing wouldn't work anyway.

    But especially interesting is the document filename:-

    DOE_Science_Grid_Collaboratory_Pilot_Proposal_03_1 4.nobudget.pdf

    Now, who can get me the version WITH the budget? I want it. Hehe.

    --
    Conversion Rate Optimisation French / English consultant
  3. Re:petabytes by Compulawyer · · Score: 2, Informative
    Of course, the "standard" 2^n*10 system of measuring bytes means nothing if you are a disk manufacturer. There, you just redefine (in VERY small print, of course) a megabyte (or other flavorbyte) as one million bytes.

    This gives us:

    • Disk megabyte = 1,000,000 bytes
    • REAL Megabyte = 1,048,576 bytes
    Difference = 48,576 bytes, or about 15 floppies worth of space per Mb. With Gb sized disks, the difference is almost 49 floppies per Gb. Definition is everything.
    --

    Laws affecting technology will always be bad until enough techies become lawyers.

  4. A little more information by pridkett · · Score: 3, Informative

    This is a little surprising that it got posted and all because it's not all that earth shatterning news, but I'll provides some additional information about grids in General.

    There are a wide variety of systems like this that are either currently available or are being developed. Among them are Particle Physics Data Grid, NEESGrid and various European and Asian counterparts.

    The basic premise is to allow access to various resources you don't have at your desktop. This is not to be confused to with putting all these computers together an forking a process a billion times and having it run it run all over the globe. It's more like saying I have a process that requires 128 processors and 4GB of ram, go find it an run it for me.

    Most of the systems use Globus which is pretty much the defacto standard. There are other systems out there such as Legion and Condor which serve slightly different purposes.

    I've also seen some issues about security raised, so I'll mention them quickly. Globus is built upon an API called GSS (Generic Security System), I believe it will soon (if not already) have an RFC published. This is a layer on top of various other security systems that may be local to the server running it. It can use Kerberos or PKI to do encryption across the network (don't flame me if it's wrong, I'm not security expert).

    When I wish to start using the grid, I start up my proxy that takes care of all authentication for me. Then my proxy connects to the gatekeeper on the remote machine which authenticates me based on my private key and then authorizes me via a mapping (usually just a text file). The task is then executed by the gatekeeper via the mapping on the remote machine. Input and output can be redirected over a secure layer if you so desire.

    My certificate is issued by an authority. In this case the Globus CA. The nice thing if that if you want to set up a grid of your own computers, you can get a cert from them too. Install Globus and it will tell you how.

    Certificates also allow you to get access to data. This allows me as a user A to run program B at site C providing results to user D at site E for a period of time F.

    It's all terribly neat and remarkably easy to install on your favorite Linux or Solaris box. It's also fairly easy to write programs to utilize the Grid thanks to the various CogKits for Python, Java and Perl.

    --
    My Slashdot account is old enough to drink...