Beowulf In Business

Re:Real-world applications for clusters by Anonymous Coward · 1999-08-06 04:29 · Score: 0

My main skill, you see, isn't a technical or administrative one - it's my creative ability to come up with new ideas, to look at things in a different way to others, to learn about things I don't know about, and to solve problems.

Looks to me like your main skill is self promotion. Luckily that is what venture capitalists seem to look for. Have fun!

MOSIX.. by Zurk · 1999-08-06 06:51 · Score: 1

MOSIX can do what beowulf cannot (transparent process migration) and its GPLised too...they should have at least commented on it...

Re:PVM and Beowulf by Anonymous Coward · 1999-08-06 06:59 · Score: 0

PVM creates a virtual machine out of hosts tied together over TCP/IP. You run jobs on the virtual machine and it spawns off processes as it sees fit on the physical machines in its cluster. Beowulf is just basically a name for a network of workstations. Look at the NOW project at Berkeley for a similar sort of design. But PVM sucks. Don't use it, its too slow. MPI is a much better solution.

Re:Real-world applications for clusters by The+Dodger · 1999-08-05 20:37 · Score: 1

Investors look for experience in both technology and business. What is yours?
My background includes a couple of years at college spent studying management science, several years working with unix systems and TCP/IP networks, both LAN and WAN, and stints as a technical sales manager and, later, project manager, at a telco. I'm currently working as a systems engineer with mid-range unix servers (e.g. Sun e450 up to e4500's, SGI Origin 2k's, etc.) including relevant technologies such as FCAL and clustering.
BTW, there is plenty of direction talk to Paralogic or Alta Tech.
Both are cool companies, in my opinion, but they're not in quite the same market I want to be in. As regards the technology, obviously, I'm not privy to the R&D that's going on at these companies - it's very possible that they're working on ideas which are similar, or indeed identical to mine, in which case, I may well have to tear up by plan and wait for the next idea to pop into my head.
My main skill, you see, isn't a technical or administrative one - it's my creative ability to come up with new ideas, to look at things in a different way to others, to learn about things I don't know about, and to solve problems.
D.
..is for defunct.

Re:Clustering by coreman · 1999-08-05 20:39 · Score: 1

Yeah, that was one of the suggestions but we need a chess source program to recompile for the project and then, as you point out, we need someone to actually challenge it.

This was to be an ad hoc "Stone Soupercomputer" style configuration built out of machines brought to the conference by attendees.

Aren't you clever by Anonymous Coward · 1999-08-05 21:45 · Score: 0

Just remember, you now have many points of failure not just one. Those many points are much more difficult to mirror and restore than the single one.

Re:Aren't you clever by The+Dodger · 1999-08-05 22:25 · Score: 1

I can only assume that you're referring to the fact that there are many nodes in a cluster, as opposed to one large server.
If this is the case, and you're saying that this increases the chances of the entire system crashing, then you're wrong and I would even go so far as to say that you don't have a clue what you're talking about.
Small wonder you're posting as an AC. I'd be embarassed too if I was that stupid. :-)
D.
..is for dunce.

Re:Real-world applications for clusters by Mignon · 1999-08-05 21:59 · Score: 2

...I don't think it has all that many applications outside of scientific research...

One big one is simulations for financial calculations. One such is roughly this: the price of some class of security is sensitive to interest rates. So you want to see what happens if, at several time steps, the interest rate goes up or down some small amount. Evaluating the different 'paths' of interest rates over time lends itself to parallel processing.

Re:PVM and Beowulf by Zurk · 1999-08-06 07:07 · Score: 1

MOSIX.

Why Beowulf? by Anonymous Coward · 1999-08-06 07:09 · Score: 0

I'm puzzled as to why Slashdot keeps talking about Beowulf clusters. I mean, sure, they're great tech, but they aren't really more special than any other cluster of workstations grouped together into a virtual machine.

PVM, NOW, DEC farm, Sun's clustering technology, globus and legion are all solutions that do basically the same thing. Sure, Beowulf has to be open source, but so is OpenMPI and some of the other systems.

Its kinda irritating to see Malda continually say that Beowulf clusters will be the open-source replacement for big iron. As if because the software is free, the cluster suddenly becomes a completely cheap replacement with "better performance."

Bah, the whole key to supercomputing is the network speed, not the computational power. People may puff up their chests and talk about peak MFLOPS all they want, but if you're running it on a slow 100Mb/s line, your real world performance will go to hell.

Thats why systems like the Origin 2k, the Sun E10k, IBM SP-2, etc cost so much. They have the high performance networking required to get good speed on *real* applications. Beowulf is nifty, and great applications where you can break the problem into large chunks, but for any fine-grained calculations, you are better off running a single processor PIII.

So, I really think /. should stop reporting so proudly about beowulf, since it will never be able to deliver good performance as long as it uses slow networking.

And as long as you're covering clusters of SMPs, talk about NOW at Berkely. Their tech is very good. -Shaka

Looking for a proprietory solution by Anonymous Coward · 1999-08-05 22:31 · Score: 0

Maybe they are looking for a proprietory solution, to get a customer lockin. Anonymous

wrong by Anonymous Coward · 1999-08-06 07:26 · Score: 0

Hate to burst your bubble, but Beowulf systems
have been shown to perform as well as or better
than the Origin 2k, the Sun E10k, IBM SP-2.
Every here of Myrinet?

And yes network speed is important but latency
is often more important.

What you pay for is the cost of writing specialized software for these beasts (not to mention specialized hardware)

Keep reading those glossy brochures for the
big iron machines.

Re:wrong by Anonymous Coward · 1999-08-09 03:50 · Score: 0

Sure, NOW uses Myrinet. My question is why focus on Beowulf? Why no coverage of other products? And sorry, but anyone can make up peak MFLOPS numbers. 1000 PCs can easily defeat any system running embarassingly parallel code. When it comes down to real performance, its hard to beat a system running an expensive interconnect, such as the one the Tera MTA or an Origin 2000 has.
Re:wrong by Anonymous Coward · 1999-08-18 01:30 · Score: 0

Real world app that performs as well as a Cray - the Stone SouperComputer in Tennessee... It was talked about here on slashdot previously... And it has been done at ZERO cost.

Looking for a proprietory solution by Anonymous Coward · 1999-08-05 22:31 · Score: 0

Maybe they are looking for a proprietory solution, to get a customer lockin. anand

Re:Not good for large parallels? by Luke+B.+Bishop · 1999-08-08 21:39 · Score: 1

True, but this is merely a design hurdle. The thing to note here is that transaction speed is not really mission-critical, if it takes half a second to complete a transaction, then that is fine. A bunch of dedicated file servers with partial databases set up to use a network hash table (certain database keys on certain systems) on a very high bandwidth backbone could drive quite a lot of transactions. While it would be a "modified Beowulf" or some such, a scheme like this would work quite well.

As long as there was not one huge shared database, of course. And to dump old databases to semi-offline storage, a second backbone could be installed in these file servers to push the data onto backup servers, which would merge the databases again, and write them out. A lot of investment in hardware, but much less than a similar proprietary system.

--

-- For large values of one, one equals two, for small values of two.

Re:Not good for large parallels? by H-Monk · 1999-08-05 22:45 · Score: 1

There are linux clusters that, if not technically a Beowulf, are at least very close to being a Beowulf and employ a shared-disk scheme. Sadly, these schemes are just as impractical for many large data applications as the shared-nothing approaches.

There's going to be some interesting developments in this area for Linux soon, even if it means I'm going to have to start them myself.

--

Re:Story sort of missed the boat by CigarBuff · 1999-08-08 22:18 · Score: 1

Besides being very poorly worded, your argument that "The definition of a Beowulf requires and "open source" OS" is simply incorrect. If you look back at why/when Beowulf was created (http://www.beowulf.org/intro.html), you'll find that it was from "their [the creators of Beowulf] idea of providing COTS (Commodity off the shelf) base systems to satisfy specific computational requirements."

Whether or not they used an open-source operating system is not the point. The goal was to provide an MPP system for as little cost as possible. If a collection of Tru64 UNIX workstations operating in a Beowulf cluster provides more computing performance at less cost than a similar system from a major MPP vendor, then it seems to meet the criteria for why Beowulf began. Sure, it might cost more than the same Alpha workstations running Linux, but it is also likely to perform better. Life's little tradeoffs are everywhere, aren't they?

Cheers,
David Hull
david.hull@england.com

Re:Not good for large parallels? by Bill+Henning · 1999-08-05 22:51 · Score: 1

With new distributed computing software (Mosix anyone?) more and more people are going to write software for clusters. Definitely there are issues with db coherency, record locking, etc., but solutions will be implemented; after all a cluster is pretty much the only way to increase throughput if an SMP box is not fast enough for you...

--
--------- Webmaster, http://www.cpureview.com and

If only there were a transparent VM... by Amoeba+Protozoa · 1999-08-05 22:58 · Score: 1

I have set up a few Beowulf machines for S&G. I used PVM, RH Linux 6.0/5.2, a 10/100 switch, and about 6 boxen. It worked quite well, except that it took a few days to get operating how I wanted. I wrote a couple applications to crunch numbers across the cluster, tested throuput, etc. For even more S&G I used MP3PVM to RIP a few CD's real fast. Fun!

Now this is all well and good, but wouldn't it be great if we could have a transparent virtual machine that runs across all the nodes? Something which you could use "/bin/bash" on as your command shell.

Now, I am not sure how this would be accomiplshed-- forinstance how you would effciciently share memory accross machines or decide how to break up tasks (break on thread, would be one way); this is just to open up conversation.

Imagine: Lower your SETI@Home WU time to mear seconds :) (is it far to run a distributed computer under a distributed computer?)

-AP

Re:If only there were a transparent VM... by Anonymous Coward · 1999-08-05 23:07 · Score: 0

Isn't this the sort of thing that MOSIX does? As far as I understood, MOSIX created a VM over a cluster, so the layer would be Linux --> MOSIX --> application Do I suffer from rectal-cranial inversion or isn't that the point?

oh yeah by Anonymous Coward · 1999-08-05 17:12 · Score: 0

I'm going to have a cluster in my house tonight.

Re:oh yeah by Anonymous Coward · 1999-08-05 19:52 · Score: 1

Seti is a good example of the type of data which lends itself well to being handled by a cluster. However, in this case you could just run the seti@home client on each of the 386's and get the same result. It's not a cluster, but seti@home is a great example of distributed computing.
Re:oh yeah by skittz · 1999-08-05 18:36 · Score: 1

hehe.. no joke.. I have a stack of 5 386's w/ 8mb (I think) that i've been threatening to turn into a cluster. But what would i use it for? Maybe get a seti@home client running? :)

Not good for large parallels? by Luke+B.+Bishop · 1999-08-05 17:16 · Score: 1

Strange, they said that it wouldnt be good for large volume transactions. Isn't this exactly the sort of task that works really well concurrently? It seems to be the perfect candidate, lots of non-interconnected tasks, perfect for multiple execution.

--

-- For large values of one, one equals two, for small values of two.

Re:Not good for large parallels? by bunyip · 1999-08-05 17:42 · Score: 2

Large volume OLTP has a huge amount of inter-connection at the data level. You have to lock and unlock all the records to maintain ACID properties. Beowulf is a shared-nothing approach and doesn't have facilities for sharing all the disks with appropriate concurrency control.

The largest airline systems all use IBM's Transaction Processing Facility (TPF), which is a specialized real-time OS for mainframes. TPF shares disks amongst all processors in the cluster and pushes the locking down to the individual disk controllers (specialized microcode). Where I work, we get about 200,000 physical I/Os per second to the disk farm, using TPF on eight mainframes and a several TB of disk.

Still, there's nothing about large-volume OLTP that Linux couldn't do, it's just a matter of programming. I, for one, would like to see it happen.
Re:Not good for large parallels? by coffii · 1999-08-05 17:51 · Score: 1

What does it matter, I thought most airlines over booked seats anyway. ;)

--
Bitter and twisted, DON'T ever FORGET the TWISTED
Re:Not good for large parallels? by alonso · 1999-08-05 17:53 · Score: 1

I think that the problem is the performance in
distibuted DB, like the one you need in this kind of environment... but a distributed FS may solve the probem:))

database locking by delmoi · 1999-08-05 17:43 · Score: 1

It may have somthing to do with that databases need to remain "consistant" such that only one operation is performed on a record at once. The problem might be in making sure that only one box "owns" a record at once. You would need to make that record unavailable to all the other nodes, or let them no not to use it. If the network is high latency, it could be a problem.

of course, with gigabit ethernet... :)
"Subtle mind control? Why do all these HTML buttons say 'Submit' ?"

--

ReadThe ReflectionEngine, a cyberpunk style n

Re:database locking by jpc · 1999-08-05 17:49 · Score: 1

gigabit ethernet just increases the bandwidth,
doesnt do much for the latency. Most of the latency is in software (TCP/IP etc) that shared memory systems avoid.
Re:database locking by The+Dodger · 1999-08-05 19:41 · Score: 1

The problem might be in making sure that only one box "owns" a record at once.
Not a problem. Whilst I don't intend to tell you how I solved this one, I will give you a hint - don't think of it as a programming issue; think of it as an architectural issue.
D.
..is for Devious.
Re:database locking by Gumber · 1999-08-06 01:50 · Score: 1

gigabit ethernet increaces bandwidth, which has a minimal impact on latency because much network latency is the result of processing overhead, with a notable contribution from context switching as the data moves from hardware driver to network stack to usermode process. Cutting out these middlemen will help a lot.

Clustering by coreman · 1999-08-05 18:57 · Score: 2

I think the problem with the transaction systems is that they top out on a different bottleneck. CPU isn't the major gating factor. Multithreadde applications will take great advantage of this type of system. One application that I worked on in a previous life was a creditcard limit verification system for a major player. They had a 1 second transaction turn around specification. We ended up setting it up with discrete machines with a failure rollover mechanism involved. Much of the coordination we had to design would have been far easier in a coupled system like Beowulf.

There's a movement on to put together a large Beowulf cluster for the Boston Geekfest in October. One of the things we're trying to come up with is a good demo that actually shows something to the crowd. We've had ideas from the realtime rendering of POV scenes to decryption (yeah, right, watch it hum for 20 hours and then spit out the true key) but haven't come up with a "killer demo app". Email if you have any ideas.

Re:Clustering by The+Creator · 1999-08-05 20:34 · Score: 1

Maby you could set it up as a chess engine? Really don't need to be beowulf but...

The problem is that there will probably not be anyone there that could even beat a single computer |).

LINUX stands for: Linux Inux Nux Ux X

--

FRA: STFU GTFO

Real-world applications for clusters by The+Dodger · 1999-08-05 19:36 · Score: 1

"Cluster", "Linux" and "Beowulf" are popular buzzwords at the moment, and I can see a bandwagon developing. However, it's a bandwagon which isn't going anywhere yet, because it doesn't have direction.
Beowulf is an interesting technology, but I don't think it has all that many applications outside of scientific research. For Linux clustering to achieve credibility as a viable means of replacing mainframes and high-end servers, a more balanced architecture, providing for high availability and ease of administration needs to be developed.
Luckily, I have the answer, am currently preparing a business plan and intend to begin seeking out interested parties in a month or two. If there are any venture capitalists out there who are interested in investing in a venture with more than hype and PR behind it, let me know. :-)
Meanwhile, some guy at Dell UK is inviting people to participate in building a large Beowulf at Dell's "Proof of Concept Lab" in Limerick, Ireland, at the beginning of September.
By an amazing coincidence, I'd already booked those two weeks off to go home and catch up with the family. A trip down to Limerick is on the cards, methinks...
D.
..is for Dastardly.

Re:Real-world applications for clusters by The+Dodger · 1999-08-05 23:09 · Score: 1

One big one is simulations for financial calculations.
You're absolutely correct - my oversight, and quite a large one, considering I've done this sort of modelling in the past - multiple regression analysis is one of the things I picked up whilst studying operational research.
It actually comes in kinda useful for dealing with performance issues on large systems with lots of users, and I've spec'd systems and laid out upgrade paths based on the results of MRA...
But, I digress....
D.
..is for Deviant.
Re:Real-world applications for clusters by Anonymous Coward · 1999-08-05 20:13 · Score: 0

Seems your business plan wants to jump
on the same bandwagon. Investors look for
experience in both technology and business.
What is yours ? (other than reading buzz words on
slashdot)

BTW, there is plenty of direction talk to Paralogic or Alta Tech.

This would rock by Spazmoid · 1999-08-05 19:42 · Score: 1

Now all I need to do is get ahold of about 100 of those power4 IBM chips when they are released build 25 Quad Processor 1GHZ machines with 500mhz bus and 1gb mem and throw em all in a cluster. Add 2-10 Terabytes of secondary storage, multiple OC-3 or faster connections and start leasing space on the fastest machine in the world. Handles 156E+10^8 hits/sec while doing recursive database lookups.

heh... If Only...

--
www.mp3.com/Undocumented

Re:This would rock by Anonymous Coward · 1999-08-06 03:05 · Score: 0

Yeah, that would really rock. What would you use as the interconnect BETWEEN the machines, though? Fast Ethernet? Unless you can figure out how to get the machines to quickly talk amongst themselves (like the nodes in a supercomputer such as the Origin 2000 do), all that amazing processing power put together into a Beowulf is nice but might as well be working in the context of a project like Distributed.Net.

Story sort of missed the boat by Anonymous Coward · 1999-08-05 18:18 · Score: 2

I just read the article. As a manufacturer of "turn-key" Beowulf systems, here was my reply to the author:

Stephen,

I just read your story about Beowulf systems. While the story was well written and informative, there are some points that you have missed.

1) The definition of a Beowulf requires and "open source" OS (See "How to Build a Beowulf" by Sterling, Becker, Salmon, Savarese) Therefore, systems built from True 64 are NOT Beowulf systems.

2) You missed my company, Paralogic Inc. We sell turnkey Beowulf systems. In fact rather than "several" as reported by IBM, we have several dozens of installed production systems at companies like Lucent, Amerada Hess, Conoco, Procter and Gamble, government sites like NASA, NRL, and the Air Force, and many Universities. (see www.xtreme-machines.com)

3) There is a rather huge barrier to entry because of the technical nature of these machines. As far as I know, we are the only company who will offer support for Beowulf clusters. Without support the market can never enter the mainstream.

4) There have been quite a few other people who contributed quite a lot of effort to the Beowulf technology other than IBM and VA Linux. Although all contributions are welcome, these guys are a little late to the party and we hope they stay.

Sincerely

Douglas Eadline, Ph.D.
President

Paralogic, Inc.
PEAK PARALLEL PERFORMANCE

3 p120s by male · 1999-08-05 19:55 · Score: 1

I just got three p120's with an unknown amount of ram and harddisk space, but it will probably be at least 8megs and a gig...
Other than learning a new technology, i don't have a real use for a parallel processing machine, i just do basic php -> mysql stuff with small dbs...
however, it sounds like a really cool thing to set up, and i want to learn. Alot of these sites talk about beowulf alot, but don't give an explanation on how to set one up! Either i'm looking at the wrong sites or i just don't kow how to do it..
Can anyone point me in the right direction or give me some tips on doing this? And responces like *just give me those three pcs* are appreciated but will be ignored :)

Thanks for the help
jc

--yep. i'm a NEWBIE!!!

Re:3 p120s by Leareth · 1999-08-05 23:17 · Score: 1

In follow up you may want to look at the following books:

How to Build A Beowulf : A Guide to the Implementation and Application of PC Clusters;
Sterling, Thomas L. / Becker, Donald J. / et al.

http://www1.fatbrain.com/asp/bookinfo/bookinfo.a sp?theisbn=026269218X

High Performance Cluster Computing : Architectures and Systems, Volume I; Buyya, Rajkumar

http://www1.fatbrain.com/asp/bookinfo/bookinfo.a sp?theisbn=0130137847

High Performance Cluster Computing : Volume 2, Programming and Applications

http://www1.fatbrain.com/asp/bookinfo/bookinfo.a sp?theisbn=0130137855

--
*A)bort, R)etry, I)nfluence with large hammer.*
Re:3 p120s by The+Dodger · 1999-08-05 20:15 · Score: 2

Can anyone point me in the right direction...
The Beowulf Underground site is a good starting place.
D.
..is for Deranged.

What beowulf is and isn't good for. by Gumber · 1999-08-06 01:34 · Score: 1

Beowulf is best for CPU intensive tasks which can be broken up easily, don't require a lot of intranode communication, can deal with relatively high latency on the intranode communication, and can deal with single node failures easily.

This is a relatively large domain or problems, but it doesn't work for everything. A lot of business applications require high reliability and availability. If you use beowulf, you have to implement these features for your application on your own.

The simulations that businesses are running on these things aren't really in the same league. For the most part, they aren't time critical and if a failure occurs that invalidates a test run, they can ususally be rolled back to some midpoint and started again without a significant loss of time.

Beowulf isn't just useful for CPU intensive tasks though. All those processors also provide significant amounts of memory bandwidth and all those machines provide potentially large amounts of disk storage and bandwidth, but again, you need memory or disk intensive tasks that can easily be split out to many loosely coupled nodes.

PVM and Beowulf by Anonymous Coward · 1999-08-05 20:21 · Score: 0

I'm a bit confused as to *exactly* what Beowulf refers too. Is PVM a type of Beowulf? Anyway, here is my experience and take on distributed processing. I wrote a program that distributed various parts of the mandelbrot set to N number of parallel virtual machines (actually 23 networked linux boxes running a PVM daemon), this was 2 years ago I guess, so it was on midrange P133s. Naturally, the time it took to render a portion of the mandelbrot set (it could be done with any fractal quite easily) was linear in relation to the number of computers it was simultaniously running on, probably with a slope of .7 or something (N of processors on the y axis). It was interesting to watch, because it drew the pixels as it recieved them (each as an individual packet) from the parallel machines, sometimes one entire machine would finish before another had started (granted, thats only a few seconds lag). This was a fun little project, it took maybe a week or two during class (~4 hours a week) to get it finished, from scratch, but I didn't have to handle getting PVM running on the machines, the admins worried about that. Its simply a matter of deciding how to send the packet and how it should be recieved, on a low speed network (like distruted.net or SETI@home) it is desirable to send the work packets in as large chunks as possible and send the completed packets as infrequently as possible. In my program on a 10mb ethernet, it was very possible, and quite interesting, to just send one pixel at a time. Anyway, from what little experience I have, and what little else I know, clustering is really only useful if you have a processor intensive app such as a ray-tracer, and it requires a rewrite of the rendering engine to distribute the processing. Its net effect is more processing horsepower for such apps, by using multiple cheap boxes. Creating a cluster out of 5-10 486 boxes in your basement is a fun little experiment, but will yield little in terms of performance. Your new PII/K6/PowerPC can probably outperform it, especially when you take into account the overhead required for the distribution process. If you work in a situation where you need something like a renderfarm, you could set this up quite nicely if you have custom apps and can afford 10 boxes of alphas running linux. What is great about linux in this case, is that you can really cut down the cost, by, scaling down each machine to what you need (ie, small slow hard drive is fine if you need caching to run your process), video, etc. is unimportant, and you can limit RAM to what your process needs, and recompile your kernal and take out all unimportant services, and config it so just telnet is running, so you have a lean distributed computing machine :-) All you really need is a processor, motherboard and ram, you can probably hack up something to boot of a network, esp. if the machines are identical :-) For the few of us who *actually* can use a system like this, linux is a great way of cutting costs, and not paying a fortune for an SGI O2 or something (though, if I had the money :-) At any rate, it is something I wish I had a real use for, because it would be *so* much fun to tinker with. I was just thinking, as I end my rambling, is there any PVM type open source program, which can distribute threaded apps across mutliple machines, then it wouldn't require custom rewrites of everything, the real trick of it would be the network lag. Anyway, enough rambling, back to work

Spyky

Anonymous because yesterdays bugs broke my login

Mfg Design? Depends on your tool... by Snarfvs+Maximvs · 1999-08-06 03:24 · Score: 1

Quite a lot of mfg design utilizes DES (discrete event simulation) models, and in my experience (which is fairly extensive in the semiconductor industry) these models contain vast amounts of detail. As a result they run REALLY REALLY SLOWLY. We're talking DAYS for a single replication of an experiment (you have to vary your random seeds and average across replications to eliminate the effects of "pseudorandom" numbers). What's worse, since these models are strictly time-based they're not parallelizable to any degree. So a Beowulf cluster buys you....nothing. You just need many CPUs and TONS of RAM to run the experiments in parallel...but you can't parallelize a single run.

OTOH, if you're doing linear or integer programming, those are parallelizable if you're doing branch & bound stuff. But LP/IP doesn't always give you the granularity you need to make decisions as accurately as DES does (and can't account for the stochastic and dynamic nature of manufacturing processes).

--
-----------------------

To understand recursion, one must first understand recursion.

46 comments