Slashdot Mirror


Ask Donald Becker

This is a "needs no introduction" introduction, because Donald Becker is one of the people who has been most influential in making GNU/Linux a usable operating system, and is also one of the "fathers" of Beowulf and commodity supercomputing clusters in general. Usual Slashdot interview rules apply, plus a special one for this interview only: "What if we made a Beowulf cluster of these?" is not an appropriate question.

40 of 273 comments (clear)

  1. One question... by Noryungi · · Score: 5, Interesting


    (And this is a serious one!)

    Why did you choose Linux, instead of *BSD, to create a Beowulf?

    This is a serious question, not a flame: why choose Linux over, say FreeBSD? Is it just because your employer already used Linux? Because you had used Linux before and had more experience working with it? Because you had tested both, and found Linux better than BSD? Or because Linux had tools the *BSD did not have?

    Just a question...

    --
    The right to offend is far more important than the right not to be offended. (Rowan Atkinson)
    1. Re:One question... by kiolbasa · · Score: 4, Informative

      If I recall, the definition of a Beowulf cluster does not specify Linux specifically, only a free operating system.

      Look it up

      --

      Beer wants to be free
  2. What one thing would you like to see added... by CSG_SurferDude · · Score: 5, Interesting

    What one thing would you like to see added to the Linux Kernel? Why hasn't anyone done that allready? And how would that "One Thing" be better than somebody else's suggestion?

  3. How to bring this up with your boss?? by Bob+Abooey · · Score: 4, Interesting

    This reminds me of when I was working at Apple in the secret (heh... my NDA ran out and they did away with the division so it's no longer a secret...) two button mouse division. Basically we used open source tools, like Linux/Emacs and Linux/gcc because they were fast and very functional, but we could never get any of the team leaders to permit them company wide due to the fact that they didn't come shrink wrapped and thus were not officially supported. Now I know that you can get great support from Usenet but that's not good enough for the pinheads who are in upper management at Apple.

    So, my question would be, what's the best way for an engineer at a large company to address this issue with the people they report to.

    --

    All the best,
    --Bob

  4. hardware insights? by rambham · · Score: 5, Interesting

    With your experience creating so many ethernet
    drivers do you have any opinions or suggestions
    for hardware makers? Aside from good documentation
    what makes a given hardware device easy to work
    with and what makes a device hard to work with?

  5. MPI OS X and the future by eadint · · Score: 4, Interesting

    Where do you see the Beowulf project going in the future. Plus I hope that this isn't a redundant question but will you be adding MPI into your clusters to create a kind of PVM / MPI hybrid. how about really good documentation. and finally. Have you considered porting your software over to the OS X platform. if so how can the apple community help.

  6. Enterprise Computing by llamalicious · · Score: 5, Interesting

    What is - in your opinion - the single most important, necessary evolution of GNU/Linux systems to help them become a commodity in the enterprise arena?

  7. What's the future of distributed computing? by theBraindonor · · Score: 5, Interesting

    What do you see as the future of distributed computing? Will it be massive P2P distributed networks for the masses? Or will it be large commercial distributed networks?

    What tools exist that will be used to create this future? What tools still need to be invented?

  8. Dear Don, does it suck not to be rich? by Hairy_Potter · · Score: 5, Interesting

    You've written code that's used by millions of people, just about anyone who's ever networked a Linux box has used your driver. Yet, you're not rich. Would you like to see Linux people chip in a few bucks out of gratitude?

  9. The Future.. by Anonymous Coward · · Score: 5, Interesting

    What do you see the future holding for:
    (a) Beowulf technology
    (b) Different uses for Beowulf

  10. Processor/Architecture by Bonker · · Score: 5, Interesting

    If you could add features to the x86 processor or architecture to make clustering work better, what features would you add?

    --
    The next Slashdot story will be ready soon, but subscribers can beat the rush and slashdot the links early!
  11. Why by idontneedanickname · · Score: 4, Interesting

    Why did you name it after the epic "Beowulf"?

  12. Two questions by Theodore+Logan · · Score: 5, Interesting

    First one I really think should be in your faq, but that I haven't been able to find there: why did you choose the name of an millenia old epos about a Scandinavian warrior for something that does not even seem distantly related?

    Secondly, do you read Slashdot, and if so, what do you think about all the troll jokes about Beowulfs? Was at least funny in the beginning to hear about people "imagining" clusters of just about anything?

    Ok, so it was more than two questions. Sue me.

    --

    "If you think education is expensive, try ignorance" - Derek Bok

  13. OS X by paradesign · · Score: 4, Interesting
    What are your thoughts on Mac OS X?

    It seems to have all of the polish and usability Linux/BSD people dream about, whie still maintainging a fully open source BSD core (Darwin). Have you ever been tempted away from Linux like so many ohers?

    --
    I want 2D games back.
  14. Message Passing vs. Single System Image by turgid · · Score: 5, Interesting

    Why do you think that message passing clusters are more popular than single system image clusters, and do you see the balance changing eventually? In other words, is there no compelling reason to choose single system image for most problems? Also, when do you think that the 32-bit addressing limitations of x86 hardware will become a problem for doing Big Science on clusters?

    1. Re:Message Passing vs. Single System Image by joib · · Score: 4, Insightful

      Programming MPI (i.e. message-passing) is slow, difficult and error-prone. But I'd say making the hardware and especially the operating system for a single system image computer with thousands of processors is even more difficult. Or hey, why stop at thousands of processors? IBM is designing their Blue Gene computer, with 1 million processors. How do you make a single kernel scale on a system like that?

      The traditional approach is to use fine grained locking in the kernel, but this tends to lead to unmaintainable code and low performance on lower end systems. For an example of this see Solaris, or most other big iron unix kernels.

      Another approach is the OS cluster idea championed by Larry McVoy (the Bitkeeper guy). The idea is that you run many kernels on the same computer, one kernel takes care of something like 4-8 cpu:s. And then they cooperate somehow so they can give the impression of SSI.

      A third approach seems to be the K42 exokernel project by IBM. They claim very good scalability without complicated lock hierarchies. The basic design idea seems to be to avoid global data whenever possible. Perhaps someone more knowledgeable might shed more light on this...

      But anyway, until someone comes up with a kernel that scales to zillions of cpu:s, message passing is about the only way to go. Libraries the give you the illusion of using threads but are actually using message passing underneath might ease the pain somewhat, but for some reason they have not become popular. Perhaps there is too much overhead. And some people claim that giving the programmer the illusion that all memory access is equal speed leads to slow code. The same argument also applies to NUMA systems.

      And on the system administration side of things, projects like mosix and bproc already today give you the impression of a single system image. Of course your application still has to use message passing, but administration and maintenance of a cluster is greatly simplified.

  15. Donald Becker is a world-class guy by johnnyb · · Score: 4, Interesting

    In addition to being extremely smart, Donald Becker is a world-class guy. When I was new to Linux, I had trouble with one of his drivers. I emailed him, and within a day he emailed me back. It was a pretty stupid issue - I needed to download the latest drive :) However, he was very nice about it, didn't send me an RTFM - in fact he included instructions for building and installing it.

    Anyway, Donald - thanks for helping me out when I was a stupid newbie, you are truly a world-class fellow.

  16. Linux kernel 2.6/3.0 ? by unixmaster · · Score: 5, Interesting

    What do you think about the affect of next Linux kernel v 2.6/3.0 on clustering when the new O(1) scheduler and VM and many new features taken into consideration?

    --
    Never learn by your mistakes, if you do you may never dare to try again
  17. Memory-Oriented Logic by Effugas · · Score: 5, Interesting

    Dr. Becker,

    As I'm sure you've noticed, the price of memory has been driven into the ground -- indeed, it's so inexpensive, the economics seem to have rendered the usage of virtual memory nearly obsolete. Need another 256MB? Spend the $20 and buy it. It's just that simple.

    Now, memory makers can't let their goods be absolutely commodified forever, and I'm unconvinced that further speed increases, either in latency or bandwidth, will remain permanently relevant. So I'm curious about your opinion of embedding highly localized simple logical operators amongst the core memory circuitry itself. I've heard a slight amount about work in this direction, and it seems fascinating -- instead of requesting the raw contents of a block of memory, request the contents run through a highly local but massively parallelizable operation -- bit/byte/word interleaved XOR/ADD/MUL, for example. Obviously semiconductors can do more than store and forward; do you believe we a) will and b) should see memory implement trivial operations directly? What about non-turing complete instruction sets?

    Yours Truly,

    Dan Kaminsky
    DoxPara Research
    http://www.doxpara.com

    P.S. Please forgive me if this entire post reads like "What about a beowulf cluster of DIMMs?"
    P.P.S. Be honest: Do you ever find it ironic that the Internet Gold Standard for Ethernet cards ended up being called Tulip?

  18. Considering you're on the board at scyld.. by Havokmon · · Score: 5, Funny
    Did you ever have anything to do with crynwr?
    And why don't you people like vowels? :)

    (Thanks for the ne2000 driver!)

    --
    "I can't give you a brain, so I'll give you a diploma" - The Great Oz (blatently stolen sig)
  19. probably way too many but what the hey... by jahjeremy · · Score: 5, Interesting

    Please describe the general process you follow for writing and testing ethernet drivers on linux.

    A couple more specific questions...

    1) What approach do you take in creating drivers for cards which have inaccurate or insufficient documentation?

    2) What tools do you use for debugging and and/or "discovering" the workings of old/obscure/poorly documented hardware?

    3) What skillset, i.e. languages, knowledge & tools, do you consider necessary to perform the kind of coding you routinely do (outside of hacker wizardry and C mastery)?

    I am also wondering how you got started writing ethernet drivers and clustering software for linux. What lead you down this specific path rather than other aspects of kernel/OS development?

    JM

  20. Java by Anonymous Coward · · Score: 5, Interesting

    What do you think about Java and its role in distributed computing? Do you have much experience with Java, and what are your opinions of it?

  21. Which Network gear manufacturer? by iamsure · · Score: 5, Interesting

    As the man responsible for writing multiple network card drivers, you are in a unique position to answer this..

    What (FastEthernet/100mb) Network gear manufacturer do you prefer and recommend to others?

    Whether its servers, or home use, its an important question, as some are as buggy as all get out, and others are to die for.

    And if its a different answer, which manufacturer do YOU use?

    1. Re:Which Network gear manufacturer? by steveha · · Score: 5, Interesting

      I blush to admit this, but I have already asked him this question. Last January I was having trouble with a network card, and I sent email to Mr. Becker asking his advice.

      Here a my quick summary of what he told me:

      Some network cards are really pathetic and/or broken. As long as you don't buy one of those, it doesn't really matter very much which one you buy.

      The 3Com 3c905 cards are a little bit better than other cards.

      I found this web page:

      http://www.fefe.de/linuxeth/

      Based on that web page and Mr. Becker's comments, I bought myself some 3Com 3c905c network cards, and I have been very happy with them.

      P.S. I used to buy my net cards by brand name. Bad idea! You must look beyond the brand name and see what chipset the net card uses. I bought a Linksys LNE100TX card and liked it, so I kept buying that card. But Linksys started making different versions of the card, using completely different chipsets, so the last time I bought that card it turned out to be really broken under Linux. Older LNE100TX cards work well with the "Tulip" driver under Linux, but newer ones are really broken.

      steveha

      --
      lf(1): it's like ls(1) but sorts filenames by extension, tersely
  22. OpenMosix by GigsVT · · Score: 5, Interesting

    As someone who has made small contributions to the OpenMosix project, while I'm amazed at what clustering can do, I'm dissapointed at the same time at what it cannot.

    Distributed shared memory is a big hurdle facing the OpenMosix project over the next couple years. Right now any program that allocates shared memory cannot migrate. What do you think of projects like OpenMosix? Do you think we will reach a point where parallel programming is a thing of the past, discarded in favor of tools like OpenMosix that require no special programming considerations except implementing clean threading?

    --
    I've had enough abrasive sigs. Kittens are cute and fuzzy.
  23. What have you done for me lately? :-) by gosand · · Score: 5, Interesting

    Donald, as the founder and CTO of Scyld, as well as a member of the board of directors, do you still get to hack, or is your time all taken up with business? Do you ever get the itch to get back to hacking code? If so, what are you working on?

    --

    My beliefs do not require that you agree with them.

  24. NASA, Government, Linux, Open Source by 4of12 · · Score: 5, Interesting

    Would you care to comment on your experience in NASA working on an Open Source project? (I understand you've left NASA for Scyld, maybe that partially answers my questions, but I still want to know...)

    It seems as if your work on Beowulf clusters had a nice spin-off in terms of providing not only low cost supercomputing for academic, government and industrial users, but also in terms of Ethernet support for all sorts of Linux users.

    1. Are further spin-offs in the works, be it for advanced network interfaces or anything else?
    2. Are the program managers in government aware of the beneficial impact they have on a wider scale by funding work like yours?
    3. Do they even care?
    --
    "Provided by the management for your protection."
  25. limits of clusters by flaming-opus · · Score: 5, Interesting

    Beowulf and similar clusters have hugely lowered the cost of super-computing for a great number of scientific problems. Due to the great interdependance of data and the relative high latency of cluster interconnects, some problems are not easily worked on using clusters. What are the evolving areas of clustered computing? Where are the advances: Are new algorithms being developed for these difficult problems, or are clusters becoming more capable?

    - Also -

    What tools are seriously lacking in linux clusters? Are open source (or low cost) cluster filesystems necessary to expand the use of beowulf clusters? - Are better libraries needed? Where is research needed?

  26. What comes next? by Matt_Bennett · · Score: 5, Interesting

    Ethernet seems to be reaching the end of its usable capacity- a gigabit ethernet card running at full bore (wire speed) can max out many machines both on bus bandwidth and CPU utilization. Infiniband appears to be the best alternative, but acceptance is so slow, it may never make it. There is a linux effort with Infiniband, but due to the slow acceptance and development of Infiniband, it seems we may never see the combination of good working hardware and a complete software implementation of the standard.

    If Ethernet consumes too many resources, and Infiniband is stillborn, what's the next communications medium for networking and clustering?

  27. Network driver fiasco by Eric+Seppanen · · Score: 5, Interesting
    You wrote and maintain a lot of Linux network drivers. Unfortunately, these drivers stopped being included in Linus' kernel because he dislikes the backwards-compatibility code in them (throwing out the baby with the bathwater, if you ask me, but this is Slashdot and I dare not criticize the Great Leader too much). Sadly, the end-users are the ones that really suffer.

    Is this still the case and is there any hope of this deadlock ending? I know some folks have stepped up to maintain what's left of your code in the kernel; are they doing an adequate job?

    --
    314-15-9265
  28. Time to burn some karma... by Loki_1929 · · Score: 5, Funny

    Donald Becker,

    With all that you've accomplished to date, how much do you think a Beowulf cluster of Donald Beckers could accomplish?

    --
    -- "Government is the great fiction through which everybody endeavors to live at the expense of everybody else."
  29. Total Cost of Ownership by guygee · · Score: 5, Interesting

    Given the decreasing ratio of power efficiency per transistor for newer generations of commodity CPUs, what suggestions do you have to reduce the total cost of ownership (including the necessary electrical power and cooling infrastructure upgrades)over the lifetime of large computing clusters?

  30. 10 computer, Teraflops GPU based Beowulf systems. by registro · · Score: 5, Interesting

    Well, if the goal is TeraFlofs league clusters, What about using other commodity chips, like nvidia/3dlabs/ATI GPUs? Some groups are already working on ways to use GPUs as mathematic coprocessors, using OpenGL to represent numerical vector operations by OpenGl based graphics operations on images.
    This is not just academic. GPUs are real Vector processors, some of them capable of +200 GFlpos, using up to 128 bits Floating point precision.

    Thats about 100 times faster than Intel based CPUs.

    Extending math libs, and adapting MPI to use the cluster GPUs as vector oriented Math co-procesor, could potentially lead to 10 computers TeraFlops level beowulf Clusters.

  31. Thanks for all the Ethernet drivers, Don! by 0x0d0a · · Score: 5, Insightful

    Thanks for all the drivers. There are a *lot* of people (including me, with two cards that use your drivers) that really appreciate what you've done.

  32. Free time at NASA??? by ICA · · Score: 5, Interesting

    Here goes:

    What drives a guy working at NASA to develop a plethora of Ethernet drivers and architect a distributed computing system?

    Was this based on a need for better tools at work? Spare time?

  33. Export restrictions by Call+Me+Black+Cloud · · Score: 4, Interesting

    Currently high-performance computers (supercomputers) are subject to export restrictions. Don't want the bad guys simulating their nuclear explosions in software or decrypting our secrets of course. This is an example of technology that can do a lot of good or a lot of bad depending on who's using it.

    Though it's certainly impossible at this point, do you think similar restrictions should apply to projects like Beowulf? At what point does the potential for bad things outweigh the potential for good things?

  34. changes in the beowulf environment + community by painehope · · Score: 5, Interesting

    Donald,
    As a member of the beowulf@beowulf.org, I have noticed that your posts generally seem to be of a technical, "yes/no, this is how you do it", etc. nature ( which is quite good actually ), and I've never really seen much stating your opinion on the way things are. I've got a few questions :
    1) how do you feel about high-speed interfaces, and the parallel code ( i.e. various flavors of MPI ) to take advantage of them? I noticed that every time benchmarks come up for Myrinet or SCI interfaces, we get a minor flamewar between said parties, and noone ever really mentions Infiniband ( and Gigabit ethernet to ea. node is still prohibitively expensive in terms of price/performance at the switch level ). This also brings up issues of free vs. propietary interfaces and software. What do you think are the futures of these technologies, and which model do you prefer : open source or Whatever Gets The Job Done(TM)?
    2) why did you pick Linux, as opposed to, say, one of the BSDs? At the time when you started doing Beowulfs, GNU/Linux wasn't the beloved child of the community that it is now, so what prompted the choice?
    3) also, what do you see the next wave of clustering to be? We saw mainframes ( Shared Memory Processors ), then high-powered clusters ( ala SP2 + SP3, SMP on ea. node, but no contiguous RAM across all nodes natively ), then the introduction of COTS ( Commodity-Off-The-Shelf ) Beowulfs, then next-generation Beowulfs ( higher-end dual ( sometimes quad or even now some Xeon NUMA boxen ) processor, large amounts of RAM, high-speed SCSI disks, 64 bit PCI or PCI-X, etc. ), which argues that the community goes w/ the next bright idea ( which is dependent on hardware ), and companies go w/ whatever gives them the most bang for their buck. Where do you think we're going now ( as far as the major trend, since there is no 1 answer to the various problems that MPPs are used to address )? Low power consumption, low-heat large farms? I'm all ears...
    Anyways, whether these questions get answered or not, thanks for the hard work you've done and all you've given to the community.

    --
    PC moderators can suck my White pierced, tattooed dick. If you think pride == hate, s/dick/Aryan meat mallet/g.
  35. What prompted you to leave NASA ? by Koos · · Score: 5, Interesting
    That is the one thing I'd like to know. You had (at least from the looks of it) a good career at NASA where the work on the clustering and the high-performance network drivers was sort of an added bonus to help you do your research work.

    You changed to scyld where the main objective is to earn money from the application of high-performance computing. You still make all those drivers available and update them (many thanks for that) but the company also has to make money, you need to pay your meals and your home.

    What made you change, and how do you feel about that change now it's been a few years.

  36. Do you play any musical instruments? by CresentCityRon · · Score: 4, Interesting

    I read from Dijstra and Knuth that they both noted how many programmers also played musical instruments - more than the standard population.

    This will will not further the clustering field but do you play any musical instruments?

  37. Device drivers - where to begin ? by minaguib · · Score: 5, Interesting

    Hello Donald,

    I'm a perl hacker (with a bit of C knowlege) and have made a good career out of it so far.

    However, lately I've found myself getting interested in the linux kernel and specifically, device drivers.

    My question is.. Where to begin ? I've seen your name in several drivers in the linux kernel (specifically to my case, the Intel EtherExpress Pro 10/100 card) and have spoken to you on usenet on occasion.

    What should a complete beginner like me learn to get into this area ? Specifically, kernel modules in general, hardware drivers in general, researching how to deal with a specific piece of hardware...

    Thanks for any tips :)