64GB RAM Under 64-bit Linux?
gary.flake asks: "My group at NECI is in need of a machine that can address 64GB of ram in a single process. This means we need 64-bit addressing. We'd prefer to go with a Linux solution because all of our development is under Linux. We've spec'ed out some reasonable machines (Dell can do 32GB, and Compaq can do 64GB) but they seem to be lacking in that they can only be loaded up with 4 x 800Mhz Itaniums. We would really like to have more processing power (2 Ghz x 4 would be a dream). Does anyone know of any monster Itanium machines that will meet our needs? (Please, no Alphas). Finding such a beast is harder than you'd think."
Will 2.4 kernel with highmem support not address 64GB in a single process?
Finding a machine with 4X2Ghz itaniums isn't hard at all - they simply don't exist. If you don't want alphas, 800Mhz is as fast as the ia64 chips go.
It would help if you'd explain a little more about what you're trying to do, or post your Q to the LKML.
You don't have to be subscribed to post to it.
Also, have you considered NUMA or PVM/MPA architectures? Linux supports the ex-sequent-now-ibm NUMA-Q architecture.
Indie rock lives! b-side!
Of course, Intel sacrificed a lot - particularly actual performance - for a high clock rate and thus the perception of speed from unsophisticated buyers.
The closest alternative is IBM's recently announced POWER4 systems, available in 1.3 and 1.6 GHz options. Extremely nice machines, which will blow the doors off a P4 with twice the clock rate, but if you have to ask about the price, the question is "merely profane, or really, really rude?"
Sun make plenty of machines that large, but their processors have always tended to trail in performance. They keep improving, but stay well behind the bleeding edge.
Alpha is the cheapest option, I think, but it's definitely end-of-life.
Additional options are SGI and HP, but you'd want to go with the vendor Unixes, because the Linux ports aren't ready for serious use yet.
Itanic is, to put it politely, not recommended for serious use. The Merced is a dog, and gcc doesn't help.
This article from The Register has a some info. on the Intel server processor roadmap, although not much about processor speeds. Last I heard the McKinely was going to ramp up to 1.5GHz and seeing that the Madison will start at 1GHz that one is probably the best bet for getting to 2GHz and according to the Register article, its not set to be released until early 2003.
"Karma can only be portioned out by the cosmos." -Homer Simpson
So I'll say what nobody has said until this point. (If you find this in Metamod, and it is marked 'redundant', please SPANK the moderator for me.)
Either you've got something REALLY REALLY strange going on, or you've got an incredibly strange niche' you are programming in. Your best bet is to spend a few brain cycles and figure out a way to do it WITHOUT 64gb of RAM.
However, if you're hell-bent on getting some CPUs that run Linux, and your budget is unlimited, go for a Sun Ultra Enterprise 10000. Other downside? You'll need a minimum of 16 CPUs to go with that 64 GB of ram (unless Sun released higher density ram chips that were certified for the E10k while I wasn't looking, in which case, you'll only need 8 CPUs). Expect to pay $1-2M.
And then if Linux has been ported to the UltraSparc III processor, you might be able to get away with something smaller. But the Sun V880 will only go 8 CPUs at 32GB. You'll need at least a Sun Fire 3800 to do the job of 64GB. Probably will take 4 processors to make the 64gb available.
The biggest memory being put out for PCs right now are 1gb modules (that I've seen). I don't think you're going to find a motherboard with room for 64 memory modules. Or even 32.
Time to rethink what you are doing, or throw lots of money at it. That money, though, would probably be better spent at buying the brainpower to rethink it.
Sorry.
Okay. You need 64 GB of ram. I can understand that. Explain to me why, exactly, you need 64 bit addressing? You're aware that recent intel chips (at least the PIII and PIV xeons) use 36 bit addressing, right? And that 36 bit addressing allows you to access... you guessed it, 64 GB of ram! Now, admittedly, you may have a hard time finding a system board to run it all (my current, erm, desktop, and HP LXr 8500, only supports 32GB), but I've seen them out there.
The remaining issue is that, under most OS's, a single process is still limitted to 32 bit pointers; that is, 4GB. Even so, you may find that a more commodity system fits your needs quite well, especially as you can pick up eight or ten of these systems at low, low prices in fire sales (where do you think I got mine?).
I've had this sig for three days.
If you want a stable performant cost-effective
platform with 64-bit addressing, you should be
looking at SPARCv9. Duh.
-I like my women like I like my tea: green-
perhaps use a cluster to access a 'virtual ram' across all the machines? 32 PCs would easily give you 64GB, which leads to the question of the day...What needs 64GB ram?
can it not be split into a distributed process?
"The Most Fun Possible on 4 wheels" is at SunBuggy in Las Vegas
Youre running linux and yet you don't wan't alphas? Have a look at the times between the last few kernel releases, and ask yourself if this software is up to beta standard.
Free Java games for your phone: Tontie, Sokoban
You don't have too many options if you need 64 GB of ram in a single image (not a cluster)...
SunFire 3800
According to the techpub, you'll need both CPU trays (but perhaps with each only half populated with CPUs -- 2 + 2 = 4 CPUs) to house enough memory modules for 64 GB. Base price for a 3800 is $160,000 for two boards. Plus unique RAM (currently $1700 per GB). Keep in mind that the 3800 cannot be upgraded beyond 64 GB or 8 CPUs.
AlphaServer GS80
Looks like you'll need at least 4 CPUs to handle the 64 GB. Base price seems to be around $140,000. Plus unique RAM. Keep in mind that the GS80 cannot be upgraded beyond 64 GB or 8 CPUs.
IBM P660
2 - 8 CPUs. Up to 64 GB RAM. Starts at $66,000. Plus unique RAM. Runs the Linux-Friendly IBM AIX 5L. Keep in mind that the P660 cannot be upgraded beyond 64 GB or 8 CPUs.
SGI Origin 3000
Looking at about $220,000 for one that can accomodate 64 GB. Plus unique RAM (about $900 per GB). The 3000 can be upgraded to 1024 GB (1TB) of RAM and 512 CPUs as a supported configuration. 1 TB and 1024 CPUs unsupported and requiring unsupported OS patches.
Origin 300 would be *much* cheaper, but it only supports 32 GB right now (it will support 64 GB when SGI ships their high density memory modules) and it's nowhere near as expandable or upgradable as the 3000. Origin 300 cannot be upgraded beyond 32 CPUs and will most likely never support more than 64 GB of RAM.
IBM is your best best.
While Pricewatch may not be the best place for reliable vendors, Mushkin Ram certainly is! Their memory is extrememly well made, stable, and relatively compitive in price. (and no I do not work for them)
Also the 4GB ram there is NOT a single 4GB Dimm, but rather 4x 1G PC100 ECC Memory
-OZ
Typical big iron uses very custom components. For example, the Origin 3000 uses DDR-SDRAM chips, but on a memory module that also includes "directory ram" and some other oddball components. The O3K has also been using this ram since about 6 months before desktop PCs began using DDR, so they didn't really have a chance to embrace now-popular standards.
Sun's ram is similar. Sun uses slightly slower chips, but has many in parallel with some ungodly wide bit paths to and from the memory modules. Again, it's much different than what we have in our desktop PCs.
Keep in mind that these monster machines were designed with some insane requirements and low tolerance for error. As such, they often require much different components to keep everything working. Recall that most DDR-SDRAM based PC mobos up until the past few months shipped with two or three DIMM slots rather than four or eight because of timing and signal issues that hadn't been resolved yet. Big iron systems from Sun, IBM, and Compaq support over 100 CPUs, over 200 GB RAM, and do so without (many) problems. SGI's Origin 3000 supports up to 1 TB of RAM and 512 CPUs per single system (and several such huge machines can be clusterd via multiple 8 gigabit GSN networking connections if a task requires insane compute power).
What surprised me is the "please, no alphas" comment. A bunch of alphas will have much higher performance than 4 itaniums. Yes, it is end of life'd, but when you are spending hundreds of thousands of dollars on your computer, worrying about the cost of porting to a new architecture when you replace this computer seems penny wise and pound foolish. Besides, as long as you are using the same OS and compiler on both old and new platforms, the "porting" should be mostly just a matter of recompiling.
If somebody would have bothered to check, the man is a Research Scientist Computer Science (lookup Gary Flake) in the NEC research instiute, with only 70 more researchers with him in the whole place, and let me guess that he has a lot of money to spend...
Give him a little credit, that what he is doing is probably worth the time and effort he is putting in to it, and be thankfull that you are here to hear about such a project (if he'll only tell us what it's about ...)
A suprisingly large number of responders have asked why anyone would need 64GB, and have speculated that we are either lazy, stupid, or have money to burn. While I can't reveal too much, I'll try to give you an approximate idea of the sort of problems that we are working on.
Basically, we are doing two things: graph theoretic algorithms on graphs with hundreds of millions of vertices and billions of edges, and robust classification systems trained on training sets of a similar size.
For our purposes, we need to access these data structures in a deterministic order that cannot be predicted in advanced. We could cache all of our data structures on disk, but the algorithms would require years to complete because of disk seek time. Instead, we are going to try to keep a compressed version of the data in memory. The difference is that the in-memory approach will take minutes as opposed to years.
And if you are thinking that we need a new algorithm, trust me on this, we are using the faster algorithm. Most of the other candidates run in exponential time.
If you want a real clue as to what we're doing, then read this. If you like what you've read and think you can help, then contact me. I am hiring.
The Computational Beauty of Nature
Perhaps an important question is *why* do you need 64GB of addressable memory? Would it be possible to rewrite some of the code such that you could get by on a few gigabytes? It is possible to parallelize the task and spread it over several smaller systems (cluster?)
I worked for a company which was hitting the limit on Intel processors (1GB addressable per process). In their case, they were reading in huge GIS terrain databases. A better solution was for them to read from disk as needed, rather than load the entire database into RAM. While this may not be possible in your case, it is worth considering.
MOSIX Clusters handle parallelizing without extra coding, ie the BeWoulf needs specialized apps. You could run it on MOSIX just like an ordinary machine. With a A GigaBit Lan and a fast SAN and a MOSIX cluster , you are in buisness. If each machine itself is a Itanium all the better .
Quidquid latine dictum sit, altum videtur