cgori · Slashdot Mirror

Re:Buying other items with small performance incre on P4 3.2GHz Reviews · 2003-06-24 05:52 · Score: 1

I'd like to know what tool they are using if they actually a multiprocessing a single job. Otherwise, they are doing what we do, use multiprocessors one-at-a-time to handle many different jobs in parallel (there is a big difference). When you do this, your total run time is dictated by the run time of the longest single job. Getting this time down requires a faster single-CPU result.

"Why not get a big Sun machine" -- obviously you have not looked at the SPECint performance of Sparc CPUs lately. Fairly old Athlons can trash 900MHz Sparcs. After long delays there are now available 1.2GHz Sparc CPUs but they get demolished by new-ish P4-Xeon boxes (and don't even get my started on Opteron, good lord!). If you don't believe me, look at www.spec.org and check out the CPU2k results. If integer performance is your thing, nothing beats x86.

Re:Buying other items with small performance incre on P4 3.2GHz Reviews · 2003-06-23 10:58 · Score: 1

"how many apps really critically need that 2% parformance [sic] increase, but do not benefit from a dual or quad-cpu machine..."

Almost everything in the field of chip design fits your description. Our software can cost $100k per year, per copy. If you can get 2% (in reality it's more like 5% for 3.06 -> 3.2 GHz) speedup, well hey, you've saved the equivalent of $2000. If the delta-price of the chip is a few hundred dollars, you saved money! That discounts any time-to-market advantage you might get (which could be enormous) -- if your schedule is a year and you can shave a few days off, that's worth money too.

And yes, our jobs do not multiprocess/multithread well. Many, many jobs are like this. That is why people buy machines like these...

Re:The Linux *kernal*??? on SCO Berates Linus' Approach To Kernel Contributions · 2003-06-18 05:51 · Score: 1

Thank you!

I was thinking that the NYT making a basic spelling error like this pretty much ends the geekdom monopoly on atrocious spelling errors.

At least ours don't get propagated to several million readers daily....

Re:IBM to SCO on SCO Amends Suit, Clarifies "Violations", Triples Damages · 2003-06-17 05:53 · Score: 1

[poker nerd]
Unfortunately that is called a "string bet" and technically not allowed if you really are playing poker. The reason being, splitting the statement in half allows you to look at your opponent's face while you are calling the original bet, look for any tell, and then dynamically resize your raise (either up or down).

Regardless of what Hollywood has told you, the correct (and allowed) way to do that is just to say "I raise 2 billion".
[/poker nerd]

Now, to find 2 billion in chips to play with...

Re:pager on fvwm Turns Ten · 2003-05-30 12:58 · Score: 1

Wow you were lucky, you had fvwm on your first Sun. I groveled through twm for years (stopping briefly with mwm, ick, and tvtwm, at least passable). By the time I installed Slackware 1.1.1 on a 486-33 and got the X series of disks loaded, I thought fvwm was the second coming of the Messiah.

And now I have it compiled on "Solaris 8" and use it instead of CDE. Yes, I'm a luddite.

Re:The inevitable '2 good songs' thread on RIAA vs The Economy · 2003-05-22 10:40 · Score: 1

Hmm, I have to respectfully disagree with the School of Fish suggestion -- Rose Colored Glasses is quite a solid song, and Speechless is excellent, probably my fave of the whole disc. Euphoria is a decent track too.

All that said, I liked _School of Fish_ so much that I bought _Human Cannonball_ without a listen -- ugh. Not a single decent track. They changed style, probably influenced by the wave of grunge in 1993, and it was not a change for the better.

An interesting side note is that School of Fish has a disc called "Back to Back Hits" with Dada for 6.98 new / 3.99 used on Amazon. It has 3 Strange Days and Euphoria (plus some other random stuff that's not bad). I know that most people would be interested at that kind of a price.

I'll nominate as one/two-hit wonders:

Semisonic's Feeling Strangely Fine (Closing Time)
Pet Shop Boys _Nightlife_ (Closer to heaven)
Train _Train_ (Meet Virginia)
Train _Drops of Jupiter_ (Drops of Jupiter)
Five for Fighting _Message for Albert_ (Bella's Birthday Cake)

I'm sure I could find a half-dozen more if I had my stacks of CDs in front of me.

Re:Performance: Itanium 2 vs. UltraSPARC III on Intel Reveals Itanium 2 Glitch · 2003-05-12 12:54 · Score: 2, Informative

Er, you do seems to be trolling just a bit. The US-III@ 1.2GHz achieves a base SPECint of 637, and the 1.0GHz Itanium-2 is 807. Yeah, it beats it, but trounces it? err, well, not really.
And it's a far cry from the "order of magnitude" better performance than the grandparent post's claims.

What's really funny about this post is that normally I am the one bashing Sun's CPUs... *boggle*

Obligatory AMD note: the new SPEC update today shows that a 1.8GHz Opteron SPECint base is 1081.
On a price/performance basis, I would consider that to be the trouncing chip -- maybe even in the order-of-magnitude range.

Re:Ironic? on Intel Reveals Itanium 2 Glitch · 2003-05-12 11:14 · Score: 4, Informative

I love posts that are COMPLETELY TOTALLY WRONG.

The number of states is 2 to the power of the numbers you were talking about. Even if I take the lowest number ("a couple dozen Kbytes") that you mentioned, it's 2^2*12*1024*8 = 2^24000.

Guess what?

That's a HUGE number -- way bigger than the "billions of petabytes" you were saying is impossible to recreate for software testing. It's roughly equivalent to 10^7200 (if that somehow makes things easier for you). Of course, the "couple dozen Kbytes" is a massive underestimation of the total state of a modern CPU (100 million transistors, even just making flip-flops will give 2.5M bits of state, and for 6T SRAM more like 16M bits).

And then you have the nice problem that physics and electrical phenomena play havoc with hardware testing simulations, as opposed to software, which only has to worry about bad boolean logic.

Come talk to me next time you have to worry about alpha-particle hits changing the state of any of your code or when you care about any event with picosecond granularity (which is just about every day in hardware).

Yes, software testing has even more states to worry about, but trust me when I tell you that the hardware problem is plenty big enough to prevent exhaustive testing from being applicable. Hardware testing uses a lot of brute-force regression and detailed test planning to find and remove bugs. Software folks would do well to use such methodologies.

Re:Overheard at Intel... on Red Hat Releases x86_64 Technology Preview, GinGin · 2003-05-02 09:33 · Score: 1

except that opteron specint is an estimated 1202 for a 2.0ghz chip (which isn't shipping yet). it goes like this:

opteron 2.0ghz specint 1202 aka opteron 244
opteron 1.8ghz specint ~1100 aka opteron 243
opteron 1.6ghz specint ~950 aka opteron 242
opteron 1.4ghz specint ~820 aka opteron 241
itanium 1.0ghz specint 807
itanium 1.5ghz specint ??? (~1200 likely)

you can't buy a 244 (or even a 243) yet, so comparing it to a shipping 1.0ghz itanium is not terribly valid. comparisons to the 1.5 are probably more accurate.

please don't get me wrong, the opteron is incredible bang for the buck, we're probably going to get some dual 244s when they become available. but please get your specints right when you quote numbers.

and i bet the "8-way with off the shelf" machines using 84x chips are gonna be much, much, much more expensive than the 24x-based machines. probably getting closer to itanium prices.

Re:No - he's focused on Machine Learning and MP3s · 2003-04-21 07:12 · Score: 2, Funny

Something just inspired me to reserve grammer/grammar.

Re:"Open Systems for Open Minds" ... on The Economist on The Rise of Linux · 2003-04-13 08:01 · Score: 1

Wow. I don't know what fantasy world you live in, but it sure must be fun. You did start with a valid point, that original Suns used all-commodity everything (Motorola/BSD/VME/Ethernet), but then you faded fast.

Let's get some things straight. Andy Bechtolsheim (note correct spelling) designed the first Motorola-based Sun1 as a side project to his PhD thesis at Stanford, before he met Scott and Vinod. So I think he knew a little about the tradeoffs they were making in the systems. He's generally considered the most technically savvy of the 3/4 founders (Bill Joy would be the "unofficial 4th," and he would still be 2nd behind Andy in technical ability).

Andy also had the idea for the first SPARC-based system chip, but didn't design the processor himself. At the time it was probably a reasonable project to green-light. This was an era of 16-20-25MHz 80386s. RISC performance was potentially considerably better than commodity CISC (Moto/Intel), so while it would be expensive and difficult, it wasn't such a bad idea. The first SPARC chips were considerably better than what you could buy commodity-wise. On the IO front, SBus blew the doors off commercially available IO buses at the time, including VME (this is the ugly era of EISA and VLB, remember -- 33MHz PCI is maybe just coming onto the scene in 1992)

The switch to Solaris (as someone else points out) had nothing to do with the switch to SPARC. SunOS 4.1.3 was on every Sparc1+ and Sparc2 I used in all my time at school. Solaris really got rolling with real momentum sometime in 1994-5, and that only because Sun knew they had to end-of-life SunOS4 to force everyone to switch. Everyone really liked the BSD-isms, but SVR4 clearly has scaled up damn well.

And as far as Andy being "long gone" -- he left in 1996 after ~14 years with Sun, having had a hand in every top-selling product Sun made in that timeframe. I hate those short-timers who only spend a decade and a half on a company.

Re:Can someone answer this? on Intel's Anti-Overclocking Technology Simplified · 2003-04-11 10:44 · Score: 2, Interesting

Because the article is slightly wrong: See HannibalArs' post about it. I would trust ArsTechnica more.

Plus, if you read the patent (and I did), they are talking about using a 32.768 kHz reference from the RTC. This is a _lot_ easier to build than a stable ring-oscillator at 200.000MHz +/- 200ppm (or whatever the current reference spec is these days). The high-speed ones are nearly impossible across the range of operating points.

As the power supply voltage drifts around Vdd (either 1.8, 1.5 or 1.3V these days), and the temperature changes on the die (which can be a lot, more than 30degC), the oscillator will give different speeds. Plus there will be manufacturing variability in the stable frequency of the oscillator across parts, even if you could somehow hold the voltage and temperature perfectly constant. That would mean that some chips would be 2.8GHz, some would be 2.795GHz, some would be 2.87GHz, etc. Actually, since they are multiplying up a reference the error will probably be much larger (I think 6x is pretty common these days, depending on the input reference. Some designs have 12-13x multipliers from 133/166MHz)

All this stuff combined is why typically a quartz-crystal oscillator is used.

Re:heat or capacitance? on Intel's Anti-Overclocking Technology Simplified · 2003-04-11 10:33 · Score: 2, Interesting

Erm, parasitic capacitance is inherent in silicon, it's not produced by bad etching. The reason people talk about parasitic cap so much these days is that it has come to dominate the delay equation for logic paths. This equation basically says (Sum of gate delays + Sum of wire delays + Required Setup Time + Clock Skew = Fastest cycle time).

As technology shrinks (0.25um -> 0.18um -> 0.13um, etc), the gate delay essentially goes to 0 (not exactly, but I'll simplify). The wire delay keeps getting larger and larger. Why? Because as the geometry decreases the width (and spacing) of the wires decreases. Unfortunately the height of the wires is mostly unchanged. As the width and spacing go down the height effect starts to dominate.

Picture the diffence between two skyscrapers in a downtown city block, and two suburban estates on 2-acre plots of land. The suburbs are the older process technologies, aspect ratios are around 1:1, and very far apart. The skyscraper has 10:1 or higher aspect ratio and the spacing is far less than the height. As the previous poster points out, the capacitance depends on the surface and thickness of the layer. It also depends on the area. The two skyscrapers can "see" a lot more of each other -- this causes the parasitic cap to go up, a lot.

Bad etching can make this worse, but in a well-controlled manfacturing process this variation is on the order of +/-10-20%. Really bad problems are due to actual defects (tiny bits of dust) that cause shorts or opens in the circuits, and then the part just fails completely.

Re:64-bit? Why? on Microsoft Commits to Using Opteron · 2003-04-09 10:41 · Score: 4, Informative

Everyone rails on the x86 instruction set. Yeah it's not pretty, it's not fun, hell it's downright ugly. But what are the top SPECint machines these days? Wanna guess? That means something is ok with x86. Yeah it might be hack-on-hack-on-hack but this collection of hacks seems to be working. (They'd be pretty near the top SPECfp's except for Itanium, everyone else's favorite Intel punching bag -- give me a break it has stellar FP, which is what it was made for!)

More seriously, there are some academic studies around that show that variable-length instructions of the x86 ISA actually are improving performance over fixed-length RISC-style ISAs. Why? Because the instruction density in the cache can be higher, and therefore the I-Cache fill rate doesn't need to be as high. Sure, the I-Decode is a b*tch to design and build, but apparently Intel and AMD are able to run it in about 500ps (~2GHz, or better) in 0.13u and below technology. Not bad, not bad.

Re:Copy & Paste behavior is the BEST thing abo on Significant Interactivity Boost in Linux Kernel · 2003-03-08 12:25 · Score: 1

Try the equivalent of "the other way" -- use
some keyboard accelerators.

1) Select URL
2) Click somewhere in location bar, not highlighting anything.
3) Ctrl-A
4) Ctrl-K
5) Middle-click in location bar, press enter

Or, if you are using Mozilla:

1) Select URL
2) Middle click somewhere in the browser pane
3) There is no 3

(You might have to enable this under preferences
somewhere -- i've had it this way for so long i've forgotten how)

Seems pretty fast to me.

Re:backwards compatibility on China's 64bit Homegrown CPU · 2003-03-06 10:37 · Score: 1

Dude, he also wrote "super-scaler" -- presumably referring to a superscalar architecture.

That should tell you how familiar with CPUs the reporter is.

Re:Who cares? on Intel To Redesign PC With "Grantsdale" Chip · 2003-02-28 07:36 · Score: 1

The key point is that PCI is a bus -- PCI Express is point-to-point. This is roughly analogous to a hubs versus switches discussion in Ethernet network design. In the point-to-point system every channel/device has its own dedicated connection -- the electrical signalling requirements are vastly simpler for such a device.

PCI Express is similar to what AMD does with LDT (HyperTransport), and what will allow them to have highly scalable MPs with Opteron/Athlon64. The LDT hubs can just be scaled up for larger systems. Contrast this with the Intel (ia32 -- Itanium is totally different) method, it pretty much craps out at 4-way SMP (yeah, there are 8-way x86 boxes but they are total hacks, basically dual 4-way boxes highly integrated). In general, point-to-point is where it's at for all future system designs that need to generate any kind of performance.

Last, building peripherals in chipsets is not bad at all from a cost perspective -- the total system cost will almost assuredly go down. Highly-integrated chipsets are gonna make your motherboard dirt-cheap in the long run. All Intel is doing is gluing together lots of discrete components into one "kitchen sink" chipset. Trust me, it's less expensive to hook that stuff up on-die than on-board (in 0.18/0.13u and below, the incremental die area is near-free, plus you pay for only one IC package instead of multiple, have less impact from assembly fallout, etc). If one of these parts isn't high-perf enough for you, it should be trivial to drop-in a card to replace it. That is flexibility.

Re:Lower cost overall? on Sun To Use AMD Mobile Processor In Blade Servers · 2003-02-25 20:09 · Score: 1

"SPEC benchmarks are not the best way to show system performance. They only measure raw CPU performance, which is not very important for most applications."

But there's the rub -- for my apps CPU performance really is the most important thing. #2, as you allude to, is memory bandwidth, but it's a distant second (and the L2->L1 cache fill bandwidth on any Intel processor is still pretty damn fast). You also will not see anywhere close to the numbers (9.6GB, 33.6GB/sec) you quote from a V480. No way -- those are aggregate marketing numbers (almost as bad as when people add MHz numbers together on dual-proc systems). Another guy later one set you straight on the PCI-bus issues on PC's so I won't do that here, but suffice it to say you are off the mark.

I have done all the app-level benchmarks, measured the real wall-clock times for apples-to-apples comparisons, trust me (when you are spending multi-$100k per year on servers, and millions per year on software licensing, you better get it right). The x86 boxen blow the doors off the Suns -- at 1/3 to 1/5 of the price.

As far as lack of education of complexities of enterprise systems, I think you are barking up the wrong tree. I do chip design work, used to do it at Cisco on high-end switches, now work on SAN switches (tangent: you will never get your claimed 512MByte/sec out of a Fibre Channel port -- it's good to 1/2 Gbit/sec -- i.e. 125/250 MByte/sec, assuming no protocol overhead -- crappy old PCI33/32-bit can almost drive a 1G FC HBA, and PCI66/64 or PCI-X certainly can handle a 2G HBA).

A big part of what I do is design the simulation and synthesis compute clusters, so system-level issues are all I think about. Further, I work with the guy who designed the Ultra60 SDRAM/system controller ASIC, a couple guys from the SGI high-end servers group, and one of the guys who ran the UltraSparc IIIi program at Sun. Our systems guys are from Sun, they're the ones who designed the Sun 280R,480 and Blade machines. They all agree (from feeling the performance) the x86 hardware is best, bang for the buck, and on absolute performance. Our modeling guys also tell us that Athlon's beat the snot out of UltraSparc III's for hspice circuit simulation (again, from head-to-head comparo).

As I mentioned, all of my points are true until you crack 4GB processes, then the 64-bit-ness of the Sun gives it an unbeatable advantage. Thankfully for us, that's a very small percentage of our workload, so we only need the one V480, and yeah it's a nice box. It's just overpriced and slow. An Itanium box is faster overall (even though everyone bitches how slow it is), and costs about the same right now in ultra-low volumes -- just wait till Intel turns on the faucet.

I appreciate your attempt to defend Sun's cost structure, but I think you'll find that it's not justified. Sure, they are solid boxes, but if you are willing to spend 5-6k on a decent rack-mount PC running Linux, it's going to be pretty damn solid as well. We have had many 100+ day uptimes on our x86 boxes, mostly terminated by building power episodes, or data center cooling failures. In our business (chip design),
Linux on x86 will completely supplant Sun in the next 3-5 years, I suspect.

One last thing -- the part about ECC. We design in ECC all over the place in the system we are building too, because people ask for it, and it's a check-box type of item for the buyers. The only place you ever actually see bit-flips is inside memories, at the bit cell -- either SDRAM or on-chip SRAM. So protecting those is covering 90+% of the problems. The rest is probably not worth it, since the probability of failure is so low.

Thanks for reading...

Re:Lower cost overall? on Sun To Use AMD Mobile Processor In Blade Servers · 2003-02-25 07:56 · Score: 1

"I'm sorry, but your wimpy little Xeon will not keep up with these processors"

um, bzzzt, wrong answer.

obviously it depends on workload characteristics, but we find that 1.0 GHz P3's are equivalent to 900MHz US-III's (Yes, I have a Fire V480 to try this on). Those 2+ GHz P4's beat the pants off everything Sun makes. Don't believe me? Check SPEC, our workload correlates nicely to SPECint results.

If you don't want to click-through, the Sun 1.015GHz US-III benchmarks at SPECint = 516 and a Dell 6650 (w/ 2.0 GHz Xeon) runs 816. Don't even try it with a reasonably current (i.e. 2.8GHz, 533 FSB) Xeon (SPECint = 1017), the Sun will just turn into a black hole.

Of course, the reason we have a V480 is because it is 64-bit -- it's for our largest-footprint computational tasks. But it costs a sh*tload, and is dog-slow. I think we can normally buy 5 dual-proc 2650's for the price of one V480. And these are fully tricked out 2650's. Just do the math and realize how screwed Sun is.

Bring on the freaking Itanium/Opteron solutions!! now!!

Re:Makes a lot of sense... on Forget Moore's Law? · 2003-02-11 08:17 · Score: 1

The problem lies in exactly what you said. Existing technology is sufficient for 99% of all needs. What do the other 1% do then? Sit around and wait?

I work in chip design -- our problems do not decompose into neat parallelizable chunks, and I'm sure there are lots of other computational problems like ours. We need big, fast machines to do our work. Most datasets are >4GB in size these days. If you try to break it up, it's a hack, and it inevitably causes much pain later in the process.

I read the article in the dead-tree edition of Red Herring 2 weeks ago and thought the author was full of hot air then. I still do.

Re:Amen. on Forget Moore's Law? · 2003-02-11 08:10 · Score: 1

That's because your dataset is smaller than 4GB.

If it was bigger, you'd see the compelling advantage rather quickly, trust me.

Re:New Hampshire? on California Considering More Internet Taxes · 2003-02-08 12:51 · Score: 1

Simple -- high property taxes. It's the way all low/no-sales-tax states make money.

Tax on Downloads on California Considering More Internet Taxes · 2003-02-08 11:51 · Score: 5, Informative

They are simply closing a (well-known) loophole.

If you buy expensive software (i.e. chip-design tools at >$100k per user) and you take delivery via FTP instead of physical media (CD/tape), you do not owe sales tax. On a big purchase (multi-million $$) the 8% is a BIG deal. It happens a lot in the Valley.

I'm surprised that it took the bureaucrats in Sacramento this long to find a revenue "source" this big.

Re:Help me! on Superbowl XXXVII · 2003-01-26 22:04 · Score: 1

Sure enough, Oakland is burning tonight. 12 cars set on fire on International Ave, one McDonalds lit up, many trash fires. Mob of ~400 people, only 32 arrests so far. Lots of police in the streets and bricks thru car/house windows.

Raiders fans are truly incomprehensible to me.

Re:This alliance should work .... on AMD and IBM Working Together on Future Chips · 2003-01-09 08:31 · Score: 1

Why "buh-buy"? [sic] Hammer is 64-bit, woo! show me the apps where this matters -- I actually use some (in the process of designing chips, no less) and they only constitute 10% (or less) of what we do. i.e. 90%+ of my job can be done in 4GB of ram or less.

I can see large databases, and certain specialized scientific apps. But that's all. Mass-market 64-bit apps? To do what? Populate the metaverse? No way, jose.

Lost in all of this debate is the fact that intel historically has had the best-performing fab in the world (probably for the last 20 years). The yields and process control that they have are way better than TSMC or even IBM at equivalent geometries. So, if you think about the design of the P4 in those terms, it makes a ton of sense. Build an architecture that is easy to scale up as you improve your process by breaking down your pipeline into lots of really small stages. It's calling "Playing to your strength" -- and it's what the winners do.

-c

PS: We use mixed Athlon/P3/P4 clusters to do our chip design. The Athlons are great, don't get me wrong. I just think that Intel will win the longer-term game because they will have it easier doing what they do in process than AMD will in design.

Slashdot Mirror

User: cgori

Comments · 123