Domain: hypertransport.org
Stories and comments across the archive that link to hypertransport.org.
Comments · 36
-
Re:Am I the only one who is confused...
> AMD also has HyperTransport. Maybe this was why there were rumours about Nvidia making a CPU.
Uh, seriously? NVIDIA is a founding member of the HyperTransport consortium, they should have no problem connection their GPUs via that if necessary.
http://www.hypertransport.org/default.cfm?page=ConsortiumAboutUs -
Re:Being an innovator not always smart?
The innovator took all the costs,
Not hardly. There were a lot of other companies involved in developing Hypertransport, and Intel spent their own money to develop their alternative.
-jcr
-
Re:Intel still playing the Chuck Norris of vendors
Royalty free membership must be a bad thing?
see - http://www.hypertransport.org/consortium/index.cfm -
Re: Intel Opens Its Front-Side Bus
Didn't Asus or somebody have one those for a P4 socket which let you put in a Pentium M (If my memory recalls)? So the P4 socket was open to Asus for some reason? This is definitely competing with AMD, in the HPC market where the HyperTransport is aiming, making FPGA's act as co-processors. But, HyperTransport's bandwidth is ~20GB/s, and the last time I checked Intel's FSB speed was still 1333Mhz which makes it atleast half as fast (~10GB/s?), if not slower. Why would I want to make something for a much slower bus if I can use a faster bus standard instead and both cost the same?
-
Re:Four Cores and Seven Years Ago
It's more a case of "Not Invented Here" symdrome than AMD not being willing to license hypertransport. AMD spun Hypertransport into it's own consortium so Intel would have had no problem getting on board if they had wanted to.
-
Sorry slow to respond links below
http://www.hypertransport.org/products/productdet
a il.cfm?RecordID=65
PathScale Infiniband card. Lowest latency infiniband neworking card in existance (1.5 microseconds).
http://www.supercomputingonline.com/article.php?si d=11429
Xilinx card
Articles about HTX and 4x4 (Torrenza) tie in:
http://www.xbitlabs.com/news/cpu/display/200606070 74412.html
http://www.amd.com/us-en/Weblets/0,,7832_8366_5730 ~109409,00.html
http://www.tgdaily.com/2006/06/02/qanda_amd_vp_ran dy_allen/
There are many more, but this is a start. -
Re:A port?Looking at the doc... Wires the same as version 2. Just the clock speed bump. 16 Command/address data links, 2 clocks, one control, two system (PWROK and RESET). Clocks to 2.6GHz for bandwidth of 20.8GB/S. It doesn't say how many ground wires in the wire. Another slide says up to 32 signal wires and up to 41.6GB/S.
Here's the pdf from another post: http://www.hypertransport.org/docs/tech/ht30pres.
p df -
NOT anything like USB at all.
Whoever subimtted the article doesn't understand what the external HT links are for. They are _NOT_ a replacement for USB or any other similar technology. External HT is used to link multiple chassis together to form a large SMP box. This is similar to infiniband, etc. This is NOT designed to be a way to just plug in a CPU to an external port. Read the pdf:
http://www.hypertransport.org/docs/tech/ht30pres.p df -
Not to be a jerk...but this is old news....
dated from June 16, 2005
Check out the article here...
http://www.hypertransport.org/consortium/cons_pres srelease.cfm?RecordID=79/ -
Re:The story of SGI is the greatest story of...
Either your memory is clouded or your friend is an idiot. Hypertransport is 100% AMD technology. Sure, they may have worked with Nvidia for the purpose of Nvidia's particular implementation of it (AMD is very good at working with their technology partners), but Nvidia did not have a thing to do with the creation of Hypertransport. AMD based their entire K8 architecture around Hypertransport. You seem to imply that Hypertransport depends on IP rights that Nvidia owns, which is laughable. Hypertransport is an open standard managed by the Hypertransport Consortium. If Nvidia owned the rights to the technology then how come ATI, their biggest competitor, is also using it?
But since I don't expect you to take my word for it, here are some links:
"HyperTransport technology was invented at AMD"
-
Re:The story of SGI is the greatest story of...
Either your memory is clouded or your friend is an idiot. Hypertransport is 100% AMD technology. Sure, they may have worked with Nvidia for the purpose of Nvidia's particular implementation of it (AMD is very good at working with their technology partners), but Nvidia did not have a thing to do with the creation of Hypertransport. AMD based their entire K8 architecture around Hypertransport. You seem to imply that Hypertransport depends on IP rights that Nvidia owns, which is laughable. Hypertransport is an open standard managed by the Hypertransport Consortium. If Nvidia owned the rights to the technology then how come ATI, their biggest competitor, is also using it?
But since I don't expect you to take my word for it, here are some links:
"HyperTransport technology was invented at AMD"
-
Re:AMD's dual cores are greatThe problem is that the socket only has enough memory bandwidth for one cpu's worth of work.
This is exactly right. It is really surprising that Intel has focussed so completely, almost obessively, and for so long, on the problem of supplying the maximum number of work-cycles per unit of time (GHZ, Pipelining, Itanium's EPIC design) while seemingly paying so little attention to supply-of-work-to-do (FSB speed and architecture)
AMD has paid quite a bit of attention to the work-supply and has a much more efficiently balanced work-cycle-supply/ data-for-work design. http://www.hypertransport.org/ gives AMD a big leg-up over Intel.
If Intel fails to do something spectacular to FSB speeds, AMD is sure to continue to pull away from Intel. The more cores and threads per CPU, the greater AMD's lead over Intel will become (at least from a performance point of view), until Intel addresses this problem.
-
Re:Is this for real?
The G5 is competitive in that it matchesor outperforms the AMD Opteron (that frontside bus helps)
BerryTruFax: The HyperTransport bus used in the Opteron is the same thing used in the G5 Macs and XServes. Both AMD and Apple are in the HyperTransport Consortium. -
Re:It's The Software Stupid!
Go to an Apple Store and ask people buying Macs there "Why are you buying a Mac instead of a Dell?" Do you really think that *anybody* will mention "PowerPC" vs. "x86"? People will only mention the G5 processor because its as much a brand/trademark as it is a processor. Nobody knows/cares about the CPU architecture.
OK, you're right, for 99 percent of the people who buy Macs, buy them for the experience, and applications available on the Macintosh platform. But there are a number of people out there who want the G5 for it's industry leading efficiency, google for the interviews of institutions who've built clusters out of G5's, like UIUC, Virginia Tech, or the US Army. The number one reason why? Because for the price, you can't build a cluster that will be faster, and run cooler, and use less wattage than the Xserve G5. These folks sure as hell care about the CPU.
You have to wonder if they talked to AMD at all. As the PC Mag article mentioned, there's never been much talk between Apple and AMD.
The G5 Power Mac's and Xserve's use AMD's baby, HyperTransport:
http://www.hypertransport.org/products/productdeta il.cfm?RecordID=51
And all sorts of Apple products have contained AMD and Intel chips for various functions over the years. I'm sure they talk to Intel and AMD all the time. I don't really think this whole thing is as big a deal as everyone's making it out to be.
Plus AMD has been the 64 bit pioneer in the x86 world, though Apple has never really shown a real conviction about going 64 bit for their software.
Apple released their 64-bit Power Mac G5's 4 months after AMD released their Opteron. It's a year later now and where's Windows XP 64-bit edition? Or even Longhorn for that matter. Apple has a 64-bit capable OS now, and 64-bit development tools in 10.4. What makes you think they're not interested in 64-bit?
http://www.apple.com/macosx/features/64bit/ -
Re:for $1500 you can get 32GB
ermm, not quite. http://www.hypertransport.org/ - read the spec. get a clue.
-
Re:Old idea, but there are many ways to implementI can't find a direct link. Fortunately, I've archived the web pages and will be able to put them up somewhere. The idea seems to have been to have a generic high-speed bus into which you plugged PCI, SCSI, VME or other regular busses.
The closest I can see to that on AMD's website is some vague referencing to Hyper Transport, which is a high-speed chip-to-chip communication system. My guess is that there wasn't enough interest for them to build an entire bus architecture, but that high-speed chip communication proved sellable. -
Re:Speed isn't the main reason for PCI-Express
I gotta say what you really want is HyperTransport.
But I guess an improvement is still an improvement. -
Re:New 32-way Opterons coming soon...
The new hypertransport-2 has 64 bit addresses. So Opterons will clearly be going to 64-bit addresses at the system level, even if they each only have 40-bits of live address pins for their local memory. But if people get close to putting 1 TB per CPU, you can be sure they will put power on more address pins in future chips.
-
Re:Why not quad core?apparently because of reduced bus conflicts with their individual memory spaces.
Ah but with multi-core chips they can transduce their flux capacitors with the onboard trans-mogrification controllers. Seriously "reduced bus conflicts with their memory space", what does that mean?? That's gibberish.
P4, presumably, like the P6 GTL+ host bus is a shared bus (like most buses are). Only CPU can use the bus at any one time. If the bus does x GB/s, that's only to one CPU at any given time - effectively it is shared. Further, P6 and P4 do not have integrated memory controllers, and must access RAM via the (shared) GTL+ bus, if it is not in cache. Eg, a 4 CPU machine looks like:P = CPU
MC = Memory Controller (part of the "northbridge" chip, also provides PCI host bus controller, etc.)
P P P P
| | | |
--------- GTL+ bus
|
MC--RAMAlso GTL+ is limited to 4 CPUs and one controller. To get 8 CPUs some controller vendors have invented a GTL+ 'bridge' to stitch 2 GTL+ buses together, but that just makes things worse really from a scaleability POV I'd imagine.
The K8 on the other hand uses a point-to-point (PtP) serialish, packet based transport, HyperTransport to interconnect CPUs and has onboard memory controller(s) (connected internally via HyperTransport links). A 4 CPU K8 machine looks like:K = K8 CPU
HT = HyperTransport link
RAM--MC-\ /-MC--RAM
RAM--MC--K--K--MC--RAM
| |
| |
RAM--MC--K--K--MC--RAM
RAM--MC-/ \--MC--RAMEach of the lines out of a K is a HyperTransport link. Each MC is integrated into the die itself. (you'll have to imagine interconnects and right-hand top/bottom MC's lining up with the K symbols, cause
/.'s filter is chomping whitespace in some strange way on me).
Each CPU has 4 HT links, two to other CPUs, two to its (integrated on die) memory controller. For dual CPU setups, each CPU needs only link to another CPU obviously. Indeed the difference between 2xx, 4xx and 8xx AMD Opteron CPUs is the number of HyperTransport links. Indeed in large multi-CPU (ie 8+) SMP setups one need not attach a memory controller to each CPU, one might choose to have a central "cross-bar" of fully-meshed K8s who then connect to peripheral K8s which have memory controllers and hence RAM. Tis all down to the board designers I guess. And a bit of a fun computer science problem too in terms of designing optimal 'networks' of interconnected nodes with the best compromise of maximum node to node distance for lowest number of required interconnects.
The K8 is actually a ccNUMA (cache coherent, Non-Uniform Memory Architecture) machine, in SMP configurations. Ie, different memory is at different distances to different CPUs, or to put it another way, some memory is local, other memory is distant, some memory may be more distant than other memory. Eg, for the top-left CPU to access RAM on it's "local" MC is obviously potentially far quicker, in terms of latency, than to access "distant" RAM on another node, and to access memory on an adjacent K8's memory controller will have lower latency than to access memory allocated in the bottom-right CPUs RAM. A good OS aware of the issues can try ensure to keep processes on the CPUs to which that processes memory is "local" and hence maximise performance, but it's quite a juggling act (Linux has some NUMA support).
What AMD will do for multi-core we dont know. For certain the individual cores will be connected by HyperTransport. Most likely AMD will give each core their own dedicated memory controller, which would simply make a multi-core SMP be exact same in terms of architecture as the current dual K8 architecture (ie 2xx opteron), and hence no different in terms of bandwidth contention than for existing SMP Opterons.
It will make large SMP machines a lot easier to build though. Eg -
Re:Oh this is silly
Hm. Apple PowerMac G5:
* HyperTransport
* PCI-X / AGP
* DDR SDRAM
* S-ATA
* Gigabit Ethernet
* IEEE 1394b a.k.a. Firewire 800
* USB 2.0
So, tell me, which of these, which will be the only interfaces that you can sanely use, is proprietary?
In the PC world, anything other than an Opteron machine can compare in specs. -
One can only hopePCI-Ex
Win64PCI Express is an Intel design bus. Win64 is an Athlon64 OS. It could be a while before we see AMD processors on PCI-Express boards.
Of course, the specs are out for HyperTransport 2.0, which is supposed to be compatible with PCI-Express. But we still need AMD to make a next generation processor with HT2. It hasn't been anounced, but 2H04 maybe.
-
Re:IANAEE (I am not an electrical engineer)
HyperTransport is more than AMD. In fact, it includes Sun!
from the HyperTransport FAQ
"6. What is the current specification release?
The current HyperTransport Technology Specification is Release 1.05. It is backward compatible to previous releases (1.01, 1.03, and 1.04) and adds 64-bit addressing, defines the HyperTransport switch function, increases the number of outstanding concurrent transactions, and enhances support for PCI-X 2.0 internetworking." -
Wrong Wrong bus = RIGHT BUS!!!
Wrong bus...Ok, HyperTransport is a CPU bus to the Northbridge of the Chipset, to prove my point - can you come up with a URL to a HyperTransport NIC ?
HyperTransport(TM) Technology - Overview
Don't have to give you a URL - the hypertransport bus is backwards compatible with the old PCI bus, so PCI cards and their drivers [knock on wood] should be able to plug right in!Best of all, HyperTransport is software and operating system compatible with the popular Peripheral Component Interconnect (PCI) interface that is commonly used in most systems today. Unlike the older multi-drop architecture of PCI that reduces throughput when more devices are attached, HyperTransport is a point-to-point line that maintains full throttle performance at all times. Point-to-point attachment also means that there is no bus arbitration overhead, keeping actual I/O bus throughput near the theoretical maximum.
And, guess what? It's had chipsets for almost a year now:
http://www.google.com/search?safe=off&q=AMD-8111+
s outhbridge -
Re:Here we go again:
AMD's Hypertransport is basically (exactly?) the same thing.
Calling it "AMD's HyperTransport" isn't quite accurate. AMD is part of the HyperTransport executive committee, however. From The HyperTransport Consortium's About Us page:Advanced Micro Devices, Alliance Semiconductors, Apple Computers, Broadcom Corporation, Cisco Systems, NVIDIA, PMC-Sierra, Sun Microsystems, and Transmeta are charter members and comprise the Executive Committee of the HyperTransport Technology Consortium.
That said, Apple does make use of HyperTransport technology in the Power Mac G5, as is stated on the Architecture page of the G5 site: ...as well as the HyperTransport interface that connects the PCI-X controller and the I/O subsystems to the system controller... -
Maybe, maybe not.
It's not a G5, it's a PPC970, completely different beasts.
Newsflash, kiddo: neither Motorola nor IBM sell a CPU called the "G4". "G4" was a "marchitechture" term coined by Apple in the spirit of Motorola's internal "G3" codename for the PPC750. The chip inside any "PowerMac G4" is some flavor of a Motorola PowerPC 7400, no matter what Apple calls it.
You can pretty much bet the farm that Apple will call every varient of the PPC970 they ship a "PowerPC G5".
1GHz bus? gimme a break. Intel hasn't yet reached this. Two points impossible.
Ahem. ("1ghz" is probably apple marketing-speak, but it's always been known that the PPC970 will have a stupidly fast FSB -- Intel isn't the only company that can innovate in this field, eh?)
Almost believable, but for the moment Apple are phasing out the use of NVIDIA cards in their machines.
Simply and 100% wrong. Apple has been doing pretty much exactly the same thing for the last three years on this front: providing whichever of the two offered them the best OEM pricing as the default configuration, and offering the other as a build-to-order option. They will continue to do this.
Also, Apple have a long standing habit of using Firewire instead of USB 2.0
Here, you may be correct, but there are two issues that may force them to start shipping "USB 2.0" connectors: first, the USB consortium has recently declared that all USB ports are "USB 2.0" (yes, this is weird and stupid), and secondly it's actually getting a bit difficult to source USB controllers that only support the 1.0/1.1 specs.
Once again use of the verbal "One" instead of the numeric. Only one FW800 port? Why would Apple stick with FireWire 400 anyway? I mark this impossible
FW400 and FW800 use different connectors, and there are not yet many FW800 products on the market. This is called "covering your bets" and "not pissing off your customers". BTW, 1x FW800 and 2x FW400 is also the configuration on the 17" AlBook, so they've already shipped one machine in exactly this "impossible" configuration.
optical audio in a graphics machine? I'm sorry but this sounds like wishful thinking.
No, it sounds like you have no idea what you're talking about. Do you have any idea how many macs are used in audio production? Are you aware that Apple sells their own high-end audio composition program? The only surprise about a PowerMac with optical TOSlink is that they didn't do it years ago. -
Re:Switch?
So looking at Hyper Transport, at this stage, I'm a tad leery of it because it didn't come from Apple.
Apple is a member of the HyperTransport Consortium. They have a hand in the development of the technology.
-
Re:No Gigabit Ethernet ? Yes
Not quite. Hypertransport is a high speed bus used to connect the CPU to the peripheral chips. There will still be an ethernet chip, and a firewire chip, but they will live on a Hypertransport bus rather than a PCI or PCI-X bus.
HyperTransport technology transfers data at 12.8 Gigabytes per second. It is designed to be approximately 48 times faster than PCI, 12 times faster than PCI X and 10 times faster than 4-channel Infiniband.
The current G4 suffers from a severe bus bandwidth bottleneck. This is an on-chip problem, so no fancy peripheral chips can rectify it. This is why the current DDR PowerMacs don't see the significant benefit that DDR technology should provide. In most current P4/Athlon/G4 performance comparisons, the G4's lagard performace can be much more attributed to its poor memory bandwidth than it's core clock speed.
Although initial 970 core clock speeds don't seem to be significantly greater than the current G4, its peripheral interface bandwidth is lightyears ahead. Hypertransport would help the 970 sing, significantly improving its throughput. Hypertransport would be wasted on a G4. It would be like having a superhighway run by your city, but your on/off ramps are potholed dirt tracks with metering lights. -
Except that...
...Apple and AMD are members of the HyperTransport consortium (not to mention have long been known to have been collaborating on this), and Intel is not.
-
I wonder...
...did Intel come up with that name in response to AMD's Hypertransport bus architecture, or did they independently decide that the Xeon needed something hyper?
-
Daddy, can I be an analyst too?
...IDC analyst Roger Kay said Monday. Kay is unimpressed with the promise of Firewire2. "I don't know what we need it for. FireWire is really fast already, and data is only as fast as your slowest link -- your PC or your modem or cable line."
Is this for real? "Umm... faster disks, no, nobody needs faster disks." Followed by, "and 640K is plenty!"?
Firewire2 is only as fast as ATA/100, which is already being superceded by Serial-ATA. That it can guarantee those transfers on a long-cable daisy-chain or star bus is why it's amazing. If PCI has trouble feeding it, we have a 6.4 GB/s bus on the horizon for this summer. This spec is going to have to last at least 4 years.
And who's implementing virtual memory over modem lines these days? What is this guy talking about? -
Hypertransport
You may be interested to read about the HyperTransport capabilities of the chip at http://www.hypertransport.org
One thing I found particularly interesting was the SMP abilities of the AMD, through the use of Hypertransport. It allows multiple chips to be used on the same board without all the glue logic normally associated with SMP setups, so you can have arrangements like the Power4 and suchlike, without enormous amounts of additional circuitry.
Funky stuff -
Re:from the horse's mouth
You also see hints of Hypertransport, the 6.4 GB/s system bus that Apple, AMD, Nvidia and others have been working on together. The article states that the chip can handle bus speeds up to 900 MHZ and I imagine it can handle DDR proprerly unlike the current Motorola chips Apple is using.
I am excited about these chips but will be buying a new Mac soon as I can't put up with my Beige G3 any longer to wait for these :(
But on the bright side, I can upgrade in another 2 years after they work all the kinks out of this new chip! -
*Full* article text follows (part 3 of 3)
(Sorry, the benchmarks pages are pretty much worthless without the graphs. If it makes you feel any better, I didn't see them either:-)
After Effects Pt. 1
Adobe After Effects 5.5 software delivers a set of tools to produce motion graphics and visual effects for film, video, multimedia, and the Web whether working in a 2D or 3D compositing environment. After Affects is a main creative program and works in concert with Adobe Photoshop, Illustrator, Softimage and a Media100 non-linear edit suite.
A user interacts with Adobe After Effects through the GUI and produces finished work by rendering a project to the hard drive. The amount of effects and elements that After Effects can do is far too lengthy to summarize accurately but guaranteed it is extensive in its palette of tools. Therefore After Effects can make a myriad of simultaneous different demands on the CPU/GPU/RAM systems. To demonstrate the benefits of different hardware components a real world After Effects projects consisting of compressed and uncompressed video, EPS, internally generated and PICT text elements, transitions, size scaling, shadows, and treatments was chosen as the test project. Benchmark programs may examine individual demands on a system but in the real world this may not be the case and it is important to measure the results of simultaneous varied demands as well as one specific measurement task. It may not be a standardized test but it shows what to expect from a project that encompasses a lot of different tasks simultaneously.
After Effects primarily uses the processor, video card and ram while a user is working within a composition window and timeline. Adjustments to a project are displayed in real time in the composition window. The faster each of the individual hardware subsystems are the smoother the interaction and the faster the composition window will be redrawn.
When After Effects renders or builds the finished timeline it is the processor, ram and hard drive that determine speed as the video card is more or less bypassed. After effects will call to the disk for information for the processor to calculate a finished frame and then return that frame to the disk for storage. This process repeats for as many frames that are within the timeline. Remember that any video is a series of still frames and After Effects builds each single frame and glues it to the next to finally end up with a playable movie or, conversely, a sequence of files.
Ram is an important consideration with Adobe After Effects. It will function effectively with only 512 MB of ram but more ram is better. Adobe recommends using the following formula to calculate the amount of ram it needs to preview a composition.
[(height x width x (bit depth/8) x frame rate x (resolution) 2) / 1024] / 1024 = MB/sec.
The variables for height, width, frame rate, and resolution depend on the composition setting. Always use the maximum expectations to determine a base of RAM requirement. For example a preview of 10 seconds in NTSC broadcast format would plug into the equation thusly:
[ (640 x 480 x (32 / 8) x 30 x (1)2) / 1024 ] / 1024 = 35 MB/sec.
One then should come to the conclusion that 350 MB of available RAM would be needed to preview 10 seconds worth of an After Effects timeline.
That couldn't be more wrong.
After Effects Pt. 2
How much RAM is needed is dictated by the information in the picture and the compression codec used. For example; 30 seconds of a 640x480 white page will take up much less RAM than a video of a stock car race. That's because more information must be stored about the color changes during moving video. A white page is just that...white...and the program will figure out quite quickly that it can save time by repeating the same information about pixel color instead of storing unique information about each one. How much required RAM depends on the variety of color, how often each pixel changes and the particular compression codec used. This will be the only time where a strong recommendation is made. Get at least 1 GB of RAM to make the After Effects experience more enjoyable. Get more RAM if it is expected that there will be a need for longer previews or work in D1, HDTV or widescreen format.
There will be two benchmark measurements to identify the benefits of different processor components. The the bars on the left feature a small jump in AMD processor speed to demonstrate how more CPU horsepower will speed up the CPU/GPU/RAM dependent RAM PREVIEW. THe bars on the right demonstrate a small increase of CPUhorsepower and its effect on rendering speed. This should help determine if the increase you are considering is worth it.
The results do show the greatest impact in each of the two major functions of After Effects. A small increase in CPU does have a big affect on rendering speed but not in real time ram preview. When designing a system on a budget it is important to identify what is expected and, if budget restrictions require a choice, then the desired balance between expectations must be sought to satisfy favor user interactivity or speed of rendering. Also anticipate that longer RAM previews or larger format previews require more ram. You can literally watch the ram fill. Just leave task manager open and watch the page file usage creep up. The goal in making purchasing choices is to work backwards from what you expect in the end result.
SoftImage XSI
SOFTIMAGE|XSI v.2.0 is an incredibly powerful 3D tool that has the capacity to bring virtually any system to its knees especially if raytracing, radiosity or photon-mapping is used to a large extent. If this is the case then there definitely will be a loud scream of anguish coming from a solitary PC system. Softimage projects can become so system intensive that 100 finished frames can take an insane amount of time to render. In order to increase rendering speed many computers are equipped with specialty hardware and are tied into render farms in the single-minded task of rendering a single scene.
That's enough of the fire and brimstone about complex 3D rendering. Softimage works on somewhat similar principle to After Effects. A faster and more powerful video card will translate to a smoother interface where complex scenes can be manipulated in real time. Note that Softimage does not have an interface to real-time preview a finished frame as unlike After Effects. Users can manipulate objects in a choice of views from wire frame mode to simulated real-time shading mode. In order to look at a finished frame a user must render the frame to disk which bypasses the GPU. A faster processor will result in the faster render. The amount of RAM is not as great an issue as the user is working frame by frame and the graphics card is doing the bulk of the work while working within the GUI.
This is a most basic overview and there are specialty hardware components that can enhance the speed and interactivity of complex 3D scenes and programs. The designers working on the test system use Softimage on a less complex level to provide enhancements and elements to commercials, promos and station ID elements. Though their work is quite complex to some it a far cry from that of special effects in major film productions.
Speeding up Softimage requires thinking on the same two levels as After Effects. The Softimage GUI can display very complex and varied effects but it does so in simulated mode. Displaying the finished rendered product in real time is beyond the capacity of most video cards. But there needs to be the hardware features on the video card to accommodate for smooth manipulation of 3D objects and the proper display of simulated effects. Softimage, AutoCAD and various other 3D programs need to access those hardware features in order to function and display the image properly. Don't think that a fast gaming card comes with these physical hardware features or have those that are onboard...unlocked. Remember that last word as it comes to make sense later.
It is quite true that a fast gaming card will be a poor performer in Softimage if it works at all. Conversely a workstation class video card may make for an enjoyable user experience in a complex 3D application but will deliver lower frame rates in games. It is safe to say that different applications require different hardware tools. Matrox provides some insight to the locking and unlocking of features on video cards.
The Parhelia workstation solution differs in no way shape or form to the retail Parhelia, which does go against the grain. Competitors tend to artificially inflate prices for their workstation products by unlocking features even though the chip may be identical to or based upon the same technology as their retail offerings. This so-called feature locking doesn't occur with Parhelia when compared to our retail solution. This is the key point here, so whether you are a prospective Parhelia client, purchased a retail board or have one integrated in your system, all Parhelia boards have access to the same workstation functionality and Surround Design support.
Softimage, by default, is designed for a single monitor interface yet the layout can be customized for dual and even triple monitors. It was most interesting to hear the comments made when the designers started to spread their workspace out to the second monitor and then to the third. Since Softimage bypasses the video card in the render process there was no performance loss.
A simple animation of 100 frames in length was rendered out with two different processors. As rendering in Softimage relies upon the processor most...then the faster the processor should result in a faster render. The animation data is as follows:
You may ask if this is any good? Just for laughs I let the art director take the project to his dual Xeon 1.8 GHz nVidia Quadro driven power box and he did beat the time by a full 10 minutes. He also beat the price by a full $3000 (cost of purchase of art director's system vs. article test system). Somehow I'll wait the 10 minutes and keep the 3 grand in my pocket. A single Xeon 450 with a Quadro card takes over 3 hours. Those numbers are completely unofficial but it lets you see the range of performance.
Benchmarks Pt. 1
Before the benchmark
Benchmarks are a yardstick we use to measure performance. Not one benchmark stands above the rest as the defacto tool. Benchmarks are useful to identify major peformance problems in a system. They can also be used to identify the impact of hardware changes on overall system peformance. This is very useful especially when combined with the software expectations. A faster processor may deliver faster renders but not help with a smooth GUI. A better video card may deliver a smoother interface but won't help if long ram previews are required. The performance enthusiast and overclocking crowd are edging each other by a handful of points or frames. Remember this as you look at graphs and charts. Don't look at just who's in front but also by how much both in points/frames and cost.
3D Mark 2001 SE
The granddaddy of benchmarking tools measuring how effectively a system runs 3D graphic applications. Moving from the 1900+ to the 2100+ showed only a small increase in peformance. This isn't critical for workstation applications but may be the goal of gamers to squeeze every frame per second gain from their systems.
Sisoft Sandra
Small increases in processor speed appear to have the greatest impact in Sandra's multimedia benchmark.
Benchmarks Pt. 2
GLExcess
Quake IIIArena
Serious Sam the Second Encounter
Business Winstone and Content Creation
Benchmarks Pt. 3
Code Creatures
Commanche 4
DroneZ high quality.
SpecviewPerf 7.0
This benchmark really testsOPENGLperformance and it is important to note that there is a large discrpency between our results and the results from Matrox on their test system. We are investigating this. (Our system scored much lower)
PSBench
We added a new benchmark to our tests. PSBench looks at 21 individual tests in Photoshop 7.0 and the results can be looked at individually or as a cumulative score. There are three levels to PSBench; basic, intermediate and advanced. This test shows the results of the intermediate tests.
Media Cleaner Pro
Three tests were conducted to compress a 651 MB 640x480 NTSC Quicktime file. The larger the file the more good a faster processor is going to do you.
In the Driver's Seat
Who's in the driver's seat?
If you were to be put into the driver's seat of a race car would you be able to win a race against a professional driver in the exact same car? Probably not given the fact you don't know how to properly drive a race car.
Computer hardware is just that...hardware...and it can't do anything without being told how to do it. While hardware itself does go through advancement cycles as new technology emerges into mainstream it isn't worth much if it doesn't work or work well. Driving the consumer PC market forward are games. Gaming video cards have fallen into 3-month product cycles with new versions being announced before the prior has even hit store shelves.
A comment from Mark Randall of Serious Magic in a TechReport article piqued interest to look beyond the hardware for performance.
The problem isn't the hardware, it's the software drivers. In fact, the speed could be dramatically increased with revised software drivers. However, no manufacturer has presently made this aspect of driver performance a priority. The first card manufacturer to address this issue would deliver the following benefits to their users:
Mr. Randall goes on to state that software drivers, properly addressed, could increase render time, record game play in real time, capture motion images off the desktop or even stream video out to the internet directly from the video card.
Drivers can indeed be a problem. Ask anyone who's experienced a Blue Screen of Death (BSOD). The $64 question is about the driver itself. Are we, the persistent purchaser of PC parts, being cheated out of performance that could be ours without a hardware upgrade?
A graphics card is built on the power of the Graphics Processing Unit (GPU). This is a processor chip and it doesn't make financial sense to reinvent the chip each time a new video card is released. This is the same for CPUs. The AMD Thunderbird chip scaled all the way up the 1.4 GHz before the Palomino core took over till 1.77 GHz and now the Thoroughbred core extends the range past 2 GHz. The same can be said for INTEL PII, PIII and PIV architecture.
The point is that features are either locked or unlocked on some graphic cards and the differences between adjacent levels of product may be very subtle; as subtle as a fresh set of tires and a tweak to a spoiler setting may make the difference between winning and losing the race.
If all the hardware is available then how visually enjoyable or complex that game may be, how fast a render is or if the card can support the software itself may come down to what features are hidden. Case in point; in the early stages of this article Matrox was developing and refining drivers for the Parhelia with such software applications like Softimage. Use the official 2.31 drivers with the Parhelia and Softimage won't recognize the card as an OPENGL card and won't access those OPENGL features and not perform as expected. One or two driver revisions later and Softimage is happy.
But it isn't as easy as that. Between the software and associated drivers and the hardware is the Application Programming Interface (API) layer. Hardware and software speak two different languages and they need some way to properly communicate with each other. Explaining the API is a fairly complex matter but think of computer hardware as your body. The software resides in your head as a desire to do something like walk, talk, run, jump, or eat. Between that software thought of wanting to walk across a room to pick up an apple and take a bite out if it and the mechanical act of actually doing it is a series of hidden instructions that just happen. You don't really think about activating individual muscles to tighten and loosen on that incredibly precarious journey of balance as you stride across a room. You don't actively plan and coordinate in 3D space the relation of the apple to you or to your hand and then calculate placement and pressure required to take a bite. These things you just...do.
The API acts in a similar fashion taking what the software wants to do and translating it to the hardware to do it and then returning the result back to the software to display. The most recognizable examples of API layers would be Microsoft's DirectX and OPENGL but other software can have its own proprietary API layer is with ADOBE and their programs. DirectX and OPENGL take interesting approaches to 3D graphics and each has their inherit advantages and disadvantages and they can be more than just coding issues....they can be political. For a far superior explanation I suggest a visit to www.jakeworld.org to read an article by guru game programmer Jake Simpson and his article on Graphical API History.
Drivers are much more complicated that one might think. They can be a proverbial house of cards. Each game, application, tool, player and so on interacts with the video card in a subtly different way. Drivers are initially designed to work with everything but may not work to their fullest potential. That's where optimization begins and the people who build drivers begin the task of figuring out what enhancements or tweaks can be made to their drivers in order to gain performance and stability. This must be done one program at a time and there is an extensive list of programs. Just think of how many games there are then begin the task of trial and error to get the best performance out of each individual game.
It isn't as simple as taking those individual driver enhancements and putting them into one set of drivers. A tweak in one enhancement can cause another tweak to turn into a problem and fixing that problem can create four others. Drivers are a balancing act between performance, stability and cost. It's almost an unobtainable triangle. Achieving performance and stability takes an unlimited pot of R&D money. Achieving great performance may cause instability. Achieving stability may cost performance.
And around it goes.
So hardware manufacturers strive to achieve balance by designing their product to fit a niche purpose. It would take too much time, effort and money to build the fastest, most stable gaming/workstation/single monitor/dual monitor/triple monitor/multimedia/digitize/output video card. It can be done but the cost of the product would be 10 times an unacceptably high price.
Manufacturers In the video card market choose where their priorities are based on what market they want to capture. Gaming cards and their drivers are optimized for games with lesser emphasis on workstation applications. Workstation video cards are optimized for the reverse. Let's face it. There's more money to be made in gaming cards than the workstation cards.
If, for the most part, the hardware can support significant performance improvements then is it the fault of the API, software or drivers and are we being cheated? This brings us back around to Stephan Schaem, Chief Technology Officer of Serious Magic.
In some cases, card manufacturers have chosen to differentiate their 'consumer' vs. 'professional' cards by introducing essentially identical cards with different firmware and software drivers. The manufacturer's state that the additional cost of the pro product goes to fund development of advanced driver features that are particularly useful in production environments. The issue that Serious Magic has focused on is a different one. It's a significant issue in PC graphics card performance but we don't believe it was an intentional omission.
In a nutshell, here's the issue. While today's graphics cards can render images very quickly, the software drivers are painfully slow at getting rendered output back over the AGP bus and into the PC where it could be saved and put to work by users. Current generation software drivers achieve only a fraction of the theoretical download transfer speed that the hardware you've already paid for is capable of. It's remarkable that a graphics card with a video input and some video recorder software can record TV-quality images to the PC hard disk in real-time, yet the same card can't record it's own renderings at even 1/10th this speed. Serious Magic has made a benchmark which demonstrates this problem freely available on our website:
www.seriousmagic.com/3D-Dloadbenchmark.zip
The problem isn't the hardware, it appears to be the software drivers. This is supported by the fact that the external video input to a VIVO-enabled graphics card can be moved over the AGP bus very quickly. Also, some software drivers under Windows 98 are able to move the rendered output very quickly. However, in all cases under Windows 2000 and XP the speed of transferring the 3D rendered results of the same card is very, very slow. It seems that the speed could be dramatically increased simply with revised software drivers. While this is a significant issue for many business, educational, production and scientific tasks, it is not a feature that gamers are clamoring for (although it would make capturing movies of game output faster, this is not as coveted as a higher frame rate). We believe that this is why no manufacturer has yet made this aspect of driver performance a priority. Even the more expensive cards with drivers targeted at the professional market are equally poor at this task. Hopefully, with the game market rapidly reaching saturation, manufacturers will realize that the growing business, educational, production and scientific markets can be substantial. Although each of these markets may be small when compared against the game market, when combined they can add up to meaningful numbers.
And don't tell me there's a difference between drivers. Here is an example of the same system benchmarked the same way except for the change in video card drivers.
Speed! I need more speed Scotty!
What does the future hold? Processors, graphic cards and RAM are edging upwards in speed and bandwidth. The 3GHz mark is within reach for both AMD and Intel. Matrox opens up a huge 17.6 GB/s pipe with the Parhelia and DDR ram is bumping up the performance ladder as seen in the table.
Memory name
Type name
Clock speed
Voltage
DDR clock speed
Data Bus & Bandwidth
PC100
.100MHz
3.3v
.64-bit, 0.8GB/s
PC133
.133MHz
3.3v
.64-bit, 1.05B/s
PC1600
DDR200
100MHz
2.5v
200MHz
64-bit, 1.6GB/s
PC2100
DDR266
133MHz
2.5v
266MHz
64-bit, 2.1GB/s
PC2700
DDR333
166MHz
2.5v
333MHz
64-bit, 2.7GB/s
PC3200
DDR400
200MHz
2.5v
400MHz
64-bit, 3.2GB/s
PC4200
DDR533
266MHz
2.5v
533MHz
64-bit, 4.2GB/s
Today's ultra-powerful CPUs, GPUs and RAM are tied to a proverbial boat anchor. It's the motherboard with its inherent latency and bottleneck problems. Further to that is the I/O rate of the hard disk or how fast data can be lifted from or stored to the platters.
A way to increase After Effects render speed is to increase disk speed and this is accomplished by moving to a SCSI disk array. Unfortunately in the restrictions of a home buyer's budget it would push the cost above an acceptable level. SCSI disks have a greater throughput of data than IDE disks. ULTRA160 SCSI disks deliver a maximum 160 MB/s and the newer UTLRA320 SCSI deliver 320 MB/s. The less expensive IDE drives can move data at a maximum of 100 MB/s (ATA100) or 133 MB/s (ATA133). We all know that actual performance with either SCSI or IDE is significantly less than theoretical boasts. Any of these disks in an array can further enhance performance with SCSI arrays reaching upwards of a theoretical 500 MB/s. Processors can handle a greater amount of data in After Effects but must wait around for the data to exchange with the hard drive.
CPU, GPU, Ram and hard drives work through the motherboard and therein lay the bottleneck. CPU, GPU and RAM may be able to accept and shovel out information with great speed and in huge gulps but the problem is that the pathway between components is relatively small and not nearly as fast. It's like trying to drain or fill a swimming pool with a garden hose. A solution is to get a heck of a lot more garden hoses or a bigger hose.
Both AMD and INTEL are backing solutions and each in their own way. AMD brings HyperTransport with a bigger hose and INTEL counters with the many hose analogy for PCI-Express, formerly known as 3GIO. INTEL also is deep into it with Infiniband. Infiniband is more of an outside of the box solution providing for reliability, availability, scalability and performance gains between data centers, such as server disk arrays. It isn't paramount to this article but worth mentioning as it will have an impact on how fast two systems can talk to each other. Both AMD and INTEL have the same goal to increase the amount and speed at which data moves through a system or device.
HyperTransport
Chipset? Who's got the chipset?
AMD HyperTransport Technology-Based System Architecture should be thought of on two levels; within the specific component and between components. In other words HyperTransport technology, when applied to a component such as a processor, can raise the bar on how fast it can complete an operation or how much it can process at any given time. HyperTransport, when applied to the pathway between components, increases the amount of data (bandwidth) and reduces the time for it to get around (latency). HyperTransport allows for the pool to drain or fill faster due to a very much larger hose.
HyperTransport promises some pretty hefty improvements to loosen the noose on bottleneck I/O problems. HyperTransport technology is used to provide high-performance interconnects between integrated circuits that comprise the system's core. Peripheral device interconnect is provided by existing industry standard busses such as USB, IEEE-1394, IDE, SCSI, Serial ATA, etc. In other words AMD is aiming to provide a large bandwidth, high speed platform. AMD makes the HyperTransport technology available and leaves the rest up to the other manufacturers. This may mean a bigger, better, badder motherboard.
HyperTransport Technology
HyperTransport technology is an advanced high-speed, high-performance, point-to-point link for integrated circuits. HyperTransport provides a universal connection that is designed to reduce the number of buses within the system, provide a high-performance link for embedded applications, and enable highly scalable multiprocessing systems. It was developed to enable the chips inside of PCs, networking and communications devices to communicate with each other up to 48 times faster than with existing technologies.
Compared with existing system interconnects that provide bandwidth up to 266MB/sec, HyperTransport technology's peak bandwidth of 12.8GB/sec represents better than a 40-fold increase in potential data throughput. HyperTransport technology provides an extremely fast connection that complements externally visible bus standards like the Peripheral Component Interconnect (PCI), as well as emerging technologies like InfiniBand. HyperTransport technology is the connection that is designed to provide the bandwidth that the new InfiniBand standard requires to communicate with memory and system components inside of next-generation servers and devices that may power the backbone infrastructure of the telecom industry. HyperTransport technology is targeted at the networking, telecommunications, computer and high performance embedded applications and any application in which high speed, low latency and scalability is necessary.
The AMD-8000 (HyperTransport) series of chipset components stack up to some large numbers promising a peak throughput of 12.8 GB/s.
AGP 8X doubles the bandwidth moving peak transfer rate up to the 2.1 GB/s notch.
PCI-X (not to be confused with PCI-Express) significantly improves data transfer rates from 100 and 133 MB/s all the way up to nearly 1 GB/s peak data transfer.
USB 2.0 allows for connecting exterior USB peripherals to access the system via a 450 MB/s pipeline.
It's a very simplified explanation but it means that PC systems have the potential to make rather large performance jumps in the relatively near future. HyperTransport technology is a reality as evident by nVidia's nForce chip but don't expect full featured HyperTransport motherboards to find their way onto store shelves for some time to come.
More on Hypertransport technonology can be found at the website and in an AMDwhite paper.
PCI-Express
All aboard the Express!
INTEL stands behind PCI-Express and Infiniband. The performance gains have been staked even higher than HyperTransport with an initial offering of 2.5 GB/s/direction up to a projected advance to 10 GB/s/direction and beyond. It appears that PCI-Express is initially designed to fit into the existing box and Infiniband is designed for improved connectivity out of the box such as connecting server data centers.
PCI Express architecture is described as a high-speed, general purpose serial I/O interconnect that provides the bandwidth for current and future applications. After reading about PCI-Express it is almost impossibly difficult to sum up this technology into a single sentence but the PR team managed to do so with a collection of words that commits to nothing yet sounds exciting. Nonetheless, PCI-Express has the same goal as AMD with one major difference. PCI-Express has been designed to fit with present technology. It also partners well with Infiniband.
HyperTransport is a new chipset entirely thus, as an example, a brand new motherboard would be required. It is up to motherboard manufacturers but in order to satisfy consumer demand there may come a time where motherboards may feature a PCI-Express port as an option to add PCI-Express components. This may happen at the relative same time that HyperTransport motherboards enter the marketplace. It's debatable to which is the best approach. Is bolting on new technology to enhance current the better route or is it best to start from an entirely next-gen platform?
A further question arises about data transfer to and from the hard drive platters. To get faster data transfer the disk needs to spin faster or the data algorithm has to be more compact or a combination of both. There comes a limit to how small the data can be made. Seagate explains;
Today, as the magnetic particles that make up recorded data on a hard disk drive become ever smaller, we are approaching a point where the data bearing particles are so small that random atomic level vibrations present in all materials at room temperature can cause the bits to spontaneously flip their magnetic orientation, effectively erasing the recorded data. Magnetic recording scientists and engineers have calculated that this so called superparamagnetic effect may become a serious technology issue for new products in only two or three years.
But as soon as it is said that it can't be done'
Seagate has decided to use a HAMR to cram more and more bits of information per square inch into hard disc drives, pushing the limits of magnetic recording even further beyond what was ever thought possible. The Company today demonstrated its revolutionary Heat Assisted Magnetic Recording (HAMR) technology, which records data magnetically on high-stability media using laser thermal assistance.
HAMR, combined with self-ordered magnetic arrays of iron-platinum particles, is expected to break through the so-called superparamagnetic limit of magnetic recording by more than a factor of 100 to ultimately deliver storage densities as great as 50 terabits per square inch. This will provide the capability for people to store the entire printed contents of the Library of Congress on a single disc drive in their notebook computers.
Hard drive space has increased at a phenomenal rate over the last 5 years. It used to be that 270 MB was considered a big disk and now 80, 100, and 120 GB drives are commonplace. (270 MB is less than one percent the size of a 120 GB hard drive.) Space increases and falling prices keep the consumer happy but what happens when the consumer turns their attention away from processor speed and disk space?
PCI Express and HyperTransport bring the promise of faster productivity on the computers that we work with today. This will buy time until hard drives become something more than they are and perhaps less integral to the real time operation of a system. Fitting the multitude of software and hardware architecture together into a coherent working solution may take time but it is on the horizon and we'll witness some form of its arrival sooner than later.
And where will it stop? Will we expect real time renders or projects rendered faster than real time? In whatever form it happens to finally evolve into next generation technology could make today's super fast PC the 486 of tomorrow.
More on PCISIG can be found at their website and this FAQ. Also look to the other white paper on 3GIO. Infiniband information can be at the website and in the FAQ.
Conclusion
Workstation class PCs were always thought of as very expensive and powerful beasts affordable to only those with deep pockets. Everyday a new piece of hardware comes onto store shelves and if properly picked can make for some formidable computing power at very affordable prices. You don't need the best of the best hardware to do the work. Perhaps that diamond tipped, gold plated shovel isn't needed in the garden when a plain old spade will do the job just as well.
I commend those who waded though this. PC configuration is like a jigsaw puzzle; you need a few pieces of information to begin to see the big picture. After this you may be left with the question of what would we recommend? Our test system tackled the workload of a professional broadcast design department and performed well and even better than some existing systems. We thoroughly enjoyed the extra display that the Parhelia brought to the work environment. Remember that a workstation is not designed to be a competitive gaming computer even though the designers had to be told on several occasions to do work instead of playing Quake. The AMD processors made a few INTEL loyalists reconsider. All of them were like curious children when we broke from the beige box syndrome. Those that knew the price of professional 2D/3D workstations said...it cost what? If you are building from the ground up or just adding on...determine what you want first. If it is workstation graphic power then balance the GPU-CPU equation as a little more money invested in one or the other may deliver better results in the end.
Begin with the end. Getting more from a workstation, gaming or home multimedia PC is a matter of answering the questions of what is expected from the computer. Define your goals and get your hands dirty with a little research then you'll end up with a PC that is better suited to your tasks and, perhaps, your pocketbook. We built a system that made many users very happy. It also made my budget very happy as well. It is amazing the creative power that's available in computer hardware today.
In closing I'm reminded of an old saying. Give a man a fish and he'll eat for a day. Teach a man to fish and he'll eat for life. In other words; if I tell you what's best now you'll have the best for a day but if I teach you how to choose what's best for you then you'll have the best for life.
Icrontic extends their appreciation to the good people at ABIT, AMD, Matrox, GlobalWin and an ever-faithful AMKComputers for their assistance and involvement with this article.
Personal Opinion
The use of benchmarks, charts, graphs and a lot of technical talk are valuable in the price vs. performance equation but it all comes down to how a computer system feels. Marketing surveys may show results such as 9 out of 10 users thought it was fast but what happens if you are the 1 out of 10?
Our test system surprised us. Perhaps we were rooted in a MACdesign world for too long or caught with our pants down for keeping up with technology. The home PC enthusiast most likely upgrades more times in a year than an office does in 5 years. In unofficial comparisons our test system beat our single and dual processor G4's and nipped at the heals of a dual XEON Quadro system.
We didnt' set out to build a gaming machine but we were able to play games and not worry about being blown up when our computer couldn't keep up. Softimage and After Effects are what interested us the most. Fast renders and an easy interface would make our head spin. The Matrox Parhelia brought great amounts of real estate and a great image quality but a few problems. Softimage is not the most well-behaved program at the best of times. It was cranky to begin with and within a few driver tweaks Matrox engineers had it under control. There are still a couple of bugs but they are getting harder to find and most wouldn't stumble across them. Softimage did have some very minor display problems with the second and third display but these should be gone with the release of the 1.01 drivers. The other problem wasn't the fault of Matrox but more us. Our cabinetry was configured for dual monitors and not for three. Nevertheless the Parhelia functions extremely well in single, dual or triple head mode. A lot of people hadn't heard of AMD, ABIT or GlobalWin and didn't know there was so many choices and options. They definitely marvelled at the AMKcase.
We thought it couldn't be done on a budget. It's simply amazing the sheer computing power available at our fingertips. Immediately half the computers that were twice the price...were made obsolete.
Sure there were the doubtful who mocked and stood firmly by their convictions...as the familiar sound of the MACs crashing echoed down the hallway.
-
*Full* article text follows (part 3 of 3)
(Sorry, the benchmarks pages are pretty much worthless without the graphs. If it makes you feel any better, I didn't see them either:-)
After Effects Pt. 1
Adobe After Effects 5.5 software delivers a set of tools to produce motion graphics and visual effects for film, video, multimedia, and the Web whether working in a 2D or 3D compositing environment. After Affects is a main creative program and works in concert with Adobe Photoshop, Illustrator, Softimage and a Media100 non-linear edit suite.
A user interacts with Adobe After Effects through the GUI and produces finished work by rendering a project to the hard drive. The amount of effects and elements that After Effects can do is far too lengthy to summarize accurately but guaranteed it is extensive in its palette of tools. Therefore After Effects can make a myriad of simultaneous different demands on the CPU/GPU/RAM systems. To demonstrate the benefits of different hardware components a real world After Effects projects consisting of compressed and uncompressed video, EPS, internally generated and PICT text elements, transitions, size scaling, shadows, and treatments was chosen as the test project. Benchmark programs may examine individual demands on a system but in the real world this may not be the case and it is important to measure the results of simultaneous varied demands as well as one specific measurement task. It may not be a standardized test but it shows what to expect from a project that encompasses a lot of different tasks simultaneously.
After Effects primarily uses the processor, video card and ram while a user is working within a composition window and timeline. Adjustments to a project are displayed in real time in the composition window. The faster each of the individual hardware subsystems are the smoother the interaction and the faster the composition window will be redrawn.
When After Effects renders or builds the finished timeline it is the processor, ram and hard drive that determine speed as the video card is more or less bypassed. After effects will call to the disk for information for the processor to calculate a finished frame and then return that frame to the disk for storage. This process repeats for as many frames that are within the timeline. Remember that any video is a series of still frames and After Effects builds each single frame and glues it to the next to finally end up with a playable movie or, conversely, a sequence of files.
Ram is an important consideration with Adobe After Effects. It will function effectively with only 512 MB of ram but more ram is better. Adobe recommends using the following formula to calculate the amount of ram it needs to preview a composition.
[(height x width x (bit depth/8) x frame rate x (resolution) 2) / 1024] / 1024 = MB/sec.
The variables for height, width, frame rate, and resolution depend on the composition setting. Always use the maximum expectations to determine a base of RAM requirement. For example a preview of 10 seconds in NTSC broadcast format would plug into the equation thusly:
[ (640 x 480 x (32 / 8) x 30 x (1)2) / 1024 ] / 1024 = 35 MB/sec.
One then should come to the conclusion that 350 MB of available RAM would be needed to preview 10 seconds worth of an After Effects timeline.
That couldn't be more wrong.
After Effects Pt. 2
How much RAM is needed is dictated by the information in the picture and the compression codec used. For example; 30 seconds of a 640x480 white page will take up much less RAM than a video of a stock car race. That's because more information must be stored about the color changes during moving video. A white page is just that...white...and the program will figure out quite quickly that it can save time by repeating the same information about pixel color instead of storing unique information about each one. How much required RAM depends on the variety of color, how often each pixel changes and the particular compression codec used. This will be the only time where a strong recommendation is made. Get at least 1 GB of RAM to make the After Effects experience more enjoyable. Get more RAM if it is expected that there will be a need for longer previews or work in D1, HDTV or widescreen format.
There will be two benchmark measurements to identify the benefits of different processor components. The the bars on the left feature a small jump in AMD processor speed to demonstrate how more CPU horsepower will speed up the CPU/GPU/RAM dependent RAM PREVIEW. THe bars on the right demonstrate a small increase of CPUhorsepower and its effect on rendering speed. This should help determine if the increase you are considering is worth it.
The results do show the greatest impact in each of the two major functions of After Effects. A small increase in CPU does have a big affect on rendering speed but not in real time ram preview. When designing a system on a budget it is important to identify what is expected and, if budget restrictions require a choice, then the desired balance between expectations must be sought to satisfy favor user interactivity or speed of rendering. Also anticipate that longer RAM previews or larger format previews require more ram. You can literally watch the ram fill. Just leave task manager open and watch the page file usage creep up. The goal in making purchasing choices is to work backwards from what you expect in the end result.
SoftImage XSI
SOFTIMAGE|XSI v.2.0 is an incredibly powerful 3D tool that has the capacity to bring virtually any system to its knees especially if raytracing, radiosity or photon-mapping is used to a large extent. If this is the case then there definitely will be a loud scream of anguish coming from a solitary PC system. Softimage projects can become so system intensive that 100 finished frames can take an insane amount of time to render. In order to increase rendering speed many computers are equipped with specialty hardware and are tied into render farms in the single-minded task of rendering a single scene.
That's enough of the fire and brimstone about complex 3D rendering. Softimage works on somewhat similar principle to After Effects. A faster and more powerful video card will translate to a smoother interface where complex scenes can be manipulated in real time. Note that Softimage does not have an interface to real-time preview a finished frame as unlike After Effects. Users can manipulate objects in a choice of views from wire frame mode to simulated real-time shading mode. In order to look at a finished frame a user must render the frame to disk which bypasses the GPU. A faster processor will result in the faster render. The amount of RAM is not as great an issue as the user is working frame by frame and the graphics card is doing the bulk of the work while working within the GUI.
This is a most basic overview and there are specialty hardware components that can enhance the speed and interactivity of complex 3D scenes and programs. The designers working on the test system use Softimage on a less complex level to provide enhancements and elements to commercials, promos and station ID elements. Though their work is quite complex to some it a far cry from that of special effects in major film productions.
Speeding up Softimage requires thinking on the same two levels as After Effects. The Softimage GUI can display very complex and varied effects but it does so in simulated mode. Displaying the finished rendered product in real time is beyond the capacity of most video cards. But there needs to be the hardware features on the video card to accommodate for smooth manipulation of 3D objects and the proper display of simulated effects. Softimage, AutoCAD and various other 3D programs need to access those hardware features in order to function and display the image properly. Don't think that a fast gaming card comes with these physical hardware features or have those that are onboard...unlocked. Remember that last word as it comes to make sense later.
It is quite true that a fast gaming card will be a poor performer in Softimage if it works at all. Conversely a workstation class video card may make for an enjoyable user experience in a complex 3D application but will deliver lower frame rates in games. It is safe to say that different applications require different hardware tools. Matrox provides some insight to the locking and unlocking of features on video cards.
The Parhelia workstation solution differs in no way shape or form to the retail Parhelia, which does go against the grain. Competitors tend to artificially inflate prices for their workstation products by unlocking features even though the chip may be identical to or based upon the same technology as their retail offerings. This so-called feature locking doesn't occur with Parhelia when compared to our retail solution. This is the key point here, so whether you are a prospective Parhelia client, purchased a retail board or have one integrated in your system, all Parhelia boards have access to the same workstation functionality and Surround Design support.
Softimage, by default, is designed for a single monitor interface yet the layout can be customized for dual and even triple monitors. It was most interesting to hear the comments made when the designers started to spread their workspace out to the second monitor and then to the third. Since Softimage bypasses the video card in the render process there was no performance loss.
A simple animation of 100 frames in length was rendered out with two different processors. As rendering in Softimage relies upon the processor most...then the faster the processor should result in a faster render. The animation data is as follows:
You may ask if this is any good? Just for laughs I let the art director take the project to his dual Xeon 1.8 GHz nVidia Quadro driven power box and he did beat the time by a full 10 minutes. He also beat the price by a full $3000 (cost of purchase of art director's system vs. article test system). Somehow I'll wait the 10 minutes and keep the 3 grand in my pocket. A single Xeon 450 with a Quadro card takes over 3 hours. Those numbers are completely unofficial but it lets you see the range of performance.
Benchmarks Pt. 1
Before the benchmark
Benchmarks are a yardstick we use to measure performance. Not one benchmark stands above the rest as the defacto tool. Benchmarks are useful to identify major peformance problems in a system. They can also be used to identify the impact of hardware changes on overall system peformance. This is very useful especially when combined with the software expectations. A faster processor may deliver faster renders but not help with a smooth GUI. A better video card may deliver a smoother interface but won't help if long ram previews are required. The performance enthusiast and overclocking crowd are edging each other by a handful of points or frames. Remember this as you look at graphs and charts. Don't look at just who's in front but also by how much both in points/frames and cost.
3D Mark 2001 SE
The granddaddy of benchmarking tools measuring how effectively a system runs 3D graphic applications. Moving from the 1900+ to the 2100+ showed only a small increase in peformance. This isn't critical for workstation applications but may be the goal of gamers to squeeze every frame per second gain from their systems.
Sisoft Sandra
Small increases in processor speed appear to have the greatest impact in Sandra's multimedia benchmark.
Benchmarks Pt. 2
GLExcess
Quake IIIArena
Serious Sam the Second Encounter
Business Winstone and Content Creation
Benchmarks Pt. 3
Code Creatures
Commanche 4
DroneZ high quality.
SpecviewPerf 7.0
This benchmark really testsOPENGLperformance and it is important to note that there is a large discrpency between our results and the results from Matrox on their test system. We are investigating this. (Our system scored much lower)
PSBench
We added a new benchmark to our tests. PSBench looks at 21 individual tests in Photoshop 7.0 and the results can be looked at individually or as a cumulative score. There are three levels to PSBench; basic, intermediate and advanced. This test shows the results of the intermediate tests.
Media Cleaner Pro
Three tests were conducted to compress a 651 MB 640x480 NTSC Quicktime file. The larger the file the more good a faster processor is going to do you.
In the Driver's Seat
Who's in the driver's seat?
If you were to be put into the driver's seat of a race car would you be able to win a race against a professional driver in the exact same car? Probably not given the fact you don't know how to properly drive a race car.
Computer hardware is just that...hardware...and it can't do anything without being told how to do it. While hardware itself does go through advancement cycles as new technology emerges into mainstream it isn't worth much if it doesn't work or work well. Driving the consumer PC market forward are games. Gaming video cards have fallen into 3-month product cycles with new versions being announced before the prior has even hit store shelves.
A comment from Mark Randall of Serious Magic in a TechReport article piqued interest to look beyond the hardware for performance.
The problem isn't the hardware, it's the software drivers. In fact, the speed could be dramatically increased with revised software drivers. However, no manufacturer has presently made this aspect of driver performance a priority. The first card manufacturer to address this issue would deliver the following benefits to their users:
Mr. Randall goes on to state that software drivers, properly addressed, could increase render time, record game play in real time, capture motion images off the desktop or even stream video out to the internet directly from the video card.
Drivers can indeed be a problem. Ask anyone who's experienced a Blue Screen of Death (BSOD). The $64 question is about the driver itself. Are we, the persistent purchaser of PC parts, being cheated out of performance that could be ours without a hardware upgrade?
A graphics card is built on the power of the Graphics Processing Unit (GPU). This is a processor chip and it doesn't make financial sense to reinvent the chip each time a new video card is released. This is the same for CPUs. The AMD Thunderbird chip scaled all the way up the 1.4 GHz before the Palomino core took over till 1.77 GHz and now the Thoroughbred core extends the range past 2 GHz. The same can be said for INTEL PII, PIII and PIV architecture.
The point is that features are either locked or unlocked on some graphic cards and the differences between adjacent levels of product may be very subtle; as subtle as a fresh set of tires and a tweak to a spoiler setting may make the difference between winning and losing the race.
If all the hardware is available then how visually enjoyable or complex that game may be, how fast a render is or if the card can support the software itself may come down to what features are hidden. Case in point; in the early stages of this article Matrox was developing and refining drivers for the Parhelia with such software applications like Softimage. Use the official 2.31 drivers with the Parhelia and Softimage won't recognize the card as an OPENGL card and won't access those OPENGL features and not perform as expected. One or two driver revisions later and Softimage is happy.
But it isn't as easy as that. Between the software and associated drivers and the hardware is the Application Programming Interface (API) layer. Hardware and software speak two different languages and they need some way to properly communicate with each other. Explaining the API is a fairly complex matter but think of computer hardware as your body. The software resides in your head as a desire to do something like walk, talk, run, jump, or eat. Between that software thought of wanting to walk across a room to pick up an apple and take a bite out if it and the mechanical act of actually doing it is a series of hidden instructions that just happen. You don't really think about activating individual muscles to tighten and loosen on that incredibly precarious journey of balance as you stride across a room. You don't actively plan and coordinate in 3D space the relation of the apple to you or to your hand and then calculate placement and pressure required to take a bite. These things you just...do.
The API acts in a similar fashion taking what the software wants to do and translating it to the hardware to do it and then returning the result back to the software to display. The most recognizable examples of API layers would be Microsoft's DirectX and OPENGL but other software can have its own proprietary API layer is with ADOBE and their programs. DirectX and OPENGL take interesting approaches to 3D graphics and each has their inherit advantages and disadvantages and they can be more than just coding issues....they can be political. For a far superior explanation I suggest a visit to www.jakeworld.org to read an article by guru game programmer Jake Simpson and his article on Graphical API History.
Drivers are much more complicated that one might think. They can be a proverbial house of cards. Each game, application, tool, player and so on interacts with the video card in a subtly different way. Drivers are initially designed to work with everything but may not work to their fullest potential. That's where optimization begins and the people who build drivers begin the task of figuring out what enhancements or tweaks can be made to their drivers in order to gain performance and stability. This must be done one program at a time and there is an extensive list of programs. Just think of how many games there are then begin the task of trial and error to get the best performance out of each individual game.
It isn't as simple as taking those individual driver enhancements and putting them into one set of drivers. A tweak in one enhancement can cause another tweak to turn into a problem and fixing that problem can create four others. Drivers are a balancing act between performance, stability and cost. It's almost an unobtainable triangle. Achieving performance and stability takes an unlimited pot of R&D money. Achieving great performance may cause instability. Achieving stability may cost performance.
And around it goes.
So hardware manufacturers strive to achieve balance by designing their product to fit a niche purpose. It would take too much time, effort and money to build the fastest, most stable gaming/workstation/single monitor/dual monitor/triple monitor/multimedia/digitize/output video card. It can be done but the cost of the product would be 10 times an unacceptably high price.
Manufacturers In the video card market choose where their priorities are based on what market they want to capture. Gaming cards and their drivers are optimized for games with lesser emphasis on workstation applications. Workstation video cards are optimized for the reverse. Let's face it. There's more money to be made in gaming cards than the workstation cards.
If, for the most part, the hardware can support significant performance improvements then is it the fault of the API, software or drivers and are we being cheated? This brings us back around to Stephan Schaem, Chief Technology Officer of Serious Magic.
In some cases, card manufacturers have chosen to differentiate their 'consumer' vs. 'professional' cards by introducing essentially identical cards with different firmware and software drivers. The manufacturer's state that the additional cost of the pro product goes to fund development of advanced driver features that are particularly useful in production environments. The issue that Serious Magic has focused on is a different one. It's a significant issue in PC graphics card performance but we don't believe it was an intentional omission.
In a nutshell, here's the issue. While today's graphics cards can render images very quickly, the software drivers are painfully slow at getting rendered output back over the AGP bus and into the PC where it could be saved and put to work by users. Current generation software drivers achieve only a fraction of the theoretical download transfer speed that the hardware you've already paid for is capable of. It's remarkable that a graphics card with a video input and some video recorder software can record TV-quality images to the PC hard disk in real-time, yet the same card can't record it's own renderings at even 1/10th this speed. Serious Magic has made a benchmark which demonstrates this problem freely available on our website:
www.seriousmagic.com/3D-Dloadbenchmark.zip
The problem isn't the hardware, it appears to be the software drivers. This is supported by the fact that the external video input to a VIVO-enabled graphics card can be moved over the AGP bus very quickly. Also, some software drivers under Windows 98 are able to move the rendered output very quickly. However, in all cases under Windows 2000 and XP the speed of transferring the 3D rendered results of the same card is very, very slow. It seems that the speed could be dramatically increased simply with revised software drivers. While this is a significant issue for many business, educational, production and scientific tasks, it is not a feature that gamers are clamoring for (although it would make capturing movies of game output faster, this is not as coveted as a higher frame rate). We believe that this is why no manufacturer has yet made this aspect of driver performance a priority. Even the more expensive cards with drivers targeted at the professional market are equally poor at this task. Hopefully, with the game market rapidly reaching saturation, manufacturers will realize that the growing business, educational, production and scientific markets can be substantial. Although each of these markets may be small when compared against the game market, when combined they can add up to meaningful numbers.
And don't tell me there's a difference between drivers. Here is an example of the same system benchmarked the same way except for the change in video card drivers.
Speed! I need more speed Scotty!
What does the future hold? Processors, graphic cards and RAM are edging upwards in speed and bandwidth. The 3GHz mark is within reach for both AMD and Intel. Matrox opens up a huge 17.6 GB/s pipe with the Parhelia and DDR ram is bumping up the performance ladder as seen in the table.
Memory name
Type name
Clock speed
Voltage
DDR clock speed
Data Bus & Bandwidth
PC100
.100MHz
3.3v
.64-bit, 0.8GB/s
PC133
.133MHz
3.3v
.64-bit, 1.05B/s
PC1600
DDR200
100MHz
2.5v
200MHz
64-bit, 1.6GB/s
PC2100
DDR266
133MHz
2.5v
266MHz
64-bit, 2.1GB/s
PC2700
DDR333
166MHz
2.5v
333MHz
64-bit, 2.7GB/s
PC3200
DDR400
200MHz
2.5v
400MHz
64-bit, 3.2GB/s
PC4200
DDR533
266MHz
2.5v
533MHz
64-bit, 4.2GB/s
Today's ultra-powerful CPUs, GPUs and RAM are tied to a proverbial boat anchor. It's the motherboard with its inherent latency and bottleneck problems. Further to that is the I/O rate of the hard disk or how fast data can be lifted from or stored to the platters.
A way to increase After Effects render speed is to increase disk speed and this is accomplished by moving to a SCSI disk array. Unfortunately in the restrictions of a home buyer's budget it would push the cost above an acceptable level. SCSI disks have a greater throughput of data than IDE disks. ULTRA160 SCSI disks deliver a maximum 160 MB/s and the newer UTLRA320 SCSI deliver 320 MB/s. The less expensive IDE drives can move data at a maximum of 100 MB/s (ATA100) or 133 MB/s (ATA133). We all know that actual performance with either SCSI or IDE is significantly less than theoretical boasts. Any of these disks in an array can further enhance performance with SCSI arrays reaching upwards of a theoretical 500 MB/s. Processors can handle a greater amount of data in After Effects but must wait around for the data to exchange with the hard drive.
CPU, GPU, Ram and hard drives work through the motherboard and therein lay the bottleneck. CPU, GPU and RAM may be able to accept and shovel out information with great speed and in huge gulps but the problem is that the pathway between components is relatively small and not nearly as fast. It's like trying to drain or fill a swimming pool with a garden hose. A solution is to get a heck of a lot more garden hoses or a bigger hose.
Both AMD and INTEL are backing solutions and each in their own way. AMD brings HyperTransport with a bigger hose and INTEL counters with the many hose analogy for PCI-Express, formerly known as 3GIO. INTEL also is deep into it with Infiniband. Infiniband is more of an outside of the box solution providing for reliability, availability, scalability and performance gains between data centers, such as server disk arrays. It isn't paramount to this article but worth mentioning as it will have an impact on how fast two systems can talk to each other. Both AMD and INTEL have the same goal to increase the amount and speed at which data moves through a system or device.
HyperTransport
Chipset? Who's got the chipset?
AMD HyperTransport Technology-Based System Architecture should be thought of on two levels; within the specific component and between components. In other words HyperTransport technology, when applied to a component such as a processor, can raise the bar on how fast it can complete an operation or how much it can process at any given time. HyperTransport, when applied to the pathway between components, increases the amount of data (bandwidth) and reduces the time for it to get around (latency). HyperTransport allows for the pool to drain or fill faster due to a very much larger hose.
HyperTransport promises some pretty hefty improvements to loosen the noose on bottleneck I/O problems. HyperTransport technology is used to provide high-performance interconnects between integrated circuits that comprise the system's core. Peripheral device interconnect is provided by existing industry standard busses such as USB, IEEE-1394, IDE, SCSI, Serial ATA, etc. In other words AMD is aiming to provide a large bandwidth, high speed platform. AMD makes the HyperTransport technology available and leaves the rest up to the other manufacturers. This may mean a bigger, better, badder motherboard.
HyperTransport Technology
HyperTransport technology is an advanced high-speed, high-performance, point-to-point link for integrated circuits. HyperTransport provides a universal connection that is designed to reduce the number of buses within the system, provide a high-performance link for embedded applications, and enable highly scalable multiprocessing systems. It was developed to enable the chips inside of PCs, networking and communications devices to communicate with each other up to 48 times faster than with existing technologies.
Compared with existing system interconnects that provide bandwidth up to 266MB/sec, HyperTransport technology's peak bandwidth of 12.8GB/sec represents better than a 40-fold increase in potential data throughput. HyperTransport technology provides an extremely fast connection that complements externally visible bus standards like the Peripheral Component Interconnect (PCI), as well as emerging technologies like InfiniBand. HyperTransport technology is the connection that is designed to provide the bandwidth that the new InfiniBand standard requires to communicate with memory and system components inside of next-generation servers and devices that may power the backbone infrastructure of the telecom industry. HyperTransport technology is targeted at the networking, telecommunications, computer and high performance embedded applications and any application in which high speed, low latency and scalability is necessary.
The AMD-8000 (HyperTransport) series of chipset components stack up to some large numbers promising a peak throughput of 12.8 GB/s.
AGP 8X doubles the bandwidth moving peak transfer rate up to the 2.1 GB/s notch.
PCI-X (not to be confused with PCI-Express) significantly improves data transfer rates from 100 and 133 MB/s all the way up to nearly 1 GB/s peak data transfer.
USB 2.0 allows for connecting exterior USB peripherals to access the system via a 450 MB/s pipeline.
It's a very simplified explanation but it means that PC systems have the potential to make rather large performance jumps in the relatively near future. HyperTransport technology is a reality as evident by nVidia's nForce chip but don't expect full featured HyperTransport motherboards to find their way onto store shelves for some time to come.
More on Hypertransport technonology can be found at the website and in an AMDwhite paper.
PCI-Express
All aboard the Express!
INTEL stands behind PCI-Express and Infiniband. The performance gains have been staked even higher than HyperTransport with an initial offering of 2.5 GB/s/direction up to a projected advance to 10 GB/s/direction and beyond. It appears that PCI-Express is initially designed to fit into the existing box and Infiniband is designed for improved connectivity out of the box such as connecting server data centers.
PCI Express architecture is described as a high-speed, general purpose serial I/O interconnect that provides the bandwidth for current and future applications. After reading about PCI-Express it is almost impossibly difficult to sum up this technology into a single sentence but the PR team managed to do so with a collection of words that commits to nothing yet sounds exciting. Nonetheless, PCI-Express has the same goal as AMD with one major difference. PCI-Express has been designed to fit with present technology. It also partners well with Infiniband.
HyperTransport is a new chipset entirely thus, as an example, a brand new motherboard would be required. It is up to motherboard manufacturers but in order to satisfy consumer demand there may come a time where motherboards may feature a PCI-Express port as an option to add PCI-Express components. This may happen at the relative same time that HyperTransport motherboards enter the marketplace. It's debatable to which is the best approach. Is bolting on new technology to enhance current the better route or is it best to start from an entirely next-gen platform?
A further question arises about data transfer to and from the hard drive platters. To get faster data transfer the disk needs to spin faster or the data algorithm has to be more compact or a combination of both. There comes a limit to how small the data can be made. Seagate explains;
Today, as the magnetic particles that make up recorded data on a hard disk drive become ever smaller, we are approaching a point where the data bearing particles are so small that random atomic level vibrations present in all materials at room temperature can cause the bits to spontaneously flip their magnetic orientation, effectively erasing the recorded data. Magnetic recording scientists and engineers have calculated that this so called superparamagnetic effect may become a serious technology issue for new products in only two or three years.
But as soon as it is said that it can't be done'
Seagate has decided to use a HAMR to cram more and more bits of information per square inch into hard disc drives, pushing the limits of magnetic recording even further beyond what was ever thought possible. The Company today demonstrated its revolutionary Heat Assisted Magnetic Recording (HAMR) technology, which records data magnetically on high-stability media using laser thermal assistance.
HAMR, combined with self-ordered magnetic arrays of iron-platinum particles, is expected to break through the so-called superparamagnetic limit of magnetic recording by more than a factor of 100 to ultimately deliver storage densities as great as 50 terabits per square inch. This will provide the capability for people to store the entire printed contents of the Library of Congress on a single disc drive in their notebook computers.
Hard drive space has increased at a phenomenal rate over the last 5 years. It used to be that 270 MB was considered a big disk and now 80, 100, and 120 GB drives are commonplace. (270 MB is less than one percent the size of a 120 GB hard drive.) Space increases and falling prices keep the consumer happy but what happens when the consumer turns their attention away from processor speed and disk space?
PCI Express and HyperTransport bring the promise of faster productivity on the computers that we work with today. This will buy time until hard drives become something more than they are and perhaps less integral to the real time operation of a system. Fitting the multitude of software and hardware architecture together into a coherent working solution may take time but it is on the horizon and we'll witness some form of its arrival sooner than later.
And where will it stop? Will we expect real time renders or projects rendered faster than real time? In whatever form it happens to finally evolve into next generation technology could make today's super fast PC the 486 of tomorrow.
More on PCISIG can be found at their website and this FAQ. Also look to the other white paper on 3GIO. Infiniband information can be at the website and in the FAQ.
Conclusion
Workstation class PCs were always thought of as very expensive and powerful beasts affordable to only those with deep pockets. Everyday a new piece of hardware comes onto store shelves and if properly picked can make for some formidable computing power at very affordable prices. You don't need the best of the best hardware to do the work. Perhaps that diamond tipped, gold plated shovel isn't needed in the garden when a plain old spade will do the job just as well.
I commend those who waded though this. PC configuration is like a jigsaw puzzle; you need a few pieces of information to begin to see the big picture. After this you may be left with the question of what would we recommend? Our test system tackled the workload of a professional broadcast design department and performed well and even better than some existing systems. We thoroughly enjoyed the extra display that the Parhelia brought to the work environment. Remember that a workstation is not designed to be a competitive gaming computer even though the designers had to be told on several occasions to do work instead of playing Quake. The AMD processors made a few INTEL loyalists reconsider. All of them were like curious children when we broke from the beige box syndrome. Those that knew the price of professional 2D/3D workstations said...it cost what? If you are building from the ground up or just adding on...determine what you want first. If it is workstation graphic power then balance the GPU-CPU equation as a little more money invested in one or the other may deliver better results in the end.
Begin with the end. Getting more from a workstation, gaming or home multimedia PC is a matter of answering the questions of what is expected from the computer. Define your goals and get your hands dirty with a little research then you'll end up with a PC that is better suited to your tasks and, perhaps, your pocketbook. We built a system that made many users very happy. It also made my budget very happy as well. It is amazing the creative power that's available in computer hardware today.
In closing I'm reminded of an old saying. Give a man a fish and he'll eat for a day. Teach a man to fish and he'll eat for life. In other words; if I tell you what's best now you'll have the best for a day but if I teach you how to choose what's best for you then you'll have the best for life.
Icrontic extends their appreciation to the good people at ABIT, AMD, Matrox, GlobalWin and an ever-faithful AMKComputers for their assistance and involvement with this article.
Personal Opinion
The use of benchmarks, charts, graphs and a lot of technical talk are valuable in the price vs. performance equation but it all comes down to how a computer system feels. Marketing surveys may show results such as 9 out of 10 users thought it was fast but what happens if you are the 1 out of 10?
Our test system surprised us. Perhaps we were rooted in a MACdesign world for too long or caught with our pants down for keeping up with technology. The home PC enthusiast most likely upgrades more times in a year than an office does in 5 years. In unofficial comparisons our test system beat our single and dual processor G4's and nipped at the heals of a dual XEON Quadro system.
We didnt' set out to build a gaming machine but we were able to play games and not worry about being blown up when our computer couldn't keep up. Softimage and After Effects are what interested us the most. Fast renders and an easy interface would make our head spin. The Matrox Parhelia brought great amounts of real estate and a great image quality but a few problems. Softimage is not the most well-behaved program at the best of times. It was cranky to begin with and within a few driver tweaks Matrox engineers had it under control. There are still a couple of bugs but they are getting harder to find and most wouldn't stumble across them. Softimage did have some very minor display problems with the second and third display but these should be gone with the release of the 1.01 drivers. The other problem wasn't the fault of Matrox but more us. Our cabinetry was configured for dual monitors and not for three. Nevertheless the Parhelia functions extremely well in single, dual or triple head mode. A lot of people hadn't heard of AMD, ABIT or GlobalWin and didn't know there was so many choices and options. They definitely marvelled at the AMKcase.
We thought it couldn't be done on a budget. It's simply amazing the sheer computing power available at our fingertips. Immediately half the computers that were twice the price...were made obsolete.
Sure there were the doubtful who mocked and stood firmly by their convictions...as the familiar sound of the MACs crashing echoed down the hallway.
-
Re:Serial Faster?
Serial can be made faster because the logic (De-muxing) that has to occur in parallel transfer systems slow down the clock rate. Thus, by simplifying the de-muxing process via serial transfers you can crank the transfer rates up to more than compensate. We are seeing good examples of other moves in this direction like Intel's 3GIO and AMD's HyperTransport
The simplified interfaces allow more flexibility (why SATA is hot-swappable) and are cheaper to produce.
-
My Ultimate HandHeldAs far as ideal portables go heres my take:
Power:
Methane powered fuel cell, which provides for at least a months worth of continuous use.The Screen:
Light emitting polymer screen is good here. Nice choice. The screen should also have some mechanism for eliminating finger grease automatically.Form Factor:
A6, there should be no border, so that the screen takes up the entire front. Perhaps with fanned screens like the Psion Protoypes. (can't find a link)Communications:
An array of Software Defined Radios, allows the device to keep in touch with the outside world. Depending on your current usage, they may be configured for Wireless Ethernet, BlueTooth, 3G+, TV, Radio etc. No need for multiple cards and slots. When a bug or security risk is found in any of the protocols, a simple software patch will fix the problem.When data is huge, perhaps something like Infiniband over fibre optic would be useful.
Input:
Touch screen will be supported, along with a slide-out or otherwise concealed keyboard for when you actually want to enter some data. Voice recognition would also be nice, but only when your on your own.CPU
Since were obviously way off into the future here, I would like a micro-distributed memory architecture, with approx 32 CPUs, each with at least 64 MB of memory. The CPU should probably be something like a 64bit ARM, running at whatever clockspeed is fashionable at the time. See this for similar stuff. The interconnects between CPU modules should use something like AMD's HyperTransportOS
For linux fans, the CPU the architecture would support a micro-Beowulf style mode of operation.For me, I'll roll my own Actor Model based system, running on a microkernel, like L4 but with better real-time response. Built in cryptography will keep ALL comms secure.