Where Have All The Cycles Gone?
Mai writes "Computers are getting faster all the time, or so they tell us. But, in fact, the user experience of performance hasn't improved much over the past 15 years. This article takes a look at where all the precious processor time and memory are going."
Computers are getting faster all the time, or so they tell us. But, in fact, the user experience of performance hasn't improved much over the past 15 years. Peter looks at where all the processor time and memory are going.
Ummmmm..... No.
A number of years ago, I had a project that required three days for each calculation. Just for kicks, when I got my dual G5, I ran the same calculation with the same parameters and it was complete almost instantaneously. Yes, yes....I know..memory bound performance versus disk swapping of memory space, but at the time, the memory on that system was maxed out (128 MB for $5000).
I also know that one of the games I helped work through beta (Halo) would absolutely not run on much hardware older than a few years ago.
Visit Jonesblog and say hello.
That would be the number one way to waste cycles on any low end system nowadays. I swear, I've seen P4's with Intel graphics run slower than PIII's with even a mediocre card in it.
Nobody except embedded programmers. My biggest project of late runs on an 8-bit, 8 MHz CPU with about 7k of Flash and 192 BYTES of RAM. Not megs, not kilobytes, but bytes. That's equivalent to less than three lines worth of text. And the code's written in C, rather than assembly, so while it's easier to maintain, it takes more effort to make sure it stays efficient.
I think all programming students should have to code for a system like this. It gives you a MUCH greater appreciation for what the compiler is doing for you, and what the consequences of simple changes can be.
I started out on a 8088 processor 11-12 years ago. Now I am using a dual proc G5 at work, which is so fast I can no longer blame the computer for my coffee breaks. It takes a good bit of video rendering to keep it busy long enough for me to get a coffee refill.
This has been a Public Service Announcement. Not to be confused with anything useful.
Now I could barely fit a word document.
And how much of that bloat in Word is useful information?
If I open word, type the letter 'a', and save the document, it's a 20K document.
If you type 'a' 2000 times, it's still a 22K document.
What the heck is in that other 99.9% of the document?
94% of Repubs and 21% of Dems voted to renew the Patriot Act
Apple II (1 MHz 6502) did animated graphics with sound and controlled floppy access while polling the keyboard (The Bard's Tale)
Amiga (14 MHz 68000) had complete GUI, multi-tasking, on 256K RAM.
The old saying that "Intel giveth, Microsoft taketh" is about right. The CPU's have gotten faster, with the Microsoft O/S taking more and more cycles to do the same thing.
My friends ask me why I haven't upgraded my 400mhz machine in years. Look at all this new stuff they say, look at all this eye candy. Look at these great new games.
And then I load up my MUD client, with simple, 16 color text in a 12 point font. This is my favorite game.
And then I load up my word processor, AbiWord, which renders as fast as I can type and has a nice spell-checker. This is my favorite word processor.
And then I load up Kmail, Mozilla, and all the other "normal applications" which have never had a problem with virii or worms.
And after all this they realize, the problem with my computer is THEIR expectations, not my software and hardware.
(And then they ask me when I'm going to replace my rotary phone... I can't win them all.)
Trying to use sarcasm in text-based forums does not work.
I haven't had to set an IRQ or DMA setting in years. I've not had to mess with himem or any other arcane memory configs and boot disks, restarting my entire system each time I want to run a different game.
Each time I plug in a new joystick and it just works, each time I plug in a new digital camera and it's just there as another drive, each time I alt-tab out of a game, check a walkthrough website, then alt-tab back, I think back to the old days where code was really efficient and didn't do any wasteful background tasks like that.
I remember helping a friend with a C++ assignment, via the net. Each time, she'd have to exit her telnet program, run Borland's C++ compiler from the command line, check the output, quit the compiler, reopen telnet, reconnect to the MUD we were talking over, then describe what had happened. Now... She'd just show me what's on her desktop via Messenger while we kept chatting.
And if some cycles get used up doing weird UI gimicks that I'll never use - like making the UI scalable so the partially sighted can use it, I'm willing to trade that.
For all those reasons, I'm more than happy that my 2^(years / 1.5) faster PC "wastes" all of those extra cycles. And that's before we get on to things like built in spell checkers and real time code debugging as I write it.
I don't want a 2^(years / 1.5) faster experience. I want all those cycles put in to making things work closer and closer to how I just expect them to work.
I don't know about anyone else but I can't code 2^(years / 1.5) faster so I wouldn't be able to keep up with that damn responsive text based compiler. On the other hand, I am that much faster overall as I now call an API that adds all that "bloatware" instead of having to code my own damn mouse drivers, my code is largely debugged on the fly and I can't remember the last time I lost several days just trying to format a newsletter in to columns.
So, before saying the cycles are wasted:
Pick an every day but semi complex task that people do now. For example: For a homework project, go on line, grab half a dozen graphics and ten blocks of text from those websites, put them all in to a stylishly laid out newsletter format. Do that on a P4, then do it on an a DOS PC from 15 years ago.
See if matching the same quality of work doesn't take you 2^10 times as long on that old PC, assuming you can even do it at all.
Those cycles aren't wasted. Sure, we do the same basic tasks but we do them with vastly more flexability and don't have to waste days of our lives wrestling with configs to do what we now consider simple tasks. That's where the speed is.
The dekstop files & folders paradigm is fine if marketing dweebs stop designing wizards that hide simplicity in a layer of complexity. What if I had a maid who said "I see you just set a piece of paper on your desk? Do you want me to file it for you? Great, I'll just shred this original while I'm at it, and you can conveniently ask me to find it whenever you need it!"
Example 1:
My dad plugs in his digital camera, and it displays a camera wizard. Great! It asks for the album name and places it in a convenient album with a nice slide-show.
The next day, he wants to edit one of the pictures, or copy it, or rename it. Too bad. Because it's now in a proprietary format in an album management program. The wizard was completely unnecessary. It have been easier for him to create a folder and drag the files into it. It would have functioned in the normal way files and folders work. He would know where they are, and could open, email, rename, delete, etc.
Another example: .MP3 file? No? Maybe it's a .WMV file? .OGG? .WAV? No... it's in the media library. And there it lies forever. You can't play it with anything else. Now I show her how to use CDEX, and click the CDDB button, then the RIP button, then whoa! And she can do whatever she wants with it.
My mom inserts a CD and Media Player asks her if she wants to rip the files to the media library. It even does a CDDB lookup and names the albums accordingly. Great! So where's that
Now I want to email that file. But I can't. Because it's not in a file on the file system, it's hiding in some "convenient" media library for me. And I want to view the pictures in the order the camera took them.
The article is trying to analyze what the bloat comes from. And no, it's not only about "unnecessary features". Maybe you should give it a read too. :-)
Beware: In C++, your friends can see your privates!
Yes and yes. Apple got religion a couple years ago and got on the profiling tools bus internally.
Mac users are demanding and impatient. All that typical slowness you see logging in, opening apps, closing windows, etc., with no feedback on XP makes Mac users want to pluck their eyes out.
You can come out with something quite elegant, like iPhoto 2, but if the performance isn't there that's all you're going to hear about. Mac users will whine incessantly until it's fixed.
My God, it's Full of Source!
OUTSIDE_IP=$(dig +short my.ip @outsideip.net)
Back in 2000 or so, I was trying to figure out why Word docs wore so bloated so I looked at them in a hex editor. I noticed a ton of NULLs in the document. So I wrote a simple C program to count the NULLs.
:)
Believe it or not, something like 60% of the document was NULLs.
So its not really bloated, its just full of nothing
"Bloat" is used by a bunch of people who don't know what it means. If the code is bloated (as in a bunch of code for features that aren't used), then the code is never accessed by the CPU. That means it doesn't take up cache and it doesn't get executed. At most, it will take up some of main memory and cause longer load times but after the application is running, code that isn't executed eventually gets removed from even main memory. Simply having more code for more features doesn't do much unless the code is actually getting executed.
I agree completely. I've done some programming for OS-9, and when we were creating some software libraries, we had to do was worry about things like program footprint size and memory allocation/deallocation. We were using a cross-compiler and doing development in C and C++. Something as simple as the order in which you declare the variables could make a noticeable difference in program size. Memory allocation and deallocation had to be done by the top level of the program. The support libraries had to be written to accept a memory block to use and how large it was. The last thing we wanted to do was use up the 4MB of RAM (which had to hold the OS, plus any programs you were running) we had by making large chunks of it unusable because it first was malloc()'ed, then free()'ed. We didn't want to risk having whatever garbage collection scheme existed to be able to properly operate... assuming there even was one. (This was 1997.)
Of course, if you want speed, you have to learn to take advantage of the "short circuit" of && and ||. While nobody's really going to notice the several nanoseconds you might use up by doing !strncmp(str1, str2, n), when you process millions of rows from a database, it can make a big difference by not forcing a program pointer jump by saying
if (str1[0] == 'a' && !strncmp(str1, str2, n))...
The mindset we have now is a direct result of the prevailing attitude that memory is cheap and processors get faster. A friend of mine is no longer asked to interview prospective candidates because he would always ask questions about optimizing code and making it run faster. The candidates nearly always had the look of a deer caught by headlights, and these supposedly knowledgable programmers (interviewee AND interviewers) couldn't answer these questions.
OCO is Loco
It would be interesting to graph system startup times year-by-year with then-standard distros running on then-standard hardware. I suspect start-up times haven't changed significantly since the 70's.
Does anyone here recall the famous if not accurate "Whoa, Win95 boots in under 3 seconds!!!" usenet thread?
Startup time is currently an area where the likes of Windows XP excels over Linux. On an Athlon 2600+, XP takes 6 seconds to boot (and become usable) whilst Fedora Core 3 takes closer to 90 seconds.
Yes, both use prelinking (or prefetch if you like), but linux distros still don't load independent services in parallel, and I suspect Fedoras prelinking is far from optimized.
"Nine times out of ten, starting a fire is not the best way to solve the problem." - my wife
Not as simple as that, which implies there's more code in there than necessary. What has happened is feature bloat. As an example there's a well known author who still does his work in WordPerfect 4.2 Why? Because it already does everything he needs.
Whenever I've been confronted with the latest install of Office (at work, I use notepad and wordpad at home) I usually spend an hour or so turning off all the damn irritating automatic features. The author makes an excellent point, too, that he has bent the computer to do what he needs, rather than having so much crap to deal with bending him to meet its requirements.
There's a lot of happy people out there with Pentiums and 486's who really, really don't want to upgrade, since everything is already the way they want it. With all the current vulnerabilities and threats, why change?
A feeling of having made the same mistake before: Deja Foobar
### Applications back then fly today, but seem like a small insect when it comes to functionality.
Well, a 10 year old NeXTSTEP computer can do a lot of stuff that I still can't do with Gnome and KDE, and yet it is still faster doing some things than todays computer. Same with Inkscape, I tried to play around with some of the stuff I did on a P90 with 24mb RAM in CorelDraw years ago, Inkscape turned out to have huge problems rendering the stuff on a 1Ghz Athlon with 768mb RAM, was almost unusable. Its true that there are some things that I can do today that would have been impossible some years ago (fullscreen video, todays games, etc.), yet there are many many things that are basically exactly the same as years ago, only that they havn't speeded up at all or even got slower, even if the CPU speed itself has massivly increased.
A little tip from when I wrote 8-bit embedded code...
Our c compiler had an output format that would list the c code and resulting assembly language intermixed. I wrote a quick little program that would read this, count the bytes of code per line, strip the assembly, and then just print out each line of C with the byte count at the beginning of the line.
This was easier to look over and you could see if some c expression was really bloated - I'd then go and simplify the code.
For example, I've been disassembling this little project and I can tell that the source firmware has this mistake:
U88 a;
U32 b;
bad: b = b & a;
good: b = (U32) ((U8) b & a);
The bad way hard codes in a lot of "& 00" instructions for this little 8-bit processor, while more selective casting (while ugly) can overcome the problem. Repeat 10 times, and you'll save 200 bytes, or almost 3% of memory.
HIV Crosses Species Barrier... into Muppets
If you have the entire contents in memory you can be assured of not skipping if there becomes contention for the disk. iTunes on the mac is famous for not skipping no matter the system load, guess why?
--- I do not moderate.
The parent was understandably modded a troll, but I have to say that I agree with the sentiment, if not the exact words. My recent experience with OSX and the iLife apps is exactly that: Apple writes software that is very slick and nice for the average user, but really limited and arguably even broken for the user with atypical or demanding needs.
My wife's new iBook is pretty, but if it were my iBook, it'd be running Linux by now.
Note to ACs: I usually delete AC replies without reading them. If you want to talk to me, log in.
I could use resizable and rotatable vector fonts, create and import both bitmap and vector graphics, and load both a spellchecker and thesaurus in my copy of GeoWrite that ran under GeoWorks Ensemble 2.1 and MS-DOS 3.3 in 1991.
Not only that, but the program was well-designed enough to provide four different levels of UI complexity (allowing new users to use it without getting lost while expert users could enable all the features and even customize the toolbars), and the PC/GEOS environment itself provided multiple threads per process and preemptive multitasking but was fast enough to be considered "fast" on my 286 with 1MB of RAM and a VGA card.
The PC/GEOS folks got around the bloat because they were interested in doing so, and they were successful in almost all respects.
Modern coders seem a lot less interested in doing so, perhaps because so many of them take the bloat for granted. It wasn't always so, as many of us remember...
Mainframe/UNIX Bit Twiddler and long time Windows/Linux Hobbyist.
The Theorem Theorem: If If, Then Then.
(1) with assembly language and a good macro-assembler, one can actually write fairly programmer-friendly code,
(2) non-PNP cards are often a lot easier to support when something goes wrong during hardware resource allocation (at least with IRQ and address jumpers you KNOW where the device thinks it should be), and
(3) text-mode is more portable than a GUI while still being easy to use if a good text-mode UI is also present (remember that drop-down menus and mouse support are not the exclusive domain of bitmapped environments).
My copy of OS/2 Warp 4 running at home on my 192MB PPro/200 box is capable of doing most of what you cite, and yet it seems at least as fast as my 512MB 2.4GHz P4 box at work running Windows XP. Both multitask, both run Firefox well, both play or rip MP3s in the background, and both can work with graphics. The OS/2 box is actually capable of running in higher res given my work monitor's limitations.
So where is the productivity gain, exactly...?
Mainframe/UNIX Bit Twiddler and long time Windows/Linux Hobbyist.
The Theorem Theorem: If If, Then Then.
As for myself, I can tell you:
Computers really aren't that slow anymore. I used to stay on the bleeding edge in performance, and ever since the first Pentium III and Celerons arrived I have been happy running stuff that isn't the latest generation.
The key is running with enough RAM and a reasonably fast hard drive. Programs consume a lot more RAM these days (and operating systems subsequently spend more cycles managing it), and hard drives are still horribly slow relative to everything else.
That said, until recently I owned a PowerBook G4 1GHz/512MB, and on that box I usually had to wait for anything. Apple did a great job on the user interface, but there's a ~50ms latency pervading the entire GUI, and whoever implemented Finder and the SMB file browser deserves to be hacked to death by flocks of rabid zombie pigeons.
When I bought a Centris 650 in the early 90's, it was noticably faster--so much faster that I brought it to work to show my boss, as I was sure he would not believe my stories of how fast it was.
This same thing has happened to me with every generation of PCs, too...it's not just a Mac thing. I buy a new machine, and marvel at how much faster it is.
Furthermore, I can go the other way to verify this. I still have my Centris 650 in storage, and booted it up a couple years ago. It was so slow that I could not believe that I ever found such a slow machine usable.
What is really going on is that it doesn't take us long to get used to a fast machine, and since we normally never go back, we don't realize just how much faster things are now.
Thanks for the link. Here's one from the jargon file about real programmers.
The Story of Mel
This was posted to Usenet by its author, Ed Nather
(nather@astro.as.utexas.edu), on May 21, 1983.
A recent article devoted to the macho side of programming
made the bald and unvarnished statement:
Real Programmers write in FORTRAN.
Maybe they do now,
in this decadent era of
Lite beer, hand calculators, and "user-friendly" software
but back in the Good Old Days,
when the term "software" sounded funny
and Real Computers were made out of drums and vacuum tubes,
Real Programmers wrote in machine code.
Not FORTRAN. Not RATFOR. Not, even, assembly language.
Machine Code.
Raw, unadorned, inscrutable hexadecimal numbers.
Directly.
Lest a whole new generation of programmers
grow up in ignorance of this glorious past,
I feel duty-bound to describe,
as best I can through the generation gap,
how a Real Programmer wrote code.
I'll call him Mel,
because that was his name.
I first met Mel when I went to work for Royal McBee Computer Corp.,
a now-defunct subsidiary of the typewriter company.
The firm manufactured the LGP-30,
a small, cheap (by the standards of the day)
drum-memory computer,
and had just started to manufacture
the RPC-4000, a much-improved,
bigger, better, faster -- drum-memory computer.
Cores cost too much,
and weren't here to stay, anyway.
(That's why you haven't heard of the company,
or the computer.)
I had been hired to write a FORTRAN compiler
for this new marvel and Mel was my guide to its wonders.
Mel didn't approve of compilers.
"If a program can't rewrite its own code",
he asked, "what good is it?"
Mel had written,
in hexadecimal,
the most popular computer program the company owned.
It ran on the LGP-30
and played blackjack with potential customers
at computer shows.
Its effect was always dramatic.
The LGP-30 booth was packed at every show,
and the IBM salesmen stood around
talking to each other.
Whether or not this actually sold computers
was a question we never discussed.
Mel's job was to re-write
the blackjack program for the RPC-4000.
(Port? What does that mean?)
The new computer had a one-plus-one
addressing scheme,
in which each machine instruction,
in addition to the operation code
and the address of the needed operand,
had a second address that indicated where, on the revolving drum,
the next instruction was located.
In modern parlance,
every single instruction was followed by a GO TO!
Put that in Pascal's pipe and smoke it.
Mel loved the RPC-4000
because he could optimize his code:
that is, locate instructions on the drum
so that just as one finished its job,
the next would be just arriving at the "read head"
and available for immediate execution.
There was a program to do that job,
an "optimizing assembler",
but Mel refused to use it.
"You never know where it's going to put things",
he explained, "so you'd have to use separate constants".
It was a long time before I understood that remark.
Since Mel knew the numerical value
of every operation code,
and assigned his own drum addresses,
every instruction he wrote could also be considered
a numerical constant.
He could pick up an earlier "add" instruction, say,
and multiply by it,
if it had the right numeric value.
His code was not easy for someone else to modify.
I compared Mel's hand-optimized programs
with the same code massaged by the optimizing assembler program,
and Mel's always ran faster.
That was beca
I want a new world. I think this one is broken.
Consider this...
I have a Kaypro II dual diskette computer. This machine sports a 1 or 2mhz processor and a single side single density floppy.
When you turn the computer on... it takes 15 seconds and you are ready to type in WordStar. 15 seconds from the time you turn the computer on and you are ready to type.
Another machine... a 200 mhz laptop with WIN 98...
Still loads up and is ready to run applications faster than my 2Ghz plus machine that runs Win XP...
BLOAT BLOAT BLOAT
Features and funstions that are never used etc...
Sign me programmer for 35 years...
That's one way of looking at it.
My former boss had another way of looking at it, after I talked him through some of the stuff I had done to optimize a program.
"If it takes someone other than you more time to figure out what you're doing, than a user will save in a day - then it's not worth doing. Because something will break, and you might not be around to pick up the pieces."
And I have come to think that he's right. I have on occation made some fairly ingeneous code optimizations, which took me way too long to figure out when I looked at it six months later. I knew that what I'd done was smart, really smart. I knew that on that portion of the program I'd shaved something like 15% off the run time (some weird ass calculations). Just wasn't sure what the hell it was, that I was actually doing.
Sure, it ran faster than what I changed it to, but what I ended up revising it into would probably take me five minutes to figure out as opposed to more than an eight hour work day. The new version is about 5% slower than what I had earlier, which means about 2 minutes on a regular run through.
Yes, that's bloat compared to what I can do, but I don't really care.
We do not live in the 21st century. We live in the 20 second century.
- Renderer for scalable PostScript (Type 1) fonts
- Renderer for scalable TrueType fonts
- PDF file viewer or import filter
- PostScript file viewer or import filter
- Flash file viewer/filter
- SVG file viewer/filter
- Viewer/filter for a vector graphics format such as CGM or WMF
And yet, most of us use different, stand-alone apps for each of these functions (t1lib, FreeType, xpdf, GhostScript/GhostView, etc.); each written by a different person or group of people, each with their own 2D graphics engine, each with their own set of shared libraries or DLLs.I suppose some über-hacker could write an entire desktop application suite himself, achieving a level of code reuse that no multi-person team could, but that's not very likely...and desktop software will thus remain bloated.
Download these two programs, one C and the other C++, which supposedly do the same thing. Compare how long it takes to compile each one. Compare the sizes of the object files and executables.
Using GCC 3.2, the C++ code took about 10 times longer to compile than the C code, and section .text of the C++ object file was FIVE TIMES larger than section .text of the C object file.
And don't tell me memory is cheap, because CACHE memory is not cheap, and static RAM (used in some embedded systems) is not cheap, and $10 of extra memory per unit times 10,000,000 units shipped adds up to real money.
Uh, yeah. Have a look at that code again. The C version can only handle strings up to 99 chars long. Rewrite it to handle arbitrarily long input strings and then you'll have a real comparison.
I am tired of seeing this rolled out naively again and again. I like smooth fonts, multimedia support, device management, hires icons, a little bit of eye candy etc etc.
...
But it's still bloat! So, OK, you might like it, but you don't need it to get the job done, do you? You could manage files with a filemanager that doesn't have svg icons, you could code or word process with a application that doesn't have sub-pixel smoothed fonts. Eye candy is eye candy - it appeals to the masses, but if your computer is only a means to an end (and hell, I know mine isn't half the time, but that's beside the point) then all that eye candy doesn't make the end job any better and it certainly doesn't make it get done any faster.
Personally, I can't stand bloated code - just because you've got the cycles, doesn't mean you have to waste them on inefficient crap. And that's probably why I use IceWM (which has more than enough eye candy for me) instead of KDE or GNOME, Nedit instead of emacs or vim, rxvt instead of Eterm, etc, etc. Mind you, I also reckon the article is based on looking at the past through rose-tinted glasses: I remember waiting minutes for a word processor to start up ten-fifteen years ago, and things seem to be a lot more snappy now even with all the modern bloat