Slashdot Mirror


Explaining Disappointing XScale Performance In Pocket PCs

JYD writes: "I found this new article on a Pocket PC web site where Microsoft talks about why XScale Pocket PCs aren't as fast as people thought they would be. Is it the OS? The CPU not supporting ARM4 properly? I wonder if the Linux port would run faster on 400 Mhz ... or did Intel screw up the CPU?"

82 of 133 comments (clear)

  1. You think thats slow by brejc8 · · Score: 3, Interesting

    My group has been working on a syhthesizable secure G3 card CPU and it will probably be the slowest ARM ever made.

    The CPU will be fully delay insensitive and asynchronous to stop power and clock glitch attacks.

    We are currently looking at 4 Mhz on 0.18 process.

  2. Cant find the link but by MrBandersnatch · · Score: 3, Interesting

    a review I read showed a 400Mhz XScale performing at 50%-75% the speed of a 206MHz Strongarm chip. I would be really interested in some none OS specific tests that showed whether or not the XScale offers any performance benifit whatsoever - I know that it is supposed to scale to 1Ghz and has better battery life than the 206Mhz Arms but if it NEEDS to run at 800MHz just to perform at the same level as its older sibling then it is a waste of space.

    1. Re:Cant find the link but by MrBandersnatch · · Score: 1

      Typical. found it

      I was off in my above assessment BTW. Snippit from the French :-

      A small video test under Pocket TV as a proof: Ipaq @ 206: 23 Fps Xscale @ 400: 19 Fps! Xscale @ 200: 14 Fps!!

    2. Re:Cant find the link but by Chanc_Gorkon · · Score: 5, Interesting

      Well I found it and the performance is NOT 50-75 percent slower then an iPaq. From the numbers on pocketnow.com, the Toshiba e740 is actually ahead in most categories with exception of graphics. There's the real kicker. I don't think it's the Xscale so much as it's the ATI imageon graphics chip in it. This is also a new chip, and as the benchmarks prove, it's driver has a problem or so it would seem. I actually heard that it's kind of operating in a emulation mode of sorts (kind of like standard SVGA on a desktop). ATI should provide driver code to Toshiba and it can then be fixed in a flash. I have a e740 and love it so much. The Xscale is a nice chip and will indeed improve in peformance as it's flashed up, but in my book, the other features are worth more. The wireless works well, the dual slots are a godsend (WE DON'T NEED NO STEEEKIN SLED! ;) ) and the price is GREAT for what you get. All in all, I would buy another one or an updated one (like the Toshiba e550 coming out soon!). One thing I am looking for is the availability of the 3000 mah high cap battery. The standard is fine for day to day use, but when you use the wireless alot you hear a giant sukcing sound coming from the battery. The other accesory I would look for is the 99 buck adapter that goes on the bottom. You add that and you can attach a USB keyboard and also drive a SVGA monitor or a Projector with it and have your handheld run your Power Point stuff on the road.

      --

      Gorkman

    3. Re:Cant find the link but by MrBandersnatch · · Score: 1

      Thanks for the link - youre 100% correct. I'm MUCH better informed now :) I really hope Tosh improve the graphics performance - its such a nice machine that apart.

    4. Re:Cant find the link but by ceallaigh · · Score: 1

      I have also seen poor performance by XScale compared to: 1- StrongARM 2- ARM920T ( Samsung ) 3- ARM7TDMI ( Samsung, ATMEL, etc ) It would appear that in implementing the XScale design they have broken rules with standard ARM design and made their product perform very poorly with software written for other ARM processors. Sounds like Itanium running Win32 code.

    5. Re:Cant find the link but by cbcbcb · · Score: 2, Informative
      The Xscale has a 10 stage pipeline, compared to the 3 stage pipeline in an ARM7TDMI, and 5 stage pipelines in the StrongARM and ARM920T The main problem is the much greater load result delay on the Xscale pipeline (3 cycles(?), compared to 1 cycle on the ARM920) which means that the instruction ordering needs to be significantly different to generate optimal Xscale code. A short code segment should clarify this:
      # cycles ARM7TDMI ARM920T Xscale(guess)
      LDR r0, label1 3 1 1
      LDR r1, label2 3 1 1
      ... interlocks 0 1 3
      ADD r2,r0,r1 1 1 1
      total 7 4 6
      On this trivial piece of code the Xscale is 50% slower at the same clock cycle than the ARM920T. However, this effect would not make a 400MHz Xscale slower than a 206MHz StrongARM by itself.
  3. That's not talk, that's regurgitation by marxmarv · · Score: 2
    The tool from MS has marketspeak in place of information except for "We decided to support V4, we didn't bother to retarget our compilers to V5 for our traitorous former buddies at Intel, and if XScale's V4 compat is weak, that's their problem." It is, in fact, but that doesn't make MSFT's laziness any less lame.

    On the other hand, Intel often gives little thought to enhancing performance of old code on new processors. If memory serves me right, Intel's Pentium Pro ran 16-bit code embarrassingly slowly.

    -jhp

    --
    /. -- the Free Republic of technology.
    1. Re:That's not talk, that's regurgitation by imroy · · Score: 1
      On the other hand, Intel often gives little thought to enhancing performance of old code on new processors. If memory serves me right, Intel's Pentium Pro ran 16-bit code embarrassingly slowly.

      No, Intel gave a lot of thought to that. It takes several years to develop a complex CPU like the pentium family. They just thought that Micro$oft would have a 32-bit operating system out by the time the PPro was released. Oops! Windows 95 wasn't completely 32-bit despite all the "32-bit" marketing and hype. And it was so new that everyone was still running a lot of 16-bit windows 3.x software. Of course, performance was much better if you were running a Real Operating System ;)

    2. Re:That's not talk, that's regurgitation by marxmarv · · Score: 2
      No, Intel gave a lot of thought to that. It takes several years to develop a complex CPU like the pentium family.
      It only takes a year or two to develop a relatively simple CPU like ARM or MIPS. RISC designs tend to be far more straightforward and simple. Many computer engineering students implement the MIPS architecture as an exercise. See the Hennessey book (Computer Architecture: A Quantitative Approach, 2nd ed.) to get an idea of just how simple a processor like ARM should be.

      Besides, the much-vaunted new feature of the PPro was the CISC->RISC translator, and it shouldn't take much to rejig that to handle 16-bit mode more effectively if the market (asses that they are) demands it.

      -jhp

      --
      /. -- the Free Republic of technology.
    3. Re:That's not talk, that's regurgitation by imroy · · Score: 1
      RISC designs tend to be far more straightforward and simple. Many computer engineering students implement the MIPS architecture as an exercise.

      yep. Been there, done that. Well, almost. Computer Architecture was probably my favourite class in Uni. We didn't implement it ourselves, but that was about 90% of the class lecture and notes. Starting with simple logic gates, we went through how to build registers, latches, ALU's, register files, all the way to pipelining. Fascinating stuff if you can stick through it all and have a great lecturer! It really gives you an appreciation of how the stuff works.

      Besides, the much-vaunted new feature of the PPro was the CISC->RISC translator, and it shouldn't take much to rejig that to handle 16-bit mode more effectively if the market (asses that they are) demands it.

      I didn't know that. All I knew was that Intel was betting on M$ migrating everyone over to 32-bit software by the time it was released to market. Considering their close deals in the past, I'm sure this was based on information that M$ had given them.

      To end this post on a non-anti-MS note, the CISC-RISC converter is software upgradeable. Recent Linux kernels provide a /dev/microcode device so that you can feed it a file (presumably) supplied by Intel. See http://www.urbanmyth.org/microcode/ for more information.

  4. Re:Judging by modern Linux DEs.... by torndorff · · Score: 2, Insightful

    Actually, you're right. Sorta.

    If you take Linux (source based, optimized for cpu) and a modern window manager like enlightenment (if you think its not modern, prove it in a reply) with a preemptive kernel and put it on a Celeron ~500mhz with 128mb of PC100 SDRAM it WILL BEAT the Windows 98 in speed, although it is different.

    I just don't see how people assume KDE and Gnome are "modern" because they resemble Windows. Is that the trend?

  5. Amulet cores by brejc8 · · Score: 5, Interesting

    The Amulet group has been working for year to make a low power yet high speed asynchronous ARM processors.
    The Amulet 3 runs at 120 MHz and consumes very little power. Most of all its asynchronous so when you dont have mych processing to do it just sits there consuming "no" power.

    They take a hell of a beating and still run. I connected one to a hamster wheel and you can see it here running despite the power fluctuating madly.

    The only reason it only goes at 120MHz is because the memory isnt fast enough.

    Its a little strange that only three ARM production lisences were given out. One to intel one to motorola and one to Amulet group.

    1. Re:Amulet cores by dpp · · Score: 1
      The Amulet 3 runs at 120 MHz and consumes very little power. Most of all its asynchronous so when you dont have mych processing to do it just sits there consuming "no" power.

      When you say 120 MHz do you mean that it has the equivalent performance of a 120 MHz ARM? I'd thought that an asynchronous chip didn't have a clock speed as such.

      --
      This post is strictly my own opinion and not necessarily that of my employer.
    2. Re:Amulet cores by brejc8 · · Score: 2

      Yeah you're correct. I meant to say MIPS.
      Which is more than a non superscalar part at 120MHz could do.

    3. Re:Amulet cores by UranusReallyHertz · · Score: 1

      Um, I thought the whole point of asynch chips was to eliminate the need for a power hungry clock. If this chip is truly asynch, how can you say it runs at 120MHz?

      --
      Smoking is an expensive, slow, and unreliable method of suicide.
    4. Re:Amulet cores by dpp · · Score: 2

      Aha... thanks :-)

      --
      This post is strictly my own opinion and not necessarily that of my employer.
  6. Before prime-time by l33t-gu3lph1t3 · · Score: 1

    Intel's Xscale architecture -should- have impressive performance. However, it seems that circumstances have conspired to keep it from showing its potential. In order to retain binary backwards compatibility, Microsoft has kept their software compiler rather basic, ensuring that it will work for multiple architectures. This lack of optimization also means that any architectural improvements Xscale has over Arm V4 or whatever it's called will not mean anything. Hopefully the history of Xscale will work like netburst architecture's history, where about 1 year after inception, software that makes use of its architecture efficiently (like SSE2 with the P4) will start to appear.

    --
    ------- "From bored to fanboy in 3.8 asian girls" ----------
    1. Re:Before prime-time by cruelworld · · Score: 1

      I wonder if this is microsofts punishment of intel in response to intel promoting linux.

    2. Re:Before prime-time by l33t-gu3lph1t3 · · Score: 1

      what punishment? Microsoft can _not_ "punish" Intel. Who else can they use, that has the fabrification capacity even close to Intel's? No one.

      --
      ------- "From bored to fanboy in 3.8 asian girls" ----------
  7. Stranding Users... by Anonymous Coward · · Score: 5, Interesting
    From the article: "We're not prepared to strand an installed base of over 2 million iPAQ users."

    Umm... right, that's why my PocketPC 2000 Cassiopiea E115 is now as useful as a doorstop as it has a MIPS chip in it.

    When I got my PocketPC, MS touted that 'software matters' - even in their publicity. Suddenly, they ditch all the SH3 and MIPS users and just support ARM in PocketPC 2002. Not only that, but applications like Terminal Services and Messenger they won't release for the older machines. I see a lot of people saying that this is becasue PocketPC 2002 is based on CE.NET - that's not correct. PocketPC 2002 is just another revamp of PocketPC 2000, which are both based on CE 3.0. So when it all boils down, it's just Microsoft playing marketing tricks. Net result of their decision - my £450 PDA became obsolete in 18 months.

    I now own a Palm.

    1. Re:Stranding Users... by Cardbox · · Score: 2, Interesting

      And that's why, as a software developer, I daren't target PocketPC. When I ported my application from PalmOS 3 to PalmOS 5 (the ARM one), I had to change ONE LINE of code (and the program remained backwards-compatible).

      If you ask the users, the current installed base of PocketPC systems is as follows:
      PocketPC 2002 - 1%
      PocketPC 2000 - 0.5%
      PocketPC "I don't know what version" - 98.5%

      We can target either of the first two quite easily, but the last operating system in the list has no programs that are compatible with it.

    2. Re:Stranding Users... by Xtifr · · Score: 3, Funny

      From the article: "We're not prepared to strand an installed base of over 2 million iPAQ users."

      Umm... right, that's why my PocketPC 2000 Cassiopiea E115 is now as useful as a doorstop as it has a MIPS chip in it.

      Sorry, there were only 1,999,999 users of that specific system, so it was below our threshold. :)

  8. Pocket PC hw spec lockdown by Qrlx · · Score: 3, Insightful

    Pocket PCs aren't as fast as people thought they would be. Is it the OS?

    It could be the OS, which is the obvious answer since it's a Microsoft OS, and this is Slashdot. But I don't know. I've never tried running anything other than PocketPC OS on the iPaq, and probably never will. (It's a work thing.)

    How did Microsoft become so popular? It was DOS, wasn't it? The program that ran on any x86 computer. Well, Microsoft should take a page from their previous success and allow a little more flexibility in PocketPC design. The main gripe that I and everyone else has about these gizmos is that they're locked into a 240 by 320 by 16-bit color display. That's lame, especially if one of the highlights of PocketPC is how easy it is to port your Win32 app. If you have to redesign all the screens to fit in a tiny-ass space, it's easy on the coders but hell on the systems analysts.

    It looks to me like Palm have a much more open approach, they are using the same tactic that established Microsoft's dominance with DOS back in the 80s. You can get that new Sony Clie' with TWICE the screen real estate (as in pixels) of ANY PocketPC available. Kind of a no-brainer if you ask me.

    Off to the solstice parade!

    1. Re:Pocket PC hw spec lockdown by RevAaron · · Score: 1, Troll

      I believe the spec for PocketPC calls for a 320x240 screen. However, you can get WinCE machines that run at any resolution, far beyond any silly PalmOS device.

      Sure, you can get a Sony Clie with a 480x320 screen. But why would you want to, you'd have to put up with that sad excuse of an OS. Makes a pretty kickass (and damned expensive) oraganizer, I'm sure. What good is a 480x320 screen if it's about the same size (in cm by cm) as the other options *and* there's no real handwriting recognition?

      Personally, I still carry around a 5 year old Newton 2100u most of the time. I have an iPAQ as well, for development mostly. It has a 480x320 (lower DPI than the Clie, which means more space to write!) screen and a 162 MHz StrongARM. And a real OS, with the facilities to develop first-class apps while never touching a desktop. Seems like a no-brainer to me too. But we're on Slashdot, and it doesn't run Linux, so I'll just kick back and wait for the on-slaught of "h3y j00 st00p1d mac lover fsck u && UR pee-pee-pda!!!1"

      Have fun at your parade! :)

      --

      Working toward a usable PDA environment in the spirit of Newton OS: Dynapad
    2. Re:Pocket PC hw spec lockdown by Chanc_Gorkon · · Score: 2

      Screw you man! I don't want them any bigger. Sure a nice big screen would be nice, but in something that's supposed to fit in your pocket, then the current specs work. The only way I would want a bigger screen would be if it was virtual (project on your wall, eye, whatever....). Now I WOULD like a 15 inch wireless web pad for at home. But I guess that'll wait for the mira. Also, that clie with twice the real estate is a heck of a lot bigger too. I like the 240x320, although I would not mind more pixels per inch. That would make things much sharper and clearer.

      --

      Gorkman

    3. Re:Pocket PC hw spec lockdown by aussersterne · · Score: 5, Interesting

      No onslaught here.

      The Newton 2100 kicks ass. I used Palm and Windows CE before finally trying out a Newton 2x00 series. The Newton made me swoon.

      It's the best damn computing device out there, PC, PDA, or otherwise. I used to do my e-mail, my diary-keeping, my word processing, etc. on my PC in Linux, but now I even write my books and do 90% of my e-mailing on my Newton 2100 directly over ethernet. I read news on it, make travel plans on it, I have my household inventory on it (in Notion)... and I read BBC World News and Slashdot on it in Newt's Cape.

      The PC only gets touched every few days. The Palm and CE devices are long gone. I only regret that Apple killed the Newton, so there won't be a color version. :(

      --
      STOP . AMERICA . NOW
    4. Re:Pocket PC hw spec lockdown by highvista63 · · Score: 1

      > Sure, you can get a Sony Clie with a 480x320 screen. But why would you want to, you'd have to put up with that sad excuse of an OS. Makes a pretty kickass (and damned expensive) oraganizer, I'm sure. What good is a 480x320 screen if it's about the same size (in cm by cm) as the other options *and* there's no real handwriting recognition?

      Um, maybe for viewing extremely clear and vibrant JPG's, watching widescreen movies, reading eBooks that don't strain the eyes, etc, etc. The Clie NR70 is just lovely, IMHO.

    5. Re:Pocket PC hw spec lockdown by RevAaron · · Score: 2

      Again, that's what you use may use a PDA for. I don't really need a PDA for looking at clear and vibrant JPGs. Porn in the toilet is of the utmost quality, I'm sure. But it's not my bag. Never have eye strain with the Newton or my iPAQ, the black on green is of high enough contrast for me. I'm sure the screen is nice, but it doesn't provide anything for me that isn't good enough elsewhere.

      They keyboard is sure badass, though. A remarkable piece of industrial design.

      --

      Working toward a usable PDA environment in the spirit of Newton OS: Dynapad
  9. Synopsis of "interview' by brooks_talley · · Score: 5, Funny

    Q: What could possibly have gone wrong?

    A: While we acknowledge that some peoples' perception is of something having gone wrong, we believe that any wrongness is unavoidable.

    Q: Well, some analysts say it's intel's fault

    A: We have implemented what we could implement, and don't believe there is any implementable implementation that would implement significant gains.

    Q: Analysts also say it will be 2004 before the issue is fixed

    A: It is too early to talk about 2004. That said, we are committed to delivering a good product.

    Q: This is really bad news for the Pocket PC platform

    A: Yes, it is. However, fortunately the issue is so small that this really isn't bad news for the Pocket PC platform.

    Cheers
    -b

    1. Re:Synopsis of "interview' by Moosifer · · Score: 3, Funny

      I'm the VP of Marketing for a large Internet company whose name I cannot disclose in a public forum. I'd like to offer you a director's position in our marketing department. Name your price. Can you start on Monday?

  10. Bet they're focussing on battery life by NigelJohnstone · · Score: 1

    They've probably used aggressive power saving on the chip to save every electron but at the expense of performance.

    Thats not such a bad thing, most of these things run address books and sync to email. The battery is the real problem with them, not the fact it can't encode video streams!

    Sure they'll get a few complaints, but nothing like the slating they've been getting for the battery life problem.

  11. Re:Judging by modern Linux DEs.... by Anonymous Coward · · Score: 1, Informative

    I can fully attest to that. I happen to have a 128 MB Celeron 500Mhz box sitting right next to me (no lie). I had Windows 98SE on it for a few days, and It was dog slow, unusably so. I put Mandrake on there (with KDE 2.something) and it flew. I was very surprised how fast it was. It's even faster now that I moved to WindowMaker.

  12. that deserves a... by g4dget · · Score: 2
    SUWANJINDAR (of Microsoft): "Our software remains the same. This is the same Pocket PC 2002 software that performs fabulously across other ARM processors (StrongARM 1110, OMAP710, etc).

    Well, that statement clearly deserves a +5 Funny.

  13. Markedroid Gobbledygook by timeOday · · Score: 1
    THOUGHTS: Early reports based on those who own the Toshiba e740 Pocket PC 2002 device are telling us that XScale at 400 MHz performs slower than a StrongARM at 206 MHz on some tasks. This came as a surprise to many people.
    SUWANJINDAR: "We are aware that PXA250 (XScale)-based devices are not demonstrating the huge performance gains that were anticipated. That said, Pocket PCs continue to offer the best performance and the richest functionality vs. other handhelds on the market today."
    THOUGHTS: Some of those same analysts have said it will be 2004 until there's an OS that can use the XScale CPU properly. Is that an accurate estimate?
    SUWANJINDAR: "It's too early to talk about the next version of our software. That said, we're committed to delivering best-in-class functionality and performance while providing a foundation that enables our developer community to continue to innovate and build successful businesses on our platform.
    I find this sort of talk simply insulting. Couldn't he even find some irrelevant facts to spin in support of his cause? How stupid do they think we are?
    1. Re:Markedroid Gobbledygook by morgajel · · Score: 2

      that' depends on who you mean by "How stupid do they think we are?"

      Are you talking about your average consumer, or are you talking about the average slashdot reader?

      (Afer allhow many slashdotters would buy this product anyways?)

      --
      Looking for Book Reviews? Check out Literary Escapism.
  14. Re:Strange by Verizon+Guy · · Score: 1

    That may be true, but I'm just telling it like it is. KDE on BSD and Linux was sooo much slower than either Windows 98 or 2000. Trying to run Konqeror was like pulling teeth. I actually got better performance running XDM and a virtual console through Exceed than if I sat there at the actual X console and tried to startup KDE

    --

    Aw, fuck it. Let's go bowling. - The Big Lebowski

  15. It's the OS by waytoomuchcoffee · · Score: 2

    MS admits in the linked article that the OS is not "optimized". It fails to use the new ARM instruction set, and worse, does not seem to use the power-management capabilities of the XScale. Supposedly the Xscale uses half the power of the StrongARM, but battery tests on the new PPCs do not show this savings. This fix will be a while coming, as the next version of the OS does not appear to be optimized either.

    Interestingly, Asus in their upcoming Xscale PPC is coming up with workarounds, such as on the fly automatic clock and voltage throttling. So while the Xscale supports capabilites that MS is not using, the vendors are not waiting for next year for MS to get their act together.

    Hopefully the vendors will also figure out a way to speed up the terrible benchmarks of the Xscale PPCs.

    1. Re:It's the OS by Chanc_Gorkon · · Score: 2

      Terrible? It beats the iPaq in every area except for memory moves, and graphics. Memory moves are not that much slower either. Are we quibbling about mere microseconds?? I believe the problem could be the OS but may be more likely to be with the ATI chip. Rumor has it the driver has not truely been released yet.

      --

      Gorkman

    2. Re:It's the OS by waytoomuchcoffee · · Score: 1

      Considering that its running at 400MHz compared to the 206 of the Ipaq, then yes, those benchmarks are terrible.

      Bonus *off* for replies.

    3. Re:It's the OS by Chanc_Gorkon · · Score: 1

      For a category that has NEW silicon running it! Driver problems can create lots of issues. The imageon 100 graphics processor they are using in the Toshiba is brand new. My point is are we expecting too much out of a device that oculd have been rushed? I mean how fast do you need your contact list to come up?? As long as it plays MP3's fine, it works good for me. Also, this is nothing like a x86 chip. I t has all kinds of optimizations for power concerns and is also a RISC processor I believe. It's too soon to create any verdict on the Xscale. There's only ONE model out now (in the US....Japan has the really cool new Genio model that won't be here for a while). Also, HP iPaqs are due out soon as well. Palm is also going to Xscale too. The Xscale is a god processor, it's just too soon to issue sucks/doesn't suck verdict on it right now. Personaly, I like my e740 very much and owuld NEVER step DOWN to an iPaq. Besides I hate them sleds anyway (why can't they include a CF sled instead of the usless basic sled).

      --

      Gorkman

    4. Re:It's the OS by waytoomuchcoffee · · Score: 2

      *** For a category that has NEW silicon running it!***

      Um, yes, it is new silicon. That doesn't make the benchmarks any better. It's new silicon with terrible benchmarks. What is your point? Saying it will get better later doesn't make it good now.

      And while you only care about MPEGs, some people care about performance and battery life. Some people run apps that use a bit more processing power than "contact lists", even if you don't.

      Bonus *off* for replys.

    5. Re:It's the OS by xswl0931 · · Score: 2, Interesting

      I believe the point is that the benchmarks were not *terrible*. Terrible implies that the new processor is = the old 206 Mhz one, which certainly isn't the case. The problem is that people expect a 400Mhz processor to give about a 2x performance increase over the old 206Mhz one, but for a bunch of geeks, it's amazing how easily you people forget that 1Mhz != 1Mhz when you're comparing different processors (even if they both use the same instruction set), such as AMD vs. Intel. Speaking of Intel, remember that Intel realized that common folk are easily sold on a high Mhz number rather than instructions processed per clock cycle, so Intel obviously designed their processors (Pentium family and StrongArm family) to scale up to high Mhz.

    6. Re:It's the OS by Chanc_Gorkon · · Score: 2

      THANK YOU! Exactly my point. I do believe it SHOULD be better, but there are SO many new things in the Toshiba e740 that it just was not possible to get everything right software wise. The Xscale, the imageon, the integrated wireless all make for one really really small and very complex device. The firmware is BOUND to have problems. This much should be a given. All things said, I have yet to see the people's problems with things like wireless (been pretty flwaless here), and video. There are annoying things like the buttons taking a vacation after using the wireless sometimes, a weird block in the messenger program command bar, weirdness with the notes app when storing notes on the SD card instead of memory, the ANNOYING "feature" of enabling the radio when doing a soft reset (I WANT TO LEAVE IT OFF IF IT WAS OFF DAMMIT!) and others. I have only had it for a week so it's hard to say if that's all the bugs, but those are the ones I notice. Personally, for a desktop, I would rather have a 450 MHz SPARC or RS/6000 instead of what I have. At least both Sun and IBM run more QA stuff then the PC manufacturers do! That 450 Mhz risc machine would run rings around any PC I could get (with exception of a Itanium, properly configured...). RISC machines are proof that doubling MHz will not necessarily double performance.

      --

      Gorkman

  16. Good Application for PR Rating by timeOday · · Score: 2, Funny
    Itroducing the Toshiba e740 Pocket PC 2002, now with a Genuine Intel XScale PR 166+* processor!

    *Actual clock speed 400 mhz

  17. Re:Seems obvious, bus speed & not enough cache by Anonymous Coward · · Score: 2, Informative

    I did n't mean the 3MHz drop hurts you alot, I meant that if you compare the clock to bus speed ratio you're looking at a 50% reduction in bus speed compared to cpu clock rate.

    If there is not enough cache memory increasing processor clock speed will not have a positive affect on performance because the real effective clock rate will be bound by how fast the processor can fetch data from main memory.

  18. Backward compatibility == poor excuse by starling · · Score: 1

    MS : "Moving to ARM V5 would break upgrade compatibility."

    Translation : "We can't or won't write portable code."

    There's absolutely no technical reason they can't take advantage of the V5 enhancements while still retaining support for ARM V4 and a common code base. This must have been a business decision, but I can't fathom the thought processes which led to it.

    1. Re:Backward compatibility == poor excuse by starling · · Score: 1

      Did you ever consider MS's other responses at face value, ie, they did not recompile for V5 because it would render incompatible all their partners and clients applications

      Yes I did, and they don't hold any water. I suggest you read your response above and consider why it doesn't make any sense (hint - the OS is not an application). Like I said, this has to have been a business decision.

      Having said that, your point about MS not liking multiplatform support is spot on. MS has some very competent programmers - that's not at issue. The problem is at the corporate decision making level.

      As to solving all their problems with a tarted up p-code system, well, if that's going to work at any sort of acceptable speed then they'll *really* need to optimise for the processor.

  19. Re:Judging by modern Linux DEs.... by RevAaron · · Score: 1

    You've not noticed that yet? GNOME and KDE are not supposed to be "modern," they're supposed to be like Windows, so people used to Windows can switch. They're being largely designed and coded by ex-Windows users. And that's fine, if you like the Windows works, but want a free version of it, more power to you.

    It's hard to build a modern computing environment on top of a non-modern operating system (Unix, Linux). Again, GNU/Linux *not* being modern isn't a dis, it's just the way things are. It works. I use it on my machine, largely as a host for a more modern OS (but far more spartan at this point). And no, I'm not talking about Mac OS X being some epitome of "modern," it's based on all the same things. Unix is so.... 50s in it's ideology.

    And before a bunch of escaped slashtards flame me, i repeat, THIS IS NOT A BAD THING. If you like Linux, GNOME or KDE, use it. Some of us, however, are not satisfied with the limits they impose on the way we work and are trying to forge something new.

    Enlightenment isn't any more modern than GNOME or KDE. But it may be a little more fun. :)

    --

    Working toward a usable PDA environment in the spirit of Newton OS: Dynapad
  20. I might add..... by Chanc_Gorkon · · Score: 4, Interesting

    This complaint was also based on the FIRST Xscale pda to EVER be released. Sure there's GOING to be problems. The iPaq started off with similar issues, but you don't hear anyone talking about it now do ya? There's alot of reasons that add up to create the total performance picture. Maybe Toshiba used cheaper internal ram? Maybe they need more memory for video (I think it has like 256 K maybe?? I don't know but I know it has dedicated video ram). The point is the performance on ONE Xscale based PocketPC does not make a prediction on how the others will perform. Also as these are flashable, we can expect even the Toshiba to get better performance as flash updates are made available.

    --

    Gorkman

  21. I'm not suprised by af_robot · · Score: 1

    This fits perfectly in Intel's "Megahertz sells" paradigm.
    Just push clockspeed up at any cost - who cares about performance? It's already running windows - so what can you expect?!

    1. Re:I'm not suprised by cheezedawg · · Score: 1

      Just push clockspeed up at any cost - who cares about performance?

      What? It's true that clock speed alone isn't a valid measure, but it is certainly a very important part of the equation. Intel's "Megahertz sells" paradigm is turning out to be a pretty effective strategy. They have been able to jack up their clock speeds almost at will (2.533 GHz on the P4 now with a clear path to 3.3 GHz by the end of the year). On the other hand, AMD has been struggling all year to speed up their processors. The result is that Intel is leaving AMD in the dust. As toms hardware puts it, "the Athlon design is already a bit outdated and is now reaching its limits."

      The consensus is that performance problems with the new XScale platform are because of poor software - not because of flaws in the hardware.

      --
      "The defense of freedom requires the advance of freedom" - George W Bush
  22. Re:Another case of inflated MHz not paying off? by _Marvin_ · · Score: 2, Interesting

    OK, so I whould have read the story before posting...
    It's simply Intel moving to a new instruction set (ARM V5) and building a (slow) emulation of the old one (ARM V4), and Microsoft says it would be horribly difficult to support two different instruction sets, so the choice was to either live with the new CPU performing slower than the old one or cut off support for the old hardware.

    Hmmmmm, yet another thing (like the OS modularity) that MS seems to be unable to do, while my Gentoo Linux is doing it by default. The sourcecode to their products has to be a complete and utter mess if they can't even get it to take advantage of new instruction sets without dropping compatibility.

    --
    "We won't use guns, we won't use bombs, we'll use the one thing we've got more of and that's our minds" - Pulp
  23. Re:Windows CE, ugh. by nesthigh · · Score: 1
    I don't know about the mobilon, but you can install NetBSD on the workpad z50.

    next

  24. Comment removed by account_deleted · · Score: 1

    Comment removed based on user account deletion

  25. Comment removed by account_deleted · · Score: 3, Informative

    Comment removed based on user account deletion

  26. Re:Windows CE, ugh. by JebusIsLord · · Score: 2, Informative

    *sigh* No, pocketpc 2002 is NOT the new name for WinCE, pocketpc 2002 and 2000 are an implementation of windowsCE 3.0. Windows CE is an embedded realtime OS that can be used for all sorts of things, PDAs being one of them. Think of pocketpc 2002 as a "distro" of CE3.0 with a special pocketpc gui installed.

    --
    Jeremy
  27. Comment removed by account_deleted · · Score: 2

    Comment removed based on user account deletion

  28. Re:It's the OS and the Compiler by ceallaigh · · Score: 2, Informative

    Without a compiler that has optimizations for the XScale, you will still get poor performance. So all the tweaks in the world to your existing code base will be for nought without a corresponding change in the compiler which is targeted for ARM7/9 cores and has only basic support for XScale.

  29. Is this a PDA? by Felinoid · · Score: 1

    When shopping for a PDA I don't remember speed meaning anything but "cost to much"

    People buy PDAs for cheap lap tops or simple organisers. Nither needs speed.

    Pocket PCs can get faster and faster while Palm Os PDAs outsell them.

    The Palm Os devices are cheaper and use less power.
    This is becouse they are slower.

    It's not speed... it's memory....
    Handspring Visors have memory cartrages and the Palm m500 use media cards so while Power PC devices play with added speed and don't get it Palm os devices get added memory.

    Thies things are just portable databanks they aren't for processing information just storing it.

    Want to play MP3s? Slap on an MP3 player... a sound chip that has an mp3 incoder built in and some added ram.

    Want to do presentations? Slap on a presentaion device.

    Go on the Internet? Snap on a wireless... (Unless it's built in)

    Play Quake? compile data?

    Hotsync with desktop...

    I'm looking for a keyboard and a wireless for my Visor (the i705 can't handle telnet) so I can use a shell account from my PDA...
    I'm not going to have any real computting power on a PDA. Thats not what a PDA is for.

    --
    I don't actually exist.
    1. Re:Is this a PDA? by steveha · · Score: 2
      People buy PDAs for cheap lap tops or simple organisers. Nither needs speed.

      Wrong.

      More right than wrong. I know that the PocketPC cannot currently do anything I need that a Palm PDA cannot do.

      Once PDAs are available with much much faster processors and tons of RAM, people will find new uses for them. But as things are today, given a choice between the battery life of a Palm or the power of a PocketPC, most people choose the Palm.

      Do you fully understand how EVIL you are? People are DYING in hospitals due to medical errors and timing issues that could be essentially eliminated by a sufficiently advanced portable computing system.

      Oh, rubbish, and shame on you. I don't believe for a second that PocketPCs (or any other single gadget) can magically solve the problems of hospitals. And I'm dubious about PocketPCs at all in hospitals; they do crash.

      You are actively preventing those technologies from being developed as fast as they otherwise would be.

      Wow, he sure has a lot of power to affect technological development in the world. That or else you are being insanely over-the-top.

      steveha
      --
      lf(1): it's like ls(1) but sorts filenames by extension, tersely
    2. Re:Is this a PDA? by steveha · · Score: 2

      Once again, the words of an idiot.

      If you are trolling, then grow up and go do something else.

      If you are not trolling, I suggest you take a course in how to effectively communicate your ideas without being a jerk.

      I figure you have to be a troll; is anyone this abrasive and annoying without working at it?

      Have a nice day.

      steveha

      --
      lf(1): it's like ls(1) but sorts filenames by extension, tersely
  30. Re:Judging by modern Linux DEs.... by IamTheRealMike · · Score: 3, Informative
    Hmm, well .....

    Linux with KDE is slower than Windows 98 basically for two reasons. The first is that Linux does more stuff. For instance, it runs various daemons in the background to allow for remote access, it journals filesystem logs, it implements proper crash protection, it has a usable command line with virtual terminals etc. Windows 98 doesn't have these things, so it can be faster.

    The second reason is that KDE is written largely in C++, and the Linux C++ linker is inefficient (it is much faster at C). The programs run fine, but they take longer to start up, which is what makes it "feel" slow. Gnome should in theory be faster, but they kill any speed increase they'd otherwise get by having a slower (well, in v1.4) graphics library and by using incredibly heavy things such as CORBA for ipc, and a daemon for configuration etc.

    The reason other window managers (not just ancient ones, others such as WindowMaker or E) are faster is because a) they are simpler and b) tend to be written in C

    The speed of GTK is improving, though CORBA/ORBit will always be slow on the gnome side imho. The Linux Linker issues with C++ are known about and are being resolved, which will lead to much better performance.

    Another problem is that some modern distros are quite bloated. My SuSE 7.3 box loads all sorts of stuff at startup that I don't actually need, but I never got around to switching it off. Combined with the slow start of KDE and the fact it loads after login (which windows does before login), and it begins to feel slow.

    Performance is improving, however it's still largely in the hands of the GNU folks and the distro companies.

    thanks -mike

  31. It's All a Question of Cache by emulac · · Score: 2, Informative

    The Intel PXA250 has only 32K/32K of cache, which means that any real application will experience an extremely high cache miss rate. The memory bus is 16 or 32 bits, and has a maximum clock rate of 100 MHz. So, if you're running the maximum width bus at its maximum speed, you're likely to see an instruction dispatch rate of about 50~100 million ops/second. That's slow, and there's really nothing to be done short of adding much more cache.

  32. Re:It's the OS and the Compiler by quarter · · Score: 1

    If you are doing anything that requires performance, you shouldn't rely on the compiler. Intel has their "Integrated Performance Primitives" which give you a nice abstraction layer so you can wring the most power out of each processor without having to hand code asm for each one.

  33. Re:Judging by modern Linux DEs.... by glitchvern · · Score: 1
    Unix is so.... 50s in it's ideology.

    Uhm, it was first invented in 69 and improved to what we would recognize as Unix in the early 70's.
  34. Re:Judging by modern Linux DEs.... by RevAaron · · Score: 2

    *sigh*

    But most of the core ideas were developed in the 50s. Just because Win98 came out in 1998 (or was it 99?) it doesn't mean that it's technology is of the 90s. That's pretty much from the 50s too.

    I should've known that comment would confuse many.

    --

    Working toward a usable PDA environment in the spirit of Newton OS: Dynapad
  35. And the Linux answer to Dotnet is...? by alext · · Score: 2

    the next version of Pocket PC - that will surely have a .NET runtime in it. That will make all the problems go away, won't it? Just compile the apps to IL, target, and distribute.

    Let's hope your skepticism is justified. Because if it isn't, Linux as a platform will be in very serious trouble.

    Linux has no answer to cross-platform code, the one exception being Gnome with Mono. If that remains the only effort, and continues to attract hype and developer support, one day soon we'll wake up and find that the single viable open source platform to write to is under the technical direction of Microsoft.

    However did this happen?

  36. Re:Seems obvious, bus speed & not enough cache by KlausB · · Score: 3, Interesting

    There are no new ARMv5 instructions that affect performance in any noticable way for general purpose computing (i.e using an optimized C-Compiler with your old code).

    The main new instructions are:

    - a "find first one bit in word" instruction, which helps software division and huffman encoding

    - some DSP-instructions like 16x16 bit multiplication/40Bit add for filters (audio-encoding, etc)

    Both these enhancencents more or less require assembly coding

    The other major architectural enhancements are branch-prediction (offset by higher penalties on branch misses) and larger caches (32K dcache versus 8K and 32K icache vs 16K, if i remember correctly)

    However, the cache latency has increased from 1 to 3 cycles.

    It means that when you load a value from memory and hit the cache, the compiler needs to find 3 unrelated instructions you can execute before you can use the result in the fourth instruction after the load.

    This is a severe blow if your compiler does not figure it in, and even if it tries, or if you use assembly, you often cannot find three such instructions (table walks, or under register pressure)

    In the worst case (table-walk, LUT's), this effectively halves your processor speed.

    As far as i know, the bus interface has not improved from the SA1110, and this was not too efficient to start with (does not exploit accessing preloaded bank, cache-line has to be .clompletely filled before execution, etc)

    Apart from that, there are some issues in the PXA silicon, which I think force some timeconsuming workarounds (extra cache flushes, Writeback-cache does not work, slow bus cycles). I would guess that these affect performance even more than the 100MHz SDRAM clock - after all that's about what you find in your 1GHz+ P-III-design.

    However, this is only what i gathered from the datasheets, I have not yet used a PXA system as it does not yet seem to be an improvement over the SA1110 that justifies a new design.

  37. Before blaming Intel... by zealot · · Score: 2, Interesting

    Before blaming Intel for going with an Arm 5 core with "slow" (slow being relative, as the benchmarks vary) Arm 4 emulation, remember that all they did was produce a CPU for the embedded market that can run on batteries. The MS Pocket PC market is just one market for these processors. They want them used in cell phones and powering all kinds of devices, just like the StrongArm did.

    Obviously, they felt that the majority of their customers would want an Arm5 based device. Wait a few months, and you might see some pretty impressive cell phones or linux based devices that use Arm5.

    The complaint against Intel is only legitimate if their Arm5 scores are terrible. Otherwise it is the fault of the device maker for using a chip that doesn't perform well for the task at hand, or MS for not optimising.

    --
    He said, "You'll be able to tell your grandchildren that you helped assemble the first NT supercomputer," and I cringed.
  38. This is not correct by KlausB · · Score: 1

    Marvin wrote:
    > It's simply Intel moving to a new instruction set (ARM V5)
    > and building a (slow) emulation of the old one (ARM V4),
    > and Microsoft says it would be horribly difficult to
    > support two different instruction sets, so the choice was to either
    > live with the new CPU performing slower than the old one or
    > cut off support for the old hardware.

    This is not correct.

    All ARMV4 instructions are implemented natively in the XSCALE core.

    The XSCALE core, just as the SA1110, executes almost all ARMV4 instructions in one clock, and, as far as I remember, uses more clocks only for very few instructions:

    - shift register by register (2 instead of 1)
    - mul / mul-acc (extra latency cycle in some cases)
    - branch miss in the added BPU
    - maybe some coprocessor accesses

    Except for an assembly rewrite of some inner loops in the kernel, there is not much MS can do about the Memory interface that hasn't scaled with the CPU clock.

    I do not think that compiler tweaking will gain much more than 10% in performance.

  39. I would MOD this up... by benjamindees · · Score: 1

    but I haven't any points. I'll just throw out the Compaq Aero series as another MIPS arch that was very quickly made useless. Not to say it was ever very useful. We bought five but the sync process was so shitty that we never used them.

    --
    "I assumed blithely that there were no elves out there in the darkness"
  40. ARMv5 versus ARMv4 and why Intel sucks by jeffmock · · Score: 5, Insightful

    It's important to differentiate between architecture optimizations
    and CPU specific optimizations. The ARMv5 instruction set is a
    relatively minor architectural tweak to the ARMv4 instruction set.
    The names give you the impression that it's some grand change between
    v4 and v5, if a technical guy did the naming it would be ARMv4 and
    ARMv4.01. ARM is playing some games with architecture naming
    to protect their business position with patents in a silly way.

    ARMv5 adds a couple of new instructions over v4, an instruction to count
    leading zeros in a register (which a compiler would likely never
    use), and a better method of switching between the ARM instruction
    set and the 16-bit Thumb instruction set. The later isn't
    relevant for PocketPC since Thumb mode isn't supported. I think
    v5 might having a new debugging hook as well.

    The new XScale parts are ARMv5te, the T is for the 16-bit Thumb
    instruction set, which no one seems to care about. The "E" adds
    some DSP oriented instructions that are pretty interesting for
    media codecs and such. They are the MMX equivalent for the ARM
    world. They likely won't improve performance of the general
    purpose aspects of the platform.

    I think it's a red herring to chase Microsoft for not optimizing for
    the ARMv5, the changes are really small and I don't see any
    performance impact, certainly not if you have to maintain another
    version for all of the strongARM based products.

    Now, as far as CPU specific optimizations for the PXA250 (XScale)
    implementation of the ARM architecture. IMHO Intel chased
    MHz and left behind a lot of good sense about system performance.
    The high order bit is bus performance as others have already
    pointed out.

    In addition to the bus performance, Intel made many tradeoffs
    to optimize for clock speed: The 7-stage pipe has a 4-clock penalty
    for a mis-predicted branch. This is compared to the circuit
    design heroics in the strongARM that implements "all branches
    are 2-cycles". The Xscale approach is much more complicated, it
    probably doesn't perform any better, but you get a high clock speed.

    Intel adds clock cycles to all load/store-multiple instructions
    in Xscale. This is a pretty big deal in ARM since they are
    used in the entry and exit of most C functions, in memcpy(),
    and any time you are moving chunks bigger than a register.

    The load-use penalty is bigger in Xscale. This is a pretty big
    deal in ARM. The ARM instruction set is pretty compact. It is a
    RISC processor, but the combination of shifting operations
    combined with ALU operations makes it possible for a good compiler
    to generate reasonably compact code. As a result, it's harder
    for a compiler to put instructions between a load and instructions
    that use the destination of the load. This is another trade-off
    in Xscale that allows a higher clock speed but hurts performance
    otherwise.

    I go on too long, but the DEC designed strongARM used in the SA1100
    is a tour-de-force of clean implementation and balanced system
    performance. It's amazing that core was designed in 1993 (I think,
    someone please correct me) and is still the leader for handheld
    apps. The Intel guys went after clock speed at the expense of
    everything else in Xscale and it will probably never optimize well
    for a platform like PocketPC.

    jeff

    1. Re:ARMv5 versus ARMv4 and why Intel sucks by FrankDrebin · · Score: 2, Informative

      Somebody, please mod this up because jeff is right damn it!!!

      I've worked with both the SA-11x0 (StrongARM) and the PXA250 "Cotulla" (Xscale) CPUs and everything jeff says is pretty much on the money (except the CLZ instruction is far from useless, it's *awesome* for fixed-point logarithms, dude).

      Also, the DSP coprocessor in the X-scale is about as useful as tits on a bull for codecs with 16-bit data streams. You spend so many clocks marshalling data around to get it in and out of the thing that it's *much* more efficient to use the MAC instructions native to ARM v4 on normal registers! Even the Intel engineers who put together their IPP's have avoided the DSP coprocessor since it provides no real advantage.

      It's pretty clear to me the v4/v5 thing is a red herring. Let's face it DEC was much better at putting out a general purspose ARM-based CPU than Intel.

      --
      Anybody want a peanut?
    2. Re:ARMv5 versus ARMv4 and why Intel sucks by Grishnakh · · Score: 2

      Well, comparing the Alpha (designed in the 80's) to the Itanium, it seems DEC was much better at designing all kinds of CPUs than Intel.

    3. Re:ARMv5 versus ARMv4 and why Intel sucks by ceallaigh · · Score: 1

      I likewise have worked with both the StrongARM and the Cotulla and Sabinal flavors of the XScale. The performance problems are real and in fact it was very difficult to get Intel to own up to it. Only after we had clear data showing what was going on that they came forward and revealed to us that there was a problem. I really fealt that Intel was not being very open about these issues. There is nothing particularly innovative with the V5 architecture and Intel will try to hype it for as much as it's worth but as the previous poster noted, it's more a variant of V4 with deeper pipeline and the extended DSP functions. Overall, the ld/str multiple performance is pathetic without changes to your standard C library routines.

  41. Maths by leonbrooks · · Score: 2

    Simple arithmetic: if it was CPU-bound, halving the clockspeed should roughly halve the FPS. Suspect the graphics chip. BTW, having it beaten by an iPaq in a graphics benchmark sucks rocks. A friend of mine got a 50-fold speed improvement in iPaq graphics by rewriting the GDI driver. Either the gfx driver is broken or there's something badly wrong with the gfx hardware.

    --
    Got time? Spend some of it coding or testing
  42. Well, mail it to me by leonbrooks · · Score: 2

    We can use MIPS portables here.

    --
    Got time? Spend some of it coding or testing
  43. Re:Judging by modern Linux DEs.... by himi · · Score: 3, Informative

    Most of the core ideas in Unix were developed in the 60's, actually.

    Computing in the 50's was a very different thing, so limited that the idea of wasting cycles on things like memory management or protected memory would have been considered insane. It wasn't until hardware developed to the point where there were cycles and memory to spare that anything like Unix (or MULTICS, which is where most of Unix's ideas were developed) became possible.

    himi

    --

    My very own DeCSS mirror.
  44. Re:Judging by modern Linux DEs.... by steveha · · Score: 2

    Gnome should in theory be faster, but they kill any speed increase they'd otherwise get by having a slower (well, in v1.4) graphics library and by using incredibly heavy things such as CORBA for ipc, and a daemon for configuration etc.

    I don't know about the graphics library thing, but the GNOME ORB is somewhat stripped down to make it faster. Unless you actually serve objects remotely over the net, the GNOME CORBA ORB basically just adds a little bit of function call overhead. I'm willing to accept that tiny bit of overhead for a tested, industrial-strength object model like CORBA. KDE, as I understand it, is inventing their own somewhat lightweight object model, and I'm worried they will later find some situation where they wish they had left in a feature they stripped out to make it lightweight.

    As for Win98, don't forget that it is a candy coating around Win95, and Win95 was aggressively optimized for size and speed. The target machine for Win95 was a 486 with 4MB of RAM. There is a bunch of assembly language in there, in critical places; and some of it is even 16-bit code. (16-bit code is much harder to write, since you have to cram things into smaller spaces and you have to explicitly handle near/far pointers, but it's tiny! Even with the thunking overhead, it won't slow you down too much if you just use it rarely.) Also don't forget that the MS C compiler actually does produce very good code, better than GCC is able to now. (Although I hear good things about the latest version of GCC, I don't think they have caught up with MS C yet.)

    The tiny size of the Win95 core means that it caches well, too, and a high cache hit rate makes for speedy performance.

    I'd be interested to see benchmarks of Win98 vs. a really stripped-down Linux system (no daemons running, etc.) that was compiled with aggressive optimizations and is running a really lightweight window manager (IceWM or ROX or something). And defintely no Nautilus; try your system with ROX Filer instead. I saw a huge speed jump when I did that. (Debian makes it so easy to try such experiments!)

    There is still room left in Linux-based systems for size and speed improvements. Every time GCC gets better, every part of the system gets a little better. And I don't believe that much work has been done on either GNOME or KDE to make it stripped-down lean-and-mean... the first law is "make it work before you make it faster", and folks are still busy making it work right. (But Nautilus has had a lot of speed work done on it lately, and I've heard it is much improved compared to its 1.0 release.)

    steveha

    --
    lf(1): it's like ls(1) but sorts filenames by extension, tersely
  45. Battery life by Mr_Silver · · Score: 2
    I love this comment:

    We are aware that PXA250 (XScale)-based devices are not demonstrating the huge performance gains that were anticipated. That said, Pocket PCs continue to offer the best performance and the richest functionality vs. other handhelds on the market today.

    Translation: We know your new car only goes 40mph instead of the 65mph you old car did, but it beats a bicycle, doesn't it? (credits to Jim S for that one).

    Even better:

    I think the market expectation of what performance on a 400 MHz processor vs. 206 MHz processor has been unreasonable.

    Not at all. The process is almost twice as fast, I don't think it is utterly unreasonable to expect the product to be at least one and a half times faster.

    But my question is, how is the battery life on one of these things? If it really is the 12-16 hours instead of the 8 currently then the XScale is still a worthwhile bet.

    --
    Avantslash - View Slashdot cleanly on your mobile phone.
  46. I really doubt that by pslam · · Score: 1
    I haven't worked on X-Scale cpus, but I've certainly worked a ton on StrongARMs. A StrongARM SA-1100 has 16KB code and 8KB data cache. It's half-line 'dirty flag' write-back, 32 bytes per line, 32-way associative, and round robin evicted. The cache is very rarely the bottleneck - low MHz is more often the problem.

    It's usually pretty hard to thrash a code cache to the point of it being the bottleneck. You pretty much have to deliberately write code to do that. For reference, an Athlon has a 64KB code cache, and that's running at a far higher speed than both of these ARM processors. Your figure of 50-100 million ops/second assumes an unreasonable 100% instruction cache miss rate. You'd have to have a program totally devoid of loops to achieve that. At a still unreasonable hit rate of 95% you'd still get 96% ((95*400+5*100)/40000) of full performance.

    My guess why the real world performance is so bad is probably Microsoft's lack of optimization specific to the processor. There's a few trade-offs Intel have made to get the clock higher, including:

    • The latency of the barrel shifter is now 1 clock, instead of free. The compiler will no doubt assume the latter, as will tons of hand coded assembler.
    • The multiplier has higher latency (I think), but you can pipeline it along side loads, stores and arithmetic (nice). It won't have been compiled for this.
    • Loads/stores have higher latency, but you can pipeline them along side other stuff. The load/store multiple instruction has been "obsoleted" such that it's slower than separate instructions. That's pretty major for a compiler to ignore.

    I certainly wouldn't expect to be able to take code targetted for StrongARM and see all of the performance increase that 200->400 would indicate. I can imagine hand coding assembler to work around the latencies and getting near to 100% performance. It's not that hard for a modern compiler to work this out either - a current day x86 is far more difficult to target than an X-Scale.