Slashdot Mirror


Explaining Disappointing XScale Performance In Pocket PCs

JYD writes: "I found this new article on a Pocket PC web site where Microsoft talks about why XScale Pocket PCs aren't as fast as people thought they would be. Is it the OS? The CPU not supporting ARM4 properly? I wonder if the Linux port would run faster on 400 Mhz ... or did Intel screw up the CPU?"

16 of 133 comments (clear)

  1. You think thats slow by brejc8 · · Score: 3, Interesting

    My group has been working on a syhthesizable secure G3 card CPU and it will probably be the slowest ARM ever made.

    The CPU will be fully delay insensitive and asynchronous to stop power and clock glitch attacks.

    We are currently looking at 4 Mhz on 0.18 process.

  2. Cant find the link but by MrBandersnatch · · Score: 3, Interesting

    a review I read showed a 400Mhz XScale performing at 50%-75% the speed of a 206MHz Strongarm chip. I would be really interested in some none OS specific tests that showed whether or not the XScale offers any performance benifit whatsoever - I know that it is supposed to scale to 1Ghz and has better battery life than the 206Mhz Arms but if it NEEDS to run at 800MHz just to perform at the same level as its older sibling then it is a waste of space.

    1. Re:Cant find the link but by Chanc_Gorkon · · Score: 5, Interesting

      Well I found it and the performance is NOT 50-75 percent slower then an iPaq. From the numbers on pocketnow.com, the Toshiba e740 is actually ahead in most categories with exception of graphics. There's the real kicker. I don't think it's the Xscale so much as it's the ATI imageon graphics chip in it. This is also a new chip, and as the benchmarks prove, it's driver has a problem or so it would seem. I actually heard that it's kind of operating in a emulation mode of sorts (kind of like standard SVGA on a desktop). ATI should provide driver code to Toshiba and it can then be fixed in a flash. I have a e740 and love it so much. The Xscale is a nice chip and will indeed improve in peformance as it's flashed up, but in my book, the other features are worth more. The wireless works well, the dual slots are a godsend (WE DON'T NEED NO STEEEKIN SLED! ;) ) and the price is GREAT for what you get. All in all, I would buy another one or an updated one (like the Toshiba e550 coming out soon!). One thing I am looking for is the availability of the 3000 mah high cap battery. The standard is fine for day to day use, but when you use the wireless alot you hear a giant sukcing sound coming from the battery. The other accesory I would look for is the 99 buck adapter that goes on the bottom. You add that and you can attach a USB keyboard and also drive a SVGA monitor or a Projector with it and have your handheld run your Power Point stuff on the road.

      --

      Gorkman

  3. Amulet cores by brejc8 · · Score: 5, Interesting

    The Amulet group has been working for year to make a low power yet high speed asynchronous ARM processors.
    The Amulet 3 runs at 120 MHz and consumes very little power. Most of all its asynchronous so when you dont have mych processing to do it just sits there consuming "no" power.

    They take a hell of a beating and still run. I connected one to a hamster wheel and you can see it here running despite the power fluctuating madly.

    The only reason it only goes at 120MHz is because the memory isnt fast enough.

    Its a little strange that only three ARM production lisences were given out. One to intel one to motorola and one to Amulet group.

  4. Stranding Users... by Anonymous Coward · · Score: 5, Interesting
    From the article: "We're not prepared to strand an installed base of over 2 million iPAQ users."

    Umm... right, that's why my PocketPC 2000 Cassiopiea E115 is now as useful as a doorstop as it has a MIPS chip in it.

    When I got my PocketPC, MS touted that 'software matters' - even in their publicity. Suddenly, they ditch all the SH3 and MIPS users and just support ARM in PocketPC 2002. Not only that, but applications like Terminal Services and Messenger they won't release for the older machines. I see a lot of people saying that this is becasue PocketPC 2002 is based on CE.NET - that's not correct. PocketPC 2002 is just another revamp of PocketPC 2000, which are both based on CE 3.0. So when it all boils down, it's just Microsoft playing marketing tricks. Net result of their decision - my £450 PDA became obsolete in 18 months.

    I now own a Palm.

    1. Re:Stranding Users... by Xtifr · · Score: 3, Funny

      From the article: "We're not prepared to strand an installed base of over 2 million iPAQ users."

      Umm... right, that's why my PocketPC 2000 Cassiopiea E115 is now as useful as a doorstop as it has a MIPS chip in it.

      Sorry, there were only 1,999,999 users of that specific system, so it was below our threshold. :)

  5. Pocket PC hw spec lockdown by Qrlx · · Score: 3, Insightful

    Pocket PCs aren't as fast as people thought they would be. Is it the OS?

    It could be the OS, which is the obvious answer since it's a Microsoft OS, and this is Slashdot. But I don't know. I've never tried running anything other than PocketPC OS on the iPaq, and probably never will. (It's a work thing.)

    How did Microsoft become so popular? It was DOS, wasn't it? The program that ran on any x86 computer. Well, Microsoft should take a page from their previous success and allow a little more flexibility in PocketPC design. The main gripe that I and everyone else has about these gizmos is that they're locked into a 240 by 320 by 16-bit color display. That's lame, especially if one of the highlights of PocketPC is how easy it is to port your Win32 app. If you have to redesign all the screens to fit in a tiny-ass space, it's easy on the coders but hell on the systems analysts.

    It looks to me like Palm have a much more open approach, they are using the same tactic that established Microsoft's dominance with DOS back in the 80s. You can get that new Sony Clie' with TWICE the screen real estate (as in pixels) of ANY PocketPC available. Kind of a no-brainer if you ask me.

    Off to the solstice parade!

    1. Re:Pocket PC hw spec lockdown by aussersterne · · Score: 5, Interesting

      No onslaught here.

      The Newton 2100 kicks ass. I used Palm and Windows CE before finally trying out a Newton 2x00 series. The Newton made me swoon.

      It's the best damn computing device out there, PC, PDA, or otherwise. I used to do my e-mail, my diary-keeping, my word processing, etc. on my PC in Linux, but now I even write my books and do 90% of my e-mailing on my Newton 2100 directly over ethernet. I read news on it, make travel plans on it, I have my household inventory on it (in Notion)... and I read BBC World News and Slashdot on it in Newt's Cape.

      The PC only gets touched every few days. The Palm and CE devices are long gone. I only regret that Apple killed the Newton, so there won't be a color version. :(

      --
      STOP . AMERICA . NOW
  6. Synopsis of "interview' by brooks_talley · · Score: 5, Funny

    Q: What could possibly have gone wrong?

    A: While we acknowledge that some peoples' perception is of something having gone wrong, we believe that any wrongness is unavoidable.

    Q: Well, some analysts say it's intel's fault

    A: We have implemented what we could implement, and don't believe there is any implementable implementation that would implement significant gains.

    Q: Analysts also say it will be 2004 before the issue is fixed

    A: It is too early to talk about 2004. That said, we are committed to delivering a good product.

    Q: This is really bad news for the Pocket PC platform

    A: Yes, it is. However, fortunately the issue is so small that this really isn't bad news for the Pocket PC platform.

    Cheers
    -b

    1. Re:Synopsis of "interview' by Moosifer · · Score: 3, Funny

      I'm the VP of Marketing for a large Internet company whose name I cannot disclose in a public forum. I'd like to offer you a director's position in our marketing department. Name your price. Can you start on Monday?

  7. I might add..... by Chanc_Gorkon · · Score: 4, Interesting

    This complaint was also based on the FIRST Xscale pda to EVER be released. Sure there's GOING to be problems. The iPaq started off with similar issues, but you don't hear anyone talking about it now do ya? There's alot of reasons that add up to create the total performance picture. Maybe Toshiba used cheaper internal ram? Maybe they need more memory for video (I think it has like 256 K maybe?? I don't know but I know it has dedicated video ram). The point is the performance on ONE Xscale based PocketPC does not make a prediction on how the others will perform. Also as these are flashable, we can expect even the Toshiba to get better performance as flash updates are made available.

    --

    Gorkman

  8. Comment removed by account_deleted · · Score: 3, Informative

    Comment removed based on user account deletion

  9. Re:Judging by modern Linux DEs.... by IamTheRealMike · · Score: 3, Informative
    Hmm, well .....

    Linux with KDE is slower than Windows 98 basically for two reasons. The first is that Linux does more stuff. For instance, it runs various daemons in the background to allow for remote access, it journals filesystem logs, it implements proper crash protection, it has a usable command line with virtual terminals etc. Windows 98 doesn't have these things, so it can be faster.

    The second reason is that KDE is written largely in C++, and the Linux C++ linker is inefficient (it is much faster at C). The programs run fine, but they take longer to start up, which is what makes it "feel" slow. Gnome should in theory be faster, but they kill any speed increase they'd otherwise get by having a slower (well, in v1.4) graphics library and by using incredibly heavy things such as CORBA for ipc, and a daemon for configuration etc.

    The reason other window managers (not just ancient ones, others such as WindowMaker or E) are faster is because a) they are simpler and b) tend to be written in C

    The speed of GTK is improving, though CORBA/ORBit will always be slow on the gnome side imho. The Linux Linker issues with C++ are known about and are being resolved, which will lead to much better performance.

    Another problem is that some modern distros are quite bloated. My SuSE 7.3 box loads all sorts of stuff at startup that I don't actually need, but I never got around to switching it off. Combined with the slow start of KDE and the fact it loads after login (which windows does before login), and it begins to feel slow.

    Performance is improving, however it's still largely in the hands of the GNU folks and the distro companies.

    thanks -mike

  10. Re:Seems obvious, bus speed & not enough cache by KlausB · · Score: 3, Interesting

    There are no new ARMv5 instructions that affect performance in any noticable way for general purpose computing (i.e using an optimized C-Compiler with your old code).

    The main new instructions are:

    - a "find first one bit in word" instruction, which helps software division and huffman encoding

    - some DSP-instructions like 16x16 bit multiplication/40Bit add for filters (audio-encoding, etc)

    Both these enhancencents more or less require assembly coding

    The other major architectural enhancements are branch-prediction (offset by higher penalties on branch misses) and larger caches (32K dcache versus 8K and 32K icache vs 16K, if i remember correctly)

    However, the cache latency has increased from 1 to 3 cycles.

    It means that when you load a value from memory and hit the cache, the compiler needs to find 3 unrelated instructions you can execute before you can use the result in the fourth instruction after the load.

    This is a severe blow if your compiler does not figure it in, and even if it tries, or if you use assembly, you often cannot find three such instructions (table walks, or under register pressure)

    In the worst case (table-walk, LUT's), this effectively halves your processor speed.

    As far as i know, the bus interface has not improved from the SA1110, and this was not too efficient to start with (does not exploit accessing preloaded bank, cache-line has to be .clompletely filled before execution, etc)

    Apart from that, there are some issues in the PXA silicon, which I think force some timeconsuming workarounds (extra cache flushes, Writeback-cache does not work, slow bus cycles). I would guess that these affect performance even more than the 100MHz SDRAM clock - after all that's about what you find in your 1GHz+ P-III-design.

    However, this is only what i gathered from the datasheets, I have not yet used a PXA system as it does not yet seem to be an improvement over the SA1110 that justifies a new design.

  11. ARMv5 versus ARMv4 and why Intel sucks by jeffmock · · Score: 5, Insightful

    It's important to differentiate between architecture optimizations
    and CPU specific optimizations. The ARMv5 instruction set is a
    relatively minor architectural tweak to the ARMv4 instruction set.
    The names give you the impression that it's some grand change between
    v4 and v5, if a technical guy did the naming it would be ARMv4 and
    ARMv4.01. ARM is playing some games with architecture naming
    to protect their business position with patents in a silly way.

    ARMv5 adds a couple of new instructions over v4, an instruction to count
    leading zeros in a register (which a compiler would likely never
    use), and a better method of switching between the ARM instruction
    set and the 16-bit Thumb instruction set. The later isn't
    relevant for PocketPC since Thumb mode isn't supported. I think
    v5 might having a new debugging hook as well.

    The new XScale parts are ARMv5te, the T is for the 16-bit Thumb
    instruction set, which no one seems to care about. The "E" adds
    some DSP oriented instructions that are pretty interesting for
    media codecs and such. They are the MMX equivalent for the ARM
    world. They likely won't improve performance of the general
    purpose aspects of the platform.

    I think it's a red herring to chase Microsoft for not optimizing for
    the ARMv5, the changes are really small and I don't see any
    performance impact, certainly not if you have to maintain another
    version for all of the strongARM based products.

    Now, as far as CPU specific optimizations for the PXA250 (XScale)
    implementation of the ARM architecture. IMHO Intel chased
    MHz and left behind a lot of good sense about system performance.
    The high order bit is bus performance as others have already
    pointed out.

    In addition to the bus performance, Intel made many tradeoffs
    to optimize for clock speed: The 7-stage pipe has a 4-clock penalty
    for a mis-predicted branch. This is compared to the circuit
    design heroics in the strongARM that implements "all branches
    are 2-cycles". The Xscale approach is much more complicated, it
    probably doesn't perform any better, but you get a high clock speed.

    Intel adds clock cycles to all load/store-multiple instructions
    in Xscale. This is a pretty big deal in ARM since they are
    used in the entry and exit of most C functions, in memcpy(),
    and any time you are moving chunks bigger than a register.

    The load-use penalty is bigger in Xscale. This is a pretty big
    deal in ARM. The ARM instruction set is pretty compact. It is a
    RISC processor, but the combination of shifting operations
    combined with ALU operations makes it possible for a good compiler
    to generate reasonably compact code. As a result, it's harder
    for a compiler to put instructions between a load and instructions
    that use the destination of the load. This is another trade-off
    in Xscale that allows a higher clock speed but hurts performance
    otherwise.

    I go on too long, but the DEC designed strongARM used in the SA1100
    is a tour-de-force of clean implementation and balanced system
    performance. It's amazing that core was designed in 1993 (I think,
    someone please correct me) and is still the leader for handheld
    apps. The Intel guys went after clock speed at the expense of
    everything else in Xscale and it will probably never optimize well
    for a platform like PocketPC.

    jeff

  12. Re:Judging by modern Linux DEs.... by himi · · Score: 3, Informative

    Most of the core ideas in Unix were developed in the 60's, actually.

    Computing in the 50's was a very different thing, so limited that the idea of wasting cycles on things like memory management or protected memory would have been considered insane. It wasn't until hardware developed to the point where there were cycles and memory to spare that anything like Unix (or MULTICS, which is where most of Unix's ideas were developed) became possible.

    himi

    --

    My very own DeCSS mirror.