faragon · Slashdot Mirror

Re:Serious question on Google Chrome For Linux Goes 64-bit · 2009-08-23 08:16 · Score: 3, Insightful

64bit also vastly speeds up long and double math. It doesn't really apply to a browser, but if you were using 64bit integers to store currency amounts, you'd notice a huge speedup. Adding/subtracting from longs is one thing that SSE probably won't help. ;)

No speedup for these reasons, at all:

1) In the case of using 64-bit 2's complement integer registers, you're able to speed-up your 64-bit interger code because operating with 64-bit integers without chaining 32-bit results on the 32-bit CPU case. However, you're missing the point that most heavily computing, such as RSA's big numbers, DES, AES, Blowfish, etc. doesn't use general purpose register but vector SIMD opcodes (e.g. SSE*), already available in the 32-bit mode (with 8 instead of 16 registers, yes), which is faster than 64-bit integer operations.

2) Floating point ("double math") remains almost the same, but with also 8 additional SSE registers.

3) Related to "adding/substracting from longs": In 32-bit mode, a SSE3 -or later- functional unit can execute *four* 32-bit instructions per clock (fetching 128-bit data at once), while already being able to execute from 2 to 4 integer + load/store instructions (e.g. Core2Duo or K8), so it would be faster still while chaining 32-bit results.

Hardware AND software revisions on A History of the Shrinking Game Console · 2009-08-23 06:08 · Score: 5, Informative

It is not just a hardware revision, but implies also cuts in software: Remember that Sony has cut the possibility of running Linux in the new PS3 "Slim" model, disabling the "Other OS" boot option, because of the costs of programming new drivers for virtualizing the new I/O devices through the hipervisor.

Extra-official reply from Sarah Ewen, a Sony employee:

BY: sarahe
DATE: 2009-Aug-21 22:23
SUBJECT: RE: Why no Linux in PS3 Slim?

Hi aragon,

I'm sorry that you are frustrated by the lack of comment specifically regarding the withdrawal of support for OtherOS on the new PS3 slim.

The reasons are simple: The PS3 Slim is a major cost reduction involving many changes to hardware components in the PS3 design. In order to offer the OtherOS install, SCE would need to continue to maintain the OtherOS hypervisor drivers for any significant hardware changes - this costs SCE. One of our key objectives with the new model is to pass on cost savings to the consumer with a lower retail price. Unfortunately in this case the cost of OtherOS install did not fit with the wider objective to offer a lower cost PS3.

We'll see if we can get the offical OtherOS page updated with something to this effect so that an official explanation is provided. Thank you for your comments.

Sarah.

Re:Serious question on Google Chrome For Linux Goes 64-bit · 2009-08-23 05:29 · Score: 4, Interesting

Cons:

- The benefit from passing from 8 to 16 general purpose registers is very little, and often, counterproductive, as total "true registers", the ones used for register renaming in OoOE remain the same, so with twice the general purpose registers, you halve the renaming register pool. That was specially noticeable in firsts AMD64 CPUs, and *very* noticeable on Intel Pentium D CPUs (Pentium 4 with x64 support and other minor changes), acusing of insufficient register pool volume for the OoOE operation in x64 mode. Newer CPUs, having a higher pool of registers, have less impact when executing x64 code.

- Memory and data cache wasting: Pointers take 64 bits, so unles you're doing your own memory management, with 32-bit offsets instead of using the bulk 64-bit space for adresses, you're wasting more memory, and what is worst: higher data cache usage for the same purpose, with unnecessary CPU-RAM bus overload (remember that OoOE implies data fetching! -imagine a contiguous 32 64-bit pointer vector, taking 2048 bits instead of the 1024 bits that it would take with 32-bit pointers-).

Pros:

However, for some things there is true benefit, and is that the number of registers for SSE operations have been also doubled, from 8 to 16. And because of the nature of the SSE code, which is usually less prone to jump misprediction and with less register aliasing, because of the nature of vector processing code.

Corollarius:

In my opinion a 64-bit operating system makes sense, but an application that doesn't need more than 2GB of RAM, and doesn't need to gain an extra 10% of speed up when running optimized SSE vector code, should be compiled in 32-bit mode.

Ok, I'll take it. on Wired Writer Disappears, Find Him and Make $5k · 2009-08-19 06:47 · Score: 1

Now, how do you want he, dead or alive?

Re:Can you scale an x86 processor down? on Dell Considering ARM-Based Smartbooks · 2009-08-15 04:50 · Score: 2, Informative

Yes, they do OoOE, but not with the insane amount of register renaming of the OoOE-x86/OoOE-PowerPC ones, nor with the same alternate execution depth. The ARM Cortex OoOE is a very power-wised balanced OoOE, however, and is just my opinion, completely unnecesary (you could put 3 in-order-execution cores instead of the 2 out-of-order-execution ones).

Re:Can you scale an x86 processor down? on Dell Considering ARM-Based Smartbooks · 2009-08-14 20:56 · Score: 3, Interesting

It is not the "x86 emulator", as it takes a tiny percent of the die, and 90-95% of instructions are decoded to one underlying RISC equivalent. Most power consumption is because of OoOE, huge pipelines, and huge caches. In my opinion OoOE processors are an aberration inteded to maximize serial code, by wasting 4 to 8x resources, as it is like having many processors executing future code paths "just in case" (misusage of instruction cache just to feed the OoOE jump prediction execution paths) while making a misuse of the system bus by loading data for instructions that will be discarded 1 of every 10 times (data cache misusage by fetching data for instructions that will be discarded in a major part). So in "advanced OoOE CPU" you're saturating the bus for computing worthless instructions. As example, in the area of a P4 CPU, you may had 8 to 16 MIPS or ARM in-order CPU cores, making much better usage of the shared cache, and with 4 to 8x more executed instructions/transistor, with efficient system bus usage.

Re:They need... on EVE Online's Fight Against Currency Farmers · 2009-08-12 08:52 · Score: 1

You're right. To keep fixes prices in a free market is non sense.

To some extent, long distance (interstellar) comerce problems were addresed by Paul Krugman in the following essay: The Theory of Interstellar Trade (Paul Krugman, 1978) [PDF] (related Slashdot article here).

They need... on EVE Online's Fight Against Currency Farmers · 2009-08-11 20:38 · Score: 1

... inflation.

Re:6 out of 11 is not "virtually every" on No Windows 7 XP Mode For Sony Vaio Z Owners · 2009-08-11 07:32 · Score: 1

Well, it could be also said that 9 out 10 "branded" laptops with VT-capable Core2Duo CPUs have that feature disabled because of their BIOS. The point it is not about CPUs lacking a feature, but CPUs with the feature, being cripped at BIOS-level. Example: Acer Aspire 2930 (I own one, with Intel Core2Duo P7350, which supports VT, but it is disabled at BIOS level, without the possibility of enabling in the BIOS menu). It seems that there are hacks for enabling it (1), but involving BIOS reflash, which is, in my opinion, a too much risk.

Add also Acer to the 'evil list' on No Windows 7 XP Mode For Sony Vaio Z Owners · 2009-08-11 02:36 · Score: 1

My Acer Aspire 2930 laptop (Intel Core2Duo CPU) has the VT extensions disabled at BIOS level. Don't buy this model, and be aware of buying other models from Acer.

For sure I will not buy anything from Acer. In addition to the VT %$%$$%-ing, the laptop VGA output it is not properly shielded because of poor design, and produces a signal with a bit of flickering (to get a digital DVI output you have in addition to spend over 125 € for a "Easyport IV" dock station).

Just nonsense on Study Claims Point-of-Sale Activation Could Generate Billions In Revenue · 2009-06-28 04:46 · Score: 4, Informative

And what about the sales lost because of annoying the *customer*? Greedy idiots.

Re:Decoding Chips on YouTube, HTML5, and Comparing H.264 With Theora · 2009-06-14 09:15 · Score: 1

Magically no, but embedded device manufacturers would move quickly for: 1) provide "youtube-resolution enough" Theora decoding for software based ARM-SIMD, 2) Hardware accelerated Theora on GPU.

New features are adopted slowly on embedded devices, as example, take the Flash player for browsers. The change will come after demand, and Google could flip the situation at their option, no matter the way they choose, they have the Ace of Spades.

Re:Decoding Chips on YouTube, HTML5, and Comparing H.264 With Theora · 2009-06-14 09:00 · Score: 1

Most newer ARM CPUs inside system-on-chip include SIMD extensions, so while being less efficient than GPU-h264, it should be enough for decoding YouTube-sized Theora video. It is a matter of time of Theora-accelerated on GPU, but demand should be first.

Re:Decoding Chips on YouTube, HTML5, and Comparing H.264 With Theora · 2009-06-14 08:57 · Score: 1

It is not a problem for Google.

Re:Decoding Chips on YouTube, HTML5, and Comparing H.264 With Theora · 2009-06-14 08:21 · Score: 2, Informative

The problem is encoding, not decoding, as the decoding is done in third party hardware (final user). Also in the transcoding process, i.e., decode from whatever to h264/Theora, decoding is much faster than encoding (because of pattern matching and movement analysis). Anyway, bandwidth is the main problem, as uploaded video is reencoded *once*, and played *many* times.

Re:Metaphor on Looking at Intel's New-ish Desktop Socket, LGA 1366 · 2009-05-31 22:54 · Score: 1, Funny

From 3 to 4 blades? Come on!

Re:"functional programming languages can beat C" on World's "Fastest" Small Web Server Released, Based On LISP · 2009-05-25 08:13 · Score: 1

Why it should? Fortran is compiled, so it is C, and both are very simple and easily optimizable languages (GCC). Lisp can be compiled too, but by its flexibility still compiled implies higher overhead in parameter processing and more data cache trashing because of additional control structures requiring extra pointer usage.

In the C vs Fortran you comment, the most time consuming is the complex domain square root (loop can be unrolled, integer multiply cand and FP integer load can be pipelined, along with FP multiply). Loop optimization, constant propagation, and strength reduction, can be done by both Fortran and C compilers, so there is no much left to be done (needless to say that the complex 'sqrt' implementation it is probably written in C).

Re:BEST POST OF MONTH on R.I.P. MS-DEBUG 1981 - 2009 · 2009-05-11 09:27 · Score: 1

Of month, you insensitive clod XD

P.S. thank you for the link, very cool demo.

Already gone... on R.I.P. MS-DEBUG 1981 - 2009 · 2009-05-08 07:59 · Score: 2, Interesting

... for XP64 and Vista64.

Here is my last tribute:

C:\Users\faragon>copy con hifolks.com
zzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzz^Z
1 files copied.

C:\Users\faragon>debug hifolks.com
-a 100
187E:0100 jmp 112
187E:0102 db "Hello Slashdot!$"
187E:0112 mov ah,9
187E:0114 push cs
187E:0115 pop ds
187E:0116 mov dx,102
187E:0119 int 21
187E:011B int 20
187E:011D
-w
Writing 00048 bytes
-q

C:\Users\faragon>hifolks.com
Hello Slashdot!
C:\Users\faragon>

The problem is not threads vs processes... on New Firefox Project Could Mean Multi-Processor Support · 2009-05-07 09:23 · Score: 3, Insightful

... is to surrender in order to accept buggy as hell plug-ins or memory leaks as "acceptable".

Current multithreaded Firefox is able to use multiple CPUs, being the reason of splitting the tabs into independent processes is to surrender to mediocrity. How about increasing Q&A, do proper synchronization between components, and don't allow untested components to be used without showing a big warning at installation?

Re:0.3 Megapixels... on Disassembling the US Nintendo DSi · 2009-04-08 10:50 · Score: 3, Funny

Fuck everything, give me the 5MP!

Re:Uhm, Intel makes ARM chips too on ARM — Heretic In the Church of Intel, Moore's Law · 2009-04-07 09:26 · Score: 1

Yes, with license there is no problem for Intel for producing IBM, ARM, NVidia or AMD chips.

Re:Uhm, Intel makes ARM chips too on ARM — Heretic In the Church of Intel, Moore's Law · 2009-04-06 07:09 · Score: 1

To manufacture with Marvell license, but the IP is Marvell property, except for IXP (Network Processors) and IOP (I/O Processors): that means that Intel should no longer able to build "application" ARM CPUs (PXA*) without license.

Re:Uhm, Intel makes ARM chips too on ARM — Heretic In the Church of Intel, Moore's Law · 2009-04-05 08:58 · Score: 1

You missed that Intel sold its ARM division to Marvell in 2006 (1).

Re:No laws overrridden on ARM — Heretic In the Church of Intel, Moore's Law · 2009-04-04 21:48 · Score: 1

ARM is not breaking any "law".

Are you sure that it is not breaking the law?

Slashdot Mirror

User: faragon

Comments · 372