kscguru · Slashdot Mirror

Useless API, for simple drivers only on Linux Kernel To Have Stable Userspace Drive · 2007-07-21 20:06 · Score: 3, Informative

Maybe I'm unique in that I not only RTFA, but browsed the patches themselves.

Which led me to the conclusion that this patch set is worthless. It allows remapping of memory-mapped I/Os to a userspace app, and allows a thread to "wait" on an interrupt. Both are nice ideas, and it would be very easy to implement a nice serial port driver with the new APIs. (As any kernel hacker knows, serial port is one of the simplest device drivers; it's easy.)

The new API is completely useless for binary-only drivers. I/Os / IRQs are enough for extremely simple devices - these are, after all, the primitives for talking to hardware. But if this were all a driver needed, don't you think Nvidia / ATI would have used this model for shim drivers a long time ago? Simple things like DMA and PCI configuration access are not present - but to be fair, those are implementable with these primitives. Reality check: real world drivers are a lot more complicated. What is impossible is fast thread switching, kernel synchronization primitives, access to the network stack (wireless!), ring-0 CPU instructions, real-time timing access, and the huge reduction in context switches / cache flushes that comes from running within the kernel (moving code to user-mode increases latency by a factor of 3, roughly). Kiss the lag-free desktop goodbye as hard drive latency skyrockets, watch your 3D framerate drop by 70%, see your webcam stutter into unusability.

The kinds of drivers this API can support are the simplistic ones, the kind that are already GPLed and are already in the kernel, the 80% of devices in this world Linux has always had good support for. The kinds of drivers this API cannot support are 3D graphics, high-performance disk or networking, wireless networking, latency-sensitive USB or Firewire, the virtual devices (VMware, KVM, Xen, even /dev/tty) - notice that most of the devices Linux supports poorly (and all the common binary-only drivers) fall into this list.

To be fair, the official (e.g. from Linus) announcements I've seen only claim this interface is useful for embedded devices (which tend to code for a specific kernel, and not get updated). No official announcement claims the new API will help binary-only drivers. It's just the OSS-zealot crowd making unwarranted assumptions. Yes, this is the bad news: the stable userspace driver API will do nothing to solve binary-only driver dilemmas. Sorry.

Re:Nvidia is not the competition on Insight Into AMD's Linux Driver Development · 2007-06-06 05:55 · Score: 1

I have other replies, but only one point here is really important.

Driver API stability: a modern 3D graphics driver is a full OS in its own right, with internal threading models, schedulers, memory management, context switches, etc.; a modern driver needs more than just bugfixes. Every good developer knows the way to keep two large codebases manageable is a stable API between them; the only people who don't seem to get this are otherwise-intelligent Linux kernel hackers [kroah.com].
In Linux, drivers are a part of the Kernel, not outside programs. It is, in a sense, a single large codebase, not multiple large code bases. nVidia wants to be part of that codebase, without actually contributing itself to that codebase.

Fundamentally, totally, and irreconcilably wrong. (With respect to various kernel devs, who disagree). Drivers as part of the kernel work if - and only if! - the driver is trivial. 80% of the drivers out there are trivial - bus drivers, serial ports, chipsets, cheap network cards, VGA, generic USB devices. 20% are not trivial - 3D graphics, enterprise-class network cards (the ones with TSO, checksum offloading, huge ring buffers, etc.), VMware/Xen/KVM, advanced filesystems (most of which exist in FUSE). The key thing all these nontrivial drivers share is a large API with other components (usually user-level) - advanced filesystems expose other types of metadata, virtualization drivers expose a whole virtual CPU, 3D drivers expose an entire array of GPUs plus a huge texture memory, and all the APIs to control it.

The difference between a trivial driver and a non-trivial driver is the stability of the API the driver provides. Most drivers present in Linux today (including the misleading "supports more devices than any other OS" mantra) have a stable, simple API, and only need bugfixes. The drivers Linux is conspicuously lacking (i.e. the drivers that are present in Windows, Solaris, sometimes Mac, but not in Linux) have complex, evolving, independently-versioned APIs - APIs with huge dependencies on a codebase that exists outside the Linux kernel, code that is under active development even after the hardware ships. It's fine that kernel devs would happily update the APIs between a driver and the kernel - who is going to update the APIs between the driver and the codebase that uses the driver? For all the griping, the kernel-driver API doesn't change that much ... the driver-userlevel API changes much more rapidly for all the drivers I named above, and keeping that up-to-date is harder than maintaining out-of-tree drivers. And thus, the driver logically belongs in the codebase to which it is more tightly coupled - the companies' codebase, NOT the kernel's codebase. Moving the driver to the kernel makes the kernel dev's life easier, but the companies' life much harder.

I'll summarize: Linux does not have complex drivers for complex devices (like 3D graphics) because the Linux kernel is not set up to interface with complex drivers. The Linux kernel assumes drivers need only bugfixes or integration with new kernel APIs; complex drivers also need integration with user-level APIs, and the Linux kernel development model makes such integration extremely difficult to impossible. The real Linux device support mantra is this: "Linux supports more trivial devices than any other OS, but can't handle non-trivial devices."

I don't particularly blame Linux - it evolved as a Unix replacement in a world where there were no complex drivers, where the drivers-are-in-kernel-tree convention worked. Now Linux is trying to jump to the modern desktop world where there are complex, out-of-tree drivers. If Linux wants appeal outside of the Unix-replacement crowd, Linux will have to grow out of this drivers-in-kernel-tree myopia.

Allow out-of-tree drivers, and all the other problems in the original post disappear.

Re:Nvidia is not the competition on Insight Into AMD's Linux Driver Development · 2007-06-04 14:28 · Score: 3, Insightful

That's why Linux hackers want all drivers in the kernel tree, so they can find anything which breaks due to an API change, and fix the problems.

And that's exactly why hardware manufacturers DON'T want their drivers to live in the kernel tree. They don't get:

Bug fix backports: Kernel hackers apply a patch to the latest vanilla kernel, then say backports are the distro's problem. Two distros do, fifteen don't, users of those fifteen scream at the hardware companies for having buggy drivers. With an out-of-tree driver, you just need to update the driver. Windows and OSX are light-years ahead of Linux here.
Community contributions: Kernel hackers only add GPLed code; very few even allow dual-license, and certain vocal hackers are zealously GPL-only (to the extent of rewriting code JUST to make it GPLed). Which means the hardware vendor can't take a fix from a linux driver and integrate it into a Windows driver without GPLing that too. The vendor has to either GPL all drivers (Windows included), or maintain two separate trees (GPL and non-GPL), or not open drivers at all. Guess which is easiest / cheapest?
Driver API stability: a modern 3D graphics driver is a full OS in its own right, with internal threading models, schedulers, memory management, context switches, etc.; a modern driver needs more than just bugfixes. Every good developer knows the way to keep two large codebases manageable is a stable API between them; the only people who don't seem to get this are otherwise-intelligent Linux kernel hackers.
Kernel API freedom: Kernel hackers like stable userspace APIs (for good reason). But hardware vendors don't need to provide stable APIs if they have a shim library that actually talks to the cards (e.g. atioglxx.dll, the ATI OpenGL implementation). It's a lot easier to let the API change rapidly and only commit to a stable API at the library interface (the OpenGL API).
Easier work. The Linux kernel development process is optimized for making the kernel hacker's life easier at the expense of the driver developer's (hint: saying "we'll update your driver for you" clashes very badly when the HW vendor is simultaneously making changes). If kernel hackers want to see better device drivers, they need to stop treating drivers as second-class citizens. Microsoft is very good at courting driver developers; Linux is the definition of arrogance.

Re:"consumer products" only on GPLv2 Vs. GPLv3 · 2007-06-04 05:45 · Score: 1

The freedom to use a program means the freedom for any kind of person or organization to use it on any kind of computer system

Fine. I want gcc to run on my ancient Apple IIe. The FSF seems to believe I have as much right to run gcc on my Apple IIe as someone else has to run modified Linux on their TiVo.

The reason I dislike all this anti-tivoization junk in GPLv3 is that it's absurd - to the point of being comical - in terms of real hardware's limitations. How is a TiVo box any different from my toaster oven? Neither was designed to run hacked-up Tivo software. It's like giving a blind man the right to read or giving a man the right to be pregnant - the right is so totally absurd that it only looks stupid, yet the FSF is patting themselves on the back for doing so.

If you want a modified TiVo box, build your own hardware. GPLv2 was already sufficient to guarantee access to the source code. So what if it can't run on one brand of box? Build your own. What is the obsession with TiVo hardware about, anyway? Why is it that (legally) modified TiVo software MUST run on genuine TiVo hardware? - because you've already got the right to run it on other hardware because of GPLv2. Why is anything in GPLv3 required? I see no reason; GPLv3 exists only to force cheaper-than-a-computer hardware to be open so that cheapskate hackers can get ever-cheaper computers.

I read the FSF paragraph above as stating that freedom is the opportunity to port free software to any computer system. I see nothing that suggests freedom is that every computer system must be open to having free software ported to it - yet these TiVo clauses attempt to create such a right. I find it a distortion of otherwise sane FSF beliefs.

Re:hmm on Parallels 3.0 Announced, 3D Graphics Included · 2007-06-02 05:43 · Score: 1

I'm quite curious what virtualization platform you are using, where enabling VT/SVM in BIOS makes such a difference. It's certainly not VMware, because current hardware virtualization runs slower than the state-of-the-art in software. AMD and Intel have new chips with second-generation hardware virtualization (which DO make that much of a difference), but you can't buy those yet. I do suggest reading the paper linked above - it's a great description of where all the virtualization overheads actually are.

Re:The hassel factor on Is Linux Out of Touch With the Average User? · 2007-05-23 05:37 · Score: 1

No, that not what he's saying. Here is the point:

User experience polish is lacking.

It's not a game of whack-a-mole, where you keep finding something that's a little inconvenient and making it better. It's a wholistic process of making sure an AVERAGE user can go through an entire day without once having to read man pages, help files, or use Google to figure something out. The Windows help system isn't really useful (it's more of a safety blanket), but nobody notices because you don't need a help system to run Windows. I've been using Linux for many years, and any time I do something mildly complicated, the first thing I do is hit google to find the options / name of the magic configuration program / location and format of the config file. I like Linux - once I figure out how to do something, I can do it again in a few keystrokes, instead of navigating fifteen menus - but Linux is hard to use and has a terrifying learning curve.

Linux is like a 10-in-1 kitchen appliance - it slices, it dices, it makes bread, pops popcorn, and does everything I might care about. But if you look at my kitchen shelves, I have 5 different appliances, and none of them are fancy, multi-use gizmos. The 10-in-1 gizmo seems like a great deal when you buy it, but it's hard to use - takes 10 minutes to reconfigure it for something else - you end up using most of the options once a year, if that - and I'd rather have five gizmos that just plug in and work with zero configuration.

Re:Suggestion to Stanford students: on Stanford To Charge Reconnect Fee For DMCA Notices · 2007-05-17 15:03 · Score: 2, Informative

You can address it to "resident" for each room number, no need to know names or anything else. Alternatively, do a ping sweep (or similar tactic) on your local network, and send notices to their office regarding all IPs you hit.

Ah, the amateur hacker at work. Stanford's IT people are much more clever than that.

Stanford filters the tackdown notices for "real" ones. Dorm numbers / street addresses aren't enough. A valid takedown request has an IP address and time of connection ... and guess what? Stanford looks in their router logs, verifies that a connection was made with that IP at that time. If you want to claim your computer wasn't even there, Stanford IT is going to ask exactly how packets with your IP address entered your port in the switch. And then you get expelled for your transparent lie.

The IT folks are very much on the student's side here. Despite all the knee-jerk reactions here on Slashdot, illegitimate complaints simply don't get through - that's what a competent IT staff buys you. Of course, that much staff time on the student's behalf really does take money.

More generous than before on Stanford To Charge Reconnect Fee For DMCA Notices · 2007-05-17 04:30 · Score: 5, Insightful

Before Slashdot overreacts, I graduated from Stanford two years ago; this policy is more forgiving than what was in place in 2005.

Read it carefully - roughly, after the first notice, it's a $100 fee. After the second notice, it's $500 plus a notice to the residence dean (like a referral to the principle). After the third, it's $1000 plus a referral to Judicial Affairs (which, given Stanford's honor code, is likely to result in a suspension). The previous policy was a network disconnect until a student certifies offending material is removed, the second offense was another disconnect plus a notice to the residence dean, then after the third, referral to Judicial Affairs and a student was PERMENANTLY BANNED from the Stanford network. (Makes it quite difficult to do classwork.) I'm personally bothered with this new policy; makes it too easy for a rich kid to ignore everything.

Stanford's networking folks do look carefully at the notices, protect student privacy unless faced with a court order, and a student can contest the DMCA takedown notice without penalty with the eager assistence of Student Legal Affairs - although doing so waives your privacy. As of two years ago, no student had ever contested a notice - they were all clear-cut DMCA violations. And only well-documented violations ever got passed to students.

Now, let's be honest here ... I have yet to see a single person on Slashdot ever suggest running a file-sharing service from their desktop at work. So exactly why is a university a different story? Regardless of the merits of the DMCA itself (I personally think it's a stupid law, guilty-until-proven-innocent and with punishments far worse than the violation itself), the DMCA is still the law; why should a university be expected to shield individuals engaged in illegal behavior?

could match in a couple of months' time? on The State of Open Source 3D Modeling · 2007-05-06 06:36 · Score: 4, Insightful

Disclaimer: I haven't actually looked at any of these codebases. BUT - this jumped out:

Each of them offers a modern, much saner, more coherent, and more powerful basic architecture and could match Blender in a couple of months' time with some extra manpower.

Here is the problem. Actually, there are several problems all tied up here.

Each of them: great, there are three projects offering equivalent functionality, each hoping to supplant the current favorite? And which, pray thee, should an experienced developer contribute to? "Any of them"? --- bzzt, wrong answer. You're asking somebody to contribute when there is a 2 in 3 chance the contribution will be dead code when one of these emerges as a favorite? A born-into-money aristorcrat who doesn't have to make his own living can do that; the rest of us have more limited time and can't. Hint: companies pay product managers quite a bit to keep developers from doing wasted work, partly to avoid overhead but partly because wasting a developer's work is the fastest way to kill any enthusiasm. Picking one option (even if it's wrong) is better than indecisiveness. And if you truly think multiple options are the best, then find a way for them to coexist (pluggable rendering cores) instead of killing each other off.
modern, much saner, more coherent, and more powerful: all of these are in the eye of the beholder. But here's an opportunity to defend yourself: if these new architectures are that much more powerful, it must be possible to implement the blender architecture with them. Which happens to be a sane migration path, instead of the throw-away-anything-old not-invented-here approach of an entirely new project. Blender is open source: fork it and insert the new architecture, instead of griping about how somebody else should do something better. (I know full well this isn't as simple as I'm making it out to sound. But you know full well these new architectures aren't unambiguously better than the old.)
could match Blender in a couple of months' time: such a confident development-time prediction! Anyone with predictions that solid should be administrator of a project already! Now that I'm done being sarcastic, "a couple of months" is totally unrealistic. Every additional developer needs ~1 month to get up to speed on a new codebase (and understand what Blender does), another X months to implement the new functionality to match Blender, and 2X months to work the bugs out of the new functionality. Wine has been a few months from being usable for general apps for years; Gnome has been a few months and a few developers from being able to replace Windows for years; Windows has been a few months from being bug-free for a decade.

I don't mean to degrade the whole idea of finding something better than Blender. It's a fantastic goal, advances the state-of-the-art, and all sorts of other good things. I do dispute the misrepresentation of the ease with which it can be done: if it were even a tenth that easy, it would already be done.

Developers are willing to put up with the arcane code base because (1) it works, (2) it's Good Enough, which means anything newer has to overcome the training / usability barriers associated with switching, and (3) the newer options are not unambiguously "better". Remember: if app Bar (Blender) is already the standard, app Foo (these alternatives) not only has to be better for someone just starting, but also has to be better for an experienced user of Bar.

Re:Would be interesting if done right on Virtualizing Cuts Web App Performance 43% · 2007-03-29 05:14 · Score: 1

No, you did totally screw it up.

VMware Server vs ESX. One is for development and is free; the other has real performance. Ballpark, Server loses 15-20% of the performance you can get from ESX.
CPU counts. Host had 2 processors (well, 4 with HT). No idea what your guest had, but I'll assume 2 ... catch is, vSMP guests are experimental in VMware Server: they're known to have a non-trivial performance hit (~15% right there). If you really want to compare, set up TWO VMs, each with one processor, and see what overall performance is. Even at 1GB per VM (which is unfair, BTW - Server can share memory between the two, 1.3GB per VM is more realistic), I bet two 1P VMs will have more throughput than a single 2P VM. And oh by the way, this clustered solution resists software failures in the guest OS or web server by virtue of having two instances.

So you totally misunderstood the software, and try to cop out of it with "we didn't do any optimizations". Sorry - your benchmark isn't just lazy configuration, it's incompetent configuration, and there's quite a bit of difference between the two.

Re:Bogus Test on Virtualizing Cuts Web App Performance 43% · 2007-03-29 04:12 · Score: 2, Interesting

Not really - HP and IBM's project get 20% improvements by optimizing slow code - that is, untuned userspace applications. Take a whole system, including a kernel that multitudes of people have spent years tuning (Linux, Windows), server apps that already squeeze in as many tricks as possible (Apache), and the net gains of re-translating instructions diminish as the underlying apps already pull in more of these optimizations. Dynamo and DAISY also gloss over one crucial detail: you need a good-sized cache to store all these translations, so they're really a time/space tradeoff (more efficient in time with a cost in space efficiency).

That said, the COST of binary translation is never very high. A good BT engine gets essentially native performance (1% overhead is quite obtainable), and is limited only by the size of the translation cache.

Re:GPLv3 on USDTV Subscribers Gouged For Linux USB Keys · 2007-03-27 16:31 · Score: 3, Insightful

the GPLv2 was never meant to allow you to see source code, but not be able to produce a modified binary that works. (emphasis mine)

GPLv2 guarantees you the right to produce a modified binary that works on some system.

GPLv3, with the TIVO-ization clauses, guarantees you the right to produce a modifed binary that works on the original system.

The difference between those is huge! TIVO complies with GPLv2 (you could build your own TIVO box and install your modified source on it). USDTV seemed to comply with neither of these.

And this new GPLv3 doesn't clear up the GP's points about what is "distribution" (the 2nd GPLv3 draft only adds confusion, defining "distribution" as "conveying" without defining "conveying") - so that definition will still have to evolve via case law in the courts. Sorry FSF, y'all got too focused on fixing TIVO-ization and didn't actually clear up the ambiguities...

Re:Functional programming on Multi-Threaded Programming Without the Pain · 2007-03-22 04:48 · Score: 1

Windows doesn't have fork(). The POSIX subsystem has it, but you can't mix the posix subsystem with the win32 subsystem in the same app, and all the Windows apps you're thinking of run as win32. (win32's C library has a lot of posix calls ... fork() isn't one of them).

On Windows, process creation is expensive and complex, while thread creation is cheap, thus Windows design favors using threads for parallelism. On most Posix systems (unix, linux), process creation is cheap (e.g. fork, which has 40 years of optimization behind it), threads didn't enter the design until much later (Linux: NPTL, late 2.4 kernels?) and are mostly hacked in as "lightweight processes", so Posix design favors using processes linked by shared memory or pipes for parallelism.

But the underlying point is fine - every modern OS has support for some variant of parallelism. It doesn't have to be baked into the language or the compiler. And to anyone who understands multiprogramming, it's quite easy to switch between programming models.

Re:Its about the bug, not the environment on MS Security Guy Wants Vista Bugs Rated Down · 2007-03-18 05:46 · Score: 4, Informative

His security features are /GS, /SafeSEH, layout randomization and an execute bit? Okay, he really is full of it.

/GS. In theory works fine. In practice, you MUST (1) get the software publisher to compile with the switch, (2) cannot use inline assembly (/GS bails out on such code), and (3) must be willing to sacrifice a small bit of performance. In other words, a fair amount of real-world code can't use this. And oh by the way, this doesn't protect against all buffer overflows - it only protects against the easiest category. It's still quite possible to corrupt data with a buffer overflow, and maybe use that data to gain control.
/SafeSEH. Right ... how many common languages don't have good exception handling? You said C only, right? And how often do you use Windows exceptions in C? Not much, you say? When I've seen SEH code, it's almost always very narrowly scoped and thus easy to get right - in real code, Windows SEH is just a trampoline to get into another exception mechanism. Making it "safer" adds no value.
ASLR. This one makes generating a sucessful exploit a little more difficult - moves it from medium-easy to medium, because it's harder to hit a "target buffer". Of course, for compatibility reasons, a fair number of apps turn this off (they have assumptions about where code lives, and/or need the wasted address space). It helps - statistically. But a lucky guess is still going to succeed, and I don't trust luck for security.
DEP. A two-pronged technology, which (1) uses the NX bit and (2) disallows syscalls from data segments. Oh but wait, (1) requires having a fairly recent processor and (2) is fine for some apps, but breaks for anything that does dynamic code (e.g. a Java runtime), so it's also disallowed for many, if not most, apps.

So what do we find out from this list? You get defense-in-depth - IF you are running the latest hardware, IF you use only software built with MSFT's favorite options (some of which are opt-in), and IF you only run apps that embrace all these strategies. How many Joe Consumers fit into those ifs? Datacenters might be closer, but I'll bet even they can't generally say all these hold true.

I'm glad open-source is adopting some of these measures. But let's be realistic - all any of these technologies do is make a sieve less leaky by putting a second sieve underneath. Something is nice, but we would be fools to treat any of these security "features" as more than a speed bump.

Re:Trying to have her cake and eat it too? on Hacker Defeats Hardware-based Rootkit Detection · 2007-03-04 20:56 · Score: 1

Rootkit VMs are "impossible" in the same sense that cracking strong crypto is "impossible".
You really seem to be drinking the same kool-aid you accuse Rutowska of - there is nothing even remotely like the rigorous mathematical proof of crypto in the VM world. All there is the opinion of people, who are certainly experienced but hardly infallible.

When my doctor says "take this pill or you'll stay sick", he's certainly experienced but hardly infalliable. Yet I don't accuse him of being wrong.

Anyway, you want a quasi-mathematical argument?

The problem: whether a rootkit-VM can hide itself from a "guest" without serious performance degredation or lack of functionality.
Given: the "guest" can run any arbitrary code it wants. This follows from the requirement of not lacking functionality.
Lemma: the rootkit-VM cannot detect "guest" code trying to detect a rootkit. Why? Solving the general case is akin to the halting problem (trivially: the guest code halts if it detects a rootkit), and is thus undecidable. The best we can do is heuristics: guest code looks like a rootkit detector and should be modified to give wrong answers. Problem is, any heuristic has false positives and/or false negatives (else it would solve the halting problem); false negatives would allow detection of the rootkit directly and false negatives would allow detection indirectly (trivially: my program said "safe", a real computer would have said "unsafe" for this code). Antivirus vendors have the same problem: every new heuristic causes virus writers to change viruses to escape the heuristics, in a never-ending arms race. For the rootkit-VM, the advantage of innovation lies with the detector. By the way, this is the fallacy of the naive "my hypervisor has a rootkit-detector-detector" game.
(I'm ignoring the far easier argument: the complexity of a rookit-detector-detector is large, both in hypervisor size and in runtime speed; hypervisor size limits the possible attack vectors for loading the rootkit and runtime speed makes timing attacks even easier.)
From this, we conclude that the "guest" can trust its own code enough. Not "nobody can see me" trust, but "result is correct" trust. That is, the "guest" only has to detect that it is within a rootkit, or equivalently, that the "guest" is running in an environment that is not expected (i.e. an environment that doesn't match real hardware).
Now for the other side: the rootkit detector is really looking for hardware that doesn't act like the real hardware should. PCI-based memory scans and external time sources are just examples of the general category.
To prevent hardware-functionality attacks, a hypervisor must either trap all inputs to hardware (i.e. virtualize all devices) or allow only inputs that cause outputs with known effects (i.e. pass through safe devices). The former is impossible because you won't know a priori what the hardware on the system does; the latter is impossible because the VM can't force arbitrary hardware to be safe. (At least, not without VT-d and an IO MMU, but that doesn't exist yet except as an unimplemented spec.)
If you dispute the last point, I challenge you to come up with a way to virtualize arbitrary hardware devices. If you do, VMware, Xen, Microsoft, KVM, Parallels, and everybody else in the business would like to talk to you.

Re:Trying to have her cake and eat it too? on Hacker Defeats Hardware-based Rootkit Detection · 2007-03-04 13:21 · Score: 1

Either rebut the core argument, that hardware-based memory retrieval is subject to manipulation by malware

Irrelevant. Hardware-based memory retrieval was already subject to manipulation by malware - she's found a new way of doing so, nothing exciting here. (How was it possible before? As mentioned above, SMM BIOSes already use the same tricks to make some memory non-accessible by devices. They aren't using it for malware - and they aren't deliberately trying to hide themselves - but the effect is the same).

There are only two new wrinkles here. (1) using the PCI mapping manipulations to deliberately hide code instead of accidentally hiding data, and (2) assuming that the PCI mapping modifications are undetectable because of a fantasy rootkit-VM. In fact, I don't think Rutkowska actually claimed (2); I point it out only because everyone arguing with me assumes it.

The question is whether we have a legitimate hole in our use of RAM dumps as forensic evidence.

If you were assuming RAM dumps are perfect, unforgable forensic evidence before, then yes, there's a new hole there. Me, I look at the whole idea as absurd (if there's a PCI device to capture memory, it has to connect to the PCI bus, which means it has to exist in PCI space, which means any rootkit would be able to see it, and why is it the rootkit is being insanely smart about hiding itself from PCI devices while still being stupid enough to run on a system with a PCI RAM dumper?)

Re:Trying to have her cake and eat it too? on Hacker Defeats Hardware-based Rootkit Detection · 2007-03-04 13:03 · Score: 1

Obviously (1) is really no defense at all because it only needs to be done once and then it is "free"

If you read the other end of those links, you'll read people who have been working on this problem a long time declaring it technically hard. Pass-through won't work because hardware's view of memory isn't virtualized; virtualizing devices doesn't work because of the need to virtualize every device in the underlying system (and the implicit need to know about every possible interface to do so). Sure it only needs to be done once ... just like proving P = NP only needs to be done once. That doesn't make it possible.

Rootkit VMs are "impossible" in the same sense that cracking strong crypto is "impossible". Rutkowska's research says "assume VMs are undetectable. Then attack X is possible". This is akin to saying "assume strong crypto can be cracked. Then attack X is possible", except Rutkowska ignores the experts who say that undetectable VMs are as unsolvable as strong crypto.

I'm quite serious when I say that Rutkowska has done some work, stated "the rest is left as an exercise for the reader", and declared victory. Catch is, anyone doing work on VMs sees that she's done 10% of the work and ignored the 90% where there are known difficulties she gleefully ignores. The woman DELETES criticisms from her blog and dismisses other attacks as "theoretical" (Anthony Liguori is a Xen contributor). Odd ... her rootkit is far more theoretical than the attacks against it, since solid rootkit-detectors exist and her rootkit is a self-admitted prototype that ignores the need to hide itself.

(2) is deceptively difficult to implement in a fool-proof manner that does not incur large costs on a per-site basis -- it requires an external time source *and* an external performance measurement system

Just an external time source - since the detection code is arbitrary (i.e. I could be sending cryptographically signed timestamps, or reading a timestamp off some well-known URL like yahoo.com), it's not possible to block the external measurement. (Sure the VM could look for such loops, but it has to detect any POSSIBLE loop, even dynamically generated ones, which is a halting problem variant and thus undecidable).

I've heard a second defense against timing analysis: ban external time sources. But this means banning a machine from the Internet, which escapes the whole security problem anyway.

What is to prevent that rootkit-VM (the one that twiddled the registers in the first place) from faking out any attempt by the legit OS to read those registers?

Because it doesn't have to be the OS reading them - it could be SMM code, or even a device snooping northbridge bus accesses to detect oddly-mapped memory accesses. But anyway, this assumes at rootkit-VM is possible anyway - and we return to point (1).

Trying to have her cake and eat it too? on Hacker Defeats Hardware-based Rootkit Detection · 2007-03-04 04:55 · Score: 4, Insightful

Let's see ... last year, she got all over the headlines claiming that virtual machines are a Bad Idea because rootkits could use them to remain undetectable (even though virtual machine experts discounted her "trivially easy and left unimplemented" parts as technically intractable).

And now a year later, she claims we need specialized hardware interfaces to scan memory for rootkits, even though this problem is laughably easy in the world of virtual machines.

And on to the actual work ... the research basically observes that MTTR registers (some of the MSRs in the CPU) can cause memory mappings to look different between the CPU and the northbridge, and then comes up with a pretty easy way to cause the northbridge to either lock up or read data that is different (really easy once you see the specs for the appropriate registers). And she totally ignores the possibility of a system defending itself against this attack by verifying the registers she's modifying. Lousy research, girl.

Oddly enough, this "hack" is ALREADY IN USE ON YOUR SYSTEM and is actually necessary. See, when the processor is running in SMM (System Management Mode), it switches to exactly this configuration: the PCI bus sees VGA hardware mapped at the well-known address, but the processor maps the RAM at that address, which gives SMM mode a few kilobytes of memory that the normal system can't touch. SMM mode is used for things like "legacy USB devices" (e.g. having your USB keyboard act like PS/2 so DOS can use it) and other implement-in-software hacks that your OS doesn't know about, but your BIOS vendor gives you as "value-added features".

How about the David Patterson perspective? on IBM's Chief Architect Says Software is at Dead End · 2007-01-30 06:07 · Score: 3, Interesting

Instead of an IBM executive, how about David Patterson. Hint: he wrote The Book on computer architecture.

Berkeley tech report (inc. Patterson as author)

Brief summary (I heard the same talk when he spoke at PARC), computational problems are divisible into one of thirteen categories that range from matrix multiplication to finite state automata. Most existing research (academia and industry) into parallelism tends to focus on about seven of those categories that are most easily parallelized - think supercomputer cluster. Most apps that you or I use fall into the graph traversal or finite-state categories (think compilers, apps with an event loop, etc.), into which there is essentially no research. Patterson even suspects that finite state machines are inherently serial and CANNOT be parallelized.

So ... the apps that we already use can't really get faster on parallel cores without major, fundamental advances in computer science that don't seem to be approaching. Which means we'll be using our current apps for a LONG time.

Additional note: IBM (and other chip manufacturers) have a vested interest in telling everyone that parallelism is the future. They can't make faster chips anymore, they can only compete on sheer number of cores.

Entirely pragmatic! on Torvalds Describes DRM and GPLv3 as 'Hot Air' · 2007-01-16 05:26 · Score: 1

But if he'd [written git] in the first place, git would have been years ahead in development by now, and the Linux community could have avoided an embarrassing debacle.

If Linus had written git in the first place, the Linux kernel would still be at v1.2, used only by guys with beards in university basements. (Where is Hurd right now, Stallman?). Linus is a pragmatist - he uses the best tool for the job. When Bitkeeper stopped being the best tool, he switched to git. Linus actually learned a very important lesson from Microsoft - it doesn't have to be Right, it just has to be Good Enough that you can make it Right later (which the GPL promises). Linus is so good at this that the kernel, under his stewardship, is beating Microsoft at their own game.

Re:fine line between "moderate" and "apolitical" on Torvalds Describes DRM and GPLv3 as 'Hot Air' · 2007-01-16 05:12 · Score: 2, Insightful

If you're at a store and notice that a customer keeps distracting the cashier,

The idea that you have a moral authority to force others to accept their rights is completely - and utterly - wrong. In your store analogy, I don't know if your robber is armed - and if I'm with somebody important to me, I have a moral responsibility to NOT notice, because interfering puts my somebody at risk. To go back to DRM, I do not have any moral responsibility to educate my users about the "evils" of DRM, particularly if I happen to believe that DRM is useful and necessary in some cases.

I don't need Big Brother watching over my shoulder and telling me to assert my rights. And I find it ironic, and very saddening, that Free Software tries so hard to invade my life and demand I accept rights that I reject - and worse, tell me I should demand others accept those rights via GPLv3. Sure, I can pick a difference license - this is exactly what Linus has done by retaining GPLv2. And it's entirely hypocritical to call him apathetic for taking the same action you suggest I take.

Re:I want a gaming designed VM on VMware Fusion goes Beta · 2006-12-23 06:12 · Score: 1

Apples and oranges. Glide is ANCIENT - it predates hardware T&L, it's before vertex shaders and pixel shaders. Newer DirectX and OpenGL exposes a lot more hardware-level detail that has to be translated back and forth. And it has to be translation - you can't directly pass through the card, the host and guest would end up fighting each other for control and instantly kill the machine.

Re:Not surprising at all on Linus Puts Kibosh On Banning Binary Kernel Modules · 2006-12-14 06:36 · Score: 1

Are you saying Linus' views aren't politically correct? Well tickle me pink ... let's throw him in the Gulag!

With all due respect to your low UID, greedy algorithms are better than more forward-looking algorithms in most cases. Sure they aren't perfect, but they tend to be good enough, and good enough is the difference between shipping next month and two years from now (when your competator controls the market). In fact, I claim the reason Linux has succeeded where GNU had not (HURD?) is precisely because of Linus' pragmatism. And I would drop Linux immediately were he to change.

Re:For those brain-dead like me: on Linux Kernel to Include KVM Virtualization · 2006-12-12 06:04 · Score: 1

"vmx" for Intel, "svm" for AMD

Re:Windows XP NUMA support on AMD QuadFX Platform and FX-70 Series Launched · 2006-11-30 06:06 · Score: 4, Insightful

Having done NUMA benchmarks ... on AMD chips, certain workloads take a latency hit from 60ns / memory access to 80ns / memory access. Bandwidth is halved. Net, it's a 5-10% slowdown across all workloads (5% if you try for average-case performance, 10% if you just hope for the best). Both sites point out that it is NUMA making the difference, yet both sites insisted on staying with Windows XP. A new motherboard like that, it's defaulted for a NUMA OS, so this is the 10% realm. As you point out, it would be extremely simple to run modern Linux (or Windows 2003, or Windows XP x64, or Vista) and see how well a NUMA scheduler works. (Note to Linux fans: Linux didn't have good NUMA scheduling either when Windows XP came out. A fair comparison would be against Linux 2.4.3 or so). This benchmark is fantastically stupid - it's the equivalent of running game benchmarks with a Voodoo3 graphics card to see CPU differences, determining that the graphics card is the bottleneck, then claiming one CPU is faster! Their benchmarks exposed a major slowdown in the memory system, easily corrected with an OS upgrade, and they refused to correct it.

In short, once you factor NUMA out of these benchmarks, the difference between AMD quad and Intel quad is approximately the same as the difference between AMD's K8 arch and Intel's Core arch for single cores. Umm... duh?

Slashdot Mirror

User: kscguru

Comments · 350