piranha(jpl) · Slashdot Mirror

Re:Hmmmmmmmmn, on Fluendo To Sell Proprietary Codecs For Linux · 2007-01-16 03:58 · Score: 2, Informative

NVidia sat on their hands for years with that security problem, didn't they?

Re:what's the purpose of a language, anyway? on PHP Application Insecurity - PHP or Devs Fault? · 2007-01-11 22:11 · Score: 1

An entire class of common web-programming bugs, Cross Site Scripting, could be eliminated if people would use DOM.

Currently, common practice for dynamic web applications is to build your output HTML document piece by piece, string-wise, splicing variables left and right to get the desired result. Unless you're extremely diligent in properly encoding plaintext to HTML (meaning changing <, >, &, and sometimes " and ' into HTML character entity references), you will slip up and people will find ways to inject arbitrary HTML (and therefore, arbitrary Javascript) into the output of some page in your application.

DOM is the structured approach to creating an HTML document. You don't splice strings, you manipulate tree structure fragments. If you want to include the output of variable foo, you might say: elem.appendChild (foo). DOM models the data in a structured format, and just about makes it impossible to introduce XSS bugs, because textual content is always properly encoded when the document is serialized and sent to the browser.

It's like the difference between building SQL queries by hand or using prepared statements. The idea is to use a special notation for these specialized tasks (building a query, building a document). Then you stop worrying about trivial syntactic security issues and start actually operating on a more abstract and efficient level.

What floors me is that PHP's DOM implementation absolutely blows chunks. For a "web programming language," they've really dropped the ball. DOM can make you more efficient: it abstracts HTML document generation so that you don't have to worry about quoting things (something of a relief, actually). You don't have to worry about stupid malformed-document bugs (say an obscure function closes an element prematurely). And it's suddenly easier to write software that returns bits of HTML code and ties it in with other HTML code.

PHP's DOM is unreliable, or at least greatly surprising. Half the functions aren't documented. And PHP's DOM doesn't even follow the W3 DOM standard: every real DOM uses the same names for functions and attributes. PHP does it "their" way, because they don't understand that the standardization of DOM is part of what makes it so fantastic.

Part of the problem may be PHP's reference semantics. It wasn't until considerably after I tried PHP's DOM that I really learned how references work in PHP. (The official documentation is far from clear on the subject.) Whenever you do assignment or call a function, all values are deep-copied. So, if you have variable a referencing an object, write "b = a", and change the object by dereferencing "a", you won't actually change the object referenced by "b". You actually have to use a special syntax in assignment to copy the reference to some object and not to instantiate a brand-new one. This flies in the face of convention in contemporary imperative/object-oriented languages, and must make writing a DOM implementation impractical.

For my own project, I gave up on mutating a PHP DOM tree, and was forced to clone the whole tree just to modify it. Unlike most PHP documentation, there's little to no examples. Because of PHP's special reference semantics, examples would have been essential.

Poor scientific practice on Virtualization In Linux Kernel 2.6.20 · 2007-01-09 23:31 · Score: 3, Informative

Why do they document the model of CD-ROM drive they used, but not the configuration of each emulation/simulation environment? I was shocked by the LAME compile times--and forced to wonder and guess what the filesystem configuration was. Is the filesystem located in an image file on the "host" computer's filesystem? Wouldn't it be interesting to try using a comparible medium across all benchmarks (shared NFS server, or low-level access to the same block device)?

Not enough data (CPU time vs. real time, etc.), not enough benchmarks (different filesystem media, etc.), poor documentation (configuration, anyone?), on what doesn't even amount to an official release. Correct me if I'm wrong.

Re:Don't use C++ as if it was only "C with classes on How Do You Know Your Code is Secure? · 2007-01-08 13:44 · Score: 1

an inherently insecure language (i.e., any compiled language)

Hmm. Would you call Java, Haskell, OCaml, or Common Lisp "inherently insecure languages"?

Re:Don't overlook popularity on File Systems Best Suited for Archival Storage? · 2007-01-06 01:04 · Score: 2, Insightful

Does anyone use RAR outside of the copyright infringement scene?

Re:This Was Possible A While Ago on MultiSwitch, the First USB Sharing Hub · 2006-12-21 09:20 · Score: 1

USB/IP is hot. If it worked with Xen, I'd be using it for my thin-clients (e.g. CD burning, etc.).

USB/IP: USB sharing for Linux on MultiSwitch, the First USB Sharing Hub · 2006-12-21 09:15 · Score: 1

USB/IP is a Linux project to export USB devices on one computer so that others on the network may use them. As with the hardware described in this article, two computers may not simultaneously use the same device; USB has no provisions for that.

Re:This is... on Detecting Rootkits In GNU/Linux · 2006-12-18 10:44 · Score: 1

No no, the real fruitbats disable the kernel module facility entirely. It could be used as a rootkit vector!

Re:the whole point... on Vista's TCP/IP Promises and Perils · 2006-12-13 02:46 · Score: 1

That doesn't make a lick of sense. References please.

Re:You want Lisp. on Bjarne Stroustrup on the Problems With Programming · 2006-12-07 08:26 · Score: 1

Or just write a Common Lisp implementation in Python...

Yeesh. Maybe when there's a self-hosting Python compiler.

Re:You want Lisp. on Bjarne Stroustrup on the Problems With Programming · 2006-12-05 01:19 · Score: 1

That's prefix notation, but you knew that.

You want Lisp. on Bjarne Stroustrup on the Problems With Programming · 2006-12-04 22:24 · Score: 4, Insightful

You want Lisp. Hear me out.

Of course, the character syntax is superficially different. Operators use infix notation ("(+ 1 2)" is analogous to "1 + 2"), and have identical character syntax as function calls ("+", an operator in Lisp jargon, may be implemented as a function).

If you can sleep at night after that, your can define own higher-level language syntax that looks exactly like any other Lisp syntax. Lisp is extremely flexible in its naming of functions and variables (symbols). If you'd like, you could define an operator named .= as a function: (.= string new-character-strings ...) would modify the given string object, string , in-place, appending each specified new-character-string to the end.

Recognizing the downside to modifying random strings in-place, perhaps you'd rather have your .= operator assign a newly-instantiated string to the variable referenced by string . You could, by writing the operator as a macro. The macro would act like a function, taking as input each "raw" argument—symbols and lists, the structure as they appear in your program, before evaluation—and returning as output replacement Lisp code to evaluate in its place. So that your .= operator form of (.= out "lalala") is semantically equivalent to (setf out (concatenate 'string out "lalala")) (like out = out . "lalala"; in other languages).

It's not just simple textual substitution. You can use any function or macro in your macro definition to transform your input arguments into whatever replacement code you'd like. I'm using macros in Common Lisp to generate recursive-descent parsers based on a grammar production expression: the following form defines a function named obs-text that takes a string as input and returns a list of matches found as output:

(defproduction obs-text (LF :* CR :* (obs-char LF :* CR :*) :*))

This function is defined in place and evaluated and compiled immediately by the Common Lisp implementation.

Macros can be abused, but they add a tremendously powerful capability of abstraction not possible with many other languages.

Re:Your Bad Call was... on Mark Shuttleworth Tries To Lure OpenSUSE Devs · 2006-11-26 12:47 · Score: 1

Link, paraphrased: "Closed source Linux kernel modules are illegal. Don't ask me how. Oh, but I know. Just talk to a lawyer. Don't ask anyone this question in public."

Blessings of the state. Blessings of the masses. on First Company Logo Visible From Space · 2006-11-14 19:56 · Score: 1

Let us be thankful we have commerce. Buy more. Buy more now. Buy more and be happy.

Re:I'd recommend doing experiments on Which Filesystem is Best for CompactFlash? · 2006-10-19 17:22 · Score: 1

As for thrashing due to memory limits - don't use swap space. Ante up for more memory and write your code so it fails gracefully if it is out of RAM.

Keep in mind Linux will kill processes which use "too much" RAM, nullifying graceful memory shortage recovery code. See: Memory overcommittal, or how to avoid the OOM killer.

Re:Safari has similar capabilitites on New Web Browser Leaves No Footprints · 2006-08-31 03:01 · Score: 1

Create a special "privacy-mode" profile for Firefox. Each profile uses entirely different settings, history, cache, and soforth from any other profile. This way, I've created a "privacy" profile that saves nothing to disk and is configured to use Tor as its proxy. What's more, you can run two Firefox instances of different profiles at once.

Re:decNumber libary from IBM on The Trouble With Rounding Floats · 2006-08-14 01:00 · Score: 1

You're right; I've obviously never implemented a rational arithmetic library. Thanks for pointing that out.

Re:decNumber libary from IBM on The Trouble With Rounding Floats · 2006-08-13 16:01 · Score: 3, Informative

Rational number arithmetic is a more general solution. Any number that can be expressed in decimal or floating-point notation is rational; any rational number can be expressed as (n/d), where n and d are integers. We have "bigints;" unbounded-magnitude integers constrained only by the memory of the computer they are stored on. Rational numeric data types pair two bigints together to give you unbounded magnitude and precision, and have been implemented for decades.

They probably aren't directly supported in your favorite programming language because they are slow to work with when you need very high precision; after each calculation, the rational number needs to be reduced to its lowest terms. This involves factoring, which takes time proportional to the the terms themselves.

Consider the use of integers, floats, or decimals only as an optimization when it has been shown that an application is suffering a serious performance hit because of rational arithmetic, and when you can use a faster data type knowing that your program will perform within accuracy goals.

For 90% of computing problems, monetary calculations included, you shouldn't even have to worry about what numeric type you're using. Your language should assume rationals unless told otherwise. Common Lisp, Scheme, and Nickle do exactly that.

C developers can use GMP. Other developers can use one of many bindings to GMP.

Re:Huh? on RSS and Web Feeds a Risk? · 2006-08-07 16:04 · Score: 1

What you're describing is basically a blacklist of all the ways that JavaScript could make its way into HTML.

You don't seem to understand what I proposed. I'm describing building a whitelist of allowable HTML elements in a document. And advising you to make informed decisions on whether to support certain elements and attributes, erring on the side of not supporting them for the sake of security.

As an example, many know that Javascript can be included, legally, in CSS. Let's say we're implementing the system I described in a pre-CSS-dominance world. We aren't familiar with the <style> element (even though it is defined in HTML), so we decide not to worry about formally parsing CSS—we simply discard <style> elements. We can choose to throw away the <style> element entirely, or include its content in place of the element. The result, 3 years and zero software releases later: someone tries a Javascript-URI-in-CSS exploit and falls on their face because without <style>, the browser doesn't interpret the intended CSS stylesheet as CSS.

Furthermore, your approach relies on a pretty wild presumption: that the source is properly structured HTML.

The BBcode approach relies on a pretty wild presumption: that the source is properly structured BBcode. (Remark: HTML and BBcode are both markup languages. Exercise: which has the most variety of lenient parsers? Which has been proven? Which is formally defined?)

Today's browsers will try to interpret both XHTML and HTML even if they're not structured properly, so invalid [X]HTML still becomes a JavaScript carrier and your blacklist enterprise is doomed to a neverending journey of catching all the possible ways of abusing this markup. Good luck with that.

You really missed the point. It doesn't matter if the system I described uses a lax parser or a strict parser. Because the parser generates a data structure, and because it is the data structure which is manipulated, and because the output of the system is always well-formed HTML or XHTML from an HTML or XML serializer, there's no risk of Internet Explorer misinterpreting any not-well-formed data. Such data is sanitized by virtue of the design of the system. As an example, consider an HTML fragment: <script\x00 type="...">...</script> (I don't know if this is a real MSIE vulnerability, but it is close to one.) This system will either:

Pretend it didn't see the bogus <script\x00> open-tag, and skip the non-matching </script> close-tag. No danger.
Interpret it as an actual <script> element, which will be filtered by the higher-level logic. No danger.
Interpret it as the literal text "<script\x00>", which will correctly be reserialized upon output to <strong>. No danger.

[kses] is a very thorough filtering library, it's being used internally by WordPress, and still hasn't stopped recent versions of WordPress 2.x from suffering from this kind of security vulnerability.

Can you provide a reference? I couldn't find any information about this vulnerability.

By contrast, consider the whitelist approach proposed by strip_tags + BBcode. You use strip_tags and thus you wipe clean every trace of JS exploit attempt. Then you interpret BBcode in a controlled manner, a markup which has no way of being interpreted in "creative" ways should it escape as is into a browser.

To implement that in a way that results in well-formed HTML or XHTML, you would have to parse the BBcode and serialize the resulting data

Re:Huh? on RSS and Web Feeds a Risk? · 2006-08-06 18:14 · Score: 1

PHP is limiting the way you consider solving the problem. Just because strip_tags() doesn't do the trick for MSIE doesn't mean there's no reliable way. This is the function PHP needs to bundle in its standard library.

Re:Oh God on RSS and Web Feeds a Risk? · 2006-08-06 17:43 · Score: 1

Feed formats are a vector for vulnerability. The proper analogy isn't "C++ is evil," it is "throwing feeds on your site without sanitization is as bright as running arbitrary executables from the Internet."

Pulling the Javascript, plugin, and ActiveX junk out of arbitrary XML data is much less trivial than "remembering to turn off JavaScript support." There is no such check box. This is apparently hard to get right, judging by the rash of XSS bugs. There needs to be the equivalent of such a check box in any library that is tempting to use for Web development.

On "LOL INTERNET": I hope you were being facetious. XSS is much more dangerous than that.

Re:VESA F'ING BIOS on Could Graphics Drivers be Included on the Card? · 2006-07-30 10:25 · Score: 1

See also: OpenFirmware; Sun & Mac video adapters.

Re:Real Beneficiaries of Hardware Virtualization.. on Undetectable Rootkits Through Virtualization? · 2006-06-29 21:04 · Score: 1

The fact of the matter is, everything described in the article is implementable right now with existing hardware. Any x86 machine can emulate an x86 machine. If you don't believe me, take a look at qemu, which can do so entirely in user-mode.

Rutkowska's approach to detecting whether code is run in a virtual machine is based on checking the address of the IDT. This approach fails to detect qemu. Below, furthermost is a qemu machine; evanescence is a "bare-metal" AMD K6-2:

piranha@furthermost$ ./redpill idt base: 0xc02bd000 Not in Matrix.

piranha@evanescence$ ./redpill idt base: 0xc03b8000 Not in Matrix.

The IDT address that qemu reports doesn't match the signature of a VM. I should imagine this address could be modified to return the same result as my K6-2 does.

The emulator has complete control over the guest environment. Complete. Control. Including completely spoofing the hardware of the real host system.

This shouldn't be shocking.

I'd also like you to qualify or expound on your allegations. It's not exactly clear; what does Intel or AMD stand to gain from undetectable rootkits?

Re:Can't understand on XSS Vulnerabilities Reviewed and Re-Classified · 2006-06-22 23:57 · Score: 4, Informative

As someone else has pointed out, that's a naïve and incorrect approach.

HTML is a standard. BBcode is a whim. HTML wins for its ubiquity. BBcode gives you nothing.

People that don't think they can effectively and safely include HTML content from untrusted sources are not viewing the problem in a formal way. Address the cause, not the symptom.

The cause is not thinking of and treating your HTML input as structured data. Rather, you're thinking of it as a character stream. Textual substitutions are a sign of that line of thought.

Your user's HTML content is a tree structure. Parse it. Then filter out all elements that are not in your allowed-elements list. Filter out all element attributes that are not in your allowed-attributes lists. Construct these lists by examining the HTML specification and considering the risks of each element or attribute.

Take it a step further. For each attribute value that contains a URI, parse that URI using a formal grammar. Filter out all URI schemes ("http", "ftp", etc) that are not in your allowed-schemes list. Certain characters, like non-printables, should never occur in a URI directly—signal an exception to the user to inform them of their error. Don't just stop if you don't find anything wrong! Reconstruct the URI from its constituent parts and replace the original with your sanitized version.

Likewise, formally parse all CSS code: in referenced external stylesheets, embedded stylesheets, and in style attributes. Filter out anything not explicitly allowed. Replace any URIs with the output of the same URI-sanitization function above. Reserialize the content. (This is hard; drop all CSS as a short-cut.)

When you're done, you'll serialize the HTML document and transmit that to your clients. I guarantee that this will eliminate XSS problems stemming from Internet Explorer incorrectly interpreting malformed HTML, CSS, or URIs. There are other attack vectors; be careful of what you allow to be included inline with documents, or linked to. (Think Flash.)

This is the correct solution, and most flexible to your users. It's not another idiosyncratic language to learn. It's the world standard for rich textual documents on the World Wide Web.

Unfortunately, it requires work.

CD-single cases on Replacement for Jewel Cases? · 2006-06-14 15:02 · Score: 1

Own any CD-singles? These come in a slim package similar to jewel-cases. Unlike most "slim" cases you find CD-Rs in, these have an enclosure for J-shaped paper inserts, designed for titles to show through the transparent spine.

So, versus jewel cases, you gain saved space, keep the ability to scan through a stack of them for the right disc, but also keep the fragility of plastic.

I can't imagine you'd find any enclosure system not based on a plastic package that lets you scan through a stack of them.

A manufacturer.
Google.

Slashdot Mirror

User: piranha(jpl)

Comments · 143