Unicode Encoding Flaw Widespread

Limited impact. by shird · 2007-05-21 18:38 · Score: 3, Informative

This appears to be limited to content scanning, and isn't really a vulnerability in itself. Relying on content scanning to prevent an exploit to reach an exploitable system is a pretty bad idea, much better to fix the system than the extra layer of defense on the outside.

Content scanning is mostly useful against filtering known exploits, and is hardly meant to be your primary defense. Being able to bypass this scanning won't buy you much. If the content scanner is aware of an exploit it scans for, chances are so are the systems being targeted and are patched to protect against it.

--
I.O.U One Sig.

Re:Limited impact. by KevMar · 2007-05-21 19:03 · Score: 1

So this is another case of don't trust user input.

I don't see anything new here, just another trick to look for. Most well tested systems should not be affected.

Unless I'm overlooking something? I'm not am I?

--
Im a gamer, not a grammer major. This post is full of spelling and grammer mistakes.
Re:Limited impact. by jrumney · 2007-05-21 20:03 · Score: 1

There have been many vulnerabilities in the past that were based on encoding a URL in some broken (or even non-broken) way to get past the first level of URL checking to a lower level where directory traversal is possible. On Unix based servers, the risk of this is mitigated by running your webserver in a chroot jail. On IIS, you just have to hope that IIS 6.0 is actually fundamentally secure down to its lowest levels, not just an insecure product with a thin veneer of security layered over it like previous versions were.
Re:Limited impact. by Anonymous Coward · 2007-05-21 21:01 · Score: 0

Fortunately, IIS 6 runs by default as the Network Service account, which has no privs on the box.
Re:Limited impact. by jrumney · 2007-05-21 21:31 · Score: 1

What does "no privs" mean on Windows? Clearly IIS 6.0 does have privileges. It has opened port 80 for listening for example, and it can read files and run scripts. So it cannot really mean no priviliges.
Re:Limited impact. by MikeB90 · 2007-05-21 21:37 · Score: 1

Yes you are. The point is that various unicode characers may be translated by your backend language into a "normal" character so uFF07 (I think that was one example) will become in your strings. Unless you escape or trash all unicode stuff. Which has it's own issues :)
Re:Limited impact. by TheRaven64 · 2007-05-21 22:27 · Score: 4, Informative

Windows makes no distinction between privileged and unprivileged ports, so any application that can open sockets can listen on port 80. That said, every port number (and every other object in the NT kernel) has an associated ACL, so it is possible to limit them on an individual basis. I've never seen this exposed to the UI though, so I've no idea how you'd go about doing it. Filesystem objects also have ACLs, so I'd imagine that IIS is not allowed access to the filesystem outside the tree it is sharing.
The NT kernel provides a lot of facilities that are very useful for writing secure code. I often wonder if the application developers at Microsoft ever noticed that they weren't writing code on top of DOS anymore...

--
I am TheRaven on Soylent News
Re:Limited impact. by Ravnen · 2007-05-21 22:44 · Score: 2, Insightful

The Network Service account on Windows has similar privileges to a normal user, which means it can't access files owned by other users, but can of course read some files owned by the system. The notion of reserved ports doesn't exist on Windows, so no software makes security assumptions based on whether or not a port is below 1024, and the ability to open port 80 doesn't imply any higher privileges than the ability to open any other port.
At any rate, running in a chroot jail is arguably better in some ways than just running as an unprivileged user. Vista has some sandboxing features, using 'integrity levels' and redirecting various file and registry accesses to a 'virtual store', but I'm not really familiar with them, except for the basics, and I don't know if IIS uses them anyway.
Re:Limited impact. by fatphil · 2007-05-21 23:32 · Score: 4, Insightful

I think you've missed his point. There are now two ways that, for example, a quote character can be passed as user input to your program: either as " or as %ublah.

Your program, sitting below the layer performing the unicode translations, doesn't need to do anything differently from before, as it doesn't matter which of the two methods were used. If you _relied on_ the layers above you to strip out, reject, escape, or whatever, quote characters, then you're writing teabag code, and should get a job selling flowers instead, as software engineering is beyond you.

Always validate user input to your own specification. Never rely on something external to do it.

This exploit hasn't changed the rules one little bit, it's just highlighted the fact that some idiots don't follow them.

--
Also FatPhil on SoylentNews, id 863
Re:Limited impact. by SEMW · 2007-05-22 00:43 · Score: 1

I've never seen this exposed to the UI though, so I've no idea how you'd go about doing it IIRC, in Windows XP, View -> Folder options -> untick "Use simple file sharing (recommended)" will let you see and edit an object's permissions though its properties dialogue.

In Vista's this is now enabled by default, which I suppose is inevitable since MS are making permissions so much more visible with UAC and such; but I do wonder how many people will go randomly clicking around to see what it does, click through the UAC dialogue, and end up doing something like removing permission to access the C: drive for everyone but their pet dog...

--
What's purple and commutes? An Abelian grape.
Re:Limited impact. by CastrTroy · 2007-05-22 00:57 · Score: 2, Insightful

Is this another problem with unescaped quotes? When will people learn? Not an hour goes by that a system doesn't get attacked by SQL injection attacks. Why do programmers continue to not use things like prepared statements which are invulnerable against such attacks. I blame it on the people writing the tutorials. Every beginner tutorial on the web shows queries being constructed at runtime, and doesn't have any mention of how insecure doing things like this is. It's hard to break the habit once you've been programming like that for so long.

--

Anthropic principle: We see the universe the way it is because if it were different we would not be here to see it.
Re:Limited impact. by flydpnkrtn · 2007-05-22 01:03 · Score: 2, Informative

I think he meant getting to "port object permissions" on a programmatic level... with an API. What you are describing are filesystem Access Control Lists. He's talking about using ACLs on ports. Everything being an object in NT, and being able to have ACLs applied to "everything," is a good idea. As the grandparent said, the application developers at MS just have to use them.

Basically the "Security tab" you see for files could be applied to individual ports.

--
Here's to the crazy ones
Re:Limited impact. by Anonymous Coward · 2007-05-22 02:28 · Score: 0

"I think he meant getting to "port object permissions" on a programmatic level... with an API. What you are describing are filesystem Access Control Lists. He's talking about using ACLs on ports. Everything being an object in NT, and being able to have ACLs applied to "everything," is a good idea. As the grandparent said, the application developers at MS just have to use them." - by flydpnkrtn (114575) on Tuesday May 22, @09:03AM (#19219899)

Maybe using the native "port filtering" is a method (one way, prefereably used in conjunction with the 'other way', using IP Security Policies, in combination with it).

Read the article(s) below from Microsoft (regarding how IP Packet processing occurs in Windows OS) & AnalogX (regarding IPSec), first, & some of the steps you need to implement this AT A PORTS LEVEL!

"Basically the "Security tab" you see for files could be applied to individual ports." - by flydpnkrtn (114575) on Tuesday May 22, @09:03AM (#19219899)

You have this on ports, as mentioned above, in two ways: Port filtering, & alternately, IP Security Policies (ontop of software firewalls (which also have some control here & at the application level no less) & hardware 'firewalls').

You may find this useful (or, others may, as YOU in particular may be aware of this stuff already, one never knows, but I am mentioning it here in detail anyhow for your reference, or for that of others who use Windows NT-based OS that have these features (Windows 2000/XP/Server 2003/VISTA):

Read this article, because it shows you how to limit/unleash various ports and what drivers act on them as filters, & @ what levels in the network stack for Windows:

TCP/IP Packet Processing Paths:

http://www.microsoft.com/technet/community/columns /cableguy/cg0605.mspx

IpNat.sys, IpFltDrv.sys, IpSec.sys, & TcpIp.sys in Windows 2000/XP/Server 2003/VISTA each has abilities for port restrictions!

This sounds like what you guys are looking for!

The steps below are basically how to use it (implement it) for limiting access to various ports, via GUI interfaces no less, in Windows versions noted above.

All of this & the tools noted can be used for LAYERED SECURITY in this manner (port filtering, IP Security Policies, software firewalls, & hardware NAT routers (true packet stateful inspection ones, & 'ordinary' NAT units as well)!

They ALL can be used simultaneously/concurrently, in layers, per the article from MS above entitled "TCP/IP Packet Processing Paths"

IPSecurity Policies are implemented in secpol.msc (this is the most complex of the lot, and I recommend "AnalogX's" model, as it works (but, can be troublesome with filesharing tools like EMule mind you), & can be downloaded here:

ANALOGX IP SECURITY POLICY OVERVIEW/HOW TO EXPLANATION:

http://www.analogx.com/contents/articles/ipsec.htm

ANALOGX IP SECURITY POLICY TEMPLATE DIRECT DOWNLOAD:

http://www.analogx.com/files/aps-ipsec.zip

(You can tune AnalogX's template model as you like above & beyond its original form for apps YOU use in particular)

AnalogX's IP Security Policy provides a good template to start with!

IP PortFiltering is done here:

Start Button -> Control Panel -> Network Connections -> Local Area Connection (or whatever you called yours) -> Properties Button -> (Next Popup dialog screen) -> Highlite "Internet Protocol (TCP/IP) -> Click the PROPERTIES button -> Click the ADVANCED button @ the bottom of this screen -> Go to the OPTIONS tab & highlite TcpIP Filtering & click the PROPERTIES button -> Check off "ENABLE TCP/IP Filtering on ALL Adapters" -> Permit only (a
Re:Limited impact. by mother_reincarnated · 2007-05-22 02:39 · Score: 1

This appears to be limited to content scanning, and isn't really a vulnerability in itself. Relying on content scanning to prevent an exploit to reach an exploitable system is a pretty bad idea, much better to fix the system than the extra layer of defense on the outside.
While that is strictly (and inarguably) true, it really doesn't matter when you aren't the people who wrote/own the application. Lobbing problems over the cubicle wall to the 'other' group only tends to lead to least-common-denominator solutions that take forever to get in place.
Content scanning is mostly useful against filtering known exploits, and is hardly meant to be your primary defense. Being able to bypass this scanning won't buy you much. If the content scanner is aware of an exploit it scans for, chances are so are the systems being targeted and are patched to protect against it.
Without getting into the religion of Integrity/Availability/Confidentiality... This is why people use layered approaches to security like Firewall -> IDS/IDP -> WAF -> Secure Applications -> Secured Platforms -> Simple Obscurity. What you are pointing out is the need for a Web Application Firewall (WAF) that uses a positive security model- one that defines what is allowed, not what is bad. If your positive model is granular enough you should catch 'zero day' attacks, as well as prevent a hax0r from probing your application since their usage patterns will probably fall way outside of normal.

PS It's still better to catch an attack in an outer layer then get hacked...
Re:Limited impact. by Anonymous Coward · 2007-05-22 02:53 · Score: 0

"If the content scanner is aware of an exploit it scans for, chances are so are the systems being targeted and are patched to protect against it."
No... to make the content scanner aware of the exploit is quicker and simpler than making your 10000 desktops aware of the exploit by patching them.
Re:Limited impact. by jZnat · 2007-05-22 02:56 · Score: 2, Insightful

Well, the way I see it, there are three ways to handle Unicode characters (one of which is wrong): store as full two-byte Unicode values (inefficient when using mostly ASCII characters like in english), store in a UTF character set such as UTF-8 (useful for primarily ASCII text as it is a superset of ASCII), or pretend it isn't Unicode and treat it as two (or three if input is in UTF-8 for example) separate ASCII characters (bad).

So, perhaps if data was all stored and represented in UTF-8, for example, this wouldn't be a problem? Or perhaps stored as raw Unicode characters via wchar_t (or language equivalent like u"" in Python)?

--
'Yes, firefox is indeed greater than women. Can women block pops up for you? No. Can Firefox show you naked women? Yes.'
Re:Limited impact. by rabtech · 2007-05-22 03:11 · Score: 4, Informative

The NT kernel has a root namespace for everything in the system (from local filesystems to network drives to sockets to synchronization objects like mutexes), and in fact treats everything as a file (just like Unix) underneath.

Using the Native (NT Executive) API you can read or set the ACL on any object in the namespace, assuming you have the appropriate user rights and you own the object (or the ACL allows you to modify the permissions). NT kernel objects can also be case-sensitive (though that can confuse some Win32 programs). Often, you can delete, move, etc files that are locked by the Win32 subsystem, which can be useful in certain situations (though in Vista they made the IO system capable of cancelling outstanding IOs on its own so the zombie process bug that ends up locking files doesn't happen anymore. Its unfortunate Vista is so DRM-laden, or I'd try upgrading.)

The APIs are NtQuerySecurityObject and NtSetSecurityObject and I believe the devices are in \Device\Tcp, \Device\Ip, \Device\RawIp, \Device\Udp, etc. Check out http://undocumented.ntinternals.net/ for more details on what is in the native API (ntdll). This API provides everything necessary to implement a full POSIX layer, which is exactly what Services for Unix does, installing itself as a new runtime subsystem right next to the Win32 subsystem. (With Server 2003 R2 SP2 they shipped it as an available component as part of the install; I've even got setuid support and GCC installed as part of the package.)

--
Natural != (nontoxic || beneficial)
Re:Limited impact. by rsvedersky · 2007-05-22 03:23 · Score: 1

You always have to think about what you are programming. Just because something is inside a prepared statement doesn't make it secure. Sure it is a much better way to go, but if your procedures blindly pass data around (like an idiot I saw who was using 1 stored procedure for an entire project and simply 'executed' the SQL statement passed into the stored procedure, then you are not really buying yourself more protection. You always need to inspect input [period.]
Re:Limited impact. by Anonymous Coward · 2007-05-22 04:27 · Score: 0

Anybody know how to delete an open file using this?
Re:Limited impact. by Talennor · 2007-05-22 04:30 · Score: 1

Relying on content scanning to prevent an exploit to reach an exploitable system is a pretty bad idea, much better to fix the system than the extra layer of defense on the outside. And while this seems good in theory, it is quite possibly the case that the content scanning system has additional logging and reporting functionality that could prove useful either during the attack if preventative actions can be performed by other security products, or it could be extremely useful in a forensic analysis of what has happened and what systems are suspected to be infected by something. Basically, running good code on a machine is helpful in keeping that machine running, but the security industry has had to take a different approach for obvious reasons (there's lots of bad code). So the security they provide is quite different than just stopping an attack at a vulnerability every time. And this exploit avoids that detection and everything that comes along with "security".

--

//TODO: signature
Re:Limited impact. by CoughDropAddict · 2007-05-22 04:34 · Score: 1

Two bytes is not enough for all Unicode characters. UTF-16, which stores characters under U+FFFF using two bytes, is still a variable-length encoding for characters higher than U+FFFF. If you want a fixed-length encoding, use UTF-32.

I recommend checking out the Wikipedia article Comparison of Unicode encodings.

So, perhaps if data was all stored and represented in UTF-8, for example, this wouldn't be a problem?

You can't impose this on the whole world; a lot of the protocols and file formats that we use every day explicitly allow for multiple encodings. Correct software simply has to buck it up and transcode when necessary.
Re:Limited impact. by fatphil · 2007-05-22 05:02 · Score: 1

Prepared queries are good, but nothing beats actually sanitising the input properly to provide a global assurance that the user input contains nothing dodgy.
I tend to match everything against a tight regexp, such that unless I say it's in, it's out.
Unfortunately, you can't remove the single quote from things like "O'Reilly".

--
Also FatPhil on SoylentNews, id 863
Re:Limited impact. by Doctor+Memory · 2007-05-22 05:55 · Score: 1

I do wonder how many people will go randomly clicking around to see what it does, click through the UAC dialogue, and end up doing something like removing permission to access the C: drive for everyone but their pet dog... Oh, c'mon, nobody's that dumb. Well, except maybe this guy...

--
Just junk food for thought...
Re:Limited impact. by Anonymous Coward · 2007-05-22 06:03 · Score: 0

Unfortunately, you can't remove the single quote from things like "O'Reilly".
O RLY?
Re:Limited impact. by MikeB90 · 2007-05-22 06:13 · Score: 2, Interesting

The point is you as your own program might have escaped or regexed items incorrectly and be open to this attack. Of course you don't blindly depend on some "magic" function. Duh! but you yourself are mortal too. And I doubt many people knew about fullwidth/halfwidh unicode transforms. The fact that one of the articles linked to this says they did a successful SQL injection SHOWS there are issues. BTW insulting people is not normally a useful technique
Re:Limited impact. by Foolhardy · 2007-05-22 06:57 · Score: 1

One nitpick: while open sockets are indeed file objects, and starting with Server 2003 SP1 the endpoint drivers do support ACLs on open sockets, unopened sockets (i.e. the port numbers themselves) are not objects, and do not have ACLs. There are firewalls that can control access to socket operations on a per process basis, but they're implemented as special TDI filters with special rules, usually not standard ACLs.

I've spent some time implementing a security descriptor editor designed to expose ALL objects with NT ACLs, and if there was an program interface to apply ACLs to port numbers, I would jump at the chance to make it available.

The endpoint devices themselves, e.g. \Device\Tcp, DO have ACLs which are checked before allowing socket ops (at least in 2003 SP2). There is no standard interface to them, AFAIK. SD Edit can edit them with sdedit t file tc ntapi n \Device\Tcp or with Udp or Ip or Nwlink. The only thing you can really do is deny all network access on a transport, but that can still be quite useful. Execute + synchronize access is sufficient to open/create sockets. Read/write access allows reconfiguration of the transport.
The NT kernel provides a lot of facilities that are very useful for writing secure code. I often wonder if the application developers at Microsoft ever noticed that they weren't writing code on top of DOS anymore...
Ugh. I'm hearing you there. The momentum of DOS and Win95 single-user, no security software design is still a plague on Windows software.
Re:Limited impact. by Foolhardy · 2007-05-22 07:15 · Score: 1

Also, the object manager namespace can be browsed with winobj or winobjex.

Actually, the IO system has always been able to cancel IO operations, including by terminating the thread owning the operation. However, IO can only be canceled when the drivers owning the operation allow it to be, and Vista got rid of many of the places IO could block but couldn't be canceled in the standard drivers. MUP (which does UNC network host lookups) in particular.

I had the same idea about reaching the ACLs of objects without a convenient interface to them, so I wrote SD Edit. It even uses native functions when possible and supports case sensitivity. The ACLs on \Device\Tcp can be edited via sdedit t file tc ntapi n \Device\Tcp
Re:Limited impact. by Anonymous Coward · 2007-05-22 11:01 · Score: 1, Informative

"The notion of reserved ports doesn't exist on Windows" - by Ravnen (823845) on Tuesday May 22, @06:44AM (#19218927)

Check this then:

HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Servic es\Tcpip\Parameters

And, there? Check the "RESERVED PORTS" parameter... here is documentation (scant) on it from MS:

How to reserve a range of ephemeral ports on a computer that is running Windows Server 2003 or Windows 2000 ServerL

http://support.microsoft.com/kb/812873

Apparently, this does exist, albeit apparently ONLY for "ephemeral ports" (short-lived ones).

(AND, iirc, UDP based ones only are used for this value afaik - the reason I make that statement, is when I have attempted to perform UDP filtering, it NEVER works out right & I have connection problems, when I attempted to use that on UDP!)

Port filtering stuff that I outlined here (and this is WHY I mention I let everything in on UDP, instead of limiting it as I did on TCP to ports 80/8080/443 only).

http://it.slashdot.org/comments.pl?sid=235621&thre shold=-1&commentsort=0&mode=thread&cid=19221131

IP PortFiltering is done here/HOW TO, STEP-by-STEP:

Start Button -> Control Panel -> Network Connections -> Local Area Connection (or whatever you called yours) -> Properties Button -> (Next Popup dialog screen) -> Highlite "Internet Protocol (TCP/IP) -> Click the PROPERTIES button -> Click the ADVANCED button @ the bottom of this screen -> Go to the OPTIONS tab & highlite TcpIP Filtering & click the PROPERTIES button -> Check off "ENABLE TCP/IP Filtering on ALL Adapters" -> Permit only (add ports as you need to here)

E.G./I.E. -> In the tcp list section, I leave 80/8080/443, for my personal home use @ least. In the UDP list I let all pass thru, & in the IP stack list, I only allow 16 (tcp) & 7 (udp).

(Any feedback on this note is appreciated. I can learn from you all, like anyone else is why.)

APK
Re:Limited impact. by Ravnen · 2007-05-22 11:34 · Score: 1

Interesting, I didn't know about that. All I meant is it doesn't use the old BSD distinction of ports below 1024 being reserved for privileged users, with 1024 and above being open to unprivileged users. I suppose you could effectively set it up that way using port filtering.
Re:Limited impact. by HeroreV · 2007-05-22 11:42 · Score: 1

Is this another problem with unescaped quotes?
Sort of. The problem is that when Unicode is translated into other character sets, some characters that didn't need to be escaped before are translated into characters that do to be escaped.

scenario:
1) You escape a Unicode string that contains fullwidth characters. The fullwidth characters have no special properties, so they aren't escaped.
2) You translate the escaped Unicode string into ASCII. Fullwidth characters are translated into halfwidth characters. Some of those halfwidth characters, like quotes, have special properties.

The lesson here is that you should never translate fullwidth characters into halfwidth characters unless you know whether they should be escaped or not, and you should escape them during translation if they need to be. Also, it's not a good idea to translate an escaped string between character sets.
Re:Limited impact. by HeroreV · 2007-05-22 12:24 · Score: 1

This appears to be limited to content scanning
The problem here isn't with content scanning. This problem comes from converting fullwidth characters to halfwidth characters without escaping them.

If you escape a Unicode string that contains fullwidth characters (without escaping the fullwidth characters), and then convert that escaped string to something like ASCII, the fullwidth characters may be converted to unescaped halfwidth characters.

If you never translate Unicode strings to non-Unicode, there's nothing to worry about.
Re:Limited impact. by Anonymous Coward · 2007-05-22 15:05 · Score: 0

,b>"Interesting, I didn't know about that." - by Ravnen (823845) on Tuesday May 22, @07:34PM (#19229887)

It's cool: I was not aware of the tidbit/factoid you laid out next, which I quote below:

"All I meant is it doesn't use the old BSD distinction of ports below 1024 being reserved for privileged users, with 1024 and above being open to unprivileged users." - by Ravnen (823845) on Tuesday May 22, @07:34PM (#19229887)

All I am aware of on this note? Is that Microsoft DID use BSD code for their IP Stack initially/afaik, but has made alterations over time (many in Server 2003, and moreso on SP#2 (see their TcpChimney stuff, which allows for (iirc) offloading of certain IP processing to NICS that support TcpOffload stuff)).

"I suppose you could effectively set it up that way using port filtering" - by Ravnen (823845) on Tuesday May 22, @07:34PM (#19229887)

It's an idea: One I figured would satisfy things for folks here in this thread, where folks were asking about controlling ports via GUI methods in Windows 2000/XP/Server 2003 & yes, VISTA too.

I went into WAY more detail on all of the methods, here in this thread, here:

http://it.slashdot.org/comments.pl?sid=235621&thre shold=1&commentsort=0&mode=thread&cid=19223109

IP Filtering though, has some "downsides" vs. using IP Security Policies (but then again, they in turn have downsides vs. using Port Filtering (the poor man's firewall, both of them, imo @ least)).

Still - IP Filtering, IP Security Policies & various security oriented .reg file hacks help make it more secure, ontop of hardware NAT (and true stateful inspecting types) units AND software firewalls as well... layered security really.

APK
Re:Limited impact. by fatphil · 2007-05-22 23:04 · Score: 1

Or simply:
- translate into raw (unsafe) data that you can handle first
- escape anything dodgy *last*

Given that different contexts require different escaping (such as parameters for a shell command versus characters intended for a regexp match, versus parameters for a SQL query, versus something to be used as a filename, etc.) it often makes most sense to try to store, in an encapsulated form, the dangerous raw string, and to request it to be rendered in the appropriate way on demand. That can alas come with a run-time overhead in some scenarios, and when it doesn't it can add to the memory footprint instead (as you'll be juggling both the raw and the ad-hoc escaped form(s) simultaniously).

--
Also FatPhil on SoylentNews, id 863
Re:Limited impact. by Anonymous Coward · 2007-05-23 00:32 · Score: 0

"I've spent some time implementing a security descriptor editor designed to expose ALL objects with NT ACLs, and if there was an program interface to apply ACLs to port numbers, I would jump at the chance to make it available." - by Foolhardy (664051) on Tuesday May 22, @02:57PM (#19225425)

Time for some poetry:

"Hey Fool: KICK A$$ TOOL!"

I just tried it, very VERY nice.

(Is this your handiwork/did you create it? If so, I am very impressed!)

Your thoughts on my approach here are appreciated (as regards TheRaven64's questions about how to control access to ports & reserved ports (Tcp Parameters controls this latter part via its RESERVEDPORTS value (ephemeral ports on udp only, iirc)):

http://it.slashdot.org/comments.pl?sid=235621&thre shold=-1&commentsort=0&mode=thread&pid=19219899

That was for folks here (TheRaven64) asking about ports access control, when no sourcecode is available for people to control ports @ an individual programmatic level, for controlling ports accesses (wholesale) via either IP Filtering, or IP Security Policies!

(Both are the "poor man's firewall" imo & work. These techniques also work in combination simultaneously as well w/ one another, plus in combination with software (good for application level control) AND hardware (NAT & true stateful inspection types) firewalls) via GUI methods exposed by the OS shell).

All for layered security.

APK
Re:Limited impact. by Foolhardy · 2007-05-23 11:29 · Score: 1

(Is this your handiwork/did you create it? If so, I am very impressed!)
Yes, I am the author of SD Edit.
Thanks. I appreciate it :)
Your thoughts on my approach here are appreciated (as regards TheRaven64's questions about how to control access to ports & reserved ports (Tcp Parameters controls this latter part via its RESERVEDPORTS value (ephemeral ports on udp only, iirc))
What you and other posters have mentioned about port control (the port filtering control panel, ipsec rules, application level firewalls, ACLs on transport devices), plus the "Routing and Remote Access" service in server versions, cover everything I can think of. This is a white paper on TCP implementation notes and parameters you might find interesting.
Re:Limited impact. by Anonymous Coward · 2007-05-24 06:58 · Score: 0

"What you and other posters have mentioned about port control (the port filtering control panel, ipsec rules, application level firewalls, ACLs on transport devices), plus the "Routing and Remote Access" service in server versions, cover everything I can think of." - by Foolhardy (664051) on Wednesday May 23, @07:29PM (#19246287)

Excellent, because imo, you DO know what you are about in this field (based on your replies and work in SDEdit in fact)... I find it good to know the methods I extolled can be used to control ports via GUI methods (which is what TheRaven64 was asking about, in addition to RESERVED PORTS in Windows 2000-2003-VISTA).

They're all good stuff, & work concurrently/simultaneously w/ one another, which is good (even with hardware NAT firewalls etc.), no hassles, & provide layered security & PORTS ACCESS CONTROL (to some extent, but not really @ an ACL level though). Still, the results are there & do work, & I am glad to see you "2nd my motions" on this account.

(Also - Perhaps I missed it, but, I didn't see anybody else mention the material on port filters, or IPSec (the "poor man's firewalls"), & this is why I did that 'big writeup' on them & how to use them (as well as download of the AnalogX prebuilt IP Security Policy - which is VERY good, strong, and flexible (but, does cause hassles with tools for filesharing (emule, etc.)))).

That is one case where (filesharing programs) where IP Port Filtering is actually superior for port access control, vs. IP Security Policies in fact.

Now, I did note your comments on ACL's on the TDI interfaces & one other persons (while speaking of managing this IN SOURCECODE for an application), & it was "kick ass", informative, & possibly useful to me at some point (developer here too).

The points you & the other fellow brought up are not ones I was NOT aware of, but it is good to know!

"This is a white paper on TCP implementation notes and parameters you might find interesting." - by Foolhardy (664051) on Wednesday May 23, @07:29PM (#19246287)

Yes, I have read up on that one when it issued (big fan of the MS daily downloads pages here is why -> http://www.microsoft.com/downloads/Results.aspx?Di splayLang=en&nr=50&sortCriteria=date )

They made even MORE alterations, for the good mind you, in the SERVICE PACK #2 release (see the material on it regarding TcpChimney settings, which iirc, allow offloading of various tasks from the System CPU to the processors on NICS that support it).

Well, since your tool is excellent, I can only offer up a "tit-for-tat" trade/return to you, with an application I wrote years ago that still survives in the Shareware/Freeware circuit (and runs unmodified all the way thru ALL Win32 OS, even 9x-VISTA & all iterations in between):

APK Registry Cleaning Engine 2002++ SR-7:

http://www.techpowerup.com/downloads/389/foowhatev ermakesgooglehappy.html

Enjoy it, it IS the safest & most thorough registry cleaner there is, bar-none!

APK

Send your claim in now by Anonymous Coward · 2007-05-21 18:42 · Score: 1, Informative

Quick! Claim the $16,000! http://it.slashdot.org/article.pl?sid=07/05/18/189 208

"It's very hard to exploit [those listed applications]," Aitel said. "IIS 6 hasn't had a public remotely exploitable bug in it. Ever."

Doh!

Re:Send your claim in now by QuantumG · 2007-05-21 18:47 · Score: 4, Funny

IIS 6 hasn't had a public remotely exploitable bug in it. Ever. That's bullshit anyway, I've got dozens of remote exploits for IIS 6.

Oh, you said public.. hehe, forget I said anything.

--
How we know is more important than what we know.
Re:Send your claim in now by Anonymous Coward · 2007-05-21 18:57 · Score: 0

What's the exploit here? There may be a bug in the Unicode character translation, but it doesn't allow remote code execution.

dom
Re:Send your claim in now by tuxedobob · 2007-05-21 18:59 · Score: 1

Exactly what I was thinking.
Re:Send your claim in now by Anonymous Coward · 2007-05-21 19:02 · Score: 0

I realize, but it was a shot at being funny =P
Re:Send your claim in now by jrumney · 2007-05-21 19:06 · Score: 1

It allows you to hide an exploit from first level scanners so it gets through to a deeper level.
Re:Send your claim in now by LurkerXXX · 2007-05-22 01:40 · Score: 1

Right, which is an exploit which allows you to claim $16,000 exactly how? Hint: It doesn't. This isn't an exploit at all.

Incident response by Anonymous Coward · 2007-05-21 19:11 · Score: 4, Interesting

I work incident response in a large web company (hence anonymous posting, natch) and currently we're treating this as "interesting, but case not proven". We test our web apps filter all input so I'm adding double-width unicode to our security regression test cases; however I'm happy to let the FD posters lab it out between them in the short term. These alleged IIS exploits don't work for us - which is not to say that we don't have some system, somewhere, for which this is an issue. At the end of the day it's just a clear restatement of something that's obvious to anyone - you need to filter input carefully, and you need to be aware of issues around alternative encodings. But it's not a "BRB" (big-red-button, ie emergency stop and all hands to the pumps to fix a vulnerability) issue for us - yet. The last time we had one of those, it was the Microsoft DNS server remote root... because most of our internal domain controllers were also running DNS servers.

Smelly foreigners by Anonymous Coward · 2007-05-21 19:14 · Score: 0, Funny

Who needs Unicode anyway? ASCII is good enough for most civilized people. If you can't sufficiently Romanize your language, maybe it's time to just let it die?

Re:Smelly foreigners by nmoog · 2007-05-21 19:45 · Score: 1

Yeah. Actually, 7 or 8 bits per character really seems excessive to me, and opens the door to additional attack vectors. Surely if people can't take the time to learn to communicate in 1 bit they should not be allowed to use the internet.
Re:Smelly foreigners by Anonymous Coward · 2007-05-21 20:01 · Score: 0

Yeah, anyone not born within 100 miles of Rome should be made to learn Latin, especially those smelly foreign Americans.

Oh, you meant ... oh, I see, you're an American. Sorry for screwing with your idea of the centre of the universe.
Re:Smelly foreigners by MLS100 · 2007-05-21 20:35 · Score: 1

0 you then!
Re:Smelly foreigners by Anonymous Coward · 2007-05-21 20:36 · Score: 2, Insightful

To think that even English fits in 7-bit ASCII is naïve.
Re:Smelly foreigners by Hognoxious · 2007-05-21 20:39 · Score: 2, Interesting

Would some of the things that led to computers - morse code, telegraphy etc have been feasible using, say, Chinese in its normal written form? Are computers biased towards English (and other languages using the same or similar alphabets) because they were largely invented by English speakers, or is the language fundamentally more amenable to small, simple encoding?

--
Confucius say, "Find worm in apple - bad. Find half a worm - worse."
Re:Smelly foreigners by Anonymous Coward · 2007-05-21 20:52 · Score: 0

Korean is even more amenable to encoding in fewer than 7-8 bits... it really is just a matter of technological biases. There's nothing particular natural about the way English is represented on computers.
Re:Smelly foreigners by ettlz · 2007-05-21 20:57 · Score: 5, Funny

To think that English doesn't fit in 7-bit ASCII is na\"ive.
Re:Smelly foreigners by Anonymous Coward · 2007-05-21 21:35 · Score: 0

There are no accent marks in English. When loan words with accent marks come into English, the accent marks are dropped. Of course, there is no Academy to define English, so some people do retain the marks. But dropping them is far more common -- and yes, that does make it correct.

Loan words that have been in English long enough even tend to have their pronunciations and/or spellings Anglicized.
Re:Smelly foreigners by kahei · 2007-05-21 22:21 · Score: 1

Would some of the things that led to computers - morse code, telegraphy etc have been feasible using, say, Chinese in its normal written form?

Well, they weren't feasible using English in its normal written form... so I'd guess they wouldn't be feasible using Chinese in it's normal written form either.

Offhand I can't think of any human script or language that's fundamentally suitable to telegraphy. Which isn't really all that surprising.

--
Whence? Hence. Whither? Thither.
Re:Smelly foreigners by jhol13 · 2007-05-21 22:46 · Score: 1

01010100 01101111 ... ah, screw it, you got the point.
Re:Smelly foreigners by TempeTerra · 2007-05-21 22:51 · Score: 3, Interesting

The notable difference between Chinese and English (or most other written languages) is that several English characters combine to form syllables, which combine to form words (i.e., we use an alphabet). In Chinese, each character corresponds directly with a word (each character is a logogram). If you're interested you can look up Alphabet on Wikipedia as a starting point, although I must admit I find the article hard to follow even though I know what it should be saying.

The practical result of this is that English is normally encoded as a long sequence of 0-25 values (a-z), whereas Chinese would be encoded as a shorter sequence of 0-~100,000 values (Wikipedia reports Chinese dictionaries with 85,000 characters). Naturally, there would be fewer Chinese characters required for a message as each character corresponds to an entire word.

I guess that since morse code is rather like binary and English letters can be encoded using 5 bits, Chinese morse codes would need to be... about 20 bits long? It's late at night, brain not work so good. It seems to me that morse codes using 20 dots/dashes would be extremely difficult to learn; but on the other hand it shouldn't be any more difficult than learning Chinese characters in the first place.

I wouldn't be surprised if English morse codes were more robust against poor data, siny Englxsh is stvll reahible even if sew2eral cheracter; are wrong.

Disclaimer: I don't know anything about the subject, I'm talking out of my elbow for the sake of discussion.

--
.evom ton seod gis eht
Re:Smelly foreigners by earthbound+kid · 2007-05-21 22:54 · Score: 1

When you wrote \"i you used three bytes (assuming an ASCII-style one byte per character encoding) to represent one character. In contrast, Unicode represents ï as codepoint x00EF, which in UTF-8 ends up as two bytes, x00C3 and x00AF.

You should amend your quote to "you can represent English in 7-bits... just so long as you're willing to use more than 7-bits to do it."
Re:Smelly foreigners by petermgreen · 2007-05-21 23:07 · Score: 1

a few people trying to look posh may use the odd diacritic on a loanword or as a heavy metal umulat but really they aren't nessacery for english.

you can get down to 6 bits per character if you are prepared to do away with either most punctuation or mixed case.

--
note: i'm known as plugwash most places but i screwd up registering that here somehow in the past and now can't register
Re:Smelly foreigners by cortana · 2007-05-21 23:19 · Score: 1

How na+AO8-ve!
Re:Smelly foreigners by Hognoxious · 2007-05-21 23:40 · Score: 2, Funny

There are no accent marks in English.
è is sometimes used to indicate that the e in a past participle is pronounced, eg learnèd (rhymes with Bernard) as opposed to learned (rhymes with burned).
When loan words with accent marks come into English, the accent marks are dropped.
The umlaut in naïve is retained to indicate that it doesn't rhyme with glaive.
Loan words that have been in English long enough even tend to have their pronunciations and/or spellings Anglicized.
Yes, that's why I'm posting from an internet caffay.

--
Confucius say, "Find worm in apple - bad. Find half a worm - worse."
Re:Smelly foreigners by vtcodger · 2007-05-21 23:55 · Score: 2, Informative

***Would some of the things that led to computers - morse code, telegraphy etc have been feasible using, say, Chinese in its normal written form?***
The answer would seem to be -- sort of ... maybe. See http://www.njstar.com/tools/telecode/jim-reeds-ctc .htm.
Summary: For telegraphy, Chinese characters are assigned numeiic codes in radical-stroke count order. That's the way that Japanese, and -- I assume -- Chinese, dictionaries, are arranged.
It may seem inefficient to use 20 bits (sort of) to encode a character, but remember that each character is a word, not a letter, and that composite words like "Beijing" or "paleontology" are only two words. That means that most "words" will be either 2.5 or 5 eight "bit" characters. Conventional telegraphy is really a trinary rather than a binary code -- pause, short, and long, and the 'digits' differ in length -- so bit count isn't really all that accurate an analogy.
So, no, the Chinese language probably wouldn't have made the development of computers by the Chinese all that much more difficult than European languages did. And the classic Chinese numeric notation is not as convenient as 'arabic' notation. But it's much less unwieldy than say Roman numerals, so I don't think it would have been an insumountable hurdle either.

--
You can't see ANYTHING from a car, You've got to get out of the goddamned contraption and walk...Edward Abbey
Re:Smelly foreigners by The+Warlock · 2007-05-22 00:47 · Score: 1

How low can you go if you completely forgo proper spelling?

--
I've upped my standards, so up yours.
Re:Smelly foreigners by steelfood · 2007-05-22 02:08 · Score: 1

First, only about five thousand characters are actually commonly used, with less than two thousand tones to represent those five thousand charcters. That gives rise to my second point, which is that spoken Chinese can be highly contextual (hence the propensity for puns and other wordplay).

My guess is that morse code would have evolved to be the same way that ASL simplifies language considerably. Each sequence would represent a different idea, or character, but every idea could pretty much be conveyed with a few hundred such unique sequences.

For computers, either there would have been a push to a standardized alphabet-like set, like bopomofo or even using roman characters a la pinyin, or 16- or 32-bit characters would be a necessity before computers became ubiquitious. Technology usually develops around human convenience, so I'm imagining that the latter would be more likely.

For input, it would be possible that instead of keyboards, handwriting recognition might be the defacto standard. I'm certain there would be some level of context-sensitivity and simplification though to make input short and easy. More likely, we'd still have the keyboard, but there'd be keys for each of the strokes, and keys for every radical. Which means perhaps 100 keys rather than a 40 keys to represent the whole language. Remember the red keyboard in that James Bond movie with Michelle Yeoh?

--
"If a nation expects to be ignorant and free in a state of civilization, it expects what never was and never will be."
Re:Smelly foreigners by amRadioHed · 2007-05-22 06:16 · Score: 1

Just a nitpick, in Chinese each character represents a syllable, not a word. Most characters do also correspond to words on their own, but this is not always the case (e.g. the various particles). Also, FYI, Chinese is not a monosyllabic language, unlike Thai for example.

--
We hope your rules and wisdom choke you / Now we are one in everlasting peace
Re:Smelly foreigners by dlane1 · 2007-05-22 06:35 · Score: 1

Um, no. Those are needed for correct English spelling. The woman's name is Zoë, not Zoe. It's coöperate, not cooperate. Those aren't umlauts, either. It's called a dieresis.
Re:Smelly foreigners by crolix · 2007-05-22 07:09 · Score: 1

New EC Regulations
The European Commision have just announced an agreement whereby English will be the official language of the EU rather than German, which was the other possibility. As part of the negotiations, Her Majesty's Government conceded that English spelling had some room for improvement and has accepted a 5 year phase-in plan that will be known as "Euro-English":
In the first year, "s" will replace the soft "c"... Sertainly, this will make the sivil servants jump with joy. The hard "c" will be dropped in favor of the "k". This should klear up konfusion and keyboards kan have 1 less letter.
There will be growing publik enthusiasm in the sekond year, when the troublesome "ph" will be replaced with the "f". This will make words like "fotograf" 20% shorter.
In the 3rd year, publik akseptanse of the new spelling kan be expekted to reach the stage where more komplikated changes are possible. Governments will enkourage the removal of double letters, which have always ben a deterent to akurate speling. Also, al wil agre that the horible mes of the silent "e"'s in the languag is disgracful, and they should go away.
By the 4th yar, peopl will be reseptiv to steps such as replasing "th" with "z" and "w" with "v". During ze fifz yar, ze unesesary "o" kan be dropd from vords kontaining "ou" and similar changes vud of kors be aplid to ozer kombinations of leters.
After ziz fifz yar, ve vil hav a reli sesibl riten styl. Zer vil be no mor trubls or difikultis and evrivun vil find it ezy tu understand ech ozer.
ZE DREM VIL FINALI KOM TRU ! ! !

--
Read the rest of this comment...
Re:Smelly foreigners by Anonymous Coward · 2007-05-22 07:18 · Score: 0

So what you're saying is Chinese and English are written with the same number of distinct symbols, so there'd be no impact on the length/complexity of encoding needed in each case?
Offhand I can't think of any human script or language that's fundamentally suitable to telegraphy. Which isn't really all that surprising.
You not being able to think doesn't surprise me either.
Re:Smelly foreigners by Anonymous Coward · 2007-05-22 07:36 · Score: 0

For telegraphy, Chinese characters are assigned numeiic codes in radical-stroke count order.
Those look like workarounds after the fact, adaptations of a telegraphy method that already exists. Trying to invent, from scratch, something like the telegraph might seem so complicated that nobody would have bothered trying. Which is what the question looks like to me.
Re:Smelly foreigners by jc42 · 2007-05-22 10:19 · Score: 2, Informative

The notable difference between Chinese and English (or most other written languages) is that several English characters combine to form syllables, which combine to form words (i.e., we use an alphabet). In Chinese, each character corresponds directly with a word (each character is a logogram).

Actually, this is pretty much a myth that originated from people with very little knowledge of Chinese language and writing. In all the Chinese languages ("dialects";-), most of the vocabulary is two-syllable words, as in English. Three-syllable words aren't uncommon. The writing system is actually a sort of syllabary, and the meaning of most two-character words can't be inferred from knowing what the syllables mean as standalone words.

It's similar to how lots of English words, e.g. "insight", can be parsed as two words ("in"+"sight"), but this doesn't really help you understand what the word actually means. Or, an example that shows how such things evolve is the English word "upstairs". If I say I'm going upstairs and take the elevator, did I lie to you? Of course not, because "upstairs" doesn't mean going up stairs. It did a few centuries ago, but hasn't meant that during the lifetime of anyone alive now. Similarly, proto-Chinese of N thousand years ago may have been mostly single-syllable words, but this hasn't been true for at least the few thousand years that we have readable examples of the writing system.

For a Mandarin example, which I'll write in pinyin (or pin1yin1;-) to get past the /. filters, consider the word zi4ran2. The zi4 syllable is a word, and means "from" or "since" (and is also used like "-ly" to form adverbs). The ran2 syllable is also a word, and basically means "correct" or "yes". The zi4ran2 combination means "nature" or "naturally". Like "insight", you might be able to kludge some sort of connection here, but in reality you just have to learn zi4ran2 as a separate word unrelated to its two syllables. It may have been a two-word idiom several thousand years ago; it's a two-syllable word now.

For an entertaining debunking of both this myth and a very common trope among Western pseudo-intellectuals and pop psychologists, read this article at languagelog. After chuckling at that particular bit of silliness about Chinese writing, you can find other articles there that go into the general problem in more detail. A number of experts in East-Asian linguistics regularly contribute to that blog, and they've been pushing for a campaign to debunk the nonsense that Westerners insist on saying about these languages.

Oh, well; I haven't yet heard any claim that Chinese doesn't have a word for "freedom". But I wouldn't be surprised. (Hint: the word starts with the same character as the above "zi4ran2", but has a different second character. ;-)

--
Those who do study history are doomed to stand helplessly by while everyone else repeats it.
Re:Smelly foreigners by TempeTerra · 2007-05-22 19:01 · Score: 1

Thanks, that was a very useful reply. It's always nice to feel that I'm less stupid than I was at the start of the day ;)

--
.evom ton seod gis eht
Re:Smelly foreigners by jc42 · 2007-05-23 03:23 · Score: 2, Informative

[T]he classic Chinese numeric notation is not as convenient as 'arabic' notation. But it's much less unwieldy than say Roman numerals, so I don't think it would have been an insumountable hurdle either.

Actually, classical Chinese numbers are only slightly worse than Arabic notation (which apparently developed in India but was spread by Arab traders who knew a good accounting system when they saw it). The Chinese notation was far better than any of the Western number notations that the Arabic notation supplanted, such as the Greek or Hebrew notation. Roman was probably the worst notation ever invented, and nobody ever really used it for accounting.

The basis of the Chinese system was symbols for 1 to 9, and symbols for powers of 10. To illustrate with ascii characters, the symbol for 10 looks like a large '+' sign, so we can use + for 10, H for hundred, T for thousand. We'd write the number 5347 as 5T3H4+7. Unused powers of 10 are omitted, so 2007 would be 2T7. 1024 would be T2+4 or 1T2+4. And so on. There are symbols for a few more powers of 10, and they can be chained to get higher powers of 10, so HT could be used for 100,000.

Nowadays, most numeric work in East Asia is done using the Western version of Arabic notation. But you also see a hybrid form that uses the Chinese 1-9 characters plus the Western 0. Converting between this notation and the traditional Chinese notation is essentially trivial and can be done as fast as you can write the numbers. But for arithmetic on paper, the Arabic form (or Arabic with Chinese digits) is a bit simpler than the traditional Chinese notation, since using 0 as a place holder results in correct alignment in columns of numbers, and the digits 1-9 are a bit faster to write than the Chinese digits.

An interesting aspect of the Chinese system is that the basic symbols have alternate "fancy" forms with a lot more strokes. These characters have the property that you can't add strokes to convert them to a different character. So they're an anti-tampering, fraud-proof way of writing numbers. I don't know of another numeric notation with this feature. Asian financial documents have historically used these fancy forms of numbers.

Actually, the Chinese and Arabic notations are the 3rd and 2nd easiest numeric notation that various societies have invented. A few years ago, Scientific American had an interesting article explaining the Mayan number system, and included an explanation of why it was a lot easier to use than the Arabic system. For example, instead of the big multiplication table that we memorized in school, the Mayan system really only needs one rule: 5x5=15. (This makes sense if you understand that they used base 20.) The rest of the rules for adding, subtracting and multiplying consist of the techniques for "carrying" and "borrowing", and are essentially similar to what you do with an abacus.

But I suppose we're stuck with the Arabic system. It's good enough, really, for the remaining uses where we don't bother with a computer.

--
Those who do study history are doomed to stand helplessly by while everyone else repeats it.
Re:Smelly foreigners by vtcodger · 2007-05-25 00:44 · Score: 1

Great Post -- thanks
***An interesting aspect of the Chinese system is that the basic symbols have alternate "fancy" forms with a lot more strokes. These characters have the property that you can't add strokes to convert them to a different character. So they're an anti-tampering, fraud-proof way of writing numbers. I don't know of another numeric notation with this feature. Asian financial documents have historically used these fancy forms of numbers.***
I'd never thought about it. Not only are you right, but Chinese numbers -- even the basic forms -- are much less ambiguous than our familiar 'arabic' forms. I was once involved in a project that tried to OCR the amount field on checks. Amongst the many problems, some people manage to handwrite 1, 2, 4, 7, 9 in shapes that are virtually identical. Try decoding that sort of check. The biggest suprise. Courier typeface 5s and 6s differ only in a couple of dots. Even people sometimes can't distinguish the two when typed/printed with an old ribbon.

--
You can't see ANYTHING from a car, You've got to get out of the goddamned contraption and walk...Edward Abbey
Re:Smelly foreigners by Anonymous Coward · 2007-05-30 01:41 · Score: 0

//5UAGgAYQB0ACcAcwAgAGoAdQBzAHQAIABzAGkAbABsAHkALg A=

"Not vunerable" by iamacat · 2007-05-21 19:20 · Score: 2, Informative

According to the advisory, Apple products do not provide HTTP content filtering and are therefore not vulnerable. This will do nothing to help someone build a functioning protection system.

Re:"Not vunerable" by KiloByte · 2007-05-21 21:31 · Score: 1

Yeah, no "content filtering" is needed, why would it be? Any text is either the request (and thus not "content") or mere data, in the second case it shouldn't be filtered unless something is terribly broken.

Trying to parse encapsulated data is a bad idea generally; as is trying to detect the same attack twice. Of course, unless you're snakeoil^Wsecurity software salesman.

--
The creatures outside looked from Alt-Right to Antifa; but already it was impossible to say which was which.

I don't think you know what you're talking about.. by msimm · 2007-05-21 19:21 · Score: 1

$-$

They've been trying to sell this kind of kit to us for years.

--
Quack, quack.

Apple and HP? by AHuxley · 2007-05-21 19:32 · Score: 1

What did Apple and HP get right?
Obsolete, obscure, open source, in house?

--
Domestic spying is now "Benign Information Gathering"

Re:Apple and HP? by Anonymous Coward · 2007-05-21 20:27 · Score: 0

BS(coff)D
Re:Apple and HP? by Anonymous Coward · 2007-05-21 20:38 · Score: 1, Informative

Their comment on Apple is that they don't have anything that provides this functionality, so there's nothing that can be vulnerable.
Re:Apple and HP? by Gadget_Guy · 2007-05-21 20:52 · Score: 2, Funny

Ooh, poor old BSD sounds really sick there. I hope that it doesn't die!

empty list? by farkus888 · 2007-05-21 19:47 · Score: 1

will someone please explain to a non security guy why this list is so empty. it seems to me that this should be easy to test for in every IDS type software. I could see more exotic equipment like alcatel being untested yet but it seems there should be enough accessibility of d-link and fedora for example that they should be tested by now. I'd personally test them myself when I got home from work if I knew enough to be able to attempt the exploit.

--
thats right, I rarely use capitals. deal with it. but don't mistake my laziness for stupidity

Re:empty list? by udippel · 2007-05-21 22:12 · Score: 1

I could see more exotic equipment like alcatel being untested yet but it seems there should be enough accessibility of d-link and fedora for example

d-link doesn't do content filtering; at least not in your home.
Fedora, is probably the same as other Debian/GNU/BSDs; depending on the applications performing the filtering.
I fail to see the usefulness of the list of platforms mixed with trade names here.

Am I the only one ?
Re:empty list? by aliquis · 2007-05-21 22:36 · Score: 1

No, how can they know for sure that everything from HP is safe? And even less everything in any major dist with lots of packages.

Must take a while to figure out, thought I don't know how much software HP have made, but I guess many companies run small inhouse projects maybe written by someone as their ex-job or whatever.
Re:empty list? by farkus888 · 2007-05-21 22:38 · Score: 1

the firewall in my dlink router does SPI, and can list "suspicious" traffic in a log file similar to snort. I agree that the list seemed kind of supsicious. on fedora I've always used snort for IDS, listing an OS in this list seemed suspicious to me... especially something like fedora or gentoo where I can choose my own IDS from any of a dozen different packages. am I mistaken that this is really just a way to get past an IDS to get to the real goods when the networks security is overly dependant on that one piece of software.

--
thats right, I rarely use capitals. deal with it. but don't mistake my laziness for stupidity
Re:empty list? by udippel · 2007-05-21 23:24 · Score: 1

the firewall in my dlink router does SPI

Don't want to quarrel with you, despite being on /., SPI isn't what you might think it was. It doesn't perform full Layer7 processing. It doesn't process content.
Re:empty list? by farkus888 · 2007-05-21 23:34 · Score: 1

I do know what SPI is, I don't have a firm grasp of this exploit. from what I am reading here this is similar to the "this app can break" bug in notepad, they take too small of a data sample to be able to guess character encoding 100% of the time. I do see how SPI has nothing to do with that. and I'm not trying to come across as argumentative, just trying to get a better understanding and learn something. like I said before I am no security expert, but I do computer networking so that makes security relevant and I feel I should know more about it than I do.

--
thats right, I rarely use capitals. deal with it. but don't mistake my laziness for stupidity
Re:empty list? by farkus888 · 2007-05-21 23:46 · Score: 1

I definitely agree. the only alcatel equipment I am familiar with is their microwave t3's, saying they are vulnerable to this would be like listing cat-5 cable. fedora is maybe slightly different than other *nix's because it has selinux, but selinux always seemed to be more about solid file permissions than firewall and IDS work. I definitely agree with you that they need to talk specific packages in the *nix world to really have a useful list for the non windows crowd.

--
thats right, I rarely use capitals. deal with it. but don't mistake my laziness for stupidity
Re:empty list? by fatphil · 2007-05-21 23:59 · Score: 1

If it does not provide content filtering, then it never claimed to protect you from dodgy content in the first place.
So such devices are not failing to meet their specification.

--
Also FatPhil on SoylentNews, id 863

bypassing great firewall? by z-j-y · 2007-05-21 19:48 · Score: 2, Interesting

I'm wondering if the great firewalls (Cisco product?) are also vulnerable to this. At least it'll force them to do longer string matching.

I wonder... by Anonymous Coward · 2007-05-21 20:06 · Score: 0

I wonder how much of this vulnerability has to do with the C programming language's lack of Unicode support...

Re:I wonder... by Anonymous Coward · 2007-05-21 20:15 · Score: 0

You mean ignorant programmers misuse of wchar_t, surely?
Re:I wonder... by /ASCII · 2007-05-21 23:59 · Score: 1

Misuse of wchar_t? Care to elaborate? My only complaint with wchar_t is that is barely used at all. From what I've seen, programs that use wchar_t are shorter, more readable, and more secure.

--
Try out fish, the friendly interactive shell.
Re:I wonder... by Anonymous Coward · 2007-05-22 00:10 · Score: 0

The limited use of wchar_t is a problem, but it also seems that a lot of developers think that using wchar_t automatically means they are using "Unicode", which of course is not true. wchar_t is a convenient container for Unicode data, but shoving 8bit ASCII into a wide character type doesn't solve all your problems for you.

At the same time I can't really blame people for sticking their head in the sand with regards to Unicode. It's insanely complex, and the documentation doesn't help much unless you're prepared to wade through it. Given a choice between that and sticking with what they know, who can blame them?
Re:I wonder... by /ASCII · 2007-05-22 00:34 · Score: 1

My experience with unicode and C is that it is painless when done right.

You only need two things.

1) Remember to call setlocale.
2) Use wide character wrappers around all system functions that don't already have one, e.g. wopen, wrealpath, etc.. Never ever directly use narrow character strings for anything.

--
Try out fish, the friendly interactive shell.
Re:I wonder... by Srin+Tuar · 2007-05-22 03:23 · Score: 1

About your point #2:

On linux (any unix really) you want to avoid wchars and wide functions like the plague.

The way to go for i18n is using utf-8 and bytes for character strings everywhere. (look into the gtk+ library for examples of this)

The whole wchar experiment has been declared a failure, and is deprecated for any usage really.
Re:I wonder... by Anonymous Coward · 2007-05-22 18:11 · Score: 0

And broken. wchar_t does not solve any problem whatsoever concerning Unicode. I can specify a glyph which requires two (wchar_t) just the same as I can requiring two (char) (the same glyph, even). If you don't understand how or why, then you're part of the problem.

Not a surprise... by gweihir · 2007-05-21 20:22 · Score: 1, Insightful

That Unicode is a very bad idea in all semantics carrying containers is nothing new. In fact one of the counterarguments to Unicode ist that it is a nightmare to secure. Filter evasion was expected to be a typical security concern. We will see more of this and all only because some people want features without ever reflecting on what problems they might cause.

--
Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.

Re:Not a surprise... by etnu · 2007-05-21 20:42 · Score: 5, Insightful

You'd prefer securing against vulnerabilities in dozens, if not hundreds of different encodings? The only people who are against Unicode are those that have never had to work with more than one written language in the same project. Yes, it's a lot easier to secure stuff when you only accept ASCII or ISO8859-1/Windows CP-1252, but then you're limiting your software to about a third of the world (if that). Crappy engineers are going to write crappy code no matter what the encoding. No sense compromising for the sake of poorly written software.
Re:Not a surprise... by KiloByte · 2007-05-21 21:23 · Score: 3, Insightful

Wrong, the flaw in Cisco's "security" software and IIS is due to them converting things to 8-bit charsets, not due to Unicode. In fact, the whole idea of "code pages" is fundamentally broken, as it assumes all data ever moves to another places only in the same region.

The idea of double-width characters is broken too, yeah, and they are there only to appease the users of some broken Chinese/Japanese software -- but there's nothing wrong with having strange characters in file names. They don't match any file they are not supposed to unless you try to shoehorn them into a limited character set.

So, it's a flaw in the software, not Unicode by itself.

--
The creatures outside looked from Alt-Right to Antifa; but already it was impossible to say which was which.
Re:Not a surprise... by kahei · 2007-05-21 22:19 · Score: 5, Insightful

Down below this post, there's a troll writing something like 'lol if u cant just use ASCII u shud let ur language die u foreign creeps lol k thx'.

And a whole bunch of people then jump on the troll and criticize him for his US-centrism, and so on, and the troll is at -1.

Yet the post I'm replying to, which is at +4, really comes to the same thing as this troll; it's simply UNIX 8-bit centric rather than USA ASCII centric.

The fact is, computers are used for text, and much if not most text is non-ASCII. How would you rather represent that text:

--With Unicode
--With KOI-8, KOI-8R, KOI-8RU, EBCDIC, EUC-KR, EUC-JP, shift-JIS, Shift-JIS-the-Jphone-version, ISCII, VISCII, ISO-2022-*, and the many many other encodings that have evolved in different times and environments.

Seriously, which is going to be easier to secure (and otherwise manage) -- one encoding (which is HEAVILY documented and discussed) or a large number of encodings (the actual number being ever-changing and impossible to really know) many of which are not well documented and have forgotten ramifications and assumptions?

Right -- so now you know why people use Unicode so much.

But the interesting question is, why is one error ("All teh world is teh USA lol! Shouldn't you learn to speak English?") rightly jumped on and pounded flat, whereas another form that's actually more problematic ("All teh world is C on UNIX lolz!! Shouldn't you stop wanting dangerous extra features?") isn't?

Actually, I see in another window that some people have indeed been pounding the parent poster flat, so perhaps my question isn't valid after all.

--
Whence? Hence. Whither? Thither.
Re:Not a surprise... by kalidasa · 2007-05-21 22:22 · Score: 1

He's an idiot. I sometimes work with languages for which there simply IS NO OTHER ENCODING THAN UNICODE. Does he really want me to create new 8-bit encodings for each of them? Ones that won't be standardized, and so won't be easily exchangeable with other users?
Re:Not a surprise... by Teancum · 2007-05-21 22:53 · Score: 1

There is the option of using UTF-8 instead of UTF-16 for the encoding of Unicode characters. Most implementations of Unicode insist upon UTF-16 (meaning all characters including Latin alphabets use 16 bits per letter). If you have some software that your anticipated audience is primarily Latin alphabet users but you want to make Unicode available, you can use UTF-8 to keep mostly 8-bit characters but allow the full Unicode code points (including 32-bit characters as well) if you need those non-Latin characters.

I would have to agree here that this is not a failure of the Unicode encoding standard, but the software implementation of using the Unicode standard, trying to communicate to software not prepared (due to poor implementation of existing standards and very lazy software developers) to deal with this sort of content. Of course it was software developers like this that gave us the Y2K bugs as well.
Re:Not a surprise... by gnasher719 · 2007-05-21 23:46 · Score: 2, Informative

Unicode is of course not the problem at all.

The problem is using character sets that can represent huge amounts of different characters, and among them characters that have similar looking glyphs. That is at the same time a feature that people really really want.

So spam filters will have a problem. They filter out "Viagra" but they don't filter out sequences of letters that look the same. Well, tough. If you follow the rule not to follow any links in emails but type them in yourself, that gets you mostly around it.

The other "problem" is filtering to prevent SQL injection and all that crap. There I'd have to say two things: 1. It is just common sense if you accept Unicode to translate it into a canonical form first, either precomposed canonical or predecomposed canonical (by the way, predecomposed canonical UTF8 is what the MacOS X file system uses). Once that is done, nothing unexpected should slip through. 2. Why would you need to filter out anything at all? This is a completely brain-damaged approach in the first place, using user input to form commands that could potentially be dangerous and filtering out user input that would produce dangerous commands. Instead, there shouldn't be any commands that could be dangerous in the first place.
Re:Not a surprise... by iandog · 2007-05-21 23:52 · Score: 1

Actually, double-width characters are quite pervasive in Japanese software and are used in everything from mobile phones to PDAs to normal PCs. In fact Windows IME, and any input system in Linux or Mac lets you enter these characters wherever you like. You can hardly say that it is only used in "broken" software. As with some of the people you're criticizing, I wouldn't make such sweeping statements without knowing the target domain.

--
-Ian
Re:Not a surprise... by Carewolf · 2007-05-22 00:24 · Score: 1

How are you going to exchange your subset of unicode if no one else is using it anyway? Who is going to have the right fonts installed?

The problem with unicode is that you assume people can decode all your data, but they actually can't. With small encodings people either have it installed or not. With unicode you have it, but it doesn't actually work for 99% of the symbols, because there are no complete fonts.
Re:Not a surprise... by gweihir · 2007-05-22 00:32 · Score: 1

Oh, I don't dispute that Unicode is a good idea for Text representation. It just has no place in anything that is carrying executable code or commands. If you allow Unicode in command languages, then there is no way to secure them with human possible effort, since filters essentially stop working.

--
Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
Re:Not a surprise... by gweihir · 2007-05-22 00:43 · Score: 1

2. Why would you need to filter out anything at all? This is a completely brain-damaged approach in the first place, using user input to form commands that could potentially be dangerous and filtering out user input that would produce dangerous commands. Instead, there shouldn't be any commands that could be dangerous in the first place.

And how do you propose to not have dangerous commands when you actually need them in places, just not comming in from that particular channel? Remove, e.g., "drop table" from SQL entirely? Sure, youcpuld do a semantica analysis to see whether a command is dangerous first. If you have the time to spend more effort than the original SQL interpreter took to design and implement. Most do not have that and hence try a syntactic approach. That does only work if you cannot get your syntax past the filter. Normalization is a good idea. However that requires your filter to actually be in-line, preventing any type of IDS/IPS from using it, since with the complexities of Unicode, there is no way in hell to ensure that the normalization on your sniffer and on your target system actually work the same. If they do not, you can get dangerous stuff past the sniffer.

Still, for markup I have no issue with Unicode. It just has no place in protcoll messages, source code and script code. Not because I do not want it ther, but because the human race is currently not able to handle the secutity problems that arise. Maybe in a few more decades, but not now. So the error made would be to implement functionality that cannot currently be secured for most practical purposes. And it was quite obvious that it cannot be secured way back. So to re-iterate, nobody working in the security field that has though about the implications of Unicode is surprised by this mess. In fact it would have been very surprising for it not to happen. Also there is strong indication that this problem cannot currently be fixed at all.

--
Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
Re:Not a surprise... by TapeCutter · 2007-05-22 01:03 · Score: 1

"due to poor implementation of existing standards and very lazy software developers"

Tip of the day: Source code is like shit, everyone else's stinks.

After two decades as a software developer (plus another as an amature) I can tell you that 99% of the time both your design and implementation will be constrained by an existing code base. The whole thing is recursive: if your "bleeding edge" project becomes "leading edge", it will end up as a legacy system that will in it's turn crush the ambition of a new generation of energetic developers. Sure, some "hacks" are ignorance/lazyness, but I find most to be a reasonable compromise between price/quality/performance at the time of writing.

IMHO: Mapping ASCII onto unicode was (at the time) a good compromise, ditto for the majority of serious Y2K problems. If unicode had not accomodated ASCII then a different standard would have been adopted by the industry - in other words the designers of unicode had no option but to support a range of popular 8bit character pages. If you look back into the history of unicode you will find there were plenty of competing standards, they all promised "transparent" support for popular (extended) ASCII code pages (eg: The code pages found in the back of DOS manuals).

Disclaimer: I was responsible for Y2K compliance for a $100M-5yr project. Fixing the code was trivial - 80+% of the budget was spent on testing for bugs, retesting fixes for bugs, and of course endless meetings explaining calanders and timespan calculations to various groups of PHB's.

--
And did you exchange a walk on part in the war for a lead role in a cage? - Pink Floyd.
Re:Not a surprise... by mrogers · 2007-05-22 01:37 · Score: 1

Not everybody who stores or carries the data will need to display it. If the data's in Sanskrit, nobody needs to have a Sanskrit font except the person entering the data and the person reading it, but everyone else can still cleanse, fold and manipulate it because it's in Unicode.
Re:Not a surprise... by ultranova · 2007-05-22 01:59 · Score: 1

If you allow Unicode in command languages, then there is no way to secure them with human possible effort, since filters essentially stop working.

Um... Why ? Why is filtering command sequences made from 32-bit characters inherently any more difficult than filtering 7-bit characters ? It doesn't make any sense to me.

--
Forget magic. Any technology distinguishable from divine power is insufficiently advanced.
Re:Not a surprise... by gweihir · 2007-05-22 02:35 · Score: 1

Itis not directly. But the problem is that unicode allows more than one representation for some characters. If one normalizer knows most, but not all of them, and another knows all, then the first one will see some strings as different that the secon one will see as equal. Think variable names, function names and the like and you see the problem.

Of course, if the normalizer is completely correct, then the problem does go away. Becasue of the complexity of Uncode, this is at the moment very hard to impossible to reach. Think using sections of the characyer set, that are hardly or not at all used and were nobody would notice this inconsitency. Also think mistakes by the implementors were they though some characters to be different, that are not and were this mistake is patched in some Unicode implementations, but not in others.

All this can be used to successfuly hide code and command functionality from filters, even if they go down to the semantic and not only work syntaktucally.

The fundamental problem is not Unicode per se, but its complexity. The fundamental design error is to allow different encodings for one symbol, so that symbol equality has to be decided via code and is not obvious in many cases.

--
Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
Re:Not a surprise... by KiloByte · 2007-05-22 02:36 · Score: 1

There is the option of using UTF-8 instead of UTF-16 for the encoding of Unicode characters. Most implementations of Unicode insist upon UTF-16 (meaning all characters including Latin alphabets use 16 bits per letter).
"All characters"? I'm afraid that's only 1/17 of Unicode. And according to the law of mainland China, software which doesn't support codes over 16 bits can't be sold there -- well, the commies are nothing but lawful so it's mostly a paper requirement, but it's there.

And UTF-16 has all the flaws of UTF-8 and UCS32 with none of the advantages. You neither get the 1-character-to-array-element ease nor having out-of-the-box compatibility for 99% of GUI software. With UTF-8, if your program supports ASCII (duh) and doesn't rely on character cell display (ie, full screen text mode software) or need to have some means of counting characters, you're set without doing anything.

Fortunately, the only major implementation which uses UTF-16 in a place visible to the user is win32, and as only a small minority of software uses their fooW APIs instead of fooA, it's those unlucky to depend on Windows who suffer from charset problems in this millenium. Oh wait, I forgot those who have to exchange data with Windows users, and that's... ugh...

--
The creatures outside looked from Alt-Right to Antifa; but already it was impossible to say which was which.
Re:Not a surprise... by jZnat · 2007-05-22 03:05 · Score: 1

You can have multiple font families installed that will cover most (if not all) of the Unicode characters when union'd together, so as long as the font manager grabs glyphs from a list of available fonts, all characters should be covered. Therefore, you don't really need a complete font family with all the Unicode glyphs.

--
'Yes, firefox is indeed greater than women. Can women block pops up for you? No. Can Firefox show you naked women? Yes.'
Re:Not a surprise... by iabervon · 2007-05-22 03:30 · Score: 1

The difference in this thread is that the OP claims that only ASCII should be used for "semantics-carrying containers", which is a confusing way of saying "control structures". The real flaw is that some systems will allow SQL string constants to be ended by non-ASCII double quote characters. In this case, the issue is the Unicode section for ASCII characters to be used in text where the normal characters have square space allotments. If the application behind a filter is using a human-meaning-preserving conversion to a character set that doesn't have these characters (which is the really dumb thing), a wide double quote (＜) gets turned into a regular one, which now changes the structure of the request.

Even people who don't know any human languages that use a latin character set can probably get by writing HTML with narrow angle brackets, and ASCII tag and attribute names.
Re:Not a surprise... by jhol13 · 2007-05-22 03:49 · Score: 1

unicode allows more than one representation for some characters Unicode states how normalization should occur: http://www.unicode.org/unicode/reports/tr15/. Is there some problems in this or what are you referring to?
Re:Not a surprise... by HopeOS · 2007-05-22 04:01 · Score: 1

I have no problem with an encoding that is capable of encapsulating all the world's languages. I use it daily.

I take issue with the fact that they implemented it so poorly.

1. It is impossible to determine if a character is whitespace; you have to look it up in a table.
2. It is impossible to determine if the character is even printable; you have to look it up in a table.
3. It is impossible to determine if the character has another, more canonical presentation; you have to look it up in table.

That's a lot of tables. And they change. And they take up space in memory. If you compact them into sorted lists, you still have to search them. That's a huge cost, per character, when parsing strings.

Moreover, the fact that there are multiple encodings for exactly the same string is an error in design. The whole canonical forms nonsense is just the tip of the iceberg.

-Hope
Re:Not a surprise... by Anonymous Coward · 2007-05-22 04:32 · Score: 0

And UTF-16 has all the flaws of UTF-8 and UCS32 with none of the advantages. You neither get the 1-character-to-array-element ease nor having out-of-the-box compatibility for 99% of GUI software. With UTF-8, if your program supports ASCII (duh) and doesn't rely on character cell display (ie, full screen text mode software) or need to have some means of counting characters, you're set without doing anything.

You don't get "1-character-to-array-element" with Unicode in any form. If your application does nothing with text except recognizing 7-bit ASCII codes and ignoring everything else in the stream, yes UTF-8 is excellent. If you want to do anything interesting with the text and still work in languages other than English, you need to understand Unicode.

Fortunately, the only major implementation which uses UTF-16 in a place visible to the user is win32

Uh, Java?

and as only a small minority of software uses their fooW APIs instead of fooA

Heh, no actually quite a lot of software uses the Unicode APIs. Since the ANSI versions only really work with code pages that have a maximum of 2 bytes per character, UTF-8 doesn't help there either.

Recompiling to work with 16-bit characters is pretty easy compared to actually understanding all of the characters. That's why we're here discussing this story! IIS is definitely written using the UTF-16 APIs. It even uses OS services to fold full- and half-width Latin characters into the standard forms. The vulnerability scanner probably uses 16-bit chars too, but doesn't do that folding. Using a different encoding for Unicode wouldn't make any relevant difference.
Re:Not a surprise... by kahei · 2007-05-22 04:33 · Score: 1

Well, yes, 'semantics-carrying containers' did confuse me a bit. The problem in the above scenario is not Unicode -- it's having one bit of code check the input string, and *then* having another bit of code (with different assumptions, e.g. that a Chinese quote mark can end a SQL string) change it before it is used!

In such a case, unless the two bits of code share common assumptions, there's bound to be a hole.

I don't think the issue of Unicode versus some other encoding matters, once you already have a filter transforming the string AFTER it is checked. Are there any systems that really have to work like this?

--
Whence? Hence. Whither? Thither.
Re:Not a surprise... by Intron · 2007-05-22 05:02 · Score: 1

http://www.unicode.org/versions/

Any time a standard has been changed, you will have some outdated, but perfectly correct software. Hence, two pieces of software may not agree on the meaning of a Unicode string even without a software error.

--
Intron: the portion of DNA which expresses nothing useful.
Re:Not a surprise... by DrVomact · 2007-05-22 05:02 · Score: 1

This comment does not make a bit of sense. What are "semantics carrying containers"? Why would Unicode be harder to secure than Shift-JIS or ANSI?

--
Great men are almost always bad men--Lord Acton's Corollary
Re:Not a surprise... by Anonymous Coward · 2007-05-22 06:15 · Score: 0

Just because there is no other encoding than Unicode doesn't mean that no one else is using the Unicode encoding. There's plenty of data out there from writing systems that were never encoded before that have been added to Unicode, and there are new fonts to handle some of those systems.
Re:Not a surprise... by spitzak · 2007-05-22 08:31 · Score: 1

Unfortunatly Microsoft has completely fucked it up, and the "A" suffix functions are useless, too. What they do is use a "code page" to translate the bytes into that nasty utf16 and thus it will not pass utf8 through. Filenames on NTFS are actually stored in utf16, which means a nightmare of future compatability. There is a third interface (the "multibyte" one) that could save us, but Microsoft oh-so-conviently left out any way for a program to force the multibyte encoding to UTF8.

Microsoft is not the only one at fault. Sun and all the other Unix vendors were convinced that UTF16 (well UCS2 then) was the solution to I18N, and if they had not been wiped out by Microsoft it would have been established sooner, UTF8 would not have been seen, and we would be far worse than we are now.

The whole thing is very sad. The insistence that all the characters be the same size is really due to political correctness but you would be amazed at what illogical and insane things otherwise-intelligent people will say to defend it. (Hint for the clueless: text is make of words and many other things besides characters. Also Unicode has combining characters and invisible characters. And damn Microsoft still thinks a line break is two characters. This variable size is NOT a problem. Think before you say something silly!)
Re:Not a surprise... by gnasher719 · 2007-05-22 10:04 · Score: 1

'' Oh, I don't dispute that Unicode is a good idea for Text representation. It just has no place in anything that is carrying executable code or commands. If you allow Unicode in command languages, then there is no way to secure them with human possible effort, since filters essentially stop working. ''

Why would they stop working? As two examples, the bash shell and the Perl language don't assign any special meaning to any character with a code above 0x80, so Unicode using UTF8 encoding would be completely transparent. (For example, a Perl script that reads a file name and generates a valid bash shell command that would delete that file or copy it to another directory would work without any changes).
Re:Not a surprise... by gnasher719 · 2007-05-22 10:07 · Score: 1

'' Any time a standard has been changed, you will have some outdated, but perfectly correct software. Hence, two pieces of software may not agree on the meaning of a Unicode string even without a software error. ''

Actually, the normalisation functions are defined to be unaffected by future changes.
Re:Not a surprise... by HeroreV · 2007-05-22 11:55 · Score: 1

Most implementations of Unicode insist upon UTF-16 (meaning all characters including Latin alphabets use 16 bits per letter).
16 bits is only enough to represent code points in the Basic Multilingual Plane. Unicode has 17 planes (0-16). Surrogate pairs can be used to represent characters outside the BMP. It is very common for software to choke on surrogate pairs, since many developers, like you, assume all characters encoded in UTF-16 are encoded in just 16 bits. I wouldn't be surprised if there were a few security bugs due to that.
Re:Not a surprise... by gweihir · 2007-05-23 03:20 · Score: 1

There is a problem in that, namely that it is too complex. It gets implemented wrongly. One wrong implementation is all it takes to cause a major security issue, but there seem to be numerous wrong and/or incomplete implementations outh there.

In fact, from experience, I would say that the only notmalization that works for security critical stuff is the identity transformation and the only comparison that works is bitwise identity. Everything else, programmers manage to screw up.

--
Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
Re:Not a surprise... by Teancum · 2007-05-24 22:44 · Score: 1

I should have been more specific here. All characters encoded with UTF-16 including the Latin alphabet will be at least 16 bits.

The point I'm making is that the lousy implementation of Unicode by Microsoft (and UTF-8 has been around for quite some time here to get this fixed and done properly since Windows '98 was released... possibly even an earlier service pack patch to Windows '95) to have two different character encoding standards for most of their API function calls when it wasn't really even necessary. The whole Win32 API treats Unicode as this perverted disease that needs to be avoided when at all possible, when it simply doesn't need to be dealt with that way.

There is the issue of diacritical marks and the code mapping to some Latin characters that normally use the 8th bit of modified ASCII instead of formal Unicode, but that is a problem anyway for these same documents. I guess it took too much effort on the part of the MS-Word development team to properly transition to Unicode as a standard, as they used the 8th bit of ASCII for document formatting characters as well. The Win32 API is in many places obviously designed by the MS-Office dev team. But that is another issue and thread entirely.
Re:Not a surprise... by Teancum · 2007-05-24 22:51 · Score: 1

There is also a widespread problem within software development circles to think that the reference implementation of a software specification is the one and only solution, and that anybody attempting to re-implement that standard doesn't have a brain.

I have seen far too many bugs in the reference implementations (just like any other software) to trust them entirely, although the unfortunate aspect of that is the reference implementation *becomes* the standard instead of the documentation which is supposed to be the standard. Hence the reason most people simply use the reference implementation and pretend the formal specification doesn't even exist. If the standard says a byte ought to be "12" but the reference implementation produces an "08", guess which one is most frequently found in data files using the standard?

Indeed, English is simpler by Anonymous Coward · 2007-05-21 20:56 · Score: 0

The English alphabet is quite small, small enough that we can waste encoding space on every character twice (upper and lower case). I dunno about Arabic, but the Chinese alphabet is quite a bit larger, even if you only encode unique strokes. To answer your question, yes the English is fundamentally simpler to work with for a computer.

Re:Indeed, English is simpler by magores · 2007-05-21 21:59 · Score: 1

1) Chinese doesn't have an alphabet.
2) And the number of unique strokes is actually quite small. It's the combinations of strokes that makes it difficult.

Not all ACs are morons, but this one is.

Depends on alphabet size by Viol8 · 2007-05-21 21:49 · Score: 2

If you want to represent a language on a computer (and not just numbers) then you need a way to enter and store all the characters that language uses. Obviously the less characters the better. The latin alphabet with all its variations, the cyrillic , hebrew, arabic & korean all lend themselves to this quite easily since they all have a manageable number of letters. Languages such as Chinese and Japanese don't , they don't even use alpabets , they use characters for each object/concept which as you can imagine is a bugger when you need a keyboard of a manageable size not to mention the memory to store each character bitmap (not an issue now , but 40 years ago it would have been a nightmare).

This is only a guess on my part but I suspect computers would have developed completely differently if they'd been developed by a culture that used symbols rather than alphabets.

Re:Depends on alphabet size by setagllib · 2007-05-21 22:08 · Score: 1

Japanese is actually even more complicated than Chinese in that regard. On the one hand it *does* have definite limited alphabets (e.g. hiragana) but also imports a huge amount from Chinese characters. So not only do they have multiple base alphabets, all of which have large distinct character counts, they also have a character library. In doing so I think they have the worst of both worlds - a lot to remember, hard to encode, and not even very compatible with other languages.

Chinese ideographs are so numerous and difficult to remember that they are considered one of the reasons for China's incredibly low literacy rate. The language reform didn't really help either. I guess it's a side effect of bringing in complex written language, once only used by the educated elite, to the average masses who don't have the time or resources to learn. It's not as much of a big deal with simpler languages. Russian language is a bit more complicated than English, but also more sensible and internally consistent, and Russia has one of the highest literacy rates of any nation. I don't know if that's because of better education or just a better language design, but the numbers are there looking us in the face.

--
Sam ty sig.
Re:Depends on alphabet size by TheRaven64 · 2007-05-21 22:24 · Score: 2, Informative

Chinese ideographs are so numerous and difficult to remember that they are considered one of the reasons for China's incredibly low literacy rate. If you want some evidence of this, then take a look at what happened to Korea when it dropped the Chinese ideograms in favour of a new, home-grown phonogram-based alphabet.

--
I am TheRaven on Soylent News
Re:Depends on alphabet size by magores · 2007-05-21 22:40 · Score: 1

The Pinyin system actually did quite a bit to increase the literacy rate in China.

I would argue that a large reason for the low literacy rate in China is the fact that parents must pay for their child to attend school. Even grades 1, 2, etc are not free.

Many rural parents cannot afford the payments. Therefore, the child doesn't attend school. Hence, low literacy.
Re:Depends on alphabet size by ickoonite · 2007-05-21 23:38 · Score: 1

I don't think it actually has much to do with the complexity of the script - Japanese is, as you say, more complicated, and yet Japan has long had some of the highest literacy rates in the world, even before its modern era. I think - as someone else has suggested here - it has far more to do with the lack of access to education due to poverty, etc. rather than the inherent complexity of hanzi.

Besides, because of the vast number of homonyms in Chinese, an ideographic writing system makes discerning intended meanings so much easier. Whilst Korean - which, because of Chinese influence, is also awash with homonyms - has a writing system which is much easier to get to grips with, it relies much more on context to convey the intended meaning of a word. At least in the CJK case, a loss of clarity is the price you pay for a more accessible language.

And I'm way offtopic...

iqu :|
Re:Depends on alphabet size by rabtech · 2007-05-22 03:27 · Score: 4, Interesting

IIRC, China was on its way to moving to an alphabet system (certain characters can be used for their alphabetic sounds in various circumstances) and so was Japan (look at Katakana/Hirigana).

It is likely that the introduction of the printing press (and later mass media like TV/radio and computers) have "arrested" this natural evolution. It may also be possible that the development of a national identity and cohesive society tends to put the brakes on some developments as well - if a single unified language is mandated by culture or a central authority then local variations are much less important.

Romanji (and to a certain extent English itself) is definitely influencing the Japanese; the younger generations even moreso. Japan may end up using an alphabet for day to day needs almost exclusively within the next 100 years. The situation in China is much less clear but it will probably happen eventually.

If we look into the past, nearly all societies with ideographic/logographic writing systems eventually moved to an alphabetic system. Hell, even Ancient Egyptian Hieroglyphs were partially syllabic much like Katakana. Much as previous posters have pointed out, changing to an alphabetic system from Chinese-characters has allowed Korea to dramatically raise literacy rates. There is only so much time for schooling and memorization, and only so much effort to expend on literacy. If a simpler writing system is more accessible then that is a net gain, even if there are a few things that logographic writing systems do better than alphabetic ones.

--
Natural != (nontoxic || beneficial)
Re:Depends on alphabet size by Anonymous Coward · 2007-05-22 09:00 · Score: 1, Informative

The word you're looking for is "romaji"

As for Japanese, the language needs to change severely to make all-alphabet practical. They have homophones like you wouldn't believe (comes from the small number of sounds the language uses). It's very easy to write a sentence that, even in context, could mean multiple widely varying things if you don't have kanji to indicate meaning.
Re:Depends on alphabet size by ShakaUVM · 2007-05-22 09:05 · Score: 2, Informative

You're missing the key roadblock to simply replacing characters with pinyin, or any other romanization: Chinese is a heavily overloaded language. While there are a bit of homophones in English, *every* word in Chinese is a homophone, with something like 13 different homophones per sound on some of them. We differentiate some of homophones by writing them differently (layed, laid, etc.), Pinyin *cannot* differentiate these homophones -- it's an exact transcription of the sound. Chinese differentiate their written words with characters. When having a conversation you can get by with spoken Chinese or pinyin, since you can always ask the other person which character they meant, if there's confusion, and Chinese will make do with pinyin in a pinch, but it's more or less impossible to ask them to switch to a Romanisation for all purposes.
Re:Depends on alphabet size by loyukfai · 2007-05-22 11:35 · Score: 3, Interesting

IIRC, China was on its way to moving to an alphabet system (certain characters can be used for their alphabetic sounds in various circumstances)...
I'm a Chinese but I have never heard of this. Would you be so kind to educate me on this...? Where did you hear such things?
I'm serious.
Re:Depends on alphabet size by xigxag · 2007-05-22 15:21 · Score: 1

I realize that what you're stating is the received wisdom and as such can scarcely be questioned, but I'd wonder how illiterate Chinese people throughout history were able to even hold a conversation without being baffled by homophones. Using characters is a clever solution to the problem of transferring Chinese words to paper but it is not the only one, as evidenced by the existence of the Dungan and Xiao-er-jin syllabaries.

--
There are two kinds of people: 1) those who start arrays with one and 1) those who start them with zero.
Re:Depends on alphabet size by ShakaUVM · 2007-05-22 20:57 · Score: 1

>>but I'd wonder how illiterate Chinese people throughout history were able to even hold a conversation without being baffled by homophones.

They ask something like: "ditie de di haishi didi de di?" if there's confusion. That's why I said it's less of an issue when having a two way conversation or playing online. You can ask.

But it's very suboptimal to expect them to live with a crippled system when the characters provide the disambiguation.

>>I realize that what you're stating is the received wisdom and as such can scarcely be questioned

It's not like the Chinese government hasn't tried to replace characters with romanizations. And was rejected by the people for the reasons I listed before.

Ron Paul for Regulation!!! by Anonymous Coward · 2007-05-21 22:09 · Score: 0

In Soviet Russia, Ron Paul is for regulating software security!

Wait.

That made no sense.

And this isn't an article about Ron Paul for President..

But, hey.. This comment is likely just about as relevant to the article, as the rest of them.

Maybe more so.

I'm not Ron Paul, and I don't approve this message.

Re:Hmmmm.... by Anonymous Coward · 2007-05-21 22:22 · Score: 2, Funny

4) You are an idiot
5) You are an asshole

Nothing to see, move along ... by udippel · 2007-05-21 22:24 · Score: 4, Insightful

It is a vulnerability, in the strict sense.
It is a self-inflicted misbehaviour as in common sense.
It is like those silly Cisco content inspectors on port 25, that try to avoid attacks on flimsy MTAs.
It is like someone dying from a jab against measles: the jab protected that person from contracting measles, actually.
It is like those stupid anti-virus programs that are more vulnerable than the daemons they profess to protect.

When the attacker uses a codepage different from the one that you think she ought to use, she can circumvent your content filter. Which ought not be an attack vector, in any case.

As I said: nothing to see, move along ...

Re:Nothing to see, move along ... by MrMista_B · 2007-05-21 23:48 · Score: 1

What if the attacker is a guy?

('he' is gender-neutral in the common sense, unless refering to a specific individual, 'she' is not)
Re:Nothing to see, move along ... by Anonymous Coward · 2007-05-22 04:13 · Score: 0

Actually, in the PC world (that is, Politically Correct, nothing to do with ASCII code pages), the use of "she" is now supposed to be gender neutral. Of course, that simply reverses the sexist issue, rather than correcting it. At some point, when the pendulum has swung too far to the other side, we'll be forced to revert back to "he". Perhaps a new Unicode character can be devised to provide a truly gender neutral 3rd person singular pronoun.

Re:Hmmmm.... by peragrin · 2007-05-21 22:55 · Score: 4, Interesting

1) unicode is better than having a hundred other encodes to debug
2)there's is nearly two billion chinese and Indians, who can't use your encoding.
3)I get just as much spam from US companies as I do foreign ones

--
i thought once I was found, but it was only a dream.

There is no proof of concept for IIS by Anonymous Coward · 2007-05-21 22:56 · Score: 0

The linked "proof of concept" is an example of how to encode a "q" letter. It is a perfectly legal coding and it is correctly converted by IIS/ASP.

Microsoft is not listed as having any product affected by this "vulnerability". They are listed as "unknow".

This affects content filtering systems (such as naive firewalls) which relies on byte-for-byte comparison to filter out unwanted content.

flawed design .. by rs232 · 2007-05-21 22:58 · Score: 1

What kind of a flawed design is it where character encoding can impact security. The concept of scanning for unsafe strings is also flawed as in the case of virus scanning, as it only know about the stuff it knows about. This is another example of Ranums enumerating badness. If the SQL engine used only stored procedures then you wouldn't have to run a content scanner as the only thing coming over HTTP is DATA.

--
davecb5620@gmail.com

Another likely example of OSS? by erroneus · 2007-05-21 23:23 · Score: 1

Back in the Win95 days, I recall a stupid little exploit that would lock up a Win95 machine. The root of the problem, however, was in the TCP/IP code from BSD's source. Microsoft had used BSD's TCP/IP stack code in building one for Win95. I'm not here to complain that big bad commercial vendors are "stealing" from the open source community. I'm just suggesting that perhaps this is yet another example of how OSS has made yet another important, thought silent, contribution.

It's annoying to me when people suggest that OSS is sub-par in some way or another. It would be nice if there were a long list somewhere of all the more commonly identified examples of OSS contibutions to commercial code. Then it can be more conveniently shown that commercial code quite often depends on the derided OSS code.

Re:Another likely example of OSS? by El_Muerte_TDS · 2007-05-22 00:18 · Score: 1

Just because you use "free" code doesn't mean you don't have to check it for correctness.
If X works in system Y doesn't imply it works in system Z. Heck, the reason it works in Y could be because of a bug in Y.
Re:Another likely example of OSS? by SEMW · 2007-05-22 00:53 · Score: 0

Apparently, Vista's networking stack has been rewritten from scratch -- which does make you wonder how much of the reason for that was technical, and how much was MS wanting to be seen to get rid of all the BSD/*nix code in Windows in preparation for their patent offensive...

--
What's purple and commutes? An Abelian grape.
Re:Another likely example of OSS? by Frankie70 · 2007-05-22 02:51 · Score: 2, Informative

Apparently, Vista's networking stack has been rewritten from scratch -- which does make you wonder how much of the reason for that was technical, and how much was MS wanting to be seen to get rid of all the BSD/*nix code in Windows in preparation for their patent offensive...

Why should using BSD code come in the way of their patent offensive?
Using BSD code isn't infringing on BSD's or someone else's patent.
Re:Another likely example of OSS? by Anonymous Coward · 2007-05-22 04:02 · Score: 1, Informative

When NT 3.1 first shipped in 1993, Microsoft did not have the resources to develop all of the network stacks. They wrote the NetBIOS stack but licensed the OSI stack from a 3rd-party and contracted the TCP/IP stack from another 3rd-party. This TCP/IP contractor used the BSD code for expediency, which is the source of the rumor.

However, the BSD stack required emulation of the STREAMS interface which was pretty inefficient. For the NT 3.5 release in 1994 they wrote their own TCP/IP stack in-house, which is the same stack that shipped with Win95.

MS hasn't shipped the BSD TCP/IP stack for 14 years. The reason they rewrote it for Vista is to incorporate IPv6. With XP, you had to install the v6 stack separately.

dom

Re:Hmmmm.... by iainl · 2007-05-21 23:30 · Score: 1

Grandparent is correct; what these Floridans send doesn't qualify as English.

--
"I Know You Are But What Am I?"

Re:Hmmmm.... by Anonymous Coward · 2007-05-22 00:20 · Score: 0

1) unicode is better than having a hundred other encodes to debug

No, it's not.

TCP/IP code from BSD .. by rs232 · 2007-05-22 00:36 · Score: 1

'Back in the Win95 days, I recall a stupid little exploit that would lock up a Win95 machine. The root of the problem, however, was in the TCP/IP code from BSD's source'

I assume you are referring to the ping of death. The root cause being a bug in the TCP protocol and occured on other platforms not using the BSD code.

was Another likely example of OSS?

--
davecb5620@gmail.com

Is "Bernard" pronounced like "burnèd"? by tepples · 2007-05-22 00:42 · Score: 0, Offtopic

è is sometimes used to indicate that the e in a past participle is pronounced, eg learnèd (rhymes with Bernard)

In Disney's 1994 film The Santa Clause , an elven character named Bernard (David Krumholtz), who accents his name on the last syllable, explains some of the basic rules to Santa. I've never heard "Bernard" pronounced any other way.

Otherwise, I agree with your post.

Re:Is "Bernard" pronounced like "burnèd"? by Nick+Number · 2007-05-22 03:44 · Score: 1

Brits tend to pronounce it BURN-urd, whereas Americans favor bur-NARD.

--
Promote proofreading. Don't mod up sloppy posts.

I don't know Japanese law, so why support kanji? by tepples · 2007-05-22 00:57 · Score: 1

The fact is, computers are used for text, and much if not most text is non-ASCII. In order to market my product in some other country, I have to familiarize myself with its laws. As of the foreseeable future, I have the time to do this only for the United States of America, for which ISO-8859-1 is "good enough" especially on a handheld device with 4 MB of RAM. It also costs money to license foreign fonts, unless you just want rectangles everywhere.

Stored procedure cross-compatibility? by tepples · 2007-05-22 01:02 · Score: 1

If the SQL engine used only stored procedures then you wouldn't have to run a content scanner as the only thing coming over HTTP is DATA. Do the popular free software implementations of SQL (MySQL, PostgreSQL, Firebird SQL, etc.) implement stored procedures in any sort of standard manner?

Re:Stored procedure cross-compatibility? by rs232 · 2007-05-22 02:46 · Score: 1

'Do the popular free software implementations of SQL (MySQL, PostgreSQL, Firebird SQL, etc.) implement stored procedures in any sort of standard manner?'

I don't know what you mean by standard manner. According to this PostgreSQL uses something called procedural languages. But then again since when was SQL ever implimented in a common standard. Remember when Microsoft 'extended' SQL so as to allow spaces in table names, you only have to wrap the name in square brackets [] or back-ticks ``.

But my point is still valid, there must be any number of ways to achieve the same usability as sending an SQL query to the client, which fills in the variables and sends it back to the server. So that things like the below don't happen where you your replace the $ENV variable with a bogus SQL query.

SELECT fieldlist
FROM table
WHERE field = '$ENV';

--
davecb5620@gmail.com

Re:Hmmmm.... by Anonymous Coward · 2007-05-22 01:06 · Score: 0

chinese and other asian people can't use Unicode : lots of characters are missing.

now those utf-8 zealots will tell you stupid things like "who cares, chinese really use about only 10k chars and those are in utf-8, the other 40k or 70k are not used daily by those people, so utf-8 is still ok right ?"

well no, it defeats the whole purpose of Unicode. if you think its ok, then why should west-europeans and americans/canadians/australians CARE about non-ISO-8859-1 chars ?

You're the moron, HA! by Anonymous Coward · 2007-05-22 01:12 · Score: 0

Composing the stokes are not difficult. No it it's not. It's the character encoding that's troublesome. Har har.

Oh, and there are still a good number of stokes, some that aren't supported by common setups even today. And complaining about it being called an alphabet is the same as being anal.

IIS's fault by phasm42 · 2007-05-22 01:50 · Score: 1

After reading through this carefully, it seems the fault is really with the webserver software (in this case, IIS). The problem is that normally a full-width character (such as FF1C in the example) and the regular character "<" are not equivalent, but IIS is translating the full-width form of a character into the regular character, so although the two forms were distinct before reaching the frontline filters, they are no longer distinct by the time it reaches application code running under IIS.

I guess whether you call this a fault depends on whether you think the webserver should be translating full-width characters to their equivalents. I tested a webpage, and both IE and FF do not consider "<" and "\uFF1C" (encoded in UTF-8) to be equivalent. The latter is displayed as a giant angle bracket, and does not work as the start of an HTML tag. I would think that the webserver should also respect this distinction. Anyone know more about full-width characters, or why they even exist?

--
"No one likes working in a hamster wheel, and your shop smells of cedar shavings from here." - TaleSpinner

Re:IIS's fault by mzsanford · 2007-05-22 02:53 · Score: 1

Full width characters are used to maintain alignment in east Asian languages. A good example is the Chinese encodings, where every character is more or less square. If you put a half-width (e.g. latin) question mark in there is would throw off the natural alignment. A full-width question mark however keeps the alignment by taking up a full square. Might seem like a silly distinction but Chinese typesetting (even web typesetting) expects squares, and so do the viewer's eyes.
Re:IIS's fault by DrVomact · 2007-05-22 06:04 · Score: 1

"Full width" vs. "Half width" (or, as I prefer, "half-wit") characters exist for typographical convenience in rendering Japanese characters. (Take a look at the Unicode spec, section 10.3 for example http://www.unicode.org/book/ch10.pdf/). This does not, however, explain why certain symbols that are already defined in other parts of the Unicode standard, such as the less-than symbol (or left angle bracket) are duplicated there. I suspect that it has something to do with possible confusions that might arise when parsing or transcoding mixed double-byte and single-byte characters...but that's just a guess.

In any case, the effect of this is that there are 2 ways of producing the < glyph: you can use character code x8B or xFF1C. However, your experiments have shown that browsers do not treat these two codes as being the same character...even though they look the same. I'm not sure if that's right or wrong, if there is a right and wrong way to handle this issue (I suppose that means it's excellent grounds for a religious war)--it's just important that it be handled consistently. From what you found, IE and FF are consistent with each other, while IIS handles the two codes as identical characters. I would think that IIS would at least be on the same page with IE...but wait, thats MS we're talking about.

--
Great men are almost always bad men--Lord Acton's Corollary
Re:IIS's fault by spitzak · 2007-05-22 08:43 · Score: 2, Informative

They are there for compatability with some Japanese and Chinese character sets, which contained most of the ascii characters in both "half" and "full width" forms. The full-width ones were twice as wide to match the square characters, which was useful for lining up columns.

This is all pointless now with proportionally-spaced fonts (and multiple fonts, you could easily select the "wide" font to print those characters instead). However Unicode had as a design requirement that translating from any common encoding to unicode and back again would be the identity transform. Thus if any character set existed with two ways of representing the same character, then there had to be two ways to represent it in Unicode. Therefore the full-width characters. This is also why Unicode has hundreds of random accented characters even though the combining characters would allow all of them to be represented easily with only a few dozen characters.
Re:IIS's fault by phasm42 · 2007-05-22 11:31 · Score: 2, Insightful

here are 2 ways of producing the < glyph: you can use character code x8B or xFF1C.
Shouldn't that be x3C?
I'm not sure if that's right or wrong, if there is a right and wrong way to handle this issue (I suppose that means it's excellent grounds for a religious war)--it's just important that it be handled consistently.
I thought about this a little more, and I think the difference will be in what it is used for. In HTML, the "<" glyph has a special meaning, so it makes sense that a different version (in this case, full-width) of the character should have a different meaning. From an application perspective, perhaps they should be the same. IIS translates the full-width version to the regular version, probably reasoning that if a full-width angle bracket was submitted to the webserver, such as in "<something@somewhere.com>", a regular one was intended. However, this isn't a safe assumption, which leads me to another question -- anyone know if this is optional behavior in IIS, and if so, is it defaulted to on or off?

--
"No one likes working in a hamster wheel, and your shop smells of cedar shavings from here." - TaleSpinner
Re:IIS's fault by DrVomact · 2007-05-22 14:02 · Score: 1

Shouldn't that be x3C?

Er...yes, of course. Apparently x8B is one of those European-style single quotes (at least that's what I think the purpose of that character is) that looks like a small left angle bracket. (There's a double version as well.)

That's what I get for posting from work, where I have to keep looking over my shoulder watching for my boss, who doesn't understand that posting to /. is research.

--
Great men are almost always bad men--Lord Acton's Corollary

MOD PARENT UP by TheRaven64 · 2007-05-22 03:36 · Score: 1

Great reply, thank you. I haven't used Windows for a few years, but it's good to keep up with this kind of thing, and I'm sure others can benefit from this information.

--
I am TheRaven on Soylent News

Don't Steal my WoW account! by Evil+W1zard · 2007-05-22 03:50 · Score: 1

So how long til we find out that there has been exploitation of this vulnerability for X number of months for the sole purpose of stealing our WoW accounts!!!

Why steal someone's real identity when you can steal their uber virtual Undead Priest identity and sell it for 16 bucks.

--
News Reporters Make Tasty Polar Bear Treats!

US-CERT != CERT by mabu · 2007-05-22 04:19 · Score: 1

Am I the only one who has noticed that since CERT partnered with the US Government, the response time on advisories has been much slower, and the details and depth of reports are less comprehensive? CERT advisories used to be a critical part of our security strategy. Now by the time the hit the mailing list (if at all), they're more of an afterthought.

Is there a better alternative to CERT now because it just isn't cutting it. I am familiar with Bugtraq and Security Focus. By the time CERT mentions something, usually it's actively exploited. It wasn't always like this but now the service isn't nearly as helpful to administrators.

Re:Limited impact CHECK THIS for Ip filterings by Anonymous Coward · 2007-05-22 04:28 · Score: 0

"Windows makes no distinction between privileged and unprivileged ports, so any application that can open sockets can listen on port 80. That said, every port number (and every other object in the NT kernel) has an associated ACL, so it is possible to limit them on an individual basis." - by TheRaven64 (641858) on Tuesday May 22, @06:27AM (#19218865)

Programmatically, on a "per-application basis", the other respondents outlined (@ a kernel level, using NtAPI/ZwAPI calls) a method for you to explore here:

http://it.slashdot.org/comments.pl?sid=235621&cid= 19221887

Now, on this material next below?

Well, I think this might help you some as well, as to limiting ports accesses on various ports WHOLESALE (Ip stack filtering) &/or on a user-defined basis (via IP Security Policies), below after this quote of yours, next:

"I've never seen this exposed to the UI though, so I've no idea how you'd go about doing it" - by TheRaven64 (641858) on Tuesday May 22, @06:27AM (#19218865)

You have this on ports, via a GUI method as mentioned above!

There are 2 ways:

1.) Port filtering

& alternately

2.) IP Security Policies (ontop of software firewalls (which also have some control here & at the application level no less) & hardware 'firewalls').

You may find this useful (or, others may, as YOU in particular may be aware of this stuff already, one never knows, but I am mentioning it here in detail anyhow for your reference, or for that of others who use Windows NT-based OS that have these features (Windows 2000/XP/Server 2003/VISTA):

FIRST - Read this article, for background. Mainly because it shows you how to limit/unleash various ports and what drivers act on them as filters, & @ what levels in the network stack for Windows:

TCP/IP Packet Processing Paths:

http://www.microsoft.com/technet/community/columns /cableguy/cg0605.mspx [microsoft.com]

IpNat.sys, IpFltDrv.sys, IpSec.sys, & TcpIp.sys in Windows 2000/XP/Server 2003/VISTA each has abilities for port restrictions!

This sounds like what you guys are looking for!

The steps below are basically how to use it (implement it) for limiting access to various ports, via GUI interfaces no less, in Windows versions noted above.

All of this & the tools noted can be used for LAYERED SECURITY in this manner (port filtering, IP Security Policies, software firewalls, & hardware NAT routers (true packet stateful inspection ones, & 'ordinary' NAT units as well)!

They ALL can be used simultaneously/concurrently, in layers, per the article from MS above entitled "TCP/IP Packet Processing Paths"

IPSecurity Policies are implemented in secpol.msc (this is the most complex of the lot, and I recommend "AnalogX's" model, as it works (but, can be troublesome with filesharing tools like EMule mind you), & can be downloaded here:

ANALOGX IP SECURITY POLICY OVERVIEW/HOW TO EXPLANATION:

http://www.analogx.com/contents/articles/ipsec.htm [analogx.com]

ANALOGX IP SECURITY POLICY TEMPLATE DIRECT DOWNLOAD:

http://www.analogx.com/files/aps-ipsec.zip [analogx.com]

(You can tune AnalogX's template model as you like above & beyond its original form for apps YOU use in particular)

AnalogX's IP Security Policy provides a good template to start with!

IP PortFiltering is done here/HOW TO, STEP-by-STEP:

Start Button -> Control Panel -> Network Connections -> Local Area Connection (or whatever you called yours) -> Properties Button -> (Next Popup dialog screen) -> Highlite "Internet Pro

Re:Hmmmm.... by AmishElvis · 2007-05-22 04:50 · Score: 1

Yeah, but according to a business week article, 1.847 billion of those chinese and indians won't buy your software, they'll just pirate it. So fuck it, let them write their own software. They can use however many bytes per character they want. Anyway, I miss plain ASCII. Much more elegant, and you can use char buffers as buffers for binary data without dicking around with lo-byte / hi-byte nonsense.

half-wit encoding? by DrVomact · 2007-05-22 04:59 · Score: 1

Full-width and half-width encoding is a technique for encoding Unicode characters.

That comes as a complete surprise to me, and I thought I knew at least a little about Unicode and other character encoding schemes. The usual methods of encoding Unicode character points are UTF-8 (variable-length scheme where characters may be represented by anything from one to six bytes), UTF-16 (fixed-width double byte encoding), UTF-32 (fixed-length 4 byte encoding), and well there's UTF-7 and other oddballs. But the closest I've ever heard of "half width" and "full width" is in connection with Asian--specifically Japanese--characters. There are Asian character sets that have "half-width" and "full-width" variants. Though this is inconsistent with the original intent of Unicode, these character variants (each pair of which are really forms of the same character that have different widths) were defined as separate Unicode characters.

Maybe everyone else knows what they're talking about, and I missed some crucial piece of information...but I don't understand how mistaking one character for another is going to break anything--you'll just have the wrong characters. Anyway, are they talking about HTML content sent via HTTP, or URLS, or what? Can anyone explain this better? Or am I not supposed to understand it?

--
Great men are almost always bad men--Lord Acton's Corollary

Re:half-wit encoding? by HeroreV · 2007-05-22 12:13 · Score: 2, Insightful

UTF-16 (fixed-width double byte encoding)
UTF-16 is a variable-width encoding. Code points from plane 0 are encoded in 16 bits and code points from planes 1 through 16 are encoded as two 16 bit surrogates. Many developers, like you, aren't aware of this, so it's very common for software to choke on UTF-16 with surrogate pairs.
I don't understand how mistaking one character for another is going to break anything
scenario:
1) You escape a Unicode string that contains fullwidth characters. The fullwidth characters have no special properties, so they aren't escaped.
2) You translate the escaped Unicode string into ASCII. Fullwidth characters are translated into halfwidth characters. Some of those halfwidth characters, like quotes, have special properties.

The fullwidth quote was perfectly safe, because it wasn't treated like a quote. It was treated the same as an "A" or "b". But when it was translated to a "normal" quote, it went from being a plain old character to being a quote character, with a completely different meaning.

The lesson here is that you should never translate fullwidth characters into halfwidth characters unless you know whether they should be escaped or not, and you should escape them during translation if they need to be. Also, it's not a good idea to translate an escaped string between character sets.

Nothing to do with Unicode by Tzutzu · 2007-05-22 05:44 · Score: 0

"Full-width and half-width encoding is a technique for encoding Unicode characters" No, these are not Unicode encoding techniques, whoever wrote that description has no clue. They are forms of Latin characters used in Japanese, with clear usage, and long history. You can find them in the JIS standards (Japanese Standards Association), and where used by UNIX, MS-DOS, Windows 3.x, Mac OS, long-long before Unicode was even created. The problems is not Unicode, is not even JIS, is the vendors that have no clue about international issues that exist for more than 20 years now.

Unlimited impact. by dolmen.fr · 2007-05-22 11:26 · Score: 1

On Unix based servers, the risk of this is mitigated by running your webserver in a chroot jail. chroot jail doesn't protect your application against XSS.

Where do you find other services? by SkiifGeek · 2007-05-23 06:03 · Score: 1

No, you haven't been the only one to notice that CERT has some timeliness issues when it comes to reporting on threats. Other CERTs, such as AusCERT, have the same sort of problem - particularly when you consider their public notification data (separate from their paid-for disclosure lists). Accepting that it takes time to analyse and report information, and accepting that they are disclosing to their fee-paying / sponsoring clients first, the recorded dates of information discovery are often significantly incorrect. This particular report comes as quite a surprise to us. We had always considered that variable-width encoding was relatively well understood by InfoSec companies, especially those that provide services in multiple languages. It always seemed more self-evident than HTTP-Request/Response splitting, for example.

The timeliness same problem also affects moderated sources such as BT and the various SecFocus sources, where there can be a several day delay between initial disclosure and appearance on those sources (if not longer - one particular list has recently developed a delay of > 1 week for new posts). Plus, you always get the problem of identifying what sources are accurate and relevant (hint: the CitiBank Screencap argument is about 2 years too late).

So, where do you look for additional resources? You could always look at companies like Secunia, FrSIRT, eEye, Symantec, or McAfee, but it is possible to time threat disclosure so that there is an approx 72 hour delay before they pick up on the threat, and there is always the question of coverage - McAfee will always have a focus on virus, worm and some malware threats.

Or, you could always use our services (http://www.beskerming.com).

We have a number of established free and fee-based services that deliver timely, relevant and accurate information about current and emerging threats. They effectively cut out the irrelevant noise that is most of the massive amount of data (across a number of different information channels) that is Information Security disclosure.

We have no vendor affiliation, do not rely on sponsorship or advertising in order to deliver our services, and strive to be platform neutral when analysing and reporting on issues. We know that our services are already being used by companies to augment their Incident Response Team information sources (as well as to validate the data coming from their more expensive, less-timely data sources), and for some clients our services form the core of their security response strategies.

Why not get in touch? We're more than happy to have someone chat to you about your InfoSec needs.

--
InfoSec that matters, when it counts.

Slashdot Mirror

Unicode Encoding Flaw Widespread

184 comments