A Text Message Can Crash An iPhone and Force It To Reboot
DavidGilbert99 writes with news that a bug in iOS has made it so anyone can crash an iPhone by simply sending it a text message containing certain characters. "When the text message is displayed by a banner alert or notification on the lockscreen, the system attempts to abbreviate the text with an ellipsis. If the ellipsis is placed in the middle of a set of non-Latin script characters, including Arabic, Marathi and Chinese, it causes the system to crash and the phone to reboot." The text string is specific enough that it's unlikely to happen by accident, and users can disable text notification banners to protect themselves from being affected. However, if a user receives the crash-inducing text, they won't be able to access the Messages app without causing another crash. A similar bug crashed applications in OS X a few years ago.
it's a parsing bug, what difference would sanitizing user input make...
Yes, technically there is a way to execute phone specific code with specially crafted text messages. This is not doing that. It's not executing a program. The system is trying to abbreviate the contents of the message to display in a notification banner or on the lock screen through a widget (or whatever apple calls them). The system is doing something it's designed to do, but due to lack of foresight or just shoddy development, they never bothered testing this with special characters. And some clown obviously found the bug. This is actually pretty big. So in the past few months I've learned about 3 important issues with IOS devices, even those running the current release: 1)They are still including a chinese root cert that has been delisted for handing out forged google certs, and who knows what else. 2)A specially crafted access point being in range of your IOS device can cause it to become unstable and eventually crash, even if you have not connected to that network 3)A specially crafted text message can crash your phone upon receiving it. Lets be clear, I'm not saying Android doesn't have some major issues as well, so don't try to fanboy me. But this is not what I expect from Apple. This is just bad. Lack of sanity testing? Keeping their users at risk seemingly just to say FU to google?
https://xkcd.com/327/
I still laugh at this... am I an idiot? Don't answer that.
Getting notifications on an Apple Watch also protects the iPhone from the bug.
They have to push sales of the iWatch some ways...
Sanitize a language people use for actual communication? The text shouldn't have to change. This is a case of bad coding and nothing more.
Generally, if a carefully-crafted input can cause your application to crash, a similarly-crafted data may be able to exploit the same bug and cause an execution of malicious code. If — as is usually the case — the crash is due to buffer overflow and I can stomp over your app's memory, I may be able to place my code in the right place and it will be executed as part of the app...
There are ways to mitigate that — such as by declaring data-parts of memory non-executable — but the earlier successful exploits of buffer overflow in the image-parsing code suggest, Apple is not using this.
Security — as any good work in general — is hard. Disproportionally harder than the merely Ok work. The real measure is not the number of bugs, really, but the speed of the fixes, once the problems are discovered. Unfortunately, Apple seems to be slow at that too...
In Soviet Washington the swamp drains you.
We can't tell you. It requires Unicode characters that /. won't render.
I'd be willing to bet that the unicode library they were using was UTF-16 . Either that or they were handling unicode in a straight binary string with something homebrewed. Both are horribly dangerous - the latter for obvious reasons, but the former in particular because it makes it easy to code something that "just works" for 99,99% of cases, but those rare 0,01% side cases involving 32-bit unicode characters slip through testing and come back and bite you down the road. It's amazing how many apps have incorrect behavior with 4-byte unicode characters, on a wide range of platforms.
Both should be considered bad practice and programming languages evolved to standardize on UTF-8 for any string format that is to handle unicode. C++ for example needs to introduce something along the lines of "std::ustring" that makes unicode string ops "just work" with a UTF-8 backend, at the cost of some memory and performance vs. std::string, which should be seen as exclusively for ascii and binary string operations. std::wstring should be obsoleted.
POTUS Witch Hunt tracker: 75 charges filed against 19 witches, 4 witches cooperating and 5 witches have pled guilty.
I think you hit the nail on the head when you observed "they never bothered testing."
As long as software vendors have zero liability for defects, we'll probably continue to see easy-to-catch and easy-to-exploit bugs in software. Even software out of large, mature dev groups that should really know better.
Confused or want karma points, probably the latter.
Apple's NSString implementation is at least 15 years old. Sure it's been improved over the years, but it's not like Apple's a newcomer to parsing Unicode.
Bugs happen.
No, it's 1985 again. Or even earlier. 1985 was when I found out an escape sequence that would reboot the HP100 portable computer my boss used to access the message system on the HP 3000 minicomputer. Cue me sending an email with it in the subject. The reboot took so long the messaging system logged you off and handily when you log in it prints the subjects of your unread emails and around you go again.
This kind of stuff never gets old.
It wouldn't be a parsing bug if the parser sanitized its input.
Don't waste your vote! Vote for whoever you want, unless you live in a swing state it won't matter anyways
Why? One big reason...
What percentage of current-generation Android phones will be able to get the next 2-3 major releases of the OS?
5-10% ?
What percentage of current-generation iPhones will be able to get the next 2-3 major releases?
~100%.
The apple phone does everything I need it to, and because of OS fragmentation on the Android side, third party apps are typically better / more stable. (Exceptions always exist, of course.)
I'm quite happy to hear arguments to the contrary, but my broad-brush perspective is that while Apple's ecosystem is a walled garden, Android's ecosystem is the wild west.
I'm willing to accept a garden with walls if it means I don't have to constantly worry about what unpatched vulnerability is ripe for exploitation on my phone.
Years ago, I had a number of Nokia flip phones. I also converted emails to text messages and sent them to the phone (actually, probably MMS, not SMS), so that I could read my emails on a dumb phone.
However, every now and again, I would receive a "text of death". The phone would receive a text message, crash, reboot, attempt to download text messages again, crash .... etc.. It continued to do this until the network would decide to give up attempting to send that MMS message.
I had several phones of the same model and they all did this.
The real "Libtards" are the Libertarians!
Eccch ... and suddenly I'm reminded of things where you need to escape the escape of the escaping of the escape so that it doesn't keel over.
And then it become "Yo, Dawg, I hear you like escape characters".
It can pretty quickly devolve into really annoying things, especially when something else wants to read it, or when humans have to. I've seen things with things which turned into nightmares of "\\\\\\".
Lost at C:>. Found at C.
I'm willing to accept a garden with walls if it means I don't have to constantly worry about what unpatched vulnerability is ripe for exploitation on my phone.
You mean like that vulnerability where I can send you a text message and cause your phone to crash? ;-)
I wish I were as sure of anything as some people are of everything
Here.
https://www.reddit.com/r/apple/comments/37enow/about_the_latest_iphone_security_vulnerability/
We've come to a time that reddit posts are more infotmative then slashdot articles.
Just to clarify, that came out rather rhetorically, it's a genuine question, where does the parsing stop, how do you sanitise without having to protect the sanitising parser too?
Never. Ever. Do. That. Again.
Or I will mark you as a troll if I have mod points. And frankly, I hope somebody does that to this one.
If you're going to post an informative link from Wikipedia, go straight to Wikipedia; not that wikiwand crap. Using that link to a site pushing a formatting extension that changes the way wikipedia's UI format looks is trolling for users to hijack with a MitM attack. This is fucking /. The general population knows better than to install random extensions from unverified sources. Go pedal your crapware on reddit.
It's not a special character that needs escaping. It's a character that needs multiple bytes to specify the code point. The parser just isn't handling the fact that you can't just crop a character mid code point - it's operating at the byte level when it should be operating at the code point level during a crop operation.
Just because it's unlikely with a real text string doesn't mean that any of the text is invalid for a message. The text string should still not need to be changed. The bug only affects notifications, and it's clear that the text can be displayed just fine in conversation view.
This is almost certainly due to splitting multibyte characters on sub-character boundaries. That's a design problem, not a sanitizing problem.
It's not that NSString itself is broken, it's that the fact that 99.99% of the time an NSString is one 16-bit code unit per glyph that apps using it rarely test the use case where it's two code units per glyph. So a person goes in and writes an app that inserts a new character at a particular byte offset and it works 99.99% of the time, but if it happens to get stuck in the middle of a multi-code-unit glyph, the program breaks.
The documentation is no help. First off, it lies:
As we all should know, that's simply not true. Unfortunately, a lot of people don't know better. Unicode is not a universal, uniform encoding scheme that is 16 bits per character. Even UTF-16 isn't that.
characterAtIndex returns a 16-bit integer. So obviously it has no way to actually represent wider unicode characters. The length method is not the number of characters on the screen, but the number of code units, which is different, but highly misleading to programmers. They're, again, the same thing 99.99% of the time, but those rare cases where they're not generally slip through testing. And this is why UTF-16 is such a hazardous encoding to use.
Yes, NSString is old. And that's part of the problem. It was made at a time where many thought that unicode was only going to be 16 bits. It hasn't aged well. And it's caused a lot of bugs over its time. And now I'd bet that it or something similar has created a brand new iPhone-equivalent of Winnuke.
Programmers really need two types of strings, and only two, for the lion's share of tasks. One, binary strings, where a char is always 8 bytes and operations can be optimized to heck and back. And two, unicode strings, where a char always represents a whole unicode character that you would display, and the count of characters represents the count of display characters and so forth. None of this "99.99% of the time it's one thing, but every so often it's another...". That's asking for bugs.
POTUS Witch Hunt tracker: 75 charges filed against 19 witches, 4 witches cooperating and 5 witches have pled guilty.
And to sanitise the input what process would you need to perform on the input? is it called parsing? and would you need to sanitise the sanitisers parser...
Yes, you could do it with a simpler parser eg delete all non-latin characters from user input because the people who designed our parser were noobs. Or go on a case-by-case basis, this character is used internally for such and such, if user input has this character then put an escape character in front of it or whatever.
For example, a fun gag on a new Linux user is to create a file called " -rf" and ask them to delete it via a command line. If they naively type "rm -rf" then it gets parsed as an option for the rm command rather than a filename. There are, of course, several ways to deal with that sort of thing which involve sanitizing the filename. I suppose it might be even more fun to create a file or directory named " --help & rm -rf $HOME/*". Point being that if you use something internally to execute commands, you'd better be damned sure that user input can't bypass your parser and execute arbitrary commands. It's not an easy thing and if you can't handle it just reject the input that's too complicated for you (eg forbid interesting characters in filenames).
Don't waste your vote! Vote for whoever you want, unless you live in a swing state it won't matter anyways
What percentage of current-generation iPhones will be able to get the next 2-3 major releases?
~100%.
How many of those phones, after receiving 2 or 3 major releases, will be so buggy or sluggish as to be unusable? Considering the amount of bitching every time it happens, I'd say non-zero.
Yeah, that was the point I was trying to get at. Most people take the privacy of their most intimate secrets for granted - they keep it in their email, on their mobile devices etc. And while these things are pretty well guarded, from a technological standpoint a single bug can lead to the mass subversion of a whole ecosystem. It seems to me that the day all of Gmail - or some other major email provider, or private data on everyone's iPhones, etc. is hacked and made public, will be an historic event "The day privacy died." IT should be alarming, but it's one step away, like many posted in this thread, bugs happen...
I immediately tried to crash every phone of every coworker who has an iPhone within earshot of me and it didn't work.
I too enjoy getting fired over stupid shit. Do you have any other suggestions I might try?
Dewey, what part of this looks like authorities should be involved?
Yep, they have been UTF-16 for a long time. And Unicode has been widely broken for a long time. It's not a coincidence.
Someone on StackExchange did some tests last year, adding in 4-byte unicode characters in common applications and seeing how they behaved. The results were really bad:
I've had more than my share of these sort of experiences too.
UTF-16 is dangerous, and should be phased out as much as possible. Where absolutely needed for performance reasons, it should be an internal representation only, hidden from the developer as much as possible. In particular, "length" functions should return the actual string length in characters, not code units; indexing functions should take character offsets; not code unit offsets; and returned "single characters" exposed to the developer should be of a format capable of handling multi-code-unit glyphs. Anything involving working with actual singular UTF-16 code units should only be available as a "for advanced users only, use at your own risk" functionality.
POTUS Witch Hunt tracker: 75 charges filed against 19 witches, 4 witches cooperating and 5 witches have pled guilty.
If you need to truncate after X characters, you don't just truncate after X*8 bits. Sure, that works if you're using an 8-bit encoding, but we're talking about multi-language script, variable-length encodings like UTF-8 here. You truncate after X code points when dealing with those, it's not a fixed number of bytes, and sanitizing your input (which I'm sure they're already doing) does not protect you against cutting a multi-byte character in half if you're counting bytes for truncation.
APK quotes people (including myself) without context and should not be trusted. Just thought you should know.
Judging from the negative mod I got for my remark, the answer is yes, I was right, and yes it was inconvenient for the Fandroids out there.
Have nice day.
"I like to lick butts!" by MobileTatsu-NJG (#32700246) (Score:5, Informative)
It's not a special character that needs escaping. It's a character that needs multiple bytes to specify the code point. The parser just isn't handling the fact that you can't just crop a character mid code point - it's operating at the byte level when it should be operating at the code point level during a crop operation.
Too bad I don't have mod points because you are absolutely correct! As more and more code points are defined, the number of bytes needed to represent characters increases. Their abbreviation mechanism should at least recognize surrogate pairs and combining characters.
And since some characters have different lengths, even counting characters might not be good enough. (Can't use max_bytes=80, nor max_chars=40.)
The message could be "displayed" in memory with the chosen font and size to calculate it's length, then truncate the string in character mode to fit within the limited area.
No, the problem is code that pretends that illegal UTF-8 sequences magically don't exist!
For some reason UTF-8 turns otherwise intelligent programmers into complete morons. Here is another example from Apple. Let me state some rules about how to deal with UTF-8:
1. Stop thinking about "characters"!!!! This is a byte stream. The ONLY reason to think about a "character" is because you are DRAWING it on a display designed for a human to read, and humans do think about "characters". All other software either does not care, or is concerned with far more complex patterns (such as regexp and editors that deal with words and sentences), these second ones are not helped at all by an intermediate translation.
2. It is TRIVIAL to detect that the byte sequence you are looking at is not a valid UTF-8 character. In this case draw a replacement for exactly ONE byte and then try the next byte to see if it is a valid sequence. Do not skip more. There must be one error per byte so that the maximum number of good characters is preserved and so that a sequence with errors can be parsed bidirectionally without looking more than a few bytes ahead, and so that it is possible to search for error patterns. It also means there are only 128 different errors, not millions.
3. NEVER "translate to Unicode" (ie UTF-16) because this will be a lossy conversion of these invalid sequences and thus you have not preserved the original data. I'm sorry but Microsoft really screwed us here. Best recommendation is to write a wrapper around the filesystem calls and translate from UTF-8 to UTF-16 at the last moment, using U+DCxx as a translation for the error bytes (this is lossy but filenames already are, due to case independence, Apple's normalization, and even on Unix where "./foo" and "foo" are the same file).
This is blatantly obvious if you substitute "words" for "characters" and imagine how you would write a program to deal with text strings. Words are also composed of multiple bytes in a row. For some reason nobody seems to crash on misspelled words, and they manage to concatenate and split strings and make whole file systems and diff programs and all kinds of other fancy text manipulation without having to translate the text so that each word is a fixed-sized integer. Amazing!