Slashdot Mirror


Names That Break Computers (bbc.com)

Reader Thelasko writes: The BBC has a story about people with names that break computer databases. "When Jennifer Null tries to buy a plane ticket, she gets an error message on most websites. The site will say she has left the surname field blank and ask her to try again." Thelasko compares it to the XKCD comic about Bobby Tables, though it's a real problem that's also been experienced by a Hawaiian woman named Janice Keihanaikukauakahihulihe'ekahaunaele, whose last name exceeds the 36-character limit on state ID cards. And in 2010, programmer John Graham-Cumming complained about web sites (including Yahoo) which refused to accept hyphenated last names. Programmer Patrick McKenzie pointed the BBC to a 2011 W3C post highlighting the key issues with names, along with his own list of common mistaken assumptions. "They don't necessarily test for the edge cases," McKenzie says, noting that even when filing his own income taxes in Japan, his last name exceeds the number of characters allowed.

8 of 372 comments (clear)

  1. Interesting read about names by angel'o'sphere · · Score: 3, Informative

    http://www.kalzumeus.com/2010/...

    Nothing to say, read it.

    There is similar stuff about Dates, Time, Time Zones etc. on the internet. I should make a collection of it.

    But I can't figure how to write into my /. journal nor how to use the old /. bookmark feature.

    --
    Cost free eBook I read (by iBook/Kobo/Amazon/ObookO/Gutenberg etc.): "The Green Odyssey" by Philip Jose Farmer.
    1. Re:Interesting read about names by Athanasius · · Score: 3, Informative

      Someone already did: http://spaceninja.com/2015/12/...

  2. Aw, come on ... by Alain+Williams · · Score: 3, Informative

    Her name in a (web) form would be put into a database field as a string ... the word NULL is a keyword, not a string "NULL". I am not saying that this did not happen, I just find it hard to see how a string and a database keyword could possibly be confused ?

    It would be: INSERT INTO Customer (Surname) VALUES ("NULL")

    not:: INSERT INTO Customer (Surname) VALUES (NULL)

  3. Re:Hyphens in last names? by wisnoskij · · Score: 3, Informative

    More to the point, care about the future. Do you really want your children's children to be called Robert Smith-Schmidt-Maier-Kilgore? Not picking a single last name is just a huge FU to all future generations.

    --
    Troll is not a replacement for I disagree.
  4. Re:Updated Policy: by rjune · · Score: 5, Informative

    After the IRS started requiring social security numbers to claim children dependents on tax returns, about 7 million of them vanished. In this case, it appears that the move was justified. http://www.snopes.com/business...

  5. Re:Hyphens in last names? by Anonymous Coward · · Score: 2, Informative

    wisnoskij,

    I trust you understand that hyphenated last names in English have a definite form.
    For example, Dr. Martin Lloyd-Jones used both his mother's and his father's last names in a hyphenated form.
    When children come about, one of the names, usually the mother's last name, is dropped.
    So Dr. Lloyd-Jones child would be come Robert Jones.
    Now Robert Jones may want his mother's name and become Robert Smythe-Jones.
    Only in America would the atrocity of a multiply hyphenated name stand a chance of occurring since Americans don't know customs or history.

  6. Re:LoL by Megol · · Score: 3, Informative

    Byte size have varied a lot in the past and could conceivably vary in the future too (but it's unlikely). Even the definition of byte as a concept have varied, most have byte as the smallest addressable element while some systems had it as the character size etc. Word addressed machines very seldom used byte to describe the addressable element size but some had word-sized characters... It's a mess.

    A more correct name is octet which by definition consists of 8 binary digits.

  7. Re:Updated Policy: by Anonymous Coward · · Score: 5, Informative

    They do exist they are called string parsers.

    The real issue is that practically *any* integer could be a valid text character in any given input because of the number of codepages that exist. Then you have to take the trouble of identifying the specific codepage used by the input to know what can be safely excluded. Then you need to deal with non-printable control characters. Which amounts to reading bytecode from the input to make a decision on how to or what to interpret / print as a character. (Example UTF-8: First byte of any character is the number of bytes that compose that character (expressed in bits, and terminated by a zero bit.), unless it's one byte in which case the first bit is zero and the remaining 7 bits are the character data. Misinterpret a bit or get misaligned, and you start interpreting garbage.) Etc.

    Add all of this complexity to a short time span to develop libraries, (i.e. it needs to be done three days ago), and minimal budget, ("What do you mean we need to support diacritics? No we're not spending that money to add support for it. Ship the damn thing without it, if they want it they can pay for an upgrade.") and you can see why these problems exist. Mostly it's the idea that the support isn't needed for everyone so they can get away with not implementing it and blame any issues that crop up on the end user / some bug / a bad connection / etc.

    Sadly TFA is yet another call to attention for this issue, that ultimately will not be addressed unless it gets "fixed" by an unrelated upgrade / patch being rolled out that just so happens to fix these kind of issues, in addition to whatever the real purpose of the upgrade / patch was.

    PS: Read the summary, if "NULL" is considered a valid error result from a string parser, then that parser needs to be rewritten to support proper error codes. Practically anything could be valid input and returning the error status as part of the damn output string is ASKING for trouble. Why? Because then you need a parser to check the error status, so the original parser just made more work for the caller, and guess what? Something tells me the caller didn't check for the EXACT error string correctly, and thus interpreted "Null" as "NULL". Hence the error given to the user.