Slashdot Mirror


Names That Break Computers (bbc.com)

Reader Thelasko writes: The BBC has a story about people with names that break computer databases. "When Jennifer Null tries to buy a plane ticket, she gets an error message on most websites. The site will say she has left the surname field blank and ask her to try again." Thelasko compares it to the XKCD comic about Bobby Tables, though it's a real problem that's also been experienced by a Hawaiian woman named Janice Keihanaikukauakahihulihe'ekahaunaele, whose last name exceeds the 36-character limit on state ID cards. And in 2010, programmer John Graham-Cumming complained about web sites (including Yahoo) which refused to accept hyphenated last names. Programmer Patrick McKenzie pointed the BBC to a 2011 W3C post highlighting the key issues with names, along with his own list of common mistaken assumptions. "They don't necessarily test for the edge cases," McKenzie says, noting that even when filing his own income taxes in Japan, his last name exceeds the number of characters allowed.

20 of 372 comments (clear)

  1. Updated Policy: by fuzzyfuzzyfungus · · Score: 3, Funny

    Users with unacceptably deviant names will be assigned GUIDs for standardized interaction with all systems. Thank you for your compliance with this exciting and mandatory efficiency initiative.

    1. Re:Updated Policy: by rjune · · Score: 5, Informative

      After the IRS started requiring social security numbers to claim children dependents on tax returns, about 7 million of them vanished. In this case, it appears that the move was justified. http://www.snopes.com/business...

    2. Re:Updated Policy: by Anonymous Coward · · Score: 5, Informative

      They do exist they are called string parsers.

      The real issue is that practically *any* integer could be a valid text character in any given input because of the number of codepages that exist. Then you have to take the trouble of identifying the specific codepage used by the input to know what can be safely excluded. Then you need to deal with non-printable control characters. Which amounts to reading bytecode from the input to make a decision on how to or what to interpret / print as a character. (Example UTF-8: First byte of any character is the number of bytes that compose that character (expressed in bits, and terminated by a zero bit.), unless it's one byte in which case the first bit is zero and the remaining 7 bits are the character data. Misinterpret a bit or get misaligned, and you start interpreting garbage.) Etc.

      Add all of this complexity to a short time span to develop libraries, (i.e. it needs to be done three days ago), and minimal budget, ("What do you mean we need to support diacritics? No we're not spending that money to add support for it. Ship the damn thing without it, if they want it they can pay for an upgrade.") and you can see why these problems exist. Mostly it's the idea that the support isn't needed for everyone so they can get away with not implementing it and blame any issues that crop up on the end user / some bug / a bad connection / etc.

      Sadly TFA is yet another call to attention for this issue, that ultimately will not be addressed unless it gets "fixed" by an unrelated upgrade / patch being rolled out that just so happens to fix these kind of issues, in addition to whatever the real purpose of the upgrade / patch was.

      PS: Read the summary, if "NULL" is considered a valid error result from a string parser, then that parser needs to be rewritten to support proper error codes. Practically anything could be valid input and returning the error status as part of the damn output string is ASKING for trouble. Why? Because then you need a parser to check the error status, so the original parser just made more work for the caller, and guess what? Something tells me the caller didn't check for the EXACT error string correctly, and thus interpreted "Null" as "NULL". Hence the error given to the user.

  2. Interesting read about names by angel'o'sphere · · Score: 3, Informative

    http://www.kalzumeus.com/2010/...

    Nothing to say, read it.

    There is similar stuff about Dates, Time, Time Zones etc. on the internet. I should make a collection of it.

    But I can't figure how to write into my /. journal nor how to use the old /. bookmark feature.

    --
    Cost free eBook I read (by iBook/Kobo/Amazon/ObookO/Gutenberg etc.): "The Green Odyssey" by Philip Jose Farmer.
    1. Re:Interesting read about names by Athanasius · · Score: 3, Informative

      Someone already did: http://spaceninja.com/2015/12/...

  3. Hyphens in last names? by jader3rd · · Score: 4, Funny

    Just pick one already.

    1. Re:Hyphens in last names? by wisnoskij · · Score: 3, Informative

      More to the point, care about the future. Do you really want your children's children to be called Robert Smith-Schmidt-Maier-Kilgore? Not picking a single last name is just a huge FU to all future generations.

      --
      Troll is not a replacement for I disagree.
    2. Re:Hyphens in last names? by JustOK · · Score: 3, Funny

      They dealt with that during the Name2K crisis

      --
      rewriting history since 2109
    3. Re:Hyphens in last names? by Anonymous Coward · · Score: 3, Funny

      Y-TO-K STATUS REPORT

      Our staff has completed the 18 months of work on time and on budget. We have gone through every line of code in every program in every system. We have analyzed all databases, all data files, including backups and historic archives, and modified all data to reflect the change.

      We are proud to report that we have completed the "Y-to-K" date change mission, and have now implemented all changes to all programs and all data to reflect your new standards:

      Januark, Februark, March, April, Mak, June, Julk, August, September, October, November, December

      As well as: Sundak, Mondak, Tuesdak, Wednesdak, Thursdak, Fridak, Saturdak

      I trust that this is satisfactory, because to be honest, none of this 'Y to K' problem has made any sense to me. But I understand it is a global problem, and our team is glad to help in any way possible.

      And what does the year 2000 have to do with it? Speaking of which, what do you think we ought to do next year when the two digit year rolls over from 99 to 00?

      We'll await your direction.

  4. Aw, come on ... by Alain+Williams · · Score: 3, Informative

    Her name in a (web) form would be put into a database field as a string ... the word NULL is a keyword, not a string "NULL". I am not saying that this did not happen, I just find it hard to see how a string and a database keyword could possibly be confused ?

    It would be: INSERT INTO Customer (Surname) VALUES ("NULL")

    not:: INSERT INTO Customer (Surname) VALUES (NULL)

    1. Re:Aw, come on ... by Alain+Williams · · Score: 4, Insightful

      Have you ever seen an application (web or otherwise) that tested an input field against the value "NULL" ? Yes: test if it is NULL (note the missing quote marks) or if it is the empty string, but not the string "NULL". I can, just about, accept that some programmer high on something illegal might have done so once, but the impression given by the article is that this happens a lot.

      I find this hard to believe. If it were true then the applications involved would be open to worse exploits than simple SQL injection.

  5. Teh by MichaelSmith · · Score: 4, Funny

    An asian co-worker of mine who's family name is Teh has found that his name is almost impossible to type in tools like microsoft word, which auto correct Teh to The.

  6. And then there's filters... by jeffasselin · · Score: 3, Funny

    I've had issues a few times with filters on names rejecting mine for supposedly referring to a body part...

    --
    If he explores all forms and substances Straight homeward to their symbol-essences; He shall not die.
    1. Re:And then there's filters... by blindseer · · Score: 4, Funny

      I heard a story from a college friend of mine about someone in his family, his dad I think, getting in some trouble while drinking with some Army buddies. So these three friends go out and have a few too many and are picked up by the local police for public intoxication or something similar. The cop asked for their names. They replied in turn, Dicks, Cox, and Bahl (pronounced like "ball"). The cop thought they were trying to be funny. They were hauled off to the station and were only released after the First Sergeant showed up to verify their names.

      --
      I am armed because I am free. I am free because I am armed.
  7. Ridiculous Premise by BoFo · · Score: 4, Insightful

    Data cannot break computers. Data whose contents differ from the possible preconception of application programmers can cause errors in poorly designed, written, or tested applications.

  8. Programers can not even figures by Anon-Admin · · Score: 3, Interesting

    Most programmers can not even figure out how to validate a f--ing email address, let alone a persons name.

    How about they fix the email problem first and stop rejecting my email address ^_^@mydomain
    Yes, you can put that on my domain listed below and email me, and yes it is a valid email address as per the RFC.

    1. Re:Programers can not even figures by Anonymous Coward · · Score: 3, Insightful

      Most programmers can not even figure out how to validate a f--ing email address, let alone a persons name.

      How about they fix the email problem first and stop rejecting my email address ^_^@mydomain
      Yes, you can put that on my domain listed below and email me, and yes it is a valid email address as per the RFC.

      Because the spec for email address is is ridiculously complex. The problem isn't that programmers can't validate email addresses, it's that they can't write good specs for email address in the first place.

    2. Re:Programers can not even figures by StormReaver · · Score: 3, Insightful

      Programmers who write database-aware programs that choke on the literal words, "null", "blank", or whose programs can't accept an apostrophe are simply incompetent or just plain stupid. There is absolutely no excuse for that kind of idiocy.

  9. Re:LoL by Megol · · Score: 3, Informative

    Byte size have varied a lot in the past and could conceivably vary in the future too (but it's unlikely). Even the definition of byte as a concept have varied, most have byte as the smallest addressable element while some systems had it as the character size etc. Word addressed machines very seldom used byte to describe the addressable element size but some had word-sized characters... It's a mess.

    A more correct name is octet which by definition consists of 8 binary digits.

  10. Re:I solved the problem with my long complicated n by Sarten-X · · Score: 3, Insightful

    As long as your last name isn't a single letter. That catches my psuedonym fairly regularly.

    Back when I worked in medical data, I encountered real people with single-character names. It happens for real names, too. For programmers, the rule is simple: Don't use names for anything except your application's convenience, and don't have any restrictions on them. Don't even require their existence.

    --
    You do not have a moral or legal right to do absolutely anything you want.