Slashdot Mirror


Falsehoods Programmers Believe About Names

Jamie points out this interesting article about how hard it is for programmers to get names right. Since software ultimately is used by and for humans, and we humans are pretty tightly linked to our names (whatever the language, spelling, or orthography), this is a big deal. This piece notes some of the ways that names get mishandled, and suggests rules of thumb (in the form of anti-suggestions) to encourage programmers to handle names more gracefully.

122 of 773 comments (clear)

  1. As the author of RFC 2100... by jra · · Score: 4, Interesting

    I found the piece very interesting.

    Though my inability to post this comment appears to have outlived the slashdotting of the site.

    1. Re:As the author of RFC 2100... by OneAhead · · Score: 3, Funny

      That doesn't make sense. I can read your comment, therefore your inability to post it has gone away. The site is still slashdotted. Ergo, the slashdotting of the site has outlived your inability to post.

      Oh wait... RFC2100...

    2. Re:As the author of RFC 2100... by TheLink · · Score: 2, Interesting

      I dunno, the guy just lists out reasons why you can't uniquely identify people by names. e.g. "some people don't have names".

      Well that's why Governments start handing out people national ID numbers[1]. Then even if you aren't who you claim you are, at least the poor data entry person has something to key in and can actually type it in on his/her keyboard ;).

      [1] As for foreigners wihtout a passport number or national ID, please wait here for those friendly guys in uniforms...

      --
    3. Re:As the author of RFC 2100... by patio11 · · Score: 2, Funny

      After Reddit got done with the site yesterday, I decided "Sure, why not upgrade to Wordpress 3.0. I'll just turn off caching for a little while and..."

    4. Re:As the author of RFC 2100... by pushf+popf · · Score: 5, Funny

      I found the article to be contrived and pointless.

      Yes, there are people and entities that do not fit into a normal name slot in a database, and no, I don't care at all because it hasn't been a problem for anything I've written in the last thrity years. When someone pops up and says "My name is this thing I drew on the sidewalk using chipmunk poop, and it doesn't fit in your database", I'll say "Yes, you're right it doesn't, then go have a beer.

      You can't handle every edge case in the universe because you'll never actually release anything.

    5. Re:As the author of RFC 2100... by Anonymous Coward · · Score: 5, Funny

      If you program like you talk, you'll never ship anyway, because it'll never compile.

      Unexpected EOF in String constant:

      "Yes, you're right it doesn't, then go have a beer.

      You can't handle every edge case in the universe because you'll never actually release anything.

    6. Re:As the author of RFC 2100... by vikstar · · Score: 2, Insightful

      I would find it more interesting if it contained approximate statistics for each type point. I will not spend time designing a system which caters for the 2 individuals having some weird exception to the detriment of millions of others which adhere to a much more useful schema. IE, sure you can just have Name and accept a 2048-length UTF-16 string to accommodate everyone, or skip a few outliers and have given and last names with certain restrictions to catch user error in the input.

      --
      The question of whether a computer can think is no more interesting than the question of whether a submarine can swim.
    7. Re:As the author of RFC 2100... by fbjon · · Score: 2

      I found the article to be insightful. It shows that there is no point in messing around with assumptions about names. Just put one field in the form that takes up to e.g. 128KB that accepts any string of data, including the empty string, and call it "Your real full name". Put in a dropdown for the encoding if you need it, another field for low-ascii transliterated name if you need something recognizable by most, and separate fields for official registered name if you really need that sort of thing.

      --
      True confidence comes not from realising you are as good as your peers, but that your peers are as bad as you are.
    8. Re:As the author of RFC 2100... by TheLink · · Score: 2, Insightful

      Fake, duplicate or not, numeric IDs are still easier to key in ;).

      As most slashdotters will know, if your data records are in a digital computer, it's pretty hard to avoid being linked to at least one number.

      Even if you don't have national ID numbers, someone could go around claiming to be you, or the "System" could still confuse you with someone else.

      At least accidental/erroneous duplicate IDs are easier to detect.

      Of course if some Big Bad Ruler/Government starts issuing citizens with Citizen Certificates that have to be renewed every year then that's a problem :).

      --
    9. Re:As the author of RFC 2100... by mcgrew · · Score: 2, Interesting

      This one was humorous: "surely people's names are diverse enough such that no million people share the same name"

      Anybody who's ever run a mainframe database knows that's just stupid. Back in 1997 Altavista found six people with my exact full name on the internet. In 1997!

      I hate doing a name lookup on my company's database -- do you have any idea how many people in Chicago alone are named "Johnson"? I once joked that they should rename it Johnson City.

      And in the US, there are people with more than one SSN, and people who have none at all. I know a guy with two SSNs, he somehow got in the middle of a feud between the Outlaws and the Hell's Angels about twenty years ago and a judge ordered him to change both his name and SSN, so it was not only done with the government's bleasing, but on their orders.

    10. Re:As the author of RFC 2100... by Maxo-Texas · · Score: 2, Funny

      One of the interesting problems a friend of mine in the food industry deals with is duplicate social security numbers combined with duplicate first and last names.

      At some restaurants every one of the half dozen servers as the same first and last name.

      Oh.. and sometimes their social security numbers are not consistent from week to week either.

      --
      She was like chocolate when she drank... semi-sweet at first and then increasingly bitter.
  2. Sounds like people need to fix thier names by h4rr4r · · Score: 3, Funny

    Who the hell has numbers in there name?

    1. Re:Sounds like people need to fix thier names by Anonymous Coward · · Score: 5, Funny

      3Jane Tessier-Ashpool, for one.

    2. Re:Sounds like people need to fix thier names by ChipMonk · · Score: 2, Informative

      Chad 8 5, for another.

    3. Re:Sounds like people need to fix thier names by Khakionion · · Score: 5, Funny

      homonyms?

      Hey, learn a little tolerance, bud.

      --
      OMG! Wau!
    4. Re:Sounds like people need to fix thier names by 0100010001010011 · · Score: 4, Informative

      Mr. Ochocinco

      For those that aren't privy to American Football. Apparently some guy with the number 85, renamed himself 85.

    5. Re:Sounds like people need to fix thier names by spitzig · · Score: 5, Informative

      Chinese, written in pinyin, has numbers. Pinyin is how Chinese is typed. The numbers represent tones and every word in Chinese has a tone.

    6. Re:Sounds like people need to fix thier names by Kitkoan · · Score: 2, Funny
      --
      Attention... all grammer nazi"s! Is they're anything; wrong with: my post,
    7. Re:Sounds like people need to fix thier names by notthepainter · · Score: 2, Interesting

      Bo3b Johnson

      http://www.linkedin.com/pub/bo3b-johnson/13/846/a52

      The 3 is silent. And no, I don't know him but I know someone who does.

    8. Re:Sounds like people need to fix thier names by DarrenBaker · · Score: 2, Informative

      OCHOCINCO!!!!

    9. Re:Sounds like people need to fix thier names by PopeRatzo · · Score: 5, Funny

      Who the hell has numbers in there name?

      Well, for starters, Thurston B. Howell, III. Malcolm X, and Jimmy Two Times.

      --
      You are welcome on my lawn.
    10. Re:Sounds like people need to fix thier names by BluBrick · · Score: 4, Informative

      Bo3b? Presumably, the 3 is silent because he wants to point out how individual he is (ironically, by rehashing a joke made over 50 years ago.)

      From Tom Lehrer's introduction to "We will all go together when we go":

      I am reminded at this point of a fellow I used to know whose name was Henry, only to give you an idea of what an individualist he was he spelt it H-E-N-3-R-Y. The 3 was silent, you see.

      --
      Ahh - My eye!
      The doctor said I'm not supposed to get Slashdot in it!
    11. Re:Sounds like people need to fix thier names by Fnordulicious · · Score: 4, Informative

      You are a little confused. Please reread the Wikipedia article on Hanyu Pinyin. It normally uses diacritics - namely macron, acute, hacek ("caron"), and grave - to represent the Mandarin tones other than neutral tone. Numbers have been used by people who lack diacritics on their typewriter or input system, but using numbers is not standard in Hanyu Pinyin, instead it's a kludge.

      That said, if your input form doesn't allow some guy to type in his name with tone number suffixes on a US Windows keyboard layout where he lacks access to diacritics, then you're not a very thoughtful programmer.

      Also, people who make software with an input fields that accept Unicode but specify a particular font that has a tiny character repertoire suck.

      Oh, and Slashdot sucks even more for only supporting ASCII and stripping everything else.

    12. Re:Sounds like people need to fix thier names by Miseph · · Score: 3, Informative

      He legally changed his name because fans refer to him as "Ochocinco" and he wanted to put it on his jersey, but because the NFL hates both fans and lulz, they only allow a person's legal surname to appear there. Rather than lay down and take it, he gave them a massive middle finger by changing his name.

      The NFL actually has a surprising number of players that behave like btards, it's rather amusing.

      --
      Try not to take me more seriously than I take myself.
    13. Re:Sounds like people need to fix thier names by Speare · · Score: 2, Interesting

      Love the literary reference. In a much earlier sci-fi story, This Perfect Day, every citizen has a nameber, an identifier that is part name, part number. There are only four male names, four female names, and these are combined with a multi-digit code to make the ID unique. Ever since online forums started suggesting logins like "MaryBeth131" I can't help but think of namebers.

      --
      [ .sig file not found ]
    14. Re:Sounds like people need to fix thier names by sonamchauhan · · Score: 3, Funny

      And King James III

    15. Re:Sounds like people need to fix thier names by fishexe · · Score: 3, Informative

      Who the hell has numbers in there name?

      Former New York Times writer Jennifer 8 Lee does.

      --
      "I don't care about the Constitution!" --Bill O'Reilly, November 17, 2009
    16. Re:Sounds like people need to fix thier names by kenj0418 · · Score: 2

      The NFL actually has a surprising number of players that behave like btards, it's rather amusing.

      I'd be a bit more concerned with the Michael Vicks and Leonard Littles of the NFL than some guy who changes his name. (dogfighting and and drunk-driving-with-fatal-accident for those not in the US or otherwise not aware)

    17. Re:Sounds like people need to fix thier names by aiht · · Score: 3, Funny

      What about Arthur "Two Sheds" Jackson?
      Nah, I guess that doesn't count 'cause it's written as a word.

    18. Re:Sounds like people need to fix thier names by shutdown+-p+now · · Score: 3, Interesting

      That said, if your input form doesn't allow some guy to type in his name with tone number suffixes on a US Windows keyboard layout where he lacks access to diacritics, then you're not a very thoughtful programmer.

      Or you code in some language where Unicode support is not there by default, and you have to jump through hoops to get it working.

      Like, say, PHP. Or stable Ruby.

      Which might explain a lot of things about why so much of the Net is largely broken I18N-wise even on the most basic level, come to think of it.

    19. Re:Sounds like people need to fix thier names by fishexe · · Score: 5, Informative

      Pinyin is how Chinese is typed. The numbers represent tones...

      No it isn't. Pinyin is how Chinese is romanized. Chinese is typed using an IME to produce Han characters. Pinyin is typically only used to represent pronunciation, for example in dictionaries, and to represent names in contexts where romanization is necessary (such as international contexts, like Western media), as well as a few other limited contexts. Writing Chinese in Pinyin, even with tone marks, is often inadequate because each syllable/tone combination corresponds to several characters, and the distinction between them is easily lost in romanization. For example, Zhang Zilin and Zhang Ziyi do not have the same surname, even though both are Zhang1 in pinyin.

      --
      "I don't care about the Constitution!" --Bill O'Reilly, November 17, 2009
    20. Re:Sounds like people need to fix thier names by arekq · · Score: 2, Informative

      Pinyin is just one way Chinese is typed.
      There are other ways to type Chinese characters, for example, Cangjie input method, which is based on the graphological aspect of the characters instead of it's sound.

    21. Re:Sounds like people need to fix thier names by deniable · · Score: 4, Funny

      Yep, there's rampant homophonia around here.

    22. Re:Sounds like people need to fix thier names by Kitkoan · · Score: 3, Funny

      What about Arthur "Two Sheds" Jackson?

      A tragic accident happened, and now he's Arthur 'No Sheds' Jackson. Very tragic.

      --
      Attention... all grammer nazi"s! Is they're anything; wrong with: my post,
    23. Re:Sounds like people need to fix thier names by mogness · · Score: 3, Interesting

      If you are a guy (not an unreasonable assumption on /.), I think it's really strange that online forums are suggesting you the name "MaryBeth131"
      What were your parents thinking?

      --
      that's teh shizzle bizzle
    24. Re:Sounds like people need to fix thier names by Anonymous Coward · · Score: 2, Funny

      Johnny 5

    25. Re:Sounds like people need to fix thier names by nacturation · · Score: 3, Funny

      What about Arthur "Two Sheds" Jackson?

      A tragic accident happened, and now he's Arthur 'No Sheds' Jackson. Very tragic.

      Last I heard he was living next to No Shed Sherlock.

      --
      Want to improve your Karma? Instead of "Post Anonymously", try the "Post Humously" option.
    26. Re:Sounds like people need to fix thier names by Kitkoan · · Score: 2, Funny

      If you are a guy (not an unreasonable assumption on /.), I think it's really strange that online forums are suggesting you the name "MaryBeth131" What were your parents thinking?

      Maybe they were listening to 'A Boy Named Sue'?

      --
      Attention... all grammer nazi"s! Is they're anything; wrong with: my post,
    27. Re:Sounds like people need to fix thier names by droopycom · · Score: 2, Informative

      The Queen of England

      God save her from programmers!

    28. Re:Sounds like people need to fix thier names by Malc · · Score: 2, Interesting

      A lot of mobile phones, including my Samsung phone, use Pinyin as a way of entering Chinese characters. For each word/syllable I enter, there's a sometimes long list of matching Chinese characters to select from.

      Pinyin is also used on things like street signs in some of the larger cities, which gives Western people at least some chance of recongnising names.

    29. Re:Sounds like people need to fix thier names by vux984 · · Score: 2, Interesting

      "Bo3b"

      Never seen that one but I've heard of a: !bo

      The leading exclamation is apparently a...lol i dunno what its called, but its apparently one of the hollow popping/clicking sounds you see in some African languages.

    30. Re:Sounds like people need to fix thier names by snowgirl · · Score: 2, Interesting

      Funny, I actually use the Chinese IME on Windows... it is called "Chinese (simplified) - Microsoft Pinyin - New Input Style"

      And I do actually type in characters using Pinyin, because they have adaptive algorithms that guess at what the most likely character to follow is. They guess well, but it also displays 9 choices at a time, that you select with number keys.

      --
      WARNING! This girl exceeds the MAXIMUM SAFE standards established by the FDA for BRATTINESS
    31. Re:Sounds like people need to fix thier names by Anonymous Coward · · Score: 2, Interesting

      http://en.wikipedia.org/wiki/Perri_6

      This is how you become top listed in every citation index.

    32. Re:Sounds like people need to fix thier names by Hognoxious · · Score: 3, Funny

      Without it he'd get three offtopic mods, one overrated, and two replies saying [citation needed]

      --
      Confucius say, "Find worm in apple - bad. Find half a worm - worse."
  3. Rip out the vowels by jimmydevice · · Score: 2, Funny

    and let god sort them out...

    1. Re:Rip out the vowels by bkpark · · Score: 3, Funny

      and let god sort them out...

      If written Hebrew is any indication, God doesn't bother with vowels either, apparently.

  4. I've been dealing with this for years. by Wonko+the+Sane · · Score: 4, Interesting

    I am fortunate enough to be the child of a professional smart-ass who intentionally gave all his children two middle names so that we would not fit into the computer systems of the era.

    When I grew up my parents used my first middle name as a "given nickname" (it's actually in quotation marks on my birth certificate). So most of the time when I give my name for something I use my "given nickname" as my first name. Unless I feel like using my legal first name as my first name in which case I use that. There are probably four or five different versions of my name attached to my SSN in various different databases.

    I've also got a sufffix: III. I don't have two ancestors with the exact same name as me, but since the various parts come from two different relatives my parents settled on III.

    1. Re:I've been dealing with this for years. by Graff · · Score: 5, Funny

      I prefer the story of this mom.

    2. Re:I've been dealing with this for years. by arekq · · Score: 2, Interesting

      A similar issue happens with Chinese names.
      Most Chinese people have one word or two word names.
      If a person have a two word name and fill it in in the form: "Chow, Yun Fat", the system likely would take "Yun" as the middle and and "Fat" as the first name, or vice versa, which often reduce the name to "Chow, Yun", or "Chow, Fat".
      One way to reduce this confusion is to use hyphen to join the words, like "Chow, Yun-Fat".

    3. Re:I've been dealing with this for years. by JaredOfEuropa · · Score: 4, Interesting

      I have an apostrophe in my surname, and you'd be surprised at how many systems break when I try to enter it... even in this day and age where character escaping and scrubbing for SQL are readily available in most languages, often even in the standard libraries. And you'd be surprised at how many systems return a response that hints at something like that cartoon being possible...

      Even worse are the systems that seem to accept the response, then break down internally. I've had some bitter arguments over reservations at car rental and airline check in counters.

      --
      If construction was anything like programming, an incorrectly fitted lock would bring down the entire building...
    4. Re:I've been dealing with this for years. by unkiereamus · · Score: 4, Interesting

      I actually knew a girl in HS who came from a very traditional Mexican family, as a result, she had 7 middle names.

      Here's the thing, in California, in order to be issued a driver's license, your full name had to appear on the card, and there was insufficient space for all of her middle names, as a result, in order to get a driver's license, she had to have her name legally changed.

      --
      I needed a sig so people would know who I am, but I was too drunk to make something witty, so you get this instead.
  5. Slashdotted already? by RenQuanta · · Score: 5, Informative

    After just 15 minutes of the story being posted?

    Wow, that's gotta be a personal best for /. (or, the site is a wee bit underpowered... ;)

    Here's the Google cache in the meanwhile: http://webcache.googleusercontent.com/search?q=cache:http://www.kalzumeus.com/2010/06/17/falsehoods-programmers-believe-about-names/

    1. Re:Slashdotted already? by RenQuanta · · Score: 2, Funny

      Not so... back in the day, such a slashdotting was quite regular. Surely you remember that

      Yeah, I might, if my memory weren't failing with age. ;-)

  6. Text only cache by SuperKendall · · Score: 2, Informative

    Even the cache needs tweaking to load.

    Text only version.

    --
    "There is more worth loving than we have strength to love." - Brian Jay Stanley
  7. Article text by Anonymous Coward · · Score: 5, Informative

    John Graham-Cumming wrote an article today complaining about how a computer system he was working with described his last name as having invalid characters. It of course does not, because anything someone tells you is their name is--by definition--an appropriate identifier for them. John was understandably vexed about this situation, and he has every right to be, because names are central to our identities, virtually by definition.

    I have lived in Japan for several years, programming in a professional capacity, and I have broken many systems by the simple expedient of being introduced into them. (Most people call me Patrick McKenzie, but I'll acknowledge as correct any of six different "full" names, any many systems I deal with will accept precisely none of them.) Similarly, I've worked with Big Freaking Enterprises which, by dint of doing business globally, have theoretically designed their systems to allow all names to work in them. I have never seen a computer system which handles names properly and doubt one exists, anywhere.

    So, as a public service, I'm going to list assumptions your systems probably make about names. All of these assumptions are wrong. Try to make less of them next time you write a system which touches names.

    1. People have exactly one canonical full name.
    2. People have exactly one full name which they go by.
    3. People have, at this point in time, exactly one canonical full name.
    4. People have, at this point in time, one full name which they go by.
    5. People have exactly N names, for any value of N.
    6. People's names fit within a certain defined amount of space.
    7. People's names do not change.
    8. People's names change, but only at a certain enumerated set of events.
    9. People's names are written in ASCII.
    10. People's names are written in any single character set.
    11. People's names are all mapped in Unicode code points.
    12. People's names are case sensitive.
    13. People's names are case insensitive.
    14. People's names sometimes have prefixes or suffixes, but you can safely ignore those.
    15. People's names do not contain numbers.
    16. People's names are not written in ALL CAPS.
    17. People's names are not written in all lower case letters.
    18. People's names have an order to them. Picking any ordering scheme will automatically result in consistent ordering among all systems, as long as both use the same ordering scheme for the same name.
    19. People's first names and last names are, by necessity, different.
    20. People have last names, family names, or anything else which is shared by folks recognized as their relatives.
    21. People's names are globally unique.
    22. People's names are almost globally unique.
    23. Alright alright but surely people's names are diverse enough such that no million people share the same name.
    24. My system will never have to deal with names from China.
    25. Or Japan.
    26. Or Korea.
    27. Or Ireland, the United Kingdom, the United States, Spain, Mexico, Brazil, Peru, Russia, Sweden, Botswana, South Africa, Trinidad, Haiti, France, or the Klingon Empire, all of which have "weird" naming schemes in common use.
    28. That Klingon Empire thing was a joke, right?
    29. Confound your cultural relativism! People in my society, at least, agree on one commonly accepted standard for names.
    30. There exists an algorithm which transforms names and can be reversed losslessly. (Yes, yes, you can do it if your algorithm returns the input. You get a gold star.)
    31. I can safely assume that this dictionary of bad words contains no people's names in it.
    32. People's names are assigned at birth.
    33. OK, maybe not at birth, but at least pretty close to birth.
    34. Alright, alright, within a year or so of birth.
    35. Five years?
    36. You're kidding me, right?
    37. Two different systems containing data about the same person will use the same name for
    1. Re:Article text by feepness · · Score: 5, Funny

      Nice rules. Still wouldn't handle my name.

    2. Re:Article text by bertok · · Score: 2, Interesting

      Reminds me of a classic database developer nightmare story that I heard:

      A local school was receiving complaints that two students were getting the exam results and the like mixed up.

      The two students? Identical twins living in the same house, with the same name.. John Smith Jnr.

      Apparently their father was John Smith Snr, and the whole "Senior / Junior" thing has been done for generations of "Johns Smiths", and it was a tradition and all, and we can't just break a tradition just because we had twin boys.. so... we'll name them both John Smith Jnr.

    3. Re:Article text by JaredOfEuropa · · Score: 2, Funny

      Who complained? The parents? If so, the only proper response would have been: "Well, what did you expect, numbnuts?"

      --
      If construction was anything like programming, an incorrectly fitted lock would bring down the entire building...
  8. Dumbfuck summary by oldhack · · Score: 5, Insightful

    Names of what?!

    --
    Fuck systemd. Fuck Redhat. Fuck Soylent, too. Wait, scratch the last one.
    1. Re:Dumbfuck summary by bigstrat2003 · · Score: 4, Informative

      Yeah, TFS is very ambiguous about that. Turns out that TFA is talking about names of people, and the pitfalls you can run into when allowing someone to enter their name into a system.

      --
      "16MB (fuck off, MiB fascists)" - The Mighty Buzzard
    2. Re:Dumbfuck summary by thePowerOfGrayskull · · Score: 2
      Indeed. Reading the summary, I thought it was some kind of article on how programmers can't remember names (I know I can't...)

      But basically, it's some dude whining about how - because there is no single set of rules that can be universally applied to all names - no systems handle them correctly. That seems kind of self-evident to me; computers are rules-based creations. If you can't define the rules, it sure is hard to code for them. Blaming the programmers is stupid - as his own article shows. (eg. "[don't assume] Names are case-sensitive. [don't assume] Names are case-insensitive".

      Not sure why this made it to slashdot -- it's just a rant.

    3. Re:Dumbfuck summary by sjames · · Score: 2, Informative

      Many of the systems that handle names the worst are the ones that try to be "clever", doing things like insisting on first (and only first) letter capitalized, rejecting digits, refusing to allow middle name (or initial) to be blank, always using the first letter of the Middle name and adding a period after or refusing to accept a single character as a name, and many more sins. The "dumb" systems are actually more graceful about it.

      The best policy is to accept what is entered. Even that tends to fail if someone has more than 3 names. Then there's the Spanish naming conventions.

  9. Article makes wrong assumption about software. by Vellmont · · Score: 5, Insightful

    Software is NOT designed to be perfect and cover every case. Have a numeral in your name? Too bad. Need some names to be case sensitive, and others case insensitive? Sucks to be you. Have a 200 character name that doesn't fit in the 100 characters the designers thought no crazy person would ever have? Tough.

    I started reading through the list, and it's just ridiculous. There's a few good points, like names don't change, or names are unique. But they're so obvious that the vast majority of the times it's not a big problem. More often it's just a matter of training the data edit/entry folks how to change someones name, or how to not assume a name is a sole identifier.

    But assuming the worst and trying to design a system that'll allow people's names to be Chinese characters when you don't do business in China, have presence in China, or ever ever plan to? That's ridiculous. Software doesn't have to be perfect out of the shoot. It should be adaptable though if some unforeseen shortcoming becomes a larger problem. Gee, I guess if you ever chose to do business in China and need Chinese character names you might have to re-write part of the damn software. Oh well, that's what software developers are FOR!

    If you don't even HAVE a name, then I submit you're crazier than the artist formerly known as the artist formerly known as Prince. At least HE had a name, though it was an unpronounceable symbol. The world can't accommodate every possibility, and software is no exception.

    --
    AccountKiller
    1. Re:Article makes wrong assumption about software. by lennier · · Score: 4, Insightful

      But assuming the worst and trying to design a system that'll allow people's names to be Chinese characters when you don't do business in China, have presence in China, or ever ever plan to? That's ridiculous.

      Or sell in New Zealand, or Australia, or anywhere else in the Pacific, or deal with immigrants, or be used by anyone who has a Chinese name?

      This is the Internet now. Welcome to it.

      --
      You are not a brain: http://books.google.com/books?id=2oV61CeDx-YC
    2. Re:Article makes wrong assumption about software. by Vellmont · · Score: 2, Insightful


      Then it's designed to fail

      Anything ever designed is designed to fail. This applies to bridges, the pyramids, and all software. This belief you have that software doesn't have to be maintained is as ridiculous as the idea that a bridge or any physical structure doesn't have to be maintained. Software lives and dies like anything else. Nothing lives forever.

      --
      AccountKiller
    3. Re:Article makes wrong assumption about software. by Trepidity · · Score: 5, Insightful

      Most Chinese emigrants to countries that use a Roman alphabet are perfectly capable of writing their name in Roman characters if they need to. If they weren't, they wouldn't have been able to get visas and get into the country in the first place.

    4. Re:Article makes wrong assumption about software. by PrecambrianRabbit · · Score: 3, Insightful

      You're overreacting (I know, I know, "welcome to the Internet"). Software should behave in some sane, safe manner given any input. Sometimes, the sane thing to do is to throw an error, or say "Sorry, Dave, I can't do that."

      In particular, systems don't necessarily have to shoehorn insane data into their processing. To use a relevant example, simply because Prince wants to upload a PNG in the "Name" field doesn't mean that the software has to let him. Rejecting this case does not doom said software system to "become a botnet" or "leave a trail of broken data."

    5. Re:Article makes wrong assumption about software. by jrumney · · Score: 2, Insightful

      Generally when building a form that asks for a name, I create a first name field and a surname field.

      And you fail right there. For some people, their first name is their surname. Others don't have a surname. Some of those without a surname may use a patronym or matrinym as part of their full name, but you never use it to address them without their personal name. Some people have a first name and second name that always go together, so parsing a first name out of the full name, or disallowing whitespace in the first name field is another common fail.

      Names are complex. Don't assume it doesn't matter because your database is only intended for local use, because unless you live somewhere as closed as North Korea, there are immigrants in your town that break your assumptions.

    6. Re:Article makes wrong assumption about software. by canajin56 · · Score: 3, Funny

      So, you shouldn't deploy software that doesn't, as the retarded article says, properly handle people with names that are over 65,000 characters long, where some portions are case sensitive, so if that part is lowercase instead of upper, that's a different name. But other parts are case insensitive, so its still the same name even in all caps. Oh, did I also mention that some of the letters in the name aren't part of any character set, so they can't even be typed in the first place? Because the article says that assuming names can even be text at all is wrong and your software is broken if you made that stupid assumption. (See Prince) PS, that person with the 65 thousand letter long name? He has 8,000 aliases and needs to enter all of them, better hope you allow that many aliases. Also, there is a huge subset of his name in common with a friend of his, but they are not related, it is sheer coincidence, you better not assume relation just because only two people in the world have the same last few hundred words of their name in common! Also, his brother has no name. Not, like his name is "No name" or he goes by "The artist formerly known as Prince", as in, his name is just the empty string, so your software better fucking not have name as a required field!

      --
      ASCII stupid question, get a stupid ANSI
    7. Re:Article makes wrong assumption about software. by Draek · · Score: 5, Funny

      You're not a programmer, are you?

      Oh, don't worry, I can tell.

      --
      No problem is insoluble in all conceivable circumstances.
    8. Re:Article makes wrong assumption about software. by shutdown+-p+now · · Score: 2, Insightful

      You can't "just use Unicode" and do no validation, though, unless you're perfectly fine with all sorts of bidi control characters showing up random places

      That is output validation problem, not input validation. So go ahead and strip it on output, when (and if) you need to do it.

      or nonprintable characters causing two different names to look identical

      And why would that be a problem (anymore so than people with identical names in general)?

      And yes, I would say that if someone can't invent something to put in a "family name or surname" field, then too bad. They would also find themselves unable to travel to most countries, since most countries' immigration forms have such a box

      Having a box is not a problem. By all means, keep one. The problem is making input there mandatory. Do immigration forms make it so?

    9. Re:Article makes wrong assumption about software. by SEE · · Score: 4, Informative

      Is it so hard for you to just use Unicode

      Unicode doesn't cover the full set of CJK characters used for names, nor does it cover all writing systems in actual use.

    10. Re:Article makes wrong assumption about software. by shutdown+-p+now · · Score: 2, Informative

      That's true also. However, Unicode covers much more ground immediately with practically no effort required from the programmer - but once you go beyond that, the complexity increases very rapidly (since you have to start dealing with multiple different encodings simultaneously etc).

      As well, new Unicode versions come out regularly which expand its reach, and new frameworks/databases update their Unicode support every now and then, so if you start using it today, it'll be much easier (in many cases, completely free) for you to expand coverage in the future in backwards-compatible way.

      In contrast, if you, say, use Latin-1 today, you'll either have to start dealing with multiple encodings much sooner, or to recode the database eventually.

    11. Re:Article makes wrong assumption about software. by Concerned+Onlooker · · Score: 2, Funny

      "Software doesn't have to be perfect out of the shoot."

      Are you saying software is like a bullet? Or perhaps you meant chute.

      --
      http://www.rootstrikers.org/
    12. Re:Article makes wrong assumption about software. by digitig · · Score: 2, Insightful

      But assuming the worst and trying to design a system that'll allow people's names to be Chinese characters when you don't do business in China, have presence in China, or ever ever plan to? That's ridiculous.

      No, but making a conscious design decision not to accommodate names in non-Roman character sets, and documenting that in the specification, is sensible.

      If you don't even HAVE a name, then I submit you're crazier than the artist formerly known as the artist formerly known as Prince.

      The discussion gives examples of people who don't have names, such as somebody born into slavery in the Sudan. In that case, it's not the person who is crazy. Do you need to account for that in your data entry? Well, it depends. If it's online sales then the chances are that that person will never be a customer. If you're doing a missing persons database for a relief agency, though, you probably need to find a way to account for them. So no, you don't have to address all of the cases that the author mentions, but if you're smart you'll at least consider whether you should in your particular context.

      --
      Quidnam Latine loqui modo coepi?
  10. Yeah, article is kind of asinine by Trepidity · · Score: 5, Insightful

    He's essentially arguing that, because names vary a lot and are complex, your software should never do anything useful with them. Sorry, but that's a stupid answer. In a lot of systems, being able to sort by surname may well be more important than being able to handle people who claim they have no surname.

    Of course, you shouldn't gratuitously do stupid things, and interfaces should aim to be relatively clear. But most people can figure out how to enter their names into relatively standardized forms, and those that don't should probably figure out how.

    1. Re:Yeah, article is kind of asinine by snowgirl · · Score: 2, Informative

      I'm going to throw in my agreement here. Yes, there are people who put numerals in their names, or non-unicode point characters, or various other things, but there just isn't a reason to foist that on other people.

      There is frustration about things like, "people have N number of names", and "names don't change" which are good and valid points... but some of the things are just like "dude... seriously..."

      --
      WARNING! This girl exceeds the MAXIMUM SAFE standards established by the FDA for BRATTINESS
  11. Thanks, Prince by BlueBoxSW.com · · Score: 4, Informative

    Thanks, Prince

  12. Re:I don't know what the complaint is about? by scdeimos · · Score: 4, Interesting

    A database MUST treat all of these names the same: McClean, MacClean, MCLean, Mc Clean, Mac Clean. McCleen, ...

    Are you sure? What if "Mac Clean" is actually somebody's first and last names?

    I know plenty of people whose legal name is a single word, such as "Alex", "Max" or "Virgil." Would your system put that in the first_name, middle_name or surname column? Storing names and using them sensibly is hard, as TFA acknowledges.

    You'd think that e-mail addresses by comparison would be simpler, but I have a hard time trying to register my e-mail address with sites that won't allow even simple things like "+", "-" or "." characters in the local part.

  13. Re:I don't know what the complaint is about? by Dragonslicer · · Score: 4, Interesting

    A database MUST treat all of these names the same: McClean, MacClean, MCLean, Mc Clean, Mac Clean. McCleen, ...

    I assume you left out a "not" in that sentence? I think there are quite a few people that will kindly (or maybe not-so-kindly) explain why "Mc" and "Mac" are not the same.

  14. Irish need not log in? by thepainguy · · Score: 5, Insightful

    My last name is O'Leary and over the past 5 years web sites have not gotten any better, and arguably have gotten worse, at handling the apostrophe in my last name

    Help me Slashdot, you're my only hope.

    1. Re:Irish need not log in? by kenj0418 · · Score: 5, Funny

      You've probably compiled a lengthly list of sites vulnerable to SQL-injection. I'm sure you could sell that to someone somewhere to compensate you for your pain and suffering.

    2. Re:Irish need not log in? by shutdown+-p+now · · Score: 4, Funny

      Help me Slashdot, you're my only hope.

      Do you think they'd let you change your legal name to something like "O';DROP TABLE users"? If so, you should be all set.

  15. not surprising by Phoenix+Dreamscape · · Score: 2, Interesting

    Considering how many entry forms still don't allow '+' in an e-mail address (or, worse, allow it in the sign-up box but not in the unsubscribe box), and considering how many banks still restrict you to an 8-character password, does it come as any surprise that they have difficulty with something that isn't defined in an RFC?

  16. Well Duh by Saint+Stephen · · Score: 4, Insightful

    First thing I learned back in 1993 when I got started.

    1) George Foreman has five boys named George Foreman. Your database better be able to handle that.
    2) Your database better be able to handle Cher (no last name).
    3) People are not required to have Social Security numbers. (it's an optional program - you don't have to partipate).
    4) Not everyone's last name starts with a capital letter.
    5) Mexican people's names break ASCII (the tilda n).
    6) People named O'Grady have a hard time getting their name in a database sometimes and have a hard time getting their name passed via a URL sometimes and generally mess stuff up.
    7) People from Sri Lanka will break your name length limits.
    8) Some people's name is only a single letter.
    9) Some people go by their middle name god damn it! :-)

    1. Re:Well Duh by Eskarel · · Score: 2, Interesting
      1. Don't use names as a unique identifier, they're not.
      2. Cher has a last name, as most likely did Homer and Virgil and everyone else, they're last names might have been "from _____ or the ______", but they still had one.
      3. It's illegal to use SSN as a unique identifier, so don't use it as one.
      4. Who cares, don't muck around with case, and search case insensitive, more matches are better than not enough.
      5. There are conventions to get around that in ASCII, but unicode solves most of it anyway.
      6. Always properly encode and decode your data to meet the requirements of your medium.
      7. You still have to have name limits, and someone's name will always break it, using some ridiculous number of characters in your database is just going to kill your database.
      8. No ones name is a single letter in any language I've ever heard of(a single character, but that's not the same thing), and since names aren't unique or identifying this doesn't really matter.
      9. Who cares?

      Names are not meaningful except to the people who have them, and they're deluding themselves. You are not your name, and your name is not you.

    2. Re:Well Duh by SpecBear · · Score: 2

      Our software can handle eight of those. Possibly nine, I don't know how long Sri Lankan's names get.

      The company I work for gets paid to make software. If someone wants to pay my employer to support certain features, then we'll build in that support. If the client says "Anyone without a last name can suck it" (and that has happened) then the system won't support that.

      As a hired gun (keyboard?) whether what i believe about names is true or false is irrelevant. I believe what I get paid to believe.

    3. Re:Well Duh by FauxPasIII · · Score: 2

      > 8) Some people's name is only a single letter.

      A former coworker of mine is named "Hyun O". He was the only person in the company not to follow the first initial, last name standard for his email address. =)

      --
      25% Funny, 25% Insightful, 25% Informative, 25% Troll
  17. Programmers hate my real name by SexyKellyOsbourne · · Score: 4, Funny

    My first name: "where 1=1 "
    My last name: "'; drop table users; --"

  18. Why do programmers get the blame? by justfred · · Score: 5, Insightful

    I code to spec. The product and marketing departments write the spec (what little there is); the QA department amends the spec with overly specific test cases. I suggest that the spec is incomplete and won't handle...but I'm told, just code it to spec. I recommend changed, but we don't have time for edge cases. I point out potential problems, but we're unlikely to get any of those. I warn of potential compatibility problems but we don't care. Are you just trying to be difficult? If there's a problem QA will catch it. The project is overdue already, and by the way here are some new requirements that need to make it in, and we can't change the release date because we already promised the stockholders. Why is your code so complicated, my twelve-year-old kid could write this.

    It's not my fault. I code to spec.

  19. it's not software, it's people by yyxx · · Score: 2, Insightful

    Software shouldn't have to satisfy every whim and excentricity. If you don't have a well-defined first name and last name that consists of extended alphanumeric characters in Unicode and starts with a letter, well, then get one, OK? And while you're at it, come up with decent Romanized and ASCII (traditional Latin) versions of your name, conformant with one of the common Romanization systems of your language; you will need that too if you want to travel internationally. Single letter names are also a potential problem because they are confusable with abbreviations, so consider using a variant spelling ("O" -> "Oh").

    This isn't because programmers have some sort of hangups about names, it's because people themselves need to be able to refer to individuals in some reasonable and standardized way, they need to be able to write your name, alphabetize it, and correct errors.

  20. ...so what? by SanityInAnarchy · · Score: 4, Insightful

    It seems to me that most misconceptions about names can be fixed by the following:

    Allow a single, Unicode-enabled field of "unlimited" length (let's say 4 kilobytes) which represents "name". Several would be defined by different roles -- "Real name", "Nickname", "login", where only login (sometimes simply an email address) is required to be globally unique.

    Now let's look at what that breaks:

    First, #1, 2, 4, and 5. How am I supposed to avoid assuming these? People should be allowed to enter an arbitrary number of names for themselves? I suppose that's possible, but it immediately kills most of the potential uses of this data. If I want to set a nickname that goes with my forum posts, say, what good is it for me to have five nicknames? Seems like the only potential use would be making people easy to find by real name -- so, a social network.

    #6 -- surely 4k is enough, but this is also not a terribly difficult assumption to change later. Annoying, but not devastating, not even as hard as changing from the first name / last name combination into one "real name" field.

    #7, 8 -- most systems would make it trivial for people to change their names.

    #9, 10 -- UTF8 is easy.

    #11 -- very, very curious to see an example. And wouldn't that be a bug in Unicode? And this is again one where I have to ask -- how do you change this? Allow arbitrary images?

    #12, 13 -- obvious solution is to make the name system case-preserving, thus allowing both case-sensitive and case-insensitive searches.

    #14 -- again, avoid by simply allowing the name to be a single opaque field.

    #15, 16, 17 -- if your name supports random unicode, no idea why these would be a problem.

    #18 -- not sure why it matters.

    #19, 20 -- again, if it's just arbitrary text, it just works.

    #21, 22, 23 -- not sure how I'd make that assumption.

    #24, 25, 26, 27 -- again, the name is just an opaque bunch of characters.

    #28 -- what?

    #29 -- opaque characters.

    #30 -- keep the original text as-is. If you want to try to split people out by naming scheme, do it later, but keep the original. This should be a "duh" concept -- always preserve the original user input. Cache transformations for speed, if you like, but they're a cache -- keep the original. Your algorithm might change.

    #31 -- bad idea to assume bad words won't cause problems in general. I currently play an MMO in which I physically can't talk about Emily Dickinson, and have occasion to more frequently than you might suspect.

    #32-36 -- why would it matter? Unless...

    #37 -- Fine, but how would I otherwise connect the same person?

    #38 -- How about unicode-equivalent? And of course, they might not -- one might make a mistake, or the name might be represented differently. But you'd have to deal with typos anyway, so this isn't exactly shocking.

    #39 -- I'm going to have to agree with the assumption, though. If I develop a system which works well for people who only follow the US standard, and I suddenly have a ton of people from China wanting to use my service -- enough that this is actually a problem for me -- that's a nice problem to have.

    #40 -- People can make up names. I guess this explains #32-36, though.

    The sense I get is that half the list is stuff you'd almost have to be stupid to run into (seriously, who doesn't use Unicode?), and the other half involves some seriously weird names and cultures that are going to have to meet me halfway, if they expect me to do anything interesting with their name. As I understand it, the only way to get this right would be to allow people to have zero or more names, each of which is either an unlimited amount of text in any encoding, or an image (raster or vector) of unlimited size. To query such a system requires insane amounts of logic just to deal with the text, and throw in some OCR for good measure.

    I think this is a case where I would much rather see people evolve to match the technology, rather than the other way

    --
    Don't thank God, thank a doctor!
  21. Re:I don't know what the complaint is about? by paeanblack · · Score: 3, Informative

    You'd think that e-mail addresses by comparison would be simpler, but I have a hard time trying to register my e-mail address with sites that won't allow even simple things like "+", "-" or "." characters in the local part.

    Proper email validation is not trivial

    Check out the huge regex at the bottom of the RFC 5322 compliant validator from CPAN:

    http://cpansearch.perl.org/src/RJBS/Email-Valid-0.184/lib/Email/Valid.pm

  22. Just more nerd bashing! by RomulusNR · · Score: 2, Insightful

    Yes. It's programmer's fault that they write applications that make poor assumptions about names -- not the people who design software requirements who are neither programmers nor usually very worldly.

    Perhaps we should have a list of "assumptions people make about developers"!
    * Developers get to design their own software.
    * Developers get to have some say in how their software is designed.
    * Developers at least can prevent really stupid things from being put in the software they write.
    * Developers aren't smart enough to know that outliers are inevitable.
    * Developers aren't smart enough to know that of course there are people with punctuation, extra words and spaces, even letters that no one has seen before.
    * Developers wouldn't rather code just one column to hold an identifier rather than two.

    --
    Terrorists can attack freedom, but only Congress can destroy it.
  23. Re:I don't know what the complaint is about? by nacturation · · Score: 3, Informative

    A database MUST treat all of these names the same: McClean, MacClean, MCLean, Mc Clean, Mac Clean. McCleen, ...

    I assume you left out a "not" in that sentence? I think there are quite a few people that will kindly (or maybe not-so-kindly) explain why "Mc" and "Mac" are not the same.

    Read between the lines a bit. Treat them the same means: treat them as all potentially valid, not that all the names would match in a string comparison.

    --
    Want to improve your Karma? Instead of "Post Anonymously", try the "Post Humously" option.
  24. Not just programmers or computers by zill · · Score: 3, Funny

    This issue is pretty much universal. Even outside the binary world people still silly assumption about people's names.

    For example, numerous people have raised objections about my signature. They always give me bullshit complaints like "Sir, that is not legible." or "um... that's not your name." or even "Did you just draw a penis on the dotted line?".

    My signature does not have to be legible.
    My signature does not have to be my name.
    My signature does not have to contain my name.
    My signature does not have to contain any name.
    My signature does not have to be in the English language.
    My signature does not have to be in any human language.
    My signature does not have to consist of meaningful symbols.
    I swear if I hear one more complaint about my signature I will carry around a portable photo printer to render goatse as my signature:

    "Yes, my signature is an 600 ppi out-stretched anus. Deal with it. The law says that any mark that I make is a legally valid signature and you have to recognize it as such. You either sign the mortgage or I'm going to the next bank."

  25. Re:I don't know what the complaint is about? by Anonymous Coward · · Score: 4, Funny

    A database MUST treat all of these names the same: McClean, MacClean, MCLean, Mc Clean, Mac Clean. McCleen, ...

    I assume you left out a "noot" in that sentence? I think there are quite a few people that will kindly (or maybe not-so-kindly) explain why "Mc" and "Mac" are noot the same.

    fixed that

  26. Re:I don't know what the complaint is about? by scdeimos · · Score: 2

    A lot of forums and banking sites written by noob programmers for starters. Notice I said "local part", I haven't found one that cares about -'s and .'s in the domain part.

  27. Re:I don't know what the complaint is about? by fishexe · · Score: 4, Funny

    A database MUST treat all of these names the same: McClean, MacClean, MCLean, Mc Clean, Mac Clean. McCleen, ...

    I assume you left out a "not" in that sentence? I think there are quite a few people that will kindly (or maybe not-so-kindly) explain why "Mc" and "Mac" are not the same.

    Yeah, one goes in front of 'Donald's' and the other goes in front of 'beth'.

    --
    "I don't care about the Constitution!" --Bill O'Reilly, November 17, 2009
  28. Re:I don't know what the complaint is about? by Shoe+Puppet · · Score: 2

    But in practice, shouldn't you just be able to copy the regex from somewhere (open source, with attribution of course) and check if it matches?

    --
    (+1, Disagree)
  29. Re:I don't know what the complaint is about? by TedRiot · · Score: 4, Informative

    True. I run into email validation problems constantly. I have a two-part first name that has "-" in the middle, so my firstname.lastname email addresses (usually work addresses) always have a "-". In addition at the moment I'm a consultant in a large company, where they put "ext-" in front of everyone who is not employed by them but works for them and has an email account from them. I also often run into problems with length, because my name is 19 characters and the last place I worked for had a 15 character company name and when you add TLD to that, you sum to an email address that is 39 characters long, which for some seems to be too much. I really don't get why you would use only 32 characters to store an email address..

    This problem very often bites in name fields, too, that don't accept "-" and two capital letters in my first name.

    And I used to live near a border of two cities, where my postal address was from one city while my real city of residence was the other one. I have had a lot of problems with that, when the guys who made the systems were trying to deduce my city of residence from my postal address. Which is also impossible in my country, because the national post office also permits addresses that have postalnumber + company (instead of city) for large companies who take their mail in one place and deliver it themselves the rest of the way.

  30. Re:I don't know what the complaint is about? by Anonymous Coward · · Score: 5, Insightful

    The regular expression, if one must be used, doesn't need to be any more complex than:

    ^[^@]+@[^@]+$

    Sending out response emails to an improperly validated address just turned you into an open relay. Spammers can use your server to send spam by embedding their entire message as the email address, trailed by '\x004@.'

    Validate your inputs. Always.

  31. Re:I don't know what the complaint is about? by VinylPusher · · Score: 5, Informative

    Wow, if you consider McLean and MacLean the same, I suggest you never visit Scotland.

    The Mc's and the Mac's consider the correct usage as a matter of extreme pride. You could end up with one or more bruises if you get it wrong and then insist that "well, they're the same anyway".

  32. Re:First hand experience. by mpe · · Score: 4, Informative

    The author must have missed his history lesson explaining that family names only became popular in Western European culture when governments started tabulating people. In a rural village everyone knows that Jack the butcher is different from Jack the baker.

    Hence Butcher, Baker, Smith, Brewer, Tanner, Farmer, etc became "family names".

    *Even if the system did a conversion to a latin representation of an asian name most people can't pronounce them because they are based on different sound primitives.

    Such a "translation" can easily be one to many, dependent on various factors.

    Which is why Asians tend to adopt westernised versions of their real names.

    Or they adopt a regular English, German, French, Spanish, etc name to be known by.

  33. Re:I don't know what the complaint is about? by CarpetShark · · Score: 2, Insightful

    Read between the lines a bit. Treat them the same means: treat them as all potentially valid, not that all the names would match in a string comparison.

    I don't think that's what it meant at all. I think the author is trying to be too smart by suggesting that someone looking for MacDonald might have heard it wrong, and so might type in McDonald instead. It's probably a valid point for fuzzy searches, but to say that they should all be treated the same is wrong.

    That said, his other points, especially about the fact that not all names are properly mapped in unicode, is a good one. I just wish he'd posted citations and solutions, rather than simply pointing out the issues. But the first step in fixing a problem is acknowledging the problem.

  34. I didn't understand by SimonInOz · · Score: 4, Interesting

    I though the article was about the inability of programmer to remember names and recognise people, Maybe I should have read the article.

    It's a real problem though - is it just me? I often know things about people (ah yes, plays squash, good at making cakes, father of that kid who rides a unicycle), but their actual name - no. It's a miracle if I recognise them at all.
    Mind you, it means if anyone says "Hello" to me, I am obliged to be polite to them as I might actually know them quite well, but haven't recognised them yet - and certainly don't know their name.

    It's a right pain. Anybody else suffer from this - and what the heck do they do about it? (I'd like a camera attachment what would whisper in my ear "that's Mrs Jones, her daughter, Kira is in the same class at school as your daughter. Likes chess and is obsessed with kayaking" - something tiny that could clip on my glasses, maybe).

    --
    "Cats like plain crisps"
    1. Re:I didn't understand by delinear · · Score: 2, Insightful

      I'm the same, faces for me just won't stick - the first few weeks going to a new employer is torture for me as I'll be bombarded with names and I just can't remember anyone, and of course the problem is multiplied because you're the new guy so everyone knows your name. I don't know if it's specifically a developer issue. I did read that people with borderline Asperger's find it difficult to recognise faces and a lot of developers I know seem to fit the patterns for that (awkward in social situations or around new people, like to collect things, etc) so maybe there's some correlation that people who fit those patterns are drawn to careers where they can focus on impenetrable logic problems and not have to deal with people too much (I know I'm making some massive generalisations here, and this is purely anecdotal, but it fits my observations).

  35. Re:I don't know what the complaint is about? by Hognoxious · · Score: 2, Interesting

    Why would you be doing the validation in the database?

    If he'd meant "should treat them all as valid", then he should have written that.

    --
    Confucius say, "Find worm in apple - bad. Find half a worm - worse."
  36. Re:I don't know what the complaint is about? by bickerdyke · · Score: 2, Funny

    You could end up with one or more bruises if you get it wrong and then insist that "well, they're the same anyway".

    Isn't that a general risk when dealing with scotsmen? :-)

    But where does the difference with the missing 'a' come from?

    --
    bickerdyke
  37. Re:I don't know what the complaint is about? by somersault · · Score: 3, Informative

    Just looked it up. I'm Scottish, live in Scotland and always hear people say that the difference in Mac/Mc is important because of the Scots/Irish thing, but according to this article, that's bollocks:

    http://www.scottishhistory.com/articles/misc/macvsmc.html

    --
    which is totally what she said
  38. Re:I don't know what the complaint is about? by delinear · · Score: 4, Insightful

    Sometimes I despair when I read or hear somebody referring to eg. Djengis Khan as "Mr Khan" ("Khan" is a title, not a name) or even call Hu Jintao, "Mr Jintao"; you would have thought people would, by now, have caught on to the idea that something like half the world's population has the family name first.

    Oh, come now - are you seriously saying you expect every single person to understand every subtle nuance of every other culture's use of titles and names? Here are some non-English equivalents to Mr., are you seriously telling us you know all of these? Here are the various forms of address in the UK alone, do you know all of these and every other culture's equivalent? How many of these should I learn before I go from being someone you despair of to someone you feel is welcome in your titular elite?

    If half the world's population has the family name first, which half do I choose to offend when I don't know the exact rule for the home country of the person I'm speaking to? That's even assuming I know which country they're from. There's no reason to assume in this shrinking planet that someone who looks like they're from country A wasn't in fact born in country B to parents from countries A and C - a person born in Japan but with lineage in China might take great offence if I use Chinese honorifics to address him, surely it's better to be polite within the confines of my own known culture than to make such crass assumptions about his? The key thing I take from someone saying "Mr Khan" or "Mr Jintao" is that they're at least making the effort to communicate in a civil manner, which certainly causes me no despair.

  39. Re:I don't know what the complaint is about? by Anonymous Coward · · Score: 2, Interesting

    Because no one ever automated the process of filling out web-forms right?

  40. on insisting everyone know your bikeshed's tint by FuckingNickName · · Score: 3, Insightful

    I was born with a complicated Spanish name.
    One first name.
    Two second names.
    One hyphenated, accented surname from my father.
    One simpler surname from my mother.

    One of the first things I've done since reaching majority is to give a precise, simple, standard name to everyone who asks for it:
      Xxxxx Xxxxxxx
    where X is in A-Z and x is in a-z. Xxxxx is my first name, and Xxxxxxx is a shortened, accent-and-hyphen-free version of my father's surname.

    You know why?

    Because, in life, there are lots of things one must be "unreasonable" about in order to effect progress, but accommodation of one's name is not of them. It's a tedious, selfish expression of nothing more than ego which ultimately will land you in more trouble than others: some day you will be denied access to something thanks to some computer system not being designed to handle your name, and "computer says no" gets priority over the angry demands to the immigration officer of "Joe\0\rBlogg$ 3'); DROP TABLE citizens; -- [insert spinning cube here] Jr."

    If you and your friends/colleagues have some other name by which they call you, sure why not? But, as any cat will tell you, the world is best when you have three names:

    (i) one for communicating formally;
    (ii) one for more intimate discourse (there's no reason why this can't be the same as (i), though many people end up with peculiar nicknames); and
    (iii) one personal identification which you can keep to yourself and you can't express in words.

    If you want the sum of all your history, culture and personality as expressed in (iii) to be embodied in (i), you're both expecting others to be burdened with your ego and bad at understanding human communication. All I asked for was a couple of words I can use in a reasonably uniform way to easily pick/call you out from a small crowd - that's what (i) is for, after all.

    tl;dr The naming of cats is both a delightful poem and an insightful account of the multiple namespaces for kitty/human names and their different purposes. Don't confuse them.

  41. My surname ... by OneSmartFellow · · Score: 2, Funny

    .. is Hfuhruhur-Uumellmahaye you insensitive clod !

  42. Re:I don't know what the complaint is about? by Sique · · Score: 4, Interesting

    To make things worse, it's not necessarily the family name you use to address someone politely.

    If you have to speak to Paul McCartney (of Beatles' fame), you have to formally address him as "Sir Paul". No, "Sir McCartney" is impolite, you shouldn't use it.
    If you have to speak to Vladimir Putin, you won't address him as "Mr. Putin". It's "Vladimir Vladimirovich", please!

    --
    .sig: Sique *sigh*
  43. Re:I don't know what the complaint is about? by Migala77 · · Score: 2, Interesting

    Proper email validation is not trivial

    The regular expression, if one must be used, doesn't need to be any more complex than:

    ^[^@]+@[^@]+$

    Actually, the local part of an e-mail address can be a quoted string, containing pretty much any character, so "user@host"@example.org is a perfectly valid e-mail address, and doesn't match your regex. Most systems won't accept it, but it's valid...

  44. And that attitude is the whole problem by Moraelin · · Score: 4, Insightful

    You know, attitudes like yours are IMHO the root of all that's wrong with computers today. And I'm saying that as a programmer, not as Jane Grandma. The whole idiotic OCD idea that you _must_ make up rules about everything, and that your rules are more important than what people are actually trying to do. The idea that if even someone's name doesn't fit "your" database, then you can just brush them off and have a beer.

    Here's some free clue: yes, you can't handle every edge case in the universe, but you'll find it's easier if you don't create such edge cases in the first place. If your database (actually more likely the program in front of it) can't handle last names with more than one capital letter, or with a dash in the middle, or which are more than 32 bytes long (which with UTF-8 might mean less than you'd think), then guess what? _You_ created an artificial edge case that had no reason to be there in the first place. Instead of handling every edge case in the universe, how about not creating them in the first place?

    I find that about 90% of the problems boil down to the above: some idiot put some artificial limits or rules, that really aren't needed anywhere else. Just because he has the delusion that he's some kind of Moses on the mountain and just _has_ to come down with some rules.

    E.g., he just had to define a byte limit, because he's prematurely optimizing a non-problem he doesn't understand. God forbid wasting space in the database by allowing 256 or 2000 byte strings... never mind that if he actually understood that underlying database, he'd know that a VARCHAR is not padded to that max length. If someone just entered "Alex", the same 4 bytes will be actually used in the database, regardless if the field is a defined as maximum 4, 32, 256 or 2000 characters. But nah, he has to put some restrictive number there, 'cause it looks more like he's doing some smart job.

    There is hardly any reason to even use a user name for anything other than display purposes. (You do have a primary key for that record for everything else, right?) As such there is no reason to make any assumptions about it, or enforce any particular format, or anything. There's no reason to even disallow SQL keywords (just effing quote it before using it in SQL) or angular brackets (just quote it before using it in HTML.)

    There is no reason to create any edge cases in the first place.

    And really it's not even just about names. Names are just one case where people make up BS rules just to feel more like they did the great design job. One could make the same case for the gazillion other pointless rules imposed upon the user or his work-flow or data, not because they're actually needed anywhere, but just because some OCD idiot feels like he _must_ impose some rigid structure upon things that really have none and don't need any. But he'd just feel naked without defining that kind of rigid structure, or without imposing upon humans some data structures theory that was intended only for use by programs.

    --
    A polar bear is a cartesian bear after a coordinate transform.
    1. Re:And that attitude is the whole problem by russotto · · Score: 5, Insightful

      The idea that if even someone's name doesn't fit "your" database, then you can just brush them off and have a beer.

      We can. Fact is, trying to write a system which can deal with all those 40 assumptions and still do anything useful with names is impossible. Even covering most of them is impractical, if you want programmers to do anything else. It has nothing to do with OCD. The programmers aren't making the rules because of some inner desire for order, but because the requirements of the system require they be made.

      Suppose your system is some sort of order-taking system. And one of the things it must do is print your name on a mailing label. How do you handle that if the name doesn't _fit_ on the mailing label? Or if there is no name at all? Or if the mailing label printer doesn't handle the name's character set? Or if the postal service for the countries in question have standards for names which are not met?

  45. Re:I don't know what the complaint is about? by Frater+219 · · Score: 3, Insightful

    Check out the huge regex at the bottom of the RFC 5322 compliant validator from CPAN:

    Honestly, this sort of thing is an example of overusing regex when it's the only parsing tool they know. Regex becomes unwieldy when you put too much of it in one place -- but this is because regex is unwieldy, not because the problem of parsing email addresses is fundamentally hard. Parsing email addresses is a case for a modular parser such as Parsec (or any of its ports and imitators) ... which will give you the added advantage of useful error messages on invalid input, instead of just a match failure.

    Moreover, isn't it kind of silly to point at an example of someone already having written the code to do something as a way of saying that doing it is difficult? In code, once it's already been done once, correctly, it doesn't need to be done again. If you think CPAN's huge regex (or any other implementation) is correct, and you've tested it to your satisfaction, you don't need to reimplement it; just use it.

  46. Re:I don't know what the complaint is about? by Bigjeff5 · · Score: 2, Funny

    Because no one ever automated the process of filling out web-forms right?

    Pffft, the idea is absurd! You'd need a computer to do that, and what spammer has a computer?

    Oh, right. Yeah, all of them. My bad.

    --
    Security is mostly a superstition... Avoiding danger is no safer in the long run than outright exposure. - Helen Keller
  47. What a coincidence! by valkenar · · Score: 2, Funny

    So is mine!