Falsehoods Programmers Believe About Names
Jamie points out this interesting article about how hard it is for programmers to get names right. Since software ultimately is used by and for humans, and we humans are pretty tightly linked to our names (whatever the language, spelling, or orthography), this is a big deal. This piece notes some of the ways that names get mishandled, and suggests rules of thumb (in the form of anti-suggestions) to encourage programmers to handle names more gracefully.
I found the piece very interesting.
Though my inability to post this comment appears to have outlived the slashdotting of the site.
I am fortunate enough to be the child of a professional smart-ass who intentionally gave all his children two middle names so that we would not fit into the computer systems of the era.
When I grew up my parents used my first middle name as a "given nickname" (it's actually in quotation marks on my birth certificate). So most of the time when I give my name for something I use my "given nickname" as my first name. Unless I feel like using my legal first name as my first name in which case I use that. There are probably four or five different versions of my name attached to my SSN in various different databases.
I've also got a sufffix: III. I don't have two ancestors with the exact same name as me, but since the various parts come from two different relatives my parents settled on III.
What's in a name?
http://www.youtube.com/watch?v=kFmJsTKWEBI
http://www.youtube.com/watch?v=cuUE7oKIkVI
Bo3b Johnson
http://www.linkedin.com/pub/bo3b-johnson/13/846/a52
The 3 is silent. And no, I don't know him but I know someone who does.
A database MUST treat all of these names the same: McClean, MacClean, MCLean, Mc Clean, Mac Clean. McCleen, ...
Are you sure? What if "Mac Clean" is actually somebody's first and last names?
I know plenty of people whose legal name is a single word, such as "Alex", "Max" or "Virgil." Would your system put that in the first_name, middle_name or surname column? Storing names and using them sensibly is hard, as TFA acknowledges.
You'd think that e-mail addresses by comparison would be simpler, but I have a hard time trying to register my e-mail address with sites that won't allow even simple things like "+", "-" or "." characters in the local part.
A database MUST treat all of these names the same: McClean, MacClean, MCLean, Mc Clean, Mac Clean. McCleen, ...
I assume you left out a "not" in that sentence? I think there are quite a few people that will kindly (or maybe not-so-kindly) explain why "Mc" and "Mac" are not the same.
Considering how many entry forms still don't allow '+' in an e-mail address (or, worse, allow it in the sign-up box but not in the unsubscribe box), and considering how many banks still restrict you to an 8-character password, does it come as any surprise that they have difficulty with something that isn't defined in an RFC?
Scratched Emulsion
Love the literary reference. In a much earlier sci-fi story, This Perfect Day, every citizen has a nameber, an identifier that is part name, part number. There are only four male names, four female names, and these are combined with a multi-digit code to make the ID unique. Ever since online forums started suggesting logins like "MaryBeth131" I can't help but think of namebers.
[
That said, if your input form doesn't allow some guy to type in his name with tone number suffixes on a US Windows keyboard layout where he lacks access to diacritics, then you're not a very thoughtful programmer.
Or you code in some language where Unicode support is not there by default, and you have to jump through hoops to get it working.
Like, say, PHP. Or stable Ruby.
Which might explain a lot of things about why so much of the Net is largely broken I18N-wise even on the most basic level, come to think of it.
If you are a guy (not an unreasonable assumption on /.), I think it's really strange that online forums are suggesting you the name "MaryBeth131"
What were your parents thinking?
that's teh shizzle bizzle
Names are not meaningful except to the people who have them, and they're deluding themselves. You are not your name, and your name is not you.
A lot of mobile phones, including my Samsung phone, use Pinyin as a way of entering Chinese characters. For each word/syllable I enter, there's a sometimes long list of matching Chinese characters to select from.
Pinyin is also used on things like street signs in some of the larger cities, which gives Western people at least some chance of recongnising names.
"Bo3b"
Never seen that one but I've heard of a: !bo
The leading exclamation is apparently a...lol i dunno what its called, but its apparently one of the hollow popping/clicking sounds you see in some African languages.
Reminds me of a classic database developer nightmare story that I heard:
A local school was receiving complaints that two students were getting the exam results and the like mixed up.
The two students? Identical twins living in the same house, with the same name.. John Smith Jnr.
Apparently their father was John Smith Snr, and the whole "Senior / Junior" thing has been done for generations of "Johns Smiths", and it was a tradition and all, and we can't just break a tradition just because we had twin boys.. so... we'll name them both John Smith Jnr.
Funny, I actually use the Chinese IME on Windows... it is called "Chinese (simplified) - Microsoft Pinyin - New Input Style"
And I do actually type in characters using Pinyin, because they have adaptive algorithms that guess at what the most likely character to follow is. They guess well, but it also displays 9 choices at a time, that you select with number keys.
WARNING! This girl exceeds the MAXIMUM SAFE standards established by the FDA for BRATTINESS
http://en.wikipedia.org/wiki/Perri_6
This is how you become top listed in every citation index.
I though the article was about the inability of programmer to remember names and recognise people, Maybe I should have read the article.
It's a real problem though - is it just me? I often know things about people (ah yes, plays squash, good at making cakes, father of that kid who rides a unicycle), but their actual name - no. It's a miracle if I recognise them at all.
Mind you, it means if anyone says "Hello" to me, I am obliged to be polite to them as I might actually know them quite well, but haven't recognised them yet - and certainly don't know their name.
It's a right pain. Anybody else suffer from this - and what the heck do they do about it? (I'd like a camera attachment what would whisper in my ear "that's Mrs Jones, her daughter, Kira is in the same class at school as your daughter. Likes chess and is obsessed with kayaking" - something tiny that could clip on my glasses, maybe).
"Cats like plain crisps"
Why would you be doing the validation in the database?
If he'd meant "should treat them all as valid", then he should have written that.
Confucius say, "Find worm in apple - bad. Find half a worm - worse."
Because no one ever automated the process of filling out web-forms right?
To make things worse, it's not necessarily the family name you use to address someone politely.
If you have to speak to Paul McCartney (of Beatles' fame), you have to formally address him as "Sir Paul". No, "Sir McCartney" is impolite, you shouldn't use it.
If you have to speak to Vladimir Putin, you won't address him as "Mr. Putin". It's "Vladimir Vladimirovich", please!
Proper email validation is not trivial
The regular expression, if one must be used, doesn't need to be any more complex than:
^[^@]+@[^@]+$
Actually, the local part of an e-mail address can be a quoted string, containing pretty much any character, so "user@host"@example.org is a perfectly valid e-mail address, and doesn't match your regex. Most systems won't accept it, but it's valid...