Address Formatting for International Mailing?
linuxbaby asks: "Anyone have any advice or wisdom from experience about address formatting for international shipping? I'm starting to doubt the process of asking individual questions of 'name, company, address, city, state, postalcode, country' because of complaints or misunderstandings from places like Ireland (no postalcodes), Germany (postalcode goes before city), Japan and England (many lines of address info needed). Maybe the best approach is to just get the country as a option-select list of 2-character country codes, but leave the other lines wide open ('address1', 'address2', 'address3', 'address4') for the person to fill in as they see fit. The point here is not data mining, but shipping packages as accurately as possible, anywhere in the world. Thoughts?"
you need to speak with your shipping company.
They do this for a living. They should be able to give you all the information you need.
Its spelt "L-I-N-U-X", but pronunced as "Free Beer"
FRANK'S COMPULSIVE GUIDE TO POSTAL ADDRESSES is probably the best resource you're going to find on the topic - it covers every continent and most countries, with details on postcodes, street addresses and more. Very geeky, but also highly useful.
There's nothing more annoying than forms that require you to enter your address in a specific way that doesn't fit for that particular company. If you're shipping to only one country, that's fine, but otherwise:
What's wrong with just letting the user enter the address in a freeform text field? The user probably knows what his own address is, and can write it in a form that the local post office can deliver to. Just include a dropdown box for the country, and that should be all there's to it.
Google is your friend. I found a good example quickly, about the fourth entry (one of the first three is the result of the submitter having asked this exact question on oreillynet.com):
g i/up
There's an example of what looks like a good solution used at https://www.theperlreview.com/cgi-bin/subscribe.c
They make some fields required (name and country would make sense) and others (such as state) are marked "required for some countries", with a big freeform text area marked "Mailing label" with the text "International subscribers: You can tell us what your mailing label should be, following your country's address format".
This seems a fair way of doing so, and that which fails your parser's ability to determine (ie, countries for which you don't know the convention) can be checked manually, with an additional contact-the-customer-and-verify step if you are really unsure.
As you contact customers and learn more about their specific formatting needs, update your parser -- use it to check the freeform address format, and perhaps warn the user if it doesn't seem to be valid (but allow them to continue anyways).
Somebody get that guy an ambulance!
Try the Universal Postal Union, specifically documents they have on properly addressing international addresses.
Also, this looks interesting: International Address Standard UPU S42-1
(BTW, I know nothing about this stuff, but I found it via Wikipedia, which these days is proving itself more useful than Google.)
_______
2B1ASK1
You can get the addressing standard and the worldwide database from the Universal Postal Union.
Zip codes, in countries that use them, are checksums. You need them in a separate field because you should check with the post office of that country to make sure it matches with the city. If the zip code and city/state do not match up you should make me verify the address. If the two match up odds are the address is good enough to get things to the right person.
If you can get someone's mail to the right zip code the post office doesn't really need the rest of the address, just the name. (Though it is much easier to deal with full addresses, so only try this when other options fail) This doesn't work so well if you name is common, but if you name is slightly obscure (which is most names, since obscure only means nobody else in town shares it) you are probably the only one in town with that name, and they can figure out where you live.
In short, the name and street address are checksums to each other, the local post office will notice a mismatch and try to correct them if they can. City/state, and zip are checksums to each other, and you should check them to be sure you get to the right town.
Now of course each country is different, but for most there is some variation of the above that you should use to verify the address is likely to be correct.
Of course checksum isn't the right term. There is math involved. However the concept is the same.
Here in the UK we have a mishmash of numbers and letters for our post codes. So whatever you do, don't try to validate it. RG21 7EJ WC1P 1AA E22 3NL EH22 3NL are all valid. There is nothing that pisses me off more than when an internet site tries to validate post code as 5digits or 5digits hyphen 4dgits. Give me a freeform text box, I'll give you my address in the form that MY post office will understand.
Sigs. We don't need no steenking sigs.
Actually while it's customary to have many lines of address in England, all that is actually required is the house number, and the post-code. Everything else can be derived from those two. Having all the extra info just makes sure if you get the post-code wrong, it will still get to the destination.
"When I grow up, I want to be a weirdo"
I think frankly your best bet here is to be freeform. They know best how their addresses are written. So long as the country goes last, to get the parcel from your country out to the appropriate country, the rest of the address should be written to their custom so that their postal service will be most likely to deliver it.
I've seen all the things you describe - stuff like "90167 Bucharest" where the postcode precedes the city - and you're just not going to cope with all that if you try and enforce a complex system of validation.
Our database just has Address1-Address5 (use as many or as few as you want), Postcode (this can be blank), Country (this can't be).
When we tried entering a lot of addresses into the address book software of a certain well-known courier company, we ran into all sorts of problems. It would keep insisting on postcodes where they weren't appropriate, and so on. It's just more hassle than it's worth, and creates more problems (with literally not being able to enter what you know is correct) than it solves (stopping accidental bad data entry).
I see this question as a special case of, how should I constrain data supplied by a user?
Other good examples are telephone numbers (not all countries use ten digits, and sometimes you need to add a note like ask for extension 36914 or ask the receptionist to page me, I don't have a direct line), gender (it may surprise you to know that not everyone identifies as male or female, and not everyone is happy with saying which label they want to apply, so make it optional) and even country (is Taiwan a country? It depends who you ask).
You need always to be aware that when a computer model of the external physical world disagrees with the external physical world, it's the model that's inaccurate or wrong, not the external physical world. This sounds pretty obvious, but look at the replies to this article and you'll see suggestions that might make me unable to give me my address.
I've had Web forms ask for my Canadian postal code (by the way, spaces are significant in UK postal codes, and are not in Canadian ones), and then tell me (because they re-used some JacaScript) that a postal code must be five digits. When I tried 00000, the server-side software tried matching that to the billing address of my credit card. As a result I was unable to buy an airline ticket!
In that case I used the 'phone. It took an hour on hold on an 800 number to place the order, because they had to process my credit card by hand, since their computer system didn't allow Canadian customers to fly from US destinations; I wonder how many millions of dollars they had lost before someone took the time to fight this? In the end I got a letter from support saying I should have used the Canadian and not the US Web page, and when I wrote back saying that's what I had done in fact, and please forward this to the programmers, I got a reply saying the bug was fixed.
It's still pretty common to find Web sites whose programmers don't have the concept Some people live outside the US. let alone Some people live in the US but have foreign credit cards, as they are temporary residents.
So when you use the billing address as a "checksum" against the credit card, and find they are different, the right thing to do is to ask the customer for confirmation and then believe the customer.
Keep a record of the information, so that if they complain later you can work out where they asked things to be shipped, and maybe recover. Obviously, your goal is to deliver the package, so you want clear text that is written to be easily understood, not a legal disclaimer in all-caps that's there so you can slither out of the clutches of a disgruntled customer!
The principles are early quality It's cheaper and more effective to get good data early on than to correct data later. Using input fields like house number, street name, postal region, county and so forth can help, as can parsing what they type, identifying the various parts, and asking the uer if they are correct. allow the user to insist If the user says their postal code is BEWARE OF THE DOG, the Post Office might not agree, but maybe it's the only way they can work out how to get an extra line of text onto the address label. It's probably better to let them do this than to lose them as a customer. Don't over-model If you are not going to need the individual address fields later, why are you making the customer type them in a form? Identify the mininum you know you'll need and ask yourself if it's really enough. Large forms aer intimidating, and people may be discouraged or complete them incorrectly because they are overwhelemed. Your database may only have twenty customers today, but when it has half a million addresses, consider the cost of storing an extra dozen fields per customer when you don't need them. The Real World is Right
Live barefoot!
free engravings/woodcuts
What if it is easier (for the customer and local Post Office) to use foreign characters in the address label? If you are adding free form text fields, you might want to be prepared for Unicode support of various languages.
227-3517
The downside, of course, is that postal codes, by extension, become traceable private information, so you'd have to start zealously guarding that as well.
More than mere navel gazing.
Which you are discussion is very difficult to do using relational databases. The whole theory of associative databases is to allow data usually in a particular form but to allow for exceptions. Its an entirely different theory of datamodeling and needs to be introduced at the earliest stages of design.
That's a lot to ask for a small percentage of the market. It may not be the case for most business that, "'better to let them do this than to lose them as a customer" it might just be better to lose them as a customer in terms of profits.
OTOH for people who just need to have flexability associative rather than relational models give one most of the advantages of a rigid relational system with requiring rigidity.