Software Internationalization
Anonymous Coward writes "It seems that the folks over at O'Reilly have quietly released a book entitled, "Java Internationalization". The website for the book can be reached from the Java O'Reilly site, . The authors also have a website dedicated to the book.
I'm curious as to how developers are treating software internationalization, not just in Java, but in other programming languages like C#, C++, Perl. For software designers out there today, is internationalization and localization a forethought or an afterthought? Is Java the only viable language for writing truly multi-lingual applications?"
The tricky part has nothing to do with coding language preference, but in the overall design of the application itself. Provided that you can come up with acceptable translations of all your output strings -- which itself can be tricky -- that still doesn't really address more subtle interface issues you might face, depending on what you're trying to do.
For web design, it could be worthwhile to have drastically different versions of your content for different locales -- IKEA and the BBC are interesting case studies for this. For other applications, one interface framework might be fine, but really this involves a lot of work and study of your target audience, and it goes far beyond (and is much more interesting than) the question of what language you code in.
That said, Unicode is a truly terrifying thing, and any language that makes it easier to work with is a welcome thing. Java supposedly uses Unicode internally, and if that helps as much as it seems like it should then great. Otherwise, or maybe even still, you face a much gentler slope in going to other Latinish languages (most of the European ones and any of the others that have adopted that alphabet or at least have a cultural standard for & acceptance of it (thus Japanese counts, Chinese doesn't), to anything with a much different character set (Russian, Arabic, Hebrew) and beyond (the CJKV languages -- Chinese, Japanese, Korean, Vietnamese).
I can deal with the prospect of planning for French, German, Spanish, and Italian versions of work that I do, but having to go beyond that is a very daunting prospect. And, of course, and interesting one... :)
DO NOT LEAVE IT IS NOT REAL
To internationalize, put all of your translatable strings, images, and formats into a resource. Your resource can be a text file, or an image, or whatever. Your must then get all the information from resources.
The basic idea is that you have a resource that needs to be translated: resource.txt. Your program determines the locale (say US_en) and then fetches resource.txt.en.US. If then merges that with resource.txt.en and resource.txt. The nice thing is this works even if you can't list your files (they may be on a web server for example). Also, because you are merging files, if something is the same for USA and Great Britian, it can go in resource.txt.en and you don't have to duplicate work in .US and .GB.
Besides having the libraries to handle this stuff the only thing that java makes it easy to do is determine the current locale. But the concept is simple and with a couple weeks of work you could have similar libraries up for any language.
Disclaimer - I been out of the shrinkwrap game for awhile. The following may be out of date.
Most commercial apps I've worked with have a core in C or C++, then port the UI to whatever is available. Nearly all Adobe apps, for example, have a cross-platform core, and a localization specific to platform. Macs get MPW code (and alot of Rezedit), Windows gets VC + properties files (or whatever windows gets these days), and Unix gets X (or your favorite UI API).
Nowadays, string localizations may be done more and more in the specific country, but this is possible in Java as well.
Sigh, most real client app companies (in my limited experience) which are truly shipping to more than a very few countries are still willing to trade off the pain of porting the UI for the stability of the shared core in C or C++.
The great part about java is still that it can be an dynamically configurable server app for many languages and people at the same time. That could be the way of the future, or not. I aint gonna wax philosophical in Developers.
You may not like this advice but Microsoft have lots of information on i18n and l10n
Some is Windows based, obviously, but some isn't.
It's a good reference if you're not ideologically opposed to visiting some sites
http://www.thehungersite.com
And another issue, that Java deals with, is text direction. At the simplest level, this can just be left-to-right or right-to-left, but Java also handles mixing different languages and thus text directions. Think about the hassles of embedding r-t-l text in l-t-r text e.g. A Hebrew quote/name inside English text. Especially, consider text selection as you select from the English text into the Hebrew! Java's I18N package can handle this. There was a good discussion of this a couple of years back in Java Report. This article at IBM's DeveloperWorks looks to be similar to what I remember, and dicusses the Arabic lettering issues.
Complexity is Easy. Simplicity is Hard.