As a number of people have mentioned, Internationalization and localization can be an incredibly complex process.
Since you are working with an existing system, you don't have the option of designing in I18N support from the very beginning.
Get a good book.
I recommend "XML Internationalization and Localization" by Yves Savourel, and "Beyond Borders web globalization strategies" by John Yunker. Both the authors have been in the I18N business a long time. They know what they are talking about.
Choose your tools wisely.
Use MySQL 4.1 (or newer) --
Since MySQL 4.1, you have the option of choosing which character set to use on a per DB, per table, or per field basis. The simplest solution is to just make the entire DB use the UTF-8 character set (This may not be appropriate for reasons of optimization or other reasons).
Learn about Unicode/UTF-8. (Others have provided links)
Store your localized data in UTF-8. Using a single character set makes life much easier.
Use a fairly recent version of PHP --
PHP 4.1.1 (or newer) comes bundled with GNU Gettext.
GNU gettext
http://www.gnu.org/software/gettext/ You probably don't need to download it, since it should be included with your version of PHP. Just enable it in the php.ini, or compile it in from source.
GNU Gettext has been around for a number of years. It's fairly efficient, well maintained and has a larger user base. It basically makes use of mapping a reference ID and a language-locale to a string of text. It replaces the ID with the appropriate text in your template to create a finished document. Text for different language-locales are stored in separate files called PO files.
If you are going to be using professional translators, you may want to consider XLIFF as a document exchange format. There are XLIFF to PO converters available.
You may be considering XML (XHTML, XSLT and XLIFF) for Internationalization. The PHP solution, using Sablatron, is not yet fully-baked. I would avoid it for a production system. It shows promise for the future. Plus, XLIFF is not recommended as a storage format. You'll probably find some performance issues if you try to use it as a direct data store.
Use templates, if at all possible.
You may not be able to use the same template for all language-locales, but they should work for most cases. If you have a BDI language, for example Arabic or Hebrew, would likely need a separate template.
Localize your CSS stylesheets.
You may have locale specific layout and formatting information in your stylesheets.
From a design point of view, consider using a combination of a Front Controller pattern to switch languages and a Page Controller pattern to apply the templates.
Where are you storing the article data? Is it in the MySQL DB, or is it in static files that are referenced by the DB? Focus most of your efforts on the part that is most critical, MySQL if most of the data is in the DB, or PHP if most of the data is static. But remember, you are going to have to internationalize both parts of your system.
Don't forget, text from many other languages takes up more space than english to say the same thing. Sometimes 30-50% more space. This can significantly impact layout in heading sections, column widths, a
I don't really see why people haven't mentioned it earlier, but large scale systems development is hard.
We're not even talking about building something new here. It's re-implementing a 40-year-old legacy system, maintaining legacy data, and adding new features to bring it into the 21st century. The software engineering task is mind-bogglingly huge.
The IRS' problem, and others like it (FAA - Air Traffic Control System) have been going on for a long time. For a little history, check out Risks Forum Digest.
Regarding the IRS doing the development in-house -- While on the surface it may seem like a good idea, in practice it isn't. The IRS does have the domain expertise. But they don't have the experience with large-scale systems development. This is a situation where outsourcing to one of the large consultancies or gov't contractors really is a good idea. The challenge is have effective communication between the experts at the IRS and the developers, not just during requirements gathering, but throughout the software development process.
Removing the intermdiaries from the development and distribution process. Those of you who remember the early days of the Web will recall that this is what the Internet is all about.
All the efficiencies that will be gained. All the money that will be made. All by getting rid of the middle-man.
One item we changed comes to mind immediately - the rubber feet on Inspiron 7000's were originally made of a material that marked nearly every surface we set them down on. Many people had multiple black spots and marks where the systems sat on their desk. Ick.
My favoite question for pushy computer salesmen who seem to know everything. Does your computer come equiped with the latest RFT? Rubber Foot Technology is clearly important.
But the trick to good estimating is to factor in enough flexiblity into the schedule.
Understand that requirements will change. Build in a change factor. If the change(s) requested by the customer are so big that they would bust the schedule, let the customer know. Push back if necessary.
Don't forget to factor in the things most people forget, stuff like testing, integration, education, marketing issues, personal & vacation time, all of these can impact the schedule and should be considered.
Lastly, remember that schedules are a living document. Estimates become more refined as more information becomes available. That task that you didn't have any clue about how to accomplish, and initially estimated at 2 months, after a little research and prototyping, is now down to 3 weeks. That's fine.
As many people have said, risk management is key. If you expect some parts of the estimate to be off, try to balance them with parts of the estimate that are known to be fairly solid. You can build in a fudge factor, but you should have some idea of what you're planning to use it for.
With completely unknown projects, all bets are off.
Well, speaking as someone with BA CS who did tech support as my first job out of college, I'll tell you it was a wonderfully enlightening experience.
No, I don't like being yelled at by someone who finally got angry and frustrated enough to actually call Tech Support. You've really got to want help before you'll go through the hell of the voice menu system. Just let'em vent, and get on with their question.
What I did learn was what the users really want. What the problems they are having are. This information is very useful in software development and user interface design.
Maybe I've read too many RTFM/(l)user comments in this thread. The customer is always wrong. But could it be, just possibly, that the products *we* are putting out there suck? Could it be that if you have to RTFM then that means that there might be something wrong with the design?
Sure, computers are complex. Not everything can be done without some training or reading a manual. But if people are buying computers as commodities, or people are trying to use software out of the box without reading the manual, then let's design hardware and software that allows them do that.
I can't begin to count the number of times when faced with a user's problem, I had to say, "Well, it's because Windows was designed like that." A little empathy about the problem. Explain that it could have been, and probably should have been better designed.
We can't put all of blame on (l)users. Sure, some didn't do their homework, aren't safe on the roads, much less with a computer. But at least part of the fault rests with us as software and hardware developers. We're supposed to know better. And we're in the best position to fix the stupid design problems. A little user testing goes a long way to finding problems, folks.
Oh yeah, did I mention that good design is a hard, challenging problem. We can do that. Fix all the most basic design problems. That'll leave the complicated, interesting problems for the Tech Support people, and make their lives more fun.
Being humble is important. It doesn't mean you have to just sitting back and watching up all the time. It also doesn't mean that you don't have value. You have to find value in yourself and self-satisfaction in your work. You can't always wait for recognition from others. Sometimes you have to pat yourself on your back, because no one else is going to do it for you.
Understand that what you do know has value. You noted specific ways in which you contribute to the company you work for. These are important. But these are only part of the picture. Equally important is knowing what you don't know. Put yourself in somebody else's shoes. What is the Manager's perspective? What is the customer's perspective? What valuable contributions do they bring to the table? Learn from them.
The world is not a perfect place (unfortunately). Developing software is not a perfect process (especially if you've got to sell it and make a profit in competition with others). Compromises need to be made. It is best that they be informed decisions. You can contribute by providing some of that information. But, in all likelihood, the final decision is going to be made by somebody else with priorities different from yours. Respect that they know things different from you. Help contribute to the reaching the goal. Be humble. People will respect you for it.
As a number of people have mentioned, Internationalization and localization can be an incredibly complex process.
Since you are working with an existing system, you don't have the option of designing in I18N support from the very beginning.
Get a good book.
I recommend "XML Internationalization and Localization" by Yves Savourel, and "Beyond Borders web globalization strategies" by John Yunker. Both the authors have been in the I18N business a long time. They know what they are talking about.
Choose your tools wisely.
Use MySQL 4.1 (or newer) --
Since MySQL 4.1, you have the option of choosing which character set to use on a per DB, per table, or per field basis. The simplest solution is to just make the entire DB use the UTF-8 character set (This may not be appropriate for reasons of optimization or other reasons).
Learn about Unicode/UTF-8. (Others have provided links)
Store your localized data in UTF-8. Using a single character set makes life much easier.
Use a fairly recent version of PHP --
PHP 4.1.1 (or newer) comes bundled with GNU Gettext.
GNU gettext
http://www.gnu.org/software/gettext/ You probably don't need to download it, since it should be included with your version of PHP. Just enable it in the php.ini, or compile it in from source.
GNU Gettext has been around for a number of years. It's fairly efficient, well maintained and has a larger user base. It basically makes use of mapping a reference ID and a language-locale to a string of text. It replaces the ID with the appropriate text in your template to create a finished document. Text for different language-locales are stored in separate files called PO files.
You will also want a PO file editor.
Here are a couple of articles on GNU Gettext
http://www.phpdig.net/ref/rn26.html
http://www.onlamp.com/pub/a/php/2002/06/13/php.htm l
http://www.uberdose.com/php/php-and-gettext-for-i1 8n/
If you are going to be using professional translators, you may want to consider XLIFF as a document exchange format. There are XLIFF to PO converters available.
You may be considering XML (XHTML, XSLT and XLIFF) for Internationalization. The PHP solution, using Sablatron, is not yet fully-baked. I would avoid it for a production system. It shows promise for the future. Plus, XLIFF is not recommended as a storage format. You'll probably find some performance issues if you try to use it as a direct data store.
Use templates, if at all possible.
You may not be able to use the same template for all language-locales, but they should work for most cases. If you have a BDI language, for example Arabic or Hebrew, would likely need a separate template.
Localize your CSS stylesheets.
You may have locale specific layout and formatting information in your stylesheets.
From a design point of view, consider using a combination of a Front Controller pattern to switch languages and a Page Controller pattern to apply the templates.
Where are you storing the article data? Is it in the MySQL DB, or is it in static files that are referenced by the DB? Focus most of your efforts on the part that is most critical, MySQL if most of the data is in the DB, or PHP if most of the data is static. But remember, you are going to have to internationalize both parts of your system.
Don't forget, text from many other languages takes up more space than english to say the same thing. Sometimes 30-50% more space. This can significantly impact layout in heading sections, column widths, a
I don't really see why people haven't mentioned it earlier, but large scale systems development is hard.
We're not even talking about building something new here. It's re-implementing a 40-year-old legacy system, maintaining legacy data, and adding new features to bring it into the 21st century. The software engineering task is mind-bogglingly huge.
The IRS' problem, and others like it (FAA - Air Traffic Control System) have been going on for a long time. For a little history, check out Risks Forum Digest.
June 1991 - http://catless.ncl.ac.uk/Risks/11.92.html#subj5.1 March 1996 - http://catless.ncl.ac.uk/Risks/17.96.html#subj4.1 February 1997 - http://catless.ncl.ac.uk/Risks/18.81.html#subj2.1 and April 1998 - http://catless.ncl.ac.uk/Risks/19.68.html#subj2.1Regarding the IRS doing the development in-house -- While on the surface it may seem like a good idea, in practice it isn't. The IRS does have the domain expertise. But they don't have the experience with large-scale systems development. This is a situation where outsourcing to one of the large consultancies or gov't contractors really is a good idea. The challenge is have effective communication between the experts at the IRS and the developers, not just during requirements gathering, but throughout the software development process.
Coffee mug - for obvious reasons. Can also be used for tea, pencils, paper-clips, etc.
Champagne glasses (set of 2) - You may not need them often. But you'll be glad you've got them when you do.
The term is "disintermediation".
Removing the intermdiaries from the development and distribution process. Those of you who remember the early days of the Web will recall that this is what the Internet is all about.
All the efficiencies that will be gained. All the money that will be made. All by getting rid of the middle-man.
I read somewhere that, if current demographic trends continue, by 2005 there will be more lawyers than people.
One item we changed comes to mind immediately - the rubber feet on Inspiron 7000's were originally made of a material that marked nearly every surface we set them down on. Many people had multiple black spots and marks where the systems sat on their desk. Ick.
My favoite question for pushy computer salesmen who seem to know everything. Does your computer come equiped with the latest RFT? Rubber Foot Technology is clearly important.
But the trick to good estimating is to factor in enough flexiblity into the schedule.
Understand that requirements will change. Build in a change factor. If the change(s) requested by the customer are so big that they would bust the schedule, let the customer know. Push back if necessary.
Don't forget to factor in the things most people forget, stuff like testing, integration, education, marketing issues, personal & vacation time, all of these can impact the schedule and should be considered.
Lastly, remember that schedules are a living document. Estimates become more refined as more information becomes available. That task that you didn't have any clue about how to accomplish, and initially estimated at 2 months, after a little research and prototyping, is now down to 3 weeks. That's fine.
As many people have said, risk management is key. If you expect some parts of the estimate to be off, try to balance them with parts of the estimate that are known to be fairly solid. You can build in a fudge factor, but you should have some idea of what you're planning to use it for.
With completely unknown projects, all bets are off.
You say "Potato", I say "Potatoe".
Well, speaking as someone with BA CS who did tech support as my first job out of college, I'll tell you it was a wonderfully enlightening experience.
No, I don't like being yelled at by someone who finally got angry and frustrated enough to actually call Tech Support. You've really got to want help before you'll go through the hell of the voice menu system. Just let'em vent, and get on with their question.
What I did learn was what the users really want. What the problems they are having are. This information is very useful in software development and user interface design.
Maybe I've read too many RTFM/(l)user comments in this thread. The customer is always wrong. But could it be, just possibly, that the products *we* are putting out there suck? Could it be that if you have to RTFM then that means that there might be something wrong with the design?
Sure, computers are complex. Not everything can be done without some training or reading a manual. But if people are buying computers as commodities, or people are trying to use software out of the box without reading the manual, then let's design hardware and software that allows them do that.
I can't begin to count the number of times when faced with a user's problem, I had to say, "Well, it's because Windows was designed like that." A little empathy about the problem. Explain that it could have been, and probably should have been better designed.
We can't put all of blame on (l)users. Sure, some didn't do their homework, aren't safe on the roads, much less with a computer. But at least part of the fault rests with us as software and hardware developers. We're supposed to know better. And we're in the best position to fix the stupid design problems. A little user testing goes a long way to finding problems, folks.
Oh yeah, did I mention that good design is a hard, challenging problem. We can do that. Fix all the most basic design problems. That'll leave the complicated, interesting problems for the Tech Support people, and make their lives more fun.
Hehe...Anyone sell ani-tempest blankets for a PDA? :)
Yeah, aluminum foil. :-)
There is a generic term. It's "unix". If you want to differentiate between open source vs. proprietary, that's another story.
Being humble is important. It doesn't mean you have to just sitting back and watching up all the time. It also doesn't mean that you don't have value. You have to find value in yourself and self-satisfaction in your work. You can't always wait for recognition from others. Sometimes you have to pat yourself on your back, because no one else is going to do it for you.
Understand that what you do know has value. You noted specific ways in which you contribute to the company you work for. These are important. But these are only part of the picture. Equally important is knowing what you don't know. Put yourself in somebody else's shoes. What is the Manager's perspective? What is the customer's perspective? What valuable contributions do they bring to the table? Learn from them.
The world is not a perfect place (unfortunately). Developing software is not a perfect process (especially if you've got to sell it and make a profit in competition with others). Compromises need to be made. It is best that they be informed decisions. You can contribute by providing some of that information. But, in all likelihood, the final decision is going to be made by somebody else with priorities different from yours. Respect that they know things different from you. Help contribute to the reaching the goal. Be humble. People will respect you for it.