Slashdot Mirror


Microsoft Forced To Translate Office Into Nynorsk

An anonymous reader writes "Beeb reports, "The main organisation working for the Nynorsk language got most of Norway's high schools to threaten to boycott all Microsoft software if they didn't come up with a New Norwegian version of Office." Which brings up questions for Open Source developers: What's involved in translating programs? Is there a process that can be followed to make the inevitable easier? Is there a group providing guidelines for this already? -- Do you work in program translation? Step up and do tell."

15 of 303 comments (clear)

  1. Well.. by Lord+Bitman · · Score: 4, Interesting

    Quite simply, keep all your text in a seperate file which can be compiled completely seperately from the rest of your project. The goes for Dialogs, Menus, and Labels. This primarily makes it easier to allow users to switch from one language to another.
    There really isnt that much that can be done other than that. What do you want us to say? Break your descriptions into simple enough language that some automatic translator can spit something out? I dont think so. Your best bet is to just keep all your text in one place, [aside from debugging messages or other things that the user is never supposed to see] so you won't have to go looking around for[and potentially miss] it when the time comes. Don't you hate it when the whole program is translated except for the one error message that it keeps giving you? :)
    Of course documentation is a different story. Nothing you can do there except keep everything very well documented so that there will be less confusion in translation. If it's a complete idea instead of a quick phrase thrown out, it's more likely to be translated correctly.

    --
    -- 'The' Lord and Master Bitman On High, Master Of All
    1. Re:Well.. by Jim+Hall · · Score: 2, Interesting

      Quite simply, keep all your text in a seperate file which can be compiled completely seperately from the rest of your project. The goes for Dialogs, Menus, and Labels. This primarily makes it easier to allow users to switch from one language to another.

      This is called a "message catalog", by the way. It's the easiest way for almost any program to support internationalization ("I18N" = "I" + 18 letters + "N".)

      On most commercial UNIX systems, the preferred library is catgets(). On Linux (GNU) systems, the preferred library is gettext(). In the FreeDOS Project we wrote an implementation of catgets(), called Cats, because it turns out to be quite easy to write. There's also another library for FreeDOS called MSGLIB that does the same thing.

      What it all comes down to is containing all your strings that would be printed by the program in the "message catalog". The catgets() or gettext() is just a method to retrieve the string you want from the catalog that represents what the current language setting is (the LANG env variable under UNIX.) catgets() references each catalog by a number, and each string in the catalog by a "set" number and a "message" number, so you have three points of identification. gettext() is more complicated, and searches all open catalogs based on the untranslated string.

      Since I've supported I18N using catgets() in my programs, it's been really easy to keep my Free software / open source programs up to date because volunteers from around the world will email me the message catalog for my programs, translated into their language. I just add the catalog to my distribution, and that's all I have to do to support the new language.

      Of course, you also have to keep in mind the locale (monetary symbols, "." or "," as "decimal point", ...) and character set. :-)

      Oh, and supporting double-byte character sets (Chinese, ...) is different.

      -jh

  2. Before you gloat.... by first+axiom · · Score: 2, Interesting

    I'd like to point out that Microsoft usually does a great job of translating to other languages. Here in Mexico, Age of Empires was the hit multiplayer game. Everyone played it and nothing else. Why? It was the only game of its kind translated to spanish.

  3. My success... by scorp1us · · Score: 3, Interesting

    I write a program to be translated into 5 languages. Fortunately, all were off the ASCII set, so no multi-byte char issues were present.

    I came up with a enum file that held lines like:
    enum phrases{
    IDL_YES=0,
    IDL_NO,
    IDL_MAX_PHRASES};

    Then a file for each language:
    English.dic:
    Yes
    No

    Spanish.dic:
    Si'
    No

    etc... At runtime it loaded the last language configured or defaulted to English.

    I also added support so you could use %s, %d, %x etc, so you can use them in sprintfs. It worked damn well. No need to re-compile. Just drop another .dic file in, have a dialog that at runtime looks for .dic files, and you're done.

    It worked extremely well. The only thing it coulf ever ned was milti-byte support, but as I said before that was not a requirement.

    PLEASE PLEASE stay waway from the way that MS Dev Studio does it. It sucks ass.

    Incedentally, the same class (I used a class when I could use C++) also works well for handling various dialects of SQL. MSSQLServer.dic, PostgreSQL.dic, etc....

    Very simple and fast.
    The only pain is that you have to come up with a unique IDL_name for each string. I'd like to have an associateive array so you could say
    IDL("Yes") and have that translated. That was the next setp for me, but I never got the time to do that.

    Hope that helps!

    --
    Slashdot's rate-of-post filter: Preventing you from posting too many great ideas at once.
  4. nynorsk is irritating.. by Anonymous Coward · · Score: 5, Interesting

    Norway has two official languages.. the one used by the majority of the people, called bokmål, and then another one called nynorsk. Not that they are two seperate languages or anything.. sort of like the difference between british english and american english, only a little more. This is because we were for quite a time, many years ago, in a union with denmark, and when the union broke, many norwegians felt they needed something that would seperate them a little from denmark (as denmark had been the bigger brother in the union, so to speak). Ivar Aasen roamed the countryside and created a new language on the basis of the many dialects norwegians spoke throughout the country.. this was the birth of nynorsk. However, nynorsk never prevailed, and now we're stuck with two languages.. much to the dismay of many norwegian students, because although very, very few speak nynorsk in the big cities, you still have to have exams in both different languages.. in some areas though, many speak nynorsk.. or at least close to it.. no one really speaks as they write bokmål and nynorsk. Close, but not quite.

  5. Microsofts refusal by kyrre · · Score: 3, Interesting

    I read some years ago that Microsoft refused to make a 'nynorsk' version due to the high development cost. $3 million they claimed. A high price compared to the income they could expect returned from the small minorty that use 'nynorsk' in Norway.

    This price seemed a bit to much for me. Gramaticaly the two norwegian written langauges differ little in actual grammar and sentence building. So word by word replacement should do most of the trick.

    KDE and Gnome and their office like replacement apps have been available in both languages for a long time.

    Guess the threat of working open source alternatives has forced MS into submition

    An opensource project called Skolelinux (School Linux) is on its way to create a replacement for Windows for use in norwegian schools. Threatning the current MS monopoly one norways educational system.

  6. Localizing Perl applications... by Anonymous Coward · · Score: 1, Interesting
    What's involved in translating programs? Is there a process that can be followed to make the inevitable easier? Is there a group providing guidelines for this already? -- Do you work in program translation? Step up and do tell."

    I have given a presentation in OSCon 2002 and a paper in ICOS 2002 that addressed these problems in the context of Perl-based web applications. The paper is also available in Chinese.

  7. Re:That's why having resources in files is helpful by MonoSynth · · Score: 2, Interesting

    And what about the difference in lenght of words between languages?. Some examples between English and Dutch:

    File - Bestand
    Edit - Bewerken
    Tools - Gereedschappen
    Cancel - Annuleren

    If you make a very slick interface for one language, it can be completely fsck'd up in another language. Buttons need to be bigger, menubars don't fit anymore, and so on.

    Especially in cheaper software, they use very strange constructs to make words fit well when translated to non-english, like removing the middle part en replacing it by a '.

  8. Switching to OpenOffice by anarchima · · Score: 3, Interesting

    Well the situation in Norway is quite interesting, because there is already a switch from Microsoft licenses to Linux in the education system. In fact, the state has sponsored a project called "Skolelinux" (SchoolLinux), where Norwegian/Nynorsk/Same language editions are being made based on the Debian operating system. One of the reasons why it was started was obviously the lowered costs, but also the ability to have more native language output. The site is at www.skolelinux.no but I think it's only in Norwegian...

  9. UNICODE and string tables by videodriverguy · · Score: 5, Interesting

    To fully support all languages, including Asian, there really is no alternative - the UNICODE format. That, and sticking to the use of tables for strings, menus etc.

    One of the major correct things Microsoft did some time ago was realize this - hence for most of their products a different resource file is all that's needed to support another language (I'm ignoring help files etc.). IMHO, it's a great pity that the Linux system didn't realize this earlier (especially as it was written in a non English language country).

    Since I'm currently working in China, this has become a very important issue, more so to me because I am designing a natural language scripting tool that has to understand both Chinese characters and syntax. Whilst we may find some translations by the Chinese into English funny, it's just because English (to them) is as foreign as Chinese is to us. All of us English speakers should realize that just because C/C++/Python etc. make sense to us, they don't to others. It's just not reasonable to say, well, if you want to learn programming, then you must learn English first.

    1. Re:UNICODE and string tables by Pentomino · · Score: 2, Interesting

      Strangely enough, the Ruby language was designed in Japan, by a Japanese person. The language is in English and makes a great deal of sense. It may help that the creator of the language is proficient in English, but the language's local popularity may have more to do with the idea that the world takes it for granted that people program in English. On the one hand, it's only fair that people should be able to program in their native language. On the other hand, Microsoft translates Visual Basic into other languages, and the result is said to not always work well. I remember a Swedish-speaking Finn telling me the horrors of having to program in Finnish Visual Basic. Then there's Perligata.

  10. Answer is simple! Plugins! by RyuuzakiTetsuya · · Score: 3, Interesting

    Language packs. Have each prompt and piece of text be dynamically linked to an external language link. Either integratable at compile time, in which a simple copying of a new language pack then recompile will do you, or just have it do it on the fly. I know this is being done on several projects, including the emulator Kawaks...

    --
    Non impediti ratione cogitationus.
  11. What about English outside the U.S. by Lord_Scrumptious · · Score: 2, Interesting

    Why can't Microsoft translate it's software and operating systems so they use the correct spelling for other English-language speaking countries? The UK, Canada, Australia, and New Zealand all use what's often referred to as International English, where spelling differs from U.S. English. Examples: Colour (not color), Favourites (rather than favorites), Network Neighbourhood (rather than neighborhood).

    For all their expertise in internationalisation, it seems that Microsoft still can't manage this. Is it a question of cost and convenience? Some of their more specialised software, such as Encarta, has been properly localised, but probably because they promote this heavily as a resource for schools. How many U.S. users would be happy with an operating system and applications that used, say, UK spellings? Not many I'd venture to guess. But it's not just Microsoft, the last time I installed Mandrake Linux, the default install only offered U.S. English.

  12. The problems I encountered with a translation by LeftOfCentre · · Score: 5, Interesting

    I translated Uropa 2 - The Ulterior Colony, an Amiga game, to Swedish on behalf of Vulcan Software.

    One thing that I seem to remember causing problems was that occasionally, there were individual words in the separate translation file that were sometimes reused in multiple places, with assumptions being made about where that could happen based on what works in the English language. That is as definite no-no. Don't assume that an English word which can mean several things also has an identical word in a foreign language.

    Also, don't assume that foreign languages have an easy way to change between singular and plural or that as in English, there is only one article for all nouns.

    In conclusion, always give the translator the option to choose the exact wording based on the context -- even if that means that the English (or whichever is the original language of your software) version of the resource file has many words duplicated. What works in one place may not work in another, even if that is the case with your language.

    1. Re:The problems I encountered with a translation by Hank+Powers · · Score: 2, Interesting

      In Finnish that really is a problem since we don't have any articles or prepositions at all. E.g. the Finnish for "use the mouse to ping Microsoft's server" is "käytä hiir (hiiri=mouse) pingataksesi (pingata=ping) Microsoftin palvelinta" (palvelin=server). Microsoft hasn't realized this quite well and because of that their localization team has had to use a shortcut in this. In some places they're using the word object ("kohde") in conjugated form. E.g. "use the mouse to ping the object Microsoft's server", which is in Finnish "käytä hiir pingataksesi kohdetta Microsoftin palvelin". Sounds quite lame to me.

      --
      hapo