Slashdot Mirror


Changing Your Filesystem's Locale?

dybdahl asks: "Now that Red Hat has changed the default character set to be UTF-8, none of the existing filenames that included local characters like æ, ø, å, (Denmark) are handled correctly by Konqueror or can be seen correctly with "ls" in a shell. Is there a tool out there that can convert an ISO8859-1 ext3 filesystem to UTF-8?"

15 comments

  1. Convert what? by Sam+Lowry · · Score: 3, Informative

    The filesystem has been stocking the filenames in utf-8 for ages. What you have to do is to make sure there is iocharset=utf-8 in the options of mount in the file /etc/fstab.

    In general, man mount helps a lot.

    1. Re:Convert what? by cyberkreiger · · Score: 3, Informative

      According to "man mount", "iocharset" is an option available to filesystems (v)fat, iso9660, and ntfs only. It's also available for smbfs.

      --
      Stumbling in the dark
      I hear slavering of jaws
      Eaten by a grue.
    2. Re:Convert what? by amorsen · · Score: 4, Informative

      The filesystem thought it was using UTF-8 filenames. That is what the specification says it should use. However the unfortunate poster has used ISO-8859-1 (or -15) file names. Therefore he now has a file system that does not conform to the standard, and of course he wants to do something about it.

      --
      Finally! A year of moderation! Ready for 2019?
  2. Did you look at freshmeat? by cyberkreiger · · Score: 5, Informative

    I think convmv may be what you're looking for.

    --
    Stumbling in the dark
    I hear slavering of jaws
    Eaten by a grue.
  3. during install by Apreche · · Score: 2, Interesting

    I've been trying shitloads of distros lately (journal has more info). And despite other problems all of them have asked me what my locale is, what character sets I want to support, and all that kind of stuff. I must say if there is one thing that is more trouble in windows than in *nix it's internationalization. As with everything though, there is a config file somewhere and a package to install.

    --
    The GeekNights podcast is going strong. Listen!
  4. Re:Have you tried... by Anonymous Coward · · Score: 0

    format c:? In Linux? Wouldn't do much, would it?

  5. Re:Have you tried... by Tesseract · · Score: 0, Offtopic

    perhaps you should try rm -rf /mnt/Windows?

    --
    Show me what you want, and I'll show you how to get along without it...
  6. Change the RedHat Default by SpaFF · · Score: 2, Informative

    Ok, so RedHat makes the default charset UTF-8. Just change the default to ISO8859-1. Its like a 2 line change in /etc/sysconfig/i18n. I had to do a similar change when we switched our mailserver to RH8 because early versions of spamassassin (more specifically perl though I think) didn't like playing with UTF-8.

    -Lee

    --
    -----BEGIN GEEK CODE BLOCK----- Version: 3.12 GIT d? s: a-- C++++ UL++++ P++ L+++ E- W++ N o-- K- w--- O- M+ V PS+ P
  7. What they should do by spitzak · · Score: 2, Insightful
    Forget all this nonsense about "locales". It is obvious there are exactly 2 "locales" of interest, UTF-8 and ISO-8859-1. Now suprisingly enough these can co-exist almost perfectly, so there can be *one* "locale" and we can be rid of all this brain-dead attempts at i18n.

    What systems should do is treat all streams of bytes as UTF-8, with the additional rule that all sequences of bytes that are not legal UTF-8 (including a unicode value encoded with more bytes than necessary) should be treated as individual bytes in ISO-8859-1. It turns out that you need three accented characters in a row, or a capitalized accent character followed by a foreign punctuation mark, for an ISO-8859-1 to be confused with UTF-8.

    I very much believe this works, although I think a search should be done through lots of ISO-8859-1 text to find out if there are any common sequences that are confused with UTF-8.

    Even if this is not a perfect solution, it certainly is better than the current scheme. Most filenames will be readable. More importantly it gets rid of the idea of an "error" in a character string, significantly simplifying the interfaces.

  8. This can also happen by djeca · · Score: 1

    when moving files from a filesystem on the ISO-8859-15 charset to one on the UTF-8 charset - say vfat to ext3.

    I know.

    Luckily there were only about 12 files (courtesy of a recent trip to Sweden) and mv-ing them wasn't too tricky.

    Any more and I would have got seriously frustrated, and probably ended up writing convmv myself.

  9. Can you believe this !! by Anonymous Coward · · Score: 0
  10. Important Question by Anonymous Coward · · Score: 0

    Sometimes when I'm on a road trip I get the urge to take a shit. I'm not talking about your garden variety shit, either. These are the "I've been eating nothing but Taco Bell and Arby's for 3 days" type of shits, where you know that when you finally achieve the glorious release, you're going to be splattering liqui-poo all over the floor tiles of whichever poor establishment has the godawful fate of being conveniently situated at the next exit. My question is, does travel etiquette require that I patronize the establishment (perhaps buying a Coke, or an order of fries) after turning their restroom into a veritable no-man's-land of stench and brownness, or am I free to leave immediately after my fecal Holocaust?