Slashdot Mirror


Any rxvt-Sized Unicode-Aware Terminal Emulators?

Viqsi writes "Just on a lark, I started a short while back to try to convert my environment totally to UTF-8. One of the big hangups that I've run into so far, however, is my X terminal emulator. I've been very happy with rxvt (I tend to have several $XTERMs open at once, so Low Memory is Good!), but it doesn't seem to support anything Unicode. A bit of searching has turned up nothing that isn't as big as or larger than xterm itself. So, the question -- are there any low-memory terminal emulators that support UTF-8, or any other Unicode encoding? (tabbed-window style terminals Don't Count, and that goes double for Konsole!)"

21 of 48 comments (clear)

  1. Why yourself by fm6 · · Score: 2
    Has there been, or ever will be, a form of Un*x that natively supports Unicode in all things? Or would doing such a thing create too many problems?
    I don't know how difficult it'd be, but what's the motivation? End users, and even application developers, may need to interact using Han or Cyrillic or Hangul characters. But it's hard to see why a kernel hacker needs more than 2^7 characters.
    1. Re:Why yourself by keesh · · Score: 3, Insightful
      But it's hard to see why a kernel hacker needs more than 2^7 characters.


      Or more than 640KBytes of RAM. Or more than a 33MHz processor. For that matter, why do they need both uppercase and lowercase? Why do they want monitors? What's wrong with a VT100?

      The days of "sorry, no accents or unAmerican characters" are over. Unicode support at every level would be a big help for non-US-English development.
    2. Re:Why yourself by Papineau · · Score: 3, Informative

      It's not much for the internal strings (function names, variables, etc.), more for the messages that the kernel can send to the user (or root). For the first kind of strings, you'd have to redefine the C standard, and change compiler (and maybe also change text editor).

      When it boots, there's a whole slew of lines, many of which are purely numbers, but some of them could be localized (just as a big part of userland is). Now, since it would be in kernelspace, you'd maybe want to choose which one at compile time, but if you don't have access to a kind of Unicode for outputting strings, your choice of supported languages will be quite small.

      Now, this doesn't go in the argument if it's a desired behavior or not, just that it's something necessary before outputting for any language.

  2. Re:Why by Anonymous Coward · · Score: 2, Funny

    I thought it was so big because it has to contain the proper responses to anything that I type in it. For example "ls". How does it know?

  3. Plan 9 uses Unicode. by Hobart · · Score: 2
    Has there been, or ever will be, a form of Un*x that natively supports Unicode in all things?

    From the 1995 paper describing "Plan 9" , the OS from the authors of Unix at Bell Labs:

    Another departure from ANSI C is that Plan 9 uses a 16-bit character set called Unicode [ISO10646, Unicode]. Although we stopped short of full internationalization, Plan 9 treats the representation of all major languages uniformly throughout all its software.
    --
    o/~ Join us now and share the software ...
    1. Re:Plan 9 uses Unicode. by norwoodites · · Score: 2

      Plan 9 is not UN*X but Mac OS X is and uses UTF-8 and UTF-16 for almost everything, and has a lot of fonts to support the different languages (Japanese, Chinese [Both traditional and Simplified], Korean, and other languages).

      Also Don't use any of the M$ products for the Mac because they do not support Unicode at all.

      Use OmniWeb for a Web Browser and Just use TextEditor for an word editor.

      Also Terminal uses UTF8 for the default encoding, you can change this if you want.

    2. Re:Plan 9 uses Unicode. by mirabilos · · Score: 2

      Actually, Mac OSX is not UNIX(tm), but it is as
      much Unix as OpenBSD, Linux, Windows NT/2000/XP are.

      Well, I wish OpenBSD hat native UTF-8 - currently it
      is totally locale unaware and just makes 8bit==latin-1
      assumptions.

      This does not mean I want NLS or I18N: localized
      error messages, locale in general, LANG= and LC_*=
      do suck a lot.

      --
      My Karma isn't excellent, damn it! (And /. still does not get UTF-8 right in 2012. Wow.)
    3. Re:Plan 9 uses Unicode. by norwoodites · · Score: 2

      Windows NT/2000/XP is no where near UN*X as much as Linux and *BSD are. Mainly because of the idea of a kernel and what goes into it.
      The Windows server in Windows NT/2K/XP is part of the kernel while under UN*X, it is an userland program. And also Mac OS X, *BSD are based loosely around the UNIX source. Also Linux is inspired by MINIX which is inspired by UNIX.

    4. Re:Plan 9 uses Unicode. by RevAaron · · Score: 2

      No sense arguing with this type of person- unless you're on this list you're not "Unix." Silly trademark issue. Apple hasn't paid the assload of money to get certified, nor as OpenBSD. So they're not "Unix." I say, who cares if it's called a car or an auto, still does the same damned thing.

      --

      Working toward a usable PDA environment in the spirit of Newton OS: Dynapad
    5. Re:Plan 9 uses Unicode. by Hard_Code · · Score: 2

      "The Windows server in Windows NT/2K/XP is part of the kernel"

      If I remember correctly the "Windows" part of "Windows NT" is simply a "personality" which is hosted under the NT kernel. In the beginning Windows NT wasn't even going to be called "Windows" - it was just "NT". Only part-way through the development of the GUI-agnostic NT kernel, was the Windows personality added on (there was also a short lived OS/2 personality IIRC). So you're statement isn't so accurate. Now perhaps it actually lives in kernel space, or over time has integrated more or less with the kernel proper, but making this claim seems to be a standard "criticism" of NT. For all intents and purposes, operating systems should come with GUIs. I'd be glad if Linux (which is now being pitched as a desktop OS) actually came with real GUI support in the kernel. Or at least shift the video drivers from XFree into the kernel.

      --

      It's 10 PM. Do you know if you're un-American?
    6. Re:Plan 9 uses Unicode. by divbyzero · · Score: 2
      Or at least shift the video drivers from XFree into the kernel.

      You might want to look at the kernel framebuffer support. That's exactly what it does. My [somewhat uninformed] understanding of why there are still video drivers in both places is that:
      1. The Linux kernel framebuffer support is not yet all that mature.
      2. XFree86 is designed to be used with more systems than just Linux, and as such, cannot rely on the existence of an underlying framebuffer API.
      --
      But my grandest creation, as history will tell,
      Was Firefrorefiddle, the Fiend of the Fell.
  4. freshmeat has many.... by therealmoose · · Score: 5, Insightful
    http://freshmeat.net/browse/158/?topic_id=158

    Ask slashdot is becoming increasing ridiculous, with the answer to almost every question found at either google or within OSDN. I don't mean to flame the editors, but it would be good if they would be a little more selective WRT ask slashdot.

  5. Context! Purpose! by fm6 · · Score: 2
    You know, there are applications that run perfectly fine in a small memory space, on a slow processor, and don't require fancy GUIs. I still enjoy playing old text-mode and low-rez computer games, designed in the days of 16K memory spaces and 1 Mhz processors. (I play them on emulators, not on the hardware they were originally designed for, but that's a matter of convenience and desk space.) High-powered systems are fun, and sometimes even useful -- but not every app needs them.

    The Linux kernel is not a very big entity -- which is a prime reason for its success. I find it hard to see what use it can make of extended character sets. And even if adding such a feature to the kernel had some benefit, there's a cost in terms of size. speed, and risk of bugs and security holes. You need to weigh the benefits against the risks, not just assume every bit of software in the world has to support UTF-32768. And the plain fact is, there doesn't seem to be any benefit at all.

    Perhaps I'm wrong. But to prove me wrong, you're going to have to suggest some real-world examples or scenarios that contradict me. Reciting cliches about vaguely relevent history says nothing.

    Note that I'm not attacking the general concept of Internationalized software. In point of fact, I spend a lot of time documenting the International features of my own company's products. All serious development tools support Internationalization. But they support it from the run-time-library level on up, where 99.99% of all development occurs.

    1. Re:Context! Purpose! by Jon+Peterson · · Score: 2

      I don't understand.

      If you are a kernel hacker (I'm not....), and you want to use chinese characters as variable names, why should you not be able to do that? Is that what you meant? Just which parts of a computer should only ever be referred to using glyphs from the Roman empire, and why?

      I can't think of any reason for any part of a computer system being ascii only, except the reason:

      "It's quite hard to change, and there's not much demand". Which is fine, but given that most people in the world can't communicate using ASCII, it's surely only a temporary excuse....

      --
      ----- .sig: file not found
    2. Re:Context! Purpose! by dvdeug · · Score: 2

      If you are a kernel hacker (I'm not....), and you want to use chinese characters as variable names, why should you not be able to do that?

      Because the common language of kernel hackers is English.

    3. Re:Context! Purpose! by fm6 · · Score: 2
      Actually, the Romans didn't invent most of our glyphs. The Romans didn't have J, U, W, Y or Z, and only used K when writing Greek words. And they didn't have lower-case letters or punctuation. All invented later.

      But I digress. You seem to be assuming that "Unicode support" is this magic thing you can add to any program and suddenly support every Unicode character. In fact, no program directly supports the complete Unicode character set -- Unicode was never meant to be used that way. Instead you support an "encoding" that gives you access to a manageable subset of Unicode.

      The most widely-used encoding is UTF-16, which supports a 2^16-character subset of Unicode. Most major programming languages support both ASCII and UTF-16. Some (notably Java and Visual Basic) support UTF-16 instead of ASCII. Unfortunately, the documentation for these languages usually refers to their their "wide" characters as "unicode" characters, as if Unicode were just a 16-bit "universal" character set. In fact, there are important Unicode-supported characters that are not in UTF-16.

      There's also UTF-8, a variable-width encoding that's backward-compatible with ASCII. I believe GCC already supports UTF-8, and could probably be made to support UTF-16, as most C compilers aready do. And since GCC is written in GCC, you could probably allow kernel programmers to use these extended character sets in their source code. But it's a tricky thing to do, and it's difficult to see the benefit.

      I hate the term "politically correct", but maybe it applies here. You seem to feel that any software that isn't character-set-agnostic is unfair to non-Western users. Putting such an assumption ahead of issues of reliability and security is a very poor kind of prioritizing.

  6. gnome-terminal --use-factory by sig · · Score: 3, Informative

    You can use gnome-terminal with the --use-factory option. It makes one process for all your terms, so if you have a lot of windows open, it doesn't use that much memory.

  7. Why so many xterms? by zeda · · Score: 3, Insightful

    Memory is cheap, why worry.

    Using screen also helps.

  8. Re:Why by dvdeug · · Score: 2

    the OSes themselves use only ascii (from what I understand).

    The OS's support arbitrary strings of 8-bit characters, which means they support UTF-8. There is no point in a modern Unix kernel where you would want to use UTF-8, and it won't let you, short of arbitrary hardware or standards limitations (weird foreign filesystems and what not.)

  9. Messages, variables by fm6 · · Score: 2
    OK, localized messages and variables are good reasons to have a internationalized kernel. But good enough to justify the costs? I don't see it.

    And before you accuse me of being English-centric, note that the guy who invented and still maintains the Linux kernel is not a native English speaker.

  10. Re:Why by Tet · · Score: 2
    is XTerm so large? I've been hearing about this for a while, as it is usally cited as a reason for using rxvt.

    Yes, it's really large. It uses 2.3MB of RAM on my machine, of which, 1.8 or so is shared with other processes. The sad fact is, though, that so is rxvt these days (and indeed, all current terminal emulators). xterm includes a tektronics emulator, amongst other things, which 99% of users will never need (in fact, I'm the only person I know that has ever had a genuine need for it). As a reaction to xterm's size, one of my lecturers at University wrote xvt, a minimal terminal emulator, without the bloat. Over time, that evolved into rxvt, which is growing more and more features, and is no longer as small as it once was. It's still smaller than xterm, but that's not hard. For comparison, opening up a new term uses up the following amount of RAM per new window:

    • 496: xterm
    • 296: rxvt
    • 856: gnome-terminal
    • 140: gnome-terminal --use-factory
    • 1068: konsole
    • 160: konsole (first new tab)
    • 36: konsole (subsequent new tabs)

    It should be noted that both konsole and gnome-terminal have massive startup costs, with konsole starting up 4 kdeinit processes, and gnome-terminal starting up gnome-pty-helper.

    --
    "The invisible and the non-existent look very much alike." -- Delos B. McKown