Slashdot Mirror


Linux Developers Consider On-Screen QR Codes For Kernel Panics

An anonymous reader writes "Linux kernel developers are currently evaluating the possibility of using QR codes to display kernel oops/panic messages. Right now a lot of text is dumped to the screen when a kernel oops occurs, most of which isn't easily archivable by normal Linux end-users. With QR codes as Linux oops messages, a smart-phone could capture the display and either report the error string or redirect them to an error page on Kernel.org. The idea of using QR codes within the Linux kernel is still being discussed by upstream developers."

32 of 175 comments (clear)

  1. Good idea by Primate+Pete · · Score: 5, Insightful

    I'm not sure how hard it would be to pull this off in practice, but kudos to the team for improving (or at least thinking about) better usability from the kernel out.

    1. Re:Good idea by Anonymous Coward · · Score: 2, Insightful

      how soon until someone accidentally posts a QR code containing confidential information, since they cannot read it themselves.

    2. Re:Good idea by Kjella · · Score: 4, Insightful

      Very unlikely.. the information in a QR code is probably just enough to say "I run kernel X (build Y) and it crashed with error code Z at instruction 12345 in module 123", if it was a kernel dump that's different but I have seen these without the QR codes and there's nothing sensitive there.

      --
      Live today, because you never know what tomorrow brings
    3. Re:Good idea by Zocalo · · Score: 3, Informative

      It might actually be more than that. Worst case, the screen in in 80x25 text mode (assuming a PC), which gives 2,000 binary bits, but if you start playing around with extended ASCII graphics characters you could probably encode a KB of data quite easily. Hardly a crash dump, but easily enough to get across the essentials.

      --
      UNIX? They're not even circumcised! Savages!
    4. Re:Good idea by Anonymous Coward · · Score: 2, Interesting

      You just have to reprogram the VGA font table with 2 wide by 4 high bitmaps (because you can fit 256 such glyphs into the standard vga font table), and you now have 16000 pixels to work with instead of 2000, a bitmap display with a 160x100 pixel resolution; VGA text mode is 640x400 pixels, and each virtual pixel is 4x4 screen pixels, the standard VGA font is 8x16 pixels.

      BTW, you can't encode 2000 bits into a QR code with a 2000 bit bitmap, as it has parity and spatial clock recovery built in to the code.

      Since the system has crashed, there is no harm in replacing the vga font table, and it is so universal that you can do it on all hardware with VGA (the font table is always at a fixed memory address in PC architecture) without any interaction with device drivers.

      Of course you can also put the VGA into a standard graphics mode with no knowledge of the previous graphics card state, simply by programming known io ports with known values, this actually seems altogether more reliable than relying on the Linux framebuffer drivers which don't always work when X11 drivers are using the hardware, but could be considered reliable if KMS is used.

    5. Re:Good idea by rnturn · · Score: 2, Insightful

      ``Hardly a crash dump, but easily enough to get across the essentials.''

      Here's a crazy idea: instead of working on displaying cutesy graphics images that need to be decoded using a smart phone and a web site, what about actually generating a freakin' crash dump? Is there a technical reason that Linux is unable to do this? If crash dumps are really not possible, how about a plain 'ol text file in the root directory containing the reason for the crash/panic?

      --
      CUR ALLOC 20195.....5804M
    6. Re:Good idea by Primate+Pete · · Score: 2

      No, it's adding smartphone users. I presume the basic panic information would continue to be available.

    7. Re:Good idea by Levex · · Score: 5, Informative

      We are encoding the full Oops, i.e. from the "cut here" to the "end trace" marker. Classic won't ever go away, and we had already created a configuration option called CONFIG_QR_OOPS that can disable this at all. In case your distro or you had compiled it in and you don't want to have QR codes on your screen, I just added a new kernel parameter currently called 'qr_oops', which can as well disable it.

      --
      Cheers Levente Kurusa
    8. Re:Good idea by Pinhedd · · Score: 4, Insightful

      Kernel crashes occur when the kernel enters an inconsistent or invalid state from which it cannot recover.

      When a user program fails, the kernel maintains consistency, can cleanly terminate the process, and can accurately report the cause of the failure if need be (illegal instruction, deadlock, access violation, etc...).

      When a kernel fails the very systems that it relies on to report failures may very well be compromised by whatever caused the kernel to fail in the first place. As such, any kernel fault reporting needs to be incredibly robust and as independent of other kernel mechanisms as possible. Dumping text to a serial terminal is the preferred method because it's incredibly simple and relies on nothing else, meaning that barring a failure of the system memory it should always act as a reliable fallback.

      Dumping kernel memory to a disk might fail if the state of the file system is compromised, if the storage controller is compromised, or if any number of intermediary systems are compromised by the inconsistent state of the kernel. Many operating systems do attempt to dump crash memory to the swap file / swap partition as this is less likely to cause data corruption than writing to a particular file in the file system.

      It "can" be done, but that does not necessarily make it a good idea.

    9. Re:Good idea by AmiMoJo · · Score: 3, Insightful

      You usually don't want to write to the filesystem in the event of a kernel panic. It could make things worse and corrupt it. Once you kernel panic you are basically screwed and can't rely on any services beyond really low level BIOS stuff to work. Poking some text to the screen buffer is about it.

      Windows does core dumps using a specially reserved area of the boot drive and using low level boot driver calls. It can still fail but at least has a fairly low probability of damaging the filesystem further. I suppose Linux could maybe dump to the swap partition or something.

      --
      const int one = 65536; (Silvermoon, Texture.cs)
      SJW, n: "Someone I don't like, and by the way I'm a fuckwit" - AC
    10. Re:Good idea by Lemming+Mark · · Score: 3, Insightful

      As AmiMoJo also noted, when you have a kernel panic all bets are off regarding which parts of the kernel are OK. If the behaviour of the disk driver or filesystem have been affected, it could damage your filesystem to try to write a kernel dump into a normal disk partition. It might work but it does seem a good idea to be properly paranoid. I didn't know that Windows uses a special reserved area of the boot drive - that does make sense as a solution!

      There have been various systems for crash dumping under Linux, though. I think the de-facto solution (the one that was accepted by the kernel devs) ended up being kdump, which is based on kexec (kexec is "boot directly to a new kernel from an old kernel, without a reboot"). This allows full crash dumps with (hopefully) decent safety, so it is possible to do this if configured.

      In kdump, you have a "spare" kernel loaded in some reserved memory and waiting to execute. When the primary kernel panics it will (if possible) begin executing the dump kernel, which is (hopefully) able to reinitialise the hardware and filesystem drivers, then write out the rest of memory to disk. I'm not sure how protected kdump's kernel is from whatever trashed the "main" kernel but there are things that would help - for instance, if they map its memory read only (or even keep it unmapped) so that somebody's buffer overflow can't just scribble on it during the crash.

      Obviously, having a full kernel available to do the crashdump makes it easier to do other clever tricks, in principle - such as writing the dump out to a server on the network. That's not new, in that there used to be a kernel patch allowing a panicked kernel directly to write out a dump to network, it just seems easier to do it the kdump way, with a whole fresh kernel. Having a fully-working kernel, rather than one which is trying to restrict its behaviour, means you can rely on more kernel services - and probably just write your dumper code as a userspace program! Having just installed system-config-kdump on Fedora 20, I see that there's an option to dump to NFS, or to an SSH-able server - the latter would never be sanely doable from within the kernel but pretty easy from userspace.

      Various distros do support kdump. I think it's often not enabled by default and does require a (comparatively small) amount of reserved RAM. So that's some motivation for basic QR code tracebacks. I suppose another reason is if they expect they can mostly decipher what happened from a traceback, without the dump being necessary - plus, with a bug report you can easily C&P a traceback.

      This discussion has just inspired me to install the tools, so maybe I'll find out what it's like...

  2. Huh? by Anonymous Coward · · Score: 2, Insightful

    And if no one with a phone is there?

    1. Re:Huh? by ledow · · Score: 5, Interesting

      You lose nothing.

      Anything that could have been logged to disk will have been.

      Anything that couldn't is probably FAR TOO LONG to even start taking down any other way and almost certainly will cut through the screen buffer limit anyway (every kernel panic I've had - which is about a dozen I think - was like that).

      Let's compare and contrast to, say, Windows. Bluescreen with minidump and error code that has 7 million potential causes.

      At least with a QR code, for those totally undumpable errors, you stand half a chance of snapping it and providing several kiloybytes of useful information for someone to work from - that they know hasn't been transcribed wrongly. And can be taken from even a completely hung machine.

      It's a good idea. Someone needs to make a patch for it. The biggest problem - as always - will be making sure you can get to the point that you can write to the video memory and do so with enough processing / storage to be able to write something useful into the QR code.

    2. Re:Huh? by ledow · · Score: 2

      Just over a kilobyte, I think.

      But that can be compressed as it doesn't NEED to be human-readable any more. So you can easily fit in a few Kb of useful data, I should think.

      And as data density rises, so does the error correction but if the QR code reads (you have a device that reads them directly, why bother to snap a shot then process the image separately?) then it was a success. Hover and hold until you get the beep, on almost any smartphone made this decade.

      But, no, you won't get CORRUPT data. The QR code either works or doesn't, like barcodes either scan or don't. You don't scan a book and get sold a DVD. Same principle.

      What you might have is trouble getting a decent QR read on a crappy low-res camera but that's - again - no worse than the prior situation where I've seen kernel-panic screenshots you can't even read, let alone decode.

    3. Re:Huh? by Guspaz · · Score: 4, Informative

      QR codes use Reed-Solomon error correction, so you don't get missing or corrupt data (in that the QR reader knows if it reconstructed all the data correctly or not). Readers will typically only "read" the code if they manage to reconstruct the entire thing. The error correction helps compensate for poor image quality, and the fact that the image is monochrome makes things like exposure less critical. There are four levels of error correction, which allow for the reconstruction of 7%, 15%,25%, or 30% of codewords respectively.

      QR codes can store up to a bit under 3KB of data (the largest size with the lowest error correction), but I couldn't get my phone to read any v40 QR codes (the largest ones), and v25 took some effort. The plan for QR codes of kernel oopses will probably fail for that reason, if nothing else (that they need v40 codes to store an entire oops, and few phones will read v40 codes).

    4. Re:Huh? by MightyYar · · Score: 2

      I'm sure the folks discussing this are smarter than I am, but this is Slashdot, so I'll do some uninformed speculation anyway. Because it's fun.

      The sample oops is 3134 bytes in plain ASCII. Plain old zip gives 1589 bytes, xz does a little better with 1492 (and only 1446 with lzma). I believe xz could do even better if the dictionary could be fixed and thus not embedded in the file. Doing a base64 encode on that gets to 1929 bytes.

      So it looks to me (based on my sample of one...) like you could use version 27 or 28 with the lowest level of error correction. Probably let the library just scale it to whatever size is necessary.

      --
      W..w..W - Willy Waterloo washes Warren Wiggins who is washing Waldo Woo.
    5. Re:Huh? by Levex · · Score: 2

      Yes this causes a big output since my (and I suspect yours as well) terminal uses size 12 fonts, but in the crash situation we do it one screen pixel per QR pixel. There is a file called qr_code.png in that folder I gave in the thread, which is the actual result unscaled. It's 147x147 which can fit on every screen I know of.
      How to handle textmode is still an ongoing question. We'll first get it working on the framebuffer then maybe we'll find a solution for textmode, if it's even possible.

      --
      Cheers Levente Kurusa
  3. Re:long-term applicablity? by istartedi · · Score: 2

    No, just rebuild the kernel. It should be a build option for text or QR panics.

    --
    For all intensive purposes, "whom" is no longer a word. That begs the question, "who cares"?
  4. Not enough data by MichaelSmith · · Score: 2

    QR codes are highly redundant and don't actually contain much data. There isn't enough space for a stack trace or anything like that. Probaby not even a register dump on those big modern CPUs.

    1. Re:Not enough data by smittyoneeach · · Score: 2
      --
      Get thee glass eyes, and, like a scurvy politician, seem to see things thou dost not.--King Lear
    2. Re:Not enough data by Anonymous Coward · · Score: 2, Insightful

      1) No, 2953 bytes is not enough for a "kernel dump". "Kernel dump" as a term/phrase doesn't even make any sense, come to think of it. Did you mean a stack trace? Register dump? Because "kernel dump" makes me think of "memory dump", i.e. dumping all contents of RAM to swap + rebooting system (which later notices the crash dump header in swap and hopefully extracts it).

      2) If just a stack trace or register dump: 40-L may be too high a resolution to reliably work when using a mobile phone camera to take a picture of an LCD screen. There's often too much noise (high ISO) in this situation. Lower-resolution QR codes means more likely successful recognition and decoding. http://en.wikipedia.org/wiki/Image_noise#Low_and_high-ISO_noise_examples

      3) What I haven't seen mentioned: how exactly do the developers plan on printing a QR code when someone's using a text-only console? Don't tell me "everything on Linux console uses a graphical framebuffer now" because that's completely false (lots of folks disable this, and some distros disable it by default). What's going to happen when the kernel crashes? It uses BIOS INT 0x10 to switch to 320x200 VGA mode and show a QR code? Is it going to change the on-screen font masks/bitmaps to display "tiled" pixel data that represents a QR code?

      I have a better idea: how about just keeping things how they are. People using mobile phones to take a photo of a stack trace + register dump mostly works reliably (barring wobbly hands). Console fonts are quite legible even if the person has consumed too much cappuccino, while QR codes, especially high-resolution QR codes, are going to be a lot less legible in that situation. My reaction to this proposal would be: what does using a QR code get us that we don't already have available with existing technology and methodologies in place? (FYI: the correct answer to that question is: "nothing")

  5. Re:Dump kernel to serial printer by smittyoneeach · · Score: 3, Funny

    You gonna need a big bowl to catch all them corn flakes, mister.

    --
    Get thee glass eyes, and, like a scurvy politician, seem to see things thou dost not.--King Lear
  6. Wish other OSs did this... by jpellino · · Score: 2, Insightful

    Anything's an improvement over:
    "My computer froze."
    "What happened?"
    "It put some message on the screen."
    "What did it say?"
    "Something about an error."
    "What error?"
    "I dunno. It had some numbers and letters and stuff."

    --
    "Win treats sysadmins better than users. Mac treats users better than sysadmins. Linux treats everyone like sysadmins."
    1. Re:Wish other OSs did this... by Anonymous Coward · · Score: 5, Insightful

      And with QR codes, the conversation becomes this:

      "My computer froze."
      "What happened?"
      "It put some white and black crap on the screen."
      "What did it say?"
      "How the fuck should I know? It was random white and black dots! Like a fucking Rorschach test!"
      "It probably was a kernel panic. What was the error?"
      "I dunno, because like I said, ALL IT HAD WAS SOME DOTS AND SHIT. Then it rebooted! So it's gone! FUCK!"

      How is that an improvement? Yes it's a change, but it's not an improvement.

    2. Re:Wish other OSs did this... by Anonymous Coward · · Score: 3, Insightful

      I doubt the kernel developer that implements this would forget to put the message

      "Make a photo of this black-and-white dots and send it to crash@kernel.org so we can try to figure out what happened. Thanks for making the Linux kernel better!"

      at the top of the black and white dots.

  7. You get the prize of dumbest comment on slashdot by Anonymous Coward · · Score: 2, Insightful

    Really? You think your end user who hasn't got the brains to take a screenshot of human readable text and send it to you and who probably has never even heard of QR codes is going to have the presence of mind and technical knowledge and ability to take a picture of the code and send it to you?

    That has to be one of the dumbest things I've heard on slashdot...and that's REALLY saying something.

    It's even more worrying that the Linux Kernel devs are giving this idea the time of day.

  8. Re:Dump kernel to serial printer by Anonymous Coward · · Score: 2, Funny

    No! We cannot do both! It must be either one or the other!

  9. No way! by msobkow · · Score: 3, Funny

    I am NOT buying a fucking cell phone to read a core dump.

    Just fuck right off already. Not everyone wants a digital leash.

    --
    I do not fail; I succeed at finding out what does not work.
  10. The matrix by BlazingATrail · · Score: 3, Insightful

    I prefer all my BSOD, crashes and core dumps to use the Matrix dripping green characters and pixel crap method of reporting errors. It's easier to see the patterns. Guru meditation # 42

  11. Re:Dump kernel to serial printer by flyingfsck · · Score: 3, Funny

    Bah. Punch cards are so much better. You young, know nothing, whipper snappers with your newfangled hoosammawhatsits...

    --
    Excuse me, but please get off my Pennisetum Clandestinum, eh!
  12. Re:Dump kernel to serial printer by icebike · · Score: 2

    Or just display a short number code. Displaying a QR code won't solve anything, it will just obfuscate the error and leave the user without any easily memorable reference. This sounds more to me like "let's do it because it's modern and hip" rather than it being actually useful.

    The QR code can not only indicate the exact location of the error, but can take you to a website on the phone, with a url long enough to log
      many key points about the error.

    Even if it logs very little, developers will get more input this way than they do now, because when your machine is crashed, you can't report anything and once it reboots, you have other priorities than digging in the last crash dump.

    However, other than collecting statistics, it might not do any good. Even when you do submit a dump, you get the request to install debug symbol packages and trigger the crash again. Ah, no, that isn't going to happen. Or there will be necessary drivers installed that taint the kernel, and devs wont touch it until replace your video card, untaint your kernel, and trigger another dump.

    --
    Sig Battery depleted. Reverting to safe mode.
  13. Just show smileys by Anonymous Coward · · Score: 3, Interesting

    Linux must be ubuntufied. We need to hide everything because it's way to complicated for the common user or his dog. We need more splash-screens to hide all the stuff that makes no sense anyway. Who want's to know if a module didn't get loaded? As a matter of fact, we should remove unnecessary logs (like message, dmesg, audit), because nobody gives a rats ass. Also: Why have a console? Or init-mode 3 ? People want the graphical stuff, let's get rid of all the ballast like command-line. Those few people still using ancient tools like 'make', 'vi' or (o my god) 'ifconfig' should go and find themselves something else to brag with. Linux MUST go mainstream.