Slashdot Mirror


A New Web Image Format

MrP- writes: "BetaNews is reporting that a company called LizardTech has developed a new image format for the Web called DjVu." Apparently, it differentiates between forground and background components of an image, and compresses each appropriately. Good idea, but I'm skeptical of improvements (especially because they say it's "20 times faster then gifs" -- which measure compression in terms of speed? And they also say it compresses faster then pdf, but pdf isn't really an image format). No Linux support. And I don't see any source code on the format, so don't expect it to get a lot of support on any major Web sites, regardless of the compression.

21 of 154 comments (clear)

  1. Why are people so stupid? by GoRK · · Score: 4

    DjVu is almost three years old. It was developed by AT&T not LizardTech. LizardTech just bought it about 8 months ago. It is not, was not, and will never be designed to replace PDF, GIF, JPEG, or PNG. I have been using DjVu as the core of a web based document management system for over a year and a half. It is absolutely bar none the best, fastest, and most cross platform way to go from paper->web out there.

    Look at the ways to get a scanned document on the web:

    1) GIF, PNG, JPEG: Large filesize or bad rendering. If I need to send a 300dpi page to a web browser, the browser isn't going to let a user pan and zoom on it and it certainly won't print it correctly. JPEG is the only one of the three formats that actually has a place to store the document DPI regardless.

    2) PDF: Creating a pdf from a scanned image means either encapsulating a lossy or losless image in a file or doing OCR and risking unreliable information.

    DjVu regularly achieves compression ratios of 1200:1 or more at very very acceptible quality. There is a IW44 fractal compressed background layer and a loslessly compressed foreground layer. The information is progressive also. As the file downloads the foreground shows, then the background, then the color information loads. Example documents on the DjVu website have shown entire 300 dpi full color sharper image catalogs compressed to fit on a floppy disk.

    Btw not only are djvu plugins available for windows, macintosh, linux, and solaris. Let's not forget HP-UX and IRIX. How's that for covering the bases? If youre not supported, you can write your own for your particular flavor of UNIX.

    Geez get it straight.

    ~GoRK

  2. Let's get all worked up about nothing! by Hard_Code · · Score: 3

    Did anybody even follow the LizardTech link? Right on the front page is a link to a page describing DjVu. The whole product ("image format") seems geared towards scanning in Real Life documents and presenting them online. If you *read* the page it explains why it claims it is faster (first downloads high contrast data, then photographs and graphics, clarifying the image as it goes) and smaller (some wavelet kung foo). I don't see anywhere where they are pitching this as competition for gif or png, so everybody put down the flamethrowers. This is a very small niche product for digitizing and presenting real life mundane documents.

    --

    It's 10 PM. Do you know if you're un-American?
  3. Misleading by Happosai · · Score: 3

    The BetaNews article is actually very misleading - it is confusing two of LizardTech's products.

    DjVu is a document format (like PDF), not an image format, and the techniques mentioned in the article refer to compression of documents.

    LizardTech do have a compressed image format. This is called MrSID, and uses completely different techniques for compression.

    I would be interested to see an independent comparison between MrSID and PNG - unless there are huge advantages of using the proprietry MrSID format over the OS PNG format, I don't predict much of a future for MrSID on the web (although it would seem that LizardTech are touting it more for internal use rather than for general distribution anyway).

    [Happosai]

  4. DjVu has Linux support by Lemuel · · Score: 3

    I don't know why the summary on Slashdot says "No Linux support". LizardTech has both decoders and encoders available for Linux.

    Also the summary picks on LizardTech's use of speed as a feature. While this isn't a standard measurement, it is a way to tell people that you will get your images faster because the files are smaller. That's not a big crime. They also do talk more specifically about their format producing smaller files, so they do understand real measurements. BTW, while it is possible that they say elsewhere that DjVu compresses faster than pdf, what I saw was that the documents download faster, not compress faster.

    The Slashdot write-up complains about LizardTech's comparison of DjVu with pdf, pointing out that pdf isn't an image format. True, but the LizardTech description refers to DjVu as "DjVu for Documents", and their web page describes why DjVu is good for documents. Images seem to be just part of the data they need to handle.

    Finally, I haven't seen any source for Acrobat either, but it is very popular on the Internet, so lack of source won't necessarily keep LizardTech from succeeding with DjVu.

    Is DjVu actually any good? I have no idea. Slamming the product with incorrect and misleading comments doesn't help one decide, though.

  5. Reminds me of FIF by Tony+Hoyle · · Score: 3

    A few years ago a company came up with a compression which was actually rather good, using fractals. It was called FIF. They made the mistake of greed ovtaking common sense and tried to charge for a license to write compressors for it... The result - when is the last time you saw an FIF file?

    If these guys don't have an open format they will simply go the same way.

  6. They got it from AT&T by dne · · Score: 4
    1. Re:They got it from AT&T by SEWilco · · Score: 4
      And we discussed it here on Slashdot:
  7. Re:Some facts off the top of my head... by Anonymous Coward · · Score: 5
    Gary got is mostly right.

    I am one of the four persons who created DjVu in the first place. The events took place in AT&T-Labs Research between 1997 and 1999.

    1. There is Linux support. Just go to the download page and select the Linux platform. Most of DjVu was first coded under Linux.
    2. After the Lizardtech deal, we set up a "non commercial site" named DjVuZone. It contains general information, benchmarks. links, a searchable digital library, etc.
    3. There is source code. Lizardtech recently had the good idea to relicense version 2 of the DjVu reference library under the GPL. We have the corresponding online documentation on DjVuZone. We are just waiting for the release of version 3 to redo that part of the site.
    4. DjVu combines several new technologies including new approaches to arithmetic coding (Z'-coder), new compression methods for textual images (Soft pattern matching, JB2), new wavelet method (IW44), and new ways of combining them together. The current implementation is geared toward compressing scanned document images in 24 bit colors around 300 dpi (raw size is 25MB) and typically packs them into 50-60KB. Neither TIFF, nor JPEG, nor JPEG-2000 nor Fax-G4 can do that. None of these technologies will let you realistically view such documents over the web. DjVu can.

    Hope this helps :-).

    - Leon Bottou, AT&T-Labs Research.

  8. -1 redundant by buttfucker2000 · · Score: 4

    The article has it about right. Png is vastly superior - excellent lossless compression, sometimes better than lossy methods (plus technical features like a full alpha channel), and, most importantly for its dominance over gif, it is unencumbered by patents or closed source algorithms.

    Speed of compression is not a factor in compression - otherwise we would use bmps or xpms, which have zero compression time - because they're uncompressed. Size matters. Speed doesn't.

    I really can't see much market, and very little application for this compression. On-the-fly compression of images for web download would be redundant, since a png would be smaller than this format, so the speedy on-the-fly compression of uncompressed images is pointless.

    And in any case, modern PCs are more than powerful enough to almost transparently display well compressed images, so a simpler format is about 10 years out of time.

    If it was open source, it could perhaps have a market in replacing things like xpms, which are used in games for processing speed, but even it was, the benefit would be marginal, since hard disk space is, relative to image size, almost infinite, so compressing them slightly wouldn't make much difference - and for download those images would be gzipped anyway.

    --
    Free Anne Tomlinson!!
    1. Re:-1 redundant by harmonica · · Score: 3

      DjVu is for scanned documents. There is no major accepted file format in use for this kind of data. This will be a huge market once bureaucracies around the world start digitizing their tons of documents. OTOH, DjVu is there for quite a while already and I don't see it having succeeded. Plus, when I installed the plugin under IE 5 a year ago, it was in some dubious beta state. Not nice to work with.

      Lossy / lossless image compression types. You cannot compare PNG tolossy schemes. PNG cannot beat a lossy method because the goals are different. Lossless: Compress as small as possible (but the exact original must be restoreable). Typically, the algorithms that throw more resources (CPU and memory) at it are better. Lossy: For a given file size, reach the best quality. You can easily beat PNG with a lossy scheme by simply choosing very bad quality.

      Open source. There are a few programs out there. Try TIC. It's GPL'd and beats JBIG-1 by about 40 percent on scanned images, according to the website.

      Resources: Image Compression Resources, The Data Compression Library.

  9. CmdrTaco needs more sleep by Hasdi+Hashim · · Score: 3

    He posted this two years ago: DjVu plug-in available on Linux/Irix/Solaris/Mac. First post by sengan:
    New Image Compression Algorithm claims 1000:1 ratio

    Hasdi :-)

  10. Re:Actually quite an old product by Suydam · · Score: 4
    On that note, I've been using DjVu's open-source encoder for several years to encode text documents. It's compression ratios are incredible and the plugin is also free and easy to install.

    The big problem I have with this article is that DJVu isn't a "new image format". It doesn't even display things inline (like GIF, PNG and JPEG). It is however an excellent alternative to PDF if size of file is your main concern.

    The extensive references to "speed" when compared to GIFs and PDFs could be one of two things. They could be talking about Download speed (my personal experiences show DjVu files to be about 10 times smaller than GIFs and even more when compared to PDFs. Or, they might be speaking of encoding speed which DjVu seems to excel in

    Here is a problem however: the command line encoder used to be free for non-commercial use. I was using DjVu for encoding swim-team documents for a small non-scholarship collegiate swim team. Certainly this counts as non-commercial. HOwever, the new version from Lizard Tech would cost me $2,000 USD to run. That is absurd by comparison. So I'm abandoning DjVu since I can no longer afford the encoder.

    Incidentally, if you want to see how it worked for me, I used it on nearly every swim meet results page for a few years. Here is an example, just click on the links next to the word "Splits" in each event: http://www.k-swimming.org/cgi-bin/swimming/results /meet_view.pl?8

    --


    Werd.
  11. There is Linux support by snookums · · Score: 4

    They have a browser plugin for Linux/x86/glibc2 available for download here

    Yes, I know that link is broken because /. put a space in it -- you'll have to get rid of it yourself
    --
    Be careful. People in masks cannot be trusted.
  12. Actually quite an old product by Nailer · · Score: 5

    DjVu has been around for 2 years, and isn't anything new. In fact, it wasn't actually designed by Lizardtech - it was developed as an Open Source technology in the Olivetti and Oracle Researtch labs in Camridge, UK, and was sold when US telco AT&T purcahsed the labs.

    Hence the Open Source products generally only seem to be there to satisfy existing licensing requirements from prior to Lizardtech's purchase. It's doubtful Lizardtech tend to encorage that aspect of the technology, and they're only promoting the closed source stuff.

    However, the compression is indeed very real and the cross platform nature makes it quite useful for archiving stuff that won't be modified frequently in the future - remeber, that text ain't vectorized, it's just another layered image, AFAICT.

    1. Re:Actually quite an old product by dkh2 · · Score: 5
      True. If you want to see DjVu in action, go get the plugin at djvu.com and visit one of my projects here at CWRU. http://www.cwru.edu/UL/DigiLib/Hours/homepage.html

      Picture this: Start with a 15th century Flemish "Book of Hours", hand illuminated on vellum (goat skin). Scan it at 600dpi 24bit for archival purposes. Reduce your tiffs to 300dpi and you still have 1.06 GB of image data (not very downloadable). Using the DjVu compressor we achieved 205:1 compression so the final product totals 5.44MB. By separating the pages so they only download when called for the initial download is a mere 45.06KB (including all of the HTML and other images on the page) with an average download of subsequent pages only 21.34KB.

      DjVu was developed by AT&T Research. It was then purchased by LizardTech last year.

      Code commentary is like sex.
      If it's good, it's VERY good.

      --
      My office has been taken over by iPod people.
  13. It wil not replace a the "standards" we have now by funkman · · Score: 3
    Here is the press release. Their market is not for general sites like slash, yahoo, etc. This is a business oriented product for storing digital assets, so they may easily be cataloged and transformed into a format a user may see. From their press release, some example apps:

    Corporate digital asset collections

    Real estate sites

    Online catalogs/retail companies

    Auction sites

    Libraries

    Medical sites

    Geospatial imagery/government agencies

    Corporations are storing everything digitally now: pictures, instructions, etc and need are searching for a way to manage all of this. This product is attempt to fill that void.

    In a nutshell, this will be a specialized format that we will see for businesses that need to pass digital assets to the user.

  14. Not a web image format by Nailer · · Score: 3

    Its more a document archiving format than a new web image format [although it happens to be viewable over the web] - as the article states although its non vectorized, it uses layered bitmaps to create more efficiently encodable data chunks.

    And, actually, there is Linux support, and source code available. Just Lizardtech aren't going out of the way to tell anybody about it - see my above post :-)

  15. Some facts off the top of my head... by gary.flake · · Score: 4
    As far as I can tell CT's post and the article have anumber of things wrong. I've known some of the people involved with DjVu for a couple of years, so let me list a couple of facts in no particular order:
    1. DjVu was originally developed at AT&T by a group that has traditionally worked in machine learning. LizardTech purchased the technology from AT&T.
    2. This format is specialized for scanned documents.
    3. The technology is very different from just about everything else because it seperates background and foreground planes. The background is compressed with wavelets, and the foreground probably uses a form of clustering on characters shapes (in a typeface and language independent manner). As a result of the latter, you get a form of OCR almost for free. You can also do text search.
    4. Everything can be viewed at 300dpi directly in your browser and in realtime (you normally only view at 100dpi but you can zoom in).
    5. The linux viewer plugin and compressor has been available for years.

    The main attraction of DjVu is that your scanned documents are tiny (typically less than 50KB) which makes it feasible for putting them on the web. Just about every other format results in files too big for easy distribution on the web. Interestingly, you can convert a *.ps.gz file into a DjVu file, and see a dramatic improvement in file size while preserving almost all of the detail. I am not talking about simple pages here, by very complex ones with a mixture or real images / artwork, and text.

    Apologies for any mistakes, but I think that I got most of it right.

    -- GWF

    1. Re:Some facts off the top of my head... by dmarney · · Score: 3

      I have been using DjVu for more than a year now, and have tested it extensively against other image compression technologies. If you've got scanned images to display on the web, especially if they are in color, then DjVu is far and away the best technology on the market. Here is just a short list of the things that I like about DjVu: 1. I can encode a 20MB color scan into a 100-200K file, and still get a great-looking image on the screen. I can create astonishingly tiny B&W images (how about a full page of text scanned at 300 DPI rendered in 17K?) Nothing else comes close. 2. I can create separate image and data layers. DjVu produces a color background layer, color foreground layer, high-contrast B&W layer, and a data layer. This is essential for doing OCR, where we really need to have that B&W image to get to the text. 3. I can encode the DjVu image to automatically upsample to match the greater resolution of my printer vs. the screen, in the same file. 4. I can create a multi-page document either as a single file, or as a linked list of indivdual pages, with a file for each page. No more horrid PDF byte-serving! (Please pardon us as the Author weeps for joy.) 5. I can construct a URL that will drop a user into the middle of a document (page 50 of 100, for example), and not lose the context of the other pages. 6. I can use the EMBED tag to provide automatic installation of the free DjVu viewer, and I get to specify which image comes up once the software is installed. There are no sign-up forms, no harvesting of my customer's email addresses, and no taking the user out of my visual space. 7. The viewer zooms and pans on the fly. You can zoom in to 100%, and pan through the image simply by clicking and dragging the mouse. This is the only way this should be done. 8. All of the encoding and decoding tools are completely free for eval purposes. The decoder is free for all purposes, and most of the source is open. 9. Once I create a DjVu file, I can convert it to other file formats such as JPEG. Try doing that with PDF! 10. IT LOOKS BETTER -- A LOT BETTER -- THAN ANYTHING ELSE.

  16. I saw this working at Seybold by karzan · · Score: 3

    LizardTech had a booth at Seybold this year, and let me tell you this technology is very, very impressive. They demonstrated extremely high-res files at various zoom points--and explained that the files were very small. This thing also worked lightning fast. Except for a just-visible delay, zooming happens almost instantly. Another thing: I'm fairly sure it was doing raster, not vector. In any case, it was obvious it went way beyond PNG.

  17. Yeah, but... by Black+Parrot · · Score: 3

    > Apparently, it differentiates between forground and background components of an image, and compresses each appropriately.

    Yeah, but can it detect the difference between nude and naked?

    --
    Sheesh, evil *and* a jerk. -- Jade