Pdf.js Reaches First Milestone
theweatherelectric writes "The pdf.js project aims to implement a PDF viewer using standards-compliant Web technologies. The project has reached its first milestone: it renders the sample PDF (a paper on Mozilla's Tracemonkey JavaScript engine) perfectly. However, that perfection currently comes with some caveats: 'pdf.js produces different results on pretty much every element in the browser×OS matrix. We said above that pdf.js renders the Tracemonkey paper "perfectly" if you're running a Firefox nightly. On a Windows 7 machine where Firefox can use Direct2D and DirectWrite. If you ignore what appears to be a bug in DirectWrite's font hinting. The paper is rendered less well on other platforms and in older Firefoxen, and even worse in other browsers. But such is life on the bleeding edge of the web platform.'"
Goatse
Even reading the summary it is clear that this is a very, very early development work. This is their *first* milestone, of course it's going to be severely lacking in almost every way. Of course it's not cross-browser and doesn't allow selectable text... but eventually it will be. I, for one, think this is a great idea, and can't wait to see it done!
If you read the article (I know, I know)...
pdf.js has now reached the point where a significant portion of its issues are actually browser-rendering-engine bugs, or missing features. Finding these gaps and filling some of them has been one of the biggest returns on our investment in pdf.js so far.
The problem isn't what they've written so much as the browsers not being able to support the latest and greatest HTML5/JS functionality.
I *think* it's supposed to be:
We said above that pdf.js renders the Tracemonkey paper "perfectly" if you're running a Firefox nightly on a Windows 7 machine where Firefox can use Direct2D and DirectWrite, if you ignore what appears to be a bug in DirectWrite's font hinting.
I can understand the use of this to find and fix browser bugs.
But it seems amazingly inferior to a platform native PDF reader, on any platform imaginable. It will be slower the native x86/ARM code by far, and won't integrate well with the desktop environment.
What's with this trend recently to build everything on fundamentally sucky technologies?
I currently have PDFs set to be downloaded and opened in an external application, because PDF rendering in a browser tab (using Adobe's PDF plugin) fucks up important shortcuts: Cmd-W no longer closes the tab but throws up an annoying dialog. That alone would be reason enough to switch.
So, "Firefoxen" is now the plural of Firefox?
Oh wow, retro-trolling. Soon we'll be back to page-widening, Steven King is dead and bell bottoms.
They've actually failed to grasp the point of PDF. You might as well go back to HTML if your PDF reader can't render the same everywhere considering that was the whole point of PDF to begin with.
If it renders it all as images, then why go to the trouble of making this client-side?
Render as images on the server once and your problems are over for all platforms and browsers. Unless, of course, you do something stupid like using the BMP image format instead of PNG.
This is really cool. Now we just need to have web2js instead of web2c, and we can typeset documents with TeX in the browser.
-- Ed Avis ed@membled.com
(congrats, 'twas a while ago I was goatsed the last time)
I hope you mean saw that picture!
https://github.com/andreasgal/pdf.js (It's BSD licensed, minus the credits clause).
Who's the asshole now?
Slashdot social media options: AIM, ICQ, Yahoo, Jabber and Mobile Text. Why no MySpace?
Why not use a vector format which can properly scale the text and even include the text in a format readable to search engines?
Like, for example, PDF?
They cut the sentences into separate clauses to emphasise each of them ...
pdf is old vector graphics news. If they want to help a parky out they can get TinySVG support built in to Firefox so I can finish rebuilding all of my XUL UI's in SVG. ...that don't work now unless the user knows how to re-enable support then ends up getting owned instead of a warning like getting a self signed cert... Cough. Sorry. Oh while I'm dreaming, getURL, putURL, and parseXML functions so I don't have to "if typeof (parseXML=='undefined')" override them every time would be nice too :) Oh and those wondering about the relationship here, pdf to svg is easier to convert and display on the fly. Well. For me. So programmers can probably do it in their sleep. Oh. Oh. And a pony.
Having to work for a living is the root of all evil.
Bzzzzt! iOS handles viewing and saving PDF's fine. Thank you for playing "I Bashed Apple on Slashdot". Try again.
Facebook is the new AOL
Render as images on the server once
"The document could not be displayed because you are not connected to the Internet."
I already have a perfectly fine Pdf viewer, called Okular.
From Okular's web site: "For Windows have a look at the KDE on Windows Initiative webpage for information on how to install KDE on Windows." The download page on KDE Windows Initiative links to detailed installation instructions. I'm not in a position to try it myself because the PC on which I'm typing this has integrated graphics, which isn't enough to run KDE according to a forum post linked from a Google search for kde system requirements.
I'm guessing you're modded Funny because - sadly - this hasn't been true of PDF in a long time now.
Even between desktop PDF readers there's now too much of a difference to even remotely be able to 'rely' on it. Even bitmaps are getting less and less reliable with applications choosing to either respect or ignore gamma tags, let alone color profile information.
As it is, I used to use FoxIt, but that started to get bloaty and including oddball toolbars. So I switched to PDF-Xchange. I'm about to switch again (Sumatra, maybe?) because PDF-Xchange takes far too long to render pages with lots of graphics. And neither FoxIt nor PDF-Xchange render vector content with anti-aliasing correctly.
http://i.imgur.com/f8udg.png
( original image source: sinfest.net . Note that left image was at 172% zoom, right image was at 150% zoom, to get the same on-screen size. wtf? )
I'm going to guess that Sumatra really won't be any much better and maybe I'll have to go back to FoxIt.
Either way, it looks like Adobe Reader will have to remain installed for when these alternatives don't quite get things right.
I find it quite hilarious that people speak seriously about coding artificial intelligence as if it will happen in the this decade, when at the same time we can't even achieve a consistent rendering of the same elements in different browsers.
I can render a PDF perfectly on all OSes I own (Windows, OS X, iOS, Windows Phone 7) already!
My book: Friendly F#, fun with game development and XNA; my game: Galaxy Wars by VSTeam; my gamedev language: Casanova.
This is just silly. While I can appreciate it from a point of curiosity and it is probably a fun project, this is really overloading the browser.
I would submit that things like this are actively breaking the browser paradigm. Every PDF viewer allows you to save a local copy of the PDF after they have read it from the temp directory or the download directory. To implement this thing correctly is would require that JS have direct access to the file system, which as I understand it, aint fucking supposed to happen, since that would create untold numbers of security problems in a system already plagued by security problems.
While there may be arguments that this would be ok, they would all be moronic.
The entire notion of the browser needs to be forked out to an application shell with hard as nails security and a presentation shell and never the twain shall meet.
Hey KID! Yeah you, get the fuck off my lawn!
I'm not in a position to try it myself because the PC on which I'm typing this has integrated graphics, which isn't enough to run KDE according to some idiot who doesn't know what he's talking about.
Fixed that for you. KDE 4 works perfectly with integrated graphics, you just have to turn desktop effects off. It's perfectly usable without desktop effects enabled, all applications detect it and degrade gracefully, and all the controls etc. work pretty much the same. I have a laptop with integrated graphics that doesn't support desktop effects, and I don't notice the difference apart from once a week or so when I suddenly wonder why my terminal emulator doesn't have a transparent background.
This graceful degradation is a pleasant contrast to GNOME 3, I might add.
Pirate Party UK
Oh, and I also use Okular on Windows. It works quite nicely.
Pirate Party UK
The end game is that by shifting focus from desktop applications to cloud applications makes the desktop operating system much less important.
Envisage a day when you dont need to run just so you can run that one specific app.
this might sound over the top - but i am sure that given time we will be able to play the new "Crysis" (whatever that might be) in the browser on any operating system. (of course there will still likely be some beefy hardware requirements and a juicy broadband). Although im fairly confident that day will come - and probably not as far of as we think! Developers can target the one platform to rule them all.
Is this a troll, or did you just spectacularly miss the point? That forum post is fairly obviously about the system requirements for the KDE equivalent of Aero Glass or whatever it's called these days...
You can't do those things with most PDF files anyway.
May the Maths Be with you!
Direct2D and DirectWrite? Sorry but browser graphical acceleration must end.
WebGL should be implemented with no hardware acceleration, using graphic card emulation.
> Either way, it looks like Adobe Reader will have to remain installed for when these alternatives don't quite get things right.
You might consider installing a VM like VirtualBox or some kind of sandboxing solution so that you can convert / print / export and subsequently erase any "side effects".
Memory and surplus CPU power is getting cheaper and cheaper, I don't understand why more people don't talk about going this route.
Or perhaps you've failed to grasp the point of a v0.2 pre-release on github? In fact TFA specifically states that pixel perfect rendering _is_ their goal.
The blog post describes the current progress; it now has good rendering on one platform, progress from last week.
Back when I still used Windows, Acrobat Reader was an absurdly large app and by far the slowest PDF reader I knew over all platforms. It always struck me as absurd that Apple and Linux users had built-in, capable, lightweight PDF viewers while most Windows users used that bloated POS. Maybe acroread is better these days, but I kind of doubt it.
Switch back to Slashdot's D1 system.
ha thats a ghost argument
"see my infinity energy machine works perfectly fine, its just the laws of physics have not caught up yet"
does not sound like any less bullshit
The Javascript code isn't doing the rendering of text. It uses dynamically loaded fonts and lets the platform's own font renderer render the glyphs. The Javascript code isn't pushing pixels.
There's less need for PDF than there used to be, now that you can download fonts in the browser. It might be worthwhile to take this PDF viewer and turn it into a server-side PDF to HTML translator.
Did you try gnome's evince? it has a little known windows version, which is easy and very decent.
Being free software makes it free of toolbar, ads and clutter. plus using the same software no matter the OS is nice.
But WHY? Why spend precios cycles that eat battery life and heat up your PC innards doing the same thing through twenty layers of twisted human logic, that a piece of native runtime plugin code can do as well? 'Plugin' is just a word, it doesn't need to be insecure, alien, buggy or . And even if they are that, the problem lies at another level.
If anything, Pdf.js will be suitable where and when energy and resource conservation isn't a factor.
As for me, I prefer to avoid all the extra layers of abstractions and have the ball fall into the same basket.
Please enlighten me, a software developer of many years, what is this gold that is Pdf.js? I mean, apart from proof-of-concept being gold in itself.
Actually Sumatra is pretty nice. it is fast, low resource, haven't done any side by side but so far looking at a ton of PDF that I have they look the same as they did in Foxit so I assume they are rendering correctly.
So if you want to give Sumatra a spin the easiest way is to use Ninite which turns 'clicky clicky next next next" into click and run. They also have tons of nice software from CCleaner to Glary Utilities and all of it TOOLBAR FREE without having to worry about checkboxes hidden on page 5.
as for TFA good luck, with the big complicated security mess that is PDFs? They are gonna need it. Personally I don't think having everyone with the same default PDF engine would be all that smart as it gives a common denominator for the bad guys to target. But then again I still think H.264 sucks compared to flash performance on older machines so what do I know.
ACs don't waste your time replying, your posts are never seen by me.
the problem though is that they will never be able to test it on more platforms than those that have native pdf rendering. So it is unlikely that it will ever be advantageous over those. If we are going this way let's just download compile and run C code from the browser automatically. If anything it may be easier to write cross platform C than cross platform Javascript+HTML!
Umm, how is parent a troll for posting a link to the actual thing the top parent spoke of? He's not.
Whoever wrote that should just go and shoot themselves in the face a couple of times.
To have a right to do a thing is not at all the same as to be right in doing it
Except that you can expect the missing browser functionality to be added, or in the case of open source browsers potentially add it yourself.
Changing the laws of physics is a rather different matter.
It's official. Most of you are morons.
Reader 10 is still a large download but it loads much faster than, say, Reader 8 or 9.
I believe it is mainly because they don't preload plugins anymore.
Search RapidShare and MegaUpload!
the problem though is that they will never be able to test it on more platforms than those that have native pdf rendering.
Wrong. The test suite compares the canvas rendering against reference images (potentially generated on a different device).
The big downside is that it's all images and you can't do all those fancy things you can do with text. Like select, copy & search.
I'm working on it. To get text out of pdf.js as is, you just implement a TextGraphics object (like their existing CanvasGraphics one) and just implement the text and coordinate transform commands. There's lots of ways of getting that into a copy/pasteable form afterwards, but its early days and I'm just coding up the OCR-ish algorithms needed to infer reading order from non-tagged pdf (the most common case).
I'm not associated with the project, but this is on their todo list too, and someone else might get it done before me. But it will be done.
As it is, I used to use FoxIt, but that started to get bloaty and including oddball toolbars. So I switched to PDF-Xchange. I'm about to switch again (Sumatra, maybe?) because PDF-Xchange takes far too long to render pages with lots of graphics.
Have you tried STDU Viewer? I switched to it after Foxit started trying to emulate Acrobat, it's small, lightweight, and reasonably good at rendering stuff, although a bit slow on page transitions sometimes. Another thing that impressed me was that I sent in a feature request, got a reply within 24 hours, and it was added in the next update.
And neither FoxIt nor PDF-Xchange render vector content with anti-aliasing correctly
Where can I get a sample PDF with this? I've just tried it with STDU Viewer and it seems to render at least diagonal lines with antialiasing properly, but if there's some test PDF that really shows it up I'd be interested in trying it.
They cut the sentences into separate clauses to emphasise each of them ...
They cut. The sentences. Into separate clauses. To emphasise. Each of them.
FTFY.
Wow this post is incorrect on so many levels.
Java != Javascript. They are radically different languages, and if you had worked with either of them you would know that converting from one to the other is by no means trivial.
There exist quite a few implementations of a pdf reader in Java, sme of them opensource. Icepdf comes to mind but there are many others, even whole libraries of code for this purpose.
Javascript really has nothing in common with Java except the name. You can thank the market department at Netscape for that.
...Thus making them non-sentences. Win.
Unity? Screw that: XFCE. Slashdot Beta? Screw that: SoylentNews. Australis? Screw that: Pale Moon. UX developers DIAF
Changing the laws of physics is a rather different matter.
Black holes do it all the time! :p
Remember to maintain your supply of
There's also another open source PDF project called Trapeze. It currently works in firefox, safari and chrome. Demo at: http://trapeze.xyrka.com/ Project at: http://code.google.com/p/trapeze-reader/ I even posted it to slashdot awhile back, but it never made it to the front page... http://slashdot.org/firehose.pl?op=view&id=17924850