Not that I'm bagging them out... tokenisation for the CJK languages is extremely difficult. The most naive kind of tokenisation just treats each character as a separate word (i.e. what Office is doing), but that definitely isn't going to be correct in all cases either.
I'm genuinely curious as to whether there are better algorithms floating around, particularly because the company I work for develop software which analyses text for text indexing and searching, and it would be rather nifty say in the case of Japanese, to normalise text which can be written three or more ways into a single representation of words.
I'm amazed they're using Word so much. My last job was at a company designing XML-based document management software, and the boss there was an ex-lawyer. He was telling me in great detail how the legal profession would outright shun Word in favour of WordPerfect.
Yeah, if doctors start making phone calls. In real life, doctors refuse to give out details over the phone even if you call them, so there isn't much to be lost here.
Right, the last update of WinLAF was around 18 Feb 2005, but when was the last update of the Windows L&F itself? Some time around 2001? Seriously, this project does make Swing look much more native, and even if it hasn't been officially released in a while, that still means it's being updated twice as fast as Swing itself.;-)
BTW, there is a third-party project providing a look and feel for Swing which fixes some of the bugs with Sun's implementation which would otherwise take forever to fix.
All joking aside, I'll take more notice when I see a site dealing with serious traffic involving database transactions and other services. Else RoR will always been seen as a bit of a hobbyists plaything suitable for proto-typing or small sites only.
There are already some of those in the list. Just because not every single site is under heavy load, you're saying that all the sites are toys? Fair enough, then. Slashdot is probably a toy too.
Well, I imagine that the PubSub for each site would be centralised (either at the site, or hosted somewhere else.) But each site would probably have its own distribution node, so it's decentralised in that respect. Either way it's not free, because someone still pays for all the bandwidth, and the site still pays for its hosting.;-)
Funny, I used to submit reports in PostScript format all the time, and those used to have images. Did it suddenly become less featureful in recent years or is it just something that happened as soon as PDF was created?
Once there is working PubSub, the work will be focused around the PubSub nodes. The site will send one message to the PubSub service, and that service would be one which is built with large scale messaging in mind.
For printing tasks, PostScript seemed to work perfectly fine before PDF existed. For web browsing, users should never be subjected to a format which stores text as binary.
I know it's a crazy suggestion, but instead of having hundreds of people polling a single RSS feed, why not have the server which hosts the RSS feed actually PUSH the updates out to the people who are interested?
We already have a nice and simple protocol (XMPP) which could be used for this, although admittedly PubSub isn't as final as it could be.
"Yes, I know: sending a binary image by PDF wastes bandwidth; TIFF is much more efficient, and there are plenty of free TIFF viewers. But I can't assume that everybody has those viewers, and I'm not going to complicate my professional life by forcing people to download software when I know they already have software that will do the job."
TIFF is a bitch in itself.
When generating TIFF files, I started to discover that even if a user had a TIFF viewer, the odds of them being able to open the specific variety of TIFF which we created would vary. The only variety which all users could open was the uncompressed one. Every kind of compression we tried was unsupported in some particular app which the user insisted on using.
And of course, TIFF is awfully inefficient when sending uncompressed. You'd be better off with PNG in my opinion.
At times like these, it's nice to know that Ruby on Rails' AJAX helper functions already handle neat things like scripting actions on errors or sending errors to a different DOM element.
On Gizmo, you can have a conference call with as many people as your system and bandwidth can support. Supposedly someone once set up a 28-way conference.
Not that I'm bagging them out... tokenisation for the CJK languages is extremely difficult. The most naive kind of tokenisation just treats each character as a separate word (i.e. what Office is doing), but that definitely isn't going to be correct in all cases either.
I'm genuinely curious as to whether there are better algorithms floating around, particularly because the company I work for develop software which analyses text for text indexing and searching, and it would be rather nifty say in the case of Japanese, to normalise text which can be written three or more ways into a single representation of words.
I'm amazed they're using Word so much. My last job was at a company designing XML-based document management software, and the boss there was an ex-lawyer. He was telling me in great detail how the legal profession would outright shun Word in favour of WordPerfect.
Yeah, if doctors start making phone calls. In real life, doctors refuse to give out details over the phone even if you call them, so there isn't much to be lost here.
Sure, but that's different than using a blatantly wrong term for something.
Right, the last update of WinLAF was around 18 Feb 2005, but when was the last update of the Windows L&F itself? Some time around 2001? Seriously, this project does make Swing look much more native, and even if it hasn't been officially released in a while, that still means it's being updated twice as fast as Swing itself. ;-)
BTW, there is a third-party project providing a look and feel for Swing which fixes some of the bugs with Sun's implementation which would otherwise take forever to fix.
It's called WinLAF.
I don't think it's okay to use an excuse which amounts to saying "but heaps of other idiots do it".
They said "features of".
All joking aside, I'll take more notice when I see a site dealing with serious traffic involving database transactions and other services. Else RoR will always been seen as a bit of a hobbyists plaything suitable for proto-typing or small sites only.
There are already some of those in the list. Just because not every single site is under heavy load, you're saying that all the sites are toys? Fair enough, then. Slashdot is probably a toy too.
"Can you name any website or application currently in production that does."
The Rails Wiki has a list.
"Who uses TIFF uncompressed? Or did you mean TIFF with bitmap encoding?"
We use it, because we found that any other given option we picked was incompatible with some user's software, particularly LZW.
Well, I imagine that the PubSub for each site would be centralised (either at the site, or hosted somewhere else.) But each site would probably have its own distribution node, so it's decentralised in that respect. Either way it's not free, because someone still pays for all the bandwidth, and the site still pays for its hosting. ;-)
Funny, I used to submit reports in PostScript format all the time, and those used to have images. Did it suddenly become less featureful in recent years or is it just something that happened as soon as PDF was created?
Once there is working PubSub, the work will be focused around the PubSub nodes. The site will send one message to the PubSub service, and that service would be one which is built with large scale messaging in mind.
For printing tasks, PostScript seemed to work perfectly fine before PDF existed. For web browsing, users should never be subjected to a format which stores text as binary.
"If Adobe folds up tomorrow, PDF will survive."
Damn... that's a shame.
So, any idea how we can kill the beast which is PDF? There must be some way to get rid of the piece of crap.
I know it's a crazy suggestion, but instead of having hundreds of people polling a single RSS feed, why not have the server which hosts the RSS feed actually PUSH the updates out to the people who are interested?
We already have a nice and simple protocol (XMPP) which could be used for this, although admittedly PubSub isn't as final as it could be.
"Yes, I know: sending a binary image by PDF wastes bandwidth; TIFF is much more efficient, and there are plenty of free TIFF viewers. But I can't assume that everybody has those viewers, and I'm not going to complicate my professional life by forcing people to download software when I know they already have software that will do the job."
TIFF is a bitch in itself.
When generating TIFF files, I started to discover that even if a user had a TIFF viewer, the odds of them being able to open the specific variety of TIFF which we created would vary. The only variety which all users could open was the uncompressed one. Every kind of compression we tried was unsupported in some particular app which the user insisted on using.
And of course, TIFF is awfully inefficient when sending uncompressed. You'd be better off with PNG in my opinion.
Why use a plugin when you can already store JavaScript, CSS and images in a single, STANDARD HTML file?
Probably quite a long time, but it's not something you do every day. :-)
So in other words, you don't disagree with my comment.
You could always have a RAID-6 array of petabyte-sized hard drives, couldn't you?
Man, TV in the US must suck more balls than I thought.
At times like these, it's nice to know that Ruby on Rails' AJAX helper functions already handle neat things like scripting actions on errors or sending errors to a different DOM element.
On Gizmo, you can have a conference call with as many people as your system and bandwidth can support. Supposedly someone once set up a 28-way conference.