No offense, but when I saw your nick, I couldn't stop laughing for two minutes... Imagine a phoenix talking about the reason of some mystical fires... like "it wasn't me, it was the neighbor's dog (or wait, maybe it was volcanic activity?)"...
As far as I know, image search in the way you want it is still only a dream. But. Approx 2 years ago I attended a conference focused (mainly) on theoretical computer science. I saw some researchers (I think they were from Italy, not sure) present an early implementation of their algorithm to look for similar images to the one you select.
The idea behind: For a computer, it's not easy
to tell what exactly does an image contain. E.g. take
all those "type the word you see above inside this box to prove you are not a bot" registration forms. If there are no working algorithms to tell "this image contains the word SLASHDOT written in yellow and blue stripes on a pink-dotted black background", the chances of creating an algorithm to tell "this is a game of tennis, it is probably
played in the afternoon somewhere in England" are really low.
However, by using various approaches from CG (comp. graphics), you MAY be able to tell whether two images are similar or not -- as simple examples consider edge detection, color spectrum, etc. As I already mentioned, such algorithms have already been implemented and their success ratio is already reasonably high. I expect that it won't take long until we see them on google.
Note that using the ideas above you CAN search for an image with a given subject -- it just requires two stages. Suppose you want an image of a sun setting down somewhere in the mountains. Stage 1. You enter "sunset" into google's present search engine. You get lots of sunsets, several dogs named Sunset, a chinese girl Sun Set, etc. Then you select one of the sunsets most resembling the
image you want and you tell google (or some other engine) to find all similar images. Et voila.
He claims that file-renaming is better in nautilus because the only way to do it is through a context menu, and furthermore, the filename without extension is highlighted by default. Personally, I find both of those "features" terribly annoying. Quite often, all I want to do is change the extension on a file. Nautilus' behavior makes this much harder than it is in windows.
No offense, but I don't think that parent post is so terribly insightful.. IMHO the difficulty in this particular case is exactly the same:
press END (highlight gone, cursor at the end),
3 times BACKSPACE, type the new extension.
The only difference comes to play if you want
to modify all BUT the extension (most often IMHO),
where you can save 4 keypresses. Not much, but helps. And surely it is a new and cool idea.
Anyhow, nothing beats regexps when renaming... in some linux distros you can find a small perl script "rename" that does exactly what I want it to do... as soon as I have to rename more than three files in a similar way, no thanks, fancy GUI, command line it is...
This is already at least the second problem somehow connected with ptrace() in the kernel. Kernels prior to 2.2.19 were vulnerable to a race-condition attack, that enabled local users to gain root privilegies. This was one of the most "famous" problems in last years and it's known as the execve/ptrace exploit.
More details:
This vulnerability exploits a race condition in the 2.2.x Linux kernel
within the execve() system call.
By predicting the child-process sleep() within execve(), an attacker
can use ptrace() or similar mechanisms to subvert control of the child
process. If the child process is setuid, the attacker can cause the
child process to execute arbitrary code at an elevated privilege.
There are also other known lesser security issues with Linux kernels prior
to 2.2.19 which have been noted as fixed.
someone just has to look through the executable for strings.
Oh my... looks like you never wrote something remotely similar to a backdoor. And I mean no just-for-debugging-and-then-i-remove-it backdoor, I mean a real one. A backdoor that's meant to be misused. E.g. you are writing some software for your bank;) That kind of backdoor. The first rule is that the backdoor should be as invisible as it is possible. And some strange password-like string is the simplest way of shouting "hey, I'm here!"
A real backdoor should look e.g. like the one legendary that once was in a C compiler... for more info see the jargon file entry on backdoor.
I remember Ingo Molnar introducing this scheduler running in O(1) time months ago, sometimes late in 2002... AFAIK it is a part of the 2.5 kernel for quite a long time.. and at the time it was first tested there were some benchmarks.. I vaguely remember something about "we tried to launch several hundreds of processes, w.o. the scheduler: 15 minutes, w. the scheduler: 2 seconds." So what is so new about some benchmarks being available?
The only solution I can think of is wide-spread adoption of PGP (or equivalent) aware mailers and certification of mail.
I have to discourage your optimism a bit. IF the public-key encryption ever finds its way to the general public (I hope and think so), there are two possibilities:
a) Your public key will be available for the general public -- this is how it will probably work. If someone wants to send you an e-mail, he obtains your public key in a trusted way (e.g. from a trusted key server), encrypts the message and sends it. If the spammer wants to send you spam, once he gets your e-mail address, he does exactly the same. Obtains your public key, encrypts the spam and sends it. The only difference with today's situation: it will be impossible to filter spam on the server side (only to block some spamming IP addresses, no server-side spam filters).
b) You give your public key only to your friends you trust. This is exactly the approach "everything coming from an address, that's not in my address book, has to be spam." and even contradicts the basic idea: it's your public key...
The compression ratio achieved therefore measures how many repeated fragments, words or phrases occur in the text.
There is a minor problem with this sentence. And with this whole gzip business. It is misleading. Words, phrases? You cannot force gzip to match words, gzip tries to exploit every likeliness found, even at the character level. E.g., if your "spam dictionary" contains words sex and pants, mail about sextants will have a good compression ratio. And there is no way how to prevent this. That's why the Bayesian filters (operating on words) outperform gzip by a league. That's (one of more reasons) why I think this article belongs not to/. but to a wastebin instead. It simply presents a worse approach to do something. Interesting idea, yes, but that's all.
(Just FYI: it is proved, that the bzip2 algorithm due to Burrows and Wheeler exploits all such repeatings in the input file nearly optimally -- within some small ratio. Hence, it is even worse to use it as a spam filter:-)
One problem with this is the right to open other people's mail. But you could use some basic scrambling (rot-13) to make sure that no one sees the inside. It wouldn't make difference to the comparing script.
And why exactly should the rot13 help? If the root of your machine wants to read your (non-encrypted) mails, he does so. Anybody else will still have the same chance to read it, rot13 or not. When the e-mail arrives, a mail daemon takes it and puts it into the appropriate user's mailbox. (Or sends back an error message -- no such user, etc.) The only change will be that this daemon will call another program -- the spam filter. Both of them will run under root privilegies and no user (except for the root) will have a possibility to see your email.
Notice the word non-encrypted in the previous paragraph. As soon as public-key cryptography becomes more used by general public, there will be some "default" ways to publish your public key. AND there will be no problem for the spammer to obtain your public key automatically with your e-mail address (or maybe to obtain it as soon as he knows your e-mail). When this comes true, server-side spam filtering will become impossible, because the server sees only the encrypted message and has no way to tell whether its spam or not.
Ever been subscribed to a prolific e-mail conferrence? Say hundreds of mails daily? And did you manage to keep track of the separate discussion threads WITHOUT a mail client that supports threading? Well, you have my respect, but I really like the computer to do such things for me...
Well, pardon me if I'm wrong... but if you are fond of all-text interface (+aalib for viewing attached images, etc.:-) as I am and if you were really willing to learn _all_ the kbd shortcuts in pine, then mutt (and NOT pine) is the right client for you. Mutt has had threading support for _ages_, it is a much more powerful tool and the kbd shortcuts are IMHO more logical, especially to someone used to work with Linux and the editor vim.
Aaargh... why oh why did they have to add the threading support? Looks like pine starts to be usable.. (It's about time, the version is already 4.50...) And that means less mutt users:(( Mutt is the one and only _real_ mail client! Hypnotoad uses mutt! All hail the hypnotoad!
This reminds me of an old joke:
American astronauts arrive to the moon. Their communication with Earth:
No offense, but when I saw your nick, I couldn't stop laughing for two minutes... Imagine a phoenix talking about the reason of some mystical fires... like "it wasn't me, it was the neighbor's dog (or wait, maybe it was volcanic activity?)"...
As far as I know, image search in the way you want it is still only a dream. But. Approx 2 years ago I attended a conference focused (mainly) on theoretical computer science. I saw some researchers (I think they were from Italy, not sure) present an early implementation of their algorithm to look for similar images to the one you select.
The idea behind: For a computer, it's not easy to tell what exactly does an image contain. E.g. take all those "type the word you see above inside this box to prove you are not a bot" registration forms. If there are no working algorithms to tell "this image contains the word SLASHDOT written in yellow and blue stripes on a pink-dotted black background", the chances of creating an algorithm to tell "this is a game of tennis, it is probably played in the afternoon somewhere in England" are really low.
However, by using various approaches from CG (comp. graphics), you MAY be able to tell whether two images are similar or not -- as simple examples consider edge detection, color spectrum, etc. As I already mentioned, such algorithms have already been implemented and their success ratio is already reasonably high. I expect that it won't take long until we see them on google.
Note that using the ideas above you CAN search for an image with a given subject -- it just requires two stages. Suppose you want an image of a sun setting down somewhere in the mountains. Stage 1. You enter "sunset" into google's present search engine. You get lots of sunsets, several dogs named Sunset, a chinese girl Sun Set, etc. Then you select one of the sunsets most resembling the image you want and you tell google (or some other engine) to find all similar images. Et voila.
He claims that file-renaming is better in nautilus because the only way to do it is through a context menu, and furthermore, the filename without extension is highlighted by default. Personally, I find both of those "features" terribly annoying. Quite often, all I want to do is change the extension on a file. Nautilus' behavior makes this much harder than it is in windows.
No offense, but I don't think that parent post is so terribly insightful.. IMHO the difficulty in this particular case is exactly the same: press END (highlight gone, cursor at the end), 3 times BACKSPACE, type the new extension.
The only difference comes to play if you want to modify all BUT the extension (most often IMHO), where you can save 4 keypresses. Not much, but helps. And surely it is a new and cool idea.
Anyhow, nothing beats regexps when renaming... in some linux distros you can find a small perl script "rename" that does exactly what I want it to do... as soon as I have to rename more than three files in a similar way, no thanks, fancy GUI, command line it is...
> > The primary subtitle is "Bigger Disk", which is suspiciously similar to the subject lines of half of the spam I get.
> Are there any internal drives with this kind of capacity?
You want a "bigger disk" that's internal??? I think that all the e-mails mentioned above were talking about external ones...
This is already at least the second problem somehow connected with ptrace() in the kernel. Kernels prior to 2.2.19 were vulnerable to a race-condition attack, that enabled local users to gain root privilegies. This was one of the most "famous" problems in last years and it's known as the execve/ptrace exploit.
More details:
This vulnerability exploits a race condition in the 2.2.x Linux kernel within the execve() system call. By predicting the child-process sleep() within execve(), an attacker can use ptrace() or similar mechanisms to subvert control of the child process. If the child process is setuid, the attacker can cause the child process to execute arbitrary code at an elevated privilege. There are also other known lesser security issues with Linux kernels prior to 2.2.19 which have been noted as fixed.Duron, Opteron.. I suppose the next one will be called Moron ;)
(after considering other names like Dodecahedron, Rhododendron, CowboyNeal-drone, etc.)
someone just has to look through the executable for strings.
Oh my... looks like you never wrote something remotely similar to a backdoor. And I mean no just-for-debugging-and-then-i-remove-it backdoor, I mean a real one. A backdoor that's meant to be misused. E.g. you are writing some software for your bank ;) That kind of backdoor. The first rule is that the backdoor should be as invisible as it is possible. And some strange password-like string is the simplest way of shouting "hey, I'm here!"
A real backdoor should look e.g. like the one legendary that once was in a C compiler... for more info see the jargon file entry on backdoor.
I remember Ingo Molnar introducing this scheduler running in O(1) time months ago, sometimes late in 2002... AFAIK it is a part of the 2.5 kernel for quite a long time.. and at the time it was first tested there were some benchmarks.. I vaguely remember something about "we tried to launch several hundreds of processes, w.o. the scheduler: 15 minutes, w. the scheduler: 2 seconds." So what is so new about some benchmarks being available?
Or am I completely off-topic? ;)
The only solution I can think of is wide-spread adoption of PGP (or equivalent) aware mailers and certification of mail.
I have to discourage your optimism a bit. IF the public-key encryption ever finds its way to the general public (I hope and think so), there are two possibilities:
a) Your public key will be available for the general public -- this is how it will probably work. If someone wants to send you an e-mail, he obtains your public key in a trusted way (e.g. from a trusted key server), encrypts the message and sends it. If the spammer wants to send you spam, once he gets your e-mail address, he does exactly the same. Obtains your public key, encrypts the spam and sends it. The only difference with today's situation: it will be impossible to filter spam on the server side (only to block some spamming IP addresses, no server-side spam filters).
b) You give your public key only to your friends you trust. This is exactly the approach "everything coming from an address, that's not in my address book, has to be spam." and even contradicts the basic idea: it's your public key...
The compression ratio achieved therefore measures how many repeated fragments, words or phrases occur in the text.
There is a minor problem with this sentence. And with this whole gzip business. It is misleading. Words, phrases? You cannot force gzip to match words, gzip tries to exploit every likeliness found, even at the character level. E.g., if your "spam dictionary" contains words sex and pants, mail about sextants will have a good compression ratio. And there is no way how to prevent this. That's why the Bayesian filters (operating on words) outperform gzip by a league. That's (one of more reasons) why I think this article belongs not to /. but to a wastebin instead. It simply presents a worse approach to do something. Interesting idea, yes, but that's all.
(Just FYI: it is proved, that the bzip2 algorithm due to Burrows and Wheeler exploits all such repeatings in the input file nearly optimally -- within some small ratio. Hence, it is even worse to use it as a spam filter :-)
One problem with this is the right to open other people's mail. But you could use some basic scrambling (rot-13) to make sure that no one sees the inside. It wouldn't make difference to the comparing script.
And why exactly should the rot13 help? If the root of your machine wants to read your (non-encrypted) mails, he does so. Anybody else will still have the same chance to read it, rot13 or not. When the e-mail arrives, a mail daemon takes it and puts it into the appropriate user's mailbox. (Or sends back an error message -- no such user, etc.) The only change will be that this daemon will call another program -- the spam filter. Both of them will run under root privilegies and no user (except for the root) will have a possibility to see your email.
Notice the word non-encrypted in the previous paragraph. As soon as public-key cryptography becomes more used by general public, there will be some "default" ways to publish your public key. AND there will be no problem for the spammer to obtain your public key automatically with your e-mail address (or maybe to obtain it as soon as he knows your e-mail). When this comes true, server-side spam filtering will become impossible, because the server sees only the encrypted message and has no way to tell whether its spam or not.
There are many correct answers, my favourite being seeing e-mail addresses makes you panic!
$ more /var/spool/mail/$USER ???
I say: /var/spool/mail/$USER
$ lpr
Ever been subscribed to a prolific e-mail conferrence? Say hundreds of mails daily? And did you manage to keep track of the separate discussion threads WITHOUT a mail client that supports threading? Well, you have my respect, but I really like the computer to do such things for me...
Well, pardon me if I'm wrong... but if you are fond of all-text interface (+aalib for viewing attached images, etc. :-) as I am and if you were really willing to learn _all_ the kbd shortcuts in pine, then mutt (and NOT pine) is the right client for you. Mutt has had threading support for _ages_, it is a much more powerful tool and the kbd shortcuts are IMHO more logical, especially to someone used to work with Linux and the editor vim.
Aaargh... why oh why did they have to add the threading support? Looks like pine starts to be usable.. (It's about time, the version is already 4.50...) And that means less mutt users :(( Mutt is the one and only _real_ mail client! Hypnotoad uses mutt! All hail the hypnotoad!