daviddlewis · Slashdot Mirror

Re:Emailchemy.. on How Do You Store and Reconcile Email Archives? · 2005-03-13 04:45 · Score: 1

Agreed. Emailchemy was only the program I found that was able to convert my 2GB Outlook Express archive to other formats. (I happened to move to mail.app on the Mac, but Emailchemy supports many other output formats, including pure mbox.) By comparison, Thunderbird choked in importing this archive.

I actually used a beta version of Emailchemy that retained the hierarchical structure of my (approximately 3000 folders). This was the biggest sticking point with most tools I looked at.

Dave

who best to support: EFF, PubPat, ? on EFF To Fight Dubious Patents · 2004-04-20 05:56 · Score: 2, Interesting

Any opinions on which organization is most worthy of support with my limited donations: EFF or PubPat or someone else?

Dave

Re:probably just a fluke on Analyzing AT&T's Anti-Anti-Spam Patent · 2003-11-16 04:02 · Score: 1

I worked with Bob Hall at Bell Labs and AT&T Labs. He's had a long interest in helping people manage their personal email, as evidenced by his other research papers and patents. He's one of the good guys.

Though I don't have any first hand knowledge, I think there's a small chance this patent might actually end up getting used against spammers. AT&T runs an ISP, so is in a very good position to know if one of their users is sending out messages that use hashing techniques for avoiding duplicate detection. They can simply forbid this, of course, but having the patent is another weapon, particularly against large abusers.

In contrast to what some other posters indicated, hash-based duplicate detection is widely used by ISPs, and spammers do widely use anti-hashing techniques. I recently did some consulting work designing anti-anti-hashing techniques, but have already seen spammers use anti-anti-anti-hashing. And so it goes.

Dave

FYI: Patent number is 6,643,686 on Analyzing AT&T's Anti-Anti-Spam Patent · 2003-11-16 03:51 · Score: 1

You can search by patent numbers at http://patft.uspto.gov/netahtml/srchnum.htm

Dave

Chicago on Captured! By Robots - A Musical/Mechanical Marvel? · 2003-11-10 01:28 · Score: 1

Saw CBR in Chicago, last year I think. "Robots" is a bit generous to describe the technology. Think more like the classic one-man band with electronics, and that gives is all a silly/fun character rather than a BDSM enslaved-by-robots feel. Don't know enough about punk to comment on the quality of the music. Very very loud, so that I could make out the lyrics at all.

Re:Heinlen or Niven on Power Outages Strike East Coast · 2003-08-14 10:08 · Score: 1

Well, Joe Haldeman in _Worlds_ wrote about the *really* last US blackout.

And of course Arthur C. Clarke in "The Nine Billion Names of God" wrote about the really last universal blackout.

Re:Tax deductible? on Funding Open Source? · 2003-07-15 02:43 · Score: 1

A large telecom company I once worked for donated some software developed by their research lab to a university, had it formally appraised for value, and took a tax deduction. I gather the legal paperwork was substantial, but so was the deduction. Anybody know of any companies that have contributed to open source in this fashion?

Dave

It's nuts to use IBM patent server anyway on Delphion To Start Charging For Patent Access · 2001-05-14 18:37 · Score: 1

No one who wants to protect their intellectual property - including their right to give it away - should use the IBM patent server. Do you really want to let the world's most aggressive pursuer and licenser of patents see your query...? Dave say, what's this about needing a sig to avoid losing your last line?

Yes, Really a Neural Net (or something similar) on AOL Introduces Neural-Net Content Filtering · 2001-05-08 20:34 · Score: 1

The idea would be that pages that get through and shouldn't (in someone's opinion) are used as positive training examples for a neural net. Pages that don't get through and should are used as negative examples. One trains the neural net to distinguish between positive and negative examples, using words, phrases, etc. as input features. The big advantage of these techniques is that they can more or less gracefully combine hundreds of different clues, something that is difficult for a human to do by hand. They still make mistakes of course - question is whether the numbers of mistakes is reasonable, in your opinion. There's a huge literature on using neural nets and other machine learning techniques to train systems for distinguishing between all sorts of text content. See the sections on text categorization in _Machine Learning_ by Mitchell or _Foundations of Statistical Natural Language Processing_ by Manning & Schuetze. There's also a survey paper at http://faure.iei.pi.cnr.it/~fabrizio/, and an upcoming workshop described at http://www.daviddlewis.com/events/otc2001/ Dave

Re:Bell Labs isn't AT&T any more on Bell Labs Creates Plastic Superconductor · 2001-03-09 21:00 · Score: 1

Right - AT&T's lab is called, creatively, "AT&T Labs". Lucent's lab is "Bell Labs" Dave

evil vs. clumsiness on Making Sense Of An Employee IP Agreement · 2001-02-18 22:31 · Score: 1

At least half the problems with NDAs and IP agreements that I see in my consulting work result from the company using a lawyer who doesn't specialize in IP. The result can be bizarre: draconian claims in one part of the agreement and gaping holes in others, a document they think is an NDA but really makes me their employee, etc. Unfortunately, I sometimes end up paying my IP attorney to rewrite their documents for them, just to protect myself! Dave

can search engines be improved? on Search Engines-Does Obscurity Prevent Exploitation? · 2000-09-13 11:27 · Score: 1

Absolutely! There's a number of techniques already known in the information retrieval research community (relevance feedback in particular) that aren't being exploited in current web search engines, and would make a big difference. What's holding them back is usually some combination of efficiency problems and lack of a good interface metaphor for allowing naive users to effectively use the technique. As others have pointed out, most people don't even use phrases.

I think it's a non-issue whether the criteria the engines used are publicized or not. There's enough index spammers out there that any weaknesses in the criteria get discovered, exploited, and patched fairly quickly.

Dave

Re:Its called prior work on What's A Reluctant Inventor To Do? · 2000-09-11 08:32 · Score: 1

Agreed. I (not a lawyer) am pretty sure both you and your company's attorneys are legally obliged to mention any relevant prior art they're aware of. So go become aware. --Dave

Slashdot Mirror

User: daviddlewis

Comments · 13