juntunen · Slashdot Mirror

← Back to Users

User: juntunen

juntunen's activity in the archive.

Stories: 0
Comments: 5
First seen: 2003-03-02
Last seen: 2005-03-25
Profile: (view on slashdot.org)

Comments · 5

Netcraft says AdTI's web server is running FreeBSD on Open Source As Legal Time Bomb · 2005-03-25 18:06 · Score: 1

http://uptime.netcraft.com/up/graph/?host=www.adti .net
Popfile for mailing lists on Bayesian Filtering Outside of Email? · 2004-03-30 01:54 · Score: 1

I am trying to setup Popfile to sort mailing list messages into multiple buckets: very interesting, mildly interesting, worthless and so forth. I belong to several high-volume mailing lists and I've been wishing for an easier way to find what I care about without having to skim several hundred messages to find it. I am hoping the classifier will eventually pick up on what people and topics I like best.
How about the .name TLD? on Registering a Locality Based .US Domain? · 2004-01-08 07:31 · Score: 3, Informative

I've been looking for a simple domain to host personal photos and journals.

Why not just get something in the .name domain? The Global Name Registry says ".name goes from a special interest, third level TLD to a mainstream, second level domain space for individuals, structurally identical to a TLD like .com, on 11:30AM EST on 14 January 2004."
Those Wacky Apple Folks on Apple Switches tcsh for bash · 2003-08-26 05:28 · Score: 4, Interesting

I followed the link provided, and found the section titled "Unix-lover Heaven" rather funny. It said, "Panther will include a final X11 window server for Unix-based apps, improved NFS/UFS, FreeBSD 5 innovations as well as support for popular Linux APIs, IPv6 and other important acronyms." I'm guessing the marketing folks wrote that last bit...
Is the theory sound? on TarProxy Creates Tar Pit... For Spammers · 2003-03-02 04:04 · Score: 1

Someone correct me if I'm wrong here (my math is not spectacular) but as I understand Bayesian techniques, a whole message is tokenized and then the fifteen or so "most interesting" tokens -- as compared to the spam corpus -- are analyzed to come up with a probability the given message is or isn't spam. If TarProxy is creating tokens and then analyzing them as they arrive, how much message has to be received before a spam identification is made? Will it work poorly with a small corpus? The first thing to arrive are the headers which are tokenized along with the message body. Lots of spam comes from Yahoo, and so does traffice from friends -- wouldn't classification based purely on header tokens tend to call all Yahoo mail spam under this scheme? I hope someone with a background in the math this is based on can make a comment.