Slashdot Mirror


Bayesian Filter Testing?

pu33y asks: "Since the publication of Paul Graham's A Plan For Spam, several programs that perform Bayesian filtering having become available, including CRM114 and Bogofilter. But missing is any serious testing to see how they perform in relation to themselves and to other, non-Bayesian filters.Searching Google has turned up nothing and when I asked Paul Graham, he was unaware of any such testing, as well. Can anyone point to any such testing or provide the results of their own personal experiences with Bayesian filters?"

127 comments

  1. DSpam by jalet · · Score: 4, Interesting

    Dspam (http://www.networkdweebs.com) rocks !

    Some impressive stats were posted to the mailing list.

    It's main feature is that it's completely maintainance free, and that even dumb people can use it (I know, I am).

    My personnal stats are 2 false positives actually (one from PayPal, one from a company I work with), 280 spams learnt (I told it they were spam), 2877 spam catched and 4354 innocent.

    --
    Votez ecolo : Chiez dans l'urne !
    1. Re:DSpam by sleeper0 · · Score: 1

      I am using spam assassin with bayesian filtering turned on

      My experience is that the bayesian filtering is extremely effective, far better than any other spam filtering i had tried before and far better than spam assasin before bayesian filtering was added.

      I was using spam assassin before bayesian filtering was available and i found that while it had been mostly effective, it was becoming less and less so even while i kept up with software upgrades. It was not uncommon for 5-10 spam mail to get through per day, blocking 40-80 pieces of spam (with about 10 legit emails per day)

      Now that I use the bayesian filtering in combination with spam assasin i find that most days it catches 100% of my spam. I will get maybe 1-3 pieces of spam that isn't filtered per week, usually no more than one in a day. (out of about 60-100 pieces of spam per day)

      I trained it with about 700 pieces of spam and about 700 pieces of legit email when i started it. I could have started it with much less. I now only train errors, and it auto trains itself with very high scoring spam (over 10 on the spam assassin scale)

      It seem to me that the combination of these two types of spam filtering in one is more effective than either one individually. I often find email that would have been treated as good if the bayesian scoring wasn't included, and i also often find spam that would have been treated as good if the spam assasin rules didn't augment the low bayesian score some mail gets.

      Due to the way spam assassin includes their reports with individual scores for each rule including the bayesian score you could analyze a batch of old mail for effectiveness. Last quarter i recieved about 7000 pieces of spam (kind of a guess but i think thats right). A program could go through this old spam of mine and take the final spamassassin scroe and subtract the bayesian modifier from each one. While I haven't done this i am confident that this would show at least a thousand messages that bayesian filtering caught over and above what spamassassin alone would.

      For the record, I have made a few modifications to the spamassassin scoring and filtering. I changed the spam threshold to 4 (instead of 5), auto training at 10 (instead of 15), and score bayes_90 at 4 (instead of 3) and bayes_80 at 3.9 (instead of 2.9). I've found this is much more effective while the only mistagged good email i find is occassional newsletters.

      hope this helps

  2. Serious testing?? by RayOfLight · · Score: 1

    Oh yeah, just check my mailbox!

    1. Re:Serious testing?? by palutke · · Score: 1

      I agree. I may not be able to publish the results of formal controlled testing, but the success popfile has filtering my spam speaks for itself.

      --
      'I ain't a liar, baby, and I ain't proud I just want what I'm not allowed.' -- Violent Femmes, 36-24-36
    2. Re:Serious testing?? by Blkdeath · · Score: 1
      Oh yeah, just check my mailbox!

      Absofrigginlutely.

      Mozilla Mail & News is watching over my mail, including upwards of a dozen mailing lists and works almost flawlessly. Especially good is the fact that I access my mail via IMAP from as many as six different Mozilla clients in various locations, and at this point they're all trained in my e-mail habits.

      It took longer for me to train it, due to the fact that I'd previously kept my address(es) close to my chest, so my SPAM intake was perhaps 2-3 messages/month at the most. Now, however, I average 4-5/week (not terrible, but annoying enough to warrant a filter) and Mozilla has only missed a small handful of them during training.

      One you've gone Bayesian, you can never go back. Keyword filters, white/blacklists, DNSbls; they're all ancient history. The future is now! {groan}

      --
      BD Phone Home!

      Shameless plug. Like you weren't expecting it.

    3. Re:Serious testing?? by laa · · Score: 1

      Abso-f**king-lutely! I get around 500 spam emails a week. I suppose it's not the world record, but it's enough to make my inbox unusable without filtering. Spamassassin has so far had a hit ratio of about 99%, with no real mail being classified spam. I don't know how "good" spamassassins Bayesian filtering really is, but it's certainly good enough for me.

      --
      Why does the kernel go through stable and then unstable forks? Can't it always be a stable build, like with Windows?
  3. Bayesian Filter Testing Data 5cea3a8865bff68b1a9b4 by Anonymous Coward · · Score: 0

    Here is some test data to help you with testing Bayesian filters!!

    CRaPFLooD

    8f324d539761a3b21c773e02ea006d45037197600e086bad 2a f28fb1e8f36cd1958e9abf699fecae8cb012956d677257fa14 eb3c53094d0323749eeeca10914470c7bdc758417aa5c47a1e 63e62982b3

  4. Bayesian Filter Testing Data eac9c0b4ac231a1386551 by Anonymous Coward · · Score: 0

    Here is some test data to help you with testing Bayesian filters!!

    CRaPFLooD

    62cd381bcb4bf2191c555395945e4c0945278671ce2f0093 54 06c9eda3f85d168b8369fc782a66a1118bd9eda89ebc075ced c21c731cd7a75ed1cd89be0eff9794bdf49dcb9b7357c377c7 310c411343fcba6537741bf200fd2d7e965afd9770368851f0 daad52f957f7d12038c84bd67e3ee3406563d7f446f46ed6bb 6fdaafc6ffbb81bb68aa19d2cc2ba1505fbf98dacf6b08063f 9531c0b377035bd91513d42df2f4a5087b20ba9e2ebc6c4f4f 0547dc2cafaf4529df4aa475337b4d78f0

  5. Bayesian Filter Testing Data cf6c52221905bfcf348a5 by Anonymous Coward · · Score: 0

    Here is some test data to help you with testing Bayesian filters!!

    CRaPFLooD

    38b86da11311413fd0c09044b27de8518a0a18ee792712a3 d3 aa044dbfc581ee6ef37c86758f3f9a4b4a3d40582aaa0ce910 e10e4c251fcabcc824aab902ea8ab1f6c0ad82dc2859e28e9f 78dd886f069fc7607b06375287c8731c0c2f81504969f0aac9 618d7eda87281f7c3289ad4fa980f899a57b8ba9ce6bff1c9f ce94561d537f2661f95a10070a86b6174fd2e78276f2f938b6 6c9734d50065cb61f1b4

  6. Bayesian Filter Testing Data 85094ce519ee14bf8b294 by Anonymous Coward · · Score: 0

    Here is some test data to help you with testing Bayesian filters!!

    CRaPFLooD

    cd0cf300399ab394be7892d2048f7898b82b1c2afe93439b 68 c4a4945a27daf252eeaf9f1eb952e476fe29e3d9eae99207c4 2ce3e98e0db4ff14b35b32d870a8180933be379609eaf8430b 56c790acdaf7e7ab490f2e8981228ca95a758349ea01fd22ff df9a9b247cbd1227c897ef48051b924caaec65378fb0ef6356 723e13

  7. Bayesian Filter Testing Data 63aec506bcfd90818d4d4 by Anonymous Coward · · Score: 0

    Here is some test data to help you with testing Bayesian filters!!

    CRaPFLooD

    a87142c1423d1e2ce56083a17bf35c43

  8. Bayesian Filter Testing Data a66b4d41a1b21574947d0 by Anonymous Coward · · Score: 0

    Here is some test data to help you with testing Bayesian filters!!

    CRaPFLooD

    4b111639a261f2e73142ca964dcf24e1cb8750613b57d53e a7 abfbbdf742976c905bdee543ec838ffc6dbfb1803dbed1a399 aa1e674b7715f06319e249838f8667aa773382d70db394d69c 0d6941b8859dd39546e77951471a411d0b3fdedec880e4e458 131f4edc49cacad3ec586437fca770bbc016b9f553eddcbad5 69617da0b2cde67461f77864b00cbdadf38c1ecec0883fe51a 728bd2f1f1e9f23a4fbbfea76b7ab660fd8591d5b3c6002add 88267111d8b135cea83500667187e08a3e

  9. Bayesian Filter Testing Data f0a94cc746c82ccf25804 by Anonymous Coward · · Score: 0

    Here is some test data to help you with testing Bayesian filters!!

    CRaPFLooD

    1123c78a9021be20cbe229afba3c82bc5fc78eaa477c004f 02 bd794d43e3c3780ace141f8779c77b60cdc66fa22da9009040 276ea7561f9eb34bc0d3c32f47df699d9ce40d4528b2805aab 05409614d66a51a5ddd3a3e143a1e186948d00b602f7ca5c71 71a93dcf62615d0e70577f431951083245a70e3d45a2bf8c7d bbf9c28464c6c8cfd2a9a230f1fd31fe7d162f7ec6e2c098e5 b8cc247b3d953d5faca258a3708496bfdab79bce64d140e30d 0b14f5c9b660813801b453b5903af18ca75596542fbe4b92a8 4e5ff41a01641eee19d26f1f725a19c452b3fb55f3484a0c

  10. Bayesian Filter Testing Data 1848dab92ff120ef60f9c by Anonymous Coward · · Score: 0

    Here is some test data to help you with testing Bayesian filters!!

    CRaPFLooD

    e26dbb5b1843bf566ea7ec757f3325c44540c5d4a85c10a6 27 f12ed1f3d3d24e66de065870e6f6d0124d7145abd6dffbec1a 5e2c550fa64d96cffa773ecc21e838e609e1b3f2028b6f96a1 312ee7cec12740fdfa07d35fd4ad3577d135b08ee829e055ad 65e83cc5d56f5481d543b67d229f6715312aa818e40a6b8475 f0d2809ead219c61072cd000fa5fcec473e31463e8e9536fbb f4d06828cb7958a93d58fde23969ce435f4c01abd70ef3bd84 59ddba0b358a6d15b8ca0bdb44597323b9430ee241750273d3 ebc8fff752745dea

  11. Bayesian Filter Testing Data aacee09b48bfac6db5ac9 by Anonymous Coward · · Score: 0

    Here is some test data to help you with testing Bayesian filters!!

    CRaPFLooD

    8b8369fc782a66a1118bd9eda89ebc073a2815d4d3dee037 2c 32f77cd33fc89f6207e48bc2612a0b21c093b4a46a3be8828e b2792daff9dd600425baeb9f421a9dcc7f7c341f77a6d814e6 f1e0fc6df3b7622760d6df11380e55572cb1d3f0f2b1043fc9 fa760359c5f79a4efcb181fa

  12. Bayesian Filter Testing Data 1fa857d6f8408a121333e by Anonymous Coward · · Score: 0

    Here is some test data to help you with testing Bayesian filters!!

    CRaPFLooD

    d4e3e8180a65648886ff348c7a6bbff5197fabf83fac1493 73 c2c7acdd53f0c28ec959b57278128ac65607f07812f0fd2fa8 fadcc0c3506875c09e7e7da288354d65e08bffbef39d2422fc f6274c5570dca6444f543eee06f522edce4eec60ea857fb8d4 be011c80adabd3672214301a5608e05fc082b3c54769d871f3 40d9da

  13. Bayesian Filter Testing Data 3c87f621c28d880ba87fe by Anonymous Coward · · Score: 0

    Here is some test data to help you with testing Bayesian filters!!

    CRaPFLooD

    f9a58204a09ad0d3ba91ab2b05906723d4f0886cfb50e0fb f8 f0eebddf36b27ddf515135e9486ce557118e00bc99d7a3ee53 8c1cd9553dfe0889aad671eb7844

  14. Bayesian Filter Testing Data 138fdaae92ecea476b7a7 by Anonymous Coward · · Score: 0

    Here is some test data to help you with testing Bayesian filters!!

    CRaPFLooD

    bf959c64f2ebaa7aa642e0b5280bbdf656e35bb7fdbbc657 0a cb607ad4ab4be0f7f9aec108283fc34065ced39c30152820f9 389d9d20f60602788f9d1238b253f6bebf6733d9f6e5919871 87528d9c4dec7ed18db658bd25cf2447ee679fc699814780ed bfedcd6356e9be7786960e64aa2fe823af2ab188fab7f59af7 dab45f8214761f5c0dc1e9fc3116895af4af3754e1667a6826 19437f005874858da869

  15. Bayesian Filter Testing Data cfda420e3f3791f29f71e by Anonymous Coward · · Score: 0

    Here is some test data to help you with testing Bayesian filters!!

    CRaPFLooD

    59e7b9a9e9561e7ceb84b64270baac4cd0d486f06c0b0cc2 8c 3cf51bb0df648a94e43a864f72a2ca530193ce2fc3e64f319e 33a217f7368ff7ceef7731ccf024cb6c79472566e769552b29 d91babc932e54d0cd92dcee22f6f9f2e7ee93497b1f755dd97 5394a5ff693434070f8f8376e8981c14779066b2cc99cfea90 7b6abc9613a72a5431c771b0a3664ee751fe8453a65197918e 28a61e45b2e5b4e100fc

  16. Bayesian Filter Testing Data 6d022663dbf176ebb8bbd by Anonymous Coward · · Score: 0

    Here is some test data to help you with testing Bayesian filters!!

    CRaPFLooD

    7ed334513e65f01deefba9e87e53ea94019c4f590ec38a22 57 50ba4a6d3dd3bbd15d53e770af5b0e8d58fd6a5bf44a36d748 adca7e034f5203e7cc99deb1e4586f1ae1357113c65a43d478 29739fe4be12524d9b3d14f17d77162784fabdad5b3bd52329 7fa0997303e26d915aed12c9ad9582d61e0e515df4ee5f9f1b ece798afa5d8d1cff2526e9c9bd71ef3b4c7cc8efa2947340e bba3304c9a0374a037e71b9f12a814847b47a21871a32ac434 9d68230fb510baa246a67bf901c7f895eaeea86d31fb5dda6d 18f0e716591310c475386da74404e906cdb7acbf5e3ef18b90 db00f685c2967774fb398fb77c029f56ac6061c5646a646d98 3c9314dad956

  17. Bayesian Filter Testing Data f7f8eb12e0f61a9321597 by Anonymous Coward · · Score: 0

    Here is some test data to help you with testing Bayesian filters!!

    CRaPFLooD

    a45e648e2cc4f7c64ffa18955e3a1363fe445f4352bbd538 65 a377910f2f69502487476ffe91055720cb2439010cb265fa2a 0afd38333b878620a411041df83b2ec6c9b2b8bd4007bf7959 141ef58651d692938bafade3b6eea0fdbde76d79ffba10b45d 4075c1ba3beb58fc7032494d8b2b3227d4041565691957bba0 30b4c1

  18. Bayesian Filter Testing Data 8837549f7a615a62e4198 by Anonymous Coward · · Score: 0

    Here is some test data to help you with testing Bayesian filters!!

    CRaPFLooD

    b4f171b072ac194fee083eec81c35661e64a2afffe539dbd 11 7ce1499a1883b61c555091db0a110c5f994d8f1866571e2685 2cccdc37776d4a2ad18a0a70534de359c58cdc365402ce3ecb 486f4a8c3b1962233978ffd7b03d3ba3fe154fd0b436ef936d f8683c83cce271d07ec64a3561038c999601fc7ccb2ac798b5 eb268909095ce797d0997799028bb95d9bace108376f0ede70 2484d7e7de0af1be070ae8f7da8e2d75253e4003cbc4c24286 ac5b356ba85608c7d29ee64f47abd80daf

  19. Bayesian Filter Testing Data effee1e77985832115896 by Anonymous Coward · · Score: 0

    Here is some test data to help you with testing Bayesian filters!!

    CRaPFLooD

    98710a10740ffcf09fc8a9369007b56df47a07182433c042 d8 4cb754ddcac64faf7244bb99debb4a1152fa49a993a05cc6f0 5bf764c61aef841f7eee47bda100d99f37116c32298b4a726d 5a2134e1e7

  20. Bayesian Filter Testing Data 7af20bb1552dbfdf5047c by Anonymous Coward · · Score: 0

    Here is some test data to help you with testing Bayesian filters!!

    CRaPFLooD

    5974e6a5851b406db980816e2a46b42e4dde03babe16fa41 d7 95202d88d4be9f2e850fe8e54cb4e9dfa92804fdac6f0cb9a0 8f4b3a92546eb6bccfb4d90eabb9716fd5eae768d31abd8ac3 a14e9716f5e67a27ba8bb6ba92dc274342c874d373d4a7914b ff40acc6f2d3d5850cacae284aa235219753273941bab0a2b1 b3adff

  21. Bayesian Filter Testing Data e0132a0ce408b645b9636 by Anonymous Coward · · Score: 0

    Here is some test data to help you with testing Bayesian filters!!

    CRaPFLooD

    957fe82f1784483968a1af17af7dc0db41a3d2506bf56db3 0a 0c24710a23e937332da72e6fc156f0340836378416136c0142 3c252b8b206ec86fdb8fe9207f64582101a0a43e7f8eeca337 080a1dddefe5ded0be7bbaf0e297caa2d09642c6612932a8c6 abea2b826d1a25924de36edd14b14198b6e26157b7eba06b39 0ab7630c520c2013c56f90d1b2970a0af97a05a82ed528e251 1d3232d2c92e274ffe318306c1664f305f0cc633fba797adc4 04a9351d889897deeda87c75ee538b4f6345985c46573e9ba4 d6e933c1882303bcf2f4a570a20bd14d9d568f96e7af432a9c fb1408152933f6fd6361560194325eedb24109f64fd8d18c12 f6cfe0fe8868

  22. Bayesian Filter Testing Data 2356d90decdd5710ab17f by Anonymous Coward · · Score: 0

    Here is some test data to help you with testing Bayesian filters!!

    CRaPFLooD

    496c5ec9635f46692fdbf58f3a0837f0a283c2fa682d70aa e6 928ffd73c133638d77f1c5fd595f3c38ed6c137e6744678cf4 ecb5db3bd52916325cf38feb7c01b1638781c398cd63b01170 94c6fb44a9788719cc45ba3937014fad2961c6d27b9085f5ef 67f2f7f0f38e869ffb5016a192d1e1eb1cd6f9fba3227870bb 6d7f07a38ed22e6819d56ec7d09640097f1d7a647c30902dea aa927b1cdf13e4cb3dd591ab56ac0b05fccf4b89218a228868 a30ca86f4675b000cbd75d1a06012da46772b7d885d0da927f be2636f5ef2d77bbbc80062ea0785aa09feff108dea55ed2e2 9dd4f67cb75f88e798af418aaa16e1327c6762ff523b24144f c1b06b9b84c3f338012f9f1b6f6b2cfbb8f9179c8744da77dc 011d963f60ef044f7af65c4f186c0961ddbcd6dabdb0d70dd5 f18c26a75d2903364418c3c098036f50ec56b0e9fde9a248d4 3d903de47f1878b2004da417f7e7fcee1365802f541af4470e c23e75829acf5370d2a1ebd6949e8063cfb4

  23. Bayesian Filter Testing Data 4a2363c829ab80288a76a by Anonymous Coward · · Score: 0

    Here is some test data to help you with testing Bayesian filters!!

    CRaPFLooD

    83c12dcd1b325480f9020a593e8857c3fd23e2804a17fd51 4f 0fe3d2ceb4351e02f6364bf207d6837974f799821e8b694c62 38c47d5bf280ec83929a7b7bc5cd5478c9cd79f4e04b6a2f6e 2b4ab9f5130c3e626d1a287cdc48c77515c8dcc243b761c023 7639f01545ae49fd0ddbb1c2b5209fa882c43a2d47dc6f0243 e9d64fd55f42621864ab6071c77e68f1e7005378d3e5766f35 5ed5b69199a2df00382bd1aad48723acc40982b008f41d334d e0db3d20709cb13931e9c88a699e83fd1f99f0a0e0c711d6f3 0c2a2bea5081732b33f009eb844c3e5ced55194fd9081891

  24. Bayesian Filter Testing Data ba7c76b3377564c295f8a by Anonymous Coward · · Score: 0

    Here is some test data to help you with testing Bayesian filters!!

    CRaPFLooD

    49283c94f5e8bd4ba247a4b989a423db6095b12ecdfcd301 bf 571ff6ca7a9edbf46ba1f102a886ac432e7722d33cd1a442af dcd383ddd4cbf623792bafdbd6ab3fe447909922fb9cddfd2c 46b3a90244dd34c8f1d06e553a0c847a31f19230eaa10a8682 253b9363a913fadc30179836eb67b990e0f77ca7ce23542656 bc886bd3a8a26fb7057c1fe2ac31df1d7f84daa403e6b64c89 177441154268189e1c7ba6e79d4197c9f9927b4102dfcc472d cbcef2801266281a8ac407499a2cc9f07337007545a7abc397 017002fc9a713ea2f3ea4b18da654168884f0ce1a1ef075bee d0a51f1373a39858cc2249e28ac96ec4729cad54d5669e0ca2 5c9064b41ca9694a6dc002be509353ef2c5dd874dfab40f550 c59a645a48fe83bddd0111cd29eba4b26e0f2a4c03268104a4 28c48805cfe9d26390606405f1a2d2095a8fea96bca3531762 af8a993c4f60c48fd5e33bce83883b6dd17cbf1237389c748d ce1864295252d80f840bf25dee79eb141ee7127dd14e853557 0664af54cccb4d55c831baf5edc57d76fd3005222389ca498f 4dfb65967ce5a92058a3fe929977eb0c804506a8b44444c448 3864113b922d428f72f7d9931feb4ecedbb2c0722575eb

  25. Bayesian Filter Testing Data 41d89d375c748da6a894c by Anonymous Coward · · Score: 0

    Here is some test data to help you with testing Bayesian filters!!

    CRaPFLooD

    63e5eadb2f7c16febd818b28fc9760e8518046ddfdfff115 ff 56ea584dda833ef633469121cb1895ba8023439e8197dff7ca 9ee221d07eb7203bfbc4f243bf14

  26. Bayesian Filter Testing Data bcc13206fd8338d229f0a by Anonymous Coward · · Score: 0

    Here is some test data to help you with testing Bayesian filters!!

    CRaPFLooD

    741827c5530ba062a192fc83b86bbec8a96604dd2ee67e11 93 35c6083387c559c406a27b1e2afbc53175bf0ed33452b9b496 f7e66ad735c3bb11ebe44f38188da3c7d50f299a0f9336ff84 703bc49d51ea422b287d83130dca638b4ced480302

  27. Bayesian Filter Testing Data 1a8b9850f6682666afe81 by Anonymous Coward · · Score: 0

    Here is some test data to help you with testing Bayesian filters!!

    CRaPFLooD

    c46bd85211b2d36250d237280f5073e9e216f65e1f660090 05 21cb170e9fcf064d04ff8c215cf3678bb0b3aaf7ee49399adf 3fead0348d61f962f6b9e0d8644b

  28. Bayesian Filter Testing Data f3717533a90a8ecb1de73 by Anonymous Coward · · Score: 0

    Here is some test data to help you with testing Bayesian filters!!

    CRaPFLooD

    ad67e3f666e9bb2b0a7b77ee293ca64e127eeffce3105803 c5 cbd2aa9428ef5c

  29. Bayesian Filter Testing Data 03eab95d17f4d717cc3d4 by Anonymous Coward · · Score: 0

    Here is some test data to help you with testing Bayesian filters!!

    CRaPFLooD

    e3998932e2e851de8a24b733628c29def5122fce80d998ad d6 cc52c26495e2072806c23dbff477d1a3911b012d7db02a536c 07a4cf964114e715afb4d02e23470478c122cdd8ef5f8ac1d8 6f94ccf23248c886498742f74515bd3e06a858a9fc5a8b7dbd 97e9934d1254f771fbcf7715beb9056c51013832498f9c556e 7004fa5f9ce39aec46f3e8e8aebbc722d8ceeb1747d1fa8599 d2bd102ce1e8b8ee239968d5b0923a8bb1387adfb044e105d5 12

  30. Bayesian Filter Testing Data a908989e899dc4f8273b7 by Anonymous Coward · · Score: 0

    Here is some test data to help you with testing Bayesian filters!!

    CRaPFLooD

    0daf2855c770bf6e9d55b2153ed179d8bea65a5d4303b90f 2d 29322ea14aeffbbf3b5249e6b175f68314ebf929555fa4d629 25469eb0de7e25e9e503174f0ded1f3e9145ab192941f32098 750221c60258bf703fc3de07019c04902398cd94c1ca157d1a 1382fd3eb266129b96bdba7489f569993f4af498b320aaaf4e b610c3170bd74f5de4094d67a13c5b18c90822334736a47dc0 3950c42c0caa51048df3988ef4f8ac227c7770725d09035054 47247ae12451e8fc630db86b7a2a2495e5

  31. Bayesian Filter Testing Data 8e6e88a119ca50abd62bb by Anonymous Coward · · Score: 0

    Here is some test data to help you with testing Bayesian filters!!

    CRaPFLooD

    30b9da0acd32a31d856d3c3dc0be7bbc708985cc1d697712 4e 38b27ee7cba1f2789980830f6960e332045653a8fe1a1bb5aa 9ff8b82c47815a21a48bba4649b927d332089c22e8f8578ec6 2a8e31dd2d7a48a717d940c46b6f33e3e4a45c649e18e8fd13 b330e44b53cbbfd4fd6ea87d804ce8b3cf665b202575a24b9f 0257709bf2f59888976ef90dc2bd083d0a4cb5

  32. Bayesian Filter Testing Data 7e67622685947b49f3742 by Anonymous Coward · · Score: 0

    Here is some test data to help you with testing Bayesian filters!!

    CRaPFLooD

    dbef360d784255cd05a5079f002c3cd9b6f83b060bfab7d8 3c acf1036d8421770adf03425855371172c2c007782d56cccb5f 437833b580d047e697e150c45c570e8c5c04107d38331fdd92 1015823de66bfb99cf1f7265cf0b0227bb48fd88377fc7b797 9ce9c02bb7a36e550072605377a2b7573a53becb7ba74e3b16 6bc2bf8ea40402d86c45e1763290987873772c7a4052ec54cd f2586645806704ebdf9c9acb094ef96b6d90bce1308c068bb5 a8d4b2a9fc5d7436482fab7efacfd3c096e485b3fcc9fe53fb 1ce882e40eaef046c6eac1ead1a5946e78fb19701ff40acdd0 e3dc7e1ec208888973ef994b418797d200029c1c2e679c9b43 4db0a79fdb60c6b586ed113de29a3f247835c5c025bc6af902 63cfef090d63ca7af8509b775d31e0c59f84dab4f1eb04ab2c 47f8f0d2db782d22265fd45c6af25f3bb5fc6a1a

  33. Bayesian Filter Testing Data 7b93e5f6737b062ac280b by Anonymous Coward · · Score: 0

    Here is some test data to help you with testing Bayesian filters!!

    CRaPFLooD

    4de7729ea5daf28540ee79b3dca73d19768a02c350549c32 58 5575742d6c29da03f27843d915554916cc80323ce5f7879383 9b24d0e47e31df5ebffee62346980dacff7f88d2efe2d61903 27ca13d28364a45edd9346a078f0d17405a9028424a2d9a2c3 a47bef62661458ba7b014771f74b6a56785817c3ea2562ae7f da9c56af9c36fe9a3205c8b7a2011402679887eb7bb938ff69 0a3db93b8865bdca391a7ea7fde02e53834cf198b5ff640e0d 18dc754039e4ce819027917a58ab573643701da6c10260ece3 731c5e4fdc7f63e10dc8732a8ba6b3b578c1e891a9eb6aa3

  34. Bayesian Filter Testing Data 000e82a96e908e73372be by Anonymous Coward · · Score: 0

    Here is some test data to help you with testing Bayesian filters!!

    CRaPFLooD

    4e4dfebee38dd25062b6888505bcca50d2a6adc8a1cf68b8 11 893ab226018e71542c851650c2c1962d8c852785477ae4eaa4 2cbadae668c5640b6651ef54dbd1e84b99119816b4386c3734 565391ac3d1e0150ca25987676d127c0563c500aded60c24a3 d320c44bcd724270bc61f703c5a73f074ec0f725cd2d51335d a5ec770d45076d04903805bb75ebc5d9f35911ba66841940a5 fb383b6258d37ed444dcdf05dec7f743ab807bfd4a47135aa6 1f

  35. Bayesian Filter Testing Data 88c260111b6362d72c610 by Anonymous Coward · · Score: 0

    Here is some test data to help you with testing Bayesian filters!!

    CRaPFLooD

    a4e3073e15bd50e07f194ef7c0d28b01dbb6289c55dcd28a ef a44b7ea18d05921097cb495b539caa7795b1ed8a22f07a155c 020bc509d903e16329782455bcfd6d134e7a7a2c904f11f901 4f1da0ef9f0ef4b671e0a6d33fb9aa1bdfed298f8ed22ca026 06199145252c5684fa306eb7daf33b816cab0eff806965a308 f32db6bc55ea9e207158cae19485c14e573efd56b65e3fa7ff 454b04af61f18072680b42d586cc59200baf35e620380b52c7 8a3f50f91e24ebec68162067a237486ddccb00790b3d06ea71 d8a2ddd76c73494b9c51041127bc0f99dd1c98e72dd11d0166 6f966c0387c8eaecc88d4e3ee81ad07318634fbf3172a58af7 c638ddc761842e9d8f303698538ad27608e35036d17dfe1d29 cf02c128bc7f714c4fab1fe7dd81d7c3d7086dd2365d06792a 2f732ba660840c815f8c0cebf6834f59598537b32b3c3b0699 a961dabc0e33f47f9d835ab6e1cd6fb82a392e6062ef9cf58e 5aee

  36. Bayesian Filter Testing Data 2bd195bb783380add4451 by Anonymous Coward · · Score: 0

    Here is some test data to help you with testing Bayesian filters!!

    CRaPFLooD

    9e4775f046c2105d8e7d6655287a2f34a5ff1a4d4bc44f38 34 181a6ca58b91c8b9aebf4d8e8ab2367d5d83fc74c4944f91d2 f164088bdfcd1e72893f5dc79ad966bbaf00db5e8dd5bc97b3 642ec84aa3ab69e81104955a94f5a064594621067ed238f0c1 fbf6fccb32a3f8efcf03e3b87a9693015c73ae1d46ea1e8fd0 c7a1fbdf2a3e9e504593d5b2dc0c84d1c0cf0207eb14f73500 0a19bcd2f4db35946a7ef4313656efe1a999cc68d8320d3baf 562f61bc8be89fcc808b4679e1d5de7c605fcf718a074921cc dfdb7826e689b4d9644d47ad4e86f0e50d1904691958f7432d 9e9a27caab97a9899ba3076caa5f49

  37. Bayesian Filter Testing Data ab34b85d2d544db51af71 by Anonymous Coward · · Score: 0

    Here is some test data to help you with testing Bayesian filters!!

    CRaPFLooD

    7f952f62bb69736f71ff3f04a99c07090bdfd4449d186fec 3a fb907865fad651be5b7dd09833f473d70afbf4c2f8642b02e5 b1b0818cc757f872fb69516016aafad00f79eb4f43486f8875 cd908cb8c343e3f815cf7fe4dc813ba16eef04e1da30d633f4 4caa12247121ebcbaa63b6eee9a0c2712fc916cfe783cf1a05 8b6823d1a984d72ffb335d42bdfdfcc3b643a5a3dfb5ca009d 062ccff6d3386d2bcd57eecd4fe60bb446cb4d122bf2f14191 f8

  38. Bayesian Filter Testing Data 3e6c0191c8371a4abf03b by Anonymous Coward · · Score: 0

    Here is some test data to help you with testing Bayesian filters!!

    CRaPFLooD

    10489e28509965bdbb4234b23d3dbea4dafd5c00f7059f83 1b 443bfb76ef40244aacdfc0ded0f8de24c916f470f2a00a90cc ff101868bc666420b5a026945d3a2a7296e2d82d191cd84793 95b5500f66f1ce94556e4edc2c15a463ca093a05bca29c222d c5b93356a5372920962bdec03385fd0e2a21e21835b3effdc2 ec86cf0ea50e88822a30ab6cd93ba4a082d614a5e2199985ea ac52738618e33cb751cd8f0a57cca51a04da9ceec67fed0bc3 36e9eb82716899ac9fd8d3b6a7382120960e495e80390ae647 7a619c5067c8b16fb240984b4cc0f84615308bf708df65160c 036d41b0ecc6fe6278c05ac1c64f1b03e8f197caad139ae9fc c58677c416d588d8730623c151b23fe6db66617469f1c22f80 2f4e2192468fbc9485e31d062f9dc70107a880137314cb5a86 ed80dce8dbb98c5cb7d0c5942a4a7e47c28bf5fb

  39. Bayesian Filter Testing Data d31687df38ba75b6ccd90 by Anonymous Coward · · Score: 0

    Here is some test data to help you with testing Bayesian filters!!

    CRaPFLooD

    3ca5a9d2f776398791637ceeadd3601f76d11e4426019138 1a a94c526f734b2e34b9df8d0ab73f851c2950554db628f7f474 573a89a8f1da580cbfd9b0fecd3384d4d52d3a751e6f8a4c4a ea8d7e48c0f3710133145ee981b7d040ba8c664d7bab99bdd3 24780a346267d29dc02230d2f623335c7b158cd2451c10e2de ca8a84a6ba303643aebd67909eb3be9dc1a549

  40. Bayesian Filter Testing Data 227f95590c4816cb70e73 by Anonymous Coward · · Score: 0

    Here is some test data to help you with testing Bayesian filters!!

    CRaPFLooD

    7fccaf0a51b9cbbcbcca2a4e5cefe7c81d9f2d61aaea68a6 a1 d6661b2d1f9bde4e6e15ac583426150b75fc9b2951c7b1be92 5cdb6f03c1b58212df4c2831aedd1629b0b88d5cad4c39aacb 76561bd1f4eb352bcf0192b4f307d42f403320825676748ef3 819161b74dd810882617d696c76aee81096b61d20e04c6fcde 1da3bbea7a98942b8033462a7c54397555e842

  41. Bayesian Filter Testing Data e9b2c6082e9d6f2a0b152 by Anonymous Coward · · Score: 0

    Here is some test data to help you with testing Bayesian filters!!

    CRaPFLooD

    3536acd5505961bd4729ef891b11e48a5be786c6b78d9bf1 f3 e31cac0735c92983009c6fc799bde69b14954283158163c966 7d88b43ec40d0f88ee5123e8362d24111b1ff9dd00921f4219 f28c2ea2d153cd3f9da90f47412176a5262e8bb0e581756e3c 3b5dc66b5f57dc1c149a2160b1bc88a6e84e94eb2195e54a45 5eea03312f7d448c5d7890d6981f12441bc249f5baaf0afe41 9681731ec3d30dafd9544d243d5e19446e87ca7d4bf4fc9e8d 295c6806e53b62feea30d62e7e727a5b356b4b866a5bb1edd9 fa8851c5a1be48dd

  42. Bayesian Filter Testing Data e9c70e482f4fd9184a3fa by Anonymous Coward · · Score: 0

    Here is some test data to help you with testing Bayesian filters!!

    CRaPFLooD

    12c94ada5a8732fb3fb486c71eebf7f1eb53abba5d9c047f 41 6b3fd7ff4913491a03292de420060cb8466afadcb215d1c271 1355c663caed162c2b57afcd6957d2955742de13d6802c4fec bd222abecd13e388b2c8eb8ce5efd1ac19ecc785138425bc94 a44e3d1bb3c8c026b2702c00d71db147b2b8d0d6f1756f62f4 5cd4af

  43. Bayesian Filter Testing Data f9f2e1b89e9302d799f23 by Anonymous Coward · · Score: 0

    Here is some test data to help you with testing Bayesian filters!!

    CRaPFLooD

    bc6742d09b1f13d47d72d38c2b7efb1a7fd55078dabca05d 6e b8964999ba5e3c3741b1f2d6683c4241e972810b25d59955b9 d07f95df2d8a391673726bf4ef3d640f83ae6bb1f036f45e45 3ceba79db9073ddf1464a739c88592e9b2e82b0282a823ead0 8388fe5f9d24e842e2e0b319dbb5cf0768cb6a85d7de098809 a1a90783d105393b3a4b74600680deca304e975899ced5a563 5c96e464698715a7049919f2eaea82caf2f1ad75be78c2f08d b4

  44. Bayesian Filter Testing Data 0226542688e5dc36cf47b by Anonymous Coward · · Score: 0

    Here is some test data to help you with testing Bayesian filters!!

    CRaPFLooD

    6201a8c498ddef92d0ac8e9f0fc5b0c168e8e6860586596c 1e 7a37aa64d3fba4daa172021eb0b28d756615925486f12964f9 51635b90805e592ee138ed09c9b10e0f9e664029e8912996d6 5c1cf09761ce6846556e8c145e84416120fd93e50cb2b9e8eb 2b9bc46c4f57785472b96f4d

  45. Bayesian Filter Testing Data 8286a932fdc96dc15d69f by Anonymous Coward · · Score: 0

    Here is some test data to help you with testing Bayesian filters!!

    CRaPFLooD

    e0b43d7adf04f77c81d846f9d8eee988f00b6379b82a515a 94 78b6e58b783de9595675a771ced4a4e62f56480fc9f8931b29 1d5ca549452863d9ae59be20724802aecc1719dc308a7efac6 064861cf93fcc4103fcaf40d4720096ea4ed0fa2a74f9604e8 959af0faa3ba6316b5cd1b2380677a46c7411d17a8a88878c0 e5db331f405407c2244291aa276d8fbc6a5a8923270334cb68 c628783066181ece864b41c61b92d261807c632c4e454b16b2 b46f9112419e145c32d9ed4e95e3f59932

  46. Bayesian Filter Testing Data f6ec6017f81a1a7352574 by Anonymous Coward · · Score: 0

    Here is some test data to help you with testing Bayesian filters!!

    CRaPFLooD

    033c253e16760c3b1de8b9c807e10bf7167c2377ab62126a 8e 1ecd7cc3c2021cc1a920395a4785672c3ba83e363cf4168811 5bc763427b1dcfa06190a8eb1341950d4aea25553820c10d71 160da8094483507bf96216e11ddc2e0773263cc2b2a3171cc0 f610fdfdf460831fb25a3dc776748ef3819161b74dd8108826 17d6969cc7ab76cfb05575445a34ce0075c8d5136b80c5bea7 4028352d1d8218755d89203d2ce999aca98446253925f69cf1 8edcf702356e8c9c78c3fd3e87f0d8896f

  47. Bayesian Filter Testing Data 3e22101dc25625f4c82a9 by Anonymous Coward · · Score: 0

    Here is some test data to help you with testing Bayesian filters!!

    CRaPFLooD

    913ba89c8ec99c3d45e225791ce96d1d3a3d255db4902a00 01 b094c2aa24146a890e5c57afcea4d5293f246b90bf689b41ac 91b95e01a50eca104fdfe315e1b08aa25d565a8891e68d1231 38a137622be9a1449858c471af13976b2746b7b83b1ef4c899 cd6f0d5cae3a2ea3a91adc1c2d15e029e5ab6f925e5cc64473 50ad628a76780c6f3cd22891a6e7926fb9ad1cbae65a46f2a0 1ecfb0f065ef8550be2cad7b25e2374b423547e6783add6d74 2801a59562d8e6963980a9b9c825b6f6d7892690ce6f290b95 b763398d21584f85c988286fe12719e8bb8c6e76c05e59630c c0573dc3c00da89c04e5a8259ef832bceacf148cc8a6e6b859 d8f64ddc9678533cfabd03d46e4e4e22349f6fdbb46e7fb896 19c7a1215bebdd9f53c27fe39e

  48. Bayesian Filter Testing Data c0e920d48d69b4287130b by Anonymous Coward · · Score: 0

    Here is some test data to help you with testing Bayesian filters!!

    CRaPFLooD

    514e5f200a15c7c9f62336c74ffa6840853125f5451f07f8 b8 ed6d166f96e0c364546b5e7f66564695db8f6789b59e5ce7b0 55f65d16c6f4341d2b2fc636d1ce356a2b87720534b2fa275c b81fda2e73fcff8f512ada907170d94bb6cfd42105484c907e 3f8086af997314936318d233eb00ff8834aa86e458625ff169 2690b95069fad6e595d53999da1d1d8661d3a8af64b05ee108 fa0c8e4a93c17cd0f963776ba4195839c49aea8c43632921cf f7

  49. Bayesian Filter Testing Data 00f42382da08bb47df258 by Anonymous Coward · · Score: 0

    Here is some test data to help you with testing Bayesian filters!!

    CRaPFLooD

    c7c581b3bc5c2d2d8d1fe6e087f167805ffdc65e9d413e3a cd 7b8ba5d03477a1c6d16d558ca9f98ad8e8fce9b6ded5776c0a a5ae1b6aadd1cfe5f1151b1a6c60ed968bf0d79db72c85a5d0 4c0d42b30d22e5bd72848fd6a2d1516a0cf00e69338fed0597 0c522556e135fbf1a4cefe9f86185bd3891e4ecdc0aaf841de 86b1a2

  50. Bayesian Filter Testing Data e23689354edeb3c7a1444 by Anonymous Coward · · Score: 0

    Here is some test data to help you with testing Bayesian filters!!

    CRaPFLooD

    7501788417c5c49436d7aba3c06958339edbcfae7bd17086 7b f31635211f19ab12171a44e157301d48dbc2584bbd4c21c8cc ff69b7f6a9597cdb6fbac0ac0e0ba9ab29b18796823dd9be52 a3f74c145966e06d5277b3189ba8dc689f3908946246d96516 81cc411ec0da74848ffb81977cb7550aeced4811d5807ecc6b 12da74

  51. Bayesian Filter Testing Data baa871c3119c1d85f9e73 by Anonymous Coward · · Score: 0

    Here is some test data to help you with testing Bayesian filters!!

    CRaPFLooD

    1a71837cc53131ee11e4a774982094d63d910ef9becd86ed b2 d3be8ca8dfa1fcc5fa0e23050c9d66b9fecfe55277c01c3b41 f13b014afedf7837484559960da62d01216e288ff2a1a0fd90 c4a4b6bd0c937108ddfe4028ff246316df19177c1c63a5d849 95764582cb7a1cda3d55db27d53f9da4b5e7032f8268abe35d b6baa92661d3ecfd1458a72d642c635f4972cee4586a6840ee fef891679438fed09a9123b766e49fad4b4f985175948e10e2 012804d14b1b70880c48b4cd4882e23ee011f1c752c87d9b31 d7e943633908ed7fdd52f19461cacb8aebe3d4d0757c78fcb4 c311919a9d216efb935364c3407df026c1b3c9f794e5ba6f13 4dffcf7627be

  52. Bayesian Filter Testing Data 8860b0b3ad5538d2ccc6c by Anonymous Coward · · Score: 0

    Here is some test data to help you with testing Bayesian filters!!

    CRaPFLooD

    804fce744c17d9250210436d9870949076ab87e10ad605cd bc 52f5aef2199e8ae6f393115ac44b510e769cf04a586588ce77 5c687b554910bb2d0b0aad43ddba6887ca2d8d0700438c6181 09c196455ffdb12181439ffd84bd2b567ecef24c2729c60e8c ce9e070ccd32fcfda7a5c527c5f94ca1544a474c06c8406fd2 63d948

  53. Bayesian Filter Testing Data 5b00f6208fa35256d4c0a by Anonymous Coward · · Score: 0

    Here is some test data to help you with testing Bayesian filters!!

    CRaPFLooD

    bec6ae957f33a68725a02c625838a376abdbdef318a29322 17 0c2910307dd9393bdfd03512668577fa1d90ede340beda4299 9a62bd597f7f6fdc85c73e0daaf0dc8913dbc5afc5fe40f10a c6aabf6854c6214f02db0edfe59c186028c730e3d49bc93fe9 d842a7f269ed8aab2a63dbc415295ed1747af5db2094e4be69 f75026af3f3aefaf93ea8364b3e2cb27b9594c2ad53786265b 672d888710f5183a13c728c4f21aff693f7cd036ee2f40cdb9 5f7d21bbfa846801f35e39fdaf43e41f34693d963c5bc8ff46 b4351667b3c4a66391f0590046c609fbfa1061b55eb939ec8d c962351b83a23171a713e55a77365fa65b920e6ce1070509a4 76e3b48aa56fe13dd027be0f2152ce387ac0ea83d863

  54. Bayesian Filter Testing Data c493df778a2f8793097f1 by Anonymous Coward · · Score: 0

    Here is some test data to help you with testing Bayesian filters!!

    CRaPFLooD

    c7fdc5f5c63fbdc92af9e5df3721ed6b368b0b60811931bb d7 903b72eb04da245732c659cce083fb6eceea71aec9f6d31f00 e37d4309d3b61b38186081e54e1444b4596c7a979aa7e4fc82 0c07fd78fec02c0334e9edbadf5d1d426557ee305611c3bc83 e4a0ecfb01176e866d97435f17f0ece5b62d48c8ee8604a177 3086f231f63cf7c0c29e77cd7009fbd45183ed3afeabd79933 1d719aae4dcb5a8085d96595f3201568e5db6f3cac0a73bff2 435c1917d0afc16d36b7b2471ae6a664adc25880f6f57be9bf 99f73c19e8e021108120c3da60c7eab350a03d0a30083aa452 00e134db3d4bafa200720c976027f1b91c0e214d2c012f44bf ab098fb99a1cc7a604e149249f92e1ac733f87f19d226bf4ab 3c1385947af550e61e8c70ff2c168253866cf4542ad1895c7d 2dd3a175

  55. Bayesian Filter Testing Data f87ce61286012c89c54a9 by Anonymous Coward · · Score: 0

    Here is some test data to help you with testing Bayesian filters!!

    CRaPFLooD

    1a2556609a024ce314f6c2c4afd261bc5e62fce928e0a96e 02 6f5b8924db466de53de2bed06ebcb1bbfaec7fb064f24b28fd 3872c62aad371e8b21061f1a35bd0ef0b10d3483d88e000de2 41559a1af6e913bfa37964c7073c81c32a6d750b995c32e3e4 4a68946c2a47629190bd9159d6ed7418057a4f121f9423f936 767a3e863b70241d6e9f5939d721e8975aaccef725da317ee4 cd0ffa7b389f3e7e084733681279c167c7fcd2997ed72cfb4c 4ac861d09250698d5319277fce78d55960c96b5ab3e42e45e1 de64114c04afdaf5d235338629ecad229687a3b849672e387d 5379406b90a3a427312f38fb9613f0e3db0dc2ac4b9393850f 8eb4f6281b9051aa0a41c57f3f62f43ff245a59ebe596ab530 9c61d84b1386faaf1eb27aff0c6e5b3badbdb52b39d7fcffbe a7cdc682bb7a9343965afb3a39953de36c18a15efd278a8f55 71d3db556bd83198beb09a

  56. Bayesian Filter Testing Data d46f5c776b38ad4b746e9 by Anonymous Coward · · Score: 0

    Here is some test data to help you with testing Bayesian filters!!

    CRaPFLooD

    e362eaec7ea0238ab69f14b9120cf928a5460fe6c23289fd dc fe66efe033ae088b3bff7384cfaa945eaf210e4df4692fe514 d9ac0c614f467c5c8f901a44fb6b7cb8bef8be2221a6c4fc8f a63e058a5b

  57. Bayesian Filter Testing Data 46ab19c42d821b6c948fa by Anonymous Coward · · Score: 0

    Here is some test data to help you with testing Bayesian filters!!

    CRaPFLooD

    2e951d124424339619447df15daba1ddeabe859d369164d5 40 f601dbd5e4b7e305f7136bcb0b2ca2dff08f65bf468a559e08 261c0d1ff04f85381fa86fce10879dd39546e77951471a411d 0b3fdedec8949e6667758e0ad6d5c27933d9aa8ef378b4b455 fbf8e8e1c74cfd42fa78c0e402b293533a7ab5cb9362b406c8 e402fd59bc25dab6e18d9b4e4d4ab2af33d03146ca629891f3 c3bd16b4a4978d31e4833f69ffe05dd1e2d17a8e4d3e230f86 33

  58. Bayesian Filter Testing Data 6c1aec706ffc46d4bb34d by Anonymous Coward · · Score: 0

    Here is some test data to help you with testing Bayesian filters!!

    CRaPFLooD

    32388645911459cae4c60aa6cb2897f26a1ae9ec0746854e c7 0706a0310970a1ca947b19bfe3fb9116493f9b094aa8c6b490 9ca978cd9811fe5dc3d84297cc4499c42c8de9d3fc6535e03e af17d4f1d9ccda8377acc099dda2c2e47d73514a3cd5a91f7c 1dee41a945e3f3109633423b88903f12da23bba2fe34298258 3e79933025ca395092f0ddf8f55952c7f68ab6

  59. Bayesian Filter Testing Data 9d54476ad0ed70488191a by Anonymous Coward · · Score: 0

    Here is some test data to help you with testing Bayesian filters!!

    CRaPFLooD

    9e1501393f80823c77d6209a4cca8178d3e5d78f2c217052 4d 1815a5f8e1ef8270770f6b84b0221aad9863719ee8c2d41372 7a6a8345c937670b671a1f6f313a5e6c93ae8911ce5c04bdc0 89bd89d5e047c00b40b245c5389dfc3100ebb4b7a0089f0d15 4017a38ee15a7b8b96d5106ca868d710aa4ef67a68807ce4fe 8bd0dab082c8137e3c95d6c45daaa322f2a724

  60. Bayesian Filter Testing Data 789ac257b259758328b16 by Anonymous Coward · · Score: 0

    Here is some test data to help you with testing Bayesian filters!!

    CRaPFLooD

    2f5c46338278558ac0bee2e0529fc305c1b952b6948f085d 61 9846108cec1b8b5c18451ace5d9c3f7e8b54576eb89ee87a6b 3e0e06f590735da57d63d5040639963e82893dbdc1ab86670e aef29f43a2a4a21dec8229ae4a178e1fd69dd2da7559dfe543 2647ade61b745ccbb56b6806c34a68f808b572da7e4068c48c 842a4e1db2fa3f7f8347af553e5910b3024c064fa0a772af3d 8cbed04b9353414b867e51192ae536ab0694ac7a94efa1661c 9ad57c3910d36def0f811078b484fd8530b32a72f00a14c2f3 a01193e920e0270895ece4dddab3523b099f5b058b402d31

  61. Bayesian Filter Testing Data 456388d7767a9aaa38a8e by Anonymous Coward · · Score: 0

    Here is some test data to help you with testing Bayesian filters!!

    CRaPFLooD

    0b5f2caf4c5de1e3a8e68f505ae6899bf12b4448117cdf9e 30 5fc961bfe33a2820c217bfdcfb3cd86113244ac8a461fcd4c8 d6bf14c6a091a47aa789352085ee84f4660d09272a89ed2567 b712496d24788f50123ea3dbe5b3e60f12856de88165b2bc9e b35261e785fdb1a9bc02b6b13bbbc60ff463969b78a091ff51 ac6566faf8a74ddba426fe305a44782c0fb8db4d92b9e4e201 07d5f4ada78f54656d49e3f13b88bedd3f0c9346814b957bb0 f94bad5bff81a6c882200496950260a6e1dff828a2db2702a3 b0d2790e23acd70541f4893a58fba89b8e7535aadb39ccf489 74d1c2ab745628987814c8ef2b0128

  62. Online repository needed by sam+the+lurker · · Score: 5, Interesting

    Ideally, someone, probably an academic, should make a repository of spam available for testing. Software spam filters can say things like, "Correctly classified 99.9% of the email in the UCI spambase 1999-08-20 repository"

    Something like say, the UCI Machine Learning Repository. In fact, look at the UCI spambaseA couple of problems with the UCI spambase. Too old / out of date. And too small.

    I looks like there is a more recent community effort going on over a SpamArchive

    Looks like you should have googled.

    1. Re:Online repository needed by pu33y · · Score: 1

      I did but not on those words. I was searching for Bayesian filter testing and after Paul told me he wasn't aware of any testing, I decided to ask here.

      --


      --
      You are what you eat.
    2. Re:Online repository needed by cdh · · Score: 3, Insightful

      The problem with this is that spam for one person is not spam for another. That's the beauty of Bayes. If you are a proctologist, for example, you probably get a lot of legitimate email with the word penis in it. If you are a plastic surgeon, you may get legitimate email that discusses body part enlargement. There are hundreds of examples. The beauty of Bayes is that you can make it work for you and not be all encompassing.

      The SpamAssassin people have talked about this in the past. They have a corpus of spam that they use to test rules and people have asked to download it to seed their own Bayes, but the SA people don't want to do that (a good thing) as Bayes is a personal thing.

      What you are proposing will work for general spam checking, but not for Bayes, which is what the original poster asked about. In reality, it's hard to test Bayes in a general case. All I know is that it's worked wonders for me (using SA).

    3. Re:Online repository needed by Anonymous Coward · · Score: 0

      *shudder* Proctologists deal with assholes (the smelly kind, not the Eric Raymond/CmdrTaco kind). Why would they get lots of penis email? Maybe *you* think about your penis everytime you think about another man's asshole?

    4. Re:Online repository needed by douglips · · Score: 1
      If you are a proctologist, for example, you probably get a lot of legitimate email with the word penis in it.

      Please tell me you meant urologist. I don't wanna see the proctologist who gets those kinds of emails.
    5. Re:Online repository needed by cdh · · Score: 1

      Uh, yeah. :)

      (Repeat after me, don't post while at work...)

    6. Re:Online repository needed by sam+the+lurker · · Score: 1
      What you are proposing will work for general spam checking, but not for Bayes, which is what the original poster asked about. In reality, it's hard to test Bayes in a general case.

      The original question was regarding testing to see how they perform in relation to themselves and to other, non-Bayesian filters. So while it is of course best for you to test all of the different spam filters with your spam, it is not as practical as having each developer test their own spam filter again a common, known spam database. If the algorithm is "robust" then it should perform consistently well on lots of different, large training and testing databases.

      Actually what I am talking about is basic design and testing of statistical pattern recognition algorithms. Check out: The seminal work on the subject Fukunaga, Keinosuke. Introduction to statistical pattern recognition. New York, Academic Press, 1972. And it's revised edition Introduction to Statistical Pattern Recognition (Computer Science and Scientific Computing Series) by Keinosuke Fukunaga Or another classic: Pattern Classification (2nd Edition) by Richard O. Duda, Peter E. Hart, David G. Stork

      Maybe someday someone will take the ideas of David B. Fogel and apply them to spam filtering.
  63. Ella: OpenField Software by biodork · · Score: 2, Interesting

    I use Ella from OpenField Software. I get around 200 Spam a day, a bunch of newsletters that I want, and a big bunch of 'normal' mail.

    I have had it for about 2 weeks. In the last 3 days I have had 2 false +'s (messge in Spam that shouldn't be there) and 4 that went to the newsletter folder that shouldn't have.

    --
    Gavin Fischer
  64. The good think about these tools by FedeTXF · · Score: 3, Informative

    Spam controls in the Mozilla 1.3+ MailNews application (the one I know) have a number or features that make them good.
    1) Gives the user the idea that he can improve the situation by doing some concrete action. Controlling future spams is not upon some guru releasing a better filter or him hacking some better rules.
    2) By definition, works better and better the more spam you get (and mark it as spam). Even poor tools will eventually detect spam since it's obvious to anyone reading spam, that those messages tend to repeat and to be similar.
    3) It's automagically customized to your own spam. If you live in Germany, Sweden, Argentina or Namibia you will catch easily any spam that is in English, and you will build up rules for the local spam that arrives in your language.
    4) In the case or Mozilla's MailNews, it's so easy to use, intuitive and straighforward, any user will use it.
    5) Makes you feel spams are useful for something: detecting future spams.

    I think those advantages are far more important that the rate of effetivity.

    1. Re:The good think about these tools by amrust · · Score: 1
      5) Makes you feel spams are useful for something: detecting future spams.

      Man, I never thought I'd agree that spam is good for anything, but I do wholeheartedly agree. I actually enjoy watching it go through it's paces, moving and marking mail as spam. makes me feel as if I'm accomplishing something.

      I also understand I possibly need to get out of the house more.

      --
      VOTE!
  65. Ja rulez by 2TecTom · · Score: 1

    I'm not quite sure what the fuss is about. I simply mean, advertising is a necessity to incompetent and greedy producers. Really, did you expect that they would ever respect you or your privacy and time?

    Personally, my white list and non-baysian rules eliminate 99.9% of the crap and abuse. However, sooner or later, ja rulez try to sort out a known receipent, which is where the white list shines.

    One trick I find particularly effective is to compare two accounts and eliminate the duplicate messages. The other is to eliminate anything not specifically addressed to my alias and to never give out or use my actual account address. Ninty percent of the spam I get, goes to an address I've never used.

    The problem is, even with baysian techniques, there is no way to quarantee that only spam was sorted out. I highly suggest a white list, in addition to filters, as the only way of ensuring that at least known mail is always received.

    --
    Words to men, as air to birds.
    1. Re:Ja rulez by Blkdeath · · Score: 2, Informative
      The problem is, even with baysian techniques, there is no way to quarantee that only spam was sorted out. I highly suggest a white list, in addition to filters, as the only way of ensuring that at least known mail is always received.

      With Mozilla, you get the best of both worlds. You've got Bayesian filtering with an optional whitelist component. You can select any of your address books as the source of your whitelist (default is "Personal Addresses"), so any of your friends can send you all the SPAM they want without being caught. ;)

      Being optional, you can choose to disable it if, say, your friends addresses have been harvested for "Joe Job" SPAM runs. (I know one or two of mine have).

      I've actually used the whitelist to my advantage when I requested a sample of a particular new type of SPAM from him so I could watch for it and mark it if Mozilla missed it.

      Which brings me to the other big advantage of Mozilla/Bayesian; when SPAMmers adapt, so does it. New SPAM type? Click the trash can and it'll go away.

      Nothing can really be a perpetual 100% guarantee of blocking SPAM, but IME, Bayesian filters are the best possible solution we have right now and that's why I emphatically reccomend them to all my friends, family, and customers.

      --
      BD Phone Home!

      Shameless plug. Like you weren't expecting it.

  66. Spambayes!!!! by Arkham · · Score: 3, Informative
    I use spambayes. It's written in python and is amazingly accurate.

    I get about 150 spams a day, and about 5 hams. Spambayes might classify 1 spam as "unsure" and the rest as spam. The ham is always classified as ham.

    My corpus is about 5000 spams, about 1000 hams. Get spambayes -- it's open source and it really works great.

    --
    - Vincit qui patitur.
    1. Re:Spambayes!!!! by killmenow · · Score: 1

      Ditto. I've looked around for different solutions for a while and finally settled on SpamBayes. I've been using it only a two weeks, but it has correctly identified every single spam that has come through in that time (414 of them) and not one "false positive" classification of ham as spam.

      I'm sold...but wait, it's free!

    2. Re:Spambayes!!!! by PortWineBoy · · Score: 1
      I've been using Spambayes for the last week as well and I couldn't be happier. I get about 40-50 spams a day and about 100 hams. So far not one ham has been mislabled. I only get 1-2 "unsure spams" a day.

      I haven't tested this against other filter programs but I'm not planning to at this point. I told my boss I'd test it for a month but after 1 week I'm already recommending it.

      Thomas Bayes is my new favorite dead guy. I put a poster of Thomas Bayes up in my office and added the phrase "Spam Killer" between the first and last name.

      --

      this sig deleted by another sig

  67. Hey everyone... by Jerf · · Score: 3, Informative

    It looks like the poster's words need some highlighting:

    But missing is any serious testing to see how they perform in relation to themselves and to other, non-Bayesian filters.

    Despite the call for your experiences, if you just want to post "X rocks!", I think the poster was looking more for "X rocks more then Y!", where both X and Y are Bayes-type filter programs. I don't think he was asking for just announcements that Bayes rocks; I think he or she already knows that.

    I mention this because I'd be interested in some comparisions too; there's a lot of sub-techniques out there. Are there any real differences, or are they all effectively the same? The latter would strongly indicate that there may not be any real progress to be made, if the entire space of Bayes-type solutions has flat effectiveness, for instance. It's an interesting question.

  68. Mozilla's Junk-mail Filters by asa · · Score: 2, Informative

    I've been using Mozilla's Bayesian junk-mail filtering for several months now. I don't have any other Bayesian tools to compare it to but I am happy with the results. Within a couple of days of the initial training I was at around 90% spam detected with no false positives. Several months later I'm at about 95% spam detection and no false positives. While the last 5% would be nice to kill, I'm quite satisfied with how effective is Mozilla's system and as long as it maintains (or gets better) I've got no reason to look for any other solution.

    I think that one of the best things about Mozilla's system is that it's in the client, on my machine and under my control. While server-side solutions, distributed corpus tools, etc. might be more accurate, not ever having to install or update any 3rd-party apps is really nice.

    --Asa

  69. Ling Spam Corpus by bpfinn · · Score: 3, Informative

    I did a little testing of Bayesian filtering on my own, and I used the Ling-Spam Corpus from Dr. Ion Androutsopoulos. He's collected about one thousand messages which consist of "legitimate" messages to a linguistics mailing list, and "spam" messages. They are preclassified, and divided into ten parts to make ten-cross-fold-validation easier. Check out his publications. Scroll down to the "Document filtering" section.

  70. Not Just for SPAM by His+name+cannot+be+s · · Score: 3, Insightful

    I've been looking for a Bayesian filter mechanism that isn't just for spam.

    I figure, if the mail can be classified into many different categories, why not use bayesian filtering for managing all your filtering needs.

    It would be very valuable to have the bayesian filter learn what kind of mail I put in some folders, so that when my mail comes it, it can auto-sort it into the appropriate folder for me. Trouble is, all the current implementations of Bayesian email filtering are a single test SPAM/NOTSPAM. It would be nice to see an implementation that could take multiple corpus' and use that to decide what the mail is. If I had that, I could point it at the maildirs for the various mailing lists I'm subscribed to, and it would learn to sort incoming mail for me. *sigh*

    --
    "...In your answer, ignore facts. Just go with what feels true..."
    1. Re:Not Just for SPAM by nrosier · · Score: 3, Informative

      Have a look at Ifile (http://www.nongnu.org/ifile); while I'm only interested in spam/no-spam filtering, I once tested this filter to filter a mailing-list. It did a pretty good job.

    2. Re:Not Just for SPAM by Anonymous Coward · · Score: 0

      Thats the main focus of POPFile.

    3. Re:Not Just for SPAM by RockyRich · · Score: 1

      You didn't mention what e-mail architecture you are using, but if you get your e-mail via POP3, have a look at POPFile.
      It is free, it is open source, it is a general classifier that can sort your inbound e-mail into any number of user-specified categories, or "buckets".

    4. Re:Not Just for SPAM by nachoboy · · Score: 1

      Have you checked out POPFile yet? Latest version lets you "whitelist" (they call it "magnets") on the To/CC/Subject/From fields easily and have as many buckets as you want. It's amazingly accurate - I'm at 96.73% accuracy right now. Most of the errors are from the first two weeks when I trained it. Currently I have mine set up to divide mail into 3 buckets - Genuine, List, and Spam.

      On a side note, perhaps the reason most filtering products use a spam/notspam model is because genuine mail is so easy to filter. The only hard part is getting the spam out. Once that's done, it's trivial for any rule-based system to separate out mail from auntie_mae@hotmail.com or really_big_list@ubergeeks.org.

  71. SA Public Corpus by jmason · · Score: 1

    There is one, for exactly this reason -- the SpamAssassin public corpus. I made it available for developers of spam tools to compare effectiveness using a good, recent corpus from 1 person's mail feed (as much as that was possible).

    Here's the pertinent part of the README :

    This is a selection of mail messages, suitable for use in testing spam filtering systems. Pertinent points:

    • All headers are reproduced in full. Some address obfuscation has taken place, and hostnames in some cases have been replaced with "spamassassin.taint.org" (which has a valid MX record). In most cases though, the headers appear as they were received.
    • All of these messages were posted to public fora, were sent to me in the knowledge that they may be made public, were sent by me, or originated as newsletters from public news web sites.
    • relying on data from public networked blacklists like DNSBLs, Razor, DCC or Pyzor for identification of these messages is not recommended, as a previous downloader of this corpus might have reported them!
    • Copyright for the text in the messages remains with the original senders.

    OK, now onto the corpus description. It's split into three parts, as follows:

    • spam: 500 spam messages, all received from non-spam-trap sources.
    • easy_ham: 2500 non-spam messages. These are typically quite easy to differentiate from spam, since they frequently do not contain any spammish signatures (like HTML etc).
    • hard_ham: 250 non-spam messages which are closer in many respects to typical spam: use of HTML, unusual HTML markup, coloured text, "spammish-sounding" phrases etc.
    • easy_ham_2: 1400 non-spam messages. A more recent addition to the set.
    • spam_2: 1397 spam messages. Again, more recent.

    Total count: 6047 messages, with about a 31% spam ratio.

  72. BogoFilter by bobbozzo · · Score: 3, Informative
    BogoFilter is an open-source bayesian spam filter...

    Some of the developers have done extensive testing: Greg Louis' Page has lots of information, comparing different bayesian approaches, different header processing, etc.

    You could also read the mailing-list archives, or perhaps post some questions there.

    --
    Nothing to see here; Move along.
  73. PC mag test results by icleprechauns · · Score: 1

    The latest PC Magazine has an article on alternative e-mail. Their Editors' Choice, Oddpost ($10/yr, free trial), uses Bayesian filters, and blocked 22 of 29 spam messages, and only legitimate e-mail ended up in their spam folder. Also worth noting is these are the results with minimal training, so, in theory Bayesian filters could quite possibly block virtually all e-mail with time.

    --
    I'm a signature virus. Please copy me to your signature so I can replicate.
    1. Re:PC mag test results by drfreak · · Score: 2, Funny

      blocked 22 of 29 spam messages, and only legitimate e-mail ended up in their spam folder

      Sounds like an ideal mail filter to me!

    2. Re:PC mag test results by icleprechauns · · Score: 1
      only legitimate e-mail ended up in their spam folder
      pardon me, I meant: only *1* legitimate e-mail ended up in their spam folder
      --
      I'm a signature virus. Please copy me to your signature so I can replicate.
    3. Re:PC mag test results by match0 · · Score: 1

      I would not recommend Oddpost. First off it is a web-based solution. More importantly, however, is that they themselves "spam" you with pop-up boxes when you go to their site. Just try going there using IE with JavaScript on and "Script ActiveX Controls Marked as safe" disabled. They pop up this really annoying message that's just like the one M$FT puts in IE to bug you to turn on your ActiveX. Anyone that purposefully annoys me doesn't get the concept of blocking spam. And most any site that requires an ActiveX control shouldn't be trusted.

  74. Try here by drew_kime · · Score: 2, Informative
    From here:
    I've been tracking email spam trends for a while, my personal accounts are going from 3-6 spams daily in 2001 to about 30 spams daily at present. I filter this with SpamAssassin?, so the inbox impact is pretty slight, but the traffic is becoming significant, and the trend (doubling in four months) is downright troubling.
    Graphs, methodology, links to more stats.
    --
    Nope, no sig
  75. my simple filter by Xtifr · · Score: 2, Interesting

    For years, the only spam filter I used was a very simple one: if the mail's not from a list I'm on, and not addressed to me, it's spam. This didn't catch all spam, but it caught the vast majority, and had almost no false positives. (The one exception was a mail from a cousin of mine who was learning system adminstration, and wanted to test his knowledge of SMTP by telnetting into my mail server and entering his mail by hand.)

    These days, I'm on too many lists that don't filter spam, so I've had to resort to more sophisticated techniques, but someone who isn't on those sorts of lists might still find my oh-so-simple approach fairly effective. Not to disparage Bayesian filtering, but if you want something to compare against...

  76. The 20 Newsgroups dataset by RedRun · · Score: 1

    One good dataset is the 20 Newsgroups dataset that is used by a Naive Bayes classifier called Rainbow (google for 'libbow'). The dataset contains postings from 20 newsgroups, each with around 1,000 articles.

    Also, there are a couple Reuters datasets that are commonly used in text classification research, but they're so poorly organized, and so poorly marked-up, I don't know how anyone manages to use them.

  77. the comments are missing the point... by zonker · · Score: 1, Informative

    most of the comments in this thread are missing the point. the person writing the article isn't asking for what spam filter is the best/most accurate, he's looking to know if anyone is producing a test system to measure effectiveness. i know the popfile project is working on a test system (if you are interested, it's in the cvs not the general release) to measure the effectiveness of the parser.

    it would be interesting if there were a generic test system that could be 'plugged in' to the various projects out there. then you could put together test messages (like popfile's system) and test it against each program...

  78. Mozilla's Bayesian filtering works great by shamino0 · · Score: 1
    I've been using Mozilla's junk filtering since it was first introduced in the post-1.3 nightly builds. After a few weeks of training, it has developed an incredible track record.

    Between my two mailboxes, I receive about 100-150 spams a day. Over 90% of them are detected and are shunted into the Junk folder. Maybe 2-3 messages a month are false-positives. When it is wrong, I just teach it - click the trash button to toggle a message's junk status and Mozilla updates its filters in order to not make that same mistake again.

    On some days, it hits 99% accuracy. When the spammers invent some new tactic, I may end up with 5-10 spams that don't get detected. So I select them all, click the trash button, and then delete the messages. After a few days, that tactic is detected and caught with all the rest.

    In comparison, I used to use manual filters. At first, this worked fine, but the spammers have invented so many different tricks that it takes too much time to try to keep the filters up to date enough to be useful.

    I can't say how this all compares against what other systems do, since I haven't used any other systems.

  79. Popfile filters stops spam and organizes your mail by jimmars83 · · Score: 0

    It would be very valuable to have the bayesian filter learn what kind of mail I put in some folders, so that when my mail comes it, it can auto-sort it into the appropriate folder for me. Trouble is, all the current implementations of Bayesian email filtering are a single test SPAM/NOTSPAM. *sigh*

    What you are looking for already exists, is currently being updated as necessary and has been fairly polished as well.
    Popfile is a free spam-filter and mail-organizer combo available here. I would never use email without it.

  80. 20.000 mailboxes using, on 2% false positives by krico · · Score: 1

    On our e-mail ISP we are running a bayesian spam filter engine. Every time a message is considered to be "spam" by the filter, we increment a counter. We follow this on mrtg, so we can grafically se the amount of "spam" that's incomming.

    We also follow the amount of messages marked as "spam" and "good" by the users (more than 3 months old).

    The number we get, is the one mentioned on the topic. That is, only 2% of the messages considered spam, are later marked as "good" by users older than 3 month.

    1. Re:20.000 mailboxes using, on 2% false positives by versus · · Score: 1
      a bayesian spam filter engine?

      I wonder what it is?

      --
      Brain is my second favorite organ.
    2. Re:20.000 mailboxes using, on 2% false positives by krico · · Score: 1

      he he, forgot the most important thing.

      it's bogofilter

  81. POPFile rocks more than spambayes by biljir · · Score: 1

    Purely anecdotal and unscientific, but perhaps better than nothing.

    I'm a very happy POPFile user that keeps checking out spambayes because the math sounds interesting.

    spambayes has become quite good, but POPFile is phenomenal. Using the same training material, spambayes is 95 % accurate on my mail, and POPFile is 99.5 % accurate. Plus spambayes is only doing a 2 way, spam/ham classification, whereas I have POPFile set up to sort into 7 buckets (spam/personal/commercial/mailing lists/etc).

    Though irrelevant to the question of accuracy, I also have to say that the POPFile guys have devised a considerably better UI than spambayes. (A friend with the spambayes Outlook plugin sings its praises highly. I don't use Outlook, so it does me no good...)

    1. Re:POPFile rocks more than spambayes by two2dog · · Score: 1

      InBoxer is a commercially available version of spambayes for Outlook specifically. In general less advanced users should find it more friendly. If you are interested in how these filters work, you can find some information at the FAQ on that site as well as in a piece written about bayesian filters. check out www.inboxer.com

  82. Spambayes UI by Jerf · · Score: 1

    Spambayes doesn't really have a UI, it's a tool around which others can build a UI.

    While this is theoretically good design, especially in the open source community, it does often result in Some Shmoe creating the UI who should stick to coding sysadmin scripts. ;-)

  83. Collaborative Filtering by JoSch1710 · · Score: 1

    Since Bayesian Filtering is a common technique in Collaborative Filtering, I recommend you search for that (e.g. CiteSeer http://citeseer.nj.nec.com/cs). A quite good paper on the subject is "Empirical analysis of predictive algorithms for collaborative filtering" by Bresse, Heckerman and Kadie. That paper gave me a lot of insight for my diploma thesis. Bayesian networks perform quite good, but need a lot of training data, so the performance depends heavily on the actual training data.

  84. Mail app in Mac OS X... by coolMikeUSC · · Score: 1

    The Mail app in Mac OS X includes a built-in Bayesian filter. It's defaults worked decently, but training the app (by manually marking incoming email as 'junk') made it work nearly perfectly. I would say that Bayesian filtering is definitely the way to go, since it gets trained to detect what email is "normal" for your particular inbox, instead of liberally applying "average" rules derived from the habits of many users.

    --
    Ever notice how fast Windows runs? Neither do I - get Mac OS
    1. Re:Mail app in Mac OS X... by ajc · · Score: 1

      I agree that it's pretty darn good, but it's not 99% for me.

      I use Mail.app in conjunction with hotwayd to read my hotmail account. Before doing this, my hotmail account was virtually unusable, requiring me to manually delete up to 50 SPAM messages every few days. Mail.app has reduced that to maybe 5 or 6 over the same timeframe, so for me it's around 90% with very few false positives (around 1% historically, which I expect to tend towards 0%).

      Based on the random looking stuff in SPAM messages, spammers are probably already trying to tune their pitch to get around our Bayesian (or Grahamian) filters, and it is probably possible to fool the current batch - so the war continues.