I have been using Gandi for several years and been very happy with their service.
They offer domain registration in.com/.org/.net/.biz/.info/.name/.be for EUR12 a year (about $14 dollars, lately). That includes optional free web redirection, email redirection, DNS hosting, and secondary DNS. Almost all administration is automated on their website and very easy to use. I have zero complaints and nothing but compliments for them, and have been recommending them to friends for low-cost, high-quality domain registration.
From their site:
GANDI SARL is a french company created in 1999 by four persons known in the french Internet world (Pierre Beyssac, Laurent Chemla, Valentin Lacambre et David Nahmias).
Our service focuses primarily on individuals and non profit organisations.
Gandi's aim is to provide to individuals domain names easily (for the technical and administratrive part) and for a price as low as possible.
This may not help your particular situation with your existing hardware, but I switched to a USB KVM switch (with a USB keyboard and USB wheel-mouse) and it works just fine in Windows 2000/XP and Redhat Linux (8.0). There are, of course, other issues with using a USB keyboard if you need boot-time support, but a modern PC motherboard should be able to handle it without much difficulty.
Many good posts so far. Here are my contributions. I work in financial services and have perspective from the "other side", as well as my own research managing my own finances.
First, I highly recommend you use Quicken/Money religiously. I only bank/etc with companies that I can download direct to Quicken. Watching your net-worth chart over months/years can be inspiring. Seeing each month if you made more than you spent is crucial.
Second, the best piece of advice is to invest at least 10% of what you make. To follow common advice, "pay yourself first"! This means making that 10% disappear from your paycheck before you even think of it as money to use to pay bills, buy pizza, etc. If you can put this in a 401k, great! If not, do it anyway.
Third, the vast majority of people investing should probably be using index funds. Yes, some people make money in the market, but it's probably just luck. (Really! There are some 30,000 or so mutual funds -- half don't beat the market each year. Imagine beating the market equal to flipping heads on a coin. Now watch 30,000 people flip a coin once a year for 10 years or so. You're still going to have some lucky few who flip heads 9/10 times. Ever wonder why mutual fund companies have so many funds? It's so they always have at least one or two that do well last year that they can advertise. If professional money managers can't do better than 50/50, why do you hear about so many successful individuals? You're not talking about 30,000, now you're talking millions. At 50/50 odds -- or worse -- you'll always hear about someone who made a lot of money.) My advise for anyone starting out is to open an account at Vanguard and stick your money is a broadly diversified index fund.
Four, make sure you have an emergency fund -- three months expenses is usually cited. Once that's topped up, you should think about investing most of the rest. Weight investments heavily towards stocks when you're young, and shift gradually towards bonds as you retire.
When I was very young (high school), I found that Smart Money, by Ken & Daria Dolan was a good overview. That's old, but they have more recents texts.
In the "funny story" model of teaching basics of financial planning, you might consider The Wealthy Barber (recommended by a friend of mine, though I found it too simple by the time I got around to it) and/or The Richest Man In Babylon (parables about money originally written in 1926 and still applicable -- see the Amazon reviews!).
For why not to confuse chance with skill, try Fooled by Randomness. It has a lot of snide commentary at industry insiders, but makes a good point about why humans mistake luck for skill, broadly.
I agree that a Wiki might do better for broad knowledge capture and participation from many people, but for a more static and searchable form of communication than e-mail, why not just use a BBS? I like and use phpbb, which is a snap to install and maintain, and requires less user knowledge than a Wiki in order to post content.
(This is like a throwback to the user-group days of old when we actually used green-screen BBS's on 2400 bps modems... but sometimes the oldies are goodies.)
My code in perl -- rough as it is, undocumented, etc. can be found here. I personally label them something like v0.05.
Two programs -- one to make a.db file with the words and probabilities from a good corpus and bad corpus, the other to test a mailbox against the word database. In both cases, the mail files probably need to be in unix mbox format.
I've implemented it in part -- my code is in perl and will flag e-mails, but I haven't worked it into a filter yet.
My experience is that I get a few percent false-negatives and about 1% false positives. I'm not seeing zero false positives, like many people are, but that probably has to do with the training sets used. Statistically speaking, you always have to trade off false negative with false positives, so it's reasonable in my 'real world' tests.
As a side note, everyone should test out of sample. E.g. set aside half your good e-mails and half your spam e-mails, build the filter on one half, and then test on the other half. That's the only way to get a fair test of the filter.
For my "good" email corpus, I dumped my entire e-mail archive since 1995. That included personal e-mail, receipts from online shopping, some mailing lists, etc. The few things that get flagged as spam (a) are almost always sent in HTML format, and (b) very short with little real content. (E.g., "Hey, looking forward to seeing you this weekend. Call me if you go out. My number is... Bye.")
The spam corpus I took from on online resource while I build up my own. The e-mails that slip by unflagged are usually (a) short and (b) phrased like friend making a suggestion. (E.g., "Hi, I just thought you'd be interested in hearing about a this new, cool website, http://...") It seems to be close enough to a real message to slip through. Thankfully, few of them are like that.
I'm including subject lines, from addresses, and the body so far. I'm not parsing ip addresses or html tags specially, however, just basic words using a simple perl regexp.
Interestingly, "COLOR" is the one of the most often flagged words indicating spam. HTML formatting text seems to be the biggest culprit in my false positives. I might explicitly exclude the ones that show up in good mail (e.g. from friends who use crappy e-mail programs like aol) like COLOR, FONT, FACE, etc., but leave in the ones that spammer use like TD, TR, etc.
Gary is both right in some respects and irrelevant in others. Here's the key line in his article that deflates it a bit:
It is untested as of now. It is based purely on theoretical reasoning. If anyone wants to try and it test it in comparison to other techniques, I'd be very interested in hearing the outcome.
On the other hand Paul Graham has actually tested his model and it works. I've worked it up in perl and tested it on my own data set and it works there, too. Paul acknowledges that he's being a bit fast and dirty, but the proof is in the pudding. The rest is just academic quibbling over the fine points.
I'm not sure why this particular article needed to be posted, as it's just one of several alternative approaches and an untested one at that. On Paul's page, he also lists several published academic papers with other alternatives -- all actually tested, of course.
Gary is basically right in questioning the use of the word "Bayesian". Paul's approach is more about weighing "evidence" as given by the appearance of certain words, rather than in figuring out the probability of spam assuming a "prior". See Paul's explanation, but if you check the article he references at the end, you'll note that the method Paul uses is only one of several methods to solve an underspecified problems. It's a reasonable guess, not necessarily the only guess.
Looking at another article Paul references, given the word independence assumption, the more formal Naive Bayesian approach calculates as follows:
p(spam) = [ p(spam)*p(word1|spam)*...*p(wordn|spam) ] / [ p(spam)*p(word1|spam)*...*p(wordn|spam) + p(!spam)*p(word1|!spam)*...*p(wordn|!spam)]
This is similar to Paul's approach except for including a "prior" assumption of p(spam) -- the expected probability of any email being spam, calcuated from the historically observed frequency of spam. By leaving it out, Paul implicitly assumes that 50% of mail is spam -- that's his "prior" estimate of the spam rate. Given the other adjustments he makes to his sample, that appears to be acceptable in practice. (Paul overweights the spam prior, but also overweights the effects of "good" words.)
I'd personally prefer to overweight the "good" e-mails entirely rather than just put a "good-multiplier" on them like Paul does, but that's just quibbling over small bits.
As to the bit that Gary raises about Paul assuming a spam probability for an unknown word -- Paul originally said.2, then revised to.4, but really should have put it at.5 or just excluded it from all calculations. A new word has no robustness as a predictor (which is why Paul dropped words that didn't appear five times anyway). In practice, a new word at.4 isn't going to be among the 15 most interesting words to make the calculation from, anyway.
ABSTRACT: I think that the goal of having everyone use "strong" encryption the "right" way has made it harder to get people to adopt encryption for e-mail. Today the average web surfer transparently uses public key encryption for web-surfing secure web pages. Why? Because they don't have to worry about any of the complexity. PGP or GPG, while robust and "correct", require too much from end users by way of key management to facilitate adoption. My assertion is that simpler, less robust methods of encryption and identify verification will serve better to kickstart adoption. Finally, consumers must have a valid reason for identify verification presented to them. I propose one: spam stomping.
Several comments have pointed out that encryption demand just wasn't there. I agree. While we would like to think that every end user would see the need for encrypted e-mail, we all know that hasn't happened. Yes, if MS or AOL made including encryption a standard part of their e-mail packages, that would go a long way, but the complexity of encryption needs to be hidden from the end-user.
Truth be told, most people really don't need encryption on a message-by-message basis. Encryption activists feel that a world with strong encryption is broadly a better one, but that requires a "network effect" from adoption and the current costs for adoption for end users are just too high in terms of complexity for them to want to go to the effort of adopting it for a vague future goal -- even assuming they comprehend it and agree.
Effecting real change may require backing away from some of the ideal crypto solution as a way of avoiding the complexity costs. People have mentioned several areas of complexity, including e-mail program transparency, certificate management, out-of-band verification, and trust networks.
However, if we consider web-browsing, these have been effectively hidden from end-users and have thus failed to hinder adoption. Web browsers have public key encryption built in, plus a master set of certificates for verification. Because web browsers don't require two-way verification of identity, users don't have to worry at all about managing their own key. Adoption of SSL has been effectively transparent to users. Those who seek to have crypto email become the standard should seek similar solutions to transparent adoption first, rather than seeking to delivering the most sophisticated crypto. Once adoption has been catalyzed, the technology can be improved as the masses become familiar with it.
Designers of the next generation of e-mail software should look to make certificates a natural part of the email environment. This should first center on identify verification, not encryption. There are several places this could happen:
When first setting up a new email address, the user should be prompted to create (or import) a digital ID for signing the e-mail. This certificate should be automatically (and transparently) sent to a central key-server. This key should be non-expiring. There should be options for upgrading/replacing the digital ID for advanced users. Passphrases should not be used by default, but could be an option for advanced users seeking additional security.
Address books or nickname managers should include an icon/notation to indicate which addresses have digital IDs. When nicknames are created or imported, the program should check all e-mail addresses against a keyserver and import the public keys.
When sending mail, signing should be the default.
When receiving mail, the return address should be checked against the nickname file or against they keyserver automatically. Signed mail should be specially flagged as such in the Inbox. Encrypted e-mail should be automatically decrypted on receipt/viewing. (Advanced users might opt to keep encrypted on disk and only decrypt for viewing)
For moderately advanced users, options should exist to enable encryption by default. This should automatically encrypt if the e-mail address matches one in the nickname file with a digital ID associated with it. Advanced users could opt to automatically seek a key from a keyserver and send encrypted if found or plaintext if not found.
Consumers would have to find a reason to upgrade to this kind of system. One possible option is to use the signatures to help spam stomping. Using e-mail addresses alone for filtering may work for a while, but ultimately likely fail due to the ease of forgery. Filtering on signatures either against the ID in the nickname file or a verified key on the keyserver (one signed by a master CA, so that legit companies could send you something without you having to have them in your nickname file) might work very well. With spam being the problem that it is, this might be part of a "killer app" of next generation e-mail programs to deal with the problem.
What this largely throws out is the element of getting signed certificates and requiring consumers to manage them. However, I'm certain that after people got used to the idea, the notion of having their digital ID certified wouldn't be so complicated. (It would of course need to be affordable.)
(One could imagine that when governments begin to issue digital ID's these will be signed by the government and could be used for e-mail signatures as well.)
The question of course is who could make all this happen? It probably still falls to the major e-mail program makers. I'm surprised that MS hasn't started down this path. Building in transparent signatures like this should be beneficial for their corporate business, and they should be able to sell lots of PKI add-ons for corporations to do certificate management. (Ironically, I think Notes has had this built in for ages, but it only works within the Notes server.)
So, anyone from MS listening? Now's your chance to delivery against your privacy initiative. Best of luck.
This type of approach isn't exactly trying to "describe society" in the same way that physics tries to describe the functioning of the universe. So the critique above is true, but not entirely the point.
These models are interesting in the fields of sociology and economics because because they allow an exploration of how local phenomena -- particularly local "rules" for agents -- give rise to macro-behavior and collective order. Economics, in particular, confronts the differences between the micro-level and the macro-level and has different techniques for each of these domains. Agent based simulation can demonstrate how certain economic phenomena (e.g. market clearing equilibrium prices) can exist from local information exchange without requiring macro structures (e.g. a centrally clearing market). Likewise, agent based models allow exploration of how underlying characteristics of the system at a local level (e.g. bell-shaped agent vision/metabolism) give rise to macro level results (e.g. skewed wealth distributions), and how changes in the underlying assumptions change the results.
For example, it is easy to say that it is "obvious" that races segregate just like two types of particles because it's the lowest energy state, but this was not common wisdom until someone expressed racial segregation in the model of typed particles on a lattice. The result is obvious to a physicist without needing a simulation, but posing the question that way was not obvious. The point of the exploration of segregation wasn't that there was phase separation at all, but rather how differences in the amount of "attraction energy" (racial preference), changed the shape/smoothness of the interface between types. The interesting result was that very little racial preference (on both sides) is needed to explain segregation. Yes, the model could be improved by considering friction/viscosity of motion of the particles/agents, etc., etc., but the base insight that segregation is a "natural" phenomenon, not a "racist" one is of interest.
The article mentions, among other things, the work of BiosGroup. From their website:
BiosGroup, a Santa Fe-based consulting and software development company, pioneered the use of complexity science to solve complex business problems and is now the world leader in applying the techniques of this emerging science to large commercial applications.
BiosGroup was founded as a joint venture between the Center for Business Innovation of Ernst & Young (now Cap Gemini Ernst & Young) and Dr. Stuart Kauffman, a co-founder of the Santa Fe Institute and author of several books on complexity science.
-XDG
Worked with these guys in college
on
Simulating Societies
·
· Score: 4, Informative
I worked with Epstein and Axtell in college. The author's description of them is spot on, and they are both fantastic people.
If you found this article interesting, their book is a great exposition of their early work with emergent behaviors. You can find it at Amazon here:
The article boils down more or less to the following:
1. "Old" search technologies (Altavista, Yahoo) failed because they used approaches that found words but not content (Altavista) or relied on non-scalable human editorial judgement (Yahoo).
2. Google works (and is cool) because it uses available information about the number of links to determine (a) valuable content and (b) smart judges of other valuable content
3. The government efforts at creating the Panopticon will fail because they'll be stuck using "old" keyword approaches that can't pick out real content.
This argument is flawed in two key ways:
1. The author confuses the nature of the "search". Web searching is about finding *content* and the challenge is differentiating "good" content from "bad" content. Governmental "security" searching is more akin to traffic analysis and the goal is identifying dangerous *individuals* based on the content and pattern of their traffic. The challenge there is differentiating "good" (safe) speakers from "bad" (dangerous) speakers.
2. The author assumes (based apparently simply on opinion and what is popularly reported in the press) that the government will blindly apply "alta-vista style" techniques. His lack of fear of the Panopticon is based on an assumption of incompetence in the application of surveillance methods. Given the motivation and resources (both of which the government now has in spades), there is no reason to believe that more sophisticated and effective techniques will not be developed and pursued. Assuming Echelon has really been in operation, it's hard to imagine that, in the closed halls of the NSA, researchers aren't well aware of the limitations of keyword search and are far along applying cryptanalytical techniques to the real problem identified above.
It would seem that the author is trying to take advantage of hype and concern about government surveillance not to make a serious comment about it or whether one should truly be concerned, but rather to get an audience for his opinion that Google is really cool, which most of already knew anyway.
OK. So they reset the preferences to try to get you receiving the mail they want you to get. At least they had the courtesy to e-mail you and tell you, thus giving the option to opt out again if you choose. They even reminded you that it ought to be your choice.
If this happened again in another couple months or even in the next year, I think there would be cause to complain, but balancing their desire for e-mailing you against your desire not to get it, this seems like a reasonable approach. They want to mail you, they warn you they'll start, if it's important enough, you set your preferences to stop it again.
It's fun to bash, but remember all the companies that change policies or release data without giving any warning. Consider this one a victory for corporate responsibility.
Separately, I've had good results with Speakeasy.net as a provider. Their website for subscribers lets you track the progress through the red tape and lets you see the actual logs that Covad maintains on the progress of the install.
These are the magic words that get things done when coordination difficulties arise with DSL providers and your RBOC.
A vendor meet is where technicians from the DSL provider (in this case, Covad) and the Bell company get together in the same place at the same time to verify that service exists (or, more usually, doesn't exist).
If you ever have trouble where Bell claims something is either installed or fixed and it actually isn't, push your DSL provider to force a vendor meet. This usually gets Bell off their butts and gets things fixed.
This is of more use during the install phase -- the original poster probably needs to do at least one round of letting the system work on things before pushing. Also, don't fear to contact Covad directly (1-800-GO-COVAD) and deal with them if it's actually a problem with the line. If you're just having IP/configuration trouble, you need to deal with your ISP, but if you actually, physically, don't have service, my experience has been that Covad is willing to talk to you and try to move things along.
Well, according to the recent US vs M$ Finding of Fact:
I.2. An "operating system" is a software program that controls the allocation and use of computer resources (such as central processing unit time, main memory space, disk space, and input/output channels). The operating system also supports the functions of software programs, called "applications," that perform specific user-oriented tasks. The operating system supports the functions of applications by exposing interfaces, called "application programming interfaces," or "APIs." These are synapses at which the developer of an application can connect to invoke pre-fabricated blocks of code in the operating system. These blocks of code in turn perform crucial tasks, such as displaying text on the computer screen. Because it supports applications while interacting more closely with the PC system's hardware, the operating system is said to serve as a "platform."
There! Isn't that clearer?;-)
To the original point, it is a sufficiently broad statement that the size of the OS code is irrelevant.
Is it me, or is/. becoming home to any sort of FUD that will get people whipped up? (Unless it's against Linux, of course:-)
This is a classic case of the ACLU and some hyper-active first-amendment activists blowing things out of proportion and slanting the facts to suit their purposes.
I actually went to the FEC's web site and citizen's guide (http://www.fec.gov/pages/citnlist.htm) for some information before posting this reply and learned some interesting things.
First, volunteering does not make someone a PAC as some people have immediately starting yammering on about. From the site:
Personal Services
An individual may help candidates and committees by volunteering personal services. For example, you may want to take part in a voter drive or offer your skills to a political committee. Your services are not considered contributions as long as you are not paid by anyone. (If your services are compensated by someone other than the committee itself, the payment is considered a contribution by that person to the committee.)
As a volunteer, you may spend unlimited money for normal living expenses.
Further, what the article is talking about when you personally make a web page about a campaign is called "Independent Expenditures" -- meaning that you are doing it as an individual and independently, not linked in with some candidate campaign. Again, from the site:
Independent Expenditures
Independent expenditures provide yet another way to support Federal candidates. An independent expenditure is money spent for a communication that expressly advocates the election or defeat of a candidate. It is "independent" only if the individual making the expenditure does not coordinate or consult in any way with the candidate campaign benefiting from the communication. Independent expenditures are not considered contributions and are unlimited. You may spend any amount on each communication as long as the expenditure is truly independent.
You may, for example, pay for an advertisement in a newspaper or on the radio urging the public to vote for the candidate you want elected. Or you may produce and distribute posters or yard signs telling people not to vote for a candidate you oppose.
When making an independent expenditure, you must include a notice stating that you have paid for the communication and that it is not authorized by any candidate's committee. ("Paid for by John Doe and not authorized by any candidate's committee.") Additionally, once you spend more than $250 on independent expenditures during a year, you must file a report with the Federal Election Commission, either FEC Form 5 or a signed statement containing the same information.
There are a couple of relevant caveats in that. First, you have to say that you are independent. Second, if you spend over $250, you have to file a form. This DOES NOT mean your free speech is being restricted. All it means is that the goverment is requiring you to register how much you spent on your speech. Why? So that political campaigns can't get around federal law by pretending to have lots of independent contributions.
You can download the form from the web site. It's about a page long. Name, address, how much you spent. Not much more than that.
Finally, I personally think it would be hard to say that a page on your website with a political message should be "calculated" as the cost of the machine, web-space, etc., as the marginal cost of adding a page to an existing site is essentially zero. If you had a dedicated machine, they'd have a better case.
In any case, people should go looking for the facts (since they're in plain sight) before overreacting to whatever FUD people want to use/. to spread.
People seem to be forgetting that Kasparov vs. The World is really just a PR gambit. It promotes the MS gaming site "The Zone" (in MS's interests) and it promotes the game of chess (in GK's interests). It really wasn't set up as some sort of great test of "electronic democracy" -- ensuring the impossibility of cheating wasn't tops in the organizers minds. That notion is a construct of the tech and cyber heads who are making more of this than it was ever intended to be.
People seem to want it both ways. First, this is a great test of "collective thinking" against the world champion, and then second, they get upset because the Krush/Kasparov duel got interrupted for technical reasons and they were forced to think for themselves.
And the suggestion that Kasparov might cheat is ludicrous.
As a separate aside, on the topic of whether this game "proves" that Krush and several grandmasters and lots of computer time can produce moves at Kasparov's level, I'll quote analyst commentary from move 3 about Kasparov's choice of move:
DANNY KING MOVE 3 COMMENTARY
One of the old masters once said: "When I give check I fear no one!", but don't panic, we can get out of this one easily.
Garry's own comment to his move is revealing:
"It seems that young coaches are trying to force me to play against my favourite Najdorf! Due to forthcoming match with Vishy I have to refrain from public theoretical duel. So please forgive me for selecting unattractive 3 Bf1-b5+."
Let me explain:
In the latter part of the year, most likely October till mid November, there is a good chance that Garry Kasparov will be defending his World title in a match against the world no.2, Vishy Anand from India. At this moment both players will be beginning their intense preparation for the match, including research on their opening repertoires.
It is therefore understandable that Garry wishes to reveal nothing of his future plans and so avoids the move which is generally accepted as the most critical - 3 d4 leading to an open game, rich in fighting possibilities for both sides - and turns to the bishop check, generally leading to a more closed position. The World Champion describes the move as 'unattractive', possibly because it could lead to the early exchange of pieces after, for instance, 3...Bd7, when ideally he would like to maintain as much tension as possible.
So, yes, Krush and "The World" can rival Kasparov... as long as he isn't trying his hardest.
In some bizarre narcissistic drive to see his words in print and debated, Katz seems to have taken far, far, too many lines to say something I can summarize in two:
Honestly debating ideas you disagree with is better than flaming them.
Technology creates new ideas for people to flame, but also also makes it harder for flamers to stop debate.
I'm surprised the average/. reader finds this worthy of debate.
(On the other hand, the moral issues he raised as examples are certainly worthy of debate both on/. and elsewhere)
While this may be true, they have one very major advantage: They must be very difficult to predict, which is what these chess types count on to decide their next moves.
This is, I believe, a misconception about the game of chess, that shows why human/computer matches are often misunderstood.
For humans, chess at the highest levels is not played by considering each move, predicting a response, and continuing through some search depth. Instead, chess masters always assume that the opponent will find the best response. To the extent that the response to a move must be "predicted" -- the master must find the one, single best move in response. This is why the "predictiveness" of the world is irrelevant -- the combination of the grandmaster coaches and collective voting push the game towards correct responses.
The real question is how to develop a strategy in the game that leads to small advantages. Each move, particularly early on, has effects on various factors: space, material, development, etc. This requires the ability to understand the balances and imbalances inherent in the board at any particular snapshot.
That is the difficult part of programming the computer. Searching out combinations is just a matter of power. The trick is programming the computer to analyze a snapshot of the board in order to determine the relative balance. Then the computer can use that information to examine in further depth the stronger lines.
If the programmers do a poor job of programming the analysis of the snapshots, a human player can exploit that. In several of the Kasparov vs computer matches, Kasparov's first few games were all about him examining how the computer was programmed to think, then exploiting that in the later games. For example, if the computer over-valued material advantage, Kasparov could use that knowledge to "convince" the computer to take a sacrifice that on the surface looked to offer a material advantage but that would lead to a positional imbalance that would favor Kasparov later.
(This is one reason that Kasparov is annoyed that the programmers never released information on how they programmed the computers.)
Further, the earlier comment about the effect of grandmaster coaches was dead on. One of the things that made the game so interesting to chess afficionados is that Krush pulled out a move she had been preparing for some time and "discovered" a new line in the old opening. (Look at Kasparov's comments on this.) If you look at the history of the analysts commentary, the voting became a personality contest between the analysts taking quieter approaches to those taking active approaches.
As a final note, another comment in the history is that Kasparov mentioned at one point that he was doing something less that ideal because didn't want to give away certain preparation he's making for his world championship match against Anand.
My overall take: Is GK vs The World interesting? Yes. Does it "mean something"? Not particularly.
[As an aside, I wonder how vitriolic the comments would be if the original author worked for some other company...]
Those people who are claiming that STO should be defined as "security *only* because the algorithm/implementation is unknown to attackers" and then go on to say that STO is not good security as are largely arguing a truism. (If I recall my debating terms correctly.)
The logic for STO goes as follows: (a) There is a bad method of keeping secrets (b) We will make the secrets safer by not telling anyone about (a)
This is almost definitionally a bad way to keep secrets as (b) -- not telling -- is almost certainly an even *worse* way to keep a secret than (a)!
The two most frequent counter arguments seem to be: i. Open algorithms prevent (a) ii. Relying on (b) will eventually fail
Argument (i) is merely a statement of how to get good security ("if the algorithm is secure even when the attacker has the algorithm and a plaintext/encrypted text sample, it must be secure"), which actually runs counter to the initial premise that security is *ONLY* due to obfuscation. If you had good security, it definitionally wouldn't be STO!
Argument (ii) is the correct criticism of STO if STO requires the "only" clause.
A generous interpretation of the author's original point -- STO jargon misconception aside -- is that obfuscating details from an attacker may enhance security. Consider the matrix:
........Quality of Security ..............Low..High ..............--------- ........... N | A | C | .Obfuscated?. --------- ............Y | B | D | ..............---------
(Hey! How about a "pre" tag!)
A: is trivial B: is STO C: is "open encryption" D: presumably, is stronger even than C
As an example, if I have a choice of strong, open encryption systems, why tell an attacker that I'm using Twofish instead of IDEA? Even if that obfuscated fact is revealed, the fallback of (C) is secure. In the meantime, however, I've layered on additional complexity for a would be attacker.
Of course, practically speaking, (C) is sufficient, but there's certainly no need to go out of our way to tell potential attackers all the details.
I have been using Gandi for several years and been very happy with their service.
They offer domain registration in .com/.org/.net/.biz/.info/.name/.be for EUR12 a year (about $14 dollars, lately). That includes optional free web redirection, email redirection, DNS hosting, and secondary DNS. Almost all administration is automated on their website and very easy to use. I have zero complaints and nothing but compliments for them, and have been recommending them to friends for low-cost, high-quality domain registration.
From their site:
-XDG
I've been using my SMC2655W for about a year and a half. Rock solid the entire time. Highly recommended.
-xdg
This may not help your particular situation with your existing hardware, but I switched to a USB KVM switch (with a USB keyboard and USB wheel-mouse) and it works just fine in Windows 2000/XP and Redhat Linux (8.0). There are, of course, other issues with using a USB keyboard if you need boot-time support, but a modern PC motherboard should be able to handle it without much difficulty.
Best of luck,
XDG
First, I highly recommend you use Quicken/Money religiously. I only bank/etc with companies that I can download direct to Quicken. Watching your net-worth chart over months/years can be inspiring. Seeing each month if you made more than you spent is crucial.
Second, the best piece of advice is to invest at least 10% of what you make. To follow common advice, "pay yourself first"! This means making that 10% disappear from your paycheck before you even think of it as money to use to pay bills, buy pizza, etc. If you can put this in a 401k, great! If not, do it anyway.
Third, the vast majority of people investing should probably be using index funds. Yes, some people make money in the market, but it's probably just luck. (Really! There are some 30,000 or so mutual funds -- half don't beat the market each year. Imagine beating the market equal to flipping heads on a coin. Now watch 30,000 people flip a coin once a year for 10 years or so. You're still going to have some lucky few who flip heads 9/10 times. Ever wonder why mutual fund companies have so many funds? It's so they always have at least one or two that do well last year that they can advertise. If professional money managers can't do better than 50/50, why do you hear about so many successful individuals? You're not talking about 30,000, now you're talking millions. At 50/50 odds -- or worse -- you'll always hear about someone who made a lot of money.) My advise for anyone starting out is to open an account at Vanguard and stick your money is a broadly diversified index fund.
Four, make sure you have an emergency fund -- three months expenses is usually cited. Once that's topped up, you should think about investing most of the rest. Weight investments heavily towards stocks when you're young, and shift gradually towards bonds as you retire.
Books that are useful:
Random Walk Down Wall Street -- by the founder of Vanguard
Stocks For The Long Run -- a top academic perspective on investing
When I was very young (high school), I found that Smart Money, by Ken & Daria Dolan was a good overview. That's old, but they have more recents texts.
In the "funny story" model of teaching basics of financial planning, you might consider The Wealthy Barber (recommended by a friend of mine, though I found it too simple by the time I got around to it) and/or The Richest Man In Babylon (parables about money originally written in 1926 and still applicable -- see the Amazon reviews!).
For why not to confuse chance with skill, try Fooled by Randomness. It has a lot of snide commentary at industry insiders, but makes a good point about why humans mistake luck for skill, broadly.
I hope those are a helpful start. Best of luck!
-XDG
(This is like a throwback to the user-group days of old when we actually used green-screen BBS's on 2400 bps modems... but sometimes the oldies are goodies.)
XDG
Sorry, permissions fixed.
Also, I discovered that the new Mail::Box as of a few days ago was breaking my code, so I've got two versions up there now.
-XDG
Two programs -- one to make a .db file with the words and probabilities from a good corpus and bad corpus, the other to test a mailbox against the word database. In both cases, the mail files probably need to be in unix mbox format.
-XDG
My experience is that I get a few percent false-negatives and about 1% false positives. I'm not seeing zero false positives, like many people are, but that probably has to do with the training sets used. Statistically speaking, you always have to trade off false negative with false positives, so it's reasonable in my 'real world' tests.
As a side note, everyone should test out of sample. E.g. set aside half your good e-mails and half your spam e-mails, build the filter on one half, and then test on the other half. That's the only way to get a fair test of the filter.
For my "good" email corpus, I dumped my entire e-mail archive since 1995. That included personal e-mail, receipts from online shopping, some mailing lists, etc. The few things that get flagged as spam (a) are almost always sent in HTML format, and (b) very short with little real content. (E.g., "Hey, looking forward to seeing you this weekend. Call me if you go out. My number is... Bye.")
The spam corpus I took from on online resource while I build up my own. The e-mails that slip by unflagged are usually (a) short and (b) phrased like friend making a suggestion. (E.g., "Hi, I just thought you'd be interested in hearing about a this new, cool website, http://...") It seems to be close enough to a real message to slip through. Thankfully, few of them are like that.
I'm including subject lines, from addresses, and the body so far. I'm not parsing ip addresses or html tags specially, however, just basic words using a simple perl regexp.
Interestingly, "COLOR" is the one of the most often flagged words indicating spam. HTML formatting text seems to be the biggest culprit in my false positives. I might explicitly exclude the ones that show up in good mail (e.g. from friends who use crappy e-mail programs like aol) like COLOR, FONT, FACE, etc., but leave in the ones that spammer use like TD, TR, etc.
-XDG
I'm not sure why this particular article needed to be posted, as it's just one of several alternative approaches and an untested one at that. On Paul's page, he also lists several published academic papers with other alternatives -- all actually tested, of course.
Gary is basically right in questioning the use of the word "Bayesian". Paul's approach is more about weighing "evidence" as given by the appearance of certain words, rather than in figuring out the probability of spam assuming a "prior". See Paul's explanation, but if you check the article he references at the end, you'll note that the method Paul uses is only one of several methods to solve an underspecified problems. It's a reasonable guess, not necessarily the only guess.
Looking at another article Paul references, given the word independence assumption, the more formal Naive Bayesian approach calculates as follows:
p(spam) = [ p(spam)*p(word1|spam)*...*p(wordn|spam) ] / [ p(spam)*p(word1|spam)*...*p(wordn|spam) + p(!spam)*p(word1|!spam)*...*p(wordn|!spam)]
This is similar to Paul's approach except for including a "prior" assumption of p(spam) -- the expected probability of any email being spam, calcuated from the historically observed frequency of spam. By leaving it out, Paul implicitly assumes that 50% of mail is spam -- that's his "prior" estimate of the spam rate. Given the other adjustments he makes to his sample, that appears to be acceptable in practice. (Paul overweights the spam prior, but also overweights the effects of "good" words.)
I'd personally prefer to overweight the "good" e-mails entirely rather than just put a "good-multiplier" on them like Paul does, but that's just quibbling over small bits.
As to the bit that Gary raises about Paul assuming a spam probability for an unknown word -- Paul originally said .2, then revised to .4, but really should have put it at .5 or just excluded it from all calculations. A new word has no robustness as a predictor (which is why Paul dropped words that didn't appear five times anyway). In practice, a new word at .4 isn't going to be among the 15 most interesting words to make the calculation from, anyway.
-XDG
Several comments have pointed out that encryption demand just wasn't there. I agree. While we would like to think that every end user would see the need for encrypted e-mail, we all know that hasn't happened. Yes, if MS or AOL made including encryption a standard part of their e-mail packages, that would go a long way, but the complexity of encryption needs to be hidden from the end-user.
Truth be told, most people really don't need encryption on a message-by-message basis. Encryption activists feel that a world with strong encryption is broadly a better one, but that requires a "network effect" from adoption and the current costs for adoption for end users are just too high in terms of complexity for them to want to go to the effort of adopting it for a vague future goal -- even assuming they comprehend it and agree.
Effecting real change may require backing away from some of the ideal crypto solution as a way of avoiding the complexity costs. People have mentioned several areas of complexity, including e-mail program transparency, certificate management, out-of-band verification, and trust networks.
However, if we consider web-browsing, these have been effectively hidden from end-users and have thus failed to hinder adoption. Web browsers have public key encryption built in, plus a master set of certificates for verification. Because web browsers don't require two-way verification of identity, users don't have to worry at all about managing their own key. Adoption of SSL has been effectively transparent to users. Those who seek to have crypto email become the standard should seek similar solutions to transparent adoption first, rather than seeking to delivering the most sophisticated crypto. Once adoption has been catalyzed, the technology can be improved as the masses become familiar with it.
Designers of the next generation of e-mail software should look to make certificates a natural part of the email environment. This should first center on identify verification, not encryption. There are several places this could happen:
- When first setting up a new email address, the user should be prompted to create (or import) a digital ID for signing the e-mail. This certificate should be automatically (and transparently) sent to a central key-server. This key should be non-expiring. There should be options for upgrading/replacing the digital ID for advanced users. Passphrases should not be used by default, but could be an option for advanced users seeking additional security.
- Address books or nickname managers should include an icon/notation to indicate which addresses have digital IDs. When nicknames are created or imported, the program should check all e-mail addresses against a keyserver and import the public keys.
- When sending mail, signing should be the default.
- When receiving mail, the return address should be checked against the nickname file or against they keyserver automatically. Signed mail should be specially flagged as such in the Inbox. Encrypted e-mail should be automatically decrypted on receipt/viewing. (Advanced users might opt to keep encrypted on disk and only decrypt for viewing)
- For moderately advanced users, options should exist to enable encryption by default. This should automatically encrypt if the e-mail address matches one in the nickname file with a digital ID associated with it. Advanced users could opt to automatically seek a key from a keyserver and send encrypted if found or plaintext if not found.
Consumers would have to find a reason to upgrade to this kind of system. One possible option is to use the signatures to help spam stomping. Using e-mail addresses alone for filtering may work for a while, but ultimately likely fail due to the ease of forgery. Filtering on signatures either against the ID in the nickname file or a verified key on the keyserver (one signed by a master CA, so that legit companies could send you something without you having to have them in your nickname file) might work very well. With spam being the problem that it is, this might be part of a "killer app" of next generation e-mail programs to deal with the problem.What this largely throws out is the element of getting signed certificates and requiring consumers to manage them. However, I'm certain that after people got used to the idea, the notion of having their digital ID certified wouldn't be so complicated. (It would of course need to be affordable.)
(One could imagine that when governments begin to issue digital ID's these will be signed by the government and could be used for e-mail signatures as well.)
The question of course is who could make all this happen? It probably still falls to the major e-mail program makers. I'm surprised that MS hasn't started down this path. Building in transparent signatures like this should be beneficial for their corporate business, and they should be able to sell lots of PKI add-ons for corporations to do certificate management. (Ironically, I think Notes has had this built in for ages, but it only works within the Notes server.)
So, anyone from MS listening? Now's your chance to delivery against your privacy initiative. Best of luck.
-XDG
These models are interesting in the fields of sociology and economics because because they allow an exploration of how local phenomena -- particularly local "rules" for agents -- give rise to macro-behavior and collective order. Economics, in particular, confronts the differences between the micro-level and the macro-level and has different techniques for each of these domains. Agent based simulation can demonstrate how certain economic phenomena (e.g. market clearing equilibrium prices) can exist from local information exchange without requiring macro structures (e.g. a centrally clearing market). Likewise, agent based models allow exploration of how underlying characteristics of the system at a local level (e.g. bell-shaped agent vision/metabolism) give rise to macro level results (e.g. skewed wealth distributions), and how changes in the underlying assumptions change the results.
For example, it is easy to say that it is "obvious" that races segregate just like two types of particles because it's the lowest energy state, but this was not common wisdom until someone expressed racial segregation in the model of typed particles on a lattice. The result is obvious to a physicist without needing a simulation, but posing the question that way was not obvious. The point of the exploration of segregation wasn't that there was phase separation at all, but rather how differences in the amount of "attraction energy" (racial preference), changed the shape/smoothness of the interface between types. The interesting result was that very little racial preference (on both sides) is needed to explain segregation. Yes, the model could be improved by considering friction/viscosity of motion of the particles/agents, etc., etc., but the base insight that segregation is a "natural" phenomenon, not a "racist" one is of interest.
-XDG
Predicting the Unpredictable
The article mentions, among other things, the work of BiosGroup. From their website:
-XDG
If you found this article interesting, their book is a great exposition of their early work with emergent behaviors. You can find it at Amazon here:
Growing Artificial Societies
There is a similar article on complexity and emergent behavior in the latest Harvard Business Review.
-XDG
"US prepares to invade your hard drive"
a letter to FL Sen Bill Nelson
All easily found courtesy of google (probably the better place to ask this question, anyway.)
-XDG
The article boils down more or less to the following:
1. "Old" search technologies (Altavista, Yahoo) failed because they used approaches that found words but not content (Altavista) or relied on non-scalable human editorial judgement (Yahoo).
2. Google works (and is cool) because it uses available information about the number of links to determine (a) valuable content and (b) smart judges of other valuable content
3. The government efforts at creating the Panopticon will fail because they'll be stuck using "old" keyword approaches that can't pick out real content.
This argument is flawed in two key ways:
1. The author confuses the nature of the "search". Web searching is about finding *content* and the challenge is differentiating "good" content from "bad" content. Governmental "security" searching is more akin to traffic analysis and the goal is identifying dangerous *individuals* based on the content and pattern of their traffic. The challenge there is differentiating "good" (safe) speakers from "bad" (dangerous) speakers.
2. The author assumes (based apparently simply on opinion and what is popularly reported in the press) that the government will blindly apply "alta-vista style" techniques. His lack of fear of the Panopticon is based on an assumption of incompetence in the application of surveillance methods. Given the motivation and resources (both of which the government now has in spades), there is no reason to believe that more sophisticated and effective techniques will not be developed and pursued. Assuming Echelon has really been in operation, it's hard to imagine that, in the closed halls of the NSA, researchers aren't well aware of the limitations of keyword search and are far along applying cryptanalytical techniques to the real problem identified above.
It would seem that the author is trying to take advantage of hype and concern about government surveillance not to make a serious comment about it or whether one should truly be concerned, but rather to get an audience for his opinion that Google is really cool, which most of already knew anyway.
-XDG
If this happened again in another couple months or even in the next year, I think there would be cause to complain, but balancing their desire for e-mailing you against your desire not to get it, this seems like a reasonable approach. They want to mail you, they warn you they'll start, if it's important enough, you set your preferences to stop it again.
It's fun to bash, but remember all the companies that change policies or release data without giving any warning. Consider this one a victory for corporate responsibility.
XDG
Separately, I've had good results with Speakeasy.net as a provider. Their website for subscribers lets you track the progress through the red tape and lets you see the actual logs that Covad maintains on the progress of the install.
XDG
These are the magic words that get things done when coordination difficulties arise with DSL providers and your RBOC.
A vendor meet is where technicians from the DSL provider (in this case, Covad) and the Bell company get together in the same place at the same time to verify that service exists (or, more usually, doesn't exist).
If you ever have trouble where Bell claims something is either installed or fixed and it actually isn't, push your DSL provider to force a vendor meet. This usually gets Bell off their butts and gets things fixed.
This is of more use during the install phase -- the original poster probably needs to do at least one round of letting the system work on things before pushing. Also, don't fear to contact Covad directly (1-800-GO-COVAD) and deal with them if it's actually a problem with the line. If you're just having IP/configuration trouble, you need to deal with your ISP, but if you actually, physically, don't have service, my experience has been that Covad is willing to talk to you and try to move things along.
Good luck,
XDG
There! Isn't that clearer? ;-)
To the original point, it is a sufficiently broad statement that the size of the OS code is irrelevant.
-XDG
This is a classic case of the ACLU and some hyper-active first-amendment activists blowing things out of proportion and slanting the facts to suit their purposes.
I actually went to the FEC's web site and citizen's guide (http://www.fec.gov/pages/citnlist.htm) for some information before posting this reply and learned some interesting things.
First, volunteering does not make someone a PAC as some people have immediately starting yammering on about. From the site:
Further, what the article is talking about when you personally make a web page about a campaign is called "Independent Expenditures" -- meaning that you are doing it as an individual and independently, not linked in with some candidate campaign. Again, from the site:
There are a couple of relevant caveats in that. First, you have to say that you are independent. Second, if you spend over $250, you have to file a form. This DOES NOT mean your free speech is being restricted. All it means is that the goverment is requiring you to register how much you spent on your speech. Why? So that political campaigns can't get around federal law by pretending to have lots of independent contributions.
You can download the form from the web site. It's about a page long. Name, address, how much you spent. Not much more than that.
Finally, I personally think it would be hard to say that a page on your website with a political message should be "calculated" as the cost of the machine, web-space, etc., as the marginal cost of adding a page to an existing site is essentially zero. If you had a dedicated machine, they'd have a better case.
In any case, people should go looking for the facts (since they're in plain sight) before overreacting to whatever FUD people want to use /. to spread.
-XDG
People seem to want it both ways. First, this is a great test of "collective thinking" against the world champion, and then second, they get upset because the Krush/Kasparov duel got interrupted for technical reasons and they were forced to think for themselves.
And the suggestion that Kasparov might cheat is ludicrous.
As a separate aside, on the topic of whether this game "proves" that Krush and several grandmasters and lots of computer time can produce moves at Kasparov's level, I'll quote analyst commentary from move 3 about Kasparov's choice of move:
So, yes, Krush and "The World" can rival Kasparov... as long as he isn't trying his hardest.
-XDG
I'm surprised the average /. reader finds this worthy of debate.
(On the other hand, the moral issues he raised as examples are certainly worthy of debate both on /. and elsewhere)
-XDG
This is, I believe, a misconception about the game of chess, that shows why human/computer matches are often misunderstood.
For humans, chess at the highest levels is not played by considering each move, predicting a response, and continuing through some search depth. Instead, chess masters always assume that the opponent will find the best response. To the extent that the response to a move must be "predicted" -- the master must find the one, single best move in response. This is why the "predictiveness" of the world is irrelevant -- the combination of the grandmaster coaches and collective voting push the game towards correct responses.
The real question is how to develop a strategy in the game that leads to small advantages. Each move, particularly early on, has effects on various factors: space, material, development, etc. This requires the ability to understand the balances and imbalances inherent in the board at any particular snapshot.
That is the difficult part of programming the computer. Searching out combinations is just a matter of power. The trick is programming the computer to analyze a snapshot of the board in order to determine the relative balance. Then the computer can use that information to examine in further depth the stronger lines.
If the programmers do a poor job of programming the analysis of the snapshots, a human player can exploit that. In several of the Kasparov vs computer matches, Kasparov's first few games were all about him examining how the computer was programmed to think, then exploiting that in the later games. For example, if the computer over-valued material advantage, Kasparov could use that knowledge to "convince" the computer to take a sacrifice that on the surface looked to offer a material advantage but that would lead to a positional imbalance that would favor Kasparov later.
(This is one reason that Kasparov is annoyed that the programmers never released information on how they programmed the computers.)
Further, the earlier comment about the effect of grandmaster coaches was dead on. One of the things that made the game so interesting to chess afficionados is that Krush pulled out a move she had been preparing for some time and "discovered" a new line in the old opening. (Look at Kasparov's comments on this.) If you look at the history of the analysts commentary, the voting became a personality contest between the analysts taking quieter approaches to those taking active approaches.
As a final note, another comment in the history is that Kasparov mentioned at one point that he was doing something less that ideal because didn't want to give away certain preparation he's making for his world championship match against Anand.
My overall take:
Is GK vs The World interesting? Yes.
Does it "mean something"? Not particularly.
XDG
Those people who are claiming that STO should be defined as "security *only* because the algorithm/implementation is unknown to attackers" and then go on to say that STO is not good security as are largely arguing a truism. (If I recall my debating terms correctly.)
The logic for STO goes as follows:
(a) There is a bad method of keeping secrets
(b) We will make the secrets safer by not telling anyone about (a)
This is almost definitionally a bad way to keep secrets as (b) -- not telling -- is almost certainly an even *worse* way to keep a secret than (a)!
The two most frequent counter arguments seem to be:
i. Open algorithms prevent (a)
ii. Relying on (b) will eventually fail
Argument (i) is merely a statement of how to get good security ("if the algorithm is secure even when the attacker has the algorithm and a plaintext/encrypted text sample, it must be secure"), which actually runs counter to the initial premise that security is *ONLY* due to obfuscation. If you had good security, it definitionally wouldn't be STO!
Argument (ii) is the correct criticism of STO if STO requires the "only" clause.
A generous interpretation of the author's original point -- STO jargon misconception aside -- is that obfuscating details from an attacker may enhance security. Consider the matrix:
(Hey! How about a "pre" tag!)
A: is trivial
B: is STO
C: is "open encryption"
D: presumably, is stronger even than C
As an example, if I have a choice of strong, open encryption systems, why tell an attacker that I'm using Twofish instead of IDEA? Even if that obfuscated fact is revealed, the fallback of (C) is secure. In the meantime, however, I've layered on additional complexity for a would be attacker.
Of course, practically speaking, (C) is sufficient, but there's certainly no need to go out of our way to tell potential attackers all the details.
-XDG
The more disturbing bit is that it spoofs my e-mail filters by looking like it's from someone I'll read mail from.