JimDabell · Slashdot Mirror

Re:Security Diversion on Google Desktop Search Under Fire · 2004-10-22 00:09 · Score: 1

actually the problem is that proxies can store whatever they want, there is no 'rules' for them to obey.

The 'rules' are described in RFC 2616, the HTTP 1.1 specification.

whatever meta-tags you write in html-header, most proxies cache pages and images anyway.

Like I said, <meta> elements are unreliable as proxies don't usually parse HTML. They virtually always pay attention to HTTP headers though.

Re:Security Diversion on Google Desktop Search Under Fire · 2004-10-21 09:09 · Score: 4, Informative

I've looked at the html for secure pages before and some used some kind of "nocache" tag or somthing like this.

If it's in the HTML, you are talking about <meta> elements, and they are an unreliable substitution for proper HTTP headers.

More importantly though, the nocache directive still permits clients and proxies to store a copy of the resource in their cache, so long as the copy is revalidated before being used again. The directive that should be used for sensitive data is nostore.

Great interview... on Neal Stephenson Responds With Wit and Humor · 2004-10-20 04:37 · Score: 5, Funny

...but I hated the last answer.

Re:You're almost there... on Pretty Printing From An XML File? · 2004-10-18 04:53 · Score: 1

Yes. This still doesn't make HTML 4.01 a standard though; ISO-HTML is not identical to HTML 4.01 - it is a stricter specification. While ISO-HTML documents can also be HTML 4.01 documents, the inverse is not true.

Re:Good on UK Record Industry Sues 'Major Filesharers' · 2004-10-07 12:22 · Score: 1

Filesharers have no such agreement. Therefore they are in violation of copyright law.

The legality or otherwise of it cannot be a justification in itself, otherwise you end up with the circular logic of "it's illegal because it's wrong and it's wrong because it's illegal".

As everybody says when topics like this come up, the justification for copyright laws (growth of the public domain) has been eroded by corrupt politicians. So when somebody says "why not?", they aren't asking "are you sure it's illegal?", they are saying "can you justify copyright law?"

Re:Possible on MPAA Blames Linux Australia Notice on Human Error · 2004-10-06 20:50 · Score: 2, Insightful

It doesn't take a lot of time to write a robot that finds files with a certain name. I think that the most likely scenario is that they do have a bot that checks filenames, but the output would be so full of false-positives that human filtering is almost certainly required. In that case, the human error would be sending out 101 notices from a list of 10,000 files when they should have only sent out 100.

Naturally, if the people are being paid for their throughput and not their accuracy, they are simply going to load up the linking page and see if it looks like "a nasty pirate site". An FTP site containing tarballs would most likely look the same as an FTP site containing MPEGs to the untrained eye.

What I don't understand is why their bot doesn't narrow things down using things like file size before it reaches the people involved. That alone would cut down the workload and reduce the false positives.

Re:No, there's something there on Tim Berners-Lee and the Semantic Web · 2004-09-28 08:29 · Score: 1

So, if one RDF feed from MSN says that I have a 'cheat and scam' property

I don't think I was quite getting your point before. This is the mistake I think you are making now: if you cheat and scam in a game, then the "cheating and scam" property should be describing your relationship with online gaming. Attaching it as a direct property of an individual is a mistake. The two properties are fundamentally different in nature; it's only the ambiguity of English that leads you to equate them. Clearly "cheat" in a gaming context is a different matter to "cheat" in a financial context.

With your example, yes, if an online gaming service is attaching that property to individuals (in my opinion this is broken - bad policy and thus the source would not be trustable), and if an online payment processor is trusting this property to be valid in describing your attitude to paying bills (arguably correct), then it would result in the situation you describe.

On the other hand, if the online gaming service merely describes you as cheating at online games (this is what I expect), or if payment processors don't trust online gaming services to provide accurate information in this respect, then there is no problem and everything happens as I describe.

Obviously the difference lies not in the technology per se, but in the policies that define how it is applied. My contention is that policies that allow these sorts of situations to arise are fundamentally broken and cannot reach widespread use because they will result in massive amounts of mistakes. However, I do not think that because those types of policies are broken, the fundamental technology is broken or that good policies don't exist. As I see it, a policy that limits organisations to making assertions only within areas they are qualified to judge solves the problem you are describing.

That is to say, you are describing a bottom-up approach where every feed is already known and properly documented before you unleash the beast.

Yes and no. For things like you describe, I'd certainly expect user-agents to use a finite list of data sources or have some mechanism for deciding how trustworthy a certain source is. For example, a book finder might base trust of reviewers on how closely their rankings match your own past rankings.

On the other hand, things like Google would want to make all information available no matter if it is untrusted, but be able to categorise it or only trust limited information.

Except, that isn't what is being envisioned. The vision is that through RDF feeds across the entire web a 'total' awareness picture can be made of anything.

Maybe that's the eventual goal, but I don't think anybody is claiming that it's a simple case of connecting the dots. I see people working towards solving small, tightly defined problems before taking the next step. In many ways "the Semantic Web" is a term like "artificial intelligence" - as soon as people take the next step and start building distributed databases of knowledge using RDF etc, it will just be considered simple stuff and the real Semantic Web is the stage after that. There's no need to jump from present-day to the eventual goal immediately; that's impossible. Right now people are working on, e.g. distributed Friendsters where they just tie together people using FOAF files etc, so you can query "who do I need to talk to in order to get introduced to [x] person?" etc.

What I'm saying is I think that this is infinately more complex than you think, although I'm certainly willing to concede that some specialized implimentations may be slightly less complex than I am suggesting.

No, that's pretty much my view on it, except I think there's plenty of useful things to be done with the "specialised implement

Re:NOTHING but an open standard. on FTC Wants Comments on Email Authentication · 2004-09-28 05:50 · Score: 2, Insightful

an open standard (i.e., a standard with specifications that are public).

In my mind, an "open standard" isn't just one anybody can read, but one that is open to anybody implementing it - which means patent-free. It's no good everybody being able to read the specifications if nobody is allowed to do anything with them.

Re:No, there's something there on Tim Berners-Lee and the Semantic Web · 2004-09-28 04:10 · Score: 1

Your faith in computational logic is astounding.

Not really. Why would anybody go to the trouble of writing extra code to link together disparate properties in an unsafe way that will almost certainly break things? And if anybody did, what's the chance anybody would use such broken software?

the 'scams and cheats' property assertion of an online gamer against my account number is, by definition, a symantic inferrence.

If anything in your example is a semantic inference, it is when you infer that a cheater in a game is a cheater in financial transactions. And it's an inference that has no basis in logic and cannot be reached by an inference engine that is using the data you have described.

Basically that is the point. It is broken beyond usable functionality. It cannot make the conclusions advertised.

No, you've described a conclusion that shouldn't be reached, and I am pointing out that it won't. That doesn't mean that there aren't useful conclusions that can be reached.

Re:No, RIAA is a pimp on The Perfect Online Music Store? · 2004-09-28 02:00 · Score: 1

Songwriters should be compensated for the song you enjoy hearing, no?

Only to the point where it allows them to make a living from songwriting. Letting them control the scarcity of the songs perpetually actually reduces the amount of works coming into the public domain, as successful songwriters have less pressure on them to continue to work; they can simply sit back and watch money come in for work they have already done.

Re:No, there's something there on Tim Berners-Lee and the Semantic Web · 2004-09-27 06:54 · Score: 1

The symantic web has no way of telling what's relavent to me in a given situation.

Yes, it does. To take your example, the jump in logic you are making that the Semantic Web doesn't is assuming that the property "cheats and scams" attached to the relationship between "account 24601" and "online gaming" is identical to the property "cheats and scams" that might be attached to the relationship between "account 15931" and "online banking". That's an unjustified leap of logic that only software that is broken to the point of being useless would make.

Re:The rest of us call this... on Tim Berners-Lee and the Semantic Web · 2004-09-27 06:37 · Score: 3, Interesting

Google's a hack. No, really, it tries to extract meaning from web pages that really aren't engineered to store that kind of information.

Google is also an application. The Semantic Web is all about building the infrastructure so applications like Google don't have to chase the holy grail of AI to become more than a hack. Think of the Semantic Web as the layer underneath Google.

Re:You don't want a "single" web... on Tim Berners-Lee and the Semantic Web · 2004-09-27 06:16 · Score: 3, Insightful

Remember when you couldn't get a virus just by reading an e-mail?

Yes, and again, the problem is when the stuff that executes has a monoculture. It's not like you see Pine users or KMail users infected by emails with Outlook viruses in.

Re:You don't want a "single" web... on Tim Berners-Lee and the Semantic Web · 2004-09-27 06:03 · Score: 3, Insightful

This is to insure against a monoculture that is so disastrous in computer circles as demonstrated by the numerous security failings of Windows...

Windows executes stuff. The semantic web is just data. Your warnings about a monoculture apply to the semantic web about as much as they apply to text files.

Re:Irony of life, DNG can be lossy too on Adobe Releasing New Photo Format · 2004-09-27 04:46 · Score: 2, Informative

So in other words, the Slashdot writeup that stated this was a new format that was better than JPEG was completely incorrect, and in actual fact this is simply a container format that uses existing JPEG algorithms? Sounds about usual for Slashdot these days.

So where is the (more) efficient method?

According to the JPEG FAQ, PNG is more efficient than lossless JPEG for most images. Unfortunately, this specification doesn't allow for that; as far as I can tell this has little to do with picture quality and more to do with metadata and interoperability.

Re:Why? on Adobe Releasing New Photo Format · 2004-09-27 02:44 · Score: 4, Informative

standardized JPEG, which professionals don't want to use because it's lossy.

The JPEG standard includes a lossless option too; professionals don't want to use JPEG because lossless JPEG is inefficient, not because it doesn't exist.

Re:Slightly OT-Malicious spam opt-outs and MYPOINT on Spam Opt-out Link Triggers Malicious Code Attack · 2004-09-22 06:38 · Score: 1

It's not an email standard, according to the RFCs everything before the @ is called the "local part" and is interpreted in a system-specific manner once it arrives on the server. However the use of +suffix or -suffix is quite common and gmail supports it - so if you send an email to the address you mentioned, it would appear in the same mail account as richardjharris@... but with a different destination address, so you could filter on it or simply find out which address spam was sent to after the fact.

Re:Slightly OT-Malicious spam opt-outs and MYPOINT on Spam Opt-out Link Triggers Malicious Code Attack · 2004-09-22 05:04 · Score: 1

Next time, give them an email address of username+mypoints@google.com. That way, if spam comes in, you'll be able to tell whether or not mypoints were the people that sent it or sold the address to spammers.

Re:Google's Reply on Does Google Censor Chinese News? · 2004-09-21 21:37 · Score: 2, Insightful

Because their news and search offerings are very different. Their search results comes from a vast database of every document indexed, weighted by keywords and other factors. Their news results come from a small list of pre-approved news sources. Having to determine which documents are available to the Chinese out of the billions they index on an ongoing basis is a completely different matter to determining which of their hundreds of relatively static news sources are unavailable to the Chinese. Filtering their news based on location blocking is feasible, filtering their search results based on location blocking is not feasible.

Re:Skewed results? on Amazon's A9.com Search Engine Goes Live · 2004-09-19 04:58 · Score: 1

I wonder if A9 will automagically find sites that have amazon links and rank them higher?

You might just as easily wonder if Google automagically finds sites that have Google Adsense ads and rank them higher.

If A9 does this, and it is less useful as a result, people aren't going to switch to it.

Re:Only good news, if it's really open on Solaris 10 to be Open Source · 2004-09-14 03:39 · Score: 1

If it's truly an open source license, this is only good news--Linux and/or the BSDs will be able to use the best bits.

Not necessarily, just because a license can be classed as open-source, it doesn't mean that it's compatible with the GPL or suitable for *BSD. For instance, it could be released under the GPL, which means the *BSDs can't use it, or it could be released under the BSD-with-advertising license, which means that Linux can't use it. Or it could be somewhere inbetween, meaning neither can use it.

It's not about the revenue on Does Microsoft Need China? · 2004-09-07 07:51 · Score: 4, Insightful

Microsoft has plenty of money, it's not going to run out any time soon.

The real issue is what China will do instead of using Microsoft software. They have to use something. That's an incredible amount of resources the Chinese government and businesses have that will go to Microsoft's competitors.

When the German government decided to shift its employees to Linux, they provided resources that greatly improved the KDE groupware infrastructure. Imagine what the whole of China could give us. Now see why it's important for Microsoft to dominate the Chinese market?

Re:Their site Validates. on Mozilla.org Relaunched · 2004-09-02 07:24 · Score: 1

XHTML is a lot easier to parse

Nope. True, you can use an existing XML parser to do it, but only if it's well formed (which you can't guarantee, as not all of the data may have arrived by the time you want to start parsing).

XML only requires you to throw a fatal error upon discovery of an error that makes a document malformed. It doesn't require you to ensure that a document is well-formed before beginning to render it.

Re:Their site Validates. on Mozilla.org Relaunched · 2004-09-02 07:21 · Score: 1

XHTML is a lot easier to parse, it is better defined and it is easier to render.

When it's actually being treated as XHTML, sure. But mozilla.org is served as text/html, so the only thing browsers will see is funny looking HTML, not XHTML.

HTML 4 is the bloated recommendation with the styling tags and all that crud.

XHTML 1.0 includes all that crud too. XHTML includes things like the <font> element type. If HTML 4 is bloated, then so is XHTML.

Re:I blame the Google Toolbar for a lot of this on Searching For Trouble With Google · 2004-09-01 00:56 · Score: 1

Because it wasn't linked from anywhere unless someone could guess the URL then no-one else wouldn't be able to find it.

Classic security through obscurity. Is it really wise to blame other people for this kind of screw-up?

The Google toolbar isn't the first, and won't be the last method of discovering unpublished URLs. For instance, web statistics packages are commonly available from http://www.example.com/stats and will list popular URLs. Or, if the "secret" resource is an HTML page, you'll be transmitting the URL in your Referer header when you click on any of the links (or even just by visiting the page if your browser is doing pre-caching, or if it contains inline images etc residing on an external server).

The bottom line is that if something is secret, you shouldn't give it to anybody who asks.

Slashdot Mirror

User: JimDabell

Comments · 849