Canadian Court Finds Website Scraping Infringes Copyright
First time accepted submitter wrecked writes "A trial judgment from British Columbia, Canada, found that Zoocasa, a real estate search site operated by Rogers Communications, breached copyright by scraping real estate listings and photos from Century 21 Canada. The decision thoroughly reviews the issues of website scraping, Terms of Use, 'Shrink Wrap' and 'Click Wrap' Agreements, robots.txt files, and copyright implications of hyperlinking. For American readers used to multi-million dollar damages, the court here awarded $1,000 (one thousand dollars) for breach of the Century 21 website's Terms of Use, and statutory copyright damages totalling $32,000 ($250 per infringing real estate photo). More analysis at Michael Geist's blog, and the Globe & Mail."
those crappy robot sites that like to take my comments on other web sites and message boards and repost them willie-nilly all over the place in hopes of attracting ad revenue- are those affected by this ruling?
Not like I'm going to file in a Canadian court, but I do find it annoying to have comments showing up all over Google on garbage sites that only exist for a short time and that I've never heard of.
The preceding post was not a Slashvertisement.
I wonder what they were smoking, thinking that was legal to do in the first place?
I know if I was a R.E. agent & somebody scraped all the pictures and other work I had gone to in an attempt to sell a property that I was asked and contracted to sell, I would be yelling copyright violations in court. And if the property was sold, I would normally have a contract that said I got my commission if it was sold within the duration of my contact.
They walk among us, and breed too!
--
Cheers, Gene.
Seems to me that would depend entirely on what they do with the results. If the answer is 'nothing that the web sees again' then no harm, no foul. On the other-hand things quickly gray up if the info is re-posted in an altered form. The guy who scraped Cragslist and made it easier to use comes to mind. Not sure if this qualifies as 'ruining the quality' or not, but the Craigslist lawyers were quick to react. If the guy doesn't (or hasn't caved) it will be interesting to see what happens.
They're a real estate broker. They make money when the property is sold; displaying ads is only a means to selling the property. Why do they object when people are copying and re-publishing their ads?
DUH!
Umm, no. GP is correct, though the United States is the 3rd most populous county, behind India and China.
Gone!
*facepalm. 300,000,000/7,000,000,000 = .0428 or about 4.3%. 23% of 7 billion is about 2.2 billion, idiot.
No, it isn't. You can't copyright data, especially data that's derived from actual events. Theoretically they could copyright the presentation, but Mint and services like that are there to display the data in a different way than just displaying all the other sites.
It isn't the lawyers, it is silly juries awarding the outrageous amounts, plus your law does not cap punitive awards. In Canada the punitive part of the award cannot exceed $300,000. The rest of the award is to "make you whole" that is, undo the damage done by the party who lost. So the most you can get is what you lost, future losses, and $300,000 in cases of extreme suffering.
Anarchists never rule
You can't copyright data, but the presentation, as you call it, includes the way the text is written. In other words, as long as you tell it your own way without copy-pasting the source you used you are in the clear. I doubt these scraping sites do any meaningful rewriting of their sources, though. And how do you rewrite a picture, by the way, without the data it comes from?
How is this different than Google News? When sites bitch about Google News, I believe Slashdotter's call them greedy and spout "fair use"...
If you want news from today, you have to come back tomorrow.
Century 21? What is Gerry Anderson doing selling properties in Australia?
Yes, you are definitely not a lawyer since you, as with most Slashtards, misunderstand what you are talking about. Yes, the facts themselves can not be copyrighted but the expression on those facts can be copyrighted. Which is why I can take the info from the phone book and publish them myself but I can't take someone else's phonebook, copy all the pages and then republish it as my own work because that expression of those facts are copyrighted.
Why not treat them like and end user and shut off internet access for the "user".
I would like to see "Rogers Communications", the runner of the site lose access to the internet.
Some how I don't think the court would do that to one of the top ISP's in Canada.
Yet we are likely to see the courts cut off internet access for small end users. Why are the laws not equal? Corporation got people status, but it seems corporations are way above people status when it comes to some laws. That is not fair. And if it would be unfair to cut off access for a corporation then it should be unfair to cut off access for a person.
I agree with this sentiment. A weeks internet black hole for Rogers Comm. would send a message their greedy board members will never forget.
--
Cheers, Gene
Zaktly. I was going to point at that same obvious fact. Don't twist law to suit your need for a solution to a problem. That spells abuse of law for you and everyone. Create new law if needed as new law can be more easily ruled against or striken down as needed.
You can definitely copyright a human-written description of something accompanied by a certain picture. In fact, the picture itself and a human-written description itself can be copyrighted.
As long as there is some creative element in the description. There are always some creative elements in regards to a photographer's choice of how to take a certain picture of a building/property.
Lawyers and Judges live to "twist"/ apply existing laws to new types of situations as they arise.
To suggest there will be a new law governing every possible new technology/practice is unrealistic.
I think that Google should just blackhole Canada if this ruling stands.
See, back a while ago, the newspapers in Belgium sued Google for copyright infringement, and Google was told by the court to take down the content or face a big fine, per day.
So they did took it down.
Suddenly the Belgian newspapers were screaming bloody murder because they weren't getting hits.
Go ahead, "content creators", kill indexing and searching. Bring it back to the old days of no search engines. I dare you.
--
BMO
Not really, only the subscribers to Rogers would suffer, as they'd use it as a excuse to charge a "Government blocking fee" and increase their profits for that month.
>third sentence.
I cannot into English.
Century 21 usually has exclusive listing contracts, which means that no one else can sell the house for commission without getting sued out of their commission. I don't see the profit for the scraping company.
I do not fail; I succeed at finding out what does not work.
and often 10+ seconds to get the damned preview back from /.
The ten-second delay is Slashdot port-scanning your IP address for common open proxies and waiting for connections to time out. You get it on a given IP only once every 24 hours. I've been told that if you set your firewall to refuse the connections instead of letting them timing out, the scan will run faster.
Seems WAY more reasonable than US copyright court.
Then again, with Google being accused of scraping Yelp! reviews etc., they might have a lot to lose.
I vote based on politicians' actions, unless contrary to my preconceptions. Often wrong, never uncertain. #iamthe99%
Google does not scrape the entirety of a site and put it up on their scraped search... they take a small blurb for you to read, and link to the original site. They do cache sites, but you have to look for the cache link, and most people aren't even aware that they exist. Very different animal.
I suspect that Google isn't in any danger here, because rather than taking business away from the sites they index, they are actually driving more business *to* those sites.
Last time I was looking for a place to rent, I felt tempted to do something like this. There are several rental websites out there, and you are lucky if their listing overlap... Comparing locations, prices, what was being offered, etc, was a pain. Some sites would at least let you right-click and open in a new tab, but others wouldn't. The maps, sometimes they were google's, sometimes they were bing, and they would never let you overlay the public transit lines on top... And instead of letting you chose a location and radius on the map, some would ask you for the postal code!
So, I toyed with the idea of scraping those sites and build my own database, and build a website from it for others like me (probably sticking some google ads to try to pay for the hosting). I guess it is a good thing I didn't. And it is a shame to know that I can't (and no one else can either). I don't like Rogers and part of me is smiling about this ruling... but if this means that there will be no "google news"-like service for rent hunting, this is another case of copyright preventing useful services from coming to life.
Which is why I can take the info from the phone book and publish them myself but I can't take someone else's phonebook, copy all the pages and then republish it as my own work because that expression of those facts are copyrighted.
I don't understand the difference between those scenarios. The final result is the same - both contain the same information, in the same order. I just don't see where to draw the line between "copying the facts in the phone book" and "copying the phone book". At the font and page design, perhaps? That seem awfully narrow...
If anything, in this case, I suspect that the "expression" of those facts is very different - the same set of facts (real state listings) in a different website design, probably without any inherent order and hopefully mixed with "facts" from other sources. I know, IANAL, I'm just trying to figure out why, if what can be copyrighted is the expression and not the facts, getting a final result almost exactly like the original work (phone book case) is more acceptable than aggregating several sources and presenting it in a new design (this case).
" the court here awarded $1,000 (one thousand dollars) for breach of the Century 21 website's Terms of Use"
We need an "Internet Terms of Use". "Anything on the internet that was meant to be accessible by the public is automatically public domain.
Your post shows just how badly misunderstood copyright law is nowadays. The mere fact that someone puts in a lot of effort to make a web page does not in and of itself mean that that web page is protected by copyright. Many things which require hard work do not qualify for protection under copyright law. Two examples which immediately come to mind are facts (like telephone book listings) or creative works which are considered utilitarian (like clothing/fashion designs, or individual cookbook recipes).
I haven't reviewed the facts of this case, so I don't know whether I would believe copyright was infringed (and what I believe makes no difference, what the court believed is what determines that). And I might well believe copyright wasn't infringed but the copying was immoral (rather than illegal). But I do know that the underlying message of your post: "if my work was copied I would be pissed off so it must be illegal", is just adding to the general confusion over an already complex and contorted branch of the law.
The images of the houses are definitely copyrightable.
The Tao of math: The numbers you can count are not the real numbers.
From what I understood from TFA, Zoocasa didn't comply with the robots.txt standard, and that was a point considered in the ruling. Since Google does comply with the robots.txt standard, I don't think this ruling can be directly transferred to Google.
The Tao of math: The numbers you can count are not the real numbers.
And would not those subscribers/customers who are doing legit business over the web, and who might suffer many thousands in damages from the disconnect, have standing to sue Rogers for their loss of connectivity when it is not their fault, but Rogers actions that got them disconnected?
:)
It seems to me that that the sauce used for the goose should work equally well for the gander.
--
Cheers, Gene.
The decision was quite specific on this. The issue was not about the facts such as address, number of bedrooms, floor space, etc. It stated quite clearly that those were not copyrightable. The issue was the description of the property which are impressions done on prose and the pictures. Both of these a copyrightable.
As I understand Feist v. Rural (ianal) you could sell your own copy of the list, although reformatting it (so as not to violate copyright on a creative layout) might be a good idea. You couldn't sell a copy of the cover, prefaces, or end material.
The "hard work" of collecting facts is irrelevant to copyright law. This is probably one of the most common misconceptions of copyright law, that it exists to protect "hard work". The only thing that is relevant is the extent to which creative expression is involved.
On the other hand, this:
brings up an interesting point, since arguably fake entries are creative. On the other hand, they are also fraudulent if the phone company represents the list as factual. So maybe a suit/countersuit for copyright/fraud would cancel each other? :)
That's not true in many countries other than the US, such as in the EU, where there is a database right. Maybe not relevant for Canadian property adverts, unless they have a database right too, but perhaps for a whole lot else given that this is the Internet and you could be sued anywhere. Scraping a whole database full of property ads might well be illegal here even if you rewrite the text and get a five year old to draw artist's impressions of the houses. You could take a one or two, but not the whole thing.
That's not true in many countries other than the US
Yes, and we were talking about the US hence why the title of this thread is "Not copyrightable in the US" (emphasis added).
I don't understand the difference between those scenarios.
What is that hard to understand? Facts themselves are not copyrightable but the work that presents those facts is copyrightable. Hence why you can copy all the names and numbers from a phone book but you can't copy the pages exactly and pass that off as your own because that is an expression of the facts which can be copyrighted.
Did you read the rest of my post? In the sentence right after that one I said why I didn't see a difference: because the end result would be the same, a phone book with exactly the same contents as the original. And then I went on to ask why would copying factual information from one website and presenting it in another with a completely new design was worse than copying the data from a phone book and publishing it. The web scraping scenario even has a creative component missing from the phonebook-entry-copying scenario. Btw, I was not being antagonistic.
If anyone was copying your website 1:1, that is assuming you are competent enough to have one, and just replaced your name with his, you would be up screaming "he stole my website".
Copying a little from many makes a doctoral thesis or a fiction book, copying from one without attribution, is plagiarism.
Also, the robots.txt is an established means to indicate that you do not want your content to be copied completely.
Weakening the status of the robots.txt is stupid, exactly because it allows publishers who are too stupid to set up a robots.txt, to claim that someone violated their "Terms of use". If you took the time to scan the court decision, you would find that the defendant willingly violated the standards surrounding the robots.txt file, in particular, it failed to provide a robot name.
Hey don't blame me, IANAB