Oh please. Those taxes were in place for years, and it didn't stop people getting rich.
There is only so long we can afford to spend without taxing. We need to gut the fat out of the government and the budget, but we also need to bring in some cash to pay off our outstanding debt. The people who can spare it are, frankly, the people who are still buying and selling stock, and collecting dividends.
That's crap. Government has only grown since Clinton, and it grew during Clinton, and during Bush I, and Reagan as well.
You want to argue public choice economics, fine, but don't play like it's one parties fault. And since the Republicans have been in charge for 22 of the last 30 years, they have a clear responsibility for the current size of the government.
And the poor Republicans have been helpless victims for the last 8 years...They only controlled the legislature for the insignificant period between 1994 and 2006, so they clearly had no power to resist Clinton's evil ways.
Ah Clinton! Is there nothing we can't blame you for?
For my money, I'd rather have the guy from the party that doesn't disdain education as "elitist"; economists may not be right all the time, but they're more right than the average Joe the Plumber. I'd rather someone who was more fiscally conservative, but since there is no (electable) fiscal conservative in the race, that doesn't matter.
Every time someone mentions Twitter to me, I think of that comic.
Maybe I'm just too wordy, but if I can say it in 120 characters, whatever, it's probably too banal to be shared. Otherwise it's a piece of smart-assery, or an aphorism, and I don't really see a need for a forum dedicated to that part of my character.
Me and the Twitter guy have something in common: if we were great minds, we'd be out doing great things, not sitting around with the belief that our opinions matter.
I don't have his hubris, thinking that his laughable Twitter credentials put him in some sort of position where he is qualified to pontificate on the sad state of the internets, but I'm not so deluded as to think my sniping at his idiocy is in any way deep or meaningful.
Meh. I've got access to a block of addresses that is so hilariously larger than anything I'll ever need that I NAT some of my home servers through proxies at work for the static IP. If we reclaimed all the unused addresses, we could string out IPv4 for another decade or so.
Moving to IPv6 is one of those things that sounds like it's going to be soooooo easy, and has the potential to be hell on earth. Adoption is happening, slowly and surely, but it's still happening. I see no reason to panic and try and force a quick transition when the only thing that that will get us is chaos.
I got the impression that he was talking about divorcing the content from the presentation, which sounds fine in theory but a lot of people want to have more control of the presentation...That was kinda the point of HTML in the first place; we'd have stuck with Gopher if all we wanted was pure content with a static presentation.
Even in a modern context, we could have switched to XML to divorce the information from the presentation, and there hasn't really been a charge in that direction.
It's hard to say what he really meant because the whole thing is lacking in specifics.
Okay, so a guy who works for Twitter a crash prone, non-scaling application, says that the internet is "built wrong", where one of the examples of wrong is scaling. He goes on to list a few specific apps that he thinks are good example of "wrong" like IP4 and SMTP, which won out against better designed (but strangely unmentioned) alternatives because of wacky market stuff, which, again, not described.
No one who knows anything about the Internet would say that it was perfect. It's not even close. There are a lot of places where unholy cludges exist and are perpetuated because it's a lot easier to live with them than it is to try and change everything that depends on them. Things like, for example, Twitter.
Sure there were alternatives, but they were all either patent-encumbered, or hard to deploy, or too complex to easily develop for. They died. It's called competition. TCP/IP and SMTP came out the other side, and grew into cornerstones of the largest network this world has ever known, in a shockingly short period of time. No, not perfect, but pretty damn good none-the-less.
It's very easy to sit back today and say, "Wow it could have been so much better!" But that is armchair crap at the best of times...I'd sneer if Vint Cerf said it. Coming from someone who demonstrably can't do better, and can't even be bothered to champion a specific alternative...That's as pointless and lacking in content as most of the crap that comes through his crappily coded service.
Nothing beats the perpetual search for...ahem...male enhancement.
The scientific pioneer was a guy around the Great Depression who made a mint selling an operation in which he would implant goat testicles into his patients, many of whom claimed dramatic improvement.
In the process he managed to revolutionize modern radio and advertising.
It all depends on the data they're after. If it's something small that doesn't change often, they never get caught unless they're wildly stupid.
But if they need to get a LOT of data on even a weekly basis, you can almost always spot them and knock them offline, or feed them a pile of crap, or some other piece of wickedness. Even a piddly 10 dollar account becomes expensive if you have to get new ones all the time, and I am not above knocking out an entire IP block if the vast majority of the traffic is crap.
I'd say it's just smart. They couldn't know if they were going to have a successful game to start, and they had to give it a price bump to make some of their money back.
The drop reflects that the game was reasonably popular, and that their likelihood of making a modest profit meant that they could afford to drop the price and still make their money back.
The example that would leap to my mind is a number of services that allow you to "map" an ip address to a geographic location...I use one of those for my job search homepage, and it only allows ~200 queries a day for the "free" account...It would be plenty useful to have as a free service (targeted advertising), and if you set up enough "free" accounts, you could use it that way.
Since I'm doing all my job searching away from where I'm currently living, I use mine to make sure that my job searching page always looks "under construction" for people who live where I live. My boss actually checks it occasionally, I guess to make sure I'm not trying to leave.
Yea, but unless you're running that list across a botnet, the IP addresses are a give away.
Even if you are running it across a botnet it's pretty easy to pick out the patterns using some pretty trivial statistical hacks...If you graph bot traffic it looks like a heartbeat; even if you randomize the access times they don't match "human" numbers (unless you add so much random that it ceases to be an efficient scraper...If you could hire a guy to browse the site and write down the data faster than you can scrape it, they beat you.)
I've never actually been banned for it either, but it's all a crapshoot. I used to work for a company that did GIS data and we smote scrapers on a near-hourly basis, and that one turned freak-nasty because when we found a really good scraper, we'd feed them 60% crap data, and with GIS it's not easy to tell good data from bad.
Things like posted schedules, imho, are the real legitimate use for scrapers. Those people want their data to get out, but they may lack the tools to put it out there.
The problem is that this is a "damned if you do, damned if you don't" situation.
If you build the scraper and the content provider successfully sues you (wildly unlikely in my experience), then it's your ass.
On the other hand, if the content provider notices your scraping and cuts you off (extremely probable), then your app is worthless, and, again, it's your ass.
Building a business on other peoples data is just a bad idea. If you can't license it from them, and you can't collect it yourself, you're at their non-existent mercy, and scraping is extremely delicate and very easy to thwart.
This is marked troll, but frankly this is the prevailing wisdom regarding scrapers and I don't see it as trollish to point that out.
Coming from a guy who spends a non-trivial amount of time dealing with scraper-related problems, I've never met someone who felt that our hard-compiled content was anything other than their god-given right. I've dealt with complaints from people running scrapers that got blocked who literally could not understand what the problem was, even really abusive scrapers that looked more like a DDOS than a scraping.
Frankly, and I'm saying this as the guy who could be blocking the article posters scraper in a week or two, this problem needs to be dealt with by the content providers. No law is going to stop abusive scraping...Likely they'd just offshore themselves if you bothered to try. You need to treat it the same way you'd treat any other security problem.
That's pretty common. John McCain had an issue with that earlier in his campaign when his MySpace page got hit. The guy who did the original template wasn't keen on having his images hotlinked from such a high volume site and made a hilarious substitution (which was widely misreported as a "hacking" incident in the media).
The AC is dead on. If you depend on someone elses data, they are going to notice, and they are going to remove your access, or, worse, start feeding you crap.
I've always worked at places that were victimized by scrapers, rather than the other way around. In the early days, I'd track 'em down (where possible), and try to extract some measure of satisfaction by confronting the miscreants with their misdeeds.
In my experience, most people don't even think it's wrong; in their minds it's the same as hotlinking an image. It's not their problem if the people on the other end don't protect their data. And anyway, if we didn't want the data stolen, we shouldn't have posted it on teh interwebs in the first place.
So I'm a bit amused at the sudden vehemence of the Slashdotters who commonly decry all DRM and all attempts by copyright holders to protect their IP. I would have thought the community would have come down on the other side of this issue, but I guess music and games are different from websites, photos, and other scrapable data.
Should be. It depends on what kind of data they're downloading, and whether they're just crawling link by link and hoovering up everything, or whether they're looking for something specific.
Either way, spiders and scrapers usually have programmed scan intervals which have no relation to an actual human's browsing...or they just hit the page as hard as they can, but that is so easy to block that almost no one does it that way. Even if they add a little randomness, it's only efficient to run a scraper if it's hitting every few seconds at max, and even the most ADD user won't keep that up.
Ironically, the easiest way to nail 'em is to put up a subset of "no robots" pages; if the robots crawl those pages, blacklist 'em. Every legitimate spider will respect those files.
Otherwise, if you're running a site with a ton of data, and something is crawling it sequentially, you can absolutely redirect their queries to whatever you want. I'd be wary of doing something cute (if you can call goatse "cute") for fear that you'll have an occasional false positive and redirect a user from a high bandwidth location to that site.
That's a Bad Analogy(tm xD), since no one sneaks into your house and installs billboards, and flyers, billboards, and direct mailings both have a significant cost attached for the initiator which prohibits abuse on the same level as spam.
Oh please. Those taxes were in place for years, and it didn't stop people getting rich.
There is only so long we can afford to spend without taxing. We need to gut the fat out of the government and the budget, but we also need to bring in some cash to pay off our outstanding debt. The people who can spare it are, frankly, the people who are still buying and selling stock, and collecting dividends.
That's crap. Government has only grown since Clinton, and it grew during Clinton, and during Bush I, and Reagan as well.
You want to argue public choice economics, fine, but don't play like it's one parties fault. And since the Republicans have been in charge for 22 of the last 30 years, they have a clear responsibility for the current size of the government.
I think Bush's 700 billion dollar bailout proves conclusively that no one is afraid of Marx anymore.
And the poor Republicans have been helpless victims for the last 8 years...They only controlled the legislature for the insignificant period between 1994 and 2006, so they clearly had no power to resist Clinton's evil ways.
Ah Clinton! Is there nothing we can't blame you for?
Looks pretty similar, numerically, to the poll Scott Adams commissioned.
For my money, I'd rather have the guy from the party that doesn't disdain education as "elitist"; economists may not be right all the time, but they're more right than the average Joe the Plumber. I'd rather someone who was more fiscally conservative, but since there is no (electable) fiscal conservative in the race, that doesn't matter.
That's the whole point. When they assigned us our block they gave us almost a hundred addresses. We use about 5.
If you design your network well, you hardly ever need more than a couple, and those are just for redundancy.
Who are you to say the Twitter guy knows nothing of scaling?
Every time someone mentions Twitter to me, I think of that comic.
Maybe I'm just too wordy, but if I can say it in 120 characters, whatever, it's probably too banal to be shared. Otherwise it's a piece of smart-assery, or an aphorism, and I don't really see a need for a forum dedicated to that part of my character.
Tell it to netscape, ;)
These things are fad driven; if Twitter doesn't get its act together, someone else will do it better.
And since Twitter is basically non-revenue generating, it's not like they're getting anything out of their early dominance except user goodwill.
Me and the Twitter guy have something in common: if we were great minds, we'd be out doing great things, not sitting around with the belief that our opinions matter.
I don't have his hubris, thinking that his laughable Twitter credentials put him in some sort of position where he is qualified to pontificate on the sad state of the internets, but I'm not so deluded as to think my sniping at his idiocy is in any way deep or meaningful.
Meh. I've got access to a block of addresses that is so hilariously larger than anything I'll ever need that I NAT some of my home servers through proxies at work for the static IP. If we reclaimed all the unused addresses, we could string out IPv4 for another decade or so.
Moving to IPv6 is one of those things that sounds like it's going to be soooooo easy, and has the potential to be hell on earth. Adoption is happening, slowly and surely, but it's still happening. I see no reason to panic and try and force a quick transition when the only thing that that will get us is chaos.
I got the impression that he was talking about divorcing the content from the presentation, which sounds fine in theory but a lot of people want to have more control of the presentation...That was kinda the point of HTML in the first place; we'd have stuck with Gopher if all we wanted was pure content with a static presentation.
Even in a modern context, we could have switched to XML to divorce the information from the presentation, and there hasn't really been a charge in that direction.
It's hard to say what he really meant because the whole thing is lacking in specifics.
Okay, so a guy who works for Twitter a crash prone, non-scaling application, says that the internet is "built wrong", where one of the examples of wrong is scaling. He goes on to list a few specific apps that he thinks are good example of "wrong" like IP4 and SMTP, which won out against better designed (but strangely unmentioned) alternatives because of wacky market stuff, which, again, not described.
No one who knows anything about the Internet would say that it was perfect. It's not even close. There are a lot of places where unholy cludges exist and are perpetuated because it's a lot easier to live with them than it is to try and change everything that depends on them. Things like, for example, Twitter.
Sure there were alternatives, but they were all either patent-encumbered, or hard to deploy, or too complex to easily develop for. They died. It's called competition. TCP/IP and SMTP came out the other side, and grew into cornerstones of the largest network this world has ever known, in a shockingly short period of time. No, not perfect, but pretty damn good none-the-less.
It's very easy to sit back today and say, "Wow it could have been so much better!" But that is armchair crap at the best of times...I'd sneer if Vint Cerf said it. Coming from someone who demonstrably can't do better, and can't even be bothered to champion a specific alternative...That's as pointless and lacking in content as most of the crap that comes through his crappily coded service.
If they wanted their writings available for free, then why would they bother to publish in the first place?
Content creators deserve some rights to their works.
Nothing beats the perpetual search for...ahem...male enhancement.
The scientific pioneer was a guy around the Great Depression who made a mint selling an operation in which he would implant goat testicles into his patients, many of whom claimed dramatic improvement.
In the process he managed to revolutionize modern radio and advertising.
Linky linky: John Brinkley
It all depends on the data they're after. If it's something small that doesn't change often, they never get caught unless they're wildly stupid.
But if they need to get a LOT of data on even a weekly basis, you can almost always spot them and knock them offline, or feed them a pile of crap, or some other piece of wickedness. Even a piddly 10 dollar account becomes expensive if you have to get new ones all the time, and I am not above knocking out an entire IP block if the vast majority of the traffic is crap.
I'd say it's just smart. They couldn't know if they were going to have a successful game to start, and they had to give it a price bump to make some of their money back.
The drop reflects that the game was reasonably popular, and that their likelihood of making a modest profit meant that they could afford to drop the price and still make their money back.
The example that would leap to my mind is a number of services that allow you to "map" an ip address to a geographic location...I use one of those for my job search homepage, and it only allows ~200 queries a day for the "free" account...It would be plenty useful to have as a free service (targeted advertising), and if you set up enough "free" accounts, you could use it that way.
Since I'm doing all my job searching away from where I'm currently living, I use mine to make sure that my job searching page always looks "under construction" for people who live where I live. My boss actually checks it occasionally, I guess to make sure I'm not trying to leave.
Yea, but unless you're running that list across a botnet, the IP addresses are a give away.
Even if you are running it across a botnet it's pretty easy to pick out the patterns using some pretty trivial statistical hacks...If you graph bot traffic it looks like a heartbeat; even if you randomize the access times they don't match "human" numbers (unless you add so much random that it ceases to be an efficient scraper...If you could hire a guy to browse the site and write down the data faster than you can scrape it, they beat you.)
I've never actually been banned for it either, but it's all a crapshoot. I used to work for a company that did GIS data and we smote scrapers on a near-hourly basis, and that one turned freak-nasty because when we found a really good scraper, we'd feed them 60% crap data, and with GIS it's not easy to tell good data from bad.
Things like posted schedules, imho, are the real legitimate use for scrapers. Those people want their data to get out, but they may lack the tools to put it out there.
The problem is that this is a "damned if you do, damned if you don't" situation.
If you build the scraper and the content provider successfully sues you (wildly unlikely in my experience), then it's your ass.
On the other hand, if the content provider notices your scraping and cuts you off (extremely probable), then your app is worthless, and, again, it's your ass.
Building a business on other peoples data is just a bad idea. If you can't license it from them, and you can't collect it yourself, you're at their non-existent mercy, and scraping is extremely delicate and very easy to thwart.
This is marked troll, but frankly this is the prevailing wisdom regarding scrapers and I don't see it as trollish to point that out.
Coming from a guy who spends a non-trivial amount of time dealing with scraper-related problems, I've never met someone who felt that our hard-compiled content was anything other than their god-given right. I've dealt with complaints from people running scrapers that got blocked who literally could not understand what the problem was, even really abusive scrapers that looked more like a DDOS than a scraping.
Frankly, and I'm saying this as the guy who could be blocking the article posters scraper in a week or two, this problem needs to be dealt with by the content providers. No law is going to stop abusive scraping...Likely they'd just offshore themselves if you bothered to try. You need to treat it the same way you'd treat any other security problem.
That's pretty common. John McCain had an issue with that earlier in his campaign when his MySpace page got hit. The guy who did the original template wasn't keen on having his images hotlinked from such a high volume site and made a hilarious substitution (which was widely misreported as a "hacking" incident in the media).
The AC is dead on. If you depend on someone elses data, they are going to notice, and they are going to remove your access, or, worse, start feeding you crap.
I've always worked at places that were victimized by scrapers, rather than the other way around. In the early days, I'd track 'em down (where possible), and try to extract some measure of satisfaction by confronting the miscreants with their misdeeds.
In my experience, most people don't even think it's wrong; in their minds it's the same as hotlinking an image. It's not their problem if the people on the other end don't protect their data. And anyway, if we didn't want the data stolen, we shouldn't have posted it on teh interwebs in the first place.
So I'm a bit amused at the sudden vehemence of the Slashdotters who commonly decry all DRM and all attempts by copyright holders to protect their IP. I would have thought the community would have come down on the other side of this issue, but I guess music and games are different from websites, photos, and other scrapable data.
Should be. It depends on what kind of data they're downloading, and whether they're just crawling link by link and hoovering up everything, or whether they're looking for something specific.
Either way, spiders and scrapers usually have programmed scan intervals which have no relation to an actual human's browsing...or they just hit the page as hard as they can, but that is so easy to block that almost no one does it that way. Even if they add a little randomness, it's only efficient to run a scraper if it's hitting every few seconds at max, and even the most ADD user won't keep that up.
Ironically, the easiest way to nail 'em is to put up a subset of "no robots" pages; if the robots crawl those pages, blacklist 'em. Every legitimate spider will respect those files.
Otherwise, if you're running a site with a ton of data, and something is crawling it sequentially, you can absolutely redirect their queries to whatever you want. I'd be wary of doing something cute (if you can call goatse "cute") for fear that you'll have an occasional false positive and redirect a user from a high bandwidth location to that site.
That's a Bad Analogy(tm xD), since no one sneaks into your house and installs billboards, and flyers, billboards, and direct mailings both have a significant cost attached for the initiator which prohibits abuse on the same level as spam.