Is there any list of knobs I have to tweak to get a stock FF3 install to behave normally, i.e. no transmission of entered URLs/searches to third parties, no "auto-complete" with www. and.com/.net and any of that bullshit that has become accepted nowadays?
Yes, that's a rhetoric rant, but if anyone knows, please reply anyway.
So, you image an OS install and then image the application installs in order to run them with virtualization because they might conflict with other apps or do something to the OS?
Sounds like applying training wheels to an incapable OS, not like modern thinking.
How does that make him wrong? Your bot was ignoring robots.txt and was blocked after all, wasn't it? Isn't that the whole purpose of what we're discussing?
That you were able to circumvent the measure manually is totally besides the point. If the guys in TFA were crawling the web manually for copyright-infringing material, we wouldn't even be discussing this since there would be no news about it.
The story is about automatic crawling without human intervention and this thread is about defeating it, successfully. The End.
You make it sound as if it's technically possible to tell whether a client has violated robots.txt. Newsflash: It's just a text file. It doesn't *do* anything.
Hint: engaging brain is recommended.
You can link to an unsuspicious file so that the link is not visible to a normal visitor. Then, you edit robots.txt so that the file is not to be crawled by spiders. And finally, as soon as anyone is accessing that file, you block them.
The actions you claim to have done DO NOT fall under piracy (well unless you did them whole boarding a vessel with a cutlass between your teeth), they are fair use actions that your a perfectly entitled to do.
Not so fast. That would certainly depend on where you live.
I find NetBSD's RAIDframe to be very reliable and hassle-free. I'm using RAID-5 with 5 disks and get 110MB/s reads and 70MB/s writes. It also never gave me _any_ headache whatsoever. It just works.
I think software RAIDs are better than hardware RAIDs (for home use) due to their flexibility. You can mix different disk interfaces (IDE, SATA, SCSI,...) and sizes. If one of my 320G disks were to fail and a new disk was more expensive than the next bigger size, I could just use the bigger disk. It's a stupid example to show that you don't need components that are identical down to the number of blocks. I could also just pop in an external USB disk and use that while I get a replacement.
Or I could gradually upgrade to a new tech. Say I'm using IDE disks now. I could pop in SATA disks and duplicate all components and be running a SATA RAID from now on.
Also, you are not dependent on a $$$ piece of hardware that's incompatible in its RAID implementation details to other vendors' products. If your vendor goes out of business and your shiny hardware controller is not available anymore, what do you do? Contrast that to finding any box that can talk to your disks and downloading a copy of NetBSD.
Of course I'm totally ignoring backups. I'm talking about home use in the order of 1TB or more. Where do you backup such amounts of data cheaply, anyway? Thus for me, RAID _is_ the backup, unfortunately. That's why I favor flexibility and accessibility over speed or being able to hotswap a component to get 99.999% uptimes.
Have you looked at any professional switch at all?
My cheap "web-managed" switch allows me to specify a monitor port where all traffic from (an)other port(s) gets duplicated to. This port is not a member of any VLAN and has non-tagged ingress traffic blocked. So effectively it's a send-only tap from the switch's POV.
Why is everyone always choosing "odd" numbers of disks like 4 or 8 for RAID5?
The optimum number of components for a RAID5 is 2^n+1, i.e. 3, 5 or 9, although personally, I would stop at 5.
With 2^n data components, you can match filesystem block sizes perfectly to the stripe size, avoiding unnecessary reads on writes.
Try NetBSD. Seriously.
You must be pulling that out of your ass.
The vast majority of online stores want to be paid in advance or with pay-on-delivery. Stores charging your bank account are really the minority.
Shooting a bit over the target, are we?
And yes, console should rule the world. Just what has it got to do with the topic?
Is there any list of knobs I have to tweak to get a stock FF3 install to behave normally, i.e. no transmission of entered URLs/searches to third parties, no "auto-complete" with www. and .com/.net and any of that bullshit that has become accepted nowadays?
Yes, that's a rhetoric rant, but if anyone knows, please reply anyway.
So, you image an OS install and then image the application installs in order to run them with virtualization because they might conflict with other apps or do something to the OS?
Sounds like applying training wheels to an incapable OS, not like modern thinking.
Great! What could possibly go wrong with uploading one's complete bookmarks, history and search queries to some corporate entity?
Autonomous system numbers are the perfect metric.
http://en.wikipedia.org/wiki/Autonomous_system_(Internet)
I'm using IPv6 just fine for several years now. Oh, and NetBSD had IPv6 since 1999 or so.
Just get involved, everyone can get IPv6 right now: http://www.sixxs.net/
I beg to differ. BSD was designed. Linux was grown. About Windows, I'm not even sure.
Huh? Single-layer DVD is just as well a standard as Dual-layer DVD.
http://en.wikipedia.org/wiki/Dvd
I'm doing that as well, with an additional random string. So bestbuy.com@domain.example would become e.g. bestbuy.com-4df2@domain.example.
This helps against dictionary guesses or the guy who knows about your scheme and likes to screw with it.
How does that make him wrong? Your bot was ignoring robots.txt and was blocked after all, wasn't it? Isn't that the whole purpose of what we're discussing?
That you were able to circumvent the measure manually is totally besides the point. If the guys in TFA were crawling the web manually for copyright-infringing material, we wouldn't even be discussing this since there would be no news about it.
The story is about automatic crawling without human intervention and this thread is about defeating it, successfully. The End.
You can link to an unsuspicious file so that the link is not visible to a normal visitor. Then, you edit robots.txt so that the file is not to be crawled by spiders. And finally, as soon as anyone is accessing that file, you block them.
Don't forget to blacklist a client as soon as it violates the robots.txt.
Easy, just pirate all the shows and movies you want to watch, ad-free. Teach it to your kids as well.
We are not consumers! We are citizens and customers, not sheep.
I find NetBSD's RAIDframe to be very reliable and hassle-free. I'm using RAID-5 with 5 disks and get 110MB/s reads and 70MB/s writes. It also never gave me _any_ headache whatsoever. It just works.
...) and sizes. If one of my 320G disks were to fail and a new disk was more expensive than the next bigger size, I could just use the bigger disk. It's a stupid example to show that you don't need components that are identical down to the number of blocks. I could also just pop in an external USB disk and use that while I get a replacement.
I think software RAIDs are better than hardware RAIDs (for home use) due to their flexibility. You can mix different disk interfaces (IDE, SATA, SCSI,
Or I could gradually upgrade to a new tech. Say I'm using IDE disks now. I could pop in SATA disks and duplicate all components and be running a SATA RAID from now on.
Also, you are not dependent on a $$$ piece of hardware that's incompatible in its RAID implementation details to other vendors' products. If your vendor goes out of business and your shiny hardware controller is not available anymore, what do you do? Contrast that to finding any box that can talk to your disks and downloading a copy of NetBSD.
Of course I'm totally ignoring backups. I'm talking about home use in the order of 1TB or more. Where do you backup such amounts of data cheaply, anyway? Thus for me, RAID _is_ the backup, unfortunately. That's why I favor flexibility and accessibility over speed or being able to hotswap a component to get 99.999% uptimes.
Dude, golden ears can hear _anything_. Didn't you know?
Have you looked at any professional switch at all?
My cheap "web-managed" switch allows me to specify a monitor port where all traffic from (an)other port(s) gets duplicated to. This port is not a member of any VLAN and has non-tagged ingress traffic blocked. So effectively it's a send-only tap from the switch's POV.
Any halfway decent switch can do that.
...does that mean it wasn't illegal up until now? That's actually more surprising to me.
Beware, though. He might just come back.
Yes.