Apparently, this is where the 6'2" Bob Smith lives. http://goo.gl/maps/1wGxN. That might be his daughter leaning against the pillar, texting. She'll be very upset if he's extradited. Wonder who he murdered? I mean allegedly murdered.
if you have 50 files that all are *exactly* 1GB in size
Hmm. To the byte?
I very rarely find any similarly sized files that large, and those that I do, there are usually only two of them. Usually, these are videos, or audio files that I've copied/rsynced around in such a way that they ended up in two places.
Of course, everyone's usage will be unique, but I can't imagine finding that scenario being common.
Let me know how it goes for you. Especially if you're not on Linux, as I've only tried it on Linux, and I'm not sure how the symlink detection works on other OSes.
You need to compile the.java files to.class files.
You'll need something called javac (which comes with the Java Development Kit (JDK)).
It's not super easy getting the hang of it at first.
Clone the git repo, run mvn clean install (you'll need to download Maven), and then you should end up with a JAR file. Then run java -jar finddups.jar and things should start to happen.
Yes, that's something I thought about. It's a trade off, isn't it? If you have two 700MB files that are both the exact same size, but are different, the way I'm doing it is quickest. If the two 700MB files are the same, then it will probably be about the same time as CRC/MD5ing the files.
If you have many small files, then I guess the IO won't be that much anyway.
My implementation offers a parameter to ignore files smaller than a specific size, which is how I run it: java -jar finddups.jar/path 200000 (for instance).
Another commenter to your comment suggested doing it adaptively, which would be easy.
1. Compare filesizes.
2. When there are multiple files with the same size, start diffing them. I don't read the whole file to compute a checksum - that's inefficient with large files. I simply read the two files byte by byte, and compare - that way, I can quit checking as soon as I hit the first different byte.
We calculated that, on the coldest winter days, the carbon cost of driving to work was about the same as the extra heating that would be needed if she stayed at home.
But you've less chance of ending up in a fiery car accident after sliding off the road on black ice.
Now, how can I be sure that the Javascript executing in my browser, a:. isn't malicious, and b:, hasn't been intercepted and changed by someone in the middle?
In a skyscraper in London's Canary Wharf financial district, Olympic organizers opened a Technology Operations Center (TOC) last month and that act as mission control for monitoring the health of Olympic IT systems.
The TOC's location is a soft secret, and organizers did not want its exact location to be published for security reasons.
If only Gentoo had a stable version. By that, I mean something that you could leave for 3 years, only applying GLSAs, and then, when you wanted to install something new, you wouldn't be fucked because there was a new version of Portage, that needed a new version of Python, that wouldn't install for a myriad of other reasons.
The general consensus is that you have to keep bringing it up to date with all the latest ebuilds every night. And that's a pain.
I still really like Gentoo though. Especially Hardened Gentoo. If you want a very tough server, that's what you need.
If you're ever in Southern England, go and see the Magna Carta (one of them, but the best preserved one) at Salisbury Cathedral. It's amazing to see a document, written in 1215, still perfectly preserved and readable, with nothing between you and it but a pane of glass.
I care. I liked the author's suggestion of creating a Dummy Account, just so you can view yourself and see what's visible to strangers. Ideally nothing but your name, but who knows for sure? What exactly is visible to Google, Bing, and other search engines? I'd like to know.
Why not let some strangers know your ID? And let them tell you? Maybe on some sort of forum. Such as Slashdot...?
Slashdot: The place where strangers exchange Facebook accounts to "nmap" their Facebook details.
When you have is a Nokia Lumia, everything looks like a nail :)
Apparently, this is where the 6'2" Bob Smith lives. http://goo.gl/maps/1wGxN. That might be his daughter leaning against the pillar, texting. She'll be very upset if he's extradited. Wonder who he murdered? I mean allegedly murdered.
I anticipate the stories of how whole wheat toast is secretly the cause of cancer.
If you burn it, it is.
It's steganography, not stenography. Stenography is what people in court-houses do.
if you have 50 files that all are *exactly* 1GB in size
Hmm. To the byte?
I very rarely find any similarly sized files that large, and those that I do, there are usually only two of them. Usually, these are videos, or audio files that I've copied/rsynced around in such a way that they ended up in two places.
Of course, everyone's usage will be unique, but I can't imagine finding that scenario being common.
Ask, and ye shall receive.
Let me know how it goes for you. Especially if you're not on Linux, as I've only tried it on Linux, and I'm not sure how the symlink detection works on other OSes.
You need to compile the .java files to .class files.
You'll need something called javac (which comes with the Java Development Kit (JDK)).
It's not super easy getting the hang of it at first.
Clone the git repo, run mvn clean install (you'll need to download Maven), and then you should end up with a JAR file. Then run java -jar finddups.jar and things should start to happen.
Or should I just commit a JAR file to Github?
Yes, that's something I thought about. It's a trade off, isn't it? If you have two 700MB files that are both the exact same size, but are different, the way I'm doing it is quickest. If the two 700MB files are the same, then it will probably be about the same time as CRC/MD5ing the files.
/path 200000 (for instance).
If you have many small files, then I guess the IO won't be that much anyway.
My implementation offers a parameter to ignore files smaller than a specific size, which is how I run it: java -jar finddups.jar
Another commenter to your comment suggested doing it adaptively, which would be easy.
Exactly. What I do is this:
1. Compare filesizes.
2. When there are multiple files with the same size, start diffing them. I don't read the whole file to compute a checksum - that's inefficient with large files. I simply read the two files byte by byte, and compare - that way, I can quit checking as soon as I hit the first different byte.
Source is at https://github.com/caluml/finddups - it needs some tidying up, but it works pretty well.
git clone, and then mvn clean install.
We calculated that, on the coldest winter days, the carbon cost of driving to work was about the same as the extra heating that would be needed if she stayed at home.
But you've less chance of ending up in a fiery car accident after sliding off the road on black ice.
You are wrong. There is a fine-grained security manager that can limit most stuff.
http://docs.oracle.com/javase/1.4.2/docs/guide/security/PolicyFiles.html
and http://docs.oracle.com/javase/1.4.2/docs/guide/security/permissions.html
Sounds great.
Now, how can I be sure that the Javascript executing in my browser, a:. isn't malicious, and b:, hasn't been intercepted and changed by someone in the middle?
Who's first language isn't Mandarin? I doubt it.
"Who is first language isn't Mandarin?"
Do you mean whose?
Assuming the admin wasn't too lazy to set it up. :)
Assuming that the DNS for the IP address range is delegated to the admin first of all.
It's all very well setting up rDNS, but sometimes, the bureaucratic nightmare to get the range pointed at your DNS server is just not worth it.
In a skyscraper in London's Canary Wharf financial district, Olympic organizers opened a Technology Operations Center (TOC) last month and that act as mission control for monitoring the health of Olympic IT systems. The TOC's location is a soft secret, and organizers did not want its exact location to be published for security reasons.
The TOC's location is a soft secret, and organizers did not want its exact location to be published for security reasons.
Wow. I contracted in Canary Wharf for 3 months this year, and I'm fairly sure I could guess where it is. That's got to be the softest secret ever.
If only Gentoo had a stable version. By that, I mean something that you could leave for 3 years, only applying GLSAs, and then, when you wanted to install something new, you wouldn't be fucked because there was a new version of Portage, that needed a new version of Python, that wouldn't install for a myriad of other reasons.
The general consensus is that you have to keep bringing it up to date with all the latest ebuilds every night. And that's a pain.
I still really like Gentoo though. Especially Hardened Gentoo. If you want a very tough server, that's what you need.
Is it wrong that I laughed like a lol-cat when I first heard this on the radio?
If you're ever in Southern England, go and see the Magna Carta (one of them, but the best preserved one) at Salisbury Cathedral. It's amazing to see a document, written in 1215, still perfectly preserved and readable, with nothing between you and it but a pane of glass.
Imagine half the population of your entire city or town dying off in 1 or 2 years.
Yep. I'm waiting for Bird Flu to mutate. If I survive, it'll be the only way I can afford to be a house owner.
I'm a leg and bum man, for instance. I'll take boobies if they're there, but I'd rather have long legs and a pert bum.
Yep, me too. I bet Scientific Linux has seen a surge of new users.
Still, now that CentOS 6 is out, that's great news - normal service is resumed.
molten salt vats are kinda cool
You're doing it wrong.
I care. I liked the author's suggestion of creating a Dummy Account, just so you can view yourself and see what's visible to strangers. Ideally nothing but your name, but who knows for sure? What exactly is visible to Google, Bing, and other search engines? I'd like to know.
Why not let some strangers know your ID? And let them tell you? Maybe on some sort of forum. Such as Slashdot...?
Slashdot: The place where strangers exchange Facebook accounts to "nmap" their Facebook details.
I was wondering if either the government or someone (like Anonymous) has or is thinking about deploying a 'shadow internet'
That's one of the goals of Anonet - but their public site seems to be down.