Slashdot Mirror


User: RekkanoRyuji

RekkanoRyuji's activity in the archive.

Stories
0
Comments
2
First seen
Last seen
Profile
(view on slashdot.org)

Comments · 2

  1. Re:C# code to get all the duplicates on windows 7 on Ask Slashdot: How Do I De-Dupe a System With 4.2 Million Files? · · Score: 1

    Oh, I forgot. The output will be put in the C:\Temp Folder. It will look something like this.


    C0EgbnbmRRjDX47IPZ3TxaNSUYTDifvRXMnq0YjlGIA=
    F:\Music\jpop\Saki_Nijino-Over_the_Rainbow\(Nijino_S-Rainbow)-01-Tokimeki_Arigatou.mp3
    F:\Music\Anime\Tokimeki Memorial\tokimeki memorial - over the rainbow\over the rainbow_track01_tokimeki arigatou.mp3


    Pr3lS9OFNHLjWCQ8OW3/fh+KOGL5J9lJVZzPUMqRptI=
    I:\Pictures\old s\Picture 208.jpg
    H:\Desktop Backup\old s\Picture 208.jpg

  2. C# code to get all the duplicates on windows 7 on Ask Slashdot: How Do I De-Dupe a System With 4.2 Million Files? · · Score: 1

    Read this, and prompted me to write a bit of code to do the de-dupe comparisons. Here is the code. You will have to mark the project to run unsafe code :) (in project properties) Compiled with Visual Studio 2010.

    Program reads the first 4MB of each file and computes a hash. A thread is run for each drive you are looking for.

    If you want all drives, comment out the section it says to do so, else just add the drives you want to the list of DrivesToSearch
    I suggest if you use your C Drive, add some of the folders like I have below to the Ignore Directories. The "ToLower()" is there just to make sure that it is lower case, else the hash match won't work.

    Please forgive the code, as this was very quick-n-dirty

    Code runs *far* faster than a week....
    C:\ = 185,000 files.
    F:\ = 29,690 files
    G:\ = 20,765 files
    H:\ = 60,851 files
    i:\ = 52,442 files
    D:\ 196 files (DVD ROM)

    Total: 348,944 files on 6 drives with 3.2TB of used space took about 50 minutes 52 seconds

    Speed can be improved by lowering the 4 meg check to something lower. Many of the files on F,G are over 4MB in size and took the longest to complete, even though they had less total files.
    Code Below. (mutters about slashdot and their inability to allow code)

    http://pastie.org/4652387