Slashdot Mirror


Microsoft Introduces GVFS (Git Virtual File System) (microsoft.com)

Saeed Noursalehi, principal program manager at Microsoft, writes on a blog post: We've been working hard on a solution that allows the Git client to scale to repos of any size. Today, we're introducing GVFS (Git Virtual File System), which virtualizes the file system beneath your repo and makes it appear as though all the files in your repo are present, but in reality only downloads a file the first time it is opened. GVFS also actively manages how much of the repo Git has to consider in operations like checkout and status, since any file that has not been hydrated can be safely ignored. And because we do this all at the file system level, your IDEs and build tools don't need to change at all! In a repo that is this large, no developer builds the entire source tree. Instead, they typically download the build outputs from the most recent official build, and only build a small portion of the sources related to the area they are modifying. Therefore, even though there are over 3 million files in the repo, a typical developer will only need to download and use about 50-100K of those files. With GVFS, this means that they now have a Git experience that is much more manageable: clone now takes a few minutes instead of 12+ hours, checkout takes 30 seconds instead of 2-3 hours, and status takes 4-5 seconds instead of 10 minutes. And we're working on making those numbers even better.

7 of 213 comments (clear)

  1. Meh... by the_skywise · · Score: 3, Insightful

    There aren't THAT many repos with over 3 million files in them.

    The great majority of projects I've been on have been around the 100k-300k range and doing a build (to properly test the product) required ALL of them.

    And even then, once you've got all of them the first time, GIT does the diffing automatically so it "scales" already.

    Maybe MS could put some of their vast R&D efforts to to something more useful... like having their free Visual Studio Code editor handle files bigger than 1gb?

  2. Did they just turn git into svn? by lucasnate1 · · Score: 5, Insightful

    The whole point of git is that you have identical copy on your machine. Why take away git's biggest advantage?

    1. Re:Did they just turn git into svn? by thegarbz · · Score: 4, Insightful

      The whole point of git is that you have identical copy on your machine. Why take away git's biggest advantage?

      Because it's biggest advantage is also one of it's greatest inefficiencies and frankly on a large project chances are you may not need it all. The whole point is you have an identical copy on your machine of what you're working on

    2. Re:Did they just turn git into svn? by Anonymous Coward · · Score: 3, Insightful

      The whole point of git is that you have identical copy on your machine. Why take away git's biggest advantage?

      "A Clone now takes only minutes instead of 12+ Hours!"
      Ja, that's because you're NOT making a copy.

    3. Re:Did they just turn git into svn? by Anonymous Coward · · Score: 3, Insightful

      No, the whole point of git is that every file version is immutable and referenced by a globally unique hash. This means that it doesn't matter where the actual data is located - until you need the actual data for some actual reason. This model has been copied by countless systems since git, because it is extremely robust and has multiple benefits, and none of those other systems expect the local user to download the entire database before he even begins work. Nonetheless, such systems can also support downloading the entire database, so I'm puzzled as to why you think this work on git object caching "takes away" a feature which it quite clearly still in fact supports.

      The important thing here is the new use cases which the new caching strategy enables.

  3. Re:Ah nostalgia by AuMatar · · Score: 4, Insightful

    The fact you needed a release team and release engineers to manage a clear case implementation is why its considered one of the worst systems out there, remembered with hatred by almost everyone who used it. A version control system should be easily set up by one admin in an hour or two, and then usable without reams of documentation by any of the engineers. ClearCase failed that.

    --
    I still have more fans than freaks. WTF is wrong with you people?
  4. It's the hook to make your repositories break by Ungrounded+Lightning · · Score: 2, Insightful

    The whole point of git is that you have identical copy on your machine. Why take away git's biggest advantage?

    Because it's biggest advantage is also one of it's greatest inefficiencies and frankly on a large project chances are you may not need it all. The whole point is you have an identical copy on your machine of what you're working on

    So buy a bigger disk. They're cheap.

    Why did they do it? It's obvious: it's the bait on the hook to get you to break git and your open source projects (even CURRENT ones) that compete with them.

    By keeping you from having a full copy of the repository, they break git: If there are files that you didn't use in recent checkouts, they're not stored locally or not brought up to date when you pull. If something goes wrong externally - like loss or corruption at a cloud site (such as the recent lost-update debacle) you have no non-microsoft-git-internals-expert way to recover - maybe no way to recover at all.

    You lose the ability to work offline. You lose the ability to look at history, or parts of the repository you haven't been to yet, without being back on line to a working and trustworthy external server, and so on.

    --
    Bantam Dominique roosters crow a four-note song. Once you've heard it as "Happy BIRTHday" you can't NOT hear it that way