Slashdot Mirror


Microsoft Introduces GVFS (Git Virtual File System) (microsoft.com)

Saeed Noursalehi, principal program manager at Microsoft, writes on a blog post: We've been working hard on a solution that allows the Git client to scale to repos of any size. Today, we're introducing GVFS (Git Virtual File System), which virtualizes the file system beneath your repo and makes it appear as though all the files in your repo are present, but in reality only downloads a file the first time it is opened. GVFS also actively manages how much of the repo Git has to consider in operations like checkout and status, since any file that has not been hydrated can be safely ignored. And because we do this all at the file system level, your IDEs and build tools don't need to change at all! In a repo that is this large, no developer builds the entire source tree. Instead, they typically download the build outputs from the most recent official build, and only build a small portion of the sources related to the area they are modifying. Therefore, even though there are over 3 million files in the repo, a typical developer will only need to download and use about 50-100K of those files. With GVFS, this means that they now have a Git experience that is much more manageable: clone now takes a few minutes instead of 12+ hours, checkout takes 30 seconds instead of 2-3 hours, and status takes 4-5 seconds instead of 10 minutes. And we're working on making those numbers even better.

4 of 213 comments (clear)

  1. Ah, Microsoft by Kierthos · · Score: 2, Interesting

    "Hey, how can we do what GitHub does, only stupider?"

    --
    Mr. Hu is not a ninja.
  2. Re:Meh... by Transcendent · · Score: 5, Interesting

    Microsoft's repos *are* that large. That's why they implemented this.

    Microsoft Office's repository is over 1 TB in size. Yes, terabyte. For *office*. They absolutely cannot (could not, I suppose now) use Git on it.

  3. Re: Did they just turn git into svn? by tangent · · Score: 5, Interesting

    > Why take away git's biggest advantage?

    Because "clone now takes a few minutes instead of 12+ hours, checkout takes 30 seconds instead of 2-3 hours, and status takes 4-5 seconds instead of 10 minutes."

    That is problem is not unique to Git. JÃrg Sonnenberger tried importing the NetBSD repository into Fossil, and "the rebuild step which (re)creates the internal meta data cache took 10h on a fast machine." There are ways to make Fossil skip the rebuild on clone, which results in a suboptimal DB, but it still takes hours to clone. NetBSD's project history goes back something like a quarter century; it's going to take time to pull and organize all that.

    DVCSes are great when you can afford their associated costs â" namely, the very advantages you refer to â" but for very large repos, those costs can be very high.

    Do you really need every single version going back a quarter century? And if you do, do you need it 5 minutes after the initial clone?

    One idea that's come up on the Fossil mailing list is to do a shallow clone initially, then trickle the back history in over time. I'd like a DVCS that gave me the past 30 days of history at the tip of every open branch, then over the next day or so back-filled the rest.

  4. Re:MS Linux by ckatko · · Score: 3, Interesting

    You must have never used their enterprise Dynamics CRM and Dynamics NAV software.

    If you can get it to run at all, half the shit is broken. Hell, the 2013 edition of CRM actually told you NOT to install the newest version of IE because it was "unsupported at this time." Yeah. IE (11?) didn't support CRM. Now I've got to explain to my clients why Windows Update completely broke their brand new system they paid thousands of dollars for.

    Another great "feature" of CRM 2013 was a completely broken IMPORT system. So if you're trying to import anything other than mind-numbingly simple data like "addresses." You have to add stuff with timestamps, dates, and so on. You surely don't want ALL USER MESSAGES to lose their order and timestamps, right? TOO BAD. Even though CRM supports setting the timestamp, for certain record types the importer is completely broken and they never cared to fix it. So the "simple" solution? All you have to do is create a C# plugin, based on non-compiling code from an obscure blog. Oh wait, you can't just write a C# plugin. You have to use their HUGE SDK, their tools to "attach" the plugin to CRM and even that requires hours of reading manuals to figure out the right triggers. And if something goes wrong? ENJOY ZERO USEFUL ERROR MESSAGES. And yes, I turned on tracing (Which requires CHANGING THE REGISTRY in various places.) and debug mode.

    Or how about SQL 2014/2015, which STILL doesn't properly support DPI scaling. The hallmark of Windows 10, and if you use a high resolution with a small laptop screen, random dialog boxes will not only be shrunk and force you to squint to read them... no... that'd be too easy. Some of them are so broken that you can't physically view all of the contents of the dialog AND YOU CAN'T SCROLL TO SEE IT. The dialog dimensions are shrunk and the data is to the right of a window you can't resize!

    THANKS MICROSOFT. I love fixing your shit at my job while having to explain to clients that Microsoft's "It Just Works (TM) if you stay within the MS ecosystem!" is all a bunch of bullshit and the "It works" trademark is actually paved with the blood of IT workers.

    Microsoft could make great products. Too bad they never bother to finish any of them.