nav search
Data Center Software Security Transformation DevOps Business Personal Tech Science Emergent Tech Bootnotes BOFH

Microsoft foists fake file system for fat Git repos

No more days off after typing 'git clone'

By Thomas Claburn, 3 Feb 2017

To lighten the burden of massive Git source code repositories, Microsoft has created a virtualized file system that allows developers to interact with large codebases without sending excessive amounts of data across the network.

Git (when not applied to people or animals) refers to a distributed version control system for managing software code. Alternatives include Subversion, Mercurial, Perforce, and Concurrent Version Systems. Git is presently the most popular, though Facebook and Google both favor systems developed internally.

Microsoft's Windows codebase, explained Saeed Noursalehi, principal program manager at Microsoft, in a blog post, contains more than 3.5 million files and weighs in at more than 270GB.

For Microsoft engineers working on Windows, "git checkout" operations could take three hours and "git clone" could take half a day or more.

So Microsoft set about adapting Git to work better with massive code repositories. The result is GVFS, which stands for Git Virtual File System.

As Noursalehi explains, GVFS "virtualizes the file system beneath your repo and makes it appear as though all the files in your repo are present, but in reality only downloads a file the first time it is opened."

Such tactics echo the way version control systems, for the sake of efficiency, transmit data about file changes ("diffs" or "deltas") rather than the whole file.

GVFS also makes working with large repositories easier by automatically calculating which parts of the repo can be excluded from checkout and status commands.

The software has sped up Git tasks immensely. A "clone" operation now takes several minutes instead of half a day, "checkout" takes 30 seconds rather than 2-3 hours, and "status" takes 4-5 seconds rather than 10 minutes, Noursalehi said.

GVFS helps not only with large repositories, but repositories that include decades of code history messages, like Windows.

Microsoft has made the GVFS client software available under an open source MIT license. It requires Windows 10 Anniversary Update or greater and GVFS-enabled Git for Windows. Others wishing to implement GVFS need to deploy a protocol extension. ®

The Register - Independent news and views for the tech community. Part of Situation Publishing