Ask Slashdot: Version Control For Non-Developers?
occamboy writes My spouse works at a company that deals with lots of documents (Word, spreadsheets, scans, and so forth), and they have a classic version control problem that sucks up hours of her time each week. Documents are stored on a shared server in some sort of hierarchy, but there are all kinds of problems, e.g. multiple copies get saved with slightly-different names because people are afraid of overwriting the old version 'just in case' and nobody can figure out which is the latest version, or which got sent out to a client, etc.
Version control should help, and my first thought was to use SVN with TortoiseSVN, but I'm wondering if there's something even simpler that they could use? Do the Slashdotteratti have any experiences or thoughts that they could share? The ideal solution would also make it easy to text search the document tree.
Version control should help, and my first thought was to use SVN with TortoiseSVN, but I'm wondering if there's something even simpler that they could use? Do the Slashdotteratti have any experiences or thoughts that they could share? The ideal solution would also make it easy to text search the document tree.
Easy to install, free for 20-users or less, rock solid, and clients for many OSes. Most importantly, it supports single-user checkouts, which is vital for things like Word documents that won't merge.
1) Create a rational naming convention and use that.
Or
2) use Sharepoint's (base version is free beer) built in versioning system. That is what it is designed for and is one of the few things that SP does well.
http://www.doxbox.ca/
It is a document management system
See my blog http://ilovecookes.blogspot.com/ for light hearted technical information.
Something like Alfresco ?
I agree it's a business problem. MS Office has some pretty good versioning support built into it and multiple people can edit a document at the same time, if you know how to set it up. There should a technical person in your wife's company that understands how MS Office and other tools work. They should train the staff on the capabilities and the staff should come up with a process that works for everyone.
With SharePoint you can have MS Office documents versioned, it is basic versioning, not like git where you can have branches and things like that. For other types of documents, it's a matter of defining a process and naming convention on how to keep a track of items.
What you are looking for is a Document Management System, something like Documentum or FileNet that are built for this specific version and include additional features like workflow and extra attributes that you can add to the content to find it easier. Web Content Management systems are not the same thing, and will not work the way you want them to so make sure you look at all the options out there.
I would recommend Google docs, assuming there isn't any crazy formating involved.
#1) It is a single document so you don't have to worry about the naming of it..
#2) Google Docs has a built in ver. control, in that you can roll backwards to early version of the document, and you can see who is editing, changing etc. (assuming everyone has their own password).
It's low tech, easy to use, and the only education is to keep on using the same file name.
http://www.hawknest.com/
Lord help you if you do... It's bad enough for source code, but it's horrible for Office documents.... On the plus side, everybody has their own local repository so loosing data due to drive failures is minimized over having everything on a server, but all that pushing and puling with merging is painful on things like word documents...
"File to fit, pound to insert, paint to match" - Aircraft Maintenance 101
I'd avoid SVN for anything that isn't a flat text file, otherwise it becomes a pain to merge or determine what the actual difference between two files is. I'm not aware of anything that will make viewing diffs for Word documents human readable. Never mind that some of the people who need to use it will probably be a afraid of it or have even more basic problems like forgetting to commit.
If they're not doing anything that requires absolute security or precise formatting, something like Google docs might work reasonably well. It's simple to use and doesn't require the users to understand the complexities of version control. No idea if there's anything that can be hosted locally in case the company can't or would prefer not to put the data on Google's servers.
I had a similar arrangement for a medical practice using Subversion and Cornerstone but ran into the same issues mentioned in the parent, creating new weird names, forgetting to check-in their changes, etc. Given the docs were Word, Excel, and Powerpoint, merges between files aren't possible.
The only real solution was to park everything online, including the editing and version control. Removing the notion of a 'file' that had to be down/uploaded was the biggest thing to overcome but they soon adapted and having simultaneous edits while everyone is on Skype was a real win for them.
Google Docs, its free and your collaborators don't even need Gmail accounts to contribute. Compared to the other offerings (Smartsheet, etc), the ability to add additional scripting behaviours puts it on a level above the rest. At that point you'll have to pay about $50/user/year which is quite reasonable.
If they are open using Google Docs, it supports multiple simultaneous editing, and does versioning of files.
A lot of us are on soylent news. http://soylentnews.org/
Buck Feta!
It has an automatic versioning filesystem (Files-11)...
Far as I can tell there isn't really a 'modern' filesystem that does this. Because what you need is for no one to have to think about doing it. Save the file, done. w/ Files-11 it gets a version number appended and if it's important enough to recover I'm sure someone would manage to figure out how to dig up the older revision that they want.
Alfresco has a versioning capability: http://docs.alfresco.com/4.0/concepts/versioning.html
The greatest document version control solution will ultimately prove to be useless without considering the human, i.e. user, part of the solution. Unless you have clear procedures in place detailing how to maintain version control, teach people how to use the software, explain to them why version control is important (and yes that means you, Mr or Ms senior executive who doesn't have time or the need to follow procedures that are in place to prevent the last screwup you caused by ignoring them), and have someone who maintains the document library and keeps it in shape so it actually is easy to use, your solution will fail. Without that, people will download the latest, make edits, save a copy and upload the edited version. After a while they will simply edit the saved copy and, if you're lucky, upload it as a new document.Others will download a document, make edits, save a copy and send it out without ever checking the document back in so no one else can edit it; those people will find an older version and simply edit it.
I've been there and seen it done very poorly and very well; the key difference is those who do it well have someone who knows how to make it work, can educate people and convince them why it is important, and actually make it work. Those where it fails simply put in a technology solution and then wonder why it didn't works they search for the next technology solution.
I'm a consultant - I convert gibberish into cash-flow.
There are dozens of document management and document version control systems, and many enterprise content management systems have document management as a component. The most well known is probably Microsoft SharePoint, but there are open source alternatives like LogicalDOC, OpenKM, Plone, Nuxeo, Alfresco, etc. as well as other commercial offerings like IBM Enterprise Content Management and others.
However, the technology won't replace poor training or users determined to do their own thing.
The road to tyranny has always been paved with claims of necessity.
or confluence, alfresco.. or most other CMSs.
You left out:
5.a Spend an inordinate amount of time explaining and defending your estimate to the point the CTO and CFO forget about the initial problem
Delete steps 6 - 9
I'm a consultant - I convert gibberish into cash-flow.
I've been using SubVersion since it was in beta and have used it at work and in private in multi-gigabyte projects. SubVersion was always rock-solid for us, and it's handling binary files very well (which was the prime reason we decided to switch use SVN back then in about 2003). Git is an excellent tool for us developers, but I feel it's way too complicated for non-technical people who don't need these bells and whistles.
The problem you have is a "process" problem. If everybody is editing documents all over the place at the same time on shared drives, you simply cannot avoid the *real* problem and that is a process one. CVS or RCS, or any other "version control system" cannot fix the process problem.
You need to think about why the "process" allows multiple people to be editing the same document at the same time. If you continue to allow this practice, your issue becomes a question of "how to merge" all this input back into ONE document. Unfortunately, Merging is pretty much *always* guaranteed to be a hard problem, especially when you are merging things that are complex in structure. Source code is bad enough, but you are dealing with stuff that most revision control systems just store as binary blobs and can usually only tell you that copy x is different than copy y, but not what the changes actually are.
So, your FIRST responsibility here is to solve the problem with your process that leads to multiple editors having the file open at once and pare that down to the minimum number of editors you can (hopefully ONE at a time) and then deal with the difficult merge task that's left. I'll warn you that you may need to enforce the process using file permissions, only giving limited people write access to the file on the share so only they can change it. Everybody else has to go though them.
THEN, you can implement just about ANY revision management system you want, or if your access controls are well enough established, just keep everything on a common share that everybody can read, but only by going though the process can they change things... If you *must* have revision management, go with something that can parse the internal changes of the files you store as much as possible. For Office documents, I would assume Microsoft has tools for that, beyond just sharepoint...
"File to fit, pound to insert, paint to match" - Aircraft Maintenance 101
One decidedly low-tech thing that can be done without any other changes is to have your users start saving documents with sortable times in the filenames, updated as to the time they are doing the save:
client1-document-20150217114003.doc
YYYYmmDDhhMMss
If that's done with a save-as, they get the previous version safety they seem to like just by using "save as" intelligently, and they get latest version sorted using just alpha sort, so it cuts down on the confusion factor.
It isn't much effort, but it's surprisingly effective.
I've fallen off your lawn, and I can't get up.
Which is not a version control system at all. Time Machine does periodic backup once per hour, and only once per hour. Not on each save. Not multi-user. No mod points for you.
Sig?
No. You are wrong.
Git handles text files well.
Git handles infrequenly changing binaries well enough.
Git handles frequently changing binaries very very very poorly.
Plus, with either svn or git, you are basically asking each user of the system to download all revisions of all files ever used on your system. That makes perfect sense for a codebase. That makes terrible terrible sense for business documents in a shared pool.
Git and SVN are different products. SVN is centralized and git is distributed. If you want to create a centralized repository and only allow people to have access to certain parts of it, SVN is a much better fit for that workflow. Neither allows the user to browse the document repository with first checking it out. Well, they both have web interfaces, but those don't support a good editing workflow.
Now you've got into rant-mode, sorry. I really hope non-technical people are never forced to actually type in commands but use a GUI instead, no matter which VCS they use. But especially with Git. I think Git is a very powerful tool and have come to like it for its features, but I still hate it for its commands and what I feel are inconsistencies and "fuck how other VCS are naming it, we use something different".
For example, discard changes on a single file: "git reset foo.bar". Discard changes on all files: "git checkout -- .". WTF? Just a few days ago, I wanted Git to give me the diff of specific commit, the equivalent to "svn diff -c revision" or "hg diff -c revision". In git? "git diff revision^ revision" or "git diff revision^!" (which I overread when I was reading the man page and needed to look it up on Ye Olde Interweb). Or "git diff-tree -p revision" or "git whatchanged -m -n 1 -p revision" since why not? And "git add" both adds a new file to the repository but also picks a modified file to be included in the next commit (but only the parts that have not changed between add and commit. The add behaviour does make sense when you think "from the inside" of the VCS, but I was confused at first and I'm a technical guy. Normal people will have trouble with this stuff. Seriously, I've been using various VCS in last two decades and still am doing a lot in the shell, especially VCS stuff since I feel to be more in control this way. But Git is the first VCS that I use almost exclusively in a GUI because it's CLI is too cumbersome.
I know that Microsoft products aren't hipster and all, but the OP mentioned Word and Excel documents. SharePoint supports version control. I don't know how well it works for scanned images, but for documents and spreadsheets it works just fine.
Love sees no species.
I've worked on a SubVersion project for several years where the smallest useful checkout was 5GB (it was an in-house Linux distribution I've built and maintained). On a local network, SubVersion works pretty well for these things but you're right, I wouldn't want to do this over a poor Internet connection. It's pretty space efficient with binary files and handles things like copies and renames very well, so if you need to deal with them a lot then SubVersion is a good choice. Git and SubVersion work very differently, each have features the other doesn't have, by design. Believe it or not, SubVersion was also *designed* for large projects, but different use-cases. I really, really wouldn't want to maintain my distribution with Git. Now that I'm a "normal" developer again, we're using Mercurial and Git since they're better suited for these tasks, handling source/text files with lots of branching and merging.
http://www.opentext.com/what-w...
Disclaimer: my cousin works there.
- Michael T. Babcock (Yes, I blog)
ownCloud(https://owncloud.org/) supports versioning and will automatically sync changes. It's easy to set up on your own server.
You can set up Apache to serve files over WebDAV. WebDAV is mountable as a network FS on Windows, OSX and Linux. Apache can store the webdav files in an SVN repository, so you get file versioning built into the mounted filesystem that is completely transparent to the user.
You can also set up apache to allow normal browsing of the SVN repo, so you can browse it online without mouting and also access old versions.
So basically you get transparently versioned files. Native read/write access. Access to old versions via a web browser. No tools required on the clients for it to work.
Also all free and open source and the data is not stored in an obnoxious format that it opaque: it's a refular SVN repo and works just as well with commandline tools.
SJW n. One who posts facts.
Configuration control is all about the methodology, and not about using a particular tool. It is possible to have great configuration control without using any software tool, and it is also possible to have no configuration control while using a software tool.
The simplest solution in the above case is to put into place configuration control procedures while not using any software tool.
Probably people will downvote me for this, but this exactly scenario is why SharePoint exists. It's specifically to help non-technical users post, share and have version control for their office documents.
It integrates with Microsoft Office, so Word etc. simply presents a 'check out' button on the top, and asks you to 'check in' if you press the 'x' and try to leave, and you can add comments.
Don't know why this wasn't considered?
Git has the "stage", right. The stage is just the next commit. It's a little hidden filesystem (git tree, actually) that's already processed and ready to be attached to a commit message once you run `git commit`.
For example, discard changes on a single file: "git reset foo.bar". Discard changes on all files: "git checkout -- .". WTF?
`git checkout` is about your working directory. Use "git checkout -- foo.bar" if you made a modification and you don't want to commit it, just erase it. Or better yet, `git checkout -p`
`git reset` is about unstaging changes, it doesn't touch the filesystem. (It also has `git reset -p`)
Just a few days ago, I wanted Git to give me the diff of specific commit, the equivalent to "svn diff -c revision" or "hg diff -c revision". In git? "git diff revision^ revision" or "git diff revision^!" (which I overread when I was reading the man page and needed to look it up on Ye Olde Interweb). Or "git diff-tree -p revision" or "git whatchanged -m -n 1 -p revision" since why not?
You want to see the changes that one commit introduced, so of course you ask git: "What were the changed from parent-of-'$revision' through '$revision'?"
You're probably looking for `git show $revision`
And "git add" both adds a new file to the repository but also picks a modified file to be included in the next commit (but only the parts that have not changed between add and commit.
`git add` copies a file from the working tree to the stage (index). What happens when you use `cp` and the target file doesn't exist? It gets created. (Since you can't copy a nonexistent file, there's also `git rm` to remove files from the stage.)
Wonder what the public key field is for?
My experience with SharePoint: A) it does not protect against multiple check-outs followed by multiple check-ins erasing other people's changes. Basically there's no detection of collisions between your changes and changes since you checked out. This caused a lot of grief in my work group. B) The versioning is strictly linear, at least I never saw any branching. That is very unlikely to address business needs. So you will need a naming scheme to represent branches.
" basic versioning"
.BAK file name. The .BAK files can be in a special folder.
The free Notepad++ can make a backup of every save, with date and time in the
My wife and I use MediaWiki! Seems kinda silly - but you can configure it to accept all kinds of file types - and you have all of the nice stuff like discussion pages and categories to help you to organize them.
The huge advantage is that it's insanely easy to use. Super-light on features also...but, hey...it's a thought, right?
-- Steve
www.sjbaker.org
1 - Switch to Office 365 or Google Docs in which revisions are a built-in feature of document editing
2 - Enable Office's built-in version tracking
3 - Move all document storage into a CMS like Sharepoint (which has good Office integration at least on Windows) or BaseCamp, Jive, Confluence - any system that allows for online editing and has revision tracking built-in
Any other ideas, skip. Anything having to do with a source-code like version control system will result in people "committing" but duplicating files over and over in the old pattern.
I'm out of my mind right now, but feel free to leave a message.....
My Gig currently is with a classic marketing agency. Very nice folks - a breath of fresh air when it comes to my history with agencies - but breathtakingly clueless with IT - as usual in this industry. I'm basically the only IT/dev guy in a shop of 30. Has its ups and downs. ... Whatever.
They asked me on board as a webdev, to establish a pipeline and introduce versioning. I'm using Git on a VMed central linux system and SourceTree as client. Our outside SSH port is mapped to that VM, so the the people on a project can commit docs or code on the go.
Sidenote: I wouldn't use anything other than Git, it's just not worth it. Git has won the versioning thing. End of story. ... Bazaar might be an alternative, if you need the same click-ui on windows, mac *and* linux, but that is probably a very rare case.
As a client we use SourceTree on both Mac and Windows, so all UIs look more or less the same. No Tortoise, for that exact reason! I show them where to click to see the entire file-tree as in finder or explorer, so nobody is confused and explain the difference between a commit and a push. In a pinch, the windows and mac folks can help each other out if I'm not around, since they’re all using SourceTree. And it keeps this "Versioning" thing nice and secluded. That's also a reason.
I want to get them to use versioning, so I tell them #1 is always fear of using it. I tell them not to worry, it's pratically impossible to break anything (one of the advantages of Git). I tell them to version often and comment their commits, even if it's just smalltalk. The point is getting used to commenting. We don't uses branches, just master. I also tell them to try and logically group commits, but not kill themselves if it goes wrong. It happens - with me aswell. No harm done.
Once everyone is pro in versioning, we might change the branching policy.
As for all the other buttons in SourceTree, I just tell them to ignore them and that they are for later. I do tell them the meaning of "Stash" and how nifty that is when you've forgotten to pull before starting your work, but only those who need and want to know. ... As soon as they get a pull conflict, they ususall do want to know, so no problem here.
I've established a naming-standard with ProjectFolderName/git-repo for local clones, so everyone has a space where they can fiddle for the project without needing to inmediately version if they just want to try out a new tool or salvage an older Photoshop template or something. Project docs go into /docs, developer stuff goes into /code (mostly complete wordpress installs or some other thing), DB dumps into /db, graphics, layout, DTP files and videos and other raw material usually goes into /assets, etc. ... You get the picture.
We're/I'm not to strict with dir-policy and let it grow a little too. No project is like another.
Important:
I put my agency behind versioning, because right now its Filename-02122014-final-extra-specialEdit-Peter.doc on a central drive and shit. Especially with the editorial team. Not good. I did a neat presentation and help everyone who comes into versioning to get familiar with the concept. Installing SourceTree, doing a few demo commits, have them do it, show them the red numbers, looking at the history log and file-changes and stuff.
A few months in and the online team is starting to get used to versioning on some projects. Once everyone there is on board we’ll move into other departments. My PM for one large online project is using versioning regularly now, as are the students helping out. That the bosses are behind all this helps.
Sidenote: More than half of the team is ladies, as is my PM, btw.
I tell everyone that they can ask me everything a million times and call me at 2 o’clock in the morning if it’s a versioning problem and they need my advice or some handholding. Very import
We suffer more in our imagination than in reality. - Seneca
This is NOT a technical issue that new software will solve. It is a training or management issue. If people don't understand how to use version control, they will use it like a file share instead. I've encountered this MANY times, and right now I'm struggling with the idiots (actual software developers) that are using dead-simple SubVersion tools and STILL want to make copies for new versions, create new folders for the "current" docs and rename folders as archives. Constantly. And these are supposed to be DEVELOPERS! They seem to have no concept of tagging, branching, or even versioning in general. WHY did you delete all these files and then commit a bunch of modified files into a new folder!??!?
The only way to fix this is to create some policy and procedure documents (they can be really short and simple), and then get management to ENFORCE them. Otherwise, you might as well just throw out the version control system and let everybody do whatever they want in a shared store. Because that's what they'll do anyway if they don't "get" version control.
"Somebody has to do something. It's just incredibly pathetic it has to be us."
--- Jerry Garcia
I'd love to see you explain all this to an average office lady :-)