Slashdot Mirror


Ask Slashdot: Version Control For Non-Developers?

occamboy writes My spouse works at a company that deals with lots of documents (Word, spreadsheets, scans, and so forth), and they have a classic version control problem that sucks up hours of her time each week. Documents are stored on a shared server in some sort of hierarchy, but there are all kinds of problems, e.g. multiple copies get saved with slightly-different names because people are afraid of overwriting the old version 'just in case' and nobody can figure out which is the latest version, or which got sent out to a client, etc.

Version control should help, and my first thought was to use SVN with TortoiseSVN, but I'm wondering if there's something even simpler that they could use? Do the Slashdotteratti have any experiences or thoughts that they could share? The ideal solution would also make it easy to text search the document tree.

33 of 343 comments (clear)

  1. Business problem != technology problem by Maxwell · · Score: 5, Insightful
    Throwing more technology on the pile won't help without a lot of user education, and if you had that you would not need the technology anyway...

    1) Create a rational naming convention and use that.

    Or

    2) use Sharepoint's (base version is free beer) built in versioning system. That is what it is designed for and is one of the few things that SP does well.

    1. Re:Business problem != technology problem by khasim · · Score: 4, Informative

      For other types of documents, it's a matter of defining a process and naming convention on how to keep a track of items.

      Seconded. It's also easier (in my experience) to get non-tech people to understand a naming standard than it is to get them to learn a new app.

      You do NOT want to be the one who has to help everyone find their "lost" documents that NEED TO BE SENT RIGHT NOW IT IS A CRISIS WE WILL LOSE THIS ACCOUNT AND IT WILL BE YOUR PROBLEM OF COURSE I CHECKED IT IN YOUR APP LOST THEM.

    2. Re:Business problem != technology problem by grcumb · · Score: 3, Interesting

      Throwing more technology on the pile won't help without a lot of user education, and if you had that you would not need the technology anyway...

      1) Create a rational naming convention and use that.

      Go no further than this. I've worked in office environments where we had dozens of editors and sub-editors proofing and editing tens of thousands of legal documents (legislation, judicial decisions and regulation), where even a single character out of place was unacceptable. After years of trial and error, the single most foolproof way of working with these documents was using the file system to define where they were in the editing process, and using filenames to indicate their status and ownership.

      It's primitively simple. But simple is an abundantly good thing in this context. Make some basic rules. Enforce them. Bob's your uncle.

      --
      Crumb's Corollary: Never bring a knife to a bun fight.
  2. Take a look at Owl by LWATCDR · · Score: 3, Interesting

    http://www.doxbox.ca/
    It is a document management system

    --
    See my blog http://ilovecookes.blogspot.com/ for light hearted technical information.
  3. Business problem != technology problem by rgbe · · Score: 5, Insightful

    I agree it's a business problem. MS Office has some pretty good versioning support built into it and multiple people can edit a document at the same time, if you know how to set it up. There should a technical person in your wife's company that understands how MS Office and other tools work. They should train the staff on the capabilities and the staff should come up with a process that works for everyone.

    With SharePoint you can have MS Office documents versioned, it is basic versioning, not like git where you can have branches and things like that. For other types of documents, it's a matter of defining a process and naming convention on how to keep a track of items.

  4. Document Management System by kdekorte · · Score: 5, Insightful

    What you are looking for is a Document Management System, something like Documentum or FileNet that are built for this specific version and include additional features like workflow and extra attributes that you can add to the content to find it easier. Web Content Management systems are not the same thing, and will not work the way you want them to so make sure you look at all the options out there.

  5. Pick an easy solution by hhawk · · Score: 4, Interesting

    I would recommend Google docs, assuming there isn't any crazy formating involved.

    #1) It is a single document so you don't have to worry about the naming of it..
    #2) Google Docs has a built in ver. control, in that you can roll backwards to early version of the document, and you can see who is editing, changing etc. (assuming everyone has their own password).

    It's low tech, easy to use, and the only education is to keep on using the same file name.

    --
    http://www.hawknest.com/
    1. Re:Pick an easy solution by uncqual · · Score: 3, Insightful

      Some businesses are not comfortable putting their documents in the hands of another party due to security concerns. Some also are hesitant to rely on a service that may go away with relatively short notice.

      Google Docs would require additional training as well if they are already using Word/Excel and legacy documents would need to be maintained somewhere.

      Google Docs does not import a lot of Word and Excel documents adequately. I've rarely had it import a Word document with sufficient fidelity that I didn't find it necessary to at least touch it up. With Excel documents, I almost always have to do a lot more than "touch up" work to make it whole again. Therefore, it's likely switching to Google Docs would require a lot of effort if some of these documents are "living" documents that change from time to time.

      --
      Why is there an "insightful" mod and why isn't it "-1"? If I wanted insight, I wouldn't be reading /.
  6. OpenVMS by frooddude · · Score: 3, Funny

    It has an automatic versioning filesystem (Files-11)...

    Far as I can tell there isn't really a 'modern' filesystem that does this. Because what you need is for no one to have to think about doing it. Save the file, done. w/ Files-11 it gets a version number appended and if it's important enough to recover I'm sure someone would manage to figure out how to dig up the older revision that they want.

  7. Alfresco by Anonymous Coward · · Score: 3, Informative

    Alfresco has a versioning capability: http://docs.alfresco.com/4.0/concepts/versioning.html

  8. Don't forget the people side of the equation by Registered+Coward+v2 · · Score: 4, Insightful

    The greatest document version control solution will ultimately prove to be useless without considering the human, i.e. user, part of the solution. Unless you have clear procedures in place detailing how to maintain version control, teach people how to use the software, explain to them why version control is important (and yes that means you, Mr or Ms senior executive who doesn't have time or the need to follow procedures that are in place to prevent the last screwup you caused by ignoring them), and have someone who maintains the document library and keeps it in shape so it actually is easy to use, your solution will fail. Without that, people will download the latest, make edits, save a copy and upload the edited version. After a while they will simply edit the saved copy and, if you're lucky, upload it as a new document.Others will download a document, make edits, save a copy and send it out without ever checking the document back in so no one else can edit it; those people will find an older version and simply edit it.

    I've been there and seen it done very poorly and very well; the key difference is those who do it well have someone who knows how to make it work, can educate people and convince them why it is important, and actually make it work. Those where it fails simply put in a technology solution and then wonder why it didn't works they search for the next technology solution.

    --
    I'm a consultant - I convert gibberish into cash-flow.
  9. Document Version Control by Bacon+Bits · · Score: 4, Informative

    There are dozens of document management and document version control systems, and many enterprise content management systems have document management as a component. The most well known is probably Microsoft SharePoint, but there are open source alternatives like LogicalDOC, OpenKM, Plone, Nuxeo, Alfresco, etc. as well as other commercial offerings like IBM Enterprise Content Management and others.

    However, the technology won't replace poor training or users determined to do their own thing.

    --
    The road to tyranny has always been paved with claims of necessity.
  10. Re:Use GIT by DarkDust · · Score: 4, Informative

    I've been using SubVersion since it was in beta and have used it at work and in private in multi-gigabyte projects. SubVersion was always rock-solid for us, and it's handling binary files very well (which was the prime reason we decided to switch use SVN back then in about 2003). Git is an excellent tool for us developers, but I feel it's way too complicated for non-technical people who don't need these bells and whistles.

  11. You are asking the wrong queston... by bobbied · · Score: 5, Insightful

    The problem you have is a "process" problem. If everybody is editing documents all over the place at the same time on shared drives, you simply cannot avoid the *real* problem and that is a process one. CVS or RCS, or any other "version control system" cannot fix the process problem.

    You need to think about why the "process" allows multiple people to be editing the same document at the same time. If you continue to allow this practice, your issue becomes a question of "how to merge" all this input back into ONE document. Unfortunately, Merging is pretty much *always* guaranteed to be a hard problem, especially when you are merging things that are complex in structure. Source code is bad enough, but you are dealing with stuff that most revision control systems just store as binary blobs and can usually only tell you that copy x is different than copy y, but not what the changes actually are.

    So, your FIRST responsibility here is to solve the problem with your process that leads to multiple editors having the file open at once and pare that down to the minimum number of editors you can (hopefully ONE at a time) and then deal with the difficult merge task that's left. I'll warn you that you may need to enforce the process using file permissions, only giving limited people write access to the file on the share so only they can change it. Everybody else has to go though them.

    THEN, you can implement just about ANY revision management system you want, or if your access controls are well enough established, just keep everything on a common share that everybody can read, but only by going though the process can they change things... If you *must* have revision management, go with something that can parse the internal changes of the files you store as much as possible. For Office documents, I would assume Microsoft has tools for that, beyond just sharepoint...

    --
    "File to fit, pound to insert, paint to match" - Aircraft Maintenance 101
  12. Re:I'd avoid Subversion by AikonMGB · · Score: 4, Informative

    I'd avoid SVN for anything that isn't a flat text file, otherwise it becomes a pain to merge or determine what the actual difference between two files is. I'm not aware of anything that will make viewing diffs for Word documents human readable.

    TortoiseSVN already does this. It uses the hooks in Office to create what is basically a "track-changes" copy, where previous version is the base, and new version is if you accept all changes. This is about as good as it gets to diffing Word files, and flows logically with how they were intended to be used in businesses anyway. It will do the same for Excel, but it's... a monster that should never be allowed to live.

  13. Re:perforce by monkeyzoo · · Score: 3, Insightful

    LOL. I was going to say Perforce too. But as a joke!
    I'm sure these teachers will love the process for creating "changesets" before they can check in any documents. Perforce is awesome, but not really for laymen.

  14. This can help by fyngyrz · · Score: 4, Informative

    One decidedly low-tech thing that can be done without any other changes is to have your users start saving documents with sortable times in the filenames, updated as to the time they are doing the save:

    client1-document-20150217114003.doc

    YYYYmmDDhhMMss

    If that's done with a save-as, they get the previous version safety they seem to like just by using "save as" intelligently, and they get latest version sorted using just alpha sort, so it cuts down on the confusion factor.

    It isn't much effort, but it's surprisingly effective.

    --
    I've fallen off your lawn, and I can't get up.
    1. Re:This can help by Anonymous Coward · · Score: 3, Insightful

      This is a terrible idea.

      a) in practice people will make typos on the 14 char datetime string, miss leading zeroes etc.. resulting in a mess of similarly named files in the folder

      b) even if miraculously this format was followed rigorously by the users for every file, you'd still have people forgetting to sort the directory files by time (or thinking it is sorted when it isn't) and opening the wrong file etc

      Timestamping files with a 14 char string is the kind of thing computers are good at and people are not.

    2. Re:This can help by FatdogHaiku · · Score: 3, Insightful

      ...another radical alternative is google docs. yes, sheesh, but better than office.

      My fear is that Google docs has attained the level of usability and popularity that often precedes a Google project, service, or feature being shut down...

      --
      You have the right to remain sentient. If you give up the right to remain sentient, you will be elected to public office
  15. Re:I'd avoid Subversion by StatureOfLiberty · · Score: 3, Interesting

    If you are using Windows PCs and using the TortoiseSVN client, you are able to diff word docs just fine. The diff view is displayed in Word itself.

  16. Re:Backup? by pahles · · Score: 4, Informative

    Which is not a version control system at all. Time Machine does periodic backup once per hour, and only once per hour. Not on each save. Not multi-user. No mod points for you.

    --
    Sig?
  17. Re:Use GIT by Jaime2 · · Score: 3, Insightful

    Git and SVN are different products. SVN is centralized and git is distributed. If you want to create a centralized repository and only allow people to have access to certain parts of it, SVN is a much better fit for that workflow. Neither allows the user to browse the document repository with first checking it out. Well, they both have web interfaces, but those don't support a good editing workflow.

  18. Re:Use GIT by DarkDust · · Score: 5, Interesting

    Now you've got into rant-mode, sorry. I really hope non-technical people are never forced to actually type in commands but use a GUI instead, no matter which VCS they use. But especially with Git. I think Git is a very powerful tool and have come to like it for its features, but I still hate it for its commands and what I feel are inconsistencies and "fuck how other VCS are naming it, we use something different".

    For example, discard changes on a single file: "git reset foo.bar". Discard changes on all files: "git checkout -- .". WTF? Just a few days ago, I wanted Git to give me the diff of specific commit, the equivalent to "svn diff -c revision" or "hg diff -c revision". In git? "git diff revision^ revision" or "git diff revision^!" (which I overread when I was reading the man page and needed to look it up on Ye Olde Interweb). Or "git diff-tree -p revision" or "git whatchanged -m -n 1 -p revision" since why not? And "git add" both adds a new file to the repository but also picks a modified file to be included in the next commit (but only the parts that have not changed between add and commit. The add behaviour does make sense when you think "from the inside" of the VCS, but I was confused at first and I'm a technical guy. Normal people will have trouble with this stuff. Seriously, I've been using various VCS in last two decades and still am doing a lot in the shell, especially VCS stuff since I feel to be more in control this way. But Git is the first VCS that I use almost exclusively in a GUI because it's CLI is too cumbersome.

  19. Microsoft SharePoint by KermodeBear · · Score: 3, Informative

    I know that Microsoft products aren't hipster and all, but the OP mentioned Word and Excel documents. SharePoint supports version control. I don't know how well it works for scanned images, but for documents and spreadsheets it works just fine.

    --
    Love sees no species.
  20. Re:DO NOT Use GIT by DarkDust · · Score: 4, Interesting

    With SubVersion, you can check out subtrees instead of the whole repository (even non-recursively, so you can check out a directory "in the middle"). That's something that Git or Mercurial can't do by design; IIRC it's because the always-complete-repository approach makes merging and other tasks much, much easier. In your SVN working copy, only the data of commit you've checked out are stored. For everything else SVN needs to contact the server which depending on the requirements and workflow, is either a good or bad thing. On the other hand, Git and Mercurial do have the complete history locally which allows them to perform a lot of tasks without contacting a server that SubVersion could not do (simple example: get log history of a file).

    But it's actually besides the point: all of these things won't matter to an office user. Ease-of-use and chances-to-screw-up do.

  21. Re:perforce by malacandrian · · Score: 3, Informative

    Won't merge? Word has built in merge and diff tools https://support.microsoft.com/...

  22. Re:Use GIT by DarkDust · · Score: 3, Interesting

    I've worked on a SubVersion project for several years where the smallest useful checkout was 5GB (it was an in-house Linux distribution I've built and maintained). On a local network, SubVersion works pretty well for these things but you're right, I wouldn't want to do this over a poor Internet connection. It's pretty space efficient with binary files and handles things like copies and renames very well, so if you need to deal with them a lot then SubVersion is a good choice. Git and SubVersion work very differently, each have features the other doesn't have, by design. Believe it or not, SubVersion was also *designed* for large projects, but different use-cases. I really, really wouldn't want to maintain my distribution with Git. Now that I'm a "normal" developer again, we're using Mercurial and Git since they're better suited for these tasks, handling source/text files with lots of branching and merging.

  23. I used this and it works. by serviscope_minor · · Score: 4, Interesting

    You can set up Apache to serve files over WebDAV. WebDAV is mountable as a network FS on Windows, OSX and Linux. Apache can store the webdav files in an SVN repository, so you get file versioning built into the mounted filesystem that is completely transparent to the user.

    You can also set up apache to allow normal browsing of the SVN repo, so you can browse it online without mouting and also access old versions.

    So basically you get transparently versioned files. Native read/write access. Access to old versions via a web browser. No tools required on the clients for it to work.

    Also all free and open source and the data is not stored in an obnoxious format that it opaque: it's a refular SVN repo and works just as well with commandline tools.

    --
    SJW n. One who posts facts.
  24. Re:Use GIT by diamondmagic · · Score: 3, Interesting

    Git has the "stage", right. The stage is just the next commit. It's a little hidden filesystem (git tree, actually) that's already processed and ready to be attached to a commit message once you run `git commit`.

    For example, discard changes on a single file: "git reset foo.bar". Discard changes on all files: "git checkout -- .". WTF?

    `git checkout` is about your working directory. Use "git checkout -- foo.bar" if you made a modification and you don't want to commit it, just erase it. Or better yet, `git checkout -p`

    `git reset` is about unstaging changes, it doesn't touch the filesystem. (It also has `git reset -p`)

    Just a few days ago, I wanted Git to give me the diff of specific commit, the equivalent to "svn diff -c revision" or "hg diff -c revision". In git? "git diff revision^ revision" or "git diff revision^!" (which I overread when I was reading the man page and needed to look it up on Ye Olde Interweb). Or "git diff-tree -p revision" or "git whatchanged -m -n 1 -p revision" since why not?

    You want to see the changes that one commit introduced, so of course you ask git: "What were the changed from parent-of-'$revision' through '$revision'?"

    You're probably looking for `git show $revision`

    And "git add" both adds a new file to the repository but also picks a modified file to be included in the next commit (but only the parts that have not changed between add and commit.

    `git add` copies a file from the working tree to the stage (index). What happens when you use `cp` and the target file doesn't exist? It gets created. (Since you can't copy a nonexistent file, there's also `git rm` to remove files from the stage.)

  25. MediaWiki. by sbaker · · Score: 4, Interesting

    My wife and I use MediaWiki! Seems kinda silly - but you can configure it to accept all kinds of file types - and you have all of the nice stuff like discussion pages and categories to help you to organize them.

    The huge advantage is that it's insanely easy to use. Super-light on features also...but, hey...it's a thought, right?

        -- Steve

    --
    www.sjbaker.org
  26. Been there, done that. Here's how: by Qbertino · · Score: 4, Interesting

    My Gig currently is with a classic marketing agency. Very nice folks - a breath of fresh air when it comes to my history with agencies - but breathtakingly clueless with IT - as usual in this industry. I'm basically the only IT/dev guy in a shop of 30. Has its ups and downs. ... Whatever.

    They asked me on board as a webdev, to establish a pipeline and introduce versioning. I'm using Git on a VMed central linux system and SourceTree as client. Our outside SSH port is mapped to that VM, so the the people on a project can commit docs or code on the go.

    Sidenote: I wouldn't use anything other than Git, it's just not worth it. Git has won the versioning thing. End of story. ... Bazaar might be an alternative, if you need the same click-ui on windows, mac *and* linux, but that is probably a very rare case.

    As a client we use SourceTree on both Mac and Windows, so all UIs look more or less the same. No Tortoise, for that exact reason! I show them where to click to see the entire file-tree as in finder or explorer, so nobody is confused and explain the difference between a commit and a push. In a pinch, the windows and mac folks can help each other out if I'm not around, since they’re all using SourceTree. And it keeps this "Versioning" thing nice and secluded. That's also a reason.

    I want to get them to use versioning, so I tell them #1 is always fear of using it. I tell them not to worry, it's pratically impossible to break anything (one of the advantages of Git). I tell them to version often and comment their commits, even if it's just smalltalk. The point is getting used to commenting. We don't uses branches, just master. I also tell them to try and logically group commits, but not kill themselves if it goes wrong. It happens - with me aswell. No harm done.

    Once everyone is pro in versioning, we might change the branching policy.

    As for all the other buttons in SourceTree, I just tell them to ignore them and that they are for later. I do tell them the meaning of "Stash" and how nifty that is when you've forgotten to pull before starting your work, but only those who need and want to know. ... As soon as they get a pull conflict, they ususall do want to know, so no problem here.

    I've established a naming-standard with ProjectFolderName/git-repo for local clones, so everyone has a space where they can fiddle for the project without needing to inmediately version if they just want to try out a new tool or salvage an older Photoshop template or something. Project docs go into /docs, developer stuff goes into /code (mostly complete wordpress installs or some other thing), DB dumps into /db, graphics, layout, DTP files and videos and other raw material usually goes into /assets, etc. ... You get the picture.
    We're/I'm not to strict with dir-policy and let it grow a little too. No project is like another.

    Important:
    I put my agency behind versioning, because right now its Filename-02122014-final-extra-specialEdit-Peter.doc on a central drive and shit. Especially with the editorial team. Not good. I did a neat presentation and help everyone who comes into versioning to get familiar with the concept. Installing SourceTree, doing a few demo commits, have them do it, show them the red numbers, looking at the history log and file-changes and stuff.

    A few months in and the online team is starting to get used to versioning on some projects. Once everyone there is on board we’ll move into other departments. My PM for one large online project is using versioning regularly now, as are the students helping out. That the bosses are behind all this helps.

    Sidenote: More than half of the team is ladies, as is my PM, btw.

    I tell everyone that they can ask me everything a million times and call me at 2 o’clock in the morning if it’s a versioning problem and they need my advice or some handholding. Very import

    --
    We suffer more in our imagination than in reality. - Seneca
  27. Re:Use GIT by DarkDust · · Score: 3, Interesting

    I'd love to see you explain all this to an average office lady :-)

  28. Re:perforce by monkeyzoo · · Score: 3, Informative

    As someone who trained people on my team in the video game industry in how to use Perforce who were already familiar with version control concepts, I would reiterate that I don't see any of the above as viable solutions for this bloke. It's exactly the point that all of these tools are going to require non-trivial training, and do you think this guy is going to be able to tell his wife... "Hey! I asked Slashdot, and they recommended Perforce (or git or SVN or CVS or VCS or PVCS or whatever!), so just teach that to your colleagues and you're all set!"

    No. What they can perhaps nominally hope for is to get everyone to switch to Google Docs which does version control and concurrent editing and merging without you asking. Heck even the built-in MS Office does versioning, but again, that is going to require team training and buy-in. Meh

    To reiterate, I like Perforce a lot and found the reward for spending the time to understand its core concepts worthwhile, but its learning curve is steeper than other tools out there. And if git ever got decent GUI tools, it would beat its pants off.