Ask Slashdot: Taming a Wild, One-Man Codebase?

← Back to Stories (view on slashdot.org)

Ask Slashdot: Taming a Wild, One-Man Codebase?

Posted by timothy on Thursday September 20, 2012 @06:27AM from the seeks-same dept.

New submitter tavi.g writes "Working for an ISP, along with my main job (networking) I get to create some useful code (Bash and Python) that's running on various internal machines. Among them: glue scripts, Cisco interaction / automatization tools, backup tools, alerting tools, IP-to-Serial OOB stuff, even a couple of web applications (LAMPython and CherryPy). Code has piled up — maybe over 20,000 lines — and I need a way to reliably work on it and deploy it. So far I used headers at the beginning of the scripts, but now I'm migrating the code over to Bazaar with TracBzr, because it seems best for my situation. My question for the Slashdot community is: in the case of single developer (for now), multiple machines, and a small-ish user base, what would be your suggestions for code versioning and deployment, considering that there are no real test environments and most code just goes into production ? This is relevant because lacking a test environment, I got used to immediate feedback from the scripts, since they were in production, and now a versioning system would mean going through proper deployment/rollback in order to get real feedback."

40 of 151 comments (clear)

Min score:

Reason:

Sort:

first thought: by Tastecicles · 2012-09-20 06:31 · Score: 5, Interesting

rectify the testbed lack.
'cos there's nothing more likely to cause immediate termination of your employment than a bit of rogue code taking down the bread of the business.
Test it first.

--
Operation Guillotine is in effect.
1. Re:first thought: by Anonymous Coward · 2012-09-20 06:55 · Score: 2, Interesting
  
  Yes, by all means test. Then, deploy your tests into production by mistake, like on Wall Street or something, LOL. Seriously though, testing is good and I unit test right from the start; but there are no silver bullets.
2. Re:first thought: by ILongForDarkness · 2012-09-20 06:57 · Score: 4, Interesting
  
  rectify the testbed lack, like Yoda is it. I agree you need a testbed. Heck run a few vm's on a workstation. If you can't build a vm to test something it shouldn't be deployed IMHO.
3. Re:first thought: by Stiletto · 2012-09-20 07:36 · Score: 4, Insightful
  
  It's not a silver bullet, but lack of a test environment is sure to eventually cause disaster. It's by far the biggest problem mentioned above, even more of a problem than lack of version control.
4. Re:first thought: by Anne+Thwacks · 2012-09-20 07:43 · Score: 2
  
  And spend a week or two reading http://thedailywtf.com/
  
  --
  Sent from my ASR33 using ASCII
5. Re:first thought: by rvw · 2012-09-20 08:32 · Score: 2
  
  It's not a silver bullet, but lack of a test environment is sure to eventually cause disaster. It's by far the biggest problem mentioned above, even more of a problem than lack of version control.
  I would start with a versioning system. That's a lot easier to get working. You could get that working in one day. And it doesn't need a test environment. Yes it should, but it's not a requirement. You can use the trunk as the production codebase. The big advantage is that you can rollback easily. You can even code on the server itself, and then update the codebase from there. No, not the wisest thing to do, but it's possible and probably a lot wiser than coding on the server without versioning. And use comments for each version update!
6. Re:first thought: by luis_a_espinal · 2012-09-20 09:28 · Score: 4, Insightful
  
  The scripts are irrelevant if not ran on the real environment,
  
  Well, that's an oxymoron. Any program, large or small, is irrelevant if it never runs on the intended target platform. That's no excuse for having a test server, however feeble compared to production it might be.
  
  the test environment would have to be a clone of the production environments.
  
  A clone does not have to be equivalent in terms of hardware or data. A good example is a test db box for testing your SQL scripts. Such a box can have the exact same software, OS and patches, and with equivalent database configuration and schemas, but on lower-cost hardware and with a fraction of the data. As long as a test bench can provide a reasonable, objective measure of comfort of your code, that is all you need. You do not need an absolute guarantee (as there is never one anyways.)
  
  Good luck with that with the described environment!
  
  Yeah, because the task is so hard, he might as well give up, right, right, right? Let's do the paralysis-by-analysis chicken dance, shall we?
  
  He could test each piece of the scripts in testing - which he probably does - but that only gets you so far
  
  Which is better than nothing, and it is always better to carry tests, however little they might be on a test/sacrificial box than on production. It's not rocket science man.
  
  and tells you that there's no typos.
  No. It can also tell you that it will not do something bad, like deleting all records in a table, or initiating a shutdown, or filling up the /tmp partition. Better to detect such things on a mickey mouse test box than on the real thing. It might not catch bugs that are triggered by the actual characteristics present in a production environment, but it will most likely catch up bugs (annoying or fatal) that are not dependent on such characteristics.
  Ideal? No. Better than nothing? Hell yeah.
Code versioning and deployment? by MetalliQaZ · 2012-09-20 06:32 · Score: 5, Insightful

I don't understand how code versioning has to be coupled with deployment? You have no test environment, as you said... so just make releases and deploy them manually. Since you are going straight to production, you had better be there in person to roll it back if you screwed up. Right? So, SVN should be all you need...

--
"Here Lies Philip J. Fry, named for his uncle, to carry on his spirit"
1. Re:Code versioning and deployment? by dgatwood · 2012-09-20 07:26 · Score: 2, Insightful
  
  Git is a cleaner model in a lot of ways. In particular, the fact that you have a local copy of the entire repository makes it easier to roll back mistakes you make while editing the code. This isn't always important, but if you decide you're going to do a significant rewrite of some piece of code (and in particular, if you are ever remote while you're doing so), it helps a lot.
  
  --
  Check out my sci-fi/humor trilogy at PatriotsBooks.
2. Re:Code versioning and deployment? by inKubus · 2012-09-20 07:51 · Score: 2
  
  Here's what I did, pre-git:
  Create svn repo, e.g. svn.company.lan/systems
  Create structure ./trunk, ./branches, ./tags
  Create a directory for each hostname e.g. ./trunk/sql1, ./trunk/web1, ./trunk/web2, etc.
  Then you can svn import configuration directories on the host into the repo, e.g. svn import svn.company.lan/trunk/sql1 /etc
  Then check out svn co svn.company.lan/trunk/sql1/etc /etc
  From that point forward if you make changes locally you can svn ci OR you can make them externally (i.e. in a test environment) then svn up to update your local conf
  I keep the same directory structure, so if I have some tomcat conf like /opt/jira/tomcat/conf it will be in svn as svn.company.lan/trunk/web1/opt/jira/tomcat/conf
  With some scripts, I automated the process and since then it's been really easy to maintain. I understand that cfengine is quite a bit more complex and can do a lot more, like verifying your configuration and that sort of thing, but for a small shop this is good enough to prevent Oh Shit moments with minimal extra work and almost no maintenance.
  Need to make a change? First, check in to make sure repo has latest version. Make your changes, restart your daemons..if it works, check in. If it doesn't work you can keep working or svn revert back to the previous version.
  With git, you'd have a similar thing but the repo would be local and you'd have to find a way to back it up, or you could have something like stash running to be a central hub. DO NOT use github to store configs out of habit, because sometimes conf files have private keys and stuff and it is extremely likely that github will be targeted by crackers at some point. Svn is real easy to set up on a random utility server or even a workstation...
  
  --
  Cool! Amazing Toys.
It's too late by Antipater · 2012-09-20 06:34 · Score: 5, Funny

Given the situation you describe, it won't be long before the whole system falls into corruption. Your only hope is to save two lines from every script on a USB stick, then flood the rest.

--
Everything is better with chainsaws.
Simple answer by girlintraining · 2012-09-20 06:35 · Score: 4, Insightful

My question for the Slashdot community is: in the case of single developer (for now), multiple machines, and a small-ish user base, what would be your suggestions for code versioning and deployment, considering that there are no real test environments and most code just goes into production ?

The simple answer is, "Whatever works best for you." You're the only developer for these projects. Unless your manager is giving you direction on a specific process or requirements, it's your ball game. You know how you work best -- pick your tools accordingly.

--
#fuckbeta #iamslashdot #dicemustdie
A few things by jlechem · 2012-09-20 06:36 · Score: 5, Informative

1. Buy or get a machine to host SVN for version control. I work on my wife's company website and some basic management tools. SVN has saved my bacon on multiple times where I thought I had lost some code.
2. Get a pre-production server and test your code! Sounds like you're living in the wild west and that shit flies until something goes horribly wrong and you're the guy who gets blamed.

--
Hold up, wait a minute, let me put some pimpin in it
1. Re:A few things by jellomizer · 2012-09-20 06:40 · Score: 5, Insightful
  
  If you can't get the hardware. Try to Virtualize a Test Envionment with like VM Ware or Virtual Box.
  At least you have something to play in before it you put it out on the open.
  
  --
  If something is so important that you feel the need to post it on the internet... It probably isn't that important.
2. Re:A few things by KingMotley · 2012-09-20 07:58 · Score: 3, Insightful
  
  Not sure why you think you need a separate server just to host the repo. Just host it on the same machine.
  Sure at the office we have a server that hosts the repo, but at my house, I have the repo running on the same machine I develop on. Of course the repo is on a RAID-6, and my local copy I develop on is on a RAID-0, but I didn't need to buy another machine just to host the repo.
No real change by chthon · 2012-09-20 06:37 · Score: 4, Informative

You can still change everything in place. Then you can run the script and get feedback. When it works, you commit. When it doesn't, you remove the problem, check and commit.
Or you can make your changes, review them and commit them, then do a run. When you have a problem, you commit again.
It is not because you use a versioning system that you need extra formality. You can still work the way you used to, but now you have an extra safety measure due to the versioning system.
Using trac is a way to better organise your problems. The main thing I can say about using trac effectively is that you always need to have a browser window open on it, and when you have an idea, or notice something, or have problem, then enter it immediately. Afterwards, take your time to look at new and open problems, classify them and process them.
proper deployment/rollback by turbidostato · 2012-09-20 06:37 · Score: 4, Insightful

You say that "now a versioning system would mean going through proper deployment/rollback in order to get real feedback."
But then, no, it wouldn't.
Storing your code on a versioning system doesn't mean but that: that you store your code in a versioning system, nothing more, nothing else.
I'm starting to be an old fart so you can believe me when I tell I've already been in your position.
Back then I used CVS and it didn't change my deployment procedures in the slightest -only that I had all those scripts in a single convenient place and I could look in past history when I found a regression or I wanted to look for the way I did something in the past.
The most naive approach is you just got working just the way you are doing now, only that when you are confident on a script/set of scripts you check them in for posterity. You mainly develop in your own desktop and you push your scripts to the servers with an rsync-based script. A bit over this, you use a CM tool (say, puppet) so instead of pushing to the servers you push to the puppetmaster and then run a `puppet agent --test` on the servers: that way configuration becomes code and therefore, repeatibility.
It allows for almost a novel but the basic idea is just the same: SCM is SCM is SCM; nothing more, nothing less.
1. Re:proper deployment/rollback by turbidostato · 2012-09-20 06:47 · Score: 4, Informative
  
  Oh, by the way, you really should listen to those that tell you *need* some development environment.
  Again, I've already been there, so I know you pain: even for the silliest development the developers will have their development environment but for us, systems people, it's expected that everything just fits in place at first try, no second chances. Of course, next heavy refurbish will be near to impossible because while being a good professional allows for more or less "clean" kaizen-style development, anything a bit dangerous is an almost impossibility because of lack of test environments (with luck, next "heavy test window" will be in three/four years when all the servers are decomissioned and new ones come in place) but that's the way it is, take it of leave it.
  The good news is that, while not a panacea, virtualization, even at desktop level (you surely need to have a look at vagrant[1]) allows for a lot of testing, impossible in the age or "real-iron only".
  [1] http://www.vagrantup.com/
2. Re:proper deployment/rollback by SQLGuru · 2012-09-20 07:14 · Score: 3, Insightful
  
  Another benefit of a versioning system is that you don't have to keep large chunks of commented out code. If it needs to go, delete it. It's in the history if you need to go back to it. This alone will clean up most of the spaghetti that a one-coder shop faces.
Rename the files f1, f2, f3, etc. by Maximum+Prophet · 2012-09-20 06:46 · Score: 4, Funny

Quick! Rename all the files f1, f2, f3 etc, rename all the variables i1, i2, i3, etc and remove all whitespace.

Keep a translation sheet on you at all times. Suddenly, you're irreplaceable.

(:-) for the humor impaired. This is actually a riff on a joke from WKRP, when an engineer said he was replacing all the color-coded wiring with black wires for job security. (B.t.w. the engineer was played by one of the writers of the show)

--
All ideas^H^H^H^H^Hprocesses in this post are Patent Pending. (as well as the process of patenting all postings)
The Story of Mel is instructive here. by Anonymous Coward · 2012-09-20 06:46 · Score: 5, Interesting

Most of you whom have seen this may have read it in the Jargon File. It's relevant. The short answer is "you don't":
The Story of Mel, a Real Programmer
This was posted to USENET by its author, Ed Nather (utastro!nather), on May 21, 1983.
A recent article devoted to the *macho* side of programming made the bald and unvarnished statement:
Real Programmers write in FORTRAN.
Maybe they do now,
in this decadent era of
Lite beer, hand calculators, and "user-friendly" software
but back in the Good Old Days,
when the term "software" sounded funny
and Real Computers were made out of drums and vacuum tubes,
Real Programmers wrote in machine code.
Not FORTRAN. Not RATFOR. Not, even, assembly language.
Machine Code.
Raw, unadorned, inscrutable hexadecimal numbers.
Directly.
Lest a whole new generation of programmers
grow up in ignorance of this glorious past,
I feel duty-bound to describe,
as best I can through the generation gap,
how a Real Programmer wrote code.
I'll call him Mel,
because that was his name.
I first met Mel when I went to work for Royal McBee Computer Corp.,
a now-defunct subsidiary of the typewriter company.
The firm manufactured the LGP-30,
a small, cheap (by the standards of the day)
drum-memory computer,
and had just started to manufacture
the RPC-4000, a much-improved,
bigger, better, faster --- drum-memory computer.
Cores cost too much,
and weren't here to stay, anyway.
(That's why you haven't heard of the company,
or the computer.)
I had been hired to write a FORTRAN compiler
for this new marvel and Mel was my guide to its wonders.
Mel didn't approve of compilers.
"If a program can't rewrite its own code",
he asked, "what good is it?"
Mel had written,
in hexadecimal,
the most popular computer program the company owned.
It ran on the LGP-30
and played blackjack with potential customers
at computer shows.
Its effect was always dramatic.
The LGP-30 booth was packed at every show,
and the IBM salesmen stood around
talking to each other.
Whether or not this actually sold computers
was a question we never discussed.
Mel's job was to re-write
Revision Control and Deployment by MrSenile · 2012-09-20 06:51 · Score: 5, Insightful

Before it gets out of hand, I'd look to set up four things.

1. Set up a proper split environment. Even if you don't have the hardware for it, set it up in such a way that when the hardware becomes available, you can move it appropriately. That being, a standard dev -> qa -> stress -> prod infrastructure.
2. Set up a good revision control. I've started to really enjoy using GIT for this, as there's other software like gitolite that can give you fine-grained access control to your repositories. However, feel free to use subversion or any other well contained revision control platform.
3. Set up a good method for deployment. My suggestion? Try puppet. It's free, and it's powerful, and if you get it configured, adding new systems to it is exceedingly easy to do.
4. Packaging for your deployment. If you are installing a bunch of software (scripts, job control, etc) package it and give it a revision, then it's easy to upgrade systems with the 'new package', or revert it to the 'previous package' instead of having to manually copy around files or (re)editing them.

Hope that helps.
Hmm by jameshofo · 2012-09-20 06:52 · Score: 2

Yea that's interesting actually, I just ran into this myself. We're putting a project together and when something breaks I end up doing small fixes and losing the changes across deployments (we only have 3 active) so its very small. But I feel your pain, I'm not totally convinced that a full SVN system is necessary but once you break down the problems it likely is. Given your closed infrastructure you may want to consider adding some phone home features to your scripts, something intelligent enough to auto update smoothly in an automated way or manually. Make things easy for yourself so they're not difficult to work with and you will be encouraging yourself (and others) to use it.

The absolute best advice I can give is keep it simple, there are a million different ways to do it, try not to do a massive migration of everything all at once or you may find out later that some minute bug is hindering everything you do.

Lastly plan what you want it to look like and how, it will save you weeks of work.

--
Good leaders run toward problems, bad leaders hide from them.
Documentation by Hjalmar · 2012-09-20 06:57 · Score: 3, Informative

Yes, set up a test environment. And implement some kind of versioning system, even if it's just "cp current_code old_code". You should always be able to fall back if you have a botched deployment.
But one of the best things you can do is to start writing documentation. I like to write my documentation assuming it will be my replacement reading it, and so I try to include everything. Justify every unusual implementation detail, explain why each task was down the way it was. List bugs, and any code you had to write to work around it. The best part of documenting your project will be that as you work through it, you'll find things that no longer make sense and make them better.
Re:git by ThorGod · 2012-09-20 06:59 · Score: 4, Insightful

git
Yes!!! Create git repos of all those various parts on some central git server. Create backups of those repos periodically, like a sane person...
Git really doesn't require a ton of understanding to "just start using git" competently. It's not going to trash whatever you have in place; it's mathematically proven to *not* lose data.
Also, freaking set up a dev server already! (That's like 2 machines, or a private, 3rd party git repo (bitbucket is what I use) and a dedicated test/dev machine).

--
PS: I don't reply to ACs.
Git. by blackcoot · 2012-09-20 07:02 · Score: 2

A great deal of the version wrangling you are facing is best done with a tool like Git.
The bigger problem (development discipline) is much harder to fix.
Chef & Jenkins by terbeaux · 2012-09-20 07:03 · Score: 2

You want something to track changes, deploy changes, and test software. Bazaar will track your changes.
Chef is open source infrastructure management. The central server maintains a searchable database of your nodes and all of the scripts (recipes) that run on them. The nodes query this database and run the scripts that they are supposed to. This is similar to your environment now. You can also check your chef-repo into scm. This allows you to mess around with production and only commit back into scm when you are fairly certain that it works.
Jenkins has a similar setup but each node is ostensibly there to build and test software although we have used it for deployment and integration testing.
Chef & Jenkins can definitely help in deploying code and maintaining your infrastructure but you will need to take responsibility for testing your code somewhere along the process whether it be with on-commit with Jenkins or on deploy with unit or other tests. I definitely feel the value after investing time to learn these powerful tools.
I have a bunch of personal code that I tote around by Omnifarious · 2012-09-20 07:11 · Score: 2

I keep it in a Mercurial repository and use symlinks into the repository to deploy it. I also make free use of Mercurial's subrepo feature for tools that others wrote that are not yet found as packages on the Linux distributions I use.
Yes, there is still a testing issue. For most of this code it's not a big deal because I'm the only user. I test it as I write it with a few simple hand tests and then it's good to go.
If I were doing this for something where the code mattered to other people I would just add unit tests for various subsections as made sense. I would also start sectioning off the tools and making them into separate repositories of their own. I'd also make much sparer use of the sub-repo feature and instead have deployment scripts that handled making sure the correct version was in place.
You still need test environments though for integration testing. And as the code grows, ad-hoc test environments stop being very practical. You should dedicate a VM or two (or even a machine or two) to replicating miniature versions of the real-world setups the code is expected to work in.
Lastly, it's never too early to start using source control on your code. 98% of my code is under source control, even most stuff I think is 'throwaway' or ad-hoc.
I would also strongly recommend Mercurial (or git (if you must)) over Bazaar. It's faster, and the mental model those two tools encourage is a much more accurate representation of what they're really doing. Bazaar lets you pretend that branching is still a big deal and takes some effort to resolve. It lets you continue to think in the model of centralized source systems even though it's not. You will be doing yourself a huge favor in productivity (yes, even for a single developer) to not use it and go for something that doesn't let you pretend anymore. Of those tools, I think Mercurial has a far more carefully thought out and better set of commands and options than git does.

--
Need a Python, C++, Unix, Linux develop
Pretend you're a team by slim · 2012-09-20 07:31 · Score: 3, Insightful

Forget that you're a lone programmer. Set up a proper environment anyway.
This is going to seem like hard work, but once you've done the upfront effort, it will pay dividends.
Do *everything* that you'd do if you were a team. There are plenty of books / web sites on the subject.
Pick a version control system -- since you're starting from scratch, Git or Mercurial. Get your code into it.
Pick a continuous build system -- Jenkins is popular and free.
Write one unit test, and make Jenkins run it as part of the build process.
Decide on some sort of repository for your build artefacts.
Establish an integration testing box, and have your CI system deploy to that every build. Ideally use something like Puppet for this, and also use Puppet on your production machines.
Write one integration test, and make Jenkins run it after deployment.
You can dedicate a server to all of this, several servers, run it all on your laptop or in VMs; it really doesn't matter. But think ahead so that you can move it to dedicated machines later if you need to.
Lots of work, but now you have a nice, confidence inspiring build / code management system.
Once that's going, you can decide how to fix your lack of tests. One approach is to take a few weeks just writing tests. Another is to write tests as the need arises -- for new code as you write it; to demonstrate bugs before you fix them. Or somewhere in between.
Python isn't my area, but there is probably an ecosystem of pythonesque tools for a lot of this stuff. pyUnit, code coverage tools, etc.
You will have problems unit testing, since you won't have designed the code for testability. The choice is, live with fewer tests than might otherwise be possible, or refactor your design into something more unit testable. (IOC is unit testing's best friend)
GitHub by the+eric+conspiracy · 2012-09-20 07:39 · Score: 2

Just get one of the inexpensive commercial subs for GitHub. This solves all sorts of issues. Remote backup, robust version system, issue tracking etc.
Please pass this to your boss. (don't read it) by HornWumpus · 2012-09-20 07:52 · Score: 3, Informative

You need to fire this cowboy. He doesn't think he needs to test his scripts.
I know he seems irreplaceable. That should be a big red flag.

--
John McAfee 'It was like that time I hired that Bangkok prostitute; to do my taxes, while I fucked my accountant'
Most interesting coder by rfrenzob · 2012-09-20 07:56 · Score: 2

Proclaim yourself the most interesting coder thinkgeek style.
I don't often test my code, but when I do, I do it in production.
Well ... by gstoddart · 2012-09-20 08:21 · Score: 4, Insightful

My question for the Slashdot community is: in the case of single developer (for now), multiple machines, and a small-ish user base, what would be your suggestions for code versioning and deployment, considering that there are no real test environments and most code just goes into production ?
If I'm the people who run the company, I start firing people. If I'm the developer, I run like hell before anybody realizes what a complete mess I've made.
No versioning, no test environment, live changes in production ... these are warning signs of something which has been cobbled together, and which continues working by sheer dumb luck.
I had a developer once who edited a live production environment without telling anybody and broke it even worse -- he very quickly found himself with no access to the machines and being told that we no longer trusted him with a production environment.
Having worked in highly regulated industries where the stakes are really high, I've had it drilled into me that you simply have no room whatsoever to be doing this kind of thing that ad hoc.
Glad you're starting to use something. But the risk to your employer of all of your stuff tanking and becoming something you can't recover is just too great. From the sounds of it, if you get abducted by aliens or hit by a bus, your company would come to a screeching halt.

--
Lost at C:>. Found at C.
1. Re:Well ... by turbidostato · 2012-09-20 11:57 · Score: 3, Insightful
  
  "If I'm the people who run the company, I start firing people."
  Unless, of course and as it is usually the case, it is the one running that small company the one that set the policy to start with.
  "If I'm the developer, I run like hell before anybody realizes what a complete mess I've made."
  Unless, of course and as it is usually the case, the guy is a professional, understands the trade-offs and such does (more or less) the boss that thinks the resulting mess is the most cost-effective way to run his business (and, up to a point, it usually is).
Devops and CI by dna_(c)(tm)(r) · 2012-09-20 08:29 · Score: 2
[...]the test environment would have to be a clone of the production environments. Good luck with that with the described environment![...]
There is stuff like Puppet (for declaratively deploying "services") and Vagrant to provision Virtualbox guests.
Downsides:
- It's only really efficient when your production environment can be provisioned with Vagrant/Puppet as well and no manual work is done on these guests. The way the question is formulated, I suppose that is not the situation.
- Virtualbox is only usable for desktop usage. I would love something similar and simple for KVM
Please tell me what company / product this is for by bratmobile · 2012-09-20 09:05 · Score: 2

Because I never, ever want to rely on anything you build this way. You are headed for a disaster, unless you 1) set up a test environment, and 2) use a revision control system.
Really, anything less than that is just a complete waste of everyone's time.
Re:git by GoogleShill · 2012-09-20 09:10 · Score: 2

... it's mathematically proven to *not* lose data.
I love git and use it on a daily basis, but you can't mathematically prove that it won't lose data. It is written by humans, and I have encountered bugs in it. You also still have to deal with manual merges, which are error prone. I've also had my local repo get in weird states that are very difficult to get out of. When this happens, I always copy out all my changes because I'm afraid of losing anything.
Re:Scalability by turbidostato · 2012-09-20 11:17 · Score: 2

"how would a substantial fraction data representative of real data be created if the real data contains people's shipping addresses or other PII?"
Do you really have to ask? You either clutter the fields or clutter their relationships:
Exhibit A:
* John Doe | Lexington Av.
* Betty Lamarr | Main St.
becomes
* John Doe | Main St.
* Betty Lamarr | Lexington Av.
Exhibit B:
* John Doe | Lexington Av.
becomes
* Nhjo Ode | Aevtginon Lx.
Re:Scalability by theshowmecanuck · 2012-09-20 11:24 · Score: 4, Informative

testing code on a fraction has led to misconceptions about scalability to a far larger data set
This is real. The solution is to manage expectations. If people know that the tests just show functionality and not scalability, and that scalability testing is required (when warranted), you should be good. More importantly if the decision makers know this, you are good.

if the real data contains people's shipping addresses or other PII?
Scrub the data. Addresses are not personal information though. The fact a specific person lives there might. Open a phone book (if you can find one now-a-days. They have reams of addresses as well as phone numbers tied to real people. This is public knowledge. Personal information involves things more like name, age, finances, medical records, etc.
For the stuff that is real personal information, randomizing names to create fake people tied to real addresses is not hard at all (real addresses are often necessary when system tie into others where shipping or location are requirements). You can take real information and put it in a can and scrambled to make fake people. I think testers should be proficient enough to be able generate this kind of data.
As to one other comment made by the OP:

and now a versioning system would mean going through proper deployment/rollback in order to get real feedback.
Versioning systems do no such thing if you don't use them that way. If you want a "proper deployment and rollback cycle" you can do that. Or not. But at least you'll be able to go back in time to find the code that actually worked if you need to. No coder should work without the safety net of version control. Whether it be CVS, SVN, GIT, it matters less what it is than whether you have one or not. Pick one and use it.

--
-- I ignore anonymous replies to my comments and postings.
Re:Playing Devil's Advocate by slim · 2012-09-20 22:57 · Score: 2

>
One guy of the caliber of a Stallman or a Thovalds will probably do much better than a team of Visual Source Safe users, even if that guy has no source control system.
Linus Torvalds, author of Git?
Richard Stallman, author of GNU diff; without which many revision control systems wouldn't work?