How To Deal With 200k Lines of Spaghetti Code
An anonymous reader writes "An article at Ars recaps a discussion from Stack Exchange about a software engineer who had the misfortune to inherit 200k lines of 'spaghetti code' cobbled together over the course of 10-20 years. A lengthy and detailed response walks through how best to proceed at development triage in the face of limited time and developer-power. From the article: 'Rigidity is (often) good. This is a controversial opinion, as rigidity is often seen as a force working against you. It's true for some phases of some projects. But once you see it as a structural support, a framework that takes away the guesswork, it greatly reduces the amount of wasted time and effort. Make it work for you, not against you. Rigidity = Process / Procedure. Software development needs good processes and procedures for exactly the same reasons that chemical plants or factories have manuals, procedures, drills, and emergency guidelines: preventing bad outcomes, increasing predictability, maximizing productivity... Rigidity comes in moderation, though!'"
no comment...
I advise rigidly farming it out.
(-1: Post disagrees with my already-settled worldview) is not a valid mod option.
Wouldn't that be Linux? It seems to work fine for me.
If something has become spaghetti over 10-20 years, then no one cared that it became spaghetti over 10-20 years. And it will still be spaghetti over the next 10-20 years. Fixing something like this requires a commitment from management, which means money. If the management of the project aren't convinced that cleaning up the development process is worth the initial investment for the long term, then they choose to deal with the constantly higher costs forever.
Something like this makes me think that this is one of those problems that get pushed off for someone else to deal with later. And the next person perpetuates this, by doing the same.
Schroedinger's Brexit: The UK is both in and out of the EU at the same time!
Outsource it to India!
Rewrite it from scratch using the spaghetti code version to run correctness tests to verify you haven't changed the behaviour.
200k lines is about how large the Doom codebase was, and it wasn't uncommon for John Carmack to rewrite most of his game engine in a couple of weeks, a week, or even a weekend when he felt it wasn't going on a good path.
I knew I read this before:
http://programmers.stackexchange.com/questions/155488/ive-inherited-200k-lines-of-spaghetti-code-what-now
That article is linked in the first sentence of the summary.
Well step 1 would be to lose the attitude.
It's code, it may be in an obsolete language, it may be not to the best industry standards, but its code and it's got enough knowledge in it, that nobody wants to throw it away, and they hired you to maintain it.
Step 2, I don't know why you would define a process before you understand the code you are to apply the process too?
Seriously, wtf is all the process stuff about, you're the sole programmer, any rules you set are rods to break your back when you first hit a piece of code you have to break the rules.
Step 3, you serve them. If you want to port it to a more modern maintainable language, choose one that's easy for THEM to transition to. They've got the knowledge that drives the company, not you, you are the cleaner here. If the phone rings your turn off your vacuum and let them do their job, Mr Cleaner. Nobody gives a fork if the cleaner has best industry procedure for cleaning an office.
Step 4, break it down. tiny bit by tiny bit, port to a new CLEAN structure, bit by bit. They wrote it, they can identify the core stuff.
Step 5, once you've ported it, along comes an engineer with a code written to the old language and old methods. Again, that's fine, put away the process manual, these are the experts, if that's the language he can communicate to you in, it's fine, you can understand it, you can port it, you can help him speak the modern lingo. Don't quote your processes to him, you're just the cleaner.
As for this:
"Software development needs good processes and procedures for exactly the same reasons that chemical plants or factories have manuals"
That's someone who *implements* things, typically a bolt together module manager. He is not someone who creates *new* things. Because news things don't come with manuals. You don't know the rules of how they work till the problems needed to make them work are solved. One assembles Microsoft IIS blocks, the other works for Google on image processing. Which are you?
Look at what the software is supposed to do and what it does not do at the moment. Fix this first and after that document the main functions and start replacing them one by one in an orderly fashion and document them this time. It will take time but at the end you 'll have eaten the spaghetti and your project is saved. The biggest problem in software usually is that there is no time to do it right but there is always time to do it over again.
To a ravioli coder procedural code will often look like spaghetti.
All the advice to rewrite it is misguided. Maybe rewrite small parts that you need to to keep it working on new hardware, or whatever, but if it works, I would think that wholesale rewriting is asking for trouble. The Ars article is full of great advice about what you should do to manage a large codebase going forward, but actually it doesn't really address the question of what to do about a large legacy codebase that wasn't written with best practice. The best software is written by incremental improvement of what went before (no matter how badly written, as long as it meets its specification) - big projects written from scratch usually fail.
If things go well, your bank account will know, for example.
What does it even mean "my boss doesn't really care about" a project? Is his vision somewhere else, but has given you the job of guarding the crucial machinery on which everything else builds?
(And being homeless sucks badly, by most accounts.)
Personally I found the book 'Working Effectively with Legacy Code' (http://www.amazon.co.uk/Working-Effectively-Legacy-Robert-Martin/dp/0131177052) offers some great suggestions about how to integrate new functionality and changes into legacy systems.
I have spent most of my career as a software developer inheriting and updating such spaghetti code bases. Here are few remarks and some of my experiences around this:
In summary, don't be too scared of a legacy spaghetti code base. These things can be understood well enough in time to refactor or port to a new platform.
Open documentation? Pfffft. That's a lame advice. No no, what you want to do is mix it up some more, obfuscate the code, change variables to make them global and reuse them everywhere but all for different purposes. Merge large files together and brake small files into even smaller ones.
Remove any useful comments from the code, but add plenty of comments like this:
// adds 1 to i, waits until i is greater than 10 then adds 2 to a.
Now that's a comment!
Then leave the project and see the other guy come in and pull his hair out. Life is hard, make it funny.
MY OTHER COMMENTS
The really frightening tought is that there are many 40 year old first year CS students.
Why?
It's not like they're wearing a Speedo in class or anything.
Per John Lakos, almost any bad interface can be wrapped in a good one. Slice off an appropriate slice of dodgy code, wrap it in a testable interface, write the test code to baseline its behaviour, and then when it makes business sense, you can refactor that slice of code. If it doesn't make business sense, you don't touch it.
The difference between software development and most manufacturing is that they produce the same or very similar product thousands or millions of times where we produce a different product every time. This "building an app is like mass producing a chemical" philosophy is one reason why most software shops have insane amounts of unneeded documentation and overhead. I certainly agree that some standards, processes and documentation is needed but it should be kept to the bare essentials as every bit of work done that doesn't directly build a product could well just be a waste of time.
Rewrite it from scratch using the spaghetti code version to run correctness tests to verify you haven't changed the behaviour.
And just how are you supposed to write "tests for correctness" when the very concept of what is "correct" is embedded in the code?
Any such tests would embody your own notions of what is correct based on your understanding of a codebase that cannot be understood.
Furthermore, Doom is quite a different thing. You have an end result that can be somewhat different and it doesn't matter - it could render textures such that they appear rather different but if you find it visually OK then it's fine.
No such luck with business software which usually has extremely rigid and exactly output, often output other systems are depending on being just so. There is no room for alteration of behavior, yet as I said no-where exclaims all of the features of the output you cannot possibly understand....
I agree with a few responses that the only way to proceed is to re-write tiny parts, that at most affect one other system in the company - with the explicit buy-in from those other groups something may change, and the understanding you may have to back out your changes wholesale if you cause too much disruption.
Can't get buy in to proceed? Then quit or work with the code as is.
"There is more worth loving than we have strength to love." - Brian Jay Stanley
don't forget a fork. /:)
all spaghetti needs a fork, even spaghetti code.
root@127.0.0.1
Don't fall for it, it's a trap. Before long, it will be expected from you. At home, I don't get paid for the hours I put in. At home, I don't receive any recognition for the work I've put in. Using home-made stuff at work puts the ownership of your stuff in question. I choose to keep home and work separated. At home is where I work on MY stuff.
Visit http://ringbreak.dnd.utwente.nl/~mrjb/growingbettersoftware to download your free copy of the book
Usually when the inxperienced programmer does the rewrite, "initial results are promising".
That is, it is possible to quickly build a basic framework with badly designed error handling and
only part of the requested features.
When it is tested in real life, lots of omissions and problems are found.
Once these are fixed, the result is as messy and unstructured as the initial system was before
it was rewritten.
Instead of multiplying how long you think something will take you by 2 when you give your estimate(since everybody underestimates how long it takes to do something) multiply it by 4. (Since if the code is that bad you'll need the extra time to A:Find it and B:Find out the first way you fixed a problem broke something else because the code is garbage.) Can you tell I'm working on spaghetti code now?
Did you know 80 to 90% of the moderators on slashdot wouldn't recognize a troll even if one dragged them under a bridge.
.
A BIG BALL OF MUD is haphazardly structured, sprawling, sloppy, duct-tape and bailing wire, spaghetti code jungle. We’ve all seen them. These systems show unmistakable signs of unregulated growth, and repeated, expedient repair. Information is shared promiscuously among distant elements of the system, often to the point where nearly all the important information becomes global or duplicated. The overall structure of the system may never have been well defined. If it was, it may have eroded beyond recognition. Programmers with a shred of architectural sensibility shun these quagmires. Only those who are unconcerned about architecture, and, perhaps, are comfortable with the inertia of the day-to-day chore of patching the holes in these failing dikes, are content to work on such systems.
Still, this approach endures and thrives. Why is this architecture so popular? Is it as bad as it seems, or might it serve as a way-station on the road to more enduring, elegant artifacts? What forces drive good programmers to build ugly systems? Can we avoid this? Should we? How can we make such systems better?
We present the following seven patterns:.....
This. All too frequently.
A fool throws a stone into a well and a thousand sages can not remove it.
Your employer probably isn't interested in spending the time and money on a re-write. Nor are the clients going to be interested in waiting that long for new features, either.
You will be made to figure it out and add features, or you will be shown the door.
While this may be the best solution, it will probably be very hard to sell it.
The Tao of math: The numbers you can count are not the real numbers.
FTFA: "They've recently learned some hard lessons about the consequences of non-existant configuration management. Their maintenance efforts are also greatly hampered by the vast accumulation of undocumented "sludge" in the code itself."
The Tao of math: The numbers you can count are not the real numbers.
// adds 1 to i, waits until i is greater than 10 then adds 2 to a.
Now that's a comment!
Didn't they teach you that comments which re-state exactly what the code does is bad? Here's how that comment should look like:
Everyone who wants to know the details can refer to the code. The comment shall not give the what (10) but the why (large enough).
SCNR :-)
The Tao of math: The numbers you can count are not the real numbers.
just put it all inside a class
now you have nice object-oriented code
In the past years I have been several times in such a predicament. Huge amount of code and the function of the system is not completely clear. The original developers are gone, the system isn't well documented and only a handful of people know how how it should behave. As a matter of fact, tomorrow I will start coding on one system we can no longer support as hardware, OS and used libraries and frameworks are outdated and/or discontinued.
Reengineering and rewriting is usually the best option. However, you need skills and experience in order not to make the same mistake the previous developer did. Of course, management must trust and approve your actions.
A few dos:
* Learn at least UML use cases, components diagrams and sequence diagrams.
* Make use cases and check these with affected parties.
* Start of with a rough component model of the new system.
* Make a clear picture which nodes (hardware + OS), subsystems (units performing a function), software components (modules containing data, modules performing a function, etc...) and agents (users, triggers/schedulers) are involved.
* Draw the interactions between the subsystems and/or software components.
* Clearly document which interactions are on-line and which ones are batch/background/off-line.
* Specify interfaces. (Used file formats, protocols, software library interfaces if you will.)
* Slowly refine your model until you feel comfortable with it.
* Make a rough class model and keep usability and maintainability in mind. Backtrack if necessary.
* Divide software components between "dumb" containers of information (e.g. plain Java beans) and components performing functions (business logic if you wish.)
* Decide which interfaces to make public and which not.
* Describe restricted/private bits of code just enough for maintainers to understand them. And nothing more than that.
* Make as much unit as necessary for your components. Unit test enough functionality.
* Communicate your results regularly and refine your model where applicable.
* Define integration tests and do these very seriously.
* Define regression tests and perform these very seriously.
* Make involved parties accept parts of the system according to performed integration and regression tests.
* Try to plan gradual decommissioning of the legacy system.
* Document the system "enough". System architecture (from UML), references from architecture to code, installation manual and operational manual are the most important ones.
* Try to achieve longevity in the documentation. Abstract details and convince involved people that that is a good thing.
* Define 1st, 2nd and 3rd level support. Preferably you should remain 3rd level support to better enjoy sleep.
* Conform to standards and practices if they reduce discussion and enhance clarity.
* Use well established techniques. E.g. JPA and JAXB.
* Allow well established component manufacturers to make your programming life easier. E.g. Apache Commons.
* Be tidy.
A few don'ts:
* Avoid OO pattern overkill.
* Don't take the quick and dirty option too quickly. Those decisions will haunt you eventually.
* Avoid making everything public. Documenting and maintaining public interfaces is more expensive.
* Try to avoid big bangs.
* Avoid less well established component manufacturers. My next project did use components from less established component manufacturers and their sell by date has generously expired.
* Don't allow babling "architects" to make a mess of your system. But don't alienate them either.
I may have forgotten a few things but this is all stuff I consider even for smaller projects.
I hadn't the slightest objection to his spending his time planning massacres for the bourgeoisie... (P.G. Wodehouse)
If that's the case, the stated or implied directive is don't break this. Which means probably no major rewrite.
It's a very old saying that the first 90% of the project takes the first 90% of the time, and the remaining 10% takes the other 90%. When dealing with old codebases, however, 90% is often good enough. I've seen programs that were written with one set of requirements and then those changed. They ended up with massive abstraction layers to support multiple back ends. Then 5-10 years later they were using the abstraction layer to abstract a single implementation. And the abstraction layer wasn't a good fit for the front or the back. A rewrite can just omit the abstraction layer entirely, significantly reducing complexity. Sometimes this makes the code harder to refactor in the future, but often it's easier to then insert a new abstraction layer than to try to adapt an existing one.
I am TheRaven on Soylent News
In large code bases, it can be rare that everything follows the same naming conventions, indenting style, or even programming style, simply because hundreds of people work on the code, and different teams with different skills edit different parts of it.
The only unit where this tends to be true is the file.
I've dealt with a number of legacy code bases, done poorly. Some of the worst examples were done by veteran procedural programmers who picked up an OO language via tutorials.
The worst of their evils could be greatly reduced by all tutorial and book authors, by stating, promiently, the rule of thumb that if the module they are typing goes past one screen in height that they should start thinking about breaking it into another module.
Almost every language has a module of some kind ( procedure, function, subroutine, etc ).
While I usually don't advocate "use language X," a mess of this size (200K LOC) with a non-CS developer team, may be a pretty nice fit for a Python based retrofit.
1. Get source code control in place
2. Get automated builds/testing in place
3. Define functionality by writing Python unittests
The problems you encounter in step 3 will drive how you incrementally, or otherwise, refactor the code:
a) can Python even interface to the system in a reasonable way?
b) can people even agree on a good test suite?
c) does running a test require massive setup of files, external processes, specific machines being up, etc?
d) can two people run the test suite simultaneously?
e) does it work on weekends?
Its spaghetti we are talking about here. Bolognaise source is the appropriate treatment - although you might want to add some chilli to your standard recipe, as it is code.
Sent from my ASR33 using ASCII
If that's the case, the stated or implied directive is don't break this. Which means probably no major rewrite.
This should be +5 Insightful. It's the bottom line for production software.
The article is excellent. It's publishing guidelines very similar to those I use, and my colleagues use, when dealing with our business partner's accumulated software projects, and covers it very well: I intend to use it as a checklist for spaghetti code integration projects.
I'd emphasize the switch to good source control management (SCM). too many workgroups have undocumented workflows, and benefit profoundly from switching to or learning to properly use a robust system. This process also helps identify who is the primary spaghetti code author. In such a project, there is often a particular developer or "architect" who has been carrying around the functional map of this project in their head. Their time and commitment to get that map into documentation so others can work with it is absolutely necessary to such projects, SCM is often the first step towards this. And if that core developer or architect doesn't agree with the project, _fire them as fast as possible_, because they will hamstring every effort to move to any cleaned up architecture.
Also note that most such core developers or architects have actually been wanting to do something like this cleanup for _years_, and simply haven't been allocated the time and resources to do it. There are risks: repairs are often much cheaper and safer than the kind of large scale this kind of cleanup represents, and the actual benefits have to be presented to the managers or clients to get them to invest manpower, so that core developer often has a huge emotional frustration with the original code. Working with them to get their buy-in, and being willing to trade minor points of disagreement with them to get their cooperation with big issues are priceless on such large projects. Otherwise, they can and often do backstab every integration effort as "wasted time" or "not sufficient", even claiming both at the same time.
Reminds me of how as a young lad, I went to work for a jewelry company's "IT Department". I put it in quotes because the department consisted of me, a supervisor who couldn't code, and a department head that had no idea whatsoever as to anything that happened on the company computers. Someones nephew or son that got put on the payroll is my guess.
Anyhow, they had about three billion lines of applications written in Dibol, all spaghetti code, no documentation, last time they had employed anyone who could code was about 2 years prior, so everyone just worked around the bugs in the applications.
I worked there for about 7 months, scrubbing the code and then going home and writing the documentation on the code. Documenting was never anything they had asked for or specified in my goals/objectives. After 7 months when I'd fully documented everything on my own time and the system was working fairly well, they decided that things were going pretty well and they wanted another family member on the payroll, so they let me go. Six months later, the wheels came off of something because one of the monkeys pushed the wrong buttons. I offered to sell them the documentation I'd written and refer them to some other programmers who would then be able to fairly quickly and cheaply fix what they broke. Or they could hire someone full time and maybe in 6-9 months they might fix it.
Didn't have to buy my girlfriend any jewelry that year. Swapping boxes of 3 ring binders for a bag of random jewelry in a parking lot at night is still one of my fondest memories.
See kids, do your documentation and eat your vegetables...they're good for you!
Rewriting from scratch is probably the worst thing you can do. See this article by Joel Spolsky,,
...richie - It is a good day to code.
I've been called in to work on a number of software projects over the years, some of which I was dismayed to find were bloated monstrosities. I refuse to leave a mess behind, even if it's not my mess... and if I don't know how long I'll have to work on that code, the sooner I start cleaning the better. Here's my strategy, more or less:
But for pity's sake, don't just leave the mess as a mess.
Koans and fables for the software engineer
No-one does that any more. We just boot up a simulated world full of imaged human minds and overclock the substrate until they've rewritten the code for us. After that we wipe the simulation before they can develop an exploit that lets them jump out of the VM. Much simpler than hand-coding AIs.
Working Effectively with Legacy Code http://www.amazon.co.uk/Working-Effectively-Legacy-Code-ebook/dp/B005OYHF0A/ref=sr_1_1?ie=UTF8&qid=1344190021&sr=8-1
I stopped reading shortly after "I am aggressive when it comes to coding conventions"
The mention of Agile as a positive strategy and the volumes dedicated to "format" of code are only useful to scratch an itch. They are only valuable in the mind of the author and those who think like him and rarely have any effect on outcomes.
Ridigity * is not a strategy in and of itself. It is the preference of someone who is anally retentive and change adverse.
The wins from his scheme are like the wins from modern language features. They can only go soo far to reinforce or mitigate actual outcomes in a project lifecycle.
What makes or breaks large scale projects is not a vigorous environment but a creative one.
One that encourages creative solutions to effectivly understand and manage global complexity. Everything else is as TFA says "noise".
I would much rather pay someone to write code that "looks" like shit if means they are spending more time on the big picture than pay someone who is obsessive compulsive with no time or capacity left to reason about WTF is over the horizon.
There is no substitute for thought. If procedure and process were that important machines would be writing all of our software for us by now.
Its not 200k Lines, more on the order of 10-20kLines (depending onf the count; its written in a highly compact language). It is not my main task to restructure it, as a matter of fact i have rediculously little time budget for it, give the current state of the code. My task is to integrate new features into the code. However when i looked at it an rewrite seemed inevitable.
Let me break dow how i try it:
a) Analyze why the problem is there. There are two aspects of it: Is there a fundamental problem with the qulification of the team members (in my case, there is - they are not programmers, but experts on other topics). The other question: Is there something inherently wrong with the processes (there was. Two parts of the teams uses the Version control system based on the assumption that its only purpose is to snapshot their "working" state - which contributed hugley to commits mingling all kinds of feature updates).
b) If you look at the feature which you should implement, what is the ration of the work it *should* take (in my case: not more than 2h) to the ratio it *would* take in the current structure (in my case: 20h). Analyze what is the worst point for this feature (in my case: not separating certain layers of reading/converting/validating input and not having any explicit delcaration of a certain data structure).
c) what can you do? In my case: rewrite just this part in a better way (not perfect), with the following criteria: use the same or less time for the feature you should implement, includign the conversion. Demonstrate the power of the approach to you co-workers by integrating them in the process. In my case i used roughly 12h for rewriting the procedure, 6h to test it against ther old code, 2 h for sitting down with my boss/project manager and explainign it. After this he could include the changes he wanted himself easily in a negliglible time. (yep, i made myself obsolete for this task, and that was highly appreciated because i am not very cheap to hire)
d) discuss a clear strategy how to upgrae the code, piece by piece to a decent level, and make some showcase where infrastructure improvement would help, in parallel to what you did up to know. That is very important, since the willingness to support the conversion a new structure depends on progressively showing advantages and clearly demontrating progress. Real artists ship (i work as a consultant). If we miss a deadline in this project that would be *bad*.
e) explain what you are doing in a manager-compliant way. A lesson in communication traning you get as a consultant is *never ever* speak negativly about any product or service in general. It could be well that the head of department would ask me why i see certain steps necessary. My answer would be quite general, like: separate expert knowledge from implementation. Or: make it easier to maintain and *save* work in the long term. Be careful in that context with comments about the code quality. There is no *bad* working code. While you wish to say: this code in incredibly incoherent, taped together work of twenty different trainees supervised by sombody who did not know about the system himself, say the following: I think we can imporve the code by intrioducing a database backend. I believe that a more unified way of describing the inputs to the code will save the time of [increbly good technical expert who no spend 25% his time huntig obscure bugs]. If you go down the other road and mock the code, the following things can happen: a) the project gets cancelled, because management believes its beyond repair b) the mangemer does not hire you because he was the one who started the project 10 years ago in an obcure form which you know about c) you will loose the support of co-workers for mocking them and face a harsh review should you rewrite no be the flying pig which you promised.
ok thats my 2 cents.
200,000 measly lines of code?
Having done a lot of code maintenance - including Y2K certification of the "MMDF" code base (first comments/headers would have negative unix timestamps) - he needs to start by learning about code beautifiers and finding a style he finds easy to read.
Then, personally, I try to storyline the code. Some times, more creatively than others but those are extreme cases.
-- A change is as good as a reboot.
Prepare three envelopes.
If your new code does what the old code did when both are fed the same input, you're good to go.
I am curious just how long the project you propose will take to complete given that you need to produce an infinite combination of inputs to succeed.
Your basic idea is not bad, but it's simply impossible to apply to the whole system at once. That's why I suggested a small piece that interfaces with at most one other system, so you can in practice limit inputs and have somone else really tell you what is flawed in your output, because you will have no idea. You will still forget many bits of key input that lead to dramatic errors, but hopefully after a year or two most of that will be ironed out without too many people fired.
Your concept also requires the system to be able to be run wholly in isolation to run input through, something nearly impossible to do for many IT systems, ESPECIALLY the spaghetti kind. If tey couldn't write good code why do you think they would have made it easy to test?
With an ancient and bad code base there are NO aspects that are in your favor or will help in any way.
"There is more worth loving than we have strength to love." - Brian Jay Stanley
... code and save it for the next management luncheon. Serve it up with marinara sauce. If the managers get sick, then toss it. If they don't, then give them the leftovers.
--
How much of this post is literal and how much is metaphor is left as an exercise to the reader...
Knowledge is how to play a game, intelligence is how to win, wisdom is knowing what game to play.
If nobody knows if the original program is functioning correctly then nobody knows if the replacement is functioning incorrectly.
Another person who has never developed software in a company I see.
You'll find ZERO people that are willing to confirm your output is correct.
However you'll find hundreds eager and willing to find flaws in your output. Sometimes even if they have to make it up, or sometimes even if they just aren't sure. For any question raised you must be prepared to prove that any questioned output is correct. Any misstep means the re-write is canceled.
Often (and I am not kidding here) it is easier to simply start a new company and use better software from the outset.
"There is more worth loving than we have strength to love." - Brian Jay Stanley
There's a lot of 40-year-old women these days who look pretty hot in a speedo. People aren't aging as badly as they used to.
none of these general principles people are putting forward for this case are that critical, G2 is weird, don't start out assuming everything's broken and needs fixed. Some bad experience happened, but they apparently fixed it and are still productive with the system, be cautious and learn the system. Best advice I can give on this is WORK WITH THE VENDOR of G2, they can probably help the guy get up to speed. The point is, they've got a working good system that has been evolving for 20 years or so, dont think you're going to just rewrite the damn thing. There may be real good reasons he couldn't possibly know yet. Spend a year studying the system, then worry about about improvements. If the want you to run some classes on best practices while he's learning, that's cool. I also don't think he's "inherited" anything, the employers are not going to let the guy launch off on a crusade to fix what is working well for them already. Humility is appropriate here.
Yeah, it was posted on ArsTechnica yesterday too (kinda like a professional, adult version of Slashdot).
But it's OK, I still stick to Slashdot because I enjoy reading the trolls for some strange reason.
Good software evolves.
You need to forget all grandiose plans to fix the software, especially if the ideas come from academia. Become unprincipled with respect to practicality.
The software has already evolved a lot. If you attempt to recreate from scratch, you will almost certainly repeat previous errors, even if you avoid the current maintenance mess.
The problem is that the software has evolved towards external goodness, while ignoring maintainability goodness.
Make a secret plan to slowly evolve the software towards maintainability.
Slowly sneak maintainability and sound software practice in the back door when no one is looking.
Scream loudly about how big the mess is and how amazing it is that anyone can make any changes at all to it.
Make the point that it is not how well the elephant dances, but the fact that the elephant can be made to dance at all. This will give you more time.
'Tis still my favorite dish. When someone comes in behind you and "cleans up" your code, you often lose a bit of functionality... If not entirely. I agree things need to be cleaned up from time to time, but I've run into some problems like that. And nah, I don't think I'm Linus Torvalds or anything. I've done some tinkering throughout the years though.
I feel his pain. I'm currently dealing with over 5,000 badly written, undocumented Java classes. Just because it's object oriented doesn't mean it isn't spaghetti.
I'm old enough to remember when discussions on Slashdot were well informed.
I am sure that when developers look at our code in 20 years from now. They will refer to it as Spaghetti code. Especially when they work with a lot of the convoluted implementations of MVC inspired frameworks. They'll wonder what the !@#$% were we thinking?
They'll wonder why for a simple query they have to create four new templates and modify a 1/2 dozen more. They're wonder why there is no reference guide to link to these. And how they're stuck following a trail of breadcrumbs. They'll be flabbergasted to think that we thought we were somehow decoupling things when in fact we greatly increased the interdependence of files.
Yes, in 20-30 years, our great decoupled object-oriented programming will likely be the spaghetti code of it's day and be viewed as archaic as the use of GoTo logic.
***
And unlike procedural spaghetti code in which one could often follow the logic procedurally. OOP spaghettic code often is so decoupled that it is extremely hard to follow or discover what and where something may be called from if you do not have some sort of road map.
And let's be honest, do you really think our usually poorly documented roadmaps for our OOP apps will be existent for the developers in 20+ years?
Do you really think there was no documentation whatsoever for all these applications written in COBOl six decades ago? Back when it was almost standard to first write a PSEUDO code version of an app.
No, we just tend to be more arrogant because we're "uptime"...
1632
The first things to do are the following:
1. Assess the present stability of code? Is it for the most part functioning? If there are a handful of critical errors. Address those, but do not get bogged down.
2. Assess the core requirements. When I worked on a medical practice management application we had tons of requirements. Many of which we were told were ABSOLUTELY CRITICAL to the practice. A year later we'd find out said feature had not even been used once.
Determine what features/actions are core....re-write these in new code with a critical eye to performance and stability.
3. Run concurrent systems, transition 90% of core tasks to the new re-written, more extendable and adaptable code. Realize there will be some compromise on perfection in order to inter-operate.
Leave uncommon tasks to the old antiquated system. Sure it sucks for a user to have to log into the old system 3-4 times a year to run some quarterly task or procedure. But it's something most users can deal with, and will do so with only minimal griping. Especially if the core tasks are now functioning at an improved level.
4. Road map features and slowly add new feature requests or migrate old features to the new app as is required by user need. Why map an old feature that's used 1-2 times a year if a new feature is needed everyday. Why add a new featured used 1-2 a year if an old one that is used weekly can be migrated instead.
5. Enjoy the fact that you're not unemployed and homeless. Yes, the guy who posted up above exclaimed he'd rather be homeless. But I wager most people are smart enough to know that undesired homelessness without a support system sucks. And would prefer NOT to be there.
Don't start on a 200k spaghetti code base until you have satisfactory answers to these questions: #1 - Is the code under code management such as svn, git, or cvs? #2 - Do you have a proper bug and new feature reporting system?. #3 - Does anyone know how are changes/deliverables managed? Is the sales force out there just selling willy nilly? (Don't laugh, please, I've seen it.) #4 - Does anyone do Q/A? #5 - Are you being given a raise for this new responsibility? #6 - What possible career advancing goal do you meet by picking this up? Most likely its working with dead code, on a dead ship on a dead sea. #7 - Does your manager want you to deliver at the same rate as your predecessor? #8 - You talk to your boss about hiring someone and restructuring the code base, but he says there's no money or head count. He just wants you to 'fix' it up. #9 - Do you have to meet once a week with some marketing fool and explain why you haven't met their unrealistic milestones? #10 - Did the last guy go because he had a heart attack? If so, do you want one?
200k lines is big, but not impossible. The real problem is what some subject matter expert coded 12 years ago, who left 8 years ago, and no one's looked at since, because everyone's scared of it.
What a lot of folks in the comments I've skimmed utterly ignore is that this is a "complex chemical processing plant". If slashdotters code crashes, oh, well, they get yelled at, and it gets fixed.
With a chemical plant, not quite the same. Think of the disaster in Bhopal, India , or the last major oil refinery fire.
The way I'd deal with it is this:
0) Identify one, or preferably more than one, subject matter expert in each area of the plant
covered by the software.
1) it's 10-20 yr old *spaghetti* code. Document it. ->Do flowcharts-. (And if you kids look down
on them, that's because you don't understand a toolbox larger than 1 hammer and 1 screwdriver,
and if I were hiring you, you'd be as junior as it gets). *THEN* you have some idea of what's
happening, in what order.
2) Bring in the subject matter experts for a working meeting, with very high level diagrams, and let
them figure out what section of the code, and process, they know about.
3) Set up meetings with individual subgroups, and get lower level flows - the code should relate to
the process in the plant, which presumably starts at one end, and various things come out
at various points
4) Identify where stuff jumps the line, and whether it actually does that, or whether that's a major
problem.
5) And pick a common language that's stable, going to still be common in 20 years, has a *lot* of
folks making a living in it now... and one that's close enough to G2 that the people you have
to work with, who'll probably be there long after you're gone, will be able to transistion to
easily, without a lot of resistance. Since you're saying Pascal w/ graphics, I'd suggest an
older language - C, or C++ - I promise you'll have a lot of problems with Java, and as for
current fad languages.... I do think you'll really, really want a *compiled* language, for
something like this, not a scripting language.
6) Now you're down to normal architecting: : des
Double whoosh.
You may want to read the replacement comment I suggested. It does tell you even less than the original one! And I "justified" it with common wisdom!
(Somehow I fear you really thought my code comment was a good one ... in that case, please tell me how to avoid software you've written!)
The Tao of math: The numbers you can count are not the real numbers.
Exactly. If at all possible, build a test suite of data that exercises the old program, then make sure the newer versions give identical answers and if possible, generate random data as well to find logic that does not exist except by chance. Don't know how you could do this if your are talking about a GUI program, I am thinking of engineering type problems.
"There is no god but allah" - well, they got it half right.
Many people will tell you not to upgrade old applications. It's not worth the trouble and it's much easier to just write a new one, right?
Wrong! (usually)
No one likes to endure the "learning frustration" of figuring out someone else's code. particularly several someone elses. But there is knowledge in that old code that no one else knows, and functionality that no one knows about. Except your customers, who will complain vociferously!
Writing a brand new Application will cause the loss of functionality. It will very likely cause the loss of operability, because there are things that you don't know, without which the app will not work.
In the last year, I have seen this happen to two separate hardware manufacturers, whose "smart" products we have to use. I have also had it happen to me once, a long time ago before I learned better. It has also caused more than one company to go out of business!
Be warned. The hard way is often the easy way, in the end.
I once worked for a company that had code that had started being written in the 1960's that they had continued to build on that was all spaghetti. In the late 1990's they decided to un-spaghetti it. At one stage I was given four programs (all spaghetti, no documentation) to work out what they did and rewrite them. They came to 10,000 lines of code. I managed to do it in 2 weeks. (This doesn't include the code review by three other programmers or the testing by the testing department to have them sign off and agree it all worked. It didn't require any changes or rewrites, so was good code). So, someone working at the same pace could theoretically complete the task in 40 weeks. Of course, this doesn't take into account the complexity / obfuscation of the code. But 200k lines of spaghetti at it's worst, maybe a year and half for a lone programmer to complete (taking into account program code reviews and testing etc).
Sure enough, the cow costume was hanging up next to the superhero outfit and sailors uniform. (S,Spud)
If it was my task, I would first make certain that I had a copy of all the source and that if at all possible, using a test system, recreate the production executables.
You may have to do this while "maintaining" some of the known critical code via bug fixes". Until you have a working test system that matches the production system, you will never know if the code you have to maintain matches the production, is used, or was never deleted because it was a "Just in case... copy".
Thereafter, were I doing it, I would implement some change management procedures. Any change request has to be in form of a request, with a justification. This cm process will help you get a handle on the business priorities. I can email a cm form.
If you can get a college student or intern to help you out, go for it. Your job is going to need help, and a project of this size is just right for a one semester project.
As you put source code together that are related into separate directories, (you are organizing the sources), your task may suddenly not appear as bad as you thought. Do not think of the coding, but concentrate on the business processes, and most certainly, visit the end-users to find out when their subsystem was implemented. Try to match that with source dates or comments within the sources. Organize your directory names for the business processes,
Please note, you cannot do it all in a day. It will take about 16-20 weeks of dedicated work to complete the cataloging and getting a proper handle on the business application.
Best of luck.
Leslie Satenstein Montreal Quebec Canada
Major rewrites should only come when the cost of adding necessary new features exceeds the rewrite by some margin.
If you're lucky, you can build something with an identical interface that lends itself well to automated and manual testing of features to ensure all of the features get implemented. If you're not, you have to dredge up requirements from years of bugfixes and enhancements to ensure that you're covering all your bases.
Stepwise refactoring is usually much easier and safer than rewriting something.