Slashdot Mirror


How To Deal With 200k Lines of Spaghetti Code

An anonymous reader writes "An article at Ars recaps a discussion from Stack Exchange about a software engineer who had the misfortune to inherit 200k lines of 'spaghetti code' cobbled together over the course of 10-20 years. A lengthy and detailed response walks through how best to proceed at development triage in the face of limited time and developer-power. From the article: 'Rigidity is (often) good. This is a controversial opinion, as rigidity is often seen as a force working against you. It's true for some phases of some projects. But once you see it as a structural support, a framework that takes away the guesswork, it greatly reduces the amount of wasted time and effort. Make it work for you, not against you. Rigidity = Process / Procedure. Software development needs good processes and procedures for exactly the same reasons that chemical plants or factories have manuals, procedures, drills, and emergency guidelines: preventing bad outcomes, increasing predictability, maximizing productivity... Rigidity comes in moderation, though!'"

37 of 236 comments (clear)

  1. ...no by Anonymous Coward · · Score: 4, Funny

    no comment...

    1. Re:...no by wmac1 · · Score: 3, Interesting

      In several iterations, disentangle and break the code into smaller and more understandable chunks.

      After breaking the code into smaller chunks clean them up (code conventions, algorithms, ...) and reorganize as needed.

    2. Re:...no by DJRumpy · · Score: 4, Insightful

      I disagree. For 200K lines of code, You immediately start a new project to produce the next major release of said code.

      200,000 lines of code is a large project, but very do-able for even with a small team or one person. Although you could go in an attempt to tighten up code in smaller chunks, the very fact that this code was written over the course of many years, probably by many authors and styles, means it probably follows no standard to general layout, declarations, etc. (hence the spaghetti).

      I would simply support what's there with only a break-fix policy, and immediately start documenting all aspects of it's functionality to rebuild it from the bottom up. The very fact that this code would have so many styles would mean most of it would have to be re-written and documented anyway.

      Document the functionality, re-implement with standard code to guidelines, include any feature enhancements that may exist, release new version.

    3. Re:...no by kestasjk · · Score: 2

      It depends what sort of thing it is, how complex it is, whether the software is the kind of thing that can be tested easily or if the rules are embedded in the spaghetti, etc, etc.

      One person given 200KLOC of complex spaghetti to rewrite though, if with little documentation outside of the code, and software that doesn't lend itself to automated testing, where the spaghetti logic is of consequence to the business.. that could be a very, very long project (years, easily).

      --
      // MD_Update(&m,buf,j);
    4. Re:...no by Rockoon · · Score: 3, Insightful

      Isnt this how Netscape died?

      --
      "His name was James Damore."
    5. Re:...no by DJRumpy · · Score: 2

      Yes and no. Part of the complexity of a new program is missing here. The full functionality is mapped out, albeit they will have to have to glean some functionality from the code and most from the end users. That's a huge boon as long as the end users can effectively comminicate or demonstrate that functionality.

      In short, the map is already written and scope creep can be largely minimized with some proper expectations and management.

    6. Re:...no by DJRumpy · · Score: 4, Informative

      A break-fix policy is simply stating you will support code to fix any breaks in functionality, while denying any enhancement requests. In short, the only changes you make to the old code would be to fix production issues.

      It lets you focus efforts on implementing new code while avoiding supporting enhancements on the old code.

    7. Re:...no by eugene+ts+wong · · Score: 2

      That's a great policy to have. I wish that Linux distros and various apps were developed like that. I only upgrade for the bug fixes and the security fixes.

    8. Re:...no by ralphdaugherty · · Score: 4, Informative

      G2 is being called virtually obsolete. I looked up G2 in Wilipedia comparison of programming languages http://en.wikipedia.org/wiki/Comparison_of_programming_languages and it is listed as:

      Language: G2
      Intended use: Application, inference, expert system
      Paradigms: common graphical development and runtime environment, event-driven, imperative, object-oriented

      Plus the search on G2 shows there is a G2++. So what does obsolete mean to those calling it obsolete?

      btw, I'm an RPG programmer and I've been writing tons of new business software every day for the last 23 years, the whole time the language has been declared obsolete.

      Now get off my lawn.

    9. Re:...no by crankyspice · · Score: 3, Informative

      Isnt this how Netscape died?

      According to Joel Spolsky...

      --
      geek. lawyer.
  2. "Cobbled together over 10-20 years . . . ?" by PolygamousRanchKid+ · · Score: 5, Insightful

    Wouldn't that be Linux? It seems to work fine for me.

    If something has become spaghetti over 10-20 years, then no one cared that it became spaghetti over 10-20 years. And it will still be spaghetti over the next 10-20 years. Fixing something like this requires a commitment from management, which means money. If the management of the project aren't convinced that cleaning up the development process is worth the initial investment for the long term, then they choose to deal with the constantly higher costs forever.

    Something like this makes me think that this is one of those problems that get pushed off for someone else to deal with later. And the next person perpetuates this, by doing the same.

    --
    Schroedinger's Brexit: The UK is both in and out of the EU at the same time!
    1. Re:"Cobbled together over 10-20 years . . . ?" by gl4ss · · Score: 3, Insightful

      would fixing it to not be spaghetti be any better though? would fixing it to be really non-confusing depend on changing real world processes of clerks etc?

      spaghetti turned into OO is going to be spaghetti too, just with a lot more of sauce and different bits which look almost the same but turn out to be tomato or minced meat upon closer inspection and when you take plunge into the platter and eat something you don't really know what you ate or how it got there, what I'm trying to get at is that if it's spaghetti because of already re-using lots of the code in lots of places then the rewrite could end up being 2 million lines (commercial rewrites of sw for gov. have a habit of ending up like this..).

      at least he can probably feel good about it not being knights capital trading system.

      but he's asking how to deal with it, with good development tools(that have good find/locate) and good memory. start exploring what the sw really does and guessing where you might be asked to do modifications.

      --
      world was created 5 seconds before this post as it is.
    2. Re:"Cobbled together over 10-20 years . . . ?" by Anonymous Coward · · Score: 3, Insightful

      When I was taking my 2nd degree in Computer Science years ago, there was quite a few mature students. They are actually doing well in class in spite of other distractions in life such as family.

      There was nothing wrong with people deciding to learn new things or switch career later in life. Are you biased or just like to discriminate people for their age?

      I would think that matured people are more suited for computer science as they are not as much distracted by the new "Shiny" new thing that comes along and constantly have to redo things not because they don't work, but they are not the "latest and greatest".

  3. Easy! by blackicye · · Score: 4, Funny

    Outsource it to India!

    1. Re:Easy! by mrmeval · · Score: 2

      Huh, they're getting to be as expensive as in-house development with just as sloppy code depending on which business you go to. There is also a substantial language barrier in some cases even though there are a lot who do speak English they do not comprehend some concepts. You then need a liaison who does and can translate your requests. That adds cost. We did find a very good software development company that is prompt, competent and who worked to understand our needs. We do save money but it's not nearly as much as it used to be. The costs won't go too high as India's success has several countries catching up to them.

      --
      I'd go on a Vegan diet but the delivery time from Vega is too long. --brownkitty
  4. Re:Farm out OP writing, too. by symbolset · · Score: 3, Interesting

    I advise printing it out and posting it in the hall in printed form, annotated with contributors. Paper the walls with it. You would be surprised how much more efficient that makes a coder even if it's not ever read.

    Also, for debugging sometimes you have to get down on the floor with 50 pages of fanfold and work it out. Decorum be damned. A PC only has so much vertical resolution and sometimes you need more than to peek at the code through a sliding window.

    --
    Help stamp out iliturcy.
  5. Re:stackexchange. by blackicye · · Score: 3, Informative

    I knew I read this before:
    http://programmers.stackexchange.com/questions/155488/ive-inherited-200k-lines-of-spaghetti-code-what-now

    That article is linked in the first sentence of the summary.

  6. 1. Lose the attitude by Anonymous Coward · · Score: 5, Insightful

    Well step 1 would be to lose the attitude.

    It's code, it may be in an obsolete language, it may be not to the best industry standards, but its code and it's got enough knowledge in it, that nobody wants to throw it away, and they hired you to maintain it.

    Step 2, I don't know why you would define a process before you understand the code you are to apply the process too?
    Seriously, wtf is all the process stuff about, you're the sole programmer, any rules you set are rods to break your back when you first hit a piece of code you have to break the rules.

    Step 3, you serve them. If you want to port it to a more modern maintainable language, choose one that's easy for THEM to transition to. They've got the knowledge that drives the company, not you, you are the cleaner here. If the phone rings your turn off your vacuum and let them do their job, Mr Cleaner. Nobody gives a fork if the cleaner has best industry procedure for cleaning an office.

    Step 4, break it down. tiny bit by tiny bit, port to a new CLEAN structure, bit by bit. They wrote it, they can identify the core stuff.

    Step 5, once you've ported it, along comes an engineer with a code written to the old language and old methods. Again, that's fine, put away the process manual, these are the experts, if that's the language he can communicate to you in, it's fine, you can understand it, you can port it, you can help him speak the modern lingo. Don't quote your processes to him, you're just the cleaner.

    As for this:
    "Software development needs good processes and procedures for exactly the same reasons that chemical plants or factories have manuals"

    That's someone who *implements* things, typically a bolt together module manager. He is not someone who creates *new* things. Because news things don't come with manuals. You don't know the rules of how they work till the problems needed to make them work are solved. One assembles Microsoft IIS blocks, the other works for Google on image processing. Which are you?

    1. Re:1. Lose the attitude by gbjbaanb · · Score: 2, Insightful

      unfortunately, I think most devs (especially the kind to complain about someone else's "crufty" code) will spend months rewriting, refactoring and introducing today's "best practices" like IoC and Dependency Injection and come up with 300kloc of even worse spaghetti code, that now has extra bugs to be fixed too.

      A bit like how a discussion on stack overflow ended up discussed on ArsTechnica for some (probably advertising-related purpose) and now has come to Slashdot for further adoption. 4chan, you're next.

    2. Re:1. Lose the attitude by assertation · · Score: 2

      I don't mean any disrespect, but you seem to imply that "those people" don't know what they are talking about and have a bit of attitude. It seems like your post has a bit of an attitude.

      Those are good techniques you are criticizing.

      You could have brought up the very good point that programmers should ask themselves why the former devs did something a certain way.

  7. Don't touch it by GrahamCox · · Score: 4, Insightful

    All the advice to rewrite it is misguided. Maybe rewrite small parts that you need to to keep it working on new hardware, or whatever, but if it works, I would think that wholesale rewriting is asking for trouble. The Ars article is full of great advice about what you should do to manage a large codebase going forward, but actually it doesn't really address the question of what to do about a large legacy codebase that wasn't written with best practice. The best software is written by incremental improvement of what went before (no matter how badly written, as long as it meets its specification) - big projects written from scratch usually fail.

    1. Re:Don't touch it by ed1park · · Score: 4, Insightful

      Beware of the second system effect.
      http://en.wikipedia.org/wiki/Second-system_effect

      Rewriting code can kill you in the short term.
      http://www.joelonsoftware.com/articles/fog0000000069.html

      Or help you in the long term.
      http://notes-on-haskell.blogspot.com/2007/08/rewriting-software.html

      I recall another similar article about a rewrite of MS Office, and what a mistake it was...

  8. Re:Don't put up with it. by rmstar · · Score: 3

    I'd rather be homeless than be in charge of a project my boss doesn't really care about. Talk about the fast track to nowhere. Even if by some maricle you do pull it together, nobody will know or care.

    If things go well, your bank account will know, for example.

    What does it even mean "my boss doesn't really care about" a project? Is his vision somewhere else, but has given you the job of guarding the crucial machinery on which everything else builds?

    (And being homeless sucks badly, by most accounts.)

  9. Re:Rewrite it by StripedCow · · Score: 4, Insightful

    200k isn't something you're going to rewrite in a couple of weeks. I think the absolute maximum you could get (for one very skilled person) would be about 5k-10k per week. Rewriting would take on the order of half a year.

    --
    If Pandora's box is destined to be opened, *I* want to be the one to open it.
  10. 200K Lines not that much by PerlPunk · · Score: 4, Interesting

    I have spent most of my career as a software developer inheriting and updating such spaghetti code bases. Here are few remarks and some of my experiences around this:

    1. 1) 200K lines is not such a formidable size. If your average module size is 1000 lines of code, that's 200 separate modules. Or if the module size averages 2000 locs, that's 100 modules. Gradually getting your head around the modules is not as big a problem as it seems, even if there are many interdependencies between modules. However, if the average module size is something on the order of 10K or 20K, then you really are dealing with spaghetti code, and that's quite a bit harder to figure out than if the module size on average were around 2 or 3K.
    2. 2) For the time being, treat the whole application like a black box, which means not worrying too much about how well it works until you have to fix some "bug". At that point, figure out how it works only insofar as you need to in order to get your bug fix in, and record your lessons learned in a wiki and in comments in the code. Refactor as you go along, if feasible.
    3. 3)Being able to step through code is really helpful when trying to understand a poorly documented code base--even if the code is well structured. A number of technology platforms (like Java JVM) offer remote debugging.
    4. 4) You can reverse engineer these things and produce a set of business specs with which to port the application to a new platform. Right now, I'm on a project that is porting 125K lines of COBOL code that ran on OS2 to an Apache/mod_perl technology stack. Our team consists of 2 cobol developers, who are producing the specs from the code, and 3 perl developers who are porting it. The key here is to capture the business requirements and the user interface behavior. Once you do that, how you implement it on the new platform is quite straightforward. HOWEVER, this approach is not advised unless your company or gov enterprise has lots of time, deep pockets, and a commitment to seeing the project through to its eventual success.

    In summary, don't be too scared of a legacy spaghetti code base. These things can be understood well enough in time to refactor or port to a new platform.

    1. Re:200K Lines not that much by wvmarle · · Score: 2

      1) 200K lines is not such a formidable size. If your average module size is 1000 lines of code, that's 200 separate modules. Or if the module size averages 2000 locs, that's 100 modules.

      You make a very big assumption here, and that is that this code is written in neatly separated modules. The submitter asking for help; calling it "spaghetti code" that has been "cobbled together" over a long period and presumably by many different developers; those points make me doubt you can make such an assumption.

      It can very well be a single "module".

  11. NOT like a chemical plant by Troyusrex · · Score: 2

    The difference between software development and most manufacturing is that they produce the same or very similar product thousands or millions of times where we produce a different product every time. This "building an app is like mass producing a chemical" philosophy is one reason why most software shops have insane amounts of unneeded documentation and overhead. I certainly agree that some standards, processes and documentation is needed but it should be kept to the bare essentials as every bit of work done that doesn't directly build a product could well just be a waste of time.

  12. Inexact Results by SuperKendall · · Score: 5, Insightful

    Rewrite it from scratch using the spaghetti code version to run correctness tests to verify you haven't changed the behaviour.

    And just how are you supposed to write "tests for correctness" when the very concept of what is "correct" is embedded in the code?

    Any such tests would embody your own notions of what is correct based on your understanding of a codebase that cannot be understood.

    Furthermore, Doom is quite a different thing. You have an end result that can be somewhat different and it doesn't matter - it could render textures such that they appear rather different but if you find it visually OK then it's fine.

    No such luck with business software which usually has extremely rigid and exactly output, often output other systems are depending on being just so. There is no room for alteration of behavior, yet as I said no-where exclaims all of the features of the output you cannot possibly understand....

    I agree with a few responses that the only way to proceed is to re-write tiny parts, that at most affect one other system in the company - with the explicit buy-in from those other groups something may change, and the understanding you may have to back out your changes wholesale if you cause too much disruption.

    Can't get buy in to proceed? Then quit or work with the code as is.

    --
    "There is more worth loving than we have strength to love." - Brian Jay Stanley
  13. Re:Farm out OP writing, too. by Sulphur · · Score: 2

    I advise printing it out and posting it in the hall in printed form, annotated with contributors. Paper the walls with it. You would be surprised how much more efficient that makes a coder even if it's not ever read.

    Also, for debugging sometimes you have to get down on the floor with 50 pages of fanfold and work it out. Decorum be damned. A PC only has so much vertical resolution and sometimes you need more than to peek at the code through a sliding window.

    You can also flip through fanfold like a book and see two pages at a time with an option to turn several pages at once.

  14. Re:in my experience... by Anonymous Coward · · Score: 2, Informative

    Usually when the inxperienced programmer does the rewrite, "initial results are promising".
    That is, it is possible to quickly build a basic framework with badly designed error handling and
    only part of the requested features.

    When it is tested in real life, lots of omissions and problems are found.
    Once these are fixed, the result is as messy and unstructured as the initial system was before
    it was rewritten.

  15. Good luck with that by Anonymous Coward · · Score: 2, Insightful

    Your employer probably isn't interested in spending the time and money on a re-write. Nor are the clients going to be interested in waiting that long for new features, either.

    You will be made to figure it out and add features, or you will be shown the door.

  16. 200k lines isn't too much by SpaghettiPattern · · Score: 2

    In the past years I have been several times in such a predicament. Huge amount of code and the function of the system is not completely clear. The original developers are gone, the system isn't well documented and only a handful of people know how how it should behave. As a matter of fact, tomorrow I will start coding on one system we can no longer support as hardware, OS and used libraries and frameworks are outdated and/or discontinued.

    Reengineering and rewriting is usually the best option. However, you need skills and experience in order not to make the same mistake the previous developer did. Of course, management must trust and approve your actions.

    A few dos:
    * Learn at least UML use cases, components diagrams and sequence diagrams.
    * Make use cases and check these with affected parties.
    * Start of with a rough component model of the new system.
    * Make a clear picture which nodes (hardware + OS), subsystems (units performing a function), software components (modules containing data, modules performing a function, etc...) and agents (users, triggers/schedulers) are involved.
    * Draw the interactions between the subsystems and/or software components.
    * Clearly document which interactions are on-line and which ones are batch/background/off-line.
    * Specify interfaces. (Used file formats, protocols, software library interfaces if you will.)
    * Slowly refine your model until you feel comfortable with it.
    * Make a rough class model and keep usability and maintainability in mind. Backtrack if necessary.
    * Divide software components between "dumb" containers of information (e.g. plain Java beans) and components performing functions (business logic if you wish.)
    * Decide which interfaces to make public and which not.
    * Describe restricted/private bits of code just enough for maintainers to understand them. And nothing more than that.
    * Make as much unit as necessary for your components. Unit test enough functionality.
    * Communicate your results regularly and refine your model where applicable.
    * Define integration tests and do these very seriously.
    * Define regression tests and perform these very seriously.
    * Make involved parties accept parts of the system according to performed integration and regression tests.
    * Try to plan gradual decommissioning of the legacy system.
    * Document the system "enough". System architecture (from UML), references from architecture to code, installation manual and operational manual are the most important ones.
    * Try to achieve longevity in the documentation. Abstract details and convince involved people that that is a good thing.
    * Define 1st, 2nd and 3rd level support. Preferably you should remain 3rd level support to better enjoy sleep.
    * Conform to standards and practices if they reduce discussion and enhance clarity.
    * Use well established techniques. E.g. JPA and JAXB.
    * Allow well established component manufacturers to make your programming life easier. E.g. Apache Commons.
    * Be tidy.

    A few don'ts:
    * Avoid OO pattern overkill.
    * Don't take the quick and dirty option too quickly. Those decisions will haunt you eventually.
    * Avoid making everything public. Documenting and maintaining public interfaces is more expensive.
    * Try to avoid big bangs.
    * Avoid less well established component manufacturers. My next project did use components from less established component manufacturers and their sell by date has generously expired.
    * Don't allow babling "architects" to make a mess of your system. But don't alienate them either.

    I may have forgotten a few things but this is all stuff I consider even for smaller projects.

    --

    I hadn't the slightest objection to his spending his time planning massacres for the bourgeoisie... (P.G. Wodehouse)
  17. Re:Don't put up with it. by __aaltlg1547 · · Score: 4, Insightful

    If that's the case, the stated or implied directive is don't break this. Which means probably no major rewrite.

  18. Re:Rewrite it by TheRaven64 · · Score: 2

    That depends on whether it actually needs to be that complex. Implementing a project that is 200KLoC is not the same as implementing something that is functionally equivalent to an existing codebase that has grown to 200KLoC. If the project is as described, I wouldn't be surprised if it could be implemented in under 20KLoC, possibly less if there are some existing libraries that can be used.

    --
    I am TheRaven on Soylent News
  19. Re:Don't put up with it. by ralphdaugherty · · Score: 2

    If that's the case, the stated or implied directive is don't break this. Which means probably no major rewrite.

    This should be +5 Insightful. It's the bottom line for production software.

  20. Re:Standard procedure by russotto · · Score: 2

    Per John Lakos, almost any bad interface can be wrapped in a good one. Slice off an appropriate slice of dodgy code, wrap it in a testable interface, write the test code to baseline its behaviour, and then when it makes business sense, you can refactor that slice of code. If it doesn't make business sense, you don't touch it.

    You've never seen real spaghetti code if you believe this. In a really nasty ball of spaghetti, there's nowhere to make the cut; any significant section of the codebase essentially depends on the entire codebase. Among other sins: code at low level will make decisions essentially based on which high-level caller ultimately called it.