Mapping/Understanding System Complexity?

← Back to Stories (view on slashdot.org)

Mapping/Understanding System Complexity?

Posted by ryuzaki0 on Thursday July 27, 2006 @11:50AM from the threads-and-tapestries dept.

thesandbender asks: "I've recently inherited a project to 'simplify' the application environment for a company that has 1600+ service offerings (many of these are product 'foobar' that has options (like 'Alpha', 'Bravo', 'Charlie', and so forth) available. I am trying to map out the applications' dependencies from a technological and a business standpoint. I would like to designate a group of applications as depending on concepts, technologies (like SAN, DB2 and AIX), specific customers (like 'Bravo' and 'Charlie') and legacy applications. Basically, I want to define any number of arbitrary dependencies and then be able to map them out in a graphical format. With those maps I can show the business oriented staff how removing one application will affect other applications, and I can show the technically oriented staff how removing one system will affect other systems or applications. Has anyone in the Slashdot community run across such a tool? If you haven't, have you run across the need for such a tool? What would you want from it so that I can fashion a usable tool that addresses everyone's needs and not just my own?" "The most appropriate tool-sets I've found to date are 'mind mapping' or 'concept mapping' tools. All of the tools I've found so far only allow me to create any number of ideas or concepts and don't allow for arbitrary, searchable and/or mappable attributes (e.g. Application 'foo' maps to attributes 'SAN', 'Java', 'Solaris' and 'Buy-Side') that would allow me to create hard and soft groupings that were based on defined attributes (e.g. I could ask for a cloud of all objects that share a specific technical attribute, and another cloud of objects that share a specific business attribute)."

63 comments

Min score:

Reason:

Sort:

Although not designed for that problem by jd · 2006-07-27 12:02 · Score: 3, Informative

I'd use a database modelling tool, like ERWin or Dezign for Databases. (Yes, they're both commercial - I've not seen a single good entity-relationship modelling tool that is also Open Source, although it's an obvious tool to write.)

These tools map attributes in records to other attributes in other records. They're designed to then turn these maps into SQL code, but that part isn't important here. What is important is that you can create a full relationship mapping between entities. If you then treat the direction of the relationship as showing the dependency, you can map all the dependencies in the system.

Managers like diagrams to be of a format that are familiar to them, so anything that is "better" from a technical standpoint but "less familiar" to managers from an experience standpoint is, in fact, not as good of a solution.

--
It's a small world and it smells funny; I'd buy another if it wasn't for the money; Take back what I paid (SoM)
1. Re:Although not designed for that problem by smittyoneeach · 2006-07-27 12:31 · Score: 1
  
  graphviz might also be a way to go, and it's unchained.
  
  --
  Get thee glass eyes, and, like a scurvy politician, seem to see things thou dost not.--King Lear
2. Re:Although not designed for that problem by John+Q.+Public · 2006-07-28 09:05 · Score: 1
  
  I've had good experiences with DBDesigner (GPL) for database modeling, though I wasn't doing anything extremely taxing and it may miss some blinkenlights the paid products have. It was more than adequate for what I needed...
  
  --
  UserAdvocate: The voice of the user
Re:degfbdsgf by donaggie03 · 2006-07-27 12:02 · Score: 3, Insightful

What is wrong with a guy doing a little research before implementing an assigned task? I assure you, conversing with colleagues about an issue, whether it is in person or online, and whether you actually know the PERSON or just thier slashdot handle, you are still conversing with colleagues. It sounds to me like you skipped the classes that stressed "teamwork" in your undergraduate curriculum.

--
Three days from now?? Thats tomorrow!! ~Peter Griffin
Graphviz by danpat · 2006-07-27 12:07 · Score: 5, Informative

You can probably draw the picture you want with GraphViz, found here http://www.graphviz.org/

To use it, you create a text file that defines all your dependencies, it'll look something like this:

digraph thingies {
"app1" -> "SAN";
"app1" -> "Java";
"client1" -> "app1"; ...
}

You can then go on to group things together so they show up in meaningful locations on the diagram,
associate pictures with certain nodes, put labels on things, make things in colour, etc.

GraphViz takes care of the laying out parts (where best to put nodes and edges). Sometimes it takes
a while to define everything in format that gets draw neatly, the results are often impressive and
very useful.

On coming to a new job, I've used it to draw all the dependencies between a collection of a couple
of hundred SQL stored procedures in our database. The locals were horrified to have what they all
knew in their gut depicted to them on 35 A3 sheets of paper on a wall :-) It was quite useful for
identifying things recursive calls ("Oh, *thats* why that proc sometimes never ends...").
1. Re:Graphviz by Clover_Kicker · 2006-07-27 12:35 · Score: 1
  
  Nifty. Here's someone using graphviz for a similar situation - dependencies for apps in the FreeBSD ports system.
2. Re:Graphviz by joshetc · 2006-07-27 12:48 · Score: 1
  
  I would love to see one of those graphs with 1600 "things" on it.
  
  I'd imagine it would look something like this
3. Re:Graphviz by thesandbender · 2006-07-27 13:34 · Score: 1
  
  I'm the author and I appreciate you taking the time to point this application out.
  
  Thankfully I have access to 48x60 plotters and I know I'll be using them :)
4. Re:Graphviz by danpat · 2006-07-27 13:53 · Score: 1
  
  No worries. If you have access to a Mac, I'd also recommend http://www.pixelglow.com/graphviz/. The raster formats that
  can come out of GraphViz can be huge (i.e. 100,000x100,000 pixels), they can be quite difficult to deal
  with sometimes, the GUI these guys have written can deal with even huge graphs pretty snappily.
5. Re:Graphviz by 4of12 · 2006-07-28 03:48 · Score: 1
  
  I've recently been using Graphviz with gprof and gprof2dot.awk to map out some really crufty old code. It's really illuminating.
  
  However, what I'd really like to be able to do is have it animated, showing nodes and edges in the graph as they become activated (slowed down from real time, of course and reducing super-repetitive actions logarithmically).
  
  A web application for this kind of graph layout and animation using SVG would be widely applicable, not just to visualizing call trees for computer programs, but also for understanding other phenomena that are visualized well with graphs, such as I/O flow between applications, industrial process equipment, organizational hierarchies, network traffic with hosts appearing and disappearing, etc.
  
  Unfortunately, everything I've seen so far is either cobbled together for a one-off spectacular presentation, specific to some particular application domain, or else locked into a much less general display mechanism (platform X graphics API).
  
  If anyone knows anything approximately close to dynamic web-based graph visualization framework I'd like to know about it.
  
  --
  "Provided by the management for your protection."
6. Re:Graphviz by kwoff · 2006-07-28 06:36 · Score: 1
  
  I did this exact same thing with an application with 132 database tables, connecting them by foreign_key and so on. It didn't really give me a much better grasp on it than I already had, though. It's just a lot of information to digest (mine was only 20 pages - maybe I needed to make it bigger).
Relicore Clarity by gastr0pod · 2006-07-27 12:42 · Score: 1

I saw a demo of this product and it seemed neat. You install the daemons on your systems and it monitors all the socket & file opens. In this way it can map application dependencies on different machines or the same machine. I think it can scale to a few thousand machines. On the down side it's not free. When Symantec acquired Relicore the product was renamed to some bland, information free name like Configuration Manager. http://www.symantec.com/Products/enterprise?c=prod info&refId=1461
1. Re:Relicore Clarity by Miniluv · 2006-07-27 15:20 · Score: 1
  
  This seems like it wouldn't be that hard to reproduce the core functionality. Obviously the management friendly graphs, reports, and the config management portion are more work but simply writing a tool that shows you what your box actually does sounds rather straight forward. And it would also seem that the 90/10 rule would come into play, in that you'll get most of the benefit by simply getting the raw data and doing some basic analysis on it without needing to spend all that money.
  
  Thanks for pointing this out, now I can point my engineering team at it and see how fast they can whip up an equivalent.
Re:degfbdsgf by Anonymous Coward · 2006-07-27 12:46 · Score: 0

It's fine for him to ask people. But those people shouldn't be basically random individuals on the Internet, who he has never met, and whose skills and experience he knows nothing about. It's a dangerous thing to do, because the information he gets may be completely invalid, even if it sounds correct. When a business is at stake, such actions are unacceptable.
Why don't you just use a wiki? by aurelianito · 2006-07-27 12:48 · Score: 1

I recomend one that handles "backlinks" (ie: what are the pages that point to this page?)

Using a wiki, I would add one page per Application, describing it and linking to its neighbor applications. I would also add one page for each attribute (like: Java, or Windows, or something like that), one page for each software group (might be related to the one before) and one page for each type of category (for instance: platform, application, development group).

You can even use the hability of some wikis to do graphs between pages.

Quoting Ward Cunnigham: The simplest database that could posibly work
1. Re:Why don't you just use a wiki? by tweek · 2006-07-27 13:54 · Score: 1
  
  I second this. I'm in the process of switching to another company and one thing I never had time to do at the current place is document. It was always a low priority because we were always going.
  
  I installed a wiki for the helpdesk and decided to create another one for the SA team. The linking alone has helped us tremendously. I've also created a system profile template wiki entry that can be copied and pasted and used by operators to help us document the systems.
  
  Let me know if anyone wants a copy of the template. I like it but it could probably use additional input. It's currently geared towards linux servers.
  
  --
  "Fighting the underpants gnomes since 1998!" "Bruce Schneier knows the state of schroedinger's cat"
"Complexity kills" by Baldrson · 2006-07-27 12:48 · Score: 1, Interesting

This request brings to mind the now famous quote from Ray Ozzie regarding reorienting Microsoft to services: "Complexity kills".

There is actually a rigorous definition of complexity that is actually stated in terms of software: Kolmogorov complexity. Kolmogorov complexity is the size of the shortest program that produces, precisely, a given output. The number of bits in that program is the Kolmogorov complexity of the output. This is actually a very useful way of viewing system complexity in the case of software. One way of viewing programming is the compression of all the program's use cases into a program specification which, in theory, is executable. But there is more to it. As it happens -- unsurprising to many of us -- it is now a theorem of computer science that the closer the size of that "executable" gets to the Kolmogorov complexity of the use cases, the better.

This theorem is a major breakthrough in CS and should be learned in every institute purporting to teach IT.

It's important enough that I've proposed a prize award to Ray Ozzie. What follows is my email to him. We'll see if it makes it through his gatekeepers and then gets his attention.

Hi Ray,

I've got a simple and powerful idea that I think, based on your statement "Complexity kills," you'll find interesting.

It is a prize competition that I'm tentatively calling "The C-Prize" that rewards the most succinct representation of a major knowledge base.

Unsurprizingly to some of us, Ockham's Razor is more than a mere rule of thumb -- it turns out to be the foundation of intelligence. Marcus Hutter of the University of Lugano, Switzerland, recently provided a mathematical proof of this link between simplicity and intelligence which has withstood peer review. Dr. Hutter believes the C-Prize to be "an excellent idea".

The criterion is easy enough to state:

Let anyone submit a program that produces, with no inputs, one of the major natural language corpora (such as a Wikipedia snapshot) as output.

S = size of uncompressed corpus
P = size of program outputting the uncompressed corpus
R = S/P (the compression ratio).

Award monies:

Previous record ratio: R0
New record ratio: R1=R0+X
Fund contains: $Z at noon GMT on day of new record
Winner receives: $Z * (X/(R0+X))

The compression program and decompression program are made open source.

At present there is a small prize being administered by Leonid A. Broukhis based on the relatively tiny Calgary Corpus. Matthew Mahoney of the University of Miami has proposed a larger $50,000 compression prize to the National Science Foundation, but experience has shown it is like pulling teeth to get government agencies to fund prize awards -- they generally have to see private parties are doing so first. Marcus Hutter has put up a few small prizes based on some mathematics problems he needs solved for advancement of his theory of intelligence.

Any of these individuals could be a credible locus of control for the C-Prize -- or Microsoft itself could be the locus of control.

--
Seastead this.
1. Re:"Complexity kills" by Anonymous Coward · 2006-07-27 13:22 · Score: 0
  
  This is one of the more subtle trolls I've read today (sci.math.* and sci.logic notwithstanding), so mod parent funny. :-)
  
  Hint: It's completely off-topic, and it's actually a tired old idea that nobody seems to care about anymore, but on first read it seems both novel and on topic. Then you realize it's a troll and slap yourself on the forehead.
  
  Basically it's another form of "get rich quick if you can find an approximate solution to the question of whether or not there exists a smaller input that produces the desired output." It's fairly trivial to show that this is NP hard, but it does not have to fall within NP because it's also easy to conceive of situations where evaluation of the program requires more than exponential time and/or polynomial space. If you wanted to strictly limit the problem to NP, you could add a further restriction that attempts to optimize for both space and decompression time. However, I think we've already got enough "solve the NP complete problem and win big prizes" types of problems.... ;-)
2. Re:"Complexity kills" by Baldrson · 2006-07-27 13:48 · Score: 1
  
  Typically, the Anonymous Coward's contentless sarcasm betrays his shallow grasp of reality. The relevance is clear: When you design your service suite and do not minimize complexity, you aren't just asking for trouble, you are, by definition, producing a low quality suite. You can, in fact, produce a compression of a natural language knowledge base without even using a compression program and have that be an important human accomplishment. Epistemology is virtually defined by such advances. So the fact that the problem is computationally hard is neither here nor there to first order. The important thing is quality of knowledge.
  
  --
  Seastead this.
3. Re:"Complexity kills" by Anonymous Coward · 2006-07-27 14:09 · Score: 0
  
  Holy nonsensical self-righteous pseudotech drivel, Batman. We've got a live one!
  
  Rebuttal: I do not believe there is any value added in futher compressing a given "knowledge base" without a constraint on the decompression time. What if someone found a 1234-byte "program" that allowed you to output the N-th bit of a 7.6 Terabyte file every factorial(N) clock cycles? What if you just want to read the last chapter? Knowledge is most useful when we have random access to it.
  
  p.s. Baldrson, you've just jumped ahead of JSH (the usenet troll famous for his "research" on factoring) on my list of favorite technology trolls. :-)
4. Re:"Complexity kills" by Baldrson · 2006-07-27 14:30 · Score: 1
  
  A time/space constraint on the C-Prize is a given due to the halting problem. So what?
  
  My statement about the quality of knowledge being defined by its approximation to the Kolmogorov complexity of the world it represents holds.
  
  A program specification is executable "in theory" (I did put it precisely that way for a reason). Programmers still have jobs to do and the closer they come to the Kolmogorov complexity of the specification within the computation constraints, the better quality their implementation.
  
  PS: A.C. snipers are typically suffering from low self-esteem and go on about "trolls" mainly as a method of projecting their own problems with real discussions. How about just coming out of the shadows and providing your identity to show you aren't afraid what you say is horseshit?
  
  --
  Seastead this.
5. Re:"Complexity kills" by sgt_doom · 2006-07-27 15:06 · Score: 1
  
  Outstanding! This is an Ultramax post! I had completely forgotten about old Andrey.
  
  I just have a problem with the architect of Lotus Notes proclaiming: "Complexity kills".
6. Re:"Complexity kills" by Anonymous Coward · 2006-07-27 15:08 · Score: 0
  
  1. The halting problem doesn't limit the search time. The "in a nutshell" reason it's unsolvable on Turing machines is that it asks a nondeterministic question with essentially unbounded time/space: "given this program, do ALL finite inputs lead to halting configurations after finite execution times?" or "given this program, is there at least one finite input that reaches a non-halting state that it previously encountered (including all the bits it can still access in the unbounded tape)?"
  
  2. The nondeterministic search for a higher compression ratio for a single problem has almost nothing to do with programming ability. After only a few iterations, finding a better compression would get as hard as factoring RSA numbers, so people would just throw it into a "wasting electricity @ home" type screensaver.
  
  3. I've been posting here for 6 years, and I probably have as many +5's as you. I don't have an account, and I don't want one.
7. Re:"Complexity kills" by QuantumFTL · 2006-07-27 15:58 · Score: 2, Informative
  
  The problem with the concept of complexity based on program length, is that program length is highly dependent on the system of encoding used to represent the program. Short programs in functional languages can be quite long in imperative languages, and lamdba calculus functions for even simple things are sometimes so long as to be impossible to represent in a published paper.
  
  I've yet to see any "cannnonical" representation that can be used for this purpose. "Kolmogorov complexity" is not useful for these things for that reason. A more interesting metric, often used in software engineering, is cyclomatic complexity.
8. Re:"Complexity kills" by Anonymous Coward · 2006-07-27 16:36 · Score: 0
  
  Short programs in functional languages can be quite long in imperative languages... I've yet to see any "cannnonical" representation that can be used for this purpose.
  The canonical (note: just two n's, not 3 or 4; the root word is canon) representation is simply a Turing Machine that functions as an interpreter for the given programming language plus the input program for that language, encoded in the alphabet of the given turing machine. Thus the "real" size includes the size of the size of the language interpreter and the size of any libraries used.
9. Re: "Complexity kills" by Black+Parrot · 2006-07-27 16:52 · Score: 1
  
  > As it happens -- unsurprising to many of us -- it is now a theorem of computer science that the closer the size of that "executable" gets to the Kolmogorov complexity of the use cases, the better. This theorem is a major breakthrough in CS and should be learned in every institute purporting to teach IT.
  
  Could you give us a source on that theorem, please?
  
  --
  Sheesh, evil *and* a jerk. -- Jade
10. Re: "Complexity kills" by Black+Parrot · 2006-07-27 18:16 · Score: 2, Insightful
  
  > Typically, the Anonymous Coward's contentless sarcasm betrays his shallow grasp of reality. The relevance is clear: When you design your service suite and do not minimize complexity, you aren't just asking for trouble, you are, by definition, producing a low quality suite. You can, in fact, produce a compression of a natural language knowledge base without even using a compression program and have that be an important human accomplishment. Epistemology is virtually defined by such advances. So the fact that the problem is computationally hard is neither here nor there to first order. The important thing is quality of knowledge.
  
  Ignoring the fact that your post was merely a bit of self-aggrandizement unrelated to to the Ask Slashdot question, you're chasing a will-o-the-wisp. There is no universal compression algorithm.
  
  It should be immediately obvious that, when using the same symbol set for plaintext strings and their compressed form, any compression algorithm that makes some strings shorter must make some other strings longer.[*]
  
  Thus the design goal of any useful compression algorithm is to bias it toward the expected properties of the input strings. The algorithm that compresses English text the best probably doesn't compress Latin text the best. The algorithm that compresses Slashdot best probably doesn't compress the New American Standard Bible best. The algorithm that compresses Slashdot stories on astronomy best probably doesn't compress Slashdot stories on biotech best. The algorithm that compresses your post best probably doesn't compress my post best.
  
  What do you expect to accomplish with a prize for best compression of some pre-specified corpus, other than finding out who can do the best job of tuning their algorithm to that corpus?
  
  You certainly won't learn anything about artificial intelligence. Hor help thesandbender with his IT question.
  
  [*] You can get ahead by using different symbol sets for the strings and their compressions, but if you are going to process them with a binary computer and/or store them on binary media, you're stuck with {0,1} under the hood, regardless of what superficial symbol sets you specify.
  
  --
  Sheesh, evil *and* a jerk. -- Jade
11. Re:"Complexity kills" by doc+modulo · 2006-07-27 23:05 · Score: 1
  
  I somehow feel there are situations where you are better off sacrificing Kolmogorov efficiency for a program structure that better interfaces with the human brain.
  
  An analogy, I read that every Rubic's cube is solvable in N steps (24 moves or something) but that humans have to use more elaborate chains of steps so they can get their heads around the solution.
  
  Even though a smaller program is better just because there's less to comprehend for the human brain. Sometimes smallness creates its own complexity from the perspective of the human mind. There's simpler from the standpoint of a smaller structure and simpler from the standpoint of simpler model/structure information transfer from reality to the human mind.
  
  --
  - -- Truth addict for life.
12. Re:"Complexity kills" by Anonymous Coward · 2006-07-27 23:49 · Score: 0
  
  Text = ""
  while (MD5(text) != WIKI_MD5)
  {
  Text = get_random_string(WIKI_LEN);
  }
  
  I win.
13. Re:"Complexity kills" by Baldrson · 2006-07-28 02:43 · Score: 1
  
  It turns out that the choice of universal machine is not very important to the compression ratio. It is different by a constant -- which you can see by virtue of the fact that U1 can emulate universal machine U2 by a fixed sized program.
  
  --
  Seastead this.
14. Re: "Complexity kills" by Baldrson · 2006-07-28 02:49 · Score: 1
  
  First, the goal of the C-Prize isn't necessarily a compressor -- although that would be nice. The goal is an optimal compression of a relatively broad knowledge base. This optimal compression has its own value -- think about an ontology that optimally codes a broad range of knowledge and you can see the means of compression is really secondary to the value.
  
  Second, if you followed the links you'll see that I did discuss using the multi-lingual aspect of Wikipedia to discover language independent properties of the knowledge base -- although this is likely to happen even without using multi-lingual samples from Wikipedia.
  
  As for the relevance to web services -- I guess you consider it irrelevant that the new architect for Microsoft said "complexity kills" as part of his manifesto on reorienting Microsoft toward services.
  
  --
  Seastead this.
15. Re:"Complexity kills" by Baldrson · 2006-07-28 03:05 · Score: 1
  
  If the use case set (the thing being compressed into a spec) isn't human readable, what good is it? The spec can be expanded in to render it readable by humans.
  Rubic's cube isn't really a very good example of the kinds of problems people solve in the real world. It has too many "group symmetries". I like symmetry as much as the next guy -- believe me, more actually -- but be reasonable.
  But even in those special instances where one has a Rubic's cube kind of problem, the proof that it is solvable in at most N steps is relevant only if the use to which the cube is being put requires such a small number of steps. If it does then that is the use case and you can't get around it. If there is no constraint on the number of steps (unlikely) then the sole criteria is the simplest specification of a universal solution.
  
  --
  Seastead this.
16. Re:"Complexity kills" by Anonymous Coward · 2006-07-28 03:41 · Score: 0
  
  It turns out that the choice of universal machine is not very important to the compression ratio. It is different by a constant -- which you can see by virtue of the fact that U1 can emulate universal machine U2 by a fixed sized program.
  
  Outside of academia, constants matter. If you want to bridge the gap between toy theories and the real world, you need to account for constants. You need to deal with reality.
  
  I don't know how many academics I've seen crash and burn on this simple mistake. It doesn't matter what you can or can't do on a theoretical machine that no one can ever build. It doesn't matter what the "complexity" of a theoretical program no one can write or understand is.
  
  You've got a stronger lower bounds; they're called "the limitations of existing hardware and human psychology". Deal with those, instead of pie in the sky, and perhaps your algorithms and ideas will actually be useful in the real world.
  
  End of rant.
17. Re:"Complexity kills" by Baldrson · 2006-07-28 07:13 · Score: 1
  
  Real world compression of large text (gigabytes) files will come down to hundreds of megabytes. Virtually all real world emulators are megabytes or less, not hundreds of megabytes nor even tens of megabytes. The ratio isn't changed much in the real world.
  
  --
  Seastead this.
18. Re:"Complexity kills" by Jamie+Lokier · 2006-07-31 07:41 · Score: 1
  
  For small programs, yes.
  
  But for large programs, the size of the encoding is not significant, and decreases to irrelevant with increasing program size.
  
  Put simply, if you have a really efficient encoding ("A") of some program, you can use it to represent the program in any other encoding that's required ("B") by using your efficient encoding as data, input to an interpreter for that encoding (you might also view the interpreter as a decompressor).
  
  You mention lambda calculus as something that's way too long a representation.
  
  Consider a variant of lambda calculus that includes basic string processing, e.g. a simple Lisp.
  
  Then you can write the major part of a large program as compressed text (or Perl!) if you think that's more compact, simply by doing ((lambda (string-to-decompress) (decopmression-algorithm-here ...)) "agewkjhewljhwlkhjlkjhlkjdh....").
  
  Such a program may be ugly and impractical for a person to read, but it takes about the same space whatever language you're asked to write it in, provided the language has basic string processing.
  
  Not all languages do, of course, and that's where your mention of canonical representation comes in. What you really need, with Kolmogorov complexity, is an appropriate measure of the number of bits in each symbol of the encoding. If the encoding is a language with arbitrary strings allowed as part of the program, then the bits per character will depend on the character range you can put in a string (e.g. ASCII (nearly 7 bits per character), or all 8-bit bytes, or Chinese (more than 8 bits per character) etc). If the encoding is limited to just parentheses and symbols, or another restricted syntax, like pure lambda calculus or SK supercombinators, then the number of bits per character or symbol is considerably less. Thus you can still compress and encode programs compactly in pure lambda calculus, to about the same size as languages with strings (for large programs), it's just that "size" doesn't mean the same number of printed characters, it means that after adjusting for the different number of bits per character/symbol/whatever unit you like.
  
  You mentioned cyclomatic complexity. This is almost the opposite of Kolmogorov complexity in that cyclomatic complexity disfavours good compression schemes and nested, obscure structures. It is quite a different meaning of the word "complexity". Reducing cyclomatic complexity roughly translates as "making the program easier to understand and work with". Reducing Kolmogorov complexity roughly translates as "finding a better way to compress the program".
  
  Kolmogorov complexity is really a measure of the information contained in the program, while cyclomatic complexity is roughly a measure of the ability to understand parts of the program well enough for people to work with them, without mistakes. They are quite different measures, and used for quite different purposes.
Simple by jlarocco · 2006-07-27 13:10 · Score: 4, Funny

When I get an assignment like this, I try to take a proactive stance. First, I add the project to my action item list. Then I formulate a list of stakeholders in the project. Then I call a meeting to open a dialog between the stakeholders and myself. After drilling down and making sure we're all on the same page, I draft a scope document. When I'm satisfied with the scope document, I hold a sidebar meeting to touch base with the shareholders and verify the document meets their requirements.

Usually by that time the project gets assigned to someone else.

--
Maybe not
Freemind by Centurix · 2006-07-27 13:33 · Score: 1

On the topic of mind mapping software, I always take a copy of Freemind with me wherever I go. It's simple, quick and open source.

--
Task Mangler
1. Re:Freemind by 3dr · 2006-07-28 02:48 · Score: 1
  
  Thanks for posting the link to Freemind. I hadn't heard of "mind mapping" per se but this particular technique may help with some current projects.
Visio by p!ssa · 2006-07-27 13:38 · Score: 1

Visio has always met my needs, you can "wow" the bus. types with the right stencil sets. Researching and identifying the catagories and dependancies will be the hard part regardless of the tool used. If you have additional people (preferrably experts with the various systems) I would define your key catagories, systems, packages etc. and assign the work of detialing the breakout tasks to each subject matter expert. Once you have this set of tasks completed bring everyone together to map the dependancies and watch all of the dumbfounded shock and amazement as people realize all of the redundant work etc. that has been performed over the years. This should help to get buy in with your peers, as well as group motivation to keep it updated going forward.
1. Re:Visio by guruevi · 2006-07-27 14:36 · Score: 1
  
  I'm sorry, but for complex things, Visio is just what you don't need. Sure it's nice and fancy on the output, but when you're trying to define a whole process it gets clogged and difficult on the input or later to change something.
  
  I think the user is looking more for a simple programming language or layout specification (like you do in TeX) in which he can write his stuff easily and orderly. Then the output gets automagically and dynamically generated and all he needs to do is make it look nice (for managers, everything has to be very simple and there has to be lots of whitespace) and print it out. I am looking for something like that too, just didn't find it yet.
  
  --
  Custom electronics and digital signage for your business: www.evcircuits.com
bollocks by weierstrass · 2006-07-27 14:06 · Score: 1

"It's fairly trivial to show that this is NP hard"

this is utter crap. you know little to nothing about what you're talking about.

the troll, and i agree it is a troll, asks for a compressed form of a given large natural language text. nothing is said about compressing or analysing for compression any given input. this is not NP-hard, and indeed has nothing to do with NP. there is no input, no input size and no reference at all made to computation cost.

any CS student with some programming knowledge and a textbook on compression could compress a wikipedia snapshot by ~50%. winzip probably achieves ~75%. the 'proposed competition' merely offers a reward for a best-yet compression ratio. this needn't require any technical leaps, just slightly better (more tailormade for the job?) compression algos.

the completely different problem you describe in your reply is a certainly difficult problem, but i would like to see your 'trivial' proof that it is NP-hard. it's probably more accurate to say that it is unsolvable, since perfect compression can never be attained.

in future, don't counter drivel with nonsense.

--
my password really is 'stinkypants'
1. Re:bollocks by Anonymous Coward · 2006-07-27 14:10 · Score: 0
  
  mod parent up.
  
  -AC
Cytoscape by Jello7 · 2006-07-27 14:35 · Score: 1

A better tool than GraphViz for this application is Cytoscape - http://cytoscape.org/ It was originally designed for biological applications, but is a general network visualizer with a very simple input format e.g.

ModelA-Charlie dependsOn ModelQ
ModelQ requires Java
etc.

You can visualize, interact with and edit the resulting network and even do some advanced analyses, such as network clustering which would tell you which families of projects you have.

It is written in Java and is LGPL.
Re: NP hard by Anonymous Coward · 2006-07-27 14:48 · Score: 0

GP AC here:

The limit of the GGP's proposed competition reduces to the question of whether or not there exists a shorter program that produces the desired output (if one exists, it's still possible to win some prize money). The desired output (knowledge base) is one of the inputs to the problem of finding a shorter program capable of winning any prize money.

If we simplify the problem so that it's limited to sigma-2, then the reduction from subcircuit isomorphism (the problem of determining if there exists a subcircuit that produces the same output) is really too obvious to bother explaining. Subcircuit isomorphism is sigma-2 complete (harder than NP complete -- sigma2 is "Ex Ay f(x,y) == true" vs NP is "Ex f(x) == true" ); therefore, this restriction of the problem is trivially NP hard. How about them apples?
How About System Tools Like Causal Loop Modeling? by Marble+Titan · 2006-07-27 15:18 · Score: 1

If it were me, I would redefine the goal slightly. It is not just the dependencies. It is the behaviour of the overall system, the collection of application components, that you need to be able to explain. In fact, if you want to know which components you can remove or tweak, you actually need to understand the business processes they are supposed to be supporting. Theoretically, the raw business processes of each company will represent less complexity than the system of application components you need to describe. These business processes would provide "overlays" that could help you to see the extraneous components of your "application system". Causal Loop Modeling (CLM) might be the right tool, specifically because you want to establish how a change in one part of the "application system" affects everything else. CLM is a successor to Stock and Flow modeling, but you can use it by istelf to explain system behaviour. It works especially well when there are feedback loops present that create complex behaviours such as goal seeking, cyclical or seasonal variation, and sudden growth or decline. However, you could simplify the CLM approach if you leave out the notations that define positive and negative forces. Essentially, you create a chart that connects every component to its upstream and downstream components. What you are left with is a diagram showing each component with many inflows and outflows. The way this is useful is to first note the components that have the most inflows. These are the "driven" components, the ones most affected by change. The components with the most outflows have the most influence on the others. Changes to these components create significant impact to the overall system. Armed with this information you can really start to establish where you can trim and where you can not. You can also start to identify where small changes can have a big influence. Like a vaccination: small change-big effect. In effect, you find the points of leverage in the system. That is where is starts to get really interesting. Hope this helps to get you started, or at least to provide a different perspective. If this interestes you, post questions and I will try to answer them.
you fucking idiot by Anonymous Coward · 2006-07-27 15:36 · Score: 0

there is only one input to the problem. the 'desired output' stays the same. it is effectively not an input. you do not have to solve it for every text, just for the one that has been determined in advance.

it is not 'NP-hard' to solve the travelling salesman problem for one given weighted graph. it can be done deterministically in constant time.

you did not read and understand either the GP or the OP.

you are a fucking idiot.

worse, your reply shows that you are an educated fucking idiot.
Re: reading and understanding by Anonymous Coward · 2006-07-27 15:56 · Score: 0

parent wrote:
there is only one input to the problem. the 'desired output' stays the same. it is effectively not an input. you do not have to solve it for every text, just for the one that has been determined in advance.
Bzzt. Thanks for playing. The problem to be solved is a feedback loop that uses the desired output and the current best compression. The decision problem is: exists_better_compression? (resource_limit, desired_output, best_known_compression). If there is a better compression that fits in the resource limit, then you can solve it directly by iteratively guessing the bits of the partial input exists_better_compression_from_partial_input? (resource_limit, desired_output, best_known_compression, partial_input). (*Hint: resource_limit can be a function of the other inputs.) The optimization problem (obtaining the most money from the "C-prize") iterates this process until there is no better compression.
Re:degfbdsgf by QuantumFTL · 2006-07-27 16:10 · Score: 1

What is wrong with a guy doing a little research before implementing an assigned task?

Agreed - slashdot may not be a good (or sane) place for legal advice, but if there's FOSS out there to do a task, someone on /. knows about it (or wrote it).
String Theory Does it for me! by Anonymous Coward · 2006-07-27 18:12 · Score: 0

Typical statement by string theorist:
"Take type IIB string theory on AdS5 x S5 or its pp-wave limit. Both of them have the maximal number of 32 supercharges. Is there some interesting generalization of these two geometries?

The answer is: yes, there is..."

Tom Jones on string theory:
If you wanna see me do my thing,
Baby, pull my string!
The Method by Anonymous+MadCoe · 2006-07-27 19:01 · Score: 1

Don't focus on tools too much, one options ids to get hold of some IBM documentation on "The Method". This gives you a great framework tp model complex systems in in such a way that is is actually useful.

beware though, it's a method, not a bible, use from it what works for you.

Can't help you to links, find a buddy who works there.
Graph Visualization & Layout problem by bunions · 2006-07-27 19:23 · Score: 1

This is a fairly obvious application of graph visualization tools. Unfortunately, since the topic is so difficult, all the good solutions tend to be commercial. GraphViz is nice up to a point, but is pretty crummy when you get into thousands of nodes and edges. JGraph is another open source tool, and it's reasonable good, but as with other free software, it starts to choke on larger graphs.

Some of the better commercial packages are from Tom Sawyer, AiSee, Ilog and yWorks.

--
there is no need to sign your posts. this isn't usenet. your username is right there above your post. stop it.
DI Diver by Anonymous Coward · 2006-07-27 21:49 · Score: 0

Once you've got your data, I'd try looking at DI Diver. It's particularly well suited to doing dynamic "what if" queries into large datasets. http://www.dimins.com/
Tideway Foundation by Anonymous Coward · 2006-07-27 22:20 · Score: 0

I work for Tideway Systems. Our Tideway Foundation product is probably exactly what you need.

It lets you define patterns that it uses to automatically map instances of your applications, and uses automatic discovery techniques to keep the model up to date. It can then tell you which servers, clusters, switches, operating systems, other applications, and other technologies and resources each of your applications depend on. It can also tell you low-level information about your environment like how much RAM each of your servers have.

Our data store is extremely flexible, so you can add your own knowledge to the model. We're currently using Tom Sawyer's tool for visualization.
ITIL : CMDB by Dem_Gnomes · 2006-07-28 00:50 · Score: 1

The problem you are trying to solve is one of the areas addressed by the ITIL. In this particular case, mapping dependencies, you might want to look at a CMDB tool. There are a bunch of vendors in the space, and I'm guessing that you don't want to draw this diagram and forget it, but would actually like to have the information stay around for a while and be up to date. Your applications, hardware, customers, technologies, etc. are business entities. The relationships between them are what you are currently trying to map out. In general, if you can't get the tool to draw the diagrams for you, you can certainly get it to export the data (via a report option) in a format that you can feed to the graphics utility of your choice. Good Luck
1. Re:ITIL : CMDB by pagati · 2006-07-30 21:40 · Score: 1
  
  Correct, if you have some spare time you could also give a look at the sw we have build to deal with ITIL in our organization (Comune di Udine - City Council of Udine) http://www.cmdbuild.org/ (sorry, the site use only italian lang) The sw is open source (GPL), web based, written in Java, uses Postgresql as database
Maping Complexity by Don+Philip · 2006-07-28 01:00 · Score: 1

What you're doing is a form of network analysis, and there are programs that will assist with this, but there is also a learning curve. First, while there are other network analysis programs, I am familiar with social network analysis tools, so I will recommend two here. There are probably other slashdotters who can recommend other programs. I've als given a reference to a book that will get you going, but any math text on graph theory will do just as well.
Programs:
Agna (Benta, I. (2002, 2003). Agna Project. Retrieved April 16, 2004, from http://www.geocities.com/imbenta/agna/index.htm). This is a free product and is excellent.
NetMiner This is available from Cyram software (Korea). It costs, but is well worth it.
Reference:
Degenne, A., & Forsé, M. (1999). Introducing Social Networks (A. Borges, Trans.). London: SAGE Publications. This is about social networks, but the basic method of analysis is similar for all networks.
Been there, done that, don't wanna go back, but: by JetScootr · 2006-07-28 01:06 · Score: 1

I suggest two ways: A high level "bizniz map" and a detailed "data trail".
My company is a polyglot conglomerated transmogrification of several gov't contracts, divisions of previous contractors whose contracts were consolidated, organizational divisons both on gov't and corp sides. We had the same problem as budget cuts forced us to compress.
The data trail identifies sets of data from their origin points thru every app and person that touches the data, making special note of decision points. A decision point is where a single person (not group, not "CM" or "accounting") decides to change the direction of the data flow, or makes a go/nogo decision (cancels a software request, decides to not merge data sets x and y this month, etc).
The bizniz map will map the needs from the business viewpoint: organizations-> "bizniz needs" concepts -> applications -> data.
By connecting apps by the bizniz need they met, we were able to categorize:
> Which needs had more than one app? (nearly all)
> Which apps were good enuf (scalable/maintainable) to take on the entire enterprise? (Consolidate other app's data and needs into the good ones)
> Which apps were only useful in the data they contained? (export the data to the good apps, and delete the weak app)
> Which apps overlapped in requirements? Some apps we split down the middle - moved some of the requirements (and data) to the upscale apps and "stub out" the no-longer-needed capability of lesser app.

One thing that kept coming up - a surprising number of functional organizations had spreadsheets, sets of .Doc templates, etc, to meet their requirements, and these created an astounding amount of work to move up to the corporate-chosen apps. The old data had to be kept, of course, and it wasn't in any kind of good shape. Sometimes, data format and content changed from one month to the next, no versioning, no docs, no controls, no nothin' but the MS app - .MDE, .XLS, .PPT even.
However you organize the work, don't forget "the little guys" like this - The people that use and maintain these micro-apps have these things in common:
> The micro-app was likely created out of frustration and desperation to get a handle on a "minor" bizniz need that corp. IT didn't prioritize.
> Even the possibility of being overlooked is likely to be sore spot - and will get a highly vocal (politically negative) outcry.
> The micro-apps contain data that is crucial to the functioning of a relatively small part of the organization.
> Created by one person (at most two), who is now very well-liked and respected by the organization's boss - who also understands the importance of the micro-app to his function.

Overlooking the micro-apps would be a major mistake as you map all this out.

One particular process I analyzed and tracked went from Excel to .csv file to mainframe DBS to print file, printed and faxed to someone who then entered the data in another Excel in order to organize it for export to a .csv so that it could be read by a realtime engineering simulator.
The two excel spreadsheets were on the same LAN. At least nine functional points (people) were involved in this. But in years past, before excel and LANS existed, this all made sense. We shortened the process path to ONE point - and eliminated more than a month from the schedule for processing the data. This is the kind of thing that gets functional managers (Supervisors and their bosses) to support you. Just show them the inefficiencies in the existing micro-app structure, and how it can be improved.

--
Pavlov wouldn't be so famous if he'd used a can opener instead of a bell.
APM by procrastitron · 2006-07-28 04:41 · Score: 1

Disclaimer: This is exactly what my company does, so I am by no means impartial...

This is called application portfolio management (APM). As other people have said in the thread, it basically involves mapping the applications and their dependencies onto a graph. Then you can use the graph to build reports, perform searches, visualize the structure, etc. The company I work for (http://www.metallect.com/) does exactly this, and the website has much more info about it.
Mod parent up... by mlibby · 2006-07-28 07:34 · Score: 1

Where are the moderators on this discussion? Two +5s and that's it?
Mindmapping and concept mapping by Argey · 2006-07-28 14:22 · Score: 1

I've used info mapping techniques for a long time when I want to get my head around complexity - or even start thinking about a new project for the first time. There are dozens of software implementations. Here are some I know of:
3D Topicscape; Aviz Thought Mapper; Axon Idea Processor; BrainMine; Claro Concepts; Concept X7; ConceptDraw Mindmap; Conception & InterModeller; Freemind (SourceForge); Headcase; Hypersoft-net; I-Navigation; InfoPro (ZPAY); InfoRapid KnowledgeMap 2005e; Inspiration Software Inc.; jKSImapper ; LifeMap; Map it! ; Mind Manager; Mind Mapper; Mind Mapper; Mind Mapping Tool; MindChart; MindFull; ModellingSpace; MyMind; Novamind; Openmind 2; PersonalBrain; SoftNeuron Mindmapping SW; VisiMap Professional 4.0 (once called InfoMap Lite)
I have myself worked on a problem very much as described by sandbender, and did use ERwin as one poster suggested, but much later in the project. I started mapping out how all apps were connected, collecting info about the interfaces between all apps, who was responsible for each to have an overall picture first. Then worked up physical data model for each, and normalized them. Then I had a load of logical models. From that info, others went on to build an enterprise model.
dunno if that helps...