Mapping/Understanding System Complexity?
thesandbender asks: "I've recently inherited a project to 'simplify' the application environment for a company that has 1600+ service offerings (many of these are product 'foobar' that has options (like 'Alpha', 'Bravo', 'Charlie', and so forth) available. I am trying to map out the applications' dependencies from a technological and a business standpoint. I would like to designate a group of applications as depending on concepts, technologies (like SAN, DB2 and AIX), specific customers (like 'Bravo' and 'Charlie') and legacy applications. Basically, I want to define any number of arbitrary dependencies and then be able to map them out in a graphical format. With those maps I can show the business oriented staff how removing one application will affect other applications, and I can show the technically oriented staff how removing one system will affect other systems or applications. Has anyone in the Slashdot community run across such a tool? If you haven't, have you run across the need for such a tool? What would you want from it so that I can fashion a usable tool that addresses everyone's needs and not just my own?"
"The most appropriate tool-sets I've found to date are 'mind mapping' or 'concept mapping' tools. All of the tools I've found so far only allow me to create any number of ideas or concepts and don't allow for arbitrary, searchable and/or mappable attributes (e.g. Application 'foo' maps to attributes 'SAN', 'Java', 'Solaris' and 'Buy-Side') that would allow me to create hard and soft groupings that were based on defined attributes (e.g. I could ask for a cloud of all objects that share a specific technical attribute, and another cloud of objects that share a specific business attribute)."
These tools map attributes in records to other attributes in other records. They're designed to then turn these maps into SQL code, but that part isn't important here. What is important is that you can create a full relationship mapping between entities. If you then treat the direction of the relationship as showing the dependency, you can map all the dependencies in the system.
Managers like diagrams to be of a format that are familiar to them, so anything that is "better" from a technical standpoint but "less familiar" to managers from an experience standpoint is, in fact, not as good of a solution.
It's a small world and it smells funny; I'd buy another if it wasn't for the money; Take back what I paid (SoM)
What is wrong with a guy doing a little research before implementing an assigned task? I assure you, conversing with colleagues about an issue, whether it is in person or online, and whether you actually know the PERSON or just thier slashdot handle, you are still conversing with colleagues. It sounds to me like you skipped the classes that stressed "teamwork" in your undergraduate curriculum.
Three days from now?? Thats tomorrow!! ~Peter Griffin
You can probably draw the picture you want with GraphViz, found here http://www.graphviz.org/
...
:-) It was quite useful for
To use it, you create a text file that defines all your dependencies, it'll look something like this:
digraph thingies {
"app1" -> "SAN";
"app1" -> "Java";
"client1" -> "app1";
}
You can then go on to group things together so they show up in meaningful locations on the diagram,
associate pictures with certain nodes, put labels on things, make things in colour, etc.
GraphViz takes care of the laying out parts (where best to put nodes and edges). Sometimes it takes
a while to define everything in format that gets draw neatly, the results are often impressive and
very useful.
On coming to a new job, I've used it to draw all the dependencies between a collection of a couple
of hundred SQL stored procedures in our database. The locals were horrified to have what they all
knew in their gut depicted to them on 35 A3 sheets of paper on a wall
identifying things recursive calls ("Oh, *thats* why that proc sometimes never ends...").
I saw a demo of this product and it seemed neat. You install the daemons on your systems and it monitors all the socket & file opens. In this way it can map application dependencies on different machines or the same machine. I think it can scale to a few thousand machines. On the down side it's not free. When Symantec acquired Relicore the product was renamed to some bland, information free name like Configuration Manager. http://www.symantec.com/Products/enterprise?c=prod info&refId=1461
It's fine for him to ask people. But those people shouldn't be basically random individuals on the Internet, who he has never met, and whose skills and experience he knows nothing about. It's a dangerous thing to do, because the information he gets may be completely invalid, even if it sounds correct. When a business is at stake, such actions are unacceptable.
I recomend one that handles "backlinks" (ie: what are the pages that point to this page?)
Using a wiki, I would add one page per Application, describing it and linking to its neighbor applications. I would also add one page for each attribute (like: Java, or Windows, or something like that), one page for each software group (might be related to the one before) and one page for each type of category (for instance: platform, application, development group).
You can even use the hability of some wikis to do graphs between pages.
Quoting Ward Cunnigham: The simplest database that could posibly work
This request brings to mind the now famous quote from Ray Ozzie regarding reorienting Microsoft to services: "Complexity kills".
There is actually a rigorous definition of complexity that is actually stated in terms of software: Kolmogorov complexity. Kolmogorov complexity is the size of the shortest program that produces, precisely, a given output. The number of bits in that program is the Kolmogorov complexity of the output. This is actually a very useful way of viewing system complexity in the case of software. One way of viewing programming is the compression of all the program's use cases into a program specification which, in theory, is executable. But there is more to it. As it happens -- unsurprising to many of us -- it is now a theorem of computer science that the closer the size of that "executable" gets to the Kolmogorov complexity of the use cases, the better.
This theorem is a major breakthrough in CS and should be learned in every institute purporting to teach IT.
It's important enough that I've proposed a prize award to Ray Ozzie. What follows is my email to him. We'll see if it makes it through his gatekeepers and then gets his attention.
Hi Ray,
I've got a simple and powerful idea that I think, based on your statement "Complexity kills," you'll find interesting.
It is a prize competition that I'm tentatively calling "The C-Prize" that rewards the most succinct representation of a major knowledge base.
Unsurprizingly to some of us, Ockham's Razor is more than a mere rule of thumb -- it turns out to be the foundation of intelligence. Marcus Hutter of the University of Lugano, Switzerland, recently provided a mathematical proof of this link between simplicity and intelligence which has withstood peer review. Dr. Hutter believes the C-Prize to be "an excellent idea".
The criterion is easy enough to state:
Let anyone submit a program that produces, with no inputs, one of the major natural language corpora (such as a Wikipedia snapshot) as output.
S = size of uncompressed corpus
P = size of program outputting the uncompressed corpus
R = S/P (the compression ratio).
Award monies:
Previous record ratio: R0
New record ratio: R1=R0+X
Fund contains: $Z at noon GMT on day of new record
Winner receives: $Z * (X/(R0+X))
The compression program and decompression program are made open source.
At present there is a small prize being administered by Leonid A. Broukhis based on the relatively tiny Calgary Corpus. Matthew Mahoney of the University of Miami has proposed a larger $50,000 compression prize to the National Science Foundation, but experience has shown it is like pulling teeth to get government agencies to fund prize awards -- they generally have to see private parties are doing so first. Marcus Hutter has put up a few small prizes based on some mathematics problems he needs solved for advancement of his theory of intelligence.
Any of these individuals could be a credible locus of control for the C-Prize -- or Microsoft itself could be the locus of control.
Seastead this.
When I get an assignment like this, I try to take a proactive stance. First, I add the project to my action item list. Then I formulate a list of stakeholders in the project. Then I call a meeting to open a dialog between the stakeholders and myself. After drilling down and making sure we're all on the same page, I draft a scope document. When I'm satisfied with the scope document, I hold a sidebar meeting to touch base with the shareholders and verify the document meets their requirements.
Usually by that time the project gets assigned to someone else.
Maybe not
On the topic of mind mapping software, I always take a copy of Freemind with me wherever I go. It's simple, quick and open source.
Task Mangler
Visio has always met my needs, you can "wow" the bus. types with the right stencil sets. Researching and identifying the catagories and dependancies will be the hard part regardless of the tool used. If you have additional people (preferrably experts with the various systems) I would define your key catagories, systems, packages etc. and assign the work of detialing the breakout tasks to each subject matter expert. Once you have this set of tasks completed bring everyone together to map the dependancies and watch all of the dumbfounded shock and amazement as people realize all of the redundant work etc. that has been performed over the years. This should help to get buy in with your peers, as well as group motivation to keep it updated going forward.
"It's fairly trivial to show that this is NP hard"
this is utter crap. you know little to nothing about what you're talking about.
the troll, and i agree it is a troll, asks for a compressed form of a given large natural language text. nothing is said about compressing or analysing for compression any given input. this is not NP-hard, and indeed has nothing to do with NP. there is no input, no input size and no reference at all made to computation cost.
any CS student with some programming knowledge and a textbook on compression could compress a wikipedia snapshot by ~50%. winzip probably achieves ~75%. the 'proposed competition' merely offers a reward for a best-yet compression ratio. this needn't require any technical leaps, just slightly better (more tailormade for the job?) compression algos.
the completely different problem you describe in your reply is a certainly difficult problem, but i would like to see your 'trivial' proof that it is NP-hard. it's probably more accurate to say that it is unsolvable, since perfect compression can never be attained.
in future, don't counter drivel with nonsense.
my password really is 'stinkypants'
A better tool than GraphViz for this application is Cytoscape - http://cytoscape.org/ It was originally designed for biological applications, but is a general network visualizer with a very simple input format e.g.
ModelA-Charlie dependsOn ModelQ
ModelQ requires Java
etc.
You can visualize, interact with and edit the resulting network and even do some advanced analyses, such as network clustering which would tell you which families of projects you have.
It is written in Java and is LGPL.
GP AC here:
The limit of the GGP's proposed competition reduces to the question of whether or not there exists a shorter program that produces the desired output (if one exists, it's still possible to win some prize money). The desired output (knowledge base) is one of the inputs to the problem of finding a shorter program capable of winning any prize money.
If we simplify the problem so that it's limited to sigma-2, then the reduction from subcircuit isomorphism (the problem of determining if there exists a subcircuit that produces the same output) is really too obvious to bother explaining. Subcircuit isomorphism is sigma-2 complete (harder than NP complete -- sigma2 is "Ex Ay f(x,y) == true" vs NP is "Ex f(x) == true" ); therefore, this restriction of the problem is trivially NP hard. How about them apples?
If it were me, I would redefine the goal slightly. It is not just the dependencies. It is the behaviour of the overall system, the collection of application components, that you need to be able to explain. In fact, if you want to know which components you can remove or tweak, you actually need to understand the business processes they are supposed to be supporting. Theoretically, the raw business processes of each company will represent less complexity than the system of application components you need to describe. These business processes would provide "overlays" that could help you to see the extraneous components of your "application system". Causal Loop Modeling (CLM) might be the right tool, specifically because you want to establish how a change in one part of the "application system" affects everything else. CLM is a successor to Stock and Flow modeling, but you can use it by istelf to explain system behaviour. It works especially well when there are feedback loops present that create complex behaviours such as goal seeking, cyclical or seasonal variation, and sudden growth or decline. However, you could simplify the CLM approach if you leave out the notations that define positive and negative forces. Essentially, you create a chart that connects every component to its upstream and downstream components. What you are left with is a diagram showing each component with many inflows and outflows. The way this is useful is to first note the components that have the most inflows. These are the "driven" components, the ones most affected by change. The components with the most outflows have the most influence on the others. Changes to these components create significant impact to the overall system. Armed with this information you can really start to establish where you can trim and where you can not. You can also start to identify where small changes can have a big influence. Like a vaccination: small change-big effect. In effect, you find the points of leverage in the system. That is where is starts to get really interesting. Hope this helps to get you started, or at least to provide a different perspective. If this interestes you, post questions and I will try to answer them.
there is only one input to the problem. the 'desired output' stays the same. it is effectively not an input. you do not have to solve it for every text, just for the one that has been determined in advance.
it is not 'NP-hard' to solve the travelling salesman problem for one given weighted graph. it can be done deterministically in constant time.
you did not read and understand either the GP or the OP.
you are a fucking idiot.
worse, your reply shows that you are an educated fucking idiot.
Bzzt. Thanks for playing. The problem to be solved is a feedback loop that uses the desired output and the current best compression. The decision problem is: exists_better_compression? (resource_limit, desired_output, best_known_compression). If there is a better compression that fits in the resource limit, then you can solve it directly by iteratively guessing the bits of the partial input exists_better_compression_from_partial_input? (resource_limit, desired_output, best_known_compression, partial_input). (*Hint: resource_limit can be a function of the other inputs.) The optimization problem (obtaining the most money from the "C-prize") iterates this process until there is no better compression.
What is wrong with a guy doing a little research before implementing an assigned task?
/. knows about it (or wrote it).
Agreed - slashdot may not be a good (or sane) place for legal advice, but if there's FOSS out there to do a task, someone on
Tom Jones on string theory:
Don't focus on tools too much, one options ids to get hold of some IBM documentation on "The Method". This gives you a great framework tp model complex systems in in such a way that is is actually useful.
beware though, it's a method, not a bible, use from it what works for you.
Can't help you to links, find a buddy who works there.
This is a fairly obvious application of graph visualization tools. Unfortunately, since the topic is so difficult, all the good solutions tend to be commercial. GraphViz is nice up to a point, but is pretty crummy when you get into thousands of nodes and edges. JGraph is another open source tool, and it's reasonable good, but as with other free software, it starts to choke on larger graphs.
Some of the better commercial packages are from Tom Sawyer, AiSee, Ilog and yWorks.
there is no need to sign your posts. this isn't usenet. your username is right there above your post. stop it.
Once you've got your data, I'd try looking at DI Diver. It's particularly well suited to doing dynamic "what if" queries into large datasets. http://www.dimins.com/
I work for Tideway Systems. Our Tideway Foundation product is probably exactly what you need.
It lets you define patterns that it uses to automatically map instances of your applications, and uses automatic discovery techniques to keep the model up to date. It can then tell you which servers, clusters, switches, operating systems, other applications, and other technologies and resources each of your applications depend on. It can also tell you low-level information about your environment like how much RAM each of your servers have.
Our data store is extremely flexible, so you can add your own knowledge to the model. We're currently using Tom Sawyer's tool for visualization.
The problem you are trying to solve is one of the areas addressed by the ITIL. In this particular case, mapping dependencies, you might want to look at a CMDB tool. There are a bunch of vendors in the space, and I'm guessing that you don't want to draw this diagram and forget it, but would actually like to have the information stay around for a while and be up to date. Your applications, hardware, customers, technologies, etc. are business entities. The relationships between them are what you are currently trying to map out. In general, if you can't get the tool to draw the diagrams for you, you can certainly get it to export the data (via a report option) in a format that you can feed to the graphics utility of your choice. Good Luck
Programs:
Agna (Benta, I. (2002, 2003). Agna Project. Retrieved April 16, 2004, from http://www.geocities.com/imbenta/agna/index.htm). This is a free product and is excellent.
NetMiner This is available from Cyram software (Korea). It costs, but is well worth it.
Reference:
Degenne, A., & Forsé, M. (1999). Introducing Social Networks (A. Borges, Trans.). London: SAGE Publications. This is about social networks, but the basic method of analysis is similar for all networks.
I suggest two ways: A high level "bizniz map" and a detailed "data trail".
.Doc templates, etc, to meet their requirements, and these created an astounding amount of work to move up to the corporate-chosen apps. The old data had to be kept, of course, and it wasn't in any kind of good shape. Sometimes, data format and content changed from one month to the next, no versioning, no docs, no controls, no nothin' but the MS app - .MDE, .XLS, .PPT even.
.csv file to mainframe DBS to print file, printed and faxed to someone who then entered the data in another Excel in order to organize it for export to a .csv so that it could be read by a realtime engineering simulator.
My company is a polyglot conglomerated transmogrification of several gov't contracts, divisions of previous contractors whose contracts were consolidated, organizational divisons both on gov't and corp sides. We had the same problem as budget cuts forced us to compress.
The data trail identifies sets of data from their origin points thru every app and person that touches the data, making special note of decision points. A decision point is where a single person (not group, not "CM" or "accounting") decides to change the direction of the data flow, or makes a go/nogo decision (cancels a software request, decides to not merge data sets x and y this month, etc).
The bizniz map will map the needs from the business viewpoint: organizations-> "bizniz needs" concepts -> applications -> data.
By connecting apps by the bizniz need they met, we were able to categorize:
> Which needs had more than one app? (nearly all)
> Which apps were good enuf (scalable/maintainable) to take on the entire enterprise? (Consolidate other app's data and needs into the good ones)
> Which apps were only useful in the data they contained? (export the data to the good apps, and delete the weak app)
> Which apps overlapped in requirements? Some apps we split down the middle - moved some of the requirements (and data) to the upscale apps and "stub out" the no-longer-needed capability of lesser app.
One thing that kept coming up - a surprising number of functional organizations had spreadsheets, sets of
However you organize the work, don't forget "the little guys" like this - The people that use and maintain these micro-apps have these things in common:
> The micro-app was likely created out of frustration and desperation to get a handle on a "minor" bizniz need that corp. IT didn't prioritize.
> Even the possibility of being overlooked is likely to be sore spot - and will get a highly vocal (politically negative) outcry.
> The micro-apps contain data that is crucial to the functioning of a relatively small part of the organization.
> Created by one person (at most two), who is now very well-liked and respected by the organization's boss - who also understands the importance of the micro-app to his function.
Overlooking the micro-apps would be a major mistake as you map all this out.
One particular process I analyzed and tracked went from Excel to
The two excel spreadsheets were on the same LAN. At least nine functional points (people) were involved in this. But in years past, before excel and LANS existed, this all made sense. We shortened the process path to ONE point - and eliminated more than a month from the schedule for processing the data. This is the kind of thing that gets functional managers (Supervisors and their bosses) to support you. Just show them the inefficiencies in the existing micro-app structure, and how it can be improved.
Pavlov wouldn't be so famous if he'd used a can opener instead of a bell.
Disclaimer: This is exactly what my company does, so I am by no means impartial...
This is called application portfolio management (APM). As other people have said in the thread, it basically involves mapping the applications and their dependencies onto a graph. Then you can use the graph to build reports, perform searches, visualize the structure, etc. The company I work for (http://www.metallect.com/) does exactly this, and the website has much more info about it.
Where are the moderators on this discussion? Two +5s and that's it?
I've used info mapping techniques for a long time when I want to get my head around complexity - or even start thinking about a new project for the first time. There are dozens of software implementations. Here are some I know of:
3D Topicscape; Aviz Thought Mapper; Axon Idea Processor; BrainMine; Claro Concepts; Concept X7; ConceptDraw Mindmap; Conception & InterModeller; Freemind (SourceForge); Headcase; Hypersoft-net; I-Navigation; InfoPro (ZPAY); InfoRapid KnowledgeMap 2005e; Inspiration Software Inc.; jKSImapper ; LifeMap; Map it! ; Mind Manager; Mind Mapper; Mind Mapper; Mind Mapping Tool; MindChart; MindFull; ModellingSpace; MyMind; Novamind; Openmind 2; PersonalBrain; SoftNeuron Mindmapping SW; VisiMap Professional 4.0 (once called InfoMap Lite)
I have myself worked on a problem very much as described by sandbender, and did use ERwin as one poster suggested, but much later in the project. I started mapping out how all apps were connected, collecting info about the interfaces between all apps, who was responsible for each to have an overall picture first. Then worked up physical data model for each, and normalized them. Then I had a load of logical models. From that info, others went on to build an enterprise model.
dunno if that helps...