Programming Environment For "Event Correlation"?
sireenmalik asks: "Of late I have become interested in this field of research namely Event Correlation on a Distributed Network System. The more I read about it, the more ignorant I feel. There is so much to it: distributed network systems, databasing, artifical intelligence (neural networks, baysian belief networks, rule based,etc.), software engineering, computer science, telecommmunication....etc. If I were to really attack it from a programming point of view, can somebody tell me what tools and languages should I use? I suppose it will be a realtime environment. Academicians support ADA but I can't figure how the artificial intelligence part will be done. If I use PROLOG/LISP I get into HEAP management business which really is a dragon for realtime systems. C/C++ .. Java....? To add the list I also know about the diverse implementations using JIRO (from SUN), ECDL (from HP), RAPIDE (from Stanford.edu), JAVA Management API, ELAVA, GEM Language, MODEL Language, IF/PROLOG......and the list goes on and on and on! It's interesting as well as confusing (I can't help but agree here). Let's talk about it. Maybe something useful happens here?"
Unless you have an awful lot of processor power to spare, why would you even think about doing this processing in real-time?
Theres several advantages to this approach:
- you don't have to have such a fast machine
- the data collection software can be *simple*
- you don't alter the data collection software when you alter your analysis
- you have the raw data to hand for applying more analysis if you need to do a second pass.
For real-time processing I would look at using an offline analysis to generate state machines for recognizing events. And I would get these machines to *generate* events into the stream as well. That way you can build your analysis hierarchically by recognizing subpatterns and building patterns from them.
In any case, from a practical standpoint 'real-time' processing would not spot some of the most interesting things - such as an event pattern recurring close to a regular period of minutes,hours,days,weeks... - eg network failures due to load and due to incorrect scheduled jobs have a differnt appearance - both occur regularly but the schedule failure would have a more precisely regular period. Unless you plan to accumulate state over long periods of time and watch for such things I reckon you'll miss a lot of important recurrences.
DARPA archives should help you a lot.
[glad I'm out of THAT arena]
/(o\ I'm not a medievalist - I just play one on weekends!