Programming Environment For "Event Correlation"?

Posted by Cliff on Sunday January 28, 2001 @05:24PM from the figuring-out-more-about-it dept.

sireenmalik asks: "Of late I have become interested in this field of research namely Event Correlation on a Distributed Network System. The more I read about it, the more ignorant I feel. There is so much to it: distributed network systems, databasing, artifical intelligence (neural networks, baysian belief networks, rule based,etc.), software engineering, computer science, telecommmunication....etc. If I were to really attack it from a programming point of view, can somebody tell me what tools and languages should I use? I suppose it will be a realtime environment. Academicians support ADA but I can't figure how the artificial intelligence part will be done. If I use PROLOG/LISP I get into HEAP management business which really is a dragon for realtime systems. C/C++ .. Java....? To add the list I also know about the diverse implementations using JIRO (from SUN), ECDL (from HP), RAPIDE (from Stanford.edu), JAVA Management API, ELAVA, GEM Language, MODEL Language, IF/PROLOG......and the list goes on and on and on! It's interesting as well as confusing (I can't help but agree here). Let's talk about it. Maybe something useful happens here?"

1 of 25 comments (clear)

Min score:

Reason:

Sort:

realtime collection, offline analysis. by Bazzargh · 2001-01-28 21:34 · Score: 3

Unless you have an awful lot of processor power to spare, why would you even think about doing this processing in real-time?

Theres several advantages to this approach:
- you don't have to have such a fast machine
- the data collection software can be *simple*
- you don't alter the data collection software when you alter your analysis
- you have the raw data to hand for applying more analysis if you need to do a second pass.

For real-time processing I would look at using an offline analysis to generate state machines for recognizing events. And I would get these machines to *generate* events into the stream as well. That way you can build your analysis hierarchically by recognizing subpatterns and building patterns from them.

In any case, from a practical standpoint 'real-time' processing would not spot some of the most interesting things - such as an event pattern recurring close to a regular period of minutes,hours,days,weeks... - eg network failures due to load and due to incorrect scheduled jobs have a differnt appearance - both occur regularly but the schedule failure would have a more precisely regular period. Unless you plan to accumulate state over long periods of time and watch for such things I reckon you'll miss a lot of important recurrences.