Dealing with different timing models in CEP

March 20, 2010

Time can be a subtle thing to understand in CEP systems.

Following, we have a scenario that yields different results depending on how one interprets time.

First, let’s define the use-case:

Consider a stream of events that has a rate of one event every 500 milliseconds. The event contains a single value property of type int. We want to calculate the average of this value property for the last 1 second of events. Note how simple is the use-case.

Is the specification of this use-case complete? Not yet, so far we have described the input, and the event processing (EP) function, but we have not specified when to output. Let’s do so: we want to output every time the calculated average value changes.

As an illustration, the following CQL implements this EP function:

ISTREAM( SELECT AVG(value) AS average FROM stream [RANGE 1 second] )

The final question is: how should we interpret time? Say we let the CEP system timestamp the events as they arrive using the wall clock (i.e. CPU clock) time. We shall call this the system-timestamped timing model.

Table 1 shows the output of this use-case for a set of input events when applying the system-timestamped model:

What’s particularly interesting in this scenario is the output event o4. A CEP system that supports the system-timestamped model can progress time as the wall clock progresses. Let’s say that our system has a heart-beat of 300 milliseconds, what this means is that at time 1300 milliseconds (i.e. i3 + heart-beat) the CEP system is able to automatically update the stream window by expiring the event i1. Note that this only happens when the stream window is full and thus events can be expired.

Next, let’s assume that the time is defined by the application itself, and this is done by including a timestamp property in the event. Let’s look what happens when we input the same set of events at exactly the same time as if we were using the wall clock time:

Interesting enough, now we only get four output events, that is, event o4 is missing. When time is defined by the application, the CEP system itself does not know how to progress time, in other words, even though the wall clock may have progressed several minutes, the application time may not have changed at all. What this means is that the CEP system will only be able to determine that time has moved when it receives a new input event with an updated timestamp property. Hence we don’t get a situation like in the previous case where the CEP system itself was able to expire events automatically.

In summary, in this simple example we went through two different timing models, system-timestamped and application timestamped. Timing models are very important in CEP, as it allows great flexibility, however you must be aware of the caveats.

Best-case analysis is bogus

June 21, 2009

One deals with worst-case analysis of algorithms in a day-to-day basis, especially if one is working with systems that have real-time constraints, such as CEP. As I’ve written in the past, real-time systems do not necessarily need to run fast, that is, they can take days to finish, but if the real-time system states that it will finish a task in 1 hour, then it must. To be able to make such time assertions with certainty, the real-time system developer must know the worst-case time for all algorithms that make up such a task.

As an example, what is the worst-case cost of inserting 1 element in a sorted linked list of size n? A naive approach is to start at the first position and compare the elements in the linked list until you find one that is larger than the element being inserted, in which case then you insert the new element in that position and move the remaining elements back. The worst-case scenario is when the element to be inserted is larger than any other element in the list, thus the cost being the cost of n node traversals (i.e. n*c1) and the cost of the actual insertion at the end (i.e. c2). Or, eliminating the constants, we would arrive at the asymptotic worst-case cost of n.

What about the best-case analysis of this algorithm?

Well, I believe that best-case analysis are rather bogus. It has very limited analytical and statistical value, as it not only adds no insight to the analysis of the algorithm, but it also rather hardly presents a real-world case.

However, the biggest problem of best-case algorithm analysis is that the result can be manufactured. To test for best-case scenarios means that you can choose your best input, which allows the algorithm developer to hard-code the algorithm for this well known best input. For example, the algorithm could do the following:

if (bestInput) then return x;

// otherwise do proper work

In other words, you could have a very complex algorithm whose worst-case cost is n square, but whose best-case cost for this one selected input is 1!

Is this valid? Well, it is a valid algorithm, but, in my opinion, a bogus analysis.

A Short Dissertation on EDA

November 13, 2006

There is a lot of literature on EDA, event stream processing, CEP, etc; that is, on event and event processing technologies. Although all of them are very good, it can get a little overwhelming. Following, I attempt to describe EDA and how EDA relates to other technologies, such as SOA, real-time, and Java, in a pragmatic form.

Event-driven architecture is an architectural style composed of decoupled applications that interact by exchanging events. These applications are called event-driven applications. Event-driven applications may play the role of an emitter of events, and of a responder or processor of events.

Event-driven architecture is important, because the real-world is event-driven. One example is the financial world, in which trader applications react to events (or changes) made to the financial exchange market. Event-driven situations should be modeled by event-driven architecture.

Event driven applications are sense-and-respond applications, that is, applications that react to and process events.

Events are state changes that are meaningful to an observer. Generally, events are in the form of a protocol message. Events may be simple or complex. Simple events contain no meaningful member events. Complex events contain meaningful member events, which are significant on their own. An example of a simple event is a stock bid event, and a stock offer event; an example of a complex event is a stock trade event, which includes both a bid event and an offer event.

Events may be delivered through different mediums, two of which are channels and streams. Channels are non-active virtual pipes, that is, a producer component is responsible for inserting data into one side of the pipe and another consumer component is responsible for removing the data at the other side of the pipe. The data is stored in the channel as long as it is not removed by a component. Of course, channels may be bound, in which case it may stop accepting new data or purging existing data as it sees fit. Examples of channels are JMS queues and topics. In the contrary, streams are active virtual pipes, that is, they support a continuous flow of data. If a producer component does not directly listen to the stream, it is likely to miss some data. Because streams do not need to store data, streams are able to support a high-volume of streaming data flowing through them. An example of a stream is the of the air TV broadcast.

Having received events, the next task of an event-driven application is to process the
events. Event Processing is defined as a computation stage that consumes and optionally generates events. Currently, as specified by Roy Schulte, there are four ways to categorize event processing:

  • Event passing:
    Events are simply handled off between components, there is
    mostly no processing, and it generally deals only with simple events. Event-passing applications are asynchronous, staged, and trigged by the arrival of one event from a single event stream or channel. Sometimes they are referenced as message-driven or document-driven applications. Examples are simple pub-sub applications.
  • Event mediation (or brokering):
    Events are filtered, routed (e.g. content-based), and transformed (e.g. enriched). Event mediators are stateless, and deal with both simple and complex events; however they do not synthesize new complex events of their own, that is, event mediators cannot combine (i.e. aggregate) simple events into complex events, mostly due to the fact that they do not keep state. Generally, there is a single event stream or channel fan-in, and multiple event
    streams or channels fan-out. Examples are integration brokers.
  • Complex Event Processing (CEP):
    Events are processed by matching for complex patterns, and for complex relationships, such as causality, timing, correlation and aggregation. CEP applications are state-full; simple and complex events are received from several event streams and new complex events may be synthesized. CEP applications must be able to handle a very high volume of events, and hence generally only using streams.
  • Non-linear Complex BPM:
    Event-based business processes modeling non-linear complex work flows. The business process is able to handle unpredictable situations, including complex patterns, and complex event relations.

Event Stream Processing (ESP) is event processing solely on streams, as opposed to channels. Hence, CEP is always part of ESP; however ESP includes other event processing types, such as event passing and event mediation, when those are performed on streams, rather than on channels.

An event-driven application may play the roles of event source, event sink, or both. An event source generates events to event sinks. Note that event sources do not necessarily create the event, nor events sinks are necessarily the consumer of events. Furthermore, event sources and event sinks are completely decoupled from each other:

  • An event source does not pass control to event sinks, which is the case of service consumers delegating work to providers; and
  • Event sinks do not provide services to event sources, which is the case of consumers that initiate and consume work from providers; and
  • One can add and remove event sources and sinks as needed without impacting other event sources and sinks.

How does EDA compare to SOA? That depends on how the loosely term SOA is defined. If SOA is defined as an architecture that promotes re-use of modular, distributed components, then EDA is a type of SOA. If SOA is defined as an architecture where modules provide services to consumer modules, then EDA is not SOA.

The concepts previously described are based upon work from Roy Schulte, Mani Chandy, David Luckham, and others.

Next, let’s focus on real-time concepts.

Real-time is the capability of a system on being able to ensure the timely and predictable execution of code. In another words, if a developer specifies that an object must be executed in the next 100 milliseconds (or in the next 100 minutes for that matter), a real-time infrastructure will guarantee the execution of this object within this temporal constraint.

Event-driven architectures are suitable for real-time. Event-driven applications are generally implemented using asynchronous mechanisms; this lack of synchronicity improves resource usage, which in turn helps guarantee real-time quality of service.

Objects that have temporal constraints are named schedulable objects. The system measures how well the temporal constraints are being met by means of a particular metric, for example, the number of missed deadlines. Schedulers order the execution of schedulable objects attempting to maximize these metrics. Schedulers make use of different algorithms or policies to do this, one of which is the Rate Monotonic Analyze (RMA). RMA relies on thread priority as a scheduling parameter and determines that the highest priority should be associated to the shortest tasks.

Let’s re-consider CEP. CEP allows one to specify temporal constraints in the processing of events. For example, one can specify to match for an event that happens within 100 milliseconds of another event. Hence, CEP rules (e.g. queries) are essentially a type of schedulable object, and therefore a CEP agent must be a real-time agent.

In a very loosely form, CEP can be further characterized by two functions, a guarding function, and an action function. The former determines whether an event should trigger a response, and the latter specifies the responses to be taken if the guard is satisfied.

Consider a system that supports CEP agents whose action functions are coded in Java. This implies that the system must support the development, and deployment of Java applications, and hence, in this regards, it must be to some extent a Java application server, or rather as we have concluded previously, a real-time Java application server.

To be more exact, CEP Java action functions do not need the full services of a complete application server, for instance, part of the transactional, persistence, and security container services may not be needed. What is needed is a minimal-featured application server. This minimalist aspect is also applicable to the real-time capability. We do not need a full set of real-time features that enables the development of any type of applications, but rather a minimal set of real-time features that enables the support of CEP agents.

A system that supports CEP also supports other event processing types, such as event passing and event mediation. Therefore, a light-weight real-time Java application server that is able to host CEP agents is a good overall solution for achieving EDA.

Memory management for real-time applications in Java

May 22, 2006

One of the main advantages of using Java is not to have to worry about disposing objects [1], that is, to let the Java runtime take care of the memory management of Java objects.

This is done by letting the Java runtime garbage collect Java objects that are no longer being used.

Garbage collection is a relatively complicated process. Generally, the Java runtime will traverse the heap, checking for objects that are no longer being referenced by any other objects, and thus can be safely deleted.

However, as garbage collection uses CPU cycles, it may impact the execution of application code. That is, if during the execution of the application code, garbage collection is performed, the application code may take more time to respond. This causes the latency of the user transaction to increase. Even worse, as it is unknown to the user when a garbage collection may occur, the latency increase is unpredictable.

Real-time applications have strict timing requirements, that is, they have to execute application code under some determined, known latency. Thus the unpredictable latency increase that may be caused by the garbage collection becomes a problem.

What are the solutions to this problem? One obvious solution is not to use Java for real-time applications. This is a poor solution. Java brings a lot as a programming language and as a development platform; we should be able to solve this problem in Java.

Another solution is to use a different memory management approach in Java instead of garbage collectors. RTSJ, the Real-Time Specification for Java, defines the concept of immortal memory, and scoped memory. Immortal memory is memory that is never garbage collected; it lives forever until the JVM is brought down. Scoped memory is memory that is allocated and released in chunks. That is, the user explicitly creates a scope of memory where the objects will live and the objects are released when the scope is exited or destroyed. In both cases of immortal memory and scoped memory, there is no need for garbage collection. However, there is a drawback, the onus of managing the memory has again moved back to the user, as it is the case for C/C++ applications. This still seems too high of a price to pay. Can we do better?

Ok, so let’s re-consider garbage collection, the main problem with garbage collection is the unpredictable latency spikes it causes. Can we avoid this unpredictable behavior? Or rather, can we limit (i.e. bind) this unpredictable behavior? Yes, by doing garbage collection more often and consistently, we can bind the maximum latency pause time. This is the approach taken by WLRT. Thus, garbage collection becomes a predictable task, with a known cost, which can be considered and modeled by the real-time developer as needed. And, most importantly, we don’t sacrifice Java’s easy of use.

[1] it should be noted that Java programs can still have memory leaks, in spite of garbage collection or any other memory management technique. For instance, by not removing component objects from containers, when they are no longer needed.