Dealing with different timing models in CEP

Time can be a subtle thing to understand in CEP systems.

Following, we have a scenario that yields different results depending on how one interprets time.

First, let’s define the use-case:

Consider a stream of events that has a rate of one event every 500 milliseconds. The event contains a single value property of type int. We want to calculate the average of this value property for the last 1 second of events. Note how simple is the use-case.

Is the specification of this use-case complete? Not yet, so far we have described the input, and the event processing (EP) function, but we have not specified when to output. Let’s do so: we want to output every time the calculated average value changes.

As an illustration, the following CQL implements this EP function:

ISTREAM( SELECT AVG(value) AS average FROM stream [RANGE 1 second] )

The final question is: how should we interpret time? Say we let the CEP system timestamp the events as they arrive using the wall clock (i.e. CPU clock) time. We shall call this the system-timestamped timing model.

Table 1 shows the output of this use-case for a set of input events when applying the system-timestamped model:

What’s particularly interesting in this scenario is the output event o4. A CEP system that supports the system-timestamped model can progress time as the wall clock progresses. Let’s say that our system has a heart-beat of 300 milliseconds, what this means is that at time 1300 milliseconds (i.e. i3 + heart-beat) the CEP system is able to automatically update the stream window by expiring the event i1. Note that this only happens when the stream window is full and thus events can be expired.

Next, let’s assume that the time is defined by the application itself, and this is done by including a timestamp property in the event. Let’s look what happens when we input the same set of events at exactly the same time as if we were using the wall clock time:

Interesting enough, now we only get four output events, that is, event o4 is missing. When time is defined by the application, the CEP system itself does not know how to progress time, in other words, even though the wall clock may have progressed several minutes, the application time may not have changed at all. What this means is that the CEP system will only be able to determine that time has moved when it receives a new input event with an updated timestamp property. Hence we don’t get a situation like in the previous case where the CEP system itself was able to expire events automatically.

In summary, in this simple example we went through two different timing models, system-timestamped and application timestamped. Timing models are very important in CEP, as it allows great flexibility, however you must be aware of the caveats.

This entry was posted on Saturday, March 20th, 2010 at 10:53 pm and is filed under CEP, CQL, real-time. You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.

5 Responses to Dealing with different timing models in CEP

Sachin says:

April 12, 2010 at 2:12 am

Very useful paper. Infact i am struggling with the problem related to application timestamp for last 3 days and not be able to find the exact solution.

Reply
Martin Lai says:

May 24, 2010 at 11:46 pm

I am new to CEP. This is a very good and concise explanation of the two timing models.

In the application timestamp model, must the application that ensure events are feed to it in timestamp order? If so, how it is achieved in different systems?

Reply
- Alexandre Alves says:
  
  May 27, 2010 at 8:07 pm
  
  Hi,
  
  Yes, in the application timestamped model, the application must guarantee that the events are ordered in time.
  
  This is so because the receiving CEP system does not know when the next event would arrive, for example, a 1 application-time unit could be equivalent to 1 nanosecond or 1 hour, there is no way to tell.
  
  Although this may seem hard to achieve in the application, usually the application time maps to some other logical concept of the application, like an order id, or logical clock, or network clock, etc; hence the application just needs to map through directly.
  
  Best rgds,
  
  Reply
Sanjeev Batta says:

June 27, 2010 at 10:09 pm

Are you aware of any CEP’s that allow application timestamp to drive the CEP temporal functions i.e. if 3 events of type X happen in 6 minutes raise another event and pattern matching. Most CEP products we have evaluated so far don’t seem to like the idea of application driven timestamp.

Reply
- Alexandre Alves says:
  
  June 28, 2010 at 8:46 am
  
  Hi,
  
  I didn’t quite understand your example, but Oracle CEP does fully-support application timestamped events. For example, one can specify ‘DURATION 10′ for a no-event pattern matching, where ’10’ maps to 10 units of application time, and not to any particular time granularity, such as seconds, minutes.
  
  Best regards,
  Alex
  
  Reply

A World of Events