New book on Event Processing and EPL theory

July 4, 2013

In march of this year, I had the pleasure to publish my second book: Getting Started with Oracle Event Processing 11g

As I learned the hard way, authoring a book is really hard stuff, so this time around I had the satisfaction of working with two co-authors, Robin Smith and Lloyd Williams, both outstanding product managers at Oracle.

Although the book is focused primarily on Oracle’s Event Processing product (version 11g), it also includes a lot of material around the underlying fundamentals and concepts of the (complex) event processing technology employed by the product, so that it should make it interesting to the event processing enthusiastics in general, particularly those interested on learning the theory behind event processing languages.

The first two chapters provide a very good landscape of the market for event processing, along the way describing a few important use-cases that are addressed by this technology. Chapter 3 and 4 describe how to get events in and out of the system, and how to model the system using the concept of a Event Processing Network (EPN).

Chapters 5, 8, and 11 provide a very deep description of Oracle’s CQL (Continuous Query Language). Amongst other things, they get into several interesting and advanced topics:

  • The conceptual differences between streams, and relations;
  • How different timing models, such as application-based and system-based time models, influence the results;
  • A formal explanation of how relational algebra in SQL is extended to support streams;
  • Shows how to implement interesting pattern matching scenarios, such as that of missing events, and the W pattern; and
  • Describes how CQL is extended to support JDBC, Java, and Spatial technology, allowing one to not only process events in time, but also in terms of location.

Chapters 6, 7, and 9 describe how to manage the overall system, both in terms of configuration, but also performance, and how to scale-up and scale-out, particularly explaining how a Data Grid can be used in conjunction with event processing to greatly improve scalability and fault tolerance. Finally, chapters 10 and 12 tie everything together with a case study and discusses future directions.

I hope you will have as much fun reading this book as I had writing it.

If you have any questions along the way, feel free to send them to me.

Advertisements

Answer to CEP quiz: streams and relations

January 4, 2012

Sorry for the long delay on posting an answer to last week’s, or rather, last year’s quiz. It is funny how time seems to stretch out sometimes and a few days turn into weeks.

Well, let’s revisit the queries from the last post:

  1. SELECT * FROM C1[NOW]
  2. ISTREAM (SELECT * FROM C1[NOW])
  3. DSTREAM (SELECT * FROM C1[NOW])

In the first query, the answer is that at the time t = 2 the CACHE is empty. Why empty and not 1?

To understand this, consider the following sequence of events:

  • At time t = 0, the NOW (window) operator creates an empty relation.
  • At time t = 1, the NOW operator converts the stream with the single event { p1 = 1 } to a relation containing a single entry { p1 = 1 }. The result of the NOW operator is a relation, and hence in the absence of other conversion operators, the query likewise outputs a relation. As CEP deals with continuous queries, the best way to represent the difference between the empty relation at time t = 0 and the relation at time t = 1 is to output the insertion of the entry { p1 = 1 }, or in other words, an insert event { p1 = 1 }. The CACHE receives this insert event, and puts the entry { p1 = 1} into it.
  • At time t = 2 (or more precisely at the immediate next moment after t = 1), the NOW operator outputs an empty relation, as the event e1 has moved on from the input stream. The difference between the relation at t = 1 and the relation at t = 2 is the deletion of the entry { p1 = 1 }, therefore the query outputs the delete event { p1 = 1 }. The CACHE receives this delete event, and consistently removes the entry { p1 = 1 }, leaving the cache empty.

Next, let’s consider the second query. In this case, the answer is that at the end the CACHE contains a single entry with the value of 1.

Let’s explore this. In this query, we are using an ISTREAM operator after the NOW operator. The ISTREAM converts the relation into a stream by keeping the insert events. This means that at time t = 1, the insert event being output from the NOW operator is converted into a stream containing the single event { p1 = 1 }. The CACHE receives this event and puts it into it. Next, at time t = 2, the delete event output from the NOW operator is ignored (dropped) by the ISTREAM (convert) operator and never makes it into the CACHE.

The answer for the third query is likewise that at the end the CACHE contains the single entry of 1. The rationale is similar to that of the previous case, however off by one time tick.

At time t = 1, the insert event being output from the NOW operator is ignored by the DSTREAM operator, however the delete event output at time t = 2 is used and converted to a stream. The conversion is simple, the delete event from the relation becomes an insert event in the stream, as the streams only support inserts anyway. The CACHE then picks up this insert event and puts the event into it. Just keep in mind that for this third query this happens at time t = 2, rather than at time t = 1 as it is the case of the second query.

Here is a quick summary:

I would like to thank those people who pinged me, some of them several times, to gently remind me to post the answer.


A CEP Quiz: streams and relations

October 28, 2011

Last week, we were training some of Oracle’s top consulting and partners on CEP.

The training was realized in Reading, UK (near London). Beautiful weather, contrary to what I was told is the common English predicament for October.

At the end, we gave a quiz, which I am reproducing here:

Consider an input channel C1 of the event type E1, defined as having a single property called p1 of type Integer.

An example of events of type E1 are { p1 = 1 } and { p1 = 2 }.

This input channel C1 is connected to a processor P, which is then connected to another (output) channel C2, whose output is sent to cache CACHE. Assume CACHE is keyed on p1.

Next, consider three CQL queries as follows which reside on processor P:

  1. SELECT * FROM C1[NOW]
  2. ISTREAM (SELECT * FROM C1[NOW])
  3. DSTREAM (SELECT * FROM C1[NOW])

Finally, send a single event e1 = { p1 = 1 } to S1.

The question is: what should be the content of the cache at the end for each one of these three queries?

To answer this, a couple of points need to be observed.

First, as I have mentioned in the past, CEP deals with two main concepts: that of a stream and that of a relation.

A stream is container of events, which is unbounded, and only supports inserts. Why only inserts? Well, because there is no such thing as a stream delete, think about it, how could we delete an event that has already happened?!

Whereas a relation is a container of events that is bounded by a certain number of events. A relation supports inserts, deletes, and updates.

Second, remember that a cache is treated like a table, or more precisely, like a relation, and therefore supports insert, delete, and update operations. In the case the query outputs a stream, then the events inserted into the stream are mapped to inserts (or puts) into the cache. If the query outputs a relation, then inserts into the relation are likewise mapped into puts into the cache, however a delete on the relation, becomes a remove of an entry in the cache.

Third, keep in mind that the operations ISTREAM (i.e. insert stream) and DSTREAM (i.e. delete stream) convert relations to streams. The former converts relation inserts into stream inserts, but ignores the relation updates and deletes. The latter converts relation deletes into stream inserts, and ignores relation inserts and updates (in reality, things are a bit more complicated, but let’s ignore the details for the time being).

Fourth, we want the answer as if time has moved on from ‘now’. For all purpose, say we measuring time in seconds, and we sent event e1 at time 1 second and want the answer at time 2 seconds.

I will post the answer in a follow up post next week.

The crucial point of this exercise is to understand the difference between two of the most important CEP concepts: that of a STREAM and RELATION, and how they relate to each other.


Oracle OpenWorld 2011

September 10, 2011

For those attending OOW this year, I will be co-presenting two sessions:

  • Complex Event Processing and Business Activity Monitoring Best Practices (Venue / Room: Marriott Marquis – Salon 3/4, Date and Time: 10/3/11, 12:30 – 13:30)

In this first session, we talk about how to best integrate CEP and BAM. BAM (Business Activity Monitoring) is a great fit to CEP, as it can serve as the CEP dashboard for visualizing and acting on complex events that are found to be business related.

  • Using Real-Time GPS Data with Oracle Spatial and Oracle Complex Event Processing (Venue / Room: Marriott Marquis – Golden Gate C3, Date and Time: 10/3/11, 19:30 – 20:15)

In this following talk, we walk through ours and our customers’ real-world experience on using GPS together with Oracle Spatial and CEP. The combination of CEP and Spatial has become an important trend and a very useful scenario.

If you are at San Francisco at this time, please stop by to chat.


Event Processing Reference Architecture at DEBS 2010

July 29, 2010

Recently, the EPTS architecture working group, which I am part of, presented its reference architecture for event processing at DEBS 2010, which was realized at Cambridge in July 12th. The presentation can be found at SlideShare.

The presentation first highlights general concepts around event processing and reference architecture models, the latter based upon IEEE. This is needed for us to be able to normalize the architectures of the different vendors and players to be presented into a cohesive set. Following, individual architectures were presented from University of Trento (Themis Palpanas), TIBCO (Paul Vincent), Oracle (myself), and IBM (Catherine Moxey).

Following, I include the functional view for Oracle’s EP reference architecture:

The pattern for each architecture presentation is to describe different views of the system, in particular a conceptual view, a logical view, a functional view, and a deployment view. Finally, a common event-processing use-case, specifically the Fast-Flower-Delivery use-case, was selected to be mapped to each architecture, thus showing how each architecture models and solves this same problem.

Having exposed the different architectures, we then collide all into a single reference model, which becomes the EPTS reference architecture for Event Processing.

What are the next steps?

We need to further select event processing use-cases and to continue applying them to the reference architecture, hopefully fine-tuning it and expanding it. In particular, I feel we should tackle some distributed CEP scenarios, in an attempt to improve our deployment models and validate the logical and functional views.

Furthermore, I should also mention that at Oracle we are also working on a general EDA architecture that collaborates with SOA. More on this subject later.


New Point Release for Oracle CEP 11gR1

May 6, 2010

This week, Oracle announced the release of Oracle CEP 11gR1 11.1.1.3.

Even though it is a point release, there are noteworthy improvements and features:

Integration of CQL with Java

CQL (or any other event processing language) allows the authoring of event processing applications at a higher level of abstraction, making them less suitable for dealing with low-level tasks, such as String manipulation, and other programming-in-the-small problems; and lack the richness of other programming language libraries (e.g. Java), which have been built over several years of usage.

In this new release of Oracle CEP, we solve this problem by fully integrating the Java programming language into CQL. This is done at the type-system level, rather than through User-Defined Functions or call-outs, allowing the usage of Java classes (e.g. constructors, methods, fields) directly in CQL in a blended form.

CQL and Java.jpg

In this example, we make use of the Java class Date, by invoking its constructor, and then we invoke the instance method toString() on the new object.

The JDK has several useful utility classes, such as Date, RegExp, and String, making it a perfect choice for CQL.

Integration of CQL and Spatial

Location tracking and CEP go hand-in-hand. One example of a spatial-related CEP application is automobile traffic monitoring, where the automobile location is received as a continuous event stream.

Oracle CEP now supports the direct usage of spatial types (e.g. Geometry) and spatial functions in CQL, as shown by the next example, which verifies if “the current location of a bus is contained within a pre-determined arrival location”.

 

CQL and Spatial.jpg

One very important aspect of this integration is that indexing of the spatial types (e.g. Geometry) are also being handled in the appropriate form. In other words, not only a user is able to leverage the spatial package, but also OCEP takes care of using the right indexing mechanism for the spatial data, such as a R-tree instead of a hash-based index.

High-Availability Adapters

CEP applications are characterized by their quick response-time. This is also applicable for high-available CEP applications, hence a common approach for achieving high-availability in CEP systems is to use an active/active architecture.

In the previous release of OCEP, several APIs were made available for OCEP developers to create their active/active HA CEP solutions.

HA OCEP app.jpgIn this new release, we take a step further and provide several built-in adapters to aide in the creation of HA OCEP applications. Amongst these, there are HA adapters that synchronize time in the upstream portions of the EPN, and synchronize the events in the downstream portions of the EPN, as illustrated in the previous figure.

Much More…

There is much more, please check the documentation for the full set of features and their details, but here are other examples:

  • Visualizer’s Event Injection and Trace, which allows a user to easily and dynamically send and receive events into and from a runnign application without having to write any code
  • Manage the revision history of a OCEP configuration
  • Deploy OCEP application libraries, facilitating re-use of adapters, JDBC drivers, and event-types
  • Support for a WITHIN clause in CQL’s pattern matching to limit the amount of time to wait for a pattern to be detected
  • Create aliases in CQL, facilitating the creation and management of complex queries
  • Support for TABLE functions in CQL, thus allowing a query to invoke a function that returns a full set of rows (e.g. table)