Concurrency at the EPN

January 3, 2011

Opher, from IBM, recently posted an interesting article explaining the need for an EPN.

As I have posted in the past, an Event Processing Network (EPN) is a directed graph that specifies the flow of the events from and to event processing agents  (EPAs), or, for short, processors.

Opher’s posting raised the valid question on why do we need an EPN at all, and instead couldn’t we just let all processors receive all events and drop those that are not of interest.

He pointed out two advantages of using an EPN, firstly it improves usability, and secondly it improves efficiency.

I believe there is a third advantage, the EPN allows one to specify concurrency.

For example, the following EPN specifies three sequential processor A, B and C. It is clear from the graph that processor C will only commence processing its events after B has finished, which likewise only processes its events after they have been processed by A.

Conversely, in the following EPN, processor B and C execute in parallel, only after the events have been processed by A.

Events are concurrent by nature, therefore being able to specify concurrency is a very important aspect of designing a CEP system. Surely, there are cases when the concurrency model can be inferred from the queries (i.e. rules) themselves by looking at their dependencies, however that is not always the case, or rather, that may not in itself be enough.

By the way, do see any resemblances between an EPN and a Petri-Network? This is not merely coincidental, but alas the subject of a later posting.

Event Processing Reference Architecture at DEBS 2010

July 29, 2010

Recently, the EPTS architecture working group, which I am part of, presented its reference architecture for event processing at DEBS 2010, which was realized at Cambridge in July 12th. The presentation can be found at SlideShare.

The presentation first highlights general concepts around event processing and reference architecture models, the latter based upon IEEE. This is needed for us to be able to normalize the architectures of the different vendors and players to be presented into a cohesive set. Following, individual architectures were presented from University of Trento (Themis Palpanas), TIBCO (Paul Vincent), Oracle (myself), and IBM (Catherine Moxey).

Following, I include the functional view for Oracle’s EP reference architecture:

The pattern for each architecture presentation is to describe different views of the system, in particular a conceptual view, a logical view, a functional view, and a deployment view. Finally, a common event-processing use-case, specifically the Fast-Flower-Delivery use-case, was selected to be mapped to each architecture, thus showing how each architecture models and solves this same problem.

Having exposed the different architectures, we then collide all into a single reference model, which becomes the EPTS reference architecture for Event Processing.

What are the next steps?

We need to further select event processing use-cases and to continue applying them to the reference architecture, hopefully fine-tuning it and expanding it. In particular, I feel we should tackle some distributed CEP scenarios, in an attempt to improve our deployment models and validate the logical and functional views.

Furthermore, I should also mention that at Oracle we are also working on a general EDA architecture that collaborates with SOA. More on this subject later.

Oracle CEP 11gR1 – official support for CQL

July 1, 2009

Today Oracle announced the release of the product Oracle CEP, version 11gR1, which I am gladly part of the team.

Oracle CEP 11gR1 is the next release after 10.3, which is the re-brand of BEA’s Weblogic Event Server.

There are several new features in this release, but the flag-stone is the inclusion of CQL (Continuous Query Language) as the default event processing language.

CQL is important in several aspects:

  • It brings us closer to converging towards a standard language for event processing
  • It is based on a solid theoretical foundation, leveraging relational calculus and extending it to include the concept of stream of events
  • Full support for pattern matching, a new stream-only operator
  • Solid engineering, including several query plan optimizations, for example, a unbounded stream that uses the insert stream operator is converted to a ‘NOW’ window using the relation-stream operator
  • Implementation abstractions, for example, a relation can be bound to either a RDBMS table or to a Coherence cache, without any query re-write.

Other features worth-mentioning are:

And more coming soon…

Best Complex Event Processing Solution

July 15, 2008

Recently, Waters published its ranking for best solution and services for the year of 2008.

I am very glad to see WebLogic Event Server, re-branded as Oracle CEP, as the winner of the best complex event processing (CEP) solution.

There is still plenty for us to do, but I do think we have come a long way in the past two/three years, and we have constantly tried to innovate, both at the container level as well as at the programming model.

Interesting enough, there is a separate Best Streaming Data Management Solution category, which was awarded to the company Streambase.

Personally, I do think there is an implementation difference between streaming data management systems (SDMS), whose roots are deep from DBMS technology, and complex event processing systems (CEP), term which I believe was coined by David Luckham, focusing on event relationships (e.g. causality, aggregation). The keyword being implementation difference, as there is a large overlap on the use-cases that both address.

Regardless, I find it intriguing that Waters not only does not state the differences between the categories, but also uses the term CEP several times in the SDMS category.

I guess the verdict is that there is still confusion amongst the experts regarding event and stream processing… And that both products must be very good.

The EDA Programming Model

July 26, 2007

In the previous post A Short Dissertation for EDA, I try to describe what is EDA, and why it is important.

Having established that EDA is indeed desirable, the next step is to determine how does one actually author an event-driven application. That is, what are the new abstractions, models, and design patterns that we should be using for EDA.

This is analogous to the problem that enterprises faced before Java EE came about. Before Java EE, developers would have to go long ways to create enterprise applications in Java. EE brought the necessary abstractions to facilitate this, by defining, among other things, the concepts of Session Beans, Entity Beans, Message-Driven Beans, and Enterprise Archives (i.e. EAR), which packaged and assembled these entities together.

So what abstractions do we need for EDA? That is, what would be a good programming model for creating event-driven applications?

To no surprise, the needed abstractions for the EDA programming model are:

  • Event Sources and Event Sinks: application code that respectively generate events and receive events
  • Streams: channels through which events flow, these channels don’t hold on to events, they actively stream events
  • Processors: agents capable of processing events; the processing function or capability varies per agent
  • Event Types: metadata defining the properties of events

Developers author event-driven applications by creating instances of these abstractions.

For example, consider a simple financial market pricing application. The goal of this pricing application is to determine what would be the best price to quote its clients that wish to trade stocks. This event-driven application creates two event sources, each receiving stock tick events from two different exchange markets. For sake of simplicity, the stock tick event contains only two event properties, its stock symbol (e.g. BEAS) and the latest traded price of the stock.  The application further defines a processor that is configured to calculate and output the price of a stock symbol as being the average price received from the two event sources. Finally, there is a single event sink that publishes the calculated average stock price to a well-known JMS destination. The event sources are connected to the processor by having the event sources send events to a common stream that the processor listens to. Likewise, the processor is connected to the event sink by sending its event, the average stock price, to a shared stream that the event sink listens to.

event-driven pricing application 

The events flow from the two event sources, to the first stream, then to the processor, then to the second stream, and finally to the event sink. This flow of events across the EDA components forms a Event Processing Network (EPN).

An EPN is another abstraction of the EDA programming model. Formally, it is defined as:

  • A directed graph of event sources, event sinks, streams, and processors; all collaborating towards fulfilling the function of a event-driven application. A EPN models horizontal composition and vertical layering of event processing.

Essentially, an event-driven application specifies a EPN, and the EPN assembles the EDA components (e.g. event sources, event sinks, processors, streams) together.

In the previous example, why do you need a stream to begin with? Couldn’t one just link together the event sources to the processor and then to the event sink? Actually, you could, but streams are useful for several reasons: 

  • Streams de-couple event sources from event sinks; this is similar to what a JMS destination does to JMS publishers and subscribers
  • Streams manage the flow of events; this is done by providing queuing capability, with different rejection policies, and by providing different dispatching mechanisms, such as synchronous and asynchronous dispatching

As long as we are defining a new programming model, let’s also consider some other lessons that we have picked up along the way. For instance, it is important that the specification of a EPN be declarative, in another words, we want to assemble the event driven application by using some declarative mechanism, such as XML. Furthermore, it is also equally important that we keep the business logic de-coupled from the technology. Finally, we would like to pay-as-you-go for functionality. This latter means that if you don’t need a service, for example persistence or security, then you should not need to configure, reference (e.g. implement some technology interface), or otherwise be impacted by this service that you don’t intend on using to begin with.

WebLogic Event Server (EvS) has native support for this EDA programming model.

In EvS, a user application is a EPN, and has first-class support for creating event sources, event sinks, streams, processors, and event types.

Event sources and event sinks may be bound to different plug-able protocols, such as JMS. A event source or event sink that is bound to some specific protocol and is responsible for converting or passing along external events to and from the EPN are known as Adapters. Processors support BEA’s Event Processing Language. Java Beans may be registered in the EPN as Event Types. Streams support dynamic configuration of queuing and concurrency parameters.

The EPN itself is specified in a XML configuration file, called the EPN assembly file

 To be able to support the de-coupling of the user code from (infrastructure) dependencies, we have created our own dependency injection container, supporting both setter and constructor injection…

Just kidding! The EPN assembly file is a custom extension of a Spring framework context XML configuration file. What this means is that we leverage Spring’s Inversion of Control (IoC) container in its entirely, thus allowing one to seamlessly use Spring beans (and any other Spring feature, such as AOP) in the assembly of a EPN. EvS defines its own custom tags for the EDA components, hence a developer does not need to understand how the Spring framework works to create event-driven applications. The EDA programming model extensions to Spring is called Hot-Spring.

Back to our pricing application example, if you consider that the event sources and event sinks are re-using existing adapter implementations respectively that understand the market exchange protocol and JMS, then the whole EDA application can be authored without the developer having to write a single line of Java code! The developer only has to specify the EPN assembly file and configure the processor and adapters that it is using, all done through XML files or through a command-line interface (CLI) Administration tool.

What if the developer needs to use some custom business logic somewhere in the EPN? Well, the developer can always create Java POJOs (Plain-Old-Java-Objects) functioning in the roles of event sources or event sinks and assembled them together in the EPN. This reflects a common manifest from the Spring community, “simple things are easily done, complicated things are still possible”.

Finally, after having authored the EvS application, how do you deploy the application to EvS?

EvS deployment unit is a Spring-OSGi bundle. What is this? To begin with, a bundle is a regular JAR file. The Spring aspect of it means that this JAR file must contain a Spring context configuration, which in the case of EvS is a EPN assembly file, within the directory META-INF/spring. The second aspect of this is OSGi. OSGi is a service-oriented, component-based backplane. Why do you care? Well, generally speaking the developer does not need to care about this. Essentially, a OSGi bundle contains special OSGi entries in its MANIFES.MF file within the JAR file that specify, among other things, service dependencies and service advertisement. The fact that a EvS application is a OSGi bundle helps promote maintainability, re-use, and interoperability. The idea here is that we are bringing SOA directly to the code.

In summary, if you must remember only two things from this article, please remember:

  • In the same way that Java EE created a new programming model for server-side Java enterprise development, there is a need for a new EDA programming model
  • The EDA programming model must not only abstract and provide first-class support for the EDA concepts, but it must also promote re-use, openness, and dependency de-coupling

We have tried to achieve these in WebLogic Event Server. Please let us know how we have done.

A Short Dissertation on EDA

November 13, 2006

There is a lot of literature on EDA, event stream processing, CEP, etc; that is, on event and event processing technologies. Although all of them are very good, it can get a little overwhelming. Following, I attempt to describe EDA and how EDA relates to other technologies, such as SOA, real-time, and Java, in a pragmatic form.

Event-driven architecture is an architectural style composed of decoupled applications that interact by exchanging events. These applications are called event-driven applications. Event-driven applications may play the role of an emitter of events, and of a responder or processor of events.

Event-driven architecture is important, because the real-world is event-driven. One example is the financial world, in which trader applications react to events (or changes) made to the financial exchange market. Event-driven situations should be modeled by event-driven architecture.

Event driven applications are sense-and-respond applications, that is, applications that react to and process events.

Events are state changes that are meaningful to an observer. Generally, events are in the form of a protocol message. Events may be simple or complex. Simple events contain no meaningful member events. Complex events contain meaningful member events, which are significant on their own. An example of a simple event is a stock bid event, and a stock offer event; an example of a complex event is a stock trade event, which includes both a bid event and an offer event.

Events may be delivered through different mediums, two of which are channels and streams. Channels are non-active virtual pipes, that is, a producer component is responsible for inserting data into one side of the pipe and another consumer component is responsible for removing the data at the other side of the pipe. The data is stored in the channel as long as it is not removed by a component. Of course, channels may be bound, in which case it may stop accepting new data or purging existing data as it sees fit. Examples of channels are JMS queues and topics. In the contrary, streams are active virtual pipes, that is, they support a continuous flow of data. If a producer component does not directly listen to the stream, it is likely to miss some data. Because streams do not need to store data, streams are able to support a high-volume of streaming data flowing through them. An example of a stream is the of the air TV broadcast.

Having received events, the next task of an event-driven application is to process the
events. Event Processing is defined as a computation stage that consumes and optionally generates events. Currently, as specified by Roy Schulte, there are four ways to categorize event processing:

  • Event passing:
    Events are simply handled off between components, there is
    mostly no processing, and it generally deals only with simple events. Event-passing applications are asynchronous, staged, and trigged by the arrival of one event from a single event stream or channel. Sometimes they are referenced as message-driven or document-driven applications. Examples are simple pub-sub applications.
  • Event mediation (or brokering):
    Events are filtered, routed (e.g. content-based), and transformed (e.g. enriched). Event mediators are stateless, and deal with both simple and complex events; however they do not synthesize new complex events of their own, that is, event mediators cannot combine (i.e. aggregate) simple events into complex events, mostly due to the fact that they do not keep state. Generally, there is a single event stream or channel fan-in, and multiple event
    streams or channels fan-out. Examples are integration brokers.
  • Complex Event Processing (CEP):
    Events are processed by matching for complex patterns, and for complex relationships, such as causality, timing, correlation and aggregation. CEP applications are state-full; simple and complex events are received from several event streams and new complex events may be synthesized. CEP applications must be able to handle a very high volume of events, and hence generally only using streams.
  • Non-linear Complex BPM:
    Event-based business processes modeling non-linear complex work flows. The business process is able to handle unpredictable situations, including complex patterns, and complex event relations.

Event Stream Processing (ESP) is event processing solely on streams, as opposed to channels. Hence, CEP is always part of ESP; however ESP includes other event processing types, such as event passing and event mediation, when those are performed on streams, rather than on channels.

An event-driven application may play the roles of event source, event sink, or both. An event source generates events to event sinks. Note that event sources do not necessarily create the event, nor events sinks are necessarily the consumer of events. Furthermore, event sources and event sinks are completely decoupled from each other:

  • An event source does not pass control to event sinks, which is the case of service consumers delegating work to providers; and
  • Event sinks do not provide services to event sources, which is the case of consumers that initiate and consume work from providers; and
  • One can add and remove event sources and sinks as needed without impacting other event sources and sinks.

How does EDA compare to SOA? That depends on how the loosely term SOA is defined. If SOA is defined as an architecture that promotes re-use of modular, distributed components, then EDA is a type of SOA. If SOA is defined as an architecture where modules provide services to consumer modules, then EDA is not SOA.

The concepts previously described are based upon work from Roy Schulte, Mani Chandy, David Luckham, and others.

Next, let’s focus on real-time concepts.

Real-time is the capability of a system on being able to ensure the timely and predictable execution of code. In another words, if a developer specifies that an object must be executed in the next 100 milliseconds (or in the next 100 minutes for that matter), a real-time infrastructure will guarantee the execution of this object within this temporal constraint.

Event-driven architectures are suitable for real-time. Event-driven applications are generally implemented using asynchronous mechanisms; this lack of synchronicity improves resource usage, which in turn helps guarantee real-time quality of service.

Objects that have temporal constraints are named schedulable objects. The system measures how well the temporal constraints are being met by means of a particular metric, for example, the number of missed deadlines. Schedulers order the execution of schedulable objects attempting to maximize these metrics. Schedulers make use of different algorithms or policies to do this, one of which is the Rate Monotonic Analyze (RMA). RMA relies on thread priority as a scheduling parameter and determines that the highest priority should be associated to the shortest tasks.

Let’s re-consider CEP. CEP allows one to specify temporal constraints in the processing of events. For example, one can specify to match for an event that happens within 100 milliseconds of another event. Hence, CEP rules (e.g. queries) are essentially a type of schedulable object, and therefore a CEP agent must be a real-time agent.

In a very loosely form, CEP can be further characterized by two functions, a guarding function, and an action function. The former determines whether an event should trigger a response, and the latter specifies the responses to be taken if the guard is satisfied.

Consider a system that supports CEP agents whose action functions are coded in Java. This implies that the system must support the development, and deployment of Java applications, and hence, in this regards, it must be to some extent a Java application server, or rather as we have concluded previously, a real-time Java application server.

To be more exact, CEP Java action functions do not need the full services of a complete application server, for instance, part of the transactional, persistence, and security container services may not be needed. What is needed is a minimal-featured application server. This minimalist aspect is also applicable to the real-time capability. We do not need a full set of real-time features that enables the development of any type of applications, but rather a minimal set of real-time features that enables the support of CEP agents.

A system that supports CEP also supports other event processing types, such as event passing and event mediation. Therefore, a light-weight real-time Java application server that is able to host CEP agents is a good overall solution for achieving EDA.