Change or die: model queries on evolving models

...the only constant

Modeling and model queries were discussed in my previous blog posts. Now it is time to turn our attentions towards a great challenge: that of evolving models.

In a model-driven engineering process, models usually do not exist as static, immutable truths, but are rather undergoing constant evolution; implying that any previously conducted model analysis must be re-evaluated, and the impact can extend to other models derived from this one as well. This evolution may happen due to requirement changes (potentially as late as years after the delivery of the system), or on a shorter time-scale the creation of ever newer model versions according to iterative development methodologies, or simply the consequence of fixing problems detected by model validation. In fact, model editing actually consist of a sequence of small, atomic manipulation operations; this can also be regarded as the continuous evolution of a model, during which e.g. immediate feedback of model validation would be useful.  

Spare any change? The cost of evolution

Repeatedly processing a (large scale) model after each small change can lead to significant performance issues. It can be more advantageous to apply incremental evaluation techniques, taking into account the evolving nature of the model. In certain use cases (e.g. well-formedness checks) incremental queries have a great performance advantage[see benchmarks].

Source incrementality is the property of model processing that it only re-evaluates the changed parts of the source model. One of the central topics of my thesis and my current research is efficient evaluation of queries against evolving models through providing source incrementality.

Target incrementality, on the other hand, means that when an evolving model is processed or transformed, only the necessary parts of the result (which is a potentially large model) are modified, there is no need to recreate the new results from scratch. The latter property, beyond direct gains in performance, has the benefit that connections, references between the result model and other external models are left intact and need not be recreated. Moreover, if the target model contains pieces of information (such as platform-specific design decisions in a PSM mapped from PIM) that do not stem from the source model, then the lack of target incrementality would lead to outright information loss.

After model evolution, the traditional approach restores the logical correspondence between source and target models by re-executing the transformation (which is efficient in case of source and target incrementality). A live transformation, however, is continuously active, immediately reacting to changes of the source model by keeping the target model in synch. In this case source and target incrementality is strongly recommended.  

Incremental model queries: an example

In case of model queries, the challenge is source incrementality. Let's see how the example from the previous post (multiple students with the same name in the same class) can be evaluated incrementally. First, we need to evaluate the query on the model once; this can be done as usual. The query results will be stored, so that they can be cheaply retrieved again. So far, so good. The interesting part comes when we change the model.

What happens if a student is transferred from a class to a different one? We have to check if there are any students in the new class sharing their name with the transfer student; if so, they are reported as new name conflicts in the query result. But we are not ready yet: it is possible that the old class already had some students with the same name; these name conflicts would no longer hold after the transfer, so these obsolete conflicts have to be removed from the query results.

What happens if a student changes her name, such as due to marriage? We have to add any new conflicts with students in the same class sharing the new name. Analogously, conflicts with student in the class who shared the old name will no longer hold.

What happens if a new homeroom teacher is assigned to supervise the class? Fortunately, nothing; this kind of change has no impact on our query results.

Incremental model queries over EMF

The previous short example was enough to demonstrate that providing incremental maintenance for query results is a time-consuming and error-prone task to implement. Fortunately, if one uses a declarative query specification, it is possible, at least in theory, to lift this burden off the shoulders of the developer.

One of the key selling points of EMF-IncQuery (see previous post) is that it provides source incrementality out-of-the-box, in a completely automatic way, for queries over EMF models formulated using graph patterns.

On the other hand, the industrially well-known formalism of OCL, by default, supports batch evaluation only. There are some solutions[Cabot], however, that aim to provide incremental evaluation. The goal of my research is to provide such a solution, where a large subset of OCL is translated into graph patterns, to be evaluated incrementally by EMF-IncQuery.

Acknowledgement: this research was realized in the frames of TÁMOP 4.2.4. A/1-11-1-2012-0001 „National Excellence Program – Elaborating and operating an inland student and researcher personal support system”. The project was subsidized by the European Union and co-financed by the European Social Fund.