Providing incremental updates for queries, part I

After discussing the importance of modeling and the concept of model queries (with OCL and EMF-IncQuery), one of the introductory blog posts presented the challenge posed by model evolution. Incremental query evluation was suggested then as a solution, without going into the details of how it can be achieved for a query language such as the one of EMF-IncQuery. Taking advantage of the previously discussed foundations in formal logic, the applied incremental strategy will be outlined here.

Elementary model queries

In analogy to elementary query statements introduced during the discussion of the propositional logic-based language subset, elementary model queries are the following:

  • retrieve the model elements of a certain type,
  • retrieve pairs of model elements, where the first one points to the second one using a reference of a given type,
  • retrieve pairs of a model element and an attribute value, the the given attribute of the model element takes the associated value

The first task is obtaining the results to these elementary queries. This involves traversing the entire model, and gather each object or edge that conforms to the types indicated in the queries. The results sets can then be cached for later use.

The second problem is that of providing incremental updates to the results of elementary queries. This requires us to register update notification listeners for the entire model. Whener the model is modified somehow, no matter how the change is performed and what initiated it, the listeners will be notified of this change. The result can be transformed and filtered, to find out which of the result caches should be updated and how.

For example, the elementary query "retrieve all school classes and students where the student is a member of the class" will be associated with a cache of class-student pairs. When a student is moved from one class to another, the model releases change notifications which are translated by the listeners into the action of removing the obsolete entry from the cache, and adding the updated one.

Note that equality and inequality were mentioned before as atomic formulas alongside these elementary model queries. Their results sets, however, can not and need not be cached in advance; they will be used as mere filters, as shown in the next example.

Connectives

Once incrementally maintained caches have been established for simpler queries, the results of more complex queries constructed from them using conjunctive, disjunctive and negative connectives, as well as quantification. The initial cache population is performed by applying set operations on the results of simlper queries: union for discjunction, relational join for conjunction, difference for negation, and relational projection for quantification.

Similarly to how elementary query caches reacted to changes of the model, the result of connectives has to be updated whenever the results of the constituent simple queries are.

For example, the set of all pairs of classmates is expressed as the following query:

pattern classmates(s1: Student, s2: Student, cl: SchoolClass) = {
  SchoolClass.students(cl, s1);
  SchoolClass.students(cl, s2);
  s1 != s2;
}

Note that this query is a conjunction of two (equivalent) elementary queries, and an inequality statement. The match set of the pattern is therefore derived using a relational operation called "join" from the set of school class - student pairs, with an inequality filtering prepended. And when elementary query results for "SchoolClass.students" are updated by an insertion (i.e. a new student joins a class), other students from the same class are looked up from the cached query results, and thus new matches of the pattern are produced, ready to be stored in the cached match set (after filtering out the singular case where the inequality statement does not hold).

Attributes

It has been mentioned before that attribute expressions are outside the domain of the fundamental formal logic underpinning the query language, therefore they cannot be, in general, evaluated incrementally. However, all such attribute expressions are required to be deterministic, which enables the incremental evaluator to reuse previously computed values.

For example, consider the following pattern:

pattern courseTuitionFee(c: Course, fee) = {
    Course.weight(c, weight);
    fee == eval(Math.max(450.0, 100.0 + weight * 40.0));
}

When the value of the weight attribute is assigned for a new course, the tuition fee will be computed using a complex formula, and the match (the pair formed of the course and its newly computed fee) will be stored in the incremental cache. When the weight of the course is changed, the course-weight pair will be removed from the cache of the constituent elementary query; the fee that was once computed for course can be looked up, and the result removed from the match set cache.

Acknowledgement: this research was supported by the European Union and the State of Hungary, co-financed by the European Social Fund in the framework of TÁMOP 4.2.4. A/-11-1-2012-0001 ‘National Excellence Program’.