Defining Model Queries using EMF-IncQuery

Outdated content

The contents of this page have been superseded by the following page on the Eclipse Wiki: 

http://wiki.eclipse.org/EMFIncQuery/UserDocumentation/QueryDevelopment

 

 

 

 

 

 

 

 

Overview

EMF-IncQuery provides an intuitive user interface to create, edit and debug queries. It assists users with various wizards, supports the editing of query definitions (patterns) in an Xtext-based editor and also provides query debugging features with the Query Explorer view.

The main components:

Use cases

New EMF-IncQuery project

EMF-IncQuery project is the container of all EMF-IncQuery related artifacts. The wizard can be found under the EMF-IncQuery category.

New EMF-IncQuery query definition

The wizard first guides you through the creation of the query container (package) which is similar to the creation of a Java class. The second page helps you create an initial pattern within your new .eiq file. Here you can name your pattern, select the packages that you want to use in your patterns and add pattern parameters with type specification. After finishing with the wizard a new .eiq file will be created with the pattern specified on the second page.

New EMF-IncQuery generator model

If you wish to work with an EMF domain that is present as a source project in your workspace, you have to create an EMF-IncQuery generator model that will contain a reference to the .genmodel file of your EMF domain. Inside the file you must specify the resource URI of the genmodel file.

Editing queries

It is best to start with the built-in pattern template (available as a code completion feature). The details of the EMF-IncQuery Query Language are discussed on the language documentation page. The School example might also provide useful starting points.

Executing and debugging queries

The Query Explorer View can be used to execute and debug the queries on EMF instance models. The main features of the Query Explorer view are:

Advanced use cases

Pattern Annotations supported by the Query Explorer

The IncQuery Language allows to use annotations for pattern definitions to fine-tune the behavior of a query. The following annotations are supported by the Query Explorer:

See the hover help (provided by the query editor) for more details on annotations.

Parameterized queries

In the Query Explorer you can define filters for the registered patterns. This specification is, however, instance model and pattern-specific (does not apply for the same pattern under different instance model). The aim of this facility is to narrow down the match sets of specific patterns when debugging. The following steps are required to specify a filter:

See the following picture for an example filtering from the school example.

Wildcard mode

Settable on the EMF-IncQuery preference settings page: in wildcard mode, every aspect of the EMF model is automatically indexed, as opposed to only indexing model elements and features relevant to the currently registered patterns; thus patterns can be registered and unregistered without re-traversing the model. This is typically useful during query development. Turn off wildcard mode to decrease the memory usage while working with very large models.

Query Language

Overview

Language concepts

For the query language, we reuse the concepts of graph patterns (which is a key concept in many graph transformation tools) as a concise and easy way to specify complex structural model queries. These graph-based queries can capture interrelated constellations of EMF objects, with the following benefits:

  • the language is expressive and provides powerful features such as negation or counting,
  • graph patterns are composable and reusable,
  • queries can be evaluated with great freedom, i.e. input and output parameters can be selected at run-time,
  • some frequently encountered shortcomings of EMF’s interfaces are addressed:
    • easy and efficient enumeration of all instances of a class regardless of location,
    • simple backwards navigation along all kinds of references (even without eOpposite)
    • finding objects based on attribute value.

Basic documentation

The current version of the IncQuery Graph Pattern language (IQPL) owes many of its syntax to the VTCL language of model transformation framework VIATRA2. If you would like to read more on the foundations of the new language, we kindly point you to our ICMT 2011 paper (important note: the most up-to-date IncQuery language syntax differs slightly from the examples of the ICMT paper).   

Short syntax guide

See also the language tutorial and the School example.

  1. Import declarations are required to indicate which EMF packages are referenced in the query definitions.
    1. Use the registered namespace URI for import declarations. Content assist is available:
      import "http://my.own.ePackage.nsUri/1.0"
  2. Enclose pattern definitions in a package:
    package my.own.patterns
  3. Introduce a pattern by the pattern keyword, a pattern name, and a list of parameter variables. Then enclose in curly braces a list of constraints that define when the pattern should match.
    pattern myPattern(A,B,C) = {...pattern contraints...}
  4. Disjunction ("or") can be expressed by linking several pattern bodies with the or keyword:
    pattern myPattern(A,B,C) = {... pattern contraints ...} or {... pattern constraints ...}
  5. The most basic pattern constraints are type declarations: use EClasses, ERelations and EAttributes. The data types should also be fine.
    1. An EClass constraint expressing that the variable MyEntityVariable must take a value that is an EObject of the class MyClass (from EPackage my.own.ePackage, as imported above) looks like this:
      MyClass(MyEntityVariable);
    2. A relation constraint for the EReference MyReference expressing that MyEntityVariable is of eClass MyClass and its MyReference EReference is pointing to TheReferencedEntity (or if MyReference is many-valued, then it is one of the target object contained in the EList), as seen below:
      MyClass.MyReference(MyEntityVariable, TheReferencedEntity);
    3. A relation constraint for an EAttribute, asserting that TheAttributeVariable is the String/Integer/etc. object that is the MyAttribute value of MyEntityVariable, looks exactly the same as the EReference constraint:
      MyClass.MyAttribute(MyEntityVariable, TheAttributeVariable);
    4. Such reference navigations can be chained; the last step may either be a reference or attribute traversal:
      MyClass.MyReference.ReferenceFromThere.AnotherReference.MyAttribute(MyEntityVariable, MyString);
    5. Pattern parameters can be suffixed by a type declaration, that will be valid in each pattern body. Here is an alternative way to express the type of variable B:
      pattern myPattern(A,B : MyClass,C) = {...pattern contraints...};
    6. (You will probably not need this, but EDatatype type constraints can be applied on attribute values, with a syntax similar to that used for EObjects, and with the additional semantics that the attribute value must come from the model, not just any int/String/etc. computed e.g. by counting):
      MyDatatype(MyAttributeVariable);
      or for the built-in datatypes (import the Ecore metamodel):
      EString(MyAttributeVariable);
  6. By default, each variable you define may be equal to each other variable in a query. This is especially important to know when using attributes or multiple variables with the same type (or supertype).
    1. For example, if you have two variables MyClass(SomeObj1), MyClass(SomeObj2), SomeObj1 and SomeObj2 may match to the same EObject.
    2. If you want to declare that two variables mustn't be equal, you can write:
      SomeObj1 != SomeObj2;
    3. If you want to declare, that two variables must be the same, you can write:
      SomeObj1 == SomeObj2;
  7. Pattern composition: you can reuse a previously define pattern by caling it in a different pattern's body:
    find otherPattern(OneParameter, OtherParameter, ThirdParameter);
  8. You can express negation with the neg keyword. Those actual parameters of the negative pattern call that are not used elsewhere in the calling body will be quantified; this means that the calling pattern only matches if no substitution of these calling variables could be found. See examples in order to understand. The below constraint asserts that for the given value of the (elsewhere defined) variable MyEntityVariable, the pattern neighborPattern does not match for any values of OtherParameter (not mentioned elsewhere).
    neg find neighborPattern(MyEntityVariable, OtherParameter);
  9. In the above constraints, wherever you could use an (attribute) variable in a pattern body, you can also use:
    1. Constant literals of primitive types, such as 10, or "Hello World".
    2. Constant literals of enumeration types, such as MyEEnum::MY_LITERAL
    3. Aggregation of multiple matches of a called pattern into a single value. Currently match counting is supported, in a syntax analogous to negative pattern calls:
      HowManyNeighbors == count find neighborPattern(MyEntityVariable, OtherParameter);
    4. Attribute expression evaluation: coming soon.
  10. Additional attribute constraints using the check() construct:
    check((A as Integer) > (S as String).length());
  11. One can also use the transitive closure of binary patterns in a pattern call, such as the transitive closure of pattern friend:
    find friend+(MyGuy, FriendOfAFriendOfAFriend);

Limitations (as of IncQuery 0.6)

  1. Meta-level queries (instanceOf etc.) will not currently work (although Ecore models can be processed as any other model). 
  2. Make sure that the result of the check() expressions can change ONLY IF one of the variables defined in the query changes.
    1. In practice, a good rule of thumb is to only use attribute variables or other scalar values in a check(), no EObjects.
    2. In particular, do not call non-constant methods of EObjects in a check(). Use attribute values instead,  if necessary converted to the native type using (SomeInt as Integer) and co, so as to help the type inference.
      1. For example, you CAN use check((Name as String).contains(SomeString as String)).
      2. But You MUSTN'T use check((SomeObject as MyClass).name.contains((SomeString as String))  as the name of SomeObject can change without SomeObject changing!
  3. Use only well-behaving derived references or attributes. Better yet, reimplement the derived feature using queries. Regular derived features are not supported in patterns (except the ones in the Ecore metamodel, which are well-behaving by default) as they can have arbitrary Java implementations and EMF-IncQuery is unable to predict when their value will change.

See advanced issues for additional topics, such as attribute handling.

Step-by-step tutorial

Note: the built-in cheat sheet of EMF-IncQuery should also provide help in getting started.

Creating your first query 

First, create an "EMF-IncQuery project" using the new project wizard (New -> Project...) in the EMF-IncQuery category. After setting the desired name for the project, it is created with the following source folders:
  •  src (for manually written code)
  •  src-gen (for generated code)
 
Use the "EMF-Incquery query definition" new file wizard (New -> Other...) in the EMF-IncQuery category to create a query definition. New queries must be put into a source folder, e.g. into src. After setting the package, file name and (optionally) a pattern name for the first pattern, the .eiq file is created with an empty pattern.

In order to develop queries using a given Ecore model, it has to be available either as a registered package in the running Eclipse, or included in a workspace project as a resource. In the second case, you can only test your queries dynamically, the generated code may not work. In the .eiq file, the Ecore metamodels are imported using the "import" keyword after the package declaration:

If you are using a metamodel that is not installed as a bundle but is available as a workspace project, you can use the EMF-IncQuery Generator Model (generator.eiqgen) to point to the genmodel files of the used metamodels. Once you add the resource URI to the eiqgen file, you can import with the nsUri in the eiq file.
 
Once some metamodels are imported, it is possible to fill out the body of the pattern. The pattern below will match every EObject in an EMF model:

Save then right click in the editor area and select "EMF-IncQuery -> Register patterns in query explorer" to dynamically load pattern definitions into the Query Explorer view (it should be open by default at the bottom right part in the Plug-in Development perspective).
 
Open any .ecore model in the "Ecore Model Editor" and click on the "Load Model" button (green triangle) in the Query Explorer view (with the editor with the model active) to initiate the pattern matching of the registered patterns in the opened model. You should see (in the Query Explorer) a tree view with the list of registered patterns each containing a list of matches. Click on any of the matches, and the parameter-value pairs are loaded into the table on the right side.

Using other structural constraints

Apart from type constraints, you can also write path expressions between variables. The following pattern will show each attribute of the classes in an Ecore metamodel:

Path expressions can include more than one part, e.g. if you would instead need the names of each attribute in a class:

Note that the EClass.eAttributes(Cls,Attr) expression will imply that Cls must be an EClass, therefore, the first type constraint is not neccessary, and similarly name points to an EString:

Note, that currently this will cause type inference for the generated Match class to fail, so the parameters Cls and Name will be returned as Objects.

Finally, you can add type restrictions to your parameters as well, which behave the same way as declaring a simple type constraint:

Check expressions and constants

It is possible to add constraints for the values of attribute variables in a query, these are called check expressions in EMF-IncQuery. For example, if you would like to ensure, that the name of the selected EClass is "MyClass", it can be done as follows:

Similarly, you can use other methods inside a check, for example compare numbers (after similar casting) or even use methods of the attribute values, such as String.contains(). Note, that you should only use methods which (1) have deterministic results, so they return the same value for the same EObject and (2) their results only change when the EObject itself changes, otherwise incremental updates will not work correctly.

If you only need equality checks for attribute values, you can simply replace the variable with the constant value, e.g.:

Injectivity constraints

By default, if you have two variables in the same query with the same type, then these variables may be matched to the same EObject during evaluation. This behavior is called injectivity and can be demonstrated with the following example:

In this case, if the MyClass EClass has an attribute, then the query will have a match where Cls=MyClass and Cls2=MyClass.

If you want to explicitely declare, that two variables must always take on the same value, you can write:

Although in such cases you can just remove Cls2 and use Cls in each occurrence.

It is a more interesting case, when you declare, that two variables must never be the same. This can be written like this:

Query composition

It is possible to create query compositions, where a complex query only matches if other queries have matches with given parameters. This allows the separation of concerns in a complex query and even increases the performance of the evaluation if done correctly.

For example, the examples from the previous section can be implemented by reusing queries from earlier sections, e.g.:

It is important, that you can use the same variable in multiple find expressions, e.g.:

Negative application conditions

Finally, you can use negative application conditions (NAC), for indicating that if a given query matches with the given parameters, then the query containing the NAC must not match. For example:

Here, the MyClass EClass must not have any attributes, since that would mean a match for the negative application condition. Note that the Attr variable in the NAC is not used anywhere else in the query. In such cases, the Attr variable will not have a concrete value inside the query, as no positive constraints for its existence is given. For illustration, the following query is NOT equal to the previous one:

Note, that here, Attr must exist, but it must not be an eAttribute of Cls, since that would mean the NAC matches.

Non-existence of attributes

Unset or null-valued attributes (or references) simply won't match, as there is no referenced EObject or attribute value to substitute in the target pattern variable. If you are especially looking for these, use a negative application condition, e.g.:

Aggregation of match sets

In some cases, you may want to check that there is a given number of matches for a query and you are not interested what exactly the matches are. In these cases, you can use aggregation to assign the number of matches to a variable. For example, you can count the number of attributes for a class:

The result of count is always an Integer, so you can use it as a regular variable in other parts of the query, e.g. in a check:

More complex Ecore query example

For a more complex example on Ecore queries, see our dedicated example.