1340 words

Index Definition

Querying without indexes requires iteration over entity objects in the database one by one.
This may take a significant amount of time if many entity objects have to be examined. Using proper indexes the iteration can be avoided and complex queries over millions of objects can be executed quickly. Index management introduces overhead in terms of maintenance time and storage space, so deciding which fields to define with indexes should be done carefully.

Single Field Index

JPA does not define a standard method for declaring indexes, but JDO does. The following entity definition uses JDO’s @Indexjavax.jdo.annotations.IndexJDO annotationAnnotation for a database index.
See JavaDoc Reference Page...
and @Uniquejavax.jdo.annotations.UniqueJDO annotationAnnotation for a database unique constraint.
See JavaDoc Reference Page...
annotations to define indexes:

@Uniquejavax.jdo.annotations.UniqueJDO annotationAnnotation for a database unique constraint.
See JavaDoc Reference Page...
represents a unique index that prevents duplicate values in the indexed field.
A PersistenceException is thrown on commit (or flush) if different entities have the same value in a unique field (similar to how primary keys behave).

@Indexjavax.jdo.annotations.IndexJDO annotationAnnotation for a database index.
See JavaDoc Reference Page...
represents either an ordinary index with no unique constraint or a unique index if unique="true" is specified (the default is false).

The optional name attribute has no specific role but might be presented in the ObjectDB Explorer and in logging.

When an entity object is stored in the database every indexed field must contain either null or a value of one of the following persistable types:

  • Primitive types: boolean, byte, short, char, int, long, float, double.
  • Equivalent wrapper classes from package java.lang:
    Byte, Short, Character, Integer, Long, Float, Double.
  • java.math.BigInteger, java.math.BigDecimal.
  • java.lang.String.
  • java.util.Date, java.sql.Date, java.sql.Time, java.sql.Timestamp.
  • Any enum type.
  • Reference to an entity object.
  • Arrays and collections that contain values of the above types (including null).

Indexes can only be defined on ordinary persistent fields (not on primary key / version fields).

Composite Index

A composite index is an index on more than one persistent field. It is defined by specifying multiple fields in the membersIndex.membersannotation elementMember (field and property) names that compose this index.
See JavaDoc Reference Page...
attribute of the the @Indexjavax.jdo.annotations.IndexJDO annotationAnnotation for a database index.
See JavaDoc Reference Page...
or @Uniquejavax.jdo.annotations.UniqueJDO annotationAnnotation for a database unique constraint.
See JavaDoc Reference Page...
annotations:

When indexed fields are specified explicitly in the members attribute, as shown above, the @Indexjavax.jdo.annotations.IndexJDO annotationAnnotation for a database index.
See JavaDoc Reference Page...
or @Uniquejavax.jdo.annotations.UniqueJDO annotationAnnotation for a database unique constraint.
See JavaDoc Reference Page...
annotation can be attached to either the class or to any persistent field in the class.

Multiple @Indexjavax.jdo.annotations.IndexJDO annotationAnnotation for a database index.
See JavaDoc Reference Page...
annotations can be wrapped with an @Indicesjavax.jdo.annotations.IndicesJDO annotationAnnotation for a group of index constraints.
See JavaDoc Reference Page...
annotation:

Similarly, the @Uniquesjavax.jdo.annotations.UniquesJDO annotationAnnotation for a group of unique constraints.
See JavaDoc Reference Page...
annotation can wrap multiple @Uniquejavax.jdo.annotations.UniqueJDO annotationAnnotation for a database unique constraint.
See JavaDoc Reference Page...
annotations.

As shown above, the members attribute can also be used for a single field index. This is useful if you want centralize your index definitions at the top of the class and is equivalent to omitting members and attaching the @Indexjavax.jdo.annotations.IndexJDO annotationAnnotation for a database index.
See JavaDoc Reference Page...
annotation directly to the indexed field.

Multi Part Path Index

The members attribute is also required in order to define indexes on multi part paths:

Indexes must always refer to values that are stored as part of the entity. Therefore, indexes on multi part paths are only allowed when using embeddable classes as fields of embedded object are stored as part of the containing entity.

Composite indexes on multi part paths are also allowed:

Notice that the paths include a collection, so multiple values will be maintained by the index for every entity.

Multi part paths in a composite index must have the same length. Therefore, the following index definition is invalid:

Indexes in Queries

ObjectDB manages a BTree for every index. A BTree is an ordered map data structure that ObjectDB maintains in the file system rather than in memory. The keys of the BTree are all the unique values in the indexed field (or arrays of values in composite indexes), in all the entities of that class (including subclasses). Every key is associated with a list of references to the entities that contain that value.

Indexes require maintenance time and consume storage space. Therefore, using indexes wisely requires an understanding of how ObjectDB uses indexes to accelerate query execution.

Indexes are especially efficient in lookup and range queries:

  SELECT p FROM Point p WHERE p.x = 100
  SELECT p FROM Point p WHERE p.x BETWEEN 50 AND 80
  SELECT p FROM Point p WHERE p.x >= 50 AND p.x <= 80

By using an index on field x, ObjectDB can find the results using a range scan, which is very efficient because only branches of the BTree that are relevant are iterated.

A composite index on fields x and y can also be used for the queries above. In this case the order of the fields in the composite index definition is important, because it determines which field is used as the BTree's primary sort key and which field is used as a secondary sort key. If x is specified first, the BTree is ordered by x values, so a range scan on x values is supported. On the other hand, if y is specified first, the BTree is ordered by y values, so a full BTree scan is required to find all the objects with relevant x values. A full index scan is less efficient than a range scan, but might be still more efficient than iteration over the entities data.

A composite index on fields x and y enables quick execution of the following queries:

  SELECT p FROM Point p WHERE p.x = 100 AND p.y = 100
  SELECT p FROM Point p WHERE p.x = 100 AND p.y BETWEEN 50 AND 80

For the second query above, the index order should be x first and y second.

Indexes on collections are useful in JOIN queries:

  SELECT d FROM Document d JOIN d.words w WHERE w = 'JPA'

The Document entity class contains a words field whose type is List<String>. The query retrieves the documents that contain "JPA". An index on words will manage for every word all the documents that contain it. Therefore, using such index, the query above can be executed by a quick index range scan.

The same collection index can also be used for executing the following query:

  SELECT d FROM Document d JOIN d.words w WHERE LENGTH(w) >= 10

But this time a full index scan is required because the index uses the lexicographic order of the words and is not ordered by the length of the words.

The bottom line is that if there is an index that contains all the fields in the WHERE clause of a given query that query will run faster as it will be able to, at the very least, utilize a full index scan. Even better, if the field order as defined in the index matches the field order used in the WHERE clause of the query a more efficient range scan can be performed.

ObjectDB also uses indexes for sorting results and for projection:

  SELECT MIN(p.x) FROM Point p WHERE p.x < p.y ORDER BY p.y

In the above example, a composite index on fields x and y would cover all the fields in the query (including the WHERE and ORDER BY clauses) saving the need to access the entities themselves. Instead, because of the comparison in the WHERE clause, a full index scan would be performed. Additionally, if field y was the first field defined in the index the results would already be produced in the requested order (the order of the scan) thus eliminating the need for a separate sort.

Finally, indexes are also used in MIN and MAX queries:

  SELECT MIN(p.x), MAX(p.x) FROM Point p

Given an index on field x ObjectDB can simply return the first and last key in the BTree, without any iteration.


This documentation explains how to use JPA in the context of the ObjectDB Object Database but mostly relevant
also for ORM JPA implementations, such as Hibernate (and HQL), EclipseLink, TopLink, OpenJPA and DataNucleus.
ObjectDB is not an ORM JPA implementation but an Object Database (ODBMS) for Java with built in JPA 2 support.