Index Definition

Querying without indexes requires sequential iteration over entities in the database. If many entities must be examined, this process can take a significant amount of time. Using indexes avoids this full iteration, allowing complex queries over millions of objects to execute quickly. However, index management introduces overhead in maintenance time and storage space. Therefore, you should choose which fields to index carefully.

Single-field index

The following entity definition uses JDO's @Index and @Unique annotations to define indexes. Note that JPA's @Index annotation cannot be applied to fields.

import javax.jdo.annotations.Index;
import javax.jdo.annotations.Unique;
@Entityjakarta.persistence.EntityDeclares that the annotated class is an entity.
public class EntityWithSimpleIndex {
    @Index String indexedField1;
    @Index(unique="true") int indexedField2; // unique
    @Index(name="i3") int indexedField3;
    @Unique Integer indexedField4; // unique
    @Unique(name="u2") Date indexedField5; // unique
}

The @Unique annotation defines a unique index, which prevents duplicate values in the indexed field.
If you try to commit or flush a transaction where different entities have the same value in a unique field, ObjectDB throws a PersistenceException. This behavior is similar to that of primary keys.

The @Index annotation defines either a non-unique index or a unique index if you specify unique="true". The default value for unique is false.

The optional name attribute has no specific function but might appear in the ObjectDB Explorer and in log files.

When an entity is stored in the database, every indexed field must contain either null or a value of one of the following persistable types:

  • Primitive types: boolean, byte, short, char, int, long, float, double
  • Equivalent wrapper classes from the java.lang package:
    Byte, Short, Character, Integer, Long, Float, Double
  • java.math.BigInteger, java.math.BigDecimal
  • java.lang.String
  • java.util.Date, java.sql.Date, java.sql.Time, java.sql.Timestamp
  • Any enum type
  • A reference to an entity
  • Arrays and collections that contain values of the above types, including null

You can define indexes only on ordinary persistent fields, not on primary key or version fields.

Composite index

A composite index is an index on more than one persistent field. Define a composite index by specifying multiple fields in the members attribute of the @Index or @Unique annotation:

@Entityjakarta.persistence.EntityDeclares that the annotated class is an entity.
@Index(members={"lastName","firstName"})
public class EntityWithCompositeIndex {
    String firstName;
    String lastName;
}

When you specify indexed fields in the members attribute, you can attach the @Index or @Unique annotation to either the class or any persistent field in the class.

You can wrap multiple @Index annotations with an @Indices annotation:

@Entityjakarta.persistence.EntityDeclares that the annotated class is an entity.
@Indices({
    @Index(members={"lastName","firstName"}),
    @Index(members={"firstName"}, unique="true")
})
public class EntityWithCompositeIndex {
    String firstName;
    String lastName;
}

Similarly, the @Uniques annotation can wrap multiple @Unique annotations.

You can also use the members attribute for a single-field index. This approach is useful for centralizing index definitions at the class level. It is equivalent to omitting the members attribute and attaching the @Index annotation directly to the indexed field.

Multi-part path index

The members attribute is also required to define indexes on multi-part paths:

@Entityjakarta.persistence.EntityDeclares that the annotated class is an entity.
@Index(members={"address.city"})
public class Employee {
    String firstName;
    String lastName;
    Address address;
     :
}
@Embeddable
jakarta.persistence.EmbeddableDeclares a type whose instances are stored as an intrinsic part of an owning entity, sharing the identity of the entity.class Address {
    String street;
    String city;
     :
}

Indexes must always refer to values that are stored as part of the entity. Therefore, indexes on multi-part paths are allowed only when using embeddable classes, because the fields of an embedded object are stored as part of the containing entity.

Composite indexes on multi-part paths are also allowed:

@Entityjakarta.persistence.EntityDeclares that the annotated class is an entity.
@Index(members={"addresses.city,addresses.street"})
public class Employee {
     :
    List<Address> addresses;
     :
}

Note that the paths include a collection, so the index maintains multiple values for each entity.

Multi-part paths in a composite index must have the same length. Therefore, the following index definition is invalid:

@Entityjakarta.persistence.EntityDeclares that the annotated class is an entity.
@Index(members={"lastName", "address.city"}) // INVALID
public class Employee {
    String firstName;
    String lastName;
    Address address;
     :
}

Indexes in queries

ObjectDB manages a B-tree for every index. A B-tree is an ordered map data structure that ObjectDB maintains in the file system, not in memory. The B-tree keys are all the unique values from the indexed field (or arrays of values for composite indexes) across all entities of that class, including its subclasses. Each key is associated with a list of references to the entities that contain that value.

Indexes require maintenance time and consume storage space. Therefore, to use indexes effectively, you must understand how ObjectDB uses them to accelerate query execution.

Indexes are especially efficient for lookup and range queries:

  SELECT p FROM Point p WHERE p.x = 100
  SELECT p FROM Point p WHERE p.x BETWEEN 50 AND 80
  SELECT p FROM Point p WHERE p.x >= 50 AND p.x <= 80

By using an index on field x, ObjectDB can find the results with a range scan. This operation is very efficient because it iterates only over the relevant branches of the B-tree.

A composite index on fields x and y can also be used for the queries above. In this case, the order of the fields in the composite index definition is important because it determines which field is the B-tree's primary sort key and which is the secondary sort key. If x is specified first, the B-tree is ordered by x values, which supports a range scan on x. Conversely, if y is specified first, the B-tree is ordered by y values. In this case, a full B-tree scan is required to find all the objects with relevant x values. A full index scan is less efficient than a range scan but might still be more efficient than iterating over the entity data.

A composite index on fields x and y enables fast execution of the following queries:

  SELECT p FROM Point p WHERE p.x = 100 AND p.y = 100
  SELECT p FROM Point p WHERE p.x = 100 AND p.y BETWEEN 50 AND 80

For the second query above, the index order should be x first and y second.

Indexes on collections are useful for JOIN queries:

  SELECT d FROM Document d JOIN d.words w WHERE w = 'JPA'

The Document entity class contains a words field whose type is List<String>. The query retrieves all documents that contain the word "JPA". An index on the words field maps each word to a list of all the documents that contain it. Therefore, by using this index, the query can be executed with a fast index lookup.

The same collection index can also execute the following query:

  SELECT d FROM Document d JOIN d.words w WHERE LENGTH(w) >= 10

However, this query requires a full index scan because the index is sorted lexicographically by word, not by word length.

In summary, if an index contains all the fields in a query's WHERE clause, the query will run faster because it can, at a minimum, use a full index scan instead of iterating over all entities. Furthermore, if the field order in the index definition matches the field order in the WHERE clause, ObjectDB can perform a more efficient range scan.

ObjectDB also uses indexes for sorting results and for projection:

  SELECT MIN(p.x) FROM Point p WHERE p.x < p.y ORDER BY p.y

In this example, a composite index on fields x and y covers all the fields in the query, including the WHERE and ORDER BY clauses. This type of index is called a covering index, and it eliminates the need to access the entities themselves. Because of the comparison in the WHERE clause (p.x < p.y), a full index scan is required. Additionally, if field y is the first field in the index definition, the scan produces results in the requested order, which eliminates the need for a separate sorting step.

Finally, indexes are also used for MIN and MAX aggregate queries:

  SELECT MIN(p.x), MAX(p.x) FROM Point p

With an index on field x, ObjectDB can return the first and last keys in the B-tree without any iteration.