ObjectDB ObjectDB

[ODB1] Chapter 4 - JDO Metadata

4.1  JDO Metadata Files

During JDO enhancement and later at runtime, ObjectDB determines whether or not each class is persistent. It searches for JDO metadata description of each class in several .jdo files in a pre defined order. If a metadata description is found, the class is persistent, and if not, the class is transient.

Metadata for class a.b.X (a.b is the package name, X is the class name), whose class file is a/b/X.class, is searched in the following paths (in the order shown):

  • META-INF/package.jdo
  • WEB-INF/package.jdo
  • package.jdo
  • a/package.jdo
  • a/b/package.jdo
  • a/b/X.jdo

Metadata for class X in the default package is searched in the following paths (in the order shown):

  • META-INF/package.jdo
  • WEB-INF/package.jdo
  • package.jdo
  • X.jdo

A metadata file with the name X.jdo must be dedicated to a single class whose name is X. Metadata for multiple classes can be specified in a package.jdo file located in META-INF, WEB-INF or in any other path at the level of the classes or above. Determining where to put the metadata of every persistent class is your responsibility. When the metadata for a class is found, the search is stopped. Therefore, only the first metadata for a class in the search order specified above has effect.

4.2  Metadata for Classes

We start with a basic JDO metadata file:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE jdo SYSTEM "http://java.sun.com/dtd/jdo_1_0.dtd">

<jdo>
  <package name="">
    <class name="A" />
  </package>
  <package name="test">
    <class name="B" />
    <class name="C" persistence-capable-superclass="B" />
    <class name="D" requires-extent="false" />
  </package>
</jdo>

A JDO metadata file is an XML file with a single root element - <jdo>. The <jdo> root has one or more <package> sub elements. Each <package> element has one or more <class> sub elements. Both <package> and <class> elements have a required name attribute. The metadata above defines class A (in the default package) and classes B, C and D (in package test), as persistent.

In addition to the required name attribute, a <class> element can have one or more of the following optional attributes:

persistence-capable-superclass

The persistence-capable-superclass attribute usually specifies the direct super class if it is also persistent. In the above metadata example, class C is probably defined as a subclass of class B (using extends). But there is also another possibility. Class C might be a subclass of a non persistent class X, which is a subclass of class B. That is also a valid structure because JDO enables declaring a persistent class as a subclass of a non persistent class. Of course, in that case the fields of class X are not persistent fields, and when an instance of class C is stored, only persistent fields from classes B and C are stored. The closest persistent super class in the inheritance hierarchy must be specified using the persistence-capable-superclass attribute. This attribute is omitted only if there is no persistent super class anywhere in the inheritance hierarchy. When the super class is in the same package the package name can be omitted. Otherwise the full name of the super class, which includes the package name, has to be specified.

requires-extent (true | false)

By default, JDO manages an extent for every persistent class. An extent enables iteration over persistent instances of a class (including or excluding instances of subclasses) as well as execution of queries against the class instances. However, maintaining an extent for a class has some overhead in terms of time and storage space. When extent management is not needed, it can be omitted by specifying requires-extent="false", as shown above for class D. If requires-extent="false" is specified for a class, it must also be specified for its super class as declared in the persistence-capable-superclass attribute.

identity-type and objectid-class

The identity-type and objectid-class attributes, which are defined in the JDO specification, are ignored by ObjectDB if specified (as a pure object database, ObjectDB always uses datastore identity with its own object-id class).

4.3  Metadata for Fields

Unlike persistent classes, which must be listed in the JDO metadata, persistent field descriptions can often be omitted. In most cases, the default management of fields by ObjectDB is adequate. Metadata for fields is required only for changing the default. Therefore, only fields with modified behavior should be specified in the metadata.

The following file demonstrates JDO metadata for fields:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE jdo SYSTEM "http://java.sun.com/dtd/jdo_1_0.dtd">

<jdo>
  <package name="test">
    <class name="A">
      <field name="f0" persistence-modifier="persistent" />
      <field name="f1" persistence-modifier="none" />
      <field name="f2" persistence-modifier="transactional" />
      <field name="f3" default-fetch-group="true" />
      <field name="f4" default-fetch-group="false" />
      <field name="f5" embedded="true" />
      <field name="f6" embedded="false" />
      <field name="f7" null-value="exception" />
      <field name="f8" null-value="none" />
      <field name="f9" null-value="default" />
    </class>
  </package>
</jdo>

A single persistent class, test.A, is declared by a <class> element containing <field> sub elements for 10 of its fields (there may be also other persistent fields in class A that are not specified in the metadata). Every field can have zero or more <field> elements. A <field> element may have several attributes (not just two as demonstrated above), but the name attribute specifying a name of a field in the class is always required.

persistence-modifier (persistent | none | transactional)

The default rules for determining if a field is persistent are explained in section 3.2. The persistence-modifier attribute makes it possible to change the default. Specifying a persistent value, as demonstrated by field f0, changes a field that is transient by default to persistent. For example, a field with a transient modifier in the Java source (useful for defining fields as transient in serialization and persistent in JDO), or a field whose declared type is java.lang.Object or some interface, but holds at runtime only values of persistent types. Specifying a none value, as demonstrated by field f1, changes a field that is persistent by default to transient, as an alternative to the Java transient modifier (for example, when a field has to be persistent in serialization). A field that is declared as transactional, like f2 above, has similar behavior to transient because its value is never stored in the database. The main difference is that, on transaction rollback, it returns automatically to its value at the beginning of the transaction.

default-fetch-group (true | false)

The default-fetch-group attribute indicates that a field should be managed in a group with other fields. When a persistent object is retrieved from the database its fields are not ready yet. Only when the program accesses a field is the field value loaded automatically by ObjectDB from the database. If the field belongs to the default fetch group, values for all the fields in the group are also loaded. The default fetch group should include fields that are needed often and are relatively small. By default, the group contains all the fields with primitive types (e.g. int), types defined in java.lang (e.g. String and Integer), types defined in java.math (e.g. BigInteger), and java.util.Date. Collections, arrays and references to user defined classes are excluded by default. The default-fetch-group attribute can change the default, as demonstrated by fields f3 and f4.

embedded (true | false)

The embedded attribute is relevant for persistent reference fields. It indicates whether or not the content of the referred object should be stored as part of the referring object, as an embedded object. Embedded objects can reduce storage space and improve efficiency, but they do not have an object ID and cannot be shared by references from multiple objects. In addition, embedded objects of persistent classes are not included in the extents of their classes, so they cannot be queried directly. When the embedded attribute is not specified, ObjectDB embeds objects by default, for all fields except fields whose type is a user defined persistent classes or java.lang.Object. To use embedded objects for fields of user defined persistent classes, a metadata has to be specified as demonstrated by field f5. Wrapper objects, strings, dates, collections and arrays are embedded by default. To use them as non embedded (useful when the field is very large or rarely used) a metadata has to be specified, as demonstrated by field f6.

null-value (exception | none | default)

The null-value attribute is also intended for persistent reference fields. It indicates whether the field can accept null values or not. If an exception value is specified, as demonstrated by field f7, a JDOUserException is thrown on any attempt to store a persistent object with a null value in that field. If the null-value attribute is not specified or specified with a none value, as demonstrated by field f8, null values are allowed. If a default value is specified, as demonstrated by field f9, null values are replaced at storage time with default values (e.g. new Integer(0) for a java.lang.Integer field, "" for a java.lang.String field, empty collection, 0 size array, and so on).

primary-key

The primary-key attribute, defined in the JDO specification, is irrelevant when using datastore identity, which is the only object identity supported by ObjectDB. To enforce unique values in a field you can define a unique index as explained in section 4.5.

4.4  Arrays, Collections and Maps

Special XML sub elements are available for array, collection and map fields:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE jdo SYSTEM "http://java.sun.com/dtd/jdo_1_0.dtd">

<jdo>
  <package name="test">
    <class name="B">
      <field name="f0" embedded="true">
        <array embedded-element="true" />
      </field>
      <field name="f1" embedded="true">
        <collection embedded-element="true" />
      </field>
      <field name="f2" embedded="true">
        <map embedded-key="true" embedded-value="true" />
      </field>
    </class>
  </package>
</jdo>

A <field> element representing a persistent field whose type is collection, map or array, can have a <collection>, <map> or <array> sub element, respectively.

embedded-element, embedded-key, embedded-value (true | false)

The embedded-element attribute indicates whether objects in a collection or array should be stored as embedded objects or not. To understand the difference between embedded and embedded-element, think about a collection field containing instances of a user defined persistent class. Specifying embedded=true in the <field> element (which is the default for collections and arrays anyway) indicates that the contained references are embedded, but not the referenced objects themselves because persistent class instances are not embedded by default. Specifying embedded-element=true in the <collection> sub element indicates that the objects are also embedded, not just their references. If embedded=false is specified in the <field> element, the objects are never embedded regardless of the embedded-element attribute, and even the references are stored externally. System types are embedded by default, so a collection of strings or dates is fully embedded by default. Specifying embedded-element=false changes it in a way that every String or Date is stored as a non embedded object (with a unique object ID). The embedded-key and embedded-value indicate whether the keys and the values of a map should be stored as embedded objects or not, similar to embedded-element, which is only for collections and arrays.

element-type, key-type, value-type

The key-type and value-type attributes defined in the JDO specification for the <Map> element are ignored by ObjectDB. The element-type attribute of the <collection> element has effect only in index declarations, as explained in the next section.

4.5  Index Definition

Querying a large extent without indexes may take a significant amount of time because it requires iteration over all the class instances, one by one. Using proper indexes the iteration can be avoided, and complex queries over millions of objects can be executed quickly. Index management introduces overhead in terms of maintenance time and storage space, so deciding which fields to define with indexes should be done carefully. Indexes are not supported by the free database edition.

The JDO specification does not define a standard method of index declaration, so the declaration syntax is specific to ObjectDB. Every <class> element can have <extension> sub elements declaring indexes. The JDO standard provides <extension> for vendor specific declarations, so the metadata file remains JDO portable. Because index declarations are specified directly in the <class> element, <field> elements are not necessarily required for the fields for which indexes are declared.

The following metadata declares simple indexes for two fields. Field f0 has an ordinary index and field f1 has a unique index. A field that has a unique index must have unique values among persistent instances of the class. An exception is thrown on any attempt to store an object with a value that is the same in some other persistent instance of the same class (or its subclasses).

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE jdo SYSTEM "http://java.sun.com/dtd/jdo_1_0.dtd">

<jdo>
  <package name="test">
    <class name="A">
      <extension vendor-name="objectdb" key="index" value="f0" />
      <extension vendor-name="objectdb" key="unique-index" value="f1" />
    </class>
  </package>
</jdo>

Simple indexed fields (as shown above) can have any of the following types:

  • embedded value types:
    • primitive type (boolean, byte, short, char, int, long, float or double)
    • embedded wrapper (Boolean, Byte, Short, Character, Integer, Long, Float or Double)
    • embedded java.lang.String
    • embedded java.util.Date
  • external (non embedded) reference of any type

There are two types of indexes: value indexes for value type fields, and reference indexes for non embedded reference fields. Value indexes manage a reversed list of persistent objects for every value in the indexed field. Both equality (==, !=) and comparison (<, <=, >, >=) queries are supported by value indexes. Reference indexes manage a reversed list of persistent objects for every reference in the indexed field. Only equality (==, !=) queries are supported by reference indexes.

Fields of value types (primitive types, wrapper types, String and Date) are embedded by default, but if, for instance, a String field is defined as non embedded, an index on that field is a reference index. In that case equality queries will check reference equality (as == operator in Java) and not value equality (as the equals(...) method).

A reference field cannot have a direct index if it is defined as embedded (because the referenced objects are embedded and do not have object IDs). However, in this case, the fields of the embedded object can have indexes, as shown in the following metadata:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE jdo SYSTEM "http://java.sun.com/dtd/jdo_1_0.dtd">

<jdo>
  <package name="test">
    <class name="B">
      <field name="f0" embedded="true" />
      <extension vendor-name="objectdb" key="index" value="f0.x" />
      <field name="f1" embedded="true" />
      <extension vendor-name="objectdb" key="index" value="f1.y.z" />
    </class>
  </package>
</jdo>

Field f0 holds a reference to a persistent object containing a persistent field named x. Because an x value is stored directly in every instance of B (field f0 is embedded), field x can have an index as a direct field of B (reference index or value index according to its type). This can be extended further to nested embedded objects as demonstrated by field f1.

Indexes can also be applied to embedded array and collection fields to accelerate contains(...) queries. Arrays of value types (including embedded wrappers, strings and dates) get value indexes, and array of reference types get reference indexes. The same is true with collections, except that the types of the elements must be declared explicitly using element-type attributes because it is not specified in the Java code.

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE jdo SYSTEM "http://java.sun.com/dtd/jdo_1_0.dtd">

<jdo>
  <package name="test">
    <class name="C">
      <field name="f0">
        <collection element-type="int" />
      </field>
      <extension vendor-name="objectdb" key="index" value="f0" />
      <field name="f1">
        <collection element-type="java.lang.String" />
      </field>
      <extension vendor-name="objectdb" key="unique-index" value="f1" />
      <field name="f2">
        <collection element-type="B" />
      </field>
      <extension vendor-name="objectdb" key="index" value="f2" />
      <field name="f3">
        <collection element-type="B" embedded-element="true" />
      </field>
      <extension vendor-name="objectdb" key="index" value="f3.x" />
    </class>
  </package>
</jdo>

For fields f0 and f1, which are collections of values, value indexes are declared. For field f2, which is a collection of external references, a reference index is declared. Field f3, which is a collection of embedded objects, cannot have a direct index, but fields of its embedded elements, like x, can have an index, as demonstrated above. The Index Rebuilder Tool

Setting indexes for a new class that does not have persistent instances stored in the database yet requires only JDO metadata declarations. However, if the class has persistent instances stored in the database, the index becomes active only by running the Index Rebuilder tool on the database (in order to collect index information for the existing class instances), otherwise it is ignored by ObjectDB. You can run the Index Rebuilder from the Explorer, as explained in section 9.4, or from your application by calling the Utilities.startIndexBuilder(...) static method:

import com.objectdb.Utilities;
    : 
    : 
  // Rebuild the index on field 'x' of class 'Point': 
  Thread thread1 = Utilities.startIndexBuilder(pm, Point.class, "x");

  // Rebuild all the indexes of class 'Point': 
  Thread thread2 = Utilities.startIndexBuilder(pm, Point.class, null);

  // Rebuild all the indexes of all the classes: 
  Thread thread3 = Utilities.startIndexBuilder(pm, null, null);

The first argument to Utilities.startIndexBuilder(...) (pm in the code above) is a PersistenceManager instance that represents the database whose index or indexes should be rebuilt. The Thread instance returned from Utilities.startIndexBuilder(...) may be used to manipulate the index rebuilder thread (i.e. to change priority, to wait until it finishes, and so on).