Meanwhile we produced a similar test program, which demonstrates the issue and possible solutions:
import java.util.*;
import javax.jdo.*;
import javax.persistence.*;
public class F2302 {
public static void main(String[] args) {
EntityManagerFactory emf =
Persistence.createEntityManagerFactory("F2302.odb");
EntityManager em = emf.createEntityManager();
//populateDatabase(em);
scanDatabase(em);
em.close();
emf.close();
}
private static void populateDatabase(EntityManager em) {
int valueCount = 0;
em.getTransaction().begin();
StepBlockCollection blockCollection = new StepBlockCollection();
blockCollection.stepBlocks = new ArrayList();
for (int i = 0; i < 100; i++)
{
StepBlock stepBlock = new StepBlock();
blockCollection.stepBlocks.add(stepBlock);
stepBlock.steps = new ArrayList();
for (int j = 0; j < 100; j++)
{
Step step = new Step();
stepBlock.steps.add(step);
step.values = new ArrayList();
for (int k = 0; k < 100; k++)
{
Value value = new Value();
value.data = new byte[128];
step.values.add(value);
valueCount++;
}
}
}
em.persist(blockCollection);
em.getTransaction().commit();
System.out.println(valueCount + " values persisted.");
}
private static void scanDatabase(EntityManager em) {
long startTime = System.currentTimeMillis();
int valueCount = 0;
StepBlockCollection collection =
em.find(StepBlockCollection.class, 1);
// // Original code: Holds the entire tree in memory:
// for (StepBlock sb : collection.stepBlocks) {
// for (Step s : sb.steps) {
// valueCount += s.values.size();
// }
// }
// // Solution #1: remove (with not commit) step blocks after reading:
// Iterator itr = collection.stepBlocks.iterator();
// while (itr.hasNext()) {
// StepBlock sb = itr.next();
// for (Step s : sb.steps) {
// valueCount += s.values.size();
// }
// itr.remove();
// }
// // Solution #2: Use a temporary queue and free the root collection:
// Queue queue = new LinkedList(collection.stepBlocks);
// collection = null;
// while (!queue.isEmpty())
// {
// StepBlock sb = queue.poll();
// for (Step s : sb.steps) {
// valueCount += s.values.size();
// }
// }
//
// Solution #3: Use JDO's evict:
PersistenceManager pm = em.unwrap(PersistenceManager.class);
for (StepBlock sb : collection.stepBlocks) {
for (Step s : sb.steps) {
valueCount += s.values.size();
}
pm.evict(sb);
}
long elapsedTime = System.currentTimeMillis() - startTime;
System.gc();
Runtime runtime = Runtime.getRuntime();
System.out.println(valueCount + " values read in " +
elapsedTime + "ms, heap size: " +
(runtime.totalMemory() - runtime.freeMemory()));
// Uncomment to stop before exit for taking a heap dump:
//try {
// System.out.println("Press any key to exit.");
// System.in.read();
//}
//catch (java.io.IOException x) {
//}
}
@Entity public static class StepBlockCollection {
@Id int id = 1;
@OneToMany(cascade=CascadeType.ALL)
List stepBlocks;
}
@Entity public static class StepBlock {
@OneToMany(cascade=CascadeType.ALL)
List steps;
}
@Entity public static class Step {
List values;
}
@Embeddable
public static class Value {
byte[] data;
}
}
Providing a minimal test program in this format in future cases (as also described in the posting instructions) may help accelerating handling them.
The problematic loop that preserves strong references to all the entity objects:
for (StepBlock sb : collection.stepBlocks) {
for (Step s : sb.steps) {
valueCount += s.values.size();
}
}
Output:
1000000 values read in 1733ms, heap size: 239995944
In solution 1 step blocks are removed from the parent collection after processing:
Iterator itr = collection.stepBlocks.iterator();
while (itr.hasNext()) {
StepBlock sb = itr.next();
for (Step s : sb.steps) {
valueCount += s.values.size();
}
itr.remove();
}
Output:
1000000 values read in 1275ms, heap size: 25624704
Although it works well, you have to be very careful not to include this code in an active transaction and then commit because the database content will be changes in that case.
Solution 2 moves the step blocks for processing to a separate queue and then discards the root collection entity:
Queue queue = new LinkedList(collection.stepBlocks);
collection = null;
while (!queue.isEmpty())
{
StepBlock sb = queue.poll();
for (Step s : sb.steps) {
valueCount += s.values.size();
}
}
Output:
1000000 values read in 1285ms, heap size: 30928312
Solution 3 is more elegant. It uses the evict operation on processed objects to return them to hollow state (as they where before accessing and loading their content from the database):
PersistenceManager pm = em.unwrap(PersistenceManager.class);
for (StepBlock sb : collection.stepBlocks) {
for (Step s : sb.steps) {
valueCount += s.values.size();
}
pm.evict(sb);
}
Output:
1000000 values read in 1207ms, heap size: 25166920
Since the evict operation is not supported by JPA we have to cast the JPA EntityManager instance to JDO's PersistenceManager, which works like JPA EntityManager but with some additional operations.
Processing cache size on these runs was 16MB and datra cache was disabled.