Listing large number of complicated objects with paging.

#1

Hi,

For last few hours I was working on a problem with bad performance and memory problems with such a simple query:

select prod from Product prod order by prod.id ASC

Right now I have 55541 'Product' (attached to this post) entities in database and I'm trying to speed listing them with paging. The problem is, that running code that looks like this:

TypedQuery<Product> icQuery = em.em.createQuery("select prod from Product prod order by prod.id ASC", Product.class);
prodList = icQuery.setFirstResult(start).setMaxResults(count).getResultList();

is quite slow. Well, after first request and "warming up" database file, query runs for about 1.5 (best) to 6 (worst) sec:

2011-11-15 19:30:59.578 [http-bio-8080-exec-3] DEBUG pl.hplxtool.dao.impl.ProductDAOImpl - Query: [select prod from Product prod order by prod.id ASC]
2011-11-15 19:30:59.578 [http-bio-8080-exec-3] DEBUG pl.hplxtool.dao.impl.ProductDAOImpl - Executing query
2011-11-15 19:30:59.578 [http-bio-8080-exec-3] DEBUG pl.hplxtool.dao.impl.ProductDAOImpl - Start: 250, count: 50
2011-11-15 19:31:03.219 [http-bio-8080-exec-3] DEBUG pl.hplxtool.dao.impl.ProductDAOImpl - Query executed
2011-11-15 19:31:03.220 [http-bio-8080-exec-3] DEBUG pl.hplxtool.dao.impl.ProductDAOImpl - Returning data

Query manager builds plan like this:

[2011-11-15 19:30:59 #269 query.manager]
<queryRequest query="select prod from Product prod order by prod.id ASC" args="null" transactionId="-1" />

[2011-11-15 19:30:59 #270 query.manager]
<newQueryProgram>
    <orderPlan plan="sort(fetch(index(-279[all])))" eval="9.3782" var="1" prm="0" reg="0" order="prod.id">
        <fatchPlan plan="fetch(index(-279[all]))" eval="7.4751" var="1" prm="0" reg="0">
            <btreePlan plan="index(-279[all])" eval="6.174" var="1" prm="0" reg="0" variable="prod" />
        </fatchPlan>
    </orderPlan>
</newQueryProgram>

[2011-11-15 19:30:59 #271 query.manager]
<newQueryResponse isUnique="false" resultSize="50" fetchSize="55541">
    <SimpleResultBuilder name="prod" type="pl.hplxtool.model.Product" />
</newQueryResponse>

ODB server in newest (2.3.3_04) version is running with bigger heap size - now it's 384MB.

I was trying to build wrapper object ProductTO that contains only fields that I need in listing and the use new ProductTO() in query. Didn't help - I always get OutOfMemory exception. The interesting part is that running this query needs lots of memory at tomcat side (attached screenshot) and after running it 3-4 times in a row cause java.lang.OutOfMemoryError: Java heap space exception. The same query in Explorer runs blazing fast (22ms) but I think it's because Explorer lazy loads all that objects and load their content on demand.

ODB is running as server and I'm using loopback to connect to this server.

So, how else can I speed up this type of query? Or maybe I just need to add some more heap space for ODB server (database is 560MB large)?

 

#2

The query itself is very simple (and therefore completes in 22 ms in the Explorer).

The problem is probably with the extensive EAGER setting in Product, and particularly with eager fetch of inverse (mapped by) collections. Massive inverse fetch is currently inefficient and has to be improved in future ObjectDB versions.

Could you please provide more details about the effect of restricting the result range? How many products are you trying to retrieve in every query execution (setMaxResults). Does that number affect?

You may try running the query with a lazy fetch hint (as done by the Explorer).

Another solution may be to use report queries, i.e. instead of retrieval of complete Product instances, you may specify the fields in Product that you need in the SELECT clause.

ObjectDB Support
#3

Yes, I do remember that there is a problem with extensive EAGER use and mapped by collections, but I can't repair this right now, so I'm stuck with such a model.

I'm trying to fetch 50 records at a time. The funny thing is that in log from query manager fetchSize is like all objects in database:

<newQueryResponse isUnique="false" resultSize="50" fetchSize="55541">

In the case of using transfer object I did use contractor with only needed fields and query was like this:

select new ProductListTO(prod.id,prod.name,prod.symbol,prod.wwwURL,prod.type,prod.listOfStores,prod.done,prod.inspDone) from Product prod order by prod.id asc

As I mention earlier - that didn't work - there was OutOfMemory exception thrown by database - maybe by creating 55k ProductListTO() objects?

I will try lazy fetch hint, but as I understand it will run 50 queries find(Product) while iterating over product list, right?

 

#4

OK, I just set lazy hint. Query plan is a little bit different in this case:

[2011-11-15 23:25:20 #297 query.manager]
<queryRequest query="select prod from Product prod order by prod.id ASC" args="null" transactionId="-1" />

[2011-11-15 23:25:20 #298 query.manager]
<cachedQueryProgram>
    <orderPlan plan="sort(index(-279[all]))" eval="8.0771" var="1" prm="0" reg="0" order="prod.id">
        <btreePlan plan="index(-279[all])" eval="6.174" var="1" prm="0" reg="0" variable="prod" />
    </orderPlan>
</cachedQueryProgram>

[2011-11-15 23:25:21 #299 query.manager]
<newQueryResponse isUnique="false" resultSize="50" fetchSize="0">
    <SimpleResultBuilder name="prod" type="pl.hplxtool.model.Product" />
</newQueryResponse>

Cold database: 1sec

Hot database: 600ms

Hot database - refresh of the same page: 160ms.

Memory usage - no more than 35MB.

And everything works as it should...

#5

I could repeat this problem with the following test:

import java.util.*;

import javax.jdo.annotations.*;
import javax.persistence.*;

public class T577 {
    public static void main(String[] args) {
       
        EntityManagerFactory emf =
            Persistence.createEntityManagerFactory(
                "objectdb:$objectdb/db/test.tmp;drop");
        EntityManager em = emf.createEntityManager();
       
        em.getTransaction().begin();
        for (int i = 0; i < 1000; i++)
            em.persist(new MyEntity());
        em.getTransaction().commit();

        TypedQuery<MyEntity> query = em.createQuery(
            "select e from MyEntity e order by e.id ASC", MyEntity.class);
        List<MyEntity> resultList =
            query.setFirstResult(20).setMaxResults(10).getResultList();
        System.out.println("Results: " + resultList.size());
       
        emf.close();
    }
   
    @Entity
    public static class MyEntity {
        @GeneratedValue @Id long id;
        @Index int value;
        byte[] data = new byte[1500];
    }
}

Your observation is correct. There is a fetch optimization that collects objects during query processing. This optimization is disabled now (build 2.3.3_06) for range queries, since before sorting and cutting the requested range it is unknown which objects are really needed.

Thank you for this bug report.

 

ObjectDB Support
#6

New build works great! Thanks for quick fix :)

Reply