Speed of queries that return many objects

#1

Hi,

I have a DB of 146MB - with 1_000_000 elements.

Loading the contents (objectDB query in java) takes about 10sec.

I am using a recent MacBook (SSD blabla - disk I/O speed is about 400MB/sec - according to blackMagicTest app for mac) - so I was expecting the query to take less than one sec (146/400 = 0.35sec.. + some overhead).

I was wondering if this kind of speed is "normal", or if this is something I should investigate.

So I created a database with 1_000_000 dummy entities (example below) - the DB size is only about 8MB - and it takes about 3sec to load the 1mio entities... still seems slow.

Thanks

EKK

import java.io.Serializable;
import javax.persistence.Entity;

@Entity
public class BenchEnt implements Serializable {
    private static final long serialVersionUID = 1L;
    public BenchEnt() {
    }  
}
        String d = System.getProperty("user.dir");
        EntityManagerFactory emf = Persistence.createEntityManagerFactory(d + "/T1.odb");
        EntityManager em = emf.createEntityManager();
       
        if( false){
            em.getTransaction().begin();
            for( int i = 0; i < 1_000_000; i++){
                BenchEnt ent = new BenchEnt();
                em.persist(ent);
            }
            em.getTransaction().commit();
            em.clear();
        }
        if( true){
            long t1 = System.currentTimeMillis();
            TypedQuery<BenchEnt> query = em.createQuery("SELECT a FROM BenchEnt a", BenchEnt.class);
            List<BenchEnt> L = query.getResultList();
            long t2 = System.currentTimeMillis();
            long dt = t2-t1;
            System.out.println("LIST " + L.size() + " dt= " + dt);
        }
#2

3 seconds for this test case seems to be normal, although with enhancement and tuning it could be in the range of 1-2 seconds.

The overhead of preparing entity objects is more than you may expect and much slower than just reading the data from the disk. Using embedded objects or report queries (which return fields rather than entity objects) may help.

If you can load the data lazily then the query can return results much faster:

    TypedQuery<BenchEnt> query = em.createQuery(
        "SELECT a FROM BenchEnt a", BenchEnt.class);
    query.setHint("objectdb.result-fetch", "LAZY");
    List<BenchEnt> L = query.getResultList();

but of course, this may not help if you need all the objects in memory at once.

ObjectDB Support
#3

Thanks, the problem is that I can't do lazy loading as I need the full content of each object.

There is still something I don't understand.

I measure how long it takes to instantiate the objects - I get something around 50ms for 1mio objects.

The way I understand how retrieving something from a DB works is as follows:

1/ load from the disk - the odb file size for the example above is around 8MB. So it should take 20ms to load the file into memory.

2/ then there is some magic to instantiate the objects - should take around 50ms.

so overall it should take about 70ms.

Can you give some information of what I am missing to reach the 3 sec?

Also when I profiled the loading of the 1mio object - 50% of the time is spent in java.lang.System.identityHashCode(Object).

The rest is split between seek0, readBytes, writeBytes etc... (see screenshot attached).

--

I am using ObjectDB for two applications - and in both cases - loading from the ObjectDB is the bottleneck of my two applications. I have 100+ DBs - and each one is about 4GB. So in the end - loading the DBs takes hours.

Thanks

 

 

#4

Unfortunately your expectations are unrealistic. Managing entity objects involves a significant overhead which is non proportional to the cost of object instantiation, so the 50ms estimate is irrelevant. 20ms for the I/O operations is also unrealistic when we have to load one million objects. It is not copying a file sequentially.

Apparently your use case is not a standard database (or object database) activity. The main idea of a database is that large amount of data is stored in a secondary storage and only portion of that data is retrieved to memory at a time. You want to load the entire database into memory at once, and an ODBMS (or maybe DBMS in general) may not be the most efficient tool for this purpose.

Although an ODBMS may not be the right tool, you may still try using it in a non standard way. For example, by using fewer but larger objects (optionally using embedded objects) or by retrieving the data without the entity objects using "report queries" (using projection). Of course, you will lose much of the advantages of using ODBMS if you use such methods.

There is no way for the System.identityHashCode method to take 50% of the total execution time. This must be one of these cases in which profiling output is distorted (e.g. is the result of excessive JVM / profiling preparations during one of the calls to that method).

ObjectDB Support
#5

I have the same issue - slow performance on the Mac. Latest JDK 1.8 , and newest Mac

Attached is the sample application. Instructions:

- Point to a local folder in com.example.demo.db.Db.java class

- Build using maven clean install -DskipTests

- Start Spring Boot application in target folder: java -jar demo-0.0.1-SNAPSHOT.jar

- Generate sample file (about 70k records) with http://localhost:8080/hello

- Run query with URL: http://localhost:8080/1/age

=> If you try this on Windows, last call is working blazing fast (100ms). On Mac is very very slow (3sec).

#6

Are you sure that this is the same issue of a slow query that returns many objects? In that case, the same answer at #4 above still applies.

Please provide more details about your issue. The original post doesn't discuss fast execution on Windows and slow execution only on Mac. If your issue is different please create a new forum thread and close this one.

Finally, please provide a simple minimized console application (see these posting instructions).

ObjectDB Support

Reply