Insertion Speed Rate and Batch Load

#1

1. I got about 10K inserts / second, is it normal? I would expect more than that, like close to 100K/second. I can definitely do 80K inserts/seconds on Oracle with batch insert on table with one primary key.

2. Is batch update/insert operation supported?

I attached the files.

#2

This test from the Memory Leak thread uses a separate transaction, and moreover, a new EntityManager per stored entity object. A batch load should reuse the EntityManager and persist a large number of objects per transaction. This is explained in the Storing JPA Entity Objects page in the manual.

Attached a modified version of your TestObjectDB.java file that commits once per 10,000 entity objects. On Intel Core 2 Quad Q6600 and enhancement by Java Agent, this version persists 7,000 Device instances per second.

But Device is a large object with an array of 100 strings (by the way, List<String> is preferred over String[] when using JPA). With 3 strings the rate is about 80,000 Device instances per second, and with null in the String[] array the rate goes up to 200,000 entity objects per second:

[THREAD-0] Wed May 04 01:25:45 IDT 2011 Persisted 980000 objects.
[THREAD-0] Wed May 04 01:25:45 IDT 2011 Persisted 990000 objects.
[THREAD-0] Wed May 04 01:25:45 IDT 2011 Persisted 1000000 objects.
[THREAD-0] Wed May 04 01:25:45 IDT 2011 Added 1000000 objects, elapsed time: 4985ms

Here is a simple program that tests batch load of an entity with only a primary key:

public final class InsertTest
{
    public static void main(String[] args)
    {
        EntityManagerFactory emf =
            Persistence.createEntityManagerFactory("$objectdb/db/speed.odb");
        EntityManager em = emf.createEntityManager();
       
        int count = 1000000;
        long startTime = System.currentTimeMillis();

        em.getTransaction().begin();
        for (int i = 1; i <= count; i++)
        {
            em.persist(new MyEntity(i));
            if ((i % 10000) == 0)
            {
                em.getTransaction().commit();
                em.getTransaction().begin();
            }
        }
        em.getTransaction().commit();

        long time = System.currentTimeMillis() - startTime;
        long rate = count / time * 1000;
        System.out.println("Persisted " + rate + " objects per second.");

        em.close();
        emf.close();
    }
   
    @Entity
    static final class MyEntity
    {
        @Id int id;
        MyEntity(int id) {
            this.id = id;
        }
    }
}

On Intel Core 2 Quad Q6600 (and enhancement) it persists 340,000 objects per second:

Persisted 340000 objects per second.

Finally, here is the batch insert speed comparison in the JPA benchmark:

https://jpab.org/Basic/Persist/Many.html

ObjectDB Support
#3

So, my close to reality use case is actually one insert per transaction. For business reason I can't do batch.

I tested with reusing the EntityManager and a transaction per insert found it to be much slower (500/second and Java GC is very busy collecting). Then I tested it with new EntityManager for every transaction and every transaction/persit: 70K/second.

em = emf.createEntityManager();
  em.getTransaction().begin();
  for (int i=0, j=0; i<deviceCount; i++) {
   if (i % threadCount != mod) {
    continue;
   }
   j++;
   attributes = generateAttributes(i);
   Device d = new Device();
   d.setDeviceId(i);
   d.setDeviceName("METER-" + i);
   d.setSerialNumber("SERIAL-"+i);
   d.setAttributes(attributes);
  
   em.persist(d);
   // toggle these 2 lines for either 1 or 10000 persists per transaction.
   //if (j%10000==0) {
   if (true) {
    em.getTransaction().commit();
    //uncomment the following 2 lines to test one EM per transaction
    //em.close();
    //em = emf.createEntityManager();
    em.getTransaction().begin();
   }
   if (j%10000==0) {
    logger("Persisted " + j + " objects.");
   }
  }
  if (em.getTransaction().isActive()) {
   em.getTransaction().commit();
  }

Also, reusing EntityManager for 10,000 persists/transaction doesn't really help me, the first 200,000 (20 commits) are fine, but after that the system is running FullGC all the time. My JVM has 4GB and I did use your attached modified code. However, with -javaagent set to objectdb.jar, I see no problem at all, the speed is about 150K/second.

Seems like something is not right with reusing EntityManager w/o -javaagent enhancement.

#4

You are right - to reuse an EntityManager in this scenario you have to empty it after commit:

    em.getTransaction().commit();
    em.clear(); // empty the persistence context

This is especially essential with no enhancement - otherwise ObjectDB has to hold all the previously persisted objects using strong references, preserve an image of them, and compare them to the image on every commit in order to identify update that has to be propagated to the database.

In the last batch, in addition to the new 10,000 objects, 990,000 objects are checked for changes...

Enhancement solves this problem, but still, invocation of clear slightly improves performance.

ObjectDB Support
#5

By the way, ObjectDB is faster than the alternatives also when using small (no batch) transactions:

https://jpab.org/Basic/Persist/Few.html

ObjectDB Support
#6

That EntityManager.clear() does the trick. I agree with your statement that one transaction per persist is much faster on ObjectDB compare to Oracle (which requires network roundtrip).

BTW, thank you so much for your speedy replies! I'm evaluating ObjectDB for possible use in one of our smart grid data data collection product.

#7

I run the codes in 2#, the result is not good.  The setting is Xeon 2.40GHz, Memory (2GB)

Persisted 13000 objects per second.

Any idea?

#8

Your classes are probably not enhanced.

ObjectDB Support
#9

I change the objectdb.conf to:

 <entities>
  <enhancement agent="true" reflection="warning" />
  <cache ref="weak" level2="0" />
  <cascade-persist always="auto" on-persist="false" on-commit="true" />
  <dirty-tracking arrays="false" />
 </entities>

Total 1000000 objects in 26.952999 seconds.

Persisted 37101.62 objects per second.

#10

I run this code  with the above enhanced setting on another PC ( Intel Celeron 550@2GHz, 3GB memory ) :

Total 1000000 objects in 17.516001 seconds. 
Persisted 57090.656 objects per second.
Error opening zip file or JAR manifest missing: /E:/projects/ObjectDB/bin/objectdb.jar

Maybe the transaction is too slow, that cause emf.close() catch a error.

The speed is not good.

#11

Try running with the Enhancer Java Agent:

java -javaagent:c:\objectdb\bin\objectdb.jar InsertTest

In addition, try adding a clear in the loop:

        for (int i = 1; i <= count; i++)
        {
            em.persist(new MyEntity(i));
            if ((i % 10000) == 0)
            {
                em.getTransaction().commit();
                em.clear(); // added to keep transaction small
                em.getTransaction().begin();
            }
        }

 

ObjectDB Support
#12

Is the Enhancer a JPA standard? Or, the private property of ObjectDB?

Is it similar to memcache?

 

#13

Other JPA providers have similar tools.

The Enhancer is not similar to memcache - please read the documentation.

ObjectDB Support

Reply