Bulk Delete and Update - best practice?

#1

The attached test creates 1000 large objects and then attempts to delete them all in one go using a DELETE statement. I'm trying to run with -Xmx512m but the test runs out of memory during the delete.

The attachment also contains an update test which has similar memory problems. I'm running both tests with the default objectdb.conf.

In general is an UPDATE/DELETE statement the most efficient way to update/remove a large number of objects?

I've had success with the "drop" keyword in the connection string in other circumstances but I need a method that can work in both embedded and server based databases (in server mode the database is locked so drop can't be used).
The only method I've found which works consistently is to load and remove objects (one by one or small batches) but this is painfully slow when there is a high number of large objects.

#2

Currently transactions are limited to the heap size, so this test that deletes and updates more than 500MB of data in one transaction requires a larger heap size.

This limitation may be removed in future releases, but even then, using smaller UPDATE / DELETE queries, of about 20MB-30MB each, would be more efficient.

drop in the connection url should work also in server mode if the database is not in use by another user.

ObjectDB Support
ObjectDB - Fast Object Database for Java (JPA/JDO)
#3

Ok, perhaps I'm not using drop properly for the server database. In the following test I'm expecting the second row count to be 0 if the database is dropped  but that doesnt seem to be the case. Could you advise?

The removal of UPDATE/DELETE limitations would definitely get my vote on the feature request list.

public class TestDropTable3 {

@Test
public void testDrop() {
  EntityManagerFactory emf =
           Persistence.createEntityManagerFactory(
               "objectdb://localhost/testDrop.tmp;user=admin;password=admin");
 
  EntityManager em = emf.createEntityManager();
  em.getMetamodel().entity(CacheData.class);
  em.close();

  String payload = "somedata";
 
  em = emf.createEntityManager();
  CacheData data = new CacheData();
  data.setHash(payload.hashCode());
  data.setData(payload);
 
  em.getTransaction().begin();
  em.persist(data);
  em.getTransaction().commit();
  em.clear();
 
  TypedQuery<Long> q = em.createQuery("SELECT COUNT(d) from CacheData d", Long.class);
  long rows = q.getSingleResult();
  System.out.println(String.format("Found %s rows", rows));
 
  em.close();
 
  emf.close();
 
  emf = Persistence.createEntityManagerFactory(
               "objectdb://localhost/testDrop.tmp;drop;user=admin;password=admin");
 
  em = emf.createEntityManager();
  em.getMetamodel().entity(CacheData.class);
  em.close();
 
  em = emf.createEntityManager();
  q = em.createQuery("SELECT COUNT(d) from CacheData d", Long.class);
  rows = q.getSingleResult();
  System.out.println(String.format("Found %s rows", rows));
 
  em.close();
  emf.close();
}

@Entity
    public static class CacheData {
        @GeneratedValue @Id long id;
        @Index int hash;
        String data;
       
        public long getId() { return id; }
       
        public int getHash() { return hash; }
        public void setHash(int value) { hash = value; }
       
        public String getData() { return data; }
        public void setData(String value) { data = value; }
    }
}
#4

You are right. Please try build 2.3.3_07.

ObjectDB Support
ObjectDB - Fast Object Database for Java (JPA/JDO)

Post Reply

To post a reply and/or subscribe to update notifications - please login