After doing some memory related research in my program, I found a possible place for an enhancement of objectdb:
My program has several entities which contain many String. Often some of these String are empty (""). To avoid unnecessary memory consumption, the String within the entities are initialized like:
String firstName = "";
In this case every "empty" attribute shares the same String. But after reloading the entities from objectdb, every empty String has become a new (unique) String object, which is a waste of memory (refer to http://www.cs.virginia.edu/kim/publicity/pldi09tutorials/memory-efficient-java-tutorial.pdf - page 26 e.g.). I assume objectdb is doing a "new String()" for every String object loaded from the database, while for empty String a "".internal() would be more memory efficient.
I wrote a little SSCCE to demostrate this effect. The SSCCE contains 2 programs:
Program a (creating the entities):
public class CreateEntites { public static void main(String[] args) { EntityManagerFactory entityManagerFactory = Persistence.createEntityManagerFactory("sscce.odb"); EntityManager entityManager = entityManagerFactory.createEntityManager(); ArrayList<MyEntity> entities = new ArrayList<MyEntity>(); // create 200000 entries for (int i = 0; i < 200000; i++) { MyEntity entity = new MyEntity(i); entities.add(entity); entityManager.getTransaction().begin(); entityManager.persist(entity); entityManager.getTransaction().commit(); entityManager.clear(); } entityManager.close(); entityManagerFactory.close(); System.out.println("let's create a heap dump"); // used to have some time to create a heap dump try { Thread.sleep(15000); } catch (InterruptedException e) { } // just reuse the arraylist, that the GC isn't cleaning it too early for (MyEntity entity : entities) { } } @Entity public static class MyEntity { @Id int id; String firstName = ""; String lastName = ""; String street = ""; String city = ""; public MyEntity(int id) { this.id = id; } } }
When running this program I did a heap dump (with VisualVM) while Thread.sleep was executed and analyzed the memory consumption:
Class Instances used Memory (Bytes)
MyEntity 200.000 10.400.000
char[] 7.154 428.532
String 7.089 226.848
...
Program b (loading this entities from the database):
public class ReadEntries { public static void main(String[] args) { EntityManagerFactory entityManagerFactory = Persistence.createEntityManagerFactory("sscce.odb"); EntityManager entityManager = entityManagerFactory.createEntityManager(); List<MyEntity> entities; TypedQuery<MyEntity> query = entityManager.createQuery("SELECT myentity FROM MyEntity myentity", MyEntity.class); entities = query.getResultList(); entityManager.close(); entityManagerFactory.close(); System.out.println("let's create a heap dump"); // used to have some time to create a heap dump try { Thread.sleep(15000); } catch (InterruptedException e) { } // just reuse the arraylist, that the GC isn't cleaning it too early for (MyEntity entity : entities) { } } @Entity public static class MyEntity { @Id int id; String firstName = ""; String lastName = ""; String street = ""; String city = ""; public MyEntity(int id) { this.id = id; } } }
When running this program I did a heap dump (with VisualVM) while Thread.sleep was executed and analyzed the memory consumption:
Class Instances used Memory (Bytes)
char[] 807.363 13.237.950
String 807.291 25.833.312
MyEntity 200.000 10.400.000
...
as you can see, the whole memory consumption of this little program has grown from ~10MB to ~50MB.
Possible solution: whenever loading an empty String from the database, objectdb should not create a new empty String - it should reuse the internal empty String
one further possible enhancement: implement a setting, where the developer can specify which Strings should be loaded via the String.internal() function. Hint for this enhancement: When loading a large list of persons, the persons first names are many time the same ones (here it is Martin, Michael, Thomas, ...). Instead of instantiating every first name with a new String, the user should be able to specify that this field should be loaded via Stirng.internal() (with all its advantages and disadvantages)
Thanks for looking into this
Kind regards
Manuel Laggner