Mapped By Fetch is very Slow

#1

Well ! I think i pointed out a HUGE performance issue with mapped-by attribute (yeah again :( ).

Here is the test case :

1- Create 100 000 entities with for each 1 entity with @OneToOne(mappedBy attribute)

2- Retrieve only 10 000 for testing

3- Wait ... On my AMD FX 8350, it took 60s

Now, go the MyEntity class and remove "mapped-by".

You obtain 1 second max of query fetch.

So my performance issue in my application is more due to that problem than the left join one. ( Note : The left join index pb still exists, i will test your optimisation disabled latter )

So, the test case, enjoy :

The entity :

import javax.persistence.CascadeType;
import javax.persistence.Entity;
import javax.persistence.Id;
import javax.persistence.OneToOne;

@Entity
public class MyEntity {

@Id
private String name;

public MyEntity(String name) {
  this.name = name;
}

public String getName() {
  return name;
}

public void setName(String name) {
  this.name = name;
}

private MyEntityChild entityChild = null;
 
@OneToOne(targetEntity = MyEntityChild.class, cascade = CascadeType.ALL,mappedBy = "myEntity") // Test by adding / remove "mappedBy"
public MyEntityChild getEntityChild() {
  return entityChild;
}
 
public void setEntityChild(MyEntityChild entityChild) {
  this.entityChild = entityChild;
}

@Entity
public static class MyEntityChild {
 
   @Id
   private String name;
 
   private MyEntity myEntity;
 
   public MyEntity getMyEntity() {
    return myEntity;
   }
 
   public void setMyEntity(MyEntity myEntity) {
    this.myEntity = myEntity;
   }
 
   MyEntityChild(String name) {
    this.name = name;
   }
}
}

The test case :

public class ObjectDbTest {

public static void main(String[] args) {

  EntityManagerFactory emf = Persistence.createEntityManagerFactory("objectdb:$objectdb/db/test.tmp;drop");

  EntityManager em = null;

  em = emf.createEntityManager();
  if (!em.getTransaction().isActive()) {
   em.getTransaction().begin();
  }

  // Write 100 000 entities with 1 child for each
  for (int i=0;i<100000;++i) {
   MyEntity e1 = new MyEntity("parent" + i);
   MyEntityChild child2 = new MyEntityChild("child" +i);
   child2.setMyEntity(e1);
   e1.setEntityChild(child2);
   em.merge(e1);
  }

  em.getTransaction().commit();
  em.clear();
  em.close();

  em = emf.createEntityManager();

  Long start = new Date().getTime();
 
  TypedQuery<MyEntity> query = em.createQuery("SELECT m FROM MyEntity m", MyEntity.class);
 
  // Retrieve only 10 000
  List <MyEntity> entities = query.setMaxResults(10000).getResultList();

  Long end = new Date().getTime();

  Long duration = (end - start) / 1000;
  System.out.println("Duration : " + duration + " seconds");
 
  em.close();
  emf.close();
 
  // Be sure we retrieve child
  if (entities.get(0).getEntityChild() == null) {
   System.out.println("FAILED TO FETCH !");
  }
}

Regards, 
Xirt

#2

Thank you for this report. Please check build 2.6.2_08 that should fix this issue.

Some information about this issue:

The mapped by reference from MyEntity to MyEntityChild is eager. Therefore, retrieval of 10,000 MyEntity instances is followed by retrieval of the 10,000 referencing MyEntityChild instances. This is done by a separate automatic query:

SELECT $$owner, $$inv
FROM MyEntityChild $$owner
JOIN $$owner.myEntity $$inv
WHERE $$inv in ?1

where $1 is the collection of 10,000 MyEntity instances.

Before build 2.6.2_08 execution of such queries with large collections (e.g. 10,000) was very inefficient.

Note:

  1. Removing mapped-by can still improve performance because there is no need in this additional query.
  2. Setting an index on the owner side of the relationship (MyEntityChild.myEntity) may improve the execution if this query.
  3. Using the Enhancer can improve performance.
ObjectDB Support
#3

Support,

When you fetch 10 000 entities of MyEntityParent, you already have the link to children. 
Why ObjectDb ( and other ? ) tries to retrieve the parent ?

I understand the need of parent retrieving when you are fetching only children, but for that test case, that's not.
 

 

#4

In your test the relationship is bidirectional and the child is the owner. Therefore, only the child contains a reference to the parent. Navigation from the parent to the child is slower and requires an "inverse" query using the mappedBy attribute.

In bidirectional relationships you should try to make the side from which navigation is more likely as the owner, by not using mappedBy on that side. You can also use two unidirectional relationships by not using mappedBy at all.  That will make navigation in both directions faster but will require updating both sides on every change (keeping them synchronized).

ObjectDB Support
#5

Thank support, your explanation is very clear.

I better understand the inverse query. Why it was so long in objectdb ?

Also, i have an another question, is that case possible ? :

P contains C

C has a mapped-by reference to P

The mapped-reference to P is a Primary Key.

If the mapped-by reference to P is a non existing column ( tell me if am wrong ), so we can't put an primary key on it, right ?

#6

The implementation of the operator IN was based on iteration in a loop, which is not efficient for large collections. The new implementation uses a HashSet.

Please use new forum threads for new questions (and that specific question should be clarified by an example, i.e. at least code fragment if not a runnable test). You questions may be relevant to other users and if they do not match the title of the forum thread it is virtually impossible to reach them by search later.

ObjectDB Support

Reply