Left join fetch behaviour doesn't retrieve children?

#1

Hello,

I'm a little bit confused about a left join fetch behaviour.

In this test case, i try to fetch my 3 children using a left join fetch.
But my collection is null after closing the em.

It was expected to be fetched before closing the em, no ?

In fact, that is the objective of the fetch purpose (i guess ?).

import java.util.HashSet;
import java.util.Set;

import javax.persistence.CascadeType;
import javax.persistence.Entity;
import javax.persistence.EntityManager;
import javax.persistence.EntityManagerFactory;
import javax.persistence.Id;
import javax.persistence.ManyToMany;
import javax.persistence.Persistence;
import javax.persistence.TypedQuery;

public class ObjectDbTest {

    public static void main(String[] args)  {
     
        EntityManagerFactory emf = Persistence.createEntityManagerFactory(
                "objectdb:$objectdb/db/test.tmp;drop");

        EntityManager em = null ;

        // Merge parent1
        em = emf.createEntityManager();
        if (! em.getTransaction().isActive()) {
            em.getTransaction().begin();
        }

        MyEntity e1 = new MyEntity("parent1");
        e1.addChild(String.valueOf("child1"));
        e1.addChild(String.valueOf("child2"));
        e1.addChild(String.valueOf("child3"));
        em.merge(e1);

        // Ok, seems to be well merged
        em.getTransaction().commit();
        em.clear();
        em.close();

        // Try to retrieve parent and children using left join fetch
        em = emf.createEntityManager();

        TypedQuery<MyEntity> query =
               em.createQuery("SELECT DISTINCT m FROM MyEntity m JOIN FETCH m.entityChildren", MyEntity.class);
      
        MyEntity myEntity = query.getSingleResult();
       
        // Ok my em is now closed, let's check if children are retrieved
        em.close();
        emf.close();

        if (myEntity.getEntityChildren() == null) {
         System.out.println("Children not fetched !"); // We fall in this case, it is excepted ?
        } else {
         System.out.println("Number of children = " + myEntity.getEntityChildren().size());
        }
    }

    @Entity
    public static class MyEntity {

        @Id
        private String name;

        public String getName() {
   return name;
  }

  public void setName(String name) {
   this.name = name;
  }

  // Also tested with List
        private Set <MyEntityChild> entityChildren = new HashSet <MyEntityChild> ();

        MyEntity(String name) {
            this.name = name;
        }

        public void addChild(String name) {
         entityChildren.add(new MyEntityChild(name));
        }
       
        @ManyToMany(targetEntity = MyEntityChild.class, cascade = CascadeType.ALL)
        public Set<MyEntityChild> getEntityChildren() {
            return this.entityChildren;
        }
       
        public void setEntityChildren(Set <MyEntityChild> entityChildren) {
         this.entityChildren = entityChildren ;
        }
    }

    @Entity
    public static class MyEntityChild {

        @Id
        private String name;

        MyEntityChild(String name) {
            this.name = name;
        }
    }
}

Regards,

Xirt

#2

Thank you for this report. Build 2.6.2_03 should fix this issue.

ObjectDB Support
#3

Thx you support !

That's perfect !

Don't you have some unit tests to test every night these kind of features ?

#4

Unit tests are automatically run on every build as part of the build process, but unfortunately in this case the test verified that the data is fetched with the query results (which worked, no additional access to the database was required to access the data, if done before closing the EntityManager) but not that the relationship is activated. The unit tests were also fixed following your report.

ObjectDB Support
#5

Hello Support,

Another strange bug ...

I just added an attribute with mappedBy to the parent.

This just break the test ! 

Test by yourself by adding / removing the mappedBy on the OneToOne annotation.

Note that i don't create any entity on that attribute.

The test case :

public class ObjectDbTest {

public static void main(String[] args) {

  EntityManagerFactory emf = Persistence.createEntityManagerFactory("objectdb:$objectdb/db/test.tmp;drop");

  EntityManager em = null;

  // Merge parent1
  em = emf.createEntityManager();
  if (!em.getTransaction().isActive()) {
   em.getTransaction().begin();
  }

  MyEntity e1 = new MyEntity("parent1");
  e1.addChild(String.valueOf("child1"));
  e1.addChild(String.valueOf("child2"));
  e1.addChild(String.valueOf("child3"));

  em.merge(e1);

  // Ok, seems to be well merged
  em.getTransaction().commit();
  em.clear();
  em.close();

  // Try to retrieve parent and children using left join fetch
  em = emf.createEntityManager();

  TypedQuery<MyEntity> query = em
    .createQuery(
      "SELECT DISTINCT m FROM MyEntity m JOIN FETCH m.entityChildren",
      MyEntity.class);

  MyEntity myEntity = query.getSingleResult();

  // Ok my em is now closed, let's check if children are retrieved
  em.clear();
  em.close();
  emf.close();

  if (myEntity.getEntityChildren() == null) {
   System.out.println("Children not fetched !");
  } else {
   System.out.println("Number of children = "
     + myEntity.getEntityChildren().size());
  }
}

@Entity
public static class MyEntity {

  @Id
  private String name;

  public String getName() {
   return name;
  }

  public void setName(String name) {
   this.name = name;
  }

  private List<MyEntityChild> entityChildren = new ArrayList<MyEntityChild>();

  private MyEntityChild2 entityChild2 = null;

  MyEntity(String name) {
   this.name = name;
  }

  // @OneToOne(targetEntity = MyEntityChild2.class)
  @OneToOne(targetEntity = MyEntityChild2.class, mappedBy = "myEntity") // This mapped break the test
  public MyEntityChild2 getEntityChild2() {
   return entityChild2;
  }

  public void setEntityChild2(MyEntityChild2 entityChild2) {
   this.entityChild2 = entityChild2;
  }

  public void addChild(String name) {
   entityChildren.add(new MyEntityChild(name));
  }

  @ManyToMany(targetEntity = MyEntityChild.class)
  public List<MyEntityChild> getEntityChildren() {
   return this.entityChildren;
  }

  public void setEntityChildren(List<MyEntityChild> entityChildren) {
   this.entityChildren = entityChildren;
  }
}

@Entity
public static class MyEntityChild {

  @Id
  private String name;

  MyEntityChild(String name) {
   this.name = name;
  }
}

@Entity
public static class MyEntityChild2 {

  @Id
  private String name;

  private MyEntity myEntity;

  public MyEntity getMyEntity() {
   return myEntity;
  }

  public void setMyEntity(MyEntity myEntity) {
   this.myEntity = myEntity;
  }

  MyEntityChild2(String name) {
   this.name = name;
  }
}

}
#6

Support,

Should i post an issue for that ?

#7

Thank you for this report. Please try build 2.6.2_04 that should fix this issue.

ObjectDB Support
#8

Well, seems to be ok support :)

Thx for your work and your speed to fix it.

But i'm curious ... why "mappedBy" broke the left join fetch ?

#9

It does seem weird (and probably one of the more bizarre issues we ever had). It is difficult to explain the details, but there was a collision between two different fetch operations.

ObjectDB Support
#10

Support,

I have a question for you ...

I have implemented a lazy left join fetch feature for all my collection beans.

But ... in some cases doing left join (+ distinct) retrieve less parent entities than no left join fetch.

Example :

Doing count(*) of MyEntity => 50000 entities
Doing list(MyEntity) => 50000 entities

Doing list(MyEntity).leftJoinFetch(Collection1) => 40000 entities

Depending of collection1 items number, we might sometimes loose the parent entity record.

Well, tell me if am wrong :

a)
MyEntity contains 10 items of collection1;
Doing leftJoinFetch + distinct bring me 1 MyEntity with a list of 10 items.Ok great

b)
MyEntity contains 0 item of collection1;
Doing leftJoinFetch + distinct should bring me 0 MyEntity or 1 MyEntity with 0 collection1 items ? ( in my opinion 1 Entity with 0 collection1 items ).

But this behaviour doesn't always happens.

In my test case, i can't reproduce the issue, it only happen in my server.

Can you confirm me that we MUST retrieve at least 1 MyEntity item no matter how much collection1 items we have in the bag ? ( and no matter kind of JPA annotation @OneToMany, @ManyToMany we have )

If Yes, we should inspect the problem in a short test case.

Thank You,
Xirt

#11

> Can you confirm me that we MUST retrieve at least 1 MyEntity item no matter how much collection1 items we have in the bag? ( and no matter kind of JPA annotation @OneToMany, @ManyToMany we have )

This is correct. If you can isolate the issue with a test case it may be very helpful.

ObjectDB Support
#12

Ok support, i have found the cause of that issue but it may be difficult to provide you a test case.

In fact, when we are doing a left join on an attribute which is also indexed we may have some weird behaviour.

Sometime (a broken index ?) has for consequences a bad retrieve of MyEntity records. So some records just disappears !

I don't know if my index has been broken ( may i have some indications to test that ? ) or if that behaviour happen also on right indexes.

In my short test case with index on left joined attribute, i haven't succeed to reproduce the issue.

Support, can you confirm that :

1- Broken index could happen ?

2- How to check if index is broken ?

3- If previous anwser if "test with doctor", have you an other manner to check without doctor ( work bad with my database ) ?

4- Will you plan to invest more in that problem and provide a solution ?

Thanks,

Xirt

Note : Removing index resolve the issue but i need it.

#13

1- There are currently no known issues regarding indexes that become broken, but everything could happen. It is more likely that the index is fine and a bad optimization causes this in a specified case.

2- Using the Doctor. You can also run it in repair mode and then the index will be rebuilt cleanly.

3- You can try "SELECT e.indexedAttr, e.id FROM MyEntity e ORDER BY e.indexedAttr" and check the results.

4- Sure. But unfortunately it seems that a test case will be needed.

 

ObjectDB Support
#14

Well ... that's enough for me, i give up.

I have spent too many hours to find a test case and if i find it, it will be the third bug about the "join fetch" feature.
3 bugs discovered in few days ... i'm afraid to find more in the future, so let's stop the investigation.

I switched back to ALL EAGER ( seems to be faster ... ) and stoped all kind of lazy initialization.

Left join is not enough mature in objectdb to use in a production environment.

I'm waiting for JPA entity graph to be implemented in ObjectDb.

Regards,
Xirt

#15

If ALL EAGER is faster in your application then you should use it anyway. We will take into account your reports regarding JOIN FETCH. Obviously more extensive tests are required in order to cover this feature better.

ObjectDB Support
#16

Although you are moving back to ALL EAGER, will you be able to check JOIN FETCH with some optimization disabled? This could help us understand where is the issue.

ObjectDB Support
#17

More than simple unit test case, I think you should realise some complex test case ( multi join fetch + various jpa collection combinaison + various cascade combinaison + merge / persist parents and childs + add indexes ) to bere sure all is fine no matter the case complexity. ( watch the mapped-by + left join test case, that was so weird, but could be prevented by complete scenario )

> Although you are moving back to ALL EAGER, will you be able to check JOIN FETCH with some optimization disabled? This could help us understand where is the issue.

Yes, without index on the collection, all if fine, but performances are weak.

Xirt

#18

Agreed. Note that we have nearly 100% coverage with unit tests. However, some bugs require complex tests to be discovered, and even the test that you described may miss some...

Build 2.6.2_07 includes two new system properties that disable some query optimization:

  • objectdb.temp.no-btree-batch
  • objectdb.temp.no-multivar-batch

Setting them as "true" on the server (or in embedded mode but not on the client side in client server) will disable two query optimizations, e.g. with JVM args:

java ... -Dobjectdb.temp.no-btree-batch=true -Dobjectdb.temp.no-multivar-batch=true

Please report if disabling these optimizations makes any difference.

ObjectDB Support
#19

Note: This post was moved to a separate forum thread.

Well ! I think i pointed out a HUGE performance issue with mapped-by attribute (yeah again :( ).

Here is the test case :

1- Create 100 000 entities with for each 1 entity with @OneToOne(mappedBy attribute)

2- Retrieve only 10 000 for testing

3- Wait ... On my AMD FX 8350, it took 60s

Now, go the MyEntity class and remove "mapped-by".

You obtain 1 second max of query fetch.

So my performance issue in my application is more due to that problem than the left join one. ( Note : The left join index pb still exists, i will test your optimisation disabled latter )

So, the test case, enjoy :

The entity :

import javax.persistence.CascadeType;
import javax.persistence.Entity;
import javax.persistence.Id;
import javax.persistence.OneToOne;

@Entity
public class MyEntity {

@Id
private String name;

public MyEntity(String name) {
  this.name = name;
}

public String getName() {
  return name;
}

public void setName(String name) {
  this.name = name;
}

private MyEntityChild entityChild = null;
 
@OneToOne(targetEntity = MyEntityChild.class, cascade = CascadeType.ALL,mappedBy = "myEntity") // Test by adding / remove "mappedBy"
public MyEntityChild getEntityChild() {
  return entityChild;
}
 
public void setEntityChild(MyEntityChild entityChild) {
  this.entityChild = entityChild;
}

@Entity
public static class MyEntityChild {
 
   @Id
   private String name;
 
   private MyEntity myEntity;
 
   public MyEntity getMyEntity() {
    return myEntity;
   }
 
   public void setMyEntity(MyEntity myEntity) {
    this.myEntity = myEntity;
   }
 
   MyEntityChild(String name) {
    this.name = name;
   }
}
}

The test case :

public class ObjectDbTest {

public static void main(String[] args) {

  EntityManagerFactory emf = Persistence.createEntityManagerFactory("objectdb:$objectdb/db/test.tmp;drop");

  EntityManager em = null;

  em = emf.createEntityManager();
  if (!em.getTransaction().isActive()) {
   em.getTransaction().begin();
  }

  // Write 100 000 entities with 1 child for each
  for (int i=0;i<100000;++i) {
   MyEntity e1 = new MyEntity("parent" + i);
   MyEntityChild child2 = new MyEntityChild("child" +i);
   child2.setMyEntity(e1);
   e1.setEntityChild(child2);
   em.merge(e1);
  }

  em.getTransaction().commit();
  em.clear();
  em.close();

  em = emf.createEntityManager();

  Long start = new Date().getTime();
 
  TypedQuery<MyEntity> query = em.createQuery("SELECT m FROM MyEntity m", MyEntity.class);
 
  // Retrieve only 10 000
  List <MyEntity> entities = query.setMaxResults(10000).getResultList();

  Long end = new Date().getTime();

  Long duration = (end - start) / 1000;
  System.out.println("Duration : " + duration + " seconds");
 
  em.close();
  emf.close();
 
  // Be sure we retrieve child
  if (entities.get(0).getEntityChild() == null) {
   System.out.println("FAILED TO FETCH !");
  }
}

Regards, 
Xirt

Note: This post was moved to a separate forum thread.

#20

Support,

I have done some benchmarks ( after having removed the bugged mapped-by attribute )

I have 193152 entities in my database

LEFT JOIN FETCH (5 Collections)
Start : Wed Jun 17 22:41:35 CEST 2015
End : Wed Jun 17 22:43:14 CEST 2015
145020 entities retrieved ( fetch bug ! 193152 entities should be retrieved )
Duration : ~2 minutes

ALL EAGER ( 5 collections )

Start : Wed Jun 17 22:44:05 CEST 2015
End : Wed Jun 17 22:44:27 CEST 2015
193152 entities retrieved
Duration : 22 seconds

LAZY INIT on call size() on each collection ( 5 collections )

Start : Wed Jun 17 22:55:25 CEST 2015
End : Wed Jun 17 22:55:42 CEST 2015
193152 entities retrieved
Duration : 17 seconds

LEFT JOIN is just catastrophic on objectdb.

I think the combo Hibernate + H2 works better ( see http://zeroturnaround.com/rebellabs/how-to-use-jpa-correctly-to-avoid-complaints-of-a-slow-application/ ). Indeed for H2 + Hibernate that's the best solution ! And the worst for ObjectDb.

Also ...

In objectDb the N+1 problem (mentionned in the website) seems to not exist. Performances are also better than EAGER (need to be confirmed, it seems suspicious). In facts, looping in each object to lazy load each collection is the best solution.

To finish, eager fetch all collection is a bit slower than lazy load, that's surprising so ...

Support, if you can provide some tests or give some benchmarks, it could be great.

So, in brief :

-> Left join fetch is buggy, doesn't fetch some entities in some cases ( haven't tested no optimization yet )

-> Left join fetch performances are very very bad, i think hibernate + h2 work really better ( see benchmark )

-> Mapped by attribute is really buggy !!! In some case, it's 60x slower with it.

-> Lazy load of all collections seems to be better than All eager strategy (well here, this is very suspicious :/).

Support, you should add some benchmarks to your test case, because buggy implementations could make the test right but slow.

It's would also be really great to put these new benchmarks in your existing charts.

Regards,

Xirt

#21

Interesting results. Different methods are faster/slower in different conditions so there are no rules that match every situation, and the only way to find out what is best for a specific application is to benchmark the alternatives.

Regarding lazy initialization with size(), it looks strange that it is faster than EAGER if referenced objects are fetched. But check that the referenced objects are indeed retrieved, because ObjectDB can return a collection size and initialize the collection with hollow referenced objects.

Your report in #19 above was moved to a separate forum thread.

ObjectDB Support

Reply