ObjectDB ObjectDB

Issue #630: NullPointer on query

Type: Bug ReoprtVersion: 2.3.6Priority: CriticalStatus: FixedReplies: 10
#1

I'm getting a null pointer when running the following code:

TypedQuery<ObjectDbMessagePayload> query = em.createQuery("SELECT m FROM ObjectDbMessagePayload m WHERE m.id = :id", ObjectDbMessagePayload.class);
query.setParameter("id", msg.getId());
List<ObjectDbMessagePayload> loaded = query.getResultList();

Stack trace follows:

rbccm.felix.framework.ApplicationException: Error taking message from ObjectDB queue
at rbccm.felix.objectdb.messaging.ObjectDbMessagePipe.take(Unknown Source)
at rbccm.felix.objectdb.messaging.ObjectDbMessagePipe.take(Unknown Source)
at rbccm.felix.framework.service.ServiceRunner.run(Unknown Source)
at java.lang.Thread.run(Thread.java:662)
Caused by: com.objectdb.o.InternalException: Unexpected internal exception
at com.objectdb.o.JPE.h(JPE.java:163)
at com.objectdb.o.ERR.f(ERR.java:69)
at com.objectdb.o.OBC.onObjectDBError(OBC.java:1493)
at com.objectdb.jpa.JpaQuery.getResultList(JpaQuery.java:695)
at rbccm.felix.objectdb.messaging.ObjectDbMessagePipe.getMessage(Unknown Source)
at rbccm.felix.objectdb.messaging.ObjectDbMessagePipe.takeWithRetry(Unknown Source)
... 4 more
Caused by: java.lang.NullPointerException
at com.objectdb.o.OBI.t(OBI.java:377)
at com.objectdb.o.OBI.r(OBI.java:264)
at com.objectdb.o.OBI.Vj(OBI.java:232)
at com.objectdb.o.BQI.Vs(BQI.java:144)
at com.objectdb.o.PRG.ag(PRG.java:734)
at com.objectdb.o.PRG.ae(PRG.java:663)
at com.objectdb.o.PRG.ad(PRG.java:539)
at com.objectdb.o.QRM.U5(QRM.java:259)
at com.objectdb.o.MST.U5(MST.java:947)
at com.objectdb.o.WRA.U5(WRA.java:290)
at com.objectdb.o.WSM.U5(WSM.java:113)
at com.objectdb.o.QRR.g(QRR.java:232)
at com.objectdb.o.QRR.b(QRR.java:151)
at com.objectdb.jpa.JpaQuery.getResultList(JpaQuery.java:686)
... 6 more

The code usually works fine so I guess there is a problem with the object/database. ObjectDbDoctor reports errors so I've uploaded the database to the ftp site (uatsrtlonw342-WorkflowService-Existing_Instance-2.rar)

edit
delete
#2

The database file is corrupted. A post mortem analysis reveals that the last change to the only ObjectDbMessagePayload instance in transaction #72,053 was not applied on the database completely (some pages have been updated and some have not).

This could be the result of killing the database process while it was writing to the file, but then there should be a recovery file that should be used to fix the database automatically. Did you get this error after a process kill? Do you have a recovery file that you can upload?

Unfortunately, assuming you were using build 2.3.5_05, the other possibility is that there is another bug in page management that could could this and has not been fixed yet. If this is the case and your test can cause it - posting the test could be very helpful.

ObjectDB Support
edit
delete
#3

The exception occurs during normal running I'm afraid.

Based on your description it's possible that the db file corruption may be unrelated and occured when the process was shutdown manually after the error occured but it's hard to be certain.

The error has occured in 2.3.5_05 and 2.3.5_07. It is recreatable within the application but only towards the end of a 6hr batch! Its odd as this is one of our "queue" database and many similar message objects have passed through it before the problem occurs. As yet, I havent found a pattern but will post a cut-down test as soon as I can. The same batch did not produce the error in versions earlier than 2.3.5_05 but I dont know whether this is down to chance or something that changed that version.

edit
delete
#4

I tried reverting to 2.3.5_04 and the error does occur there as well (see below). Interestingly the exception is thrown from the find method rather than the query. The application is set to try find and then revert to query is null is returned - this is legacy from when I had some trouble with eager loading but either way there should definitely be an object there. No exceptions are thrown on the original persist/commit.

Full code is:

ObjectDbMessagePayload payload = em.find(ObjectDbMessagePayload.class, msg.getId());
      if(payload == null) {
       //if we didn't find anything, try again with a query
       TypedQuery<ObjectDbMessagePayload> query = em.createQuery("SELECT m FROM ObjectDbMessagePayload m WHERE m.id = :id", ObjectDbMessagePayload.class);
       query.setParameter("id", msg.getId());
       List<ObjectDbMessagePayload> loaded = query.getResultList();
       payload = loaded.size() == 0 ? null : loaded.get(0);
      }

Exception from 2.3.5_04 is:

com.objectdb.o.InternalException: null
com.objectdb.o.InternalException
at com.objectdb.o.BYR.s(BYR.java:113)
at com.objectdb.o.BYR.A(BYR.java:206)
at com.objectdb.o.MUT.readAndAdjust(MUT.java:388)
at com.objectdb.o.UMR.readAndAdjust(UMR.java:632)
at rbccm.felix.objectdb.messaging.ObjectDbMessagePayload.__odbReadContent(Unknown Source)
at com.objectdb.o.MMM.ag(MMM.java:1046)
at com.objectdb.o.UTY.aI(UTY.java:1253)
at com.objectdb.o.UTY.aH(UTY.java:1225)
at com.objectdb.o.ENH.b(ENH.java:102)
at com.objectdb.o.LDR.x(LDR.java:444)
at com.objectdb.o.LDR.UV(LDR.java:669)
at com.objectdb.o.MST.aT(MST.java:522)
at com.objectdb.o.MST.aS(MST.java:454)
at com.objectdb.o.MST.U2(MST.java:427)
at com.objectdb.o.WRA.U2(WRA.java:248)
at com.objectdb.o.LDR.w(LDR.java:355)
at com.objectdb.o.LDR.v(LDR.java:293)
at com.objectdb.o.LDR.s(LDR.java:211)
at com.objectdb.o.OBC.aM(OBC.java:1058)
at com.objectdb.o.OBC.aK(OBC.java:971)
at com.objectdb.jpa.EMImpl.find(EMImpl.java:551)
at com.objectdb.jpa.EMImpl.find(EMImpl.java:474)
at rbccm.felix.objectdb.messaging.ObjectDbMessagePipe.getMessage(Unknown Source)
at rbccm.felix.objectdb.messaging.ObjectDbMessagePipe.takeWithRetry(Unknown Source)
at rbccm.felix.objectdb.messaging.ObjectDbMessagePipe.take(Unknown Source)
at rbccm.felix.objectdb.messaging.ObjectDbMessagePipe.take(Unknown Source)
at rbccm.felix.framework.service.ServiceRunner.run(Unknown Source)
at java.lang.Thread.run(Thread.java:662)

 

edit
delete
#5

This is probably a problem that exists a long time (and not a regression).

If you can share a test that causes it - it could help.

In this case the test can be complex since the bad result is very clear. Your last test was very helpful even though I didn't look at the source at all - it was used only for running and reproducing the problem.

ObjectDB Support
edit
delete
#6

Yep - desperately trying to find a way of recreating it. At the moment it only seems to occur in a real application batch and this is too heavy to send as an example (6hrs runtime backed up by a large grid...). My attempts to recreate the issue in a cutdown version of the app have been unsuccessful so far.

Is there anything you can suggest which would help narrow down the issue? Should the runtime make any difference?

edit
delete
#7

Maybe if you enable recording it would be possible to recreate the problem by replaying the recorded transactions. At least it might be worth trying.

ObjectDB Support
edit
delete
#8

Just looked again at the page cache fix (issues #116, #119, #121) and apparently the fix is incomplete. In version 2.3.6 newer pages can still be purged from the cache leaving older pages (from previous transactions).

Hopefully build 2.3.6_01 completes the fix.

Please check the new build on a new clean database.

ObjectDB Support
edit
delete
#9

Great news! I'll let you know how it goes.

In other news, I tried a run with recorded transactions but the error did not occur.

edit
delete
#10

Initial tests are very positive - the batches that were consistently failing are now running ok. I'll keep a close eye on it over the coming weeks but it looks like this issue can finally be closed. Thanks!

edit
delete
#11

Good. Thank you for the update and thank you again for this very important report.

ObjectDB Support
edit
delete

Reply

To post on this website please sign in.