Issue #563: ArrayIndexOutOfBoundsException on flush

Type: Bug ReoprtVersion: 1.4.0Priority: NormalStatus: ClosedReplies: 11
#1

The following exception was thrown when my app attempted to flush an objectdb database:

Caused by: rbccm.felix.framework.ApplicationException: Error flushing graph container
at rbccm.felix.objectdb.workflow.ObjectDbGraphContainer.flush(Unknown Source)
at rbccm.felix.objectdb.workflow.ObjectDbGraphContainer.<init>(Unknown Source)
at rbccm.felix.objectdb.workflow.ObjectDbGraphContainer.<init>(Unknown Source)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
at org.springframework.beans.BeanUtils.instantiateClass(BeanUtils.java:100)
... 32 more
Caused by: com.objectdb.o.InternalException: Unexpected internal exception
at com.objectdb.o.JPE.h(JPE.java:163)
at com.objectdb.o.ERR.f(ERR.java:69)
at com.objectdb.o.OBC.onObjectDBError(OBC.java:1485)
at com.objectdb.jpa.JpaQuery.executeUpdate(JpaQuery.java:737)
... 40 more
Caused by: java.lang.ArrayIndexOutOfBoundsException
at java.lang.System.arraycopy(Native Method)
at com.objectdb.o.SEM.c(SEM.java:77)
at com.objectdb.o.PBI.q(PBI.java:98)
at com.objectdb.o.OBI.Vi(OBI.java:236)
at com.objectdb.o.BQI.Vr(BQI.java:144)
at com.objectdb.o.PRG.ah(PRG.java:875)
at com.objectdb.o.PRG.ad(PRG.java:607)
at com.objectdb.o.PRG.ac(PRG.java:536)
at com.objectdb.o.QRM.U4(QRM.java:259)
at com.objectdb.o.MST.U4(MST.java:947)
at com.objectdb.o.WRA.U4(WRA.java:290)
at com.objectdb.o.WSM.U4(WSM.java:113)
at com.objectdb.o.QRR.g(QRR.java:232)
at com.objectdb.o.QRR.b(QRR.java:151)
at com.objectdb.jpa.JpaQuery.executeUpdate(JpaQuery.java:722)
... 40 more

The code at this point is performing a delete query for all entity objects that the database contains:

EntityManager em = _emf.createEntityManager();
  em.getTransaction().begin();
  try {
   Query query = em.createQuery("DELETE FROM ExecutionGraph g");
   query.executeUpdate();
  
   query = em.createQuery("DELETE FROM WorkflowInstance w");
   query.executeUpdate();
  
   query = em.createQuery("DELETE FROM TaskInstance t");
   query.executeUpdate();
  
   query = em.createQuery("DELETE FROM TaskNode n");
   query.executeUpdate();
  
   em.getTransaction().commit();
  } catch(Exception e) {
   if(em.getTransaction().isActive())
    em.getTransaction().rollback();
   throw new ApplicationException("Error flushing graph container", e);
  } finally {
   em.close();
  }

I've not had any problems with this operation before today. Looking at the database in explorer it looks like some of the "WorkflowInstance" objects have failed to persist (explorer lists an exception message rather than the object).

The following error message from my application monitor seems to be related but unfortunately I dont have a full stack trace as the app went into a commit retry loop which blew out the logs:

Attempt to commit a transaction after failure in writing to database file 'F:\calculationservice\objectdb\dbs\workflow\uatsrtlonw342-WSAgent-1-Graphs.odb'

I've uploaded the database in question and the objectdb logfile via ftp in case it can shed any additional light on the problem. The filename is ArrayIndexOutOfBoundsException.zip

I appreciated that this may not be enough information to go on - if that is the case feel free to close the issue until I can recreate in a controlled environment.

 

#2

The full exception from the error message I mentioned above is this:

[ObjectDB 2.3.2_08] javax.persistence.PersistenceException
Attempt to commit a transaction after failure in writing to database file 'F:\calculationservice\objectdb\dbs\workflow\uatsrtlonw342-WSAgent-1-Graphs.odb' (error 144)
at rbccm.felix.objectdb.workflow.ObjectDbGraphContainer.putIfAbsent(Unknown Source)
at rbccm.digest2.execution.Workflow2Runner.processGraph(Unknown Source)
at rbccm.digest2.execution.Workflow2Runner.processWorkflows(Unknown Source)
at rbccm.digest2.execution.Workflow2Runner.processWorkflow(Unknown Source)
at rbccm.digest2.execution.Workflow2Runner.workflowCompleteForSomeTasks(Unknown Source)
at rbccm.felix.gridservice.workflow.WorkflowService.process(Unknown Source)
at rbccm.felix.gridservice.workflow.WorkflowService.process(Unknown Source)
at rbccm.felix.gridservice.workflow.WorkflowService.process(Unknown Source)
at rbccm.felix.framework.service.ServiceRunner.run(Unknown Source)
at java.lang.Thread.run(Thread.java:662)
Caused by: com.objectdb.o.UserException: Attempt to commit a transaction after failure in writing to database file 'F:\calculationservice\objectdb\dbs\workflow\uatsrtlonw342-WSAgent-1-Graphs.odb'
at com.objectdb.o.MSG.d(MSG.java:74)
at com.objectdb.o.SFL.ad(SFL.java:823)
at com.objectdb.o.MST.ad(MST.java:1339)
at com.objectdb.o.MST.U9(MST.java:1229)
at com.objectdb.o.WRA.U9(WRA.java:355)
at com.objectdb.o.WSM.U9(WSM.java:146)
at com.objectdb.o.OBM.bL(OBM.java:945)
at com.objectdb.o.OBM.bI(OBM.java:818)
at com.objectdb.o.OBM.flush(OBM.java:739)
... 10 more
Caused by: java.lang.ArrayIndexOutOfBoundsException: -1
at com.objectdb.o.HTB.t(HTB.java:448)
at com.objectdb.o.PGC.B(PGC.java:401)
at com.objectdb.o.SFL.ah(SFL.java:991)
at com.objectdb.o.SFL.run(SFL.java:899)
... 1 more
#3

Unfortunately the database that you have uploaded is totally corrupted (as shown by running the Doctor) so using it would generate various exceptions, but these exceptions cannot help in finding the problem.

Of course, if you can reproduce it with a test case it would be very helpful, but I understand that this might not be possible at the moment.

Please provide more details:

  1. Is this a one time problem or can you cause it again in your environment (with a new clean database)?
  2. Have you used replication? Is it the master database or a replicated slave database?
  3. Did you move or copy the database file before this happened?
  4. Did you have a database recovery before this happened?
  5. Do you have the full transaction history (recording) in an odr directory? If you have it the corruption may be reproduced by using the Replayer.
ObjectDB Support
#4

To answer your questions:

1) The problem is re-producable but only at the application level at the moment. I will try to put together a test that replicates it but we're not clear what is happening yet

2) No replication is in use - the database in question is embedded

3) The database was created in place by objectdb (i.e. on a clean file system). It hasn't been moved or copied

4) No recovery - the database is new & empty upon startup

5) No but I'll switch this on if you think it would be useful

The current theory is that this may be related to the size of the objects that are being stored. The process in test was working ok until the start of this week at which point data sizes increased substantially.

#5

Recording transactions could help if replaying them reproduce the problem and you can share these logs.

Large objects (> ~2KB) are handled differently and many problems in early versions of ObjectDB 2.x where related to managing large objects.

Do you use update queries? Update queries are new and maybe they do not handle large objects correctly.

ObjectDB Support
#6

I've uploaded the odr directory to the ftp site, filename is Issue87_odr.zip

The first error to occur in the application was:

[ObjectDB 2.3.2_04] javax.persistence.EntityNotFoundException
Entity is not found: rbccm.digest2.execution.TaskNode#397 (error 631)
at rbccm.digest2.execution.TaskNode.__odbSet__status(Unknown Source)
at rbccm.digest2.execution.TaskNode.setStatus(Unknown Source)
at rbccm.digest2.execution.ExecutionGraph.workflowCompleteTaskExecution(Unknown Source)
at rbccm.digest2.execution.Workflow2Runner.workflowCompleteForSomeTasks(Unknown Source)
at rbccm.felix.gridservice.workflow.WorkflowService.process(Unknown Source)
at rbccm.felix.gridservice.workflow.WorkflowService.process(Unknown Source)
at rbccm.felix.gridservice.workflow.WorkflowService.process(Unknown Source)
at rbccm.felix.framework.service.ServiceRunner.run(Unknown Source)
at java.lang.Thread.run(Thread.java:662)

Closely followed by:

[ObjectDB 2.3.2_04] javax.persistence.PersistenceException
Failed to locate primary key (rbccm.digest2.execution.TaskNode:393) (error 644)
at rbccm.felix.objectdb.workflow.ObjectDbGraphContainer.remove(Unknown Source)
at rbccm.digest2.execution.Workflow2Runner.cleanup(Unknown Source)
at rbccm.digest2.execution.Workflow2Runner.workflowCompleteforThisRunner(Unknown Source)
at rbccm.digest2.execution.Workflow2Runner.workflowError(Unknown Source)
at rbccm.digest2.execution.Workflow2Runner.workflowCompleteForSomeTasks(Unknown Source)
at rbccm.felix.gridservice.workflow.WorkflowService.process(Unknown Source)
at rbccm.felix.gridservice.workflow.WorkflowService.process(Unknown Source)
at rbccm.felix.gridservice.workflow.WorkflowService.process(Unknown Source)
at rbccm.felix.framework.service.ServiceRunner.run(Unknown Source)
at java.lang.Thread.run(Thread.java:662)
Caused by: com.objectdb.o.UserException: Failed to locate primary key (rbccm.digest2.execution.TaskNode:393)
at com.objectdb.o.MSG.d(MSG.java:61)
at com.objectdb.o.PPW.aq(PPW.java:92)
at com.objectdb.o.PGW.am(PGW.java:210)
at com.objectdb.o.UPT.B(UPT.java:134)
at com.objectdb.o.UPT.l(UPT.java:110)
at com.objectdb.o.TSK.i(TSK.java:146)
at com.objectdb.o.TSK.f(TSK.java:95)
at com.objectdb.o.UPT.r(UPT.java:157)
at com.objectdb.o.PGT.q(PGT.java:109)
at com.objectdb.o.UPT.B(UPT.java:121)
at com.objectdb.o.URT.l(URT.java:171)
at com.objectdb.o.TSK.i(TSK.java:146)
at com.objectdb.o.TSK.f(TSK.java:95)
at com.objectdb.o.TSM.e(TSM.java:87)
at com.objectdb.o.UTT.z(UTT.java:362)
at com.objectdb.o.UTT.l(UTT.java:203)
at com.objectdb.o.TSK.i(TSK.java:146)
at com.objectdb.o.TSK.f(TSK.java:95)
at com.objectdb.o.MST.U9(MST.java:1230)
at com.objectdb.o.WRA.U9(WRA.java:355)
at com.objectdb.o.WSM.U9(WSM.java:146)
at com.objectdb.o.WRA.U9(WRA.java:355)
at com.objectdb.o.WSN.U9(WSN.java:615)
at com.objectdb.o.OBM.bL(OBM.java:941)
at com.objectdb.o.OBM.bI(OBM.java:814)
at com.objectdb.o.OBM.flush(OBM.java:735)
... 10 more

(Btw, i have reverted to 2.3.2_04 to rule out the build 08 being the cause - i will reapply this update as the same problem seems to occur...)

#7

I updated to 2.3.3 and get the same errors as above.

The entity referred to in the first exception is stored in a HashMap object within the parent entity:

@OneToMany(orphanRemoval=true, cascade=CascadeType.ALL)
private Map<String, TaskNode> _nodes;

The TaskNode contains a further Entity object - TaskInstance - which is the object that will be holding the large data. The data itself is already compressed (fortunately this aspect of the application doesnt deal with the uncompressed data) and is help as a string. The size is certainly larger than 2k and is likely to be in MB.

The second exception is a cleanup response to the first. An attempt to remove the object tree from the database is made as follows:

    IExecutionGraph loadedgraph = em.find(ExecutionGraph.class, id);
     if(loadedgraph != null) {
         em.remove(loadedgraph);
         em.flush();
     }

Transactions are maintained at the thread level rather than operation but in this cleanup the transaction commit will be the next action to occur

#8

Don't know if this will be of any further use but in the course of making changes for the investigation I'm now seeing the following exception:

[ObjectDB 2.3.3] Unexpected exception (Error 990)
  Generated by Java HotSpot(TM) 64-Bit Server VM 1.6.0_26 (on Windows 2003 5.2).
Please report this error on http://www.objectdb.com/database/issue/new
com.objectdb.o.InternalException: java.lang.ArrayIndexOutOfBoundsException: 27210
java.lang.ArrayIndexOutOfBoundsException: 27210
at com.objectdb.o.BYR.A(BYR.java:892)
at com.objectdb.o.BYR.A(BYR.java:207)
at com.objectdb.o.VUT.j(VUT.java:332)
at com.objectdb.o.PGT.q(PGT.java:133)
at com.objectdb.o.RST.A(RST.java:93)
at com.objectdb.o.RTT.l(RTT.java:132)
at com.objectdb.o.RST.l(RST.java:24)
at com.objectdb.o.RTT.C(RTT.java:179)
at com.objectdb.o.RST.r(RST.java:110)
at com.objectdb.o.PGT.q(PGT.java:109)
at com.objectdb.o.RST.A(RST.java:93)
at com.objectdb.o.RTT.l(RTT.java:132)
at com.objectdb.o.RST.l(RST.java:24)
at com.objectdb.o.TSK.i(TSK.java:146)
at com.objectdb.o.TSK.f(TSK.java:95)
at com.objectdb.o.TSM.e(TSM.java:87)
at com.objectdb.o.RTT.C(RTT.java:177)
at com.objectdb.o.RST.r(RST.java:110)
at com.objectdb.o.PGT.q(PGT.java:109)
at com.objectdb.o.RST.A(RST.java:93)
at com.objectdb.o.RTT.l(RTT.java:132)
at com.objectdb.o.RST.l(RST.java:24)
at com.objectdb.o.TSK.i(TSK.java:146)
at com.objectdb.o.TSK.f(TSK.java:95)
at com.objectdb.o.MST.aT(MST.java:582)
at com.objectdb.o.MST.U2(MST.java:541)
at com.objectdb.o.WRA.U2(WRA.java:260)
at com.objectdb.o.WRA.U2(WRA.java:260)
at com.objectdb.o.WSN.U2(WSN.java:337)
at com.objectdb.o.LDR.w(LDR.java:381)
at com.objectdb.o.LDR.v(LDR.java:292)
at com.objectdb.o.LDR.s(LDR.java:210)
at com.objectdb.o.OBC.aM(OBC.java:1056)
at com.objectdb.o.OBC.aK(OBC.java:969)
at com.objectdb.jpa.EMImpl.find(EMImpl.java:551)
at com.objectdb.jpa.EMImpl.find(EMImpl.java:474)
at rbccm.felix.objectdb.workflow.ObjectDbGraphContainer.get(Unknown Source)
at rbccm.digest2.execution.Workflow2Runner.workflowCompleteForSomeTasks(Unknown Source)
at rbccm.felix.gridservice.workflow.WorkflowService.process(Unknown Source)
at rbccm.felix.gridservice.workflow.WorkflowService.process(Unknown Source)
at rbccm.felix.gridservice.workflow.WorkflowService.process(Unknown Source)
at rbccm.felix.framework.service.ServiceRunner.run(Unknown Source)
at java.lang.Thread.run(Thread.java:662)
#9

Unfortunately the additional stack traces do not provide more info and replaying the recorded transactions fail.

I am afraid a test program is required to solve this problem.

ObjectDB Support
#10

Ok, will attempt to put a test together but we're taking a slightly different root at the application level to work around this so it is less of a problem now. Feel free to close the issue for now and I will reopen if I can get the test operational.

#11

I hope you still have the code that caused these exceptions.

Please keep it for checking against future ObjectDB versions (until this problem is fixed).

There is a chance that a synchronized block that was added recently may solve the problem, so if you can try your code with the last build of ObjectDB (with a new database) and report the results - it may help.

ObjectDB Support
#12

This issue will be closed since the problem is not related to anything specific that is discussed here but to working with a corrupted database file. A new issue should be created if necessary.

ObjectDB Support

Reply