Issue #1228: Negative snapshot user count exception

Type: Bug ReoprtVersion: 2.5.2Priority: NormalStatus: FixedReplies: 8
#1

I found the following stack trace in my log files; it's cropped up a number of times:

[2013-08-23 14:05:23 #187 *]
[ObjectDB 2.5.2] Unexpected exception (Error 990)
  Generated by Java HotSpot(TM) Server VM 1.6.0_27 (on Linux 2.6.35.14-95.38.amzn1.i686).
Please report this error on http://www.objectdb.com/database/issue/new
com.objectdb.o.InternalException: Negative snapshot user count
com.objectdb.o.InternalException: Negative snapshot user count
        at com.objectdb.o.SNP.D(SNP.java:344)
        at com.objectdb.o.SFL.aa(SFL.java:785)
        at com.objectdb.o.MST.Va(MST.java:1745)
        at com.objectdb.o.PRG.au(PRG.java:1168)
        at com.objectdb.o.PRG.ag(PRG.java:718)
        at com.objectdb.o.PRG.af(PRG.java:553)
        at com.objectdb.o.QRM.U6(QRM.java:265)
        at com.objectdb.o.MST.U6(MST.java:933)
        at com.objectdb.o.WRA.U6(WRA.java:293)
        at com.objectdb.o.WSM.U6(WSM.java:114)
        at com.objectdb.o.STC.r(STC.java:450)
        at com.objectdb.o.SHN.aj(SHN.java:489)
        at com.objectdb.o.SHN.K(SHN.java:156)
        at com.objectdb.o.HND.run(HND.java:132)
        at java.lang.Thread.run(Thread.java:662)

[2013-08-23 14:05:23 #188 *]
transaction Id = 506009, new file size = 297009152, dirty pages = 1, update list
s = 2, dirty page map = 1,

 

#2

Thank you for this report. It indicates an unexpected ObjectDB state but the cause is unclear.

There is some clue that may indicate that this is related to extending the database (allocating new database pages).

Could you please provide more details? Particularly, when did this start? is there anything that you think may be related (a change in your application or in the ObjectDB version)? How often does it happen?

If the problem is related to extending the database, a workaround could be to store many large objects and then to delete them, creating more space and avoiding resizing. If you try this, please let us know if you see any change.

ObjectDB Support
#3

Do you still see this error message after eliminating the memory and index issues?

ObjectDB Support
#4

Yes, I am continuing to see this issue, as well as the mismatch client-server protocol prefix failure, and some other problems (see attached log file).

Last week I had inadvertently left the database running with inadequate memory, which I am sure precipitated additional index problems; I resolved the memory configuration on Sunday night.  I re-ran the database doctor on Monday evening PDT (circa 9/10 @02:40 UTC).  

After restarting the database (see log entry [2013-09-10 02:44:10 #1 server]), this error shows up a number of times, and today at 10:14 AM PDT we started getting the mismatch client-server protocol prefix failure again.

#5

Since your system ran with ObjectDB for a long time (I think 2 years?) without these issues, two possible factors may be relevant:

  • Your move from ObjectDB 2.3.x to 2.5.2.
  • Corruption of the database that cannot be fixed by the Doctor (due to the OutOfMemoryError, the upgrade, or any other cause).

Can you try the previous ObjectDB version for several days (with a clean database of course)?

If you can share the database (e.g. by providing a link in a private support ticket) we can try getting some hint.

ObjectDB Support
#6
On our Dev server we were running with 2.5.x for some time, including a few weeks against a copy of the production data, without incident. The problems only occurred after running on the production server. Reverting that server to a clean database is not an option, because we have live customer data. Rolling back to 2.3.x would be a problem because development work I did this year relies on fixes and enhancements you implemented in the interim. I will get you a link to the database this evening.
#7

As far as I can tell, the database seems to be healthy. Possibly, the errors until restarting on [2013-09-10 03:44:10 #1 server]) were related to the corrupted index, but they are not seen in the log after that restart, so hopefully the Doctor fixed these index issues.

In that case, the "Negative snapshot user count" and the "mismatch client-server protocol" errors could be caused by a bug in ObjectDB, which is not related to problems with the database file. Similar "Negative snapshot user count" report (issue #556) was fixed 2 years ago. Possibly this is another bug that produces a similar exception.

The stack trace on the server log indicates that the exception is thrown during query execution on a flushed transaction, i.e. there are uncommitted changes in the transaction. If you can check the client side logs, and then find the specific query and context on the client side it could help. Maybe this is new code that was added recently.

ObjectDB Support
#8

There was not much in the client side logs; the only thing that showed up was the ClassCastException related to the database corruption.  I've tried enabling a finer level (both 'trace' and 'debug') on the client side, but I'm not getting any new output.  Any suggestions to help capture more useful debug information?

#9

Discussion was moved to a private support ticket.

Update: Apparently this exception is the result of an OutOfMemoryError exception, and allocating sufficient RAM solves the problem. We should check how future versions of ObjectDB can at least produce a better error message in this case.

Update: The original error message "Negative snapshot user count" is also reported in issue #1407 and that was fixed in build 2.5.5_12.

ObjectDB Support

Reply