Issue #2561: database corrupt

Type: Bug Reoprt

Version: 2.7.6_05

Priority: Critical

Status: Active

Replies: 5

hgzwicker

Joined on 2014‑04‑09

we have a major shutdown problem at a 450 employees company just using our system to completely manage the production.

we have the following state of the system:

- Doctor around 2 monthes ago

- run our application from the 20200429, suddenly entries as it can be seen from #2559 appear

- Doctor today (part of the results in the attached log)

how this is possible (we run into sever problems at our customers with this kind of stability) ? Please inform

log.rtf (9.3 KB)

support

Joined on 2010‑05‑03

This is obviously a bad situation. These issues repeat once in awhile in your applications, and the cause is still unclear. There are no similar reports from other users and the stack traces are not very useful because they indicate a corruption but not the cause of the corruption, which happened before.

One possible cause is not using sync in the objectdb.conf, which we may have discussed in the past. Using sync=false is faster but less safe (at least with some types of hardware), if objectdb is suddenly and disorderly stopped in the middle of busy transactions.

The immediate solution is probably to fix the database using the Doctor to get the system back in operation.

In the long term, we can try working with you on diagnosis of this issue, but we will need more information. For example, does it always happen with the same customer? What are the hardware details?

Due to the repeating issues, you should probably also consider switching to another product. This is a bad result and you will be the first customer (to the best of our knowledge) to abandon ObjectDB because of database corruption issues (the usual reason is closing a company / project), but unfortunately it may be unavoidable.

ObjectDB Support

hgzwicker

Joined on 2014‑04‑09

first of all, we love your database, but on the other hand we want to get rid of this kind of problems as we start now with our worldwide multiplication. Please advice what we should change in our configuration

server: 12 CPUs, 96 GByte RAM, SSDDrive, Server 2012 R2

</objectdb>

support

Joined on 2010‑05‑03

The one setting that might make the difference (hopefully) is:

It should force physically writing to the recovery file before starting to write to the database file. Although the default, sync="false" (which is faster) works pretty well for other users due to other protections, this default could possibly cause issues with some hardware optimizations.

A possible way to diagnose issues that are not repeatable may be to enable recording:

This is rarely used, may slow your application and consumes a lot of disk space, so it is unclear if it is applicable on your production server. It records all the transactions, so playing them later may reproduce the issue and enable debugging. It requires starting with a duplicate database that is later used to play the transactions.

One important question is whether these issues always happen with the same system, i.e. are there other installations of your application that never had these issues?

Another thing that can be done but will require some time is that the ObjectDB Doctor will be upgraded to run while the database is active, i.e. fix such issues as they happen. This may not solve the issue completely but will avoid shutdown.

ObjectDB Support

hgzwicker

Joined on 2014‑04‑09

thanks for the update, we change accordingly.

this issue is just happening always at the same customer. As this customer has > 1000 of our apps + around 100 connected automation cells up and running it is the worst case that we could imagine, but important for us for the proof of our concept.

the server is a virtual server in a pretty much new virtual infrastructure. We could gather more information if needed.

the 'on the fly' Doctor would be great for such sensible applications

support

Joined on 2010‑05‑03

A quick search on Google:

https://www.google.com/search?q=windows+server+2012+database+corruption

finds various discussions regarding database corruption on Windows Server 2012, which maybe could be relevant, for example:

https://learn.microsoft.com/en-us/archive/msdn-technet-forums/7336d31b-6c24-468a-9c47-750244ae3a8c (old link)

If you use ObjectDB in embedded mode (similarly to Access, which is discussed in that thread) then you must make sure that the ObjectDB database file is never accessed concurrently by two processes. ObjectDB protects against this option by using a file lock (e.g. you probably see an error message when you open the Explorer and the database is open by another process), but respecting this Java file lock is system dependent and might not work in some systems (e.g. if the file system is shared). Accessing ObjectDB in embedded mode directly from two processes is fatal.

Regarding the solutions in that thread, search for example for:

"Since disabling SMB3/2, no file corruption occurred any more. Worked with two different servers with two different applications. Problem solved."

and also:

"Might be you want to disable oplocks."

although this may be completely irrelevant to your case.

The important questions are:

Is it always the same system with the corruptions?
Do other installations of your application with ObjectDB work properly?

UPDATE: You already answered these questions just now.

ObjectDB Support

Issue #2561: database corrupt

Reply