after upgrading to 2.9.0 (because of the other problems in former versions), the database destroys/becomes unusable, see doctor report and current log, urgent help required, production of a 500 people manufacturing site shut down
after upgrade to 2.9.0 database destroys, urgent issue at our most important customer, production shutdown
The Doctor report indicates indexes that are not synchronized with the data. This could be caused by certain unsupported schema changes (if you had schema changes in addition to upgrading to 2.9.0).
If this is the only issue then running the Doctor in repair mode and switching to the new created database should solve the issue.
If it doesn't and you share the database we would examine the issue further,
we are currently uploading the database that was repaierd and the logs, as soon as available we send a link. We did not do any schema changes
Looking again in the log file, there are also serious problems in the structure of the database (orphan pages and pages with multiple parents), which are unlikely to be solved by the Doctor.
Is it the a log for the database before or after an attempt to fix it?
Unfortunately it is possible that the database is corrupted in a way that requires switching to a backup database. However, if you send the original database, not the fixed one, we can try fixing it, if the data is there.
Note that there were no changes in the engine that manages the database storage, at least in the last 2 years, so there is nothing in version 2.9.0 (or 2.8.7 or above) that could be easily suspected as related to this problem.
this log is the doctor output, attached other logs that should be before the doctor, the doctor was on 28.6.24, the database before doctor still downloads
There are errors in the log file on 17.6.2024 of the form:
java.io.IOException: The process cannot access the file because another process has locked part of the file
Normally there should be only one process that accesses the database file. Accessing a database from multiple processes concurrently can cause exactly this issue of structural issues (orphan pages and page with multiple parents). ObjectDB provides some file locking protection, but unfortunately it is incomplete in some environments as the actual abilities depend on the OS.
Maybe when you upgraded to 2.9.0 there was a situation in which more than one process accessed the database? Again, version 2.9.0 uses exactly the same storage engine as in previous versions, so although it may introduce new regression bugs as any new version, it is unlikely to introduce new data corruption issues (a new engine is under development for versions 3.x, but versions 2.x are frozen regarding the storage engine to preserve stability).
More information on what exactly happened around 17.6.2024 21:30 may help.
this is the database before the doctor
https://www.dropbox.com/scl/fi/uf1o7hkv76hk42t6giwpd/coreSystemDb__.7z?rlkey=63nr8o89wydf4klpg843a6rf9&dl=0
the problem with the locking arises if we just open the explorer. The explorer just opens the database file without any checking that this is be used by our embedded system
As it seems that the Doctor fixes the database successfully with no evidence for data loss, I assume you use the repaired database now and the production system was recovered yesterday and is working fine.
It is highly recommended to have a script that runs automatic backup daily (and if possible several times every day) and then checks the backup copy with the Doctor. With this arrangement you should find issues when they start (in that case, probably on 17.6.2024) before they develop to a system crush.
Please report if you have new information to share or if this issue happens again.
we are using the repaired database but there are permanent errors and we have to shut down again and again. The conflict with the explorer is a pain, everytime we start it it connects directly to the database and then we have the access error that you see in the log. Why this ? Please have a look at the error log attached
Regarding using the explorer in embedded mode, have you enabled embedded-server? Are you using the same ObjectDB version for the production database and the Explorer? Do you start seeing other database errors (in addition to the lock error) just after using the Explorer?
Regarding the merge errors, did you start getting them only after fixing the database? According to the Doctor the repaired database is clean from such issues. Did it happen also with previous ObjectDB versions? If not then you may have to reverse to version 2.8.9, if it solves this issue, and the issues that forced you to upgrade will be addressed in a different way, at least until this problem is diagnosed and fixed.
we use exactly the same version of the objectdb and explorer. There are no other errors than these that you can see in the database. The errors are the same for years, the hope was that switching to 2.8 solves these problems (that was an advise of objectdb), see log, at that time we used 2.8.
It happens at aöö our customers
These are not exactly the same errors but maybe related.
Errors that stop (even for awhile) after restart and not reported by the Doctor are not corruption errors but related to cache, and will not cause data loss. These cache errors have been addressed by fixes in recent versions. Since we cannot reproduce them in our tests and there are no similar reports by other users we cannot diagnose the exact situation. If you can share a test that reproduces the issue then obviously we can work on fixing it.
By checking backup copies using the Doctor you can at least tell whether there is a database corruption, as was discovered yesterday.
when can we expect to have version 3 available
The release date is unknown yet. If you would like to be a beta tester then please indicate whether you need client-server access or only embedded mode access and we will get back to you at due course.
it would be very good to have that in an early stage, we need only embedded mode