Optimization Question

#1

One of our uses for your ObjectDB software is in maintaining the current state of a large number of data objects called tracks. These tracks change very frequently over time (every 10-15 seconds is not unreasonable), but they are also very large complex data structures so it's extremely difficult to identify exactly what fields have changed from one update to the next.

Currently the following sequence executes when a track update comes in:

- If it’s a new track simply add it to the database.

- If it's an update to an existing track, delete the entire previous entry and then add in the new one.

This completely removes the old entry and all of its subcomponents before adding in the new one. This has always seemed very inefficient to me, but the other option would be some potentially error prone code to identify exactly what's changed and replace only exactly those items, but that would also require maintaining the OID for everything. Anyway, what are your thoughts on this? I have a feeling there is something more efficient we can be doing, but I can't figure out exactly what, any suggestions?

P.S. Really like the new website design, good work!

#2

I am afraid it would be impossible to predict which method is faster without trying. It also depends on many factors. For example, if your classes have many indexes - smart comparison might be faster. If there are no indexes and the changes in every update are significant - I can see a situation in which rewriting from scratch might be faster.

You may also consider improving performance by:

  • Using more embedded objects whenever possible.
  • Delaying the delete operations to time with less or no activity (and if you can use one transaction for deletion of many updates it can also improve performance).

If you still want to try updating only the modified objects, maybe the new merge feature of ObjectDB 2.0 RC1 can help. ObjectDB does such comparisons internally to determine if a merged object has been modified. Still, you will have to manage the object IDs in order to merge a graph of objects that use existing object IDs. This is a new feature and it has never been tested in a scenario as the one that you described, but it might be worth trying.

Thank you for your kinds words regarding the new site.

ObjectDB Support
#3

Yeah, that's pretty much what I've figured as well.  It would take a lot of work just to do the analysis to determine which approach would be better and with about 100 other things to work on that falls into the category of "if it ain't broke don't fix it" because what we're currently doing is working.

Currently updates come in via RMI so the application is receiving a serialized copy of the original.  That is placed on a BlockingQueue so the RMI call can return immediately.  Then our database processing thread pulls all of the updates off of the BlockingQueue and processes all database modifications in one transaction.  All tracks maintain a java UUID identifier and a mapping of that to the ObjectDB OID is used to delete the existing track data before persisting the new data.

I'm only using a single Thread to do the processing as I've been concerned with a race condition occurring where two updates for the same track come in very close together and they get persisted to the database in the wrong order causing a period of time where the current set of data is incorrect.  I could do some synchronization between the threads to ensure only one is transacting the database at any given time, but then that defeats the whole point of having multiple threads.

So yeah, without some intense analysis (we're talking hundreds of fields with many layers of subcomponents) it's probably best to just stick with what's working.

#4

FYI, a new ObjectDB 2 version that has just been released (2.0 RC3) improves performance of insert/update/delete operations significantly. It might boost performance, especially in a scenario as you have described.

ObjectDB Support

Reply