ObjectDB ObjectDB

Slow searching with two indexes

#1

I'm running queries on a large database (tens of millions of objects), and I'm running into a performance problem using two indexes.  An entity I'm querying has two indexed fields - one is the primary key (id) and so is different for each object, and the other is an object reference that only has a few tens of different values.   When I query by just the primary key, it's (of course) extremely fast, but when I also query using the other key, queries can take minutes.  Looking at the database log, the way the query is run is that all values of the object reference key are scanned to get candidates for querying the primary key.  This is the slow part.  

I'm looking for a way to optimise this querying.  What I'm trying to fetch in my queries are a limited number of the most recent objects, like this:

select from Thing where reference == :objectReference order by id desc 

With the query limited to (say) 10 results.

Is there any obvious way to speed up this querying?  Taking minutes is not acceptable.

edit
delete
#2

Could you please post the query execution plan from the log?

The following query:

select from Thing where reference == :objectReference order by id desc

when an index is defined on the reference field should use that index and avoid a full scan.

Does it run fast if the ORDER BY is removed?

ObjectDB Support
edit
delete
#3

I have investigated further, and I have the query logs from using the Database Explorer:

select from NmeaBatch where receiver == :r order by id desc
Query Plan Description
======================

[Step 1]
Scan index com.spiffymap.sealog.model.NmeaBatch[receiver]
locating NmeaBatch (this) instances that satisfy:
  (this.receiver==:r).

[Step 2]
Order the results of step 1 by:
  desc(any(any(this.id))).

[Step 3]
Apply selection and prepare final results.

For 10 results - 573 milliseconds

select id,timeReceived from NmeaBatch where receiver == :r order by id desc
Query Plan Description
======================

[Step 1]
Scan index com.spiffymap.sealog.model.NmeaBatch[receiver]
locating NmeaBatch (this) instances that satisfy:
  (this.receiver==:r).

[Step 2]
Retrieve fields in NmeaBatch (this) instances.

[Step 3]
Order the results of step 2 by:
  desc(this.id).

[Step 4]
Apply selection and prepare final results.

For 10 results - 460384 milliseconds.

If ORDER BY is removed, the queries run very fast

edit
delete
#4

The second query is slow because of the need to retrieve timeReceived value for a lot of objects. Unfortunately the fact that only 10 results are needed doesn't help, as all the results are prepared for ordering. Selecting the 10 first results happens later after sort if ORDER BY is specified. This is something that ObjectDB should do better.

Anyway, the best solution may be defining a composite index on receiver and timeReceived. This will allow execution of this query purely with the new index with no need for further data retrieval.

Another option that can be checked is running the query with lazy fetch.

ObjectDB Support
edit
delete

Reply

To post on this website please sign in.