447 words

Slow searching with two indexes

#1
2016-11-15 12:01

I'm running queries on a large database (tens of millions of objects), and I'm running into a performance problem using two indexes.  An entity I'm querying has two indexed fields - one is the primary key (id) and so is different for each object, and the other is an object reference that only has a few tens of different values.   When I query by just the primary key, it's (of course) extremely fast, but when I also query using the other key, queries can take minutes.  Looking at the database log, the way the query is run is that all values of the object reference key are scanned to get candidates for querying the primary key.  This is the slow part.  

I'm looking for a way to optimise this querying.  What I'm trying to fetch in my queries are a limited number of the most recent objects, like this:

select from Thing where reference == :objectReference order by id desc 

With the query limited to (say) 10 results.

Is there any obvious way to speed up this querying?  Taking minutes is not acceptable.

sjzlondon
sjzlondon's picture
Joined on 2013-04-21
User Post #34
#2
2016-11-16 13:34

Could you please post the query execution plan from the log?

The following query:

SELECT FROM Thing WHERE reference == :objectReference ORDER BY id DESC

when an index is defined on the reference field should use that index and avoid a full scan.

Does it run fast if the ORDER BY is removed?

ObjectDB Support
ObjectDB - Fast Object Database for Java (JPA/JDO)
support
support's picture
Joined on 2010-05-03
User Post #2,641
#3
2016-11-17 05:20

I have investigated further, and I have the query logs from using the Database Explorer:

SELECT FROM NmeaBatch WHERE receiver == :r ORDER BY id DESC
Query Plan Description
======================
 
[Step 1]
Scan index com.spiffymap.sealog.model.NmeaBatch[receiver]
locating NmeaBatch (this) instances that satisfy:
  (this.receiver==:r).
 
[Step 2]
Order the results of step 1 by:
  desc(any(any(this.id))).
 
[Step 3]
Apply selection and prepare final results.

For 10 results - 573 milliseconds

SELECT id,timeReceived FROM NmeaBatch WHERE receiver == :r ORDER BY id DESC
Query Plan Description
======================
 
[Step 1]
Scan index com.spiffymap.sealog.model.NmeaBatch[receiver]
locating NmeaBatch (this) instances that satisfy:
  (this.receiver==:r).
 
[Step 2]
Retrieve fields in NmeaBatch (this) instances.
 
[Step 3]
Order the results of step 2 by:
  desc(this.id).
 
[Step 4]
Apply selection and prepare final results.

For 10 results - 460384 milliseconds.

If ORDER BY is removed, the queries run very fast

sjzlondon
sjzlondon's picture
Joined on 2013-04-21
User Post #35
#4
2016-11-17 07:48

The second query is slow because of the need to retrieve timeReceived value for a lot of objects. Unfortunately the fact that only 10 results are needed doesn't help, as all the results are prepared for ordering. Selecting the 10 first results happens later after sort if ORDER BY is specified. This is something that ObjectDB should do better.

Anyway, the best solution may be defining a composite index on receiver and timeReceived. This will allow execution of this query purely with the new index with no need for further data retrieval.

Another option that can be checked is running the query with lazy fetch.

ObjectDB Support
ObjectDB - Fast Object Database for Java (JPA/JDO)
support
support's picture
Joined on 2010-05-03
User Post #2,644

Post Reply

Please read carefully the posting instructions - before posting to the ObjectDB website.

  • You may have to disable pop up blocking in order to use the toolbar (e.g. in Chrome).
  • Use ctrl + right click to open the browser context menu in the editing area (e.g. for using a browser spell checker).
  • To insert formatted lines (e.g. Java code, stack trace) - select a style in the toolbar and then insert the text in the new created block.
  • Avoid overflow of published source code examples by breaking long lines.
  • You may mark in paragraph code words (e.g. class names) with the code style (can be applied by ctrl + D).
  • Long stack traces (> 50 lines) and complex source examples (> 100 lines) should be posted as attachments.
Attachments:
Maximum file size: 32 MB
Cancel