Issue #2941: performance issues, database after adding statistics, after doctor

Type: DiscussionVersion: 2.9.0_05Priority: HighStatus: ActiveReplies: 1
#1

we face performance issues on queries and therefore compared the times of 2 (same) databases, 2.9.0_05, one after adding statistics and one after additional doctor, here are the results:

Query : select distinct o from ObjectNode o where (o.classIdentifier = '(RG)' and o.properties.name = 'noCapacityPlanning' and o.properties.doubleValue = 0  and o.linkedObjects.classIdentifier = '(CA)' and o.linkedObjects.type = 1 and o.linkedObjects.state = 0)

database after statistics added: around 159 seconds

database after additional Doctor: around 700 ms

download links:

https://www.dropbox.com/s/6rd2458aohc89s0/coreSystemDb-afterDoctor.odb?dl=0

https://www.dropbox.com/s/4gyhqbdqpjzlayc/coreSystemDb-afterStatistics.odb?dl=0

 

 

 

 

 

#2

Thank you for this report (if possible, please compress large sample files in the future for faster download).

Your report demonstrates well a limitation of using the current statistics implementation, where better results are obtained with no statistics available (apparently the Doctor creates a new database file with no statistics).

In the current implementation ObjectDB estimates the number of index entries to iterate over, given an exact key, as the number of total entries in the index divided by the number of different keys. ObjectProperty has 47,581,595 objects and 397,009 different doubleValue values (keys in the index), about 120 entries per value, so starting with the doubleValue  index on doubleValue seems reasonable, based on that statistics. This may work well for other values but very bad for doubleValue = 0, as there are 22,134,587 properties with that value.

Your report will accelerate the need to address this issue. Queries are compiled for reuse with different parameters, so usually the exact values are not taken into account during query plan selection. A quick obvious solution could be to follow the worst case, and in this case avoid the doubleValue index. A more advanced solution should be to use different query plans based on specific provided search values.

As a temporary workaround you can use a query hint to request using the classIdentifier index for this query, or remove the doubleValue index, if not needed elsewhere.

ObjectDB Support

Reply