Issue #2517: queries more than 10 times slower using 2.8.1 versus 2.7.6

Type: Bug Reoprt

Version: 2.8.1_01

Priority: Normal

Status: Closed

Replies: 16

hgzwicker

Joined on 2014‑04‑09

- we restarted to try to use 2.8.1 instead of 2.7.6

- we did a doctor on the database that you can download from www.hummingbird-systems.com, login/pw your database name, main menu objectdb

- we did not do any statistics generate, just tried to use the new version

- our reference query that takes in 2.7.6 around 5-10 seconds takes in 2.8.1 around 150-200 seconds:

SELECT DISTINCT o.parentNode FROM ObjectNode o WHERE o.classIdentifier = "(OP)" AND (o.linkedObjects.classIdentifier = "(US)" and o.linkedObjects.objectNameUpper = "ORG" and o.linkedObjects.type = 1 and o.linkedObjects.state = 0) AND (o.parentNode.properties.name = "confirmed" AND o.parentNode.properties.doubleValue = 0) and o.parentNode.classIdentifier = "(ME)" order by o.parentNode.modified desc

support

Joined on 2010‑05‑03

Please advise whether updating the statistics, as explained on issue #2417 post#15 helps.

It seems that running your query on your sample database with version 2.8.1 is slower if statistics is not updated.

ObjectDB Support

support

Joined on 2010‑05‑03

Note that we haven't tested it on your new database but on a old sample database for which statistics was already updated as part of the work on issue #2417. Hopefully updating statistics will also help with the new database. If not, we can investigate it further.

ObjectDB Support

hgzwicker

Joined on 2014‑04‑09

isn't it possible that you create the statistics on that new database directly to avoid to again and again transfer this huge database, we are a bit under time pressure

support

Joined on 2010‑05‑03

Since we already checked the database from issue #2417 and the problem that you describe here is also demonstrated by that older database, i.e. statistics must be updated to avoid 10x or more performance lose with this query, there is a good chance that the same applies to your newer database (which we have not downloaded).

Please try running statistics update on it and it may solve the issue. If not, then we can download the new database and run statistics update, so you won't have to upload another file.

ObjectDB Support

hgzwicker

Joined on 2014‑04‑09

results after adding the statistics:

query 1: "select o from ObjectNode o join o.linkedObjects l1 where o.classIdentifier = '(SI)' and o.type = 0 and o.state = 0 and (o.linkedObjects.classIdentifier = '(TC)' and o.linkedObjects.objectNameUpper = 'CAM-MILLING') and (l1.classIdentifier = '(SY)' and l1.objectNameUpper = 'TECHNOLOGY')"

2.7.6, 3 seconds

2.8.1, 16 seconds

query 2: "select distinct o from ObjectNode o join o.properties p1 join o.properties p2 join o.properties p10 join o.properties p20 join o.properties p21 where o.classIdentifier = '(OP)' and (o.linkedObjects.classIdentifier = '(TC)' and o.linkedObjects.objectNameUpper = 'CAM-MILLING' and o.linkedObjects.type = 1 and o.linkedObjects.state = 0) and (((p2.name = 'scheduledEndSlot' and p2.doubleValue <= 2603) or (p1.name = 'scheduledSlot' and p1.doubleValue <= 2603)) and (p10.name = 'scheduledState' and p10.doubleValue = 2) and (o.properties.name = 'feedbackState' and o.properties.doubleValue != 13) and (p21.name = 'globalFeedback' and p21.doubleValue = 0) and (p20.name = 'roughPlanning' and p20.doubleValue = 0))"

2.7.6, 2 seconds

2.8.1, 16 seconds

support

Joined on 2010‑05‑03

OK. We will download and check. What about the query from #1 above?

ObjectDB Support

hgzwicker

Joined on 2014‑04‑09

the #1 query is around half of the speed compared to 2.7.6

support

Joined on 2010‑05‑03

Here is an analysis of query 1 of post #6 above:

select o from ObjectNode o join o.linkedObjects l1
where o.classIdentifier = '(SI)' 
  and o.type = 0
  and o.state = 0
  and (o.linkedObjects.classIdentifier = '(TC)'
  and o.linkedObjects.objectNameUpper = 'CAM-MILLING')
  and (l1.classIdentifier = '(SY)'
  and l1.objectNameUpper = 'TECHNOLOGY')

The issue is again suboptimal selection of a query plan. Versions 2.7.6 and 2.8.1 have more or less the same collection of query plans to select from. Version 2.8.1 is expected to make a better choice in most cases, but apparently in this case the selection of 2.8.1 was not optimal.

There are only 237 ObjectNode instances that satisfy the constraint:

(o.classIdentifier = '(SI)' and o.type = 0 and o.state = 0)

Therefore, the optimal query starts by using the composite index "ctso" to quickly reduce the examined nodes. This is actually what version 2.7.6 does.

However, version 2.8.1 selection algorithm is different. Since there are 11,170,730 ObjectNode instances, and there are 269 different combinations of (classIdentifier, type, state), it guesses that:

(o.classIdentifier = '(SI)' and o.type = 0 and o.state = 0)

will have about 11,170,730 / 269 = ~41,527 objects rather than 237, and therefore it doesn't select this option.

We will try to improve this in future versions of ObjectDB. However, as a quick solution you can force ObjectDB to use the query plan that uses the index "ctso".

For example in the Explorer:

select o from ObjectNode o join o.linkedObjects l1
where o.classIdentifier = '(SI)'
  and o.type = 0
  and o.state = 0
  and (o.linkedObjects.classIdentifier = '(TC)'
  and o.linkedObjects.objectNameUpper = 'CAM-MILLING')
  and (l1.classIdentifier = '(SY)'
  and l1.objectNameUpper = 'TECHNOLOGY')
[[objectdb.query-plan-text=ctso]]

Or in the code:

 query.setHint("objectdb.query-plan-text", "ctso");

See also the discussion in this thread.

ObjectDB Support

#10

hgzwicker

Joined on 2014‑04‑09

the query hint does not help, measures:

version 2.8.1

query:

select distinct o from ObjectNode o join o.properties p1 join o.properties p2 join o.properties p10 join o.properties p20 join o.properties p21 where o.classIdentifier = '(OP)' and (o.linkedObjects.classIdentifier = '(TC)' and o.linkedObjects.objectNameUpper = 'CAM-MILLING' and o.linkedObjects.type = 1 and o.linkedObjects.state = 0) and (((p2.name = 'scheduledEndSlot' and p2.doubleValue <= 2604) or (p1.name = 'scheduledSlot' and p1.doubleValue <= 2604)) and (p10.name = 'scheduledState' and p10.doubleValue = 2) and (o.properties.name = 'feedbackState' and o.properties.doubleValue != 13) and (p21.name = 'globalFeedback' and p21.doubleValue = 0) and (p20.name = 'roughPlanning' and p20.doubleValue = 0))

180 results in 7263 milliseconds

added:

[[objectdb.query-plan-text=ctso]]

180 results in 34304 milliseconds

please advice, how can we proceed ?

#11

support

Joined on 2010‑05‑03

As explained in answer #9 the hint is for your first query in post #6.

Does it help with that query?

In post #10 you only mention the second query from post #6 that we haven't analysed, and therefore cannot suggest the particular recommended hint for that query.

Obviously forcing ObjectDB to use a particular index is not suitable for every query, so the answer in #9 demonstrates it with one query in order for you to proceed and try and use hints as necessary in all your other queries. In the ObjectDB log (if you set <logger name="query" level="debug" />) you can see all the top query plans and try alternative query plans manually using hints, or see which query plan was used in older version of ObjectDB, such as 2.7.6, and adopt it manually in 2.8.1 using hints.

ObjectDB Support

#12

hgzwicker

Joined on 2014‑04‑09

Such a process is not possible to implement. We have a huge amount of customer specific queries and it would kill our flexibility.

how about a way to fall back to the old mechanism till you have something that works better ?

#13

support

Joined on 2010‑05‑03

> how about a way to fall back to the old mechanism till you have something that works better ?

Maybe it could may be possible to some extent, but it would be difficult to guarantee exactly the same behaviour, as different things have changed.

Is there a specific reason for upgrading from version 2.7.6? because the best way to guarantee permanent fall back to the old query plans is to stay with the old ObjectDB version.

> Such a process is not possible to implement. We have a huge amount of customer specific queries and it would kill our flexibility.

Maybe this could be automated? We may be able to implement a new Trainer utility, that given a set of queries, it tests different query plans and chooses and remembers the best query plan. Any other solution may still be kind of guessing. The best way to choose the best query plan is simply to try the 10 leading query plans.

However, for this to work queries must repeat, i.e. the query string of a new query must be the same as of a trained query, except the parameter values.

ObjectDB Support

#14

hgzwicker

Joined on 2014‑04‑09

we would like to benefit from the new features/fixes but we can wait until you have something that works better

#15

support

Joined on 2010‑05‑03

OK. If it is possible for you to provide a complete set of queries that work on your last database and covers the needs of your application, then we may be able to use it for tuning, hoping to get better results in general, but particularly better results with your applications.

ObjectDB Support

#16

hgzwicker

Joined on 2014‑04‑09

we startet to implement the query hints, question: is it possible to set a sequence of hints but not only the hint for the first index to use

#17

support

Joined on 2010‑05‑03

> is it possible to set a sequence of hints but not only the hint for the first index to use

The "objectdb.query-plan-text" hint is a single string, which when specified, ObjectDB selects a query plan that contains that string in its description, if there is such a query plan. Therefore, it is a single string and not a list of strings. However, after finding the best query (manually) you can use as a hint any string in the description of the query plan, not just a name of an index, so you should be able to cover more than one index by a longer string.

ObjectDB Support

Issue #2517: queries more than 10 times slower using 2.8.1 versus 2.7.6

Reply