we update to the latest version from 2.7.6. Now we have the problem that we have numerous logs indicating deep query plan checks and an overall significant performance loss in all queries.
is there a possibility to switch of the new feature ?
Type: Bug Reoprt | Version: 2.7.6_04 | Priority: Normal | Status: Fixed | Replies: 14 |
we update to the latest version from 2.7.6. Now we have the problem that we have numerous logs indicating deep query plan checks and an overall significant performance loss in all queries.
is there a possibility to switch of the new feature ?
There are several changes since version 2.7.6 that could possibly effect query execution. Could you please explain or post some of these logs that show deep query plan checks (and are new in 2.7.6_04)?
In addition, if you can identify the exact build in which these new issues started (_01, _02, _03 or _04) it could help.
currently we have no chance to really test that with all the sub versions. What we can report is:
using a reference query (you have our test db)
select distinct o from ObjectNode o join o.properties p1 where o.classIdentifier = "(OP)" and ((o.linkedObjects.classIdentifier = "(TC)" and o.linkedObjects.objectNameUpper = "SENKEN-LINIE-1") or (o.linkedObjects.classIdentifier = "(TC)" and o.linkedObjects.objectNameUpper = "SENKEN-GROSS")) and (o.properties.name = "feedbackState" and o.properties.doubleValue != 13) and ((p1.name = "plannedState" and p1.doubleValue = 4 and p1.state = 4) or (p1.name = "startableState" and p1.doubleValue = 0))
2.7.6
takes around 4 seconds
2.7.6_04
takes > 200 seconds and reports in query plan:
Query plan 1/2 description
============================
Step 1: Process ObjectNode (o) instances
----------------------------------------
[Step 1a]
Scan index com.agile.hummingbird.ObjectNode[classIdentifier]
locating ObjectNode (o) instances that satisfy:
(o.classIdentifier='(OP)').
[Step 1b]
Retrieve fields in ObjectNode (o) instances.
Step 2: Process ObjectNode (v$1) instances
(for every result of step 1)
------------------------------------------
[Step 2a]
Scan type com.agile.hummingbird.ObjectNode
locating ObjectNode (v$1) instances that satisfy:
(o.linkedObjects join v$1).
[Step 2b]
Evaluate fields in ObjectNode (v$1) instances.
[Step 2c]
Filter the results of step 2b
retaining only results that satisfy:
or(and((v$1.objectNameUpper='SENKEN-GROSS'),(v$1.classIdentifier='(TC)')),and((v$1.objectNameUpper='SENKEN-LINIE-1'),(v$1.classIdentifier='(TC)'))).
Step 3: Process ObjectProperty (v$2) instances
(for every result tuple of the steps above)
----------------------------------------------
[Step 3a]
Scan type com.agile.hummingbird.ObjectProperty
locating ObjectProperty (v$2) instances that satisfy:
(o.properties join v$2).
[Step 3b]
Evaluate fields in ObjectProperty (v$2) instances.
[Step 3c]
Filter the results of step 3b
retaining only results that satisfy:
(v$2.doubleValue!=13).
[Step 3d]
Filter the results of step 3c
retaining only results that satisfy:
(v$2.name='feedbackState').
Step 4: Process ObjectProperty (p1) instances
(for every result tuple of the steps above)
---------------------------------------------
[Step 4a]
Scan type com.agile.hummingbird.ObjectProperty
locating ObjectProperty (p1) instances that satisfy:
(o.properties join p1).
[Step 4b]
Evaluate fields in ObjectProperty (p1) instances.
[Step 4c]
Filter the results of step 4b
retaining only results that satisfy:
or(and((p1.doubleValue=0),(p1.name='startableState')),and((p1.state=4),(p1.doubleValue=4),(p1.name='plannedState'))).
Step 5: Apply selection
-----------------------
Apply selection and prepare final results.
<multiVarPlan plan="o:extract(o,index(-104['(OP)'->'(OP)':notNull])) <x> v$1:filter(extract(v$1,type(ObjectNode[set])),or(and((v$1.objectNameUpper='SENKEN-GROSS'),(v$1.classIdentifier='(TC)')),and((v$1.objectNameUpper='SENKEN-LINIE-1'),(v$1.classIdentifier='(TC)')))) <x> v$2:filter(filter(extract(v$2,type(ObjectProperty[set])),(v$2.doubleValue!=13)),(v$2.name='feedbackState')) <x> p1:filter(extract(p1,type(ObjectProperty[set])),or(and((p1.doubleValue=0),(p1.name='startableState')),and((p1.state=4),(p1.doubleValue=4),(p1.name='plannedState'))))" eval="15.1373">
<multiVarPlan plan="o:extract(o,index(-104['(OP)'->'(OP)':notNull])) <x> v$1:filter(extract(v$1,type(ObjectNode[set])),or(and((v$1.objectNameUpper='SENKEN-GROSS'),(v$1.classIdentifier='(TC)')),and((v$1.objectNameUpper='SENKEN-LINIE-1'),(v$1.classIdentifier='(TC)')))) <x> v$2:filter(filter(extract(v$2,type(ObjectProperty[set])),(v$2.doubleValue!=13)),(v$2.name='feedbackState'))" eval="12.7982">
<multiVarPlan plan="o:extract(o,index(-104['(OP)'->'(OP)':notNull])) <x> v$1:filter(extract(v$1,type(ObjectNode[set])),or(and((v$1.objectNameUpper='SENKEN-GROSS'),(v$1.classIdentifier='(TC)')),and((v$1.objectNameUpper='SENKEN-LINIE-1'),(v$1.classIdentifier='(TC)'))))" eval="10.7135">
<extractPlan plan="extract(o,index(-104['(OP)'->'(OP)':notNull]))" eval="8.4168" variable="o">
<btreePlan plan="index(-104['(OP)'->'(OP)':notNull])" eval="6.4168" variable="o" />
</extractPlan>
<filterPlan plan="filter(extract(v$1,type(ObjectNode[set])),or(and((v$1.objectNameUpper='SENKEN-GROSS'),(v$1.classIdentifier='(TC)')),and((v$1.objectNameUpper='SENKEN-LINIE-1'),(v$1.classIdentifier='(TC)'))))" eval="1.1483">
<extractPlan plan="extract(v$1,type(ObjectNode[set]))" eval="1.0" variable="v$1">
<btreePlan plan="type(ObjectNode[set])" eval="1.0" variable="v$1" />
</extractPlan>
</filterPlan>
</multiVarPlan>
<filterPlan plan="filter(filter(extract(v$2,type(ObjectProperty[set])),(v$2.doubleValue!=13)),(v$2.name='feedbackState'))" eval="1.0424">
<filterPlan plan="filter(extract(v$2,type(ObjectProperty[set])),(v$2.doubleValue!=13))" eval="1.0212">
<extractPlan plan="extract(v$2,type(ObjectProperty[set]))" eval="1.0" variable="v$2">
<btreePlan plan="type(ObjectProperty[set])" eval="1.0" variable="v$2" />
</extractPlan>
</filterPlan>
</filterPlan>
</multiVarPlan>
<filterPlan plan="filter(extract(p1,type(ObjectProperty[set])),or(and((p1.doubleValue=0),(p1.name='startableState')),and((p1.state=4),(p1.doubleValue=4),(p1.name='plannedState'))))" eval="1.1695">
<extractPlan plan="extract(p1,type(ObjectProperty[set]))" eval="1.0" variable="p1">
<btreePlan plan="type(ObjectProperty[set])" eval="1.0" variable="p1" />
</extractPlan>
</filterPlan>
</multiVarPlan>
forgot, query plan for 2.7.6:
Query Plan Description
======================
Step 1: Process ObjectNode (v$1) instances
------------------------------------------
[Step 1a]
Scan index com.agile.hummingbird.ObjectNode[classIdentifier, type, state, objectNameUpper]
locating ObjectNode (v$1) instances that satisfy:
and((v$1.objectNameUpper='SENKEN-GROSS'),(v$1.classIdentifier='(TC)')).
[Step 1b]
Order the results of step 1a by:
v$1.
[Step 1c]
Scan index com.agile.hummingbird.ObjectNode[classIdentifier, type, state, objectNameUpper]
locating ObjectNode (v$1) instances that satisfy:
and((v$1.objectNameUpper='SENKEN-LINIE-1'),(v$1.classIdentifier='(TC)')).
[Step 1d]
Order the results of step 1c by:
v$1.
[Step 1e]
Merge the results of steps 1b, 1d using OR.
Step 2: Process ObjectNode (o) instances
(for every result of step 1)
----------------------------------------
[Step 2a]
Scan index com.agile.hummingbird.ObjectNode[linkedObjects]
locating ObjectNode (o) instances that satisfy:
(o.linkedObjects join v$1).
[Step 2b]
Retrieve fields in ObjectNode (o) instances.
[Step 2c]
Filter the results of step 2b
retaining only results that satisfy:
(o.classIdentifier='(OP)').
Step 3: Process ObjectProperty (v$2) instances
(for every result tuple of the steps above)
----------------------------------------------
[Step 3a]
Scan type com.agile.hummingbird.ObjectProperty
locating ObjectProperty (v$2) instances that satisfy:
(o.properties join v$2).
[Step 3b]
Evaluate fields in ObjectProperty (v$2) instances.
[Step 3c]
Filter the results of step 3b
retaining only results that satisfy:
(v$2.doubleValue!=13).
[Step 3d]
Filter the results of step 3c
retaining only results that satisfy:
(v$2.name='feedbackState').
Step 4: Process ObjectProperty (p1) instances
(for every result tuple of the steps above)
---------------------------------------------
[Step 4a]
Scan type com.agile.hummingbird.ObjectProperty
locating ObjectProperty (p1) instances that satisfy:
(o.properties join p1).
[Step 4b]
Evaluate fields in ObjectProperty (p1) instances.
[Step 4c]
Filter the results of step 4b
retaining only results that satisfy:
or(and((p1.doubleValue=0),(p1.name='startableState')),and((p1.state=4),(p1.doubleValue=4),(p1.name='plannedState'))).
Step 5: Group results
---------------------
Group the results of step 4 by:
o.
[Step 6]
Remove duplicates from the groups of step 5.
[Step 7]
Apply selection and prepare final results.
Still working on your report, hopefully will be able to update soon.
Update: The change in ObjectDB that affects this specific query was introduced in build 2.7.6_01 following issue #2379, as the number of combinations examined by the query planner was increased. Unfortunately, although increasing the number of query plans that are examined can usually help in finding a better query plan, in this specific case a new inferior query plan is considered and selected. Decreasing the number of query plans can solve the performance issue in this specific case (and we can produce a new build that does exactly that - if you need a quick solution), but we are working on a better solution that will solve the issue with keeping that increase.
thanks for the update, we can wait until you have a better solution, we did step back to 2.7.6 for the time being
This report as well as a previous report (issue #2379) demonstrate that in some cases ObjectDB fails to select the best query plan. Build 2.7.6_01 increased the number of query plans that are checked and unfortunately a new inferior examined query plan overtakes the previous (superior) selected query plan in this specific case. The problem is not with build 2.7.6_01 (as it is justified to examine additional query plans) but with the evaluation of these query plans, as also discussed in issue #2379.
As a workaround you can use a query hint to request a specific preferred query as discussed in issue #2379. However, during the work on this issue it was found that the new query plan hint feature that was introduced in 2.7.6_01 requires a fix to cover some types of selections. So please use build 2.7.6_06.
To see how it works you can try the following 2 queries in the Explorer (with build 2.7.6_06) :
Query 1:
select distinct o from ObjectNode o join o.properties p1 where o.classIdentifier = "(OP)" and ((o.linkedObjects.classIdentifier = "(TC)" and o.linkedObjects.objectNameUpper = "SENKEN-LINIE-1") or (o.linkedObjects.classIdentifier = "(TC)" and o.linkedObjects.objectNameUpper = "SENKEN-GROSS")) and (o.properties.name = "feedbackState" and o.properties.doubleValue != 13) and ((p1.name = "plannedState" and p1.doubleValue = 4 and p1.state = 4) or (p1.name = "startableState" and p1.doubleValue = 0))
Query 2:
select distinct o from ObjectNode o join o.properties p1 where o.classIdentifier = "(OP)" and ((o.linkedObjects.classIdentifier = "(TC)" and o.linkedObjects.objectNameUpper = "SENKEN-LINIE-1") or (o.linkedObjects.classIdentifier = "(TC)" and o.linkedObjects.objectNameUpper = "SENKEN-GROSS")) and (o.properties.name = "feedbackState" and o.properties.doubleValue != 13) and ((p1.name = "plannedState" and p1.doubleValue = 4 and p1.state = 4) or (p1.name = "startableState" and p1.doubleValue = 0)) [[objectdb.query-plan-text=index(ctso['(TC)'->'(TC)']]
The 2nd query requires (using a hint) to use a query plan with a specific index. It should be executed much faster.
The [[...]] format for hints is only in the Explorer. In the code you can write:
query.setHint("objectdb.query-plan-text", "index(ctso['(TC)'->'(TC)'");
This will solve this specific query. Upgrading to the new build will also enable improving another query that is discussed in issue #2379.
If you need help with selecting query plan hints to optimise other slow query plans we can help with that, and using query plan hints we can optimise the execution of your queries.
Question: even without using the hint, does the new version behave like 2.7.6? We will introduce hints later but want to use the new version
Without hints the execution of the query that you described in #3 above is slower in the new build.
Thats bad news we have queries like the above that are entered by customers on the fly, no chance to use hints here. We need in this case a possibility to switch off this ne behaviour
Work on a better solution proceeds.
If you can provide several different queries (that are known to be slow) it may help identifying a pattern and finding a solution. If the schema and indexes are the same or similar for all your customers, it is quite possible that the same hint (or maybe a few hints). i.e. requiring using a specific index of method will solve all of them.
> We need in this case a possibility to switch off this ne behaviour
It is easy to add a system property to disable the change in version 2.7.6_01, as a temporary solution, if you think it would be useful (although that change improves performance in other cases). What is the purpose of updating from 2.7.6? Is there any other change in a newer build that you currently need?
Currently there is no need for us to upgrade, we did not know if you continue to work on a hintless solution :)
Build 2.7.6_08 improves query plan evaluation and provides better selection of query plans by using additional statistic information. To benefit from these changes ObjectDB must collect that statistics. Currently this is done by running a special query:
public static void main(String[] args) throws Exception { EntityManagerFactory emf = Persistence.createEntityManagerFactory("my.odb"); EntityManager em = emf.createEntityManager(); Thread thread = em.createQuery( "objectdb statistics", Thread.class).getSingleResult(); thread.join(); em.close(); emf.close(); }
It may take some time for this process to complete (about 20-30 minutes on your sample database). It may be effective to run this again occasionally when the database structure or content is changed significantly.
With build 2.7.6_08 and after collecting statistics as explained above the query execution is fast with no need for query hints. This also helps in similar issues (e.g. issue #2379).