Index causes increased memory usage

#1

Hello,

In an effort to increase the speed of our queries with our embedded ObjectDB system, we configured our JDO file that defines our persistable objects with some unique String indices.  While this improved query performance considerably, we have noted that our database application is consuming MUCH more memory (~8gb instead of ~3gb) than when the indices are not included.  What could be taking up that memory?  The data load was the same for both tests (with and without String indexing).  Is there some sort of tuning we can do to have both fast queries and not soak up all the memory on the server?

 

Thanks,

CVTSC

#2

Using indexes should not make that difference.

When indexes become available different query plans may be used to execute queries, and maybe for a specific query the new selected query plan uses too much memory. You can try decreasing the temporary file threshold, but that may slow performance.

To understand the problem better you will have to provide information from a heap dump of your application.

ObjectDB Support
#3

During our test, we weren't making any queries.  Our database operates in a cycle that involves _lots_ of database adds over a great period of time, and occasional querying.  No record in our database is ever updated, we simply add new data.

We're still a bit confused about what could have caused this increased memory usage since the only change was the including the indexes.  How are the indexes stored?  Is it simply a pointer to the field to be indexed, or is there a data structure stored in memory that assists in searching on indexes?  Are they cached in any way?


Here's a bit about our environment:

    OS is 64-bit Linux
    Java 7
    ObjectDB version 2.37_22.


Thanks in advance.

#4

There is no special cache for indexes. Pages that contain indexes are cached in the page cache, exactly as pages that contain data.

Try analyzing a heap dump and post information about the main ObjectDB classes that take the heap space.

ObjectDB Support
#5

Took a look at a heap dump after it had been running for a while (1.5 - 2 horus) and here are some of the results of the highest ranking objects from objectdb:

com.objectdb.o.PAG - 117mb, 1.62 million instances

com.objectdb.o.SLV - 28.3mb, 1.2 million instances

com.objectdb.o.RFV - 25.1mb, 1.04 million instances
 

I also took a look at the methods occupying the most CPU time and spotted these after letting it collect data for about 20 minutes:

com.objectdb.o.SFL.run() 41%

com.objectdb.o.LFL.r() 14%

Hopefully this helps?

#6

About 3.5GB are in use by the page cache (which has a size of 64MB per database by default).

Have you changed the cache size?

A separate cache is managed per open database. How many database files do you use concurrently?

ObjectDB Support
#7

Yes, we were tinkering with that value yesterday and I recorded some results.  I don't have the numbers with me, but I can post what we discovered using different processing cache (page) and query cache settings on Monday.  I'll have to take a look at the database file count Monday as well.

According to the documentation the page cache is an in-memory cache.  After data is stored to the file are the entries in the page cache cleared out?  Or do they remain?  Our jvisualvm memory graphs show a steady climb of used heap as though some cache is not being cleared out at all, and our memory profile rises steadily throughout run-time.

This climb is exacerbated by heavy querying of the database.  Once the querying stops, the memory usage never returns to a its pre-query levels and continues to climb.

Hope this helps!

 

PS - On another note, we are working with an 8-core processor.  The recommended thread count is "more than the number of cores but not too many."  What would be an appropriate number of threads for the processing thread pool in order to get good performance for our 8-core server?


 

#8

If you have 30 open databases with page cache size of 128MB (each) then this explains your numbers.

The cache is not cleared after the data is stored, since the idea of the cache is to be ready for future requests, which could come anytime.

Please use separate forum threads for different topics.

ObjectDB Support
#9

Hello,

As a follow up to your last post I have a few questions that would help with us with understanding what is going on behind the scenes inside object db. 

When you refer to having "30 open databases", are you talking about having 30 PersistanceManagers open?  Or 30 PersistanceManagerFactories?  Or is it one and the same?  We are creating a new PM for each transaction currently, and are accessing the DB from different threads.  It is not impossible to imagine that at any given time 30 or more persistance managers are active in the system and processing some data transactions.

When Does each PersistenceManager have its own 64mb processing cache?  During what part of operation is another 64 mb cache allocated to the system?

=================================

Today, we ran some more tests.  We wrote a very stripped down version of our embedded ObjectDB database application, that simply connects to the application that supplies our data and stores it in an .odb file.  This application created a single PersistenceManagerFactory which had a thread pool of 10 threads each of which could be called to process data for storage by getting a new PersistanceManager from the factory, and storing the persistable objects using the manager.  At the end of the threads run, the PM was closed.

When we ran this test, only one thread was required tp handle processing and storage of the data - a second thread was never requested from the pool.  Again we noted that when indexing was enabled on a unique String field, our memory usage was far higher than when an index was not specified.During the runs of the code, the memory usage followed a linear trajectory until the application ran out of heap space and crashed.

During both runs with indexing and without, the object that JVisualVM showed as the biggest memory user was byte arrays - usually hitting between 75-90% of the memory in use .  Is this normal?

=================================

Thank you again for your help, it is greatly appreciated.

#10

> When you refer to having "30 open databases", are you talking about having 30 PersistanceManagers open?  Or 30 PersistanceManagerFactories

30 different databases (with 30 different PersistanceManagerFactory instances), not 30 database connections. You can use as many PersistanceManager instances as you like, since they are lightweight objects.

> We wrote a very stripped down version of our embedded ObjectDB database application, that simply connects to the application that supplies our data and stores it in an .odb file.

If you can post this test it could help.

ObjectDB Support

Reply