ObjectDB ObjectDB

ObjectDB is opening too many sockets

#1

Hi, we are trying to switch from embedded mode to client-server to support database clustering, and we are now facing following issue.

Object DB server is creating too many sockets.
Currently the ObjectDB Server process has 3895 opened files, and from that number 3823 are opened sockets.
I am supposing, that it is because of some mistake, so I would like from you to help me diagnose it instead of increasing system limits.
We are using Ubuntu 18.04.5 LTS and after some time it causes following error and the ObjectDB is restarted automatically:
java[3545]: Too many open files

Few lines from /proc/<ODBServerPID>/fd 
lrwx------ 1 odb odb 64 Mar 19 11:34 995 -> 'socket:[507755621]'
lrwx------ 1 odb odb 64 Mar 19 11:34 996 -> 'socket:[507755622]'
lrwx------ 1 odb odb 64 Mar 19 11:34 997 -> 'socket:[507755623]'
lrwx------ 1 odb odb 64 Mar 19 11:34 998 -> 'socket:[507755624]'
lrwx------ 1 odb odb 64 Mar 19 11:34 999 -> 'socket:[507755625]'

We think, that it does not corresponding with opened DB connections as we are monitoring them, and they are staying under 20.

edit
delete
#2

If you open connections (create EntityManager / PersistenceManager instances) without closing them then the result could be many open sockets.

ObjectDB Support
edit
delete
#3

I guess I will need more information from you about this :-(

1. I don't know, that is the purpose of that sockets, I don't think, it is related to the connection to much

Let's look on the results of lsof command for ODB Server process, from the 2 lines the first one is TCP connection from explorer.
This disappear when explorer is closed (or File->Close Connection is used), the same way, when our application is running, those connections are not created indefinitely (we are using up to 20), but the socket stays there for longer period of time.

java    17888  odb   14u  IPv6           31439922       0t0      TCP localhost:zabbix-agent->localhost:52842 (ESTABLISHED)
java    17888  odb   65u  sock                0,9       0t0 31404976 protocol: TCPv6

2. I did some testing with running ObjectDB Server, and did all connections from explorer only.
I was working with 2 DB files (firstdb and seconddb) and on the end of the line I wrote number of sockets, which ObjectDB Server has currently in use. I guess it kind of work correctly as the sockets were not increasing indefinitely, but still I don't get, why some times it openes 70 sockets at once, and if I tried same operation again with freshly started ObjectDB Server it did not opened single one.
The results written bellow are steps done on 1 running ObjectDBServer without restarts, also explorer was single instance, just closing and reopening again the connection. So I assume in bad case when opening more database files it can easily pass file systems limits. I had 72 sockets opened at one time using single TCP connection.


The results were following:
01_firstdb open (connected using explorer) 22 sockets open (in total)
02_firstdb closed connection (File->Close connection) 22 sockets open
03_firstdb open second time 22 sockets open
04_firstdb closed connection 22 sockets open
05_seconddb open and closed 22 sockets open
06_seconddb open 22 sockets open
07_seconddb open entity in tree view 0 sockets open
08_firstdb open 0 sockets
08_2_firstdb closed 0 sockets
09_firstdb open 72 sockets open
09_2_firstdb closed 72 sockets open
10_firstdb open 72 sockets open
10_2_firstdb closed 72 sockets open
11_seconddb open 72 sockets open
12_seconddb open entity in tree view 46 sockets open
12_seconddb closed 46 sockets open
13_seconddb open 46 sockets open
14_seconddb open entity in tree mode 47 sockets open
15_firstdb open 47 sockets open
16_firstdb open entity in tree view 47 sockets open
16_2_firstdb closed 47 sockets open
17_seconddb open and also open entity in tree view 1 socket open
18_firstdb open 1 socket open
18_2 firstdb closed 1 socket open
19_firstdb open 1 socket open
19_2firstdb closed
20_firstdb open and open entity in tree view 1 socket open
20_2firstdb closed
21_seconddb open 1 socket open 1 socket open
22_seconddb open entity in tree view 0 sockets open
22_2 seconddb closed
23_firstdb open 0 sockets open
24_seconddb open entity in tree view 1 socket open

edit
delete
#4

This is an interesting test, thanks for sharing. It is difficult to understand and explain some of the data, such as the 22 open sockets on just opening the first database at the beginning.

In principle sockets are associated with connections. Every connection requires an open socket. Connections (and their sockets) are not released immediately when you close them due the connection pool mechanism, which keeps some sockets open for reuse. This is a commonly used optimization in working with databases in general. You can limit the size of the connection pool (the default in ObjectDB is currently 50), by using the JDO method setMaxPool.

ObjectDB Support
edit
delete
#5

Sorry, I did not get that

You can limit the size of the connection pool (the default in ObjectDB is currently 50), by using the JDO method setMaxPool

How can I do that for ObjectDB server? Shouldn't that be in some config? Or set by command argument?

edit
delete
#6

Actually the connection pool is on the client side and the client holds the sockets. On the server you can set an expiration time for unused connections. See post #2 on this related forum thread.

ObjectDB Support
edit
delete
#7

I am sorry, but I did not found a way, how to set it up.
I did not found setMaxPool anywhere in documentation.
I was also searching in PersistenceManagerFactory and PersistenceManager with no success.

Can you explain more, how to call setMaxPool?

edit
delete
#8

You can specify the max pool size when you initialize the EntityManagerFactory:

    Map<String, String> properties = new HashMap<>();
    properties.put("javax.jdo.option.MaxPool", "10");
    EntityManagerFactory emf =
        Persistence.createEntityManagerFactory("test.odb", properties);
    EntityManager em = emf.createEntityManager();
ObjectDB Support
edit
delete
#9

Thank You, I will try that.

I have some followup questions:
1. Is the thread pool shared somehow for both JDO and JPA?
We are using JDO for our application except backups which are done using JPA.
I was playing with server connection limit (set to 55 assuming the default pool 50 on client), and our app crashed after few hours with Too many open connections (55) exception.

2. Not much a question yet, but we have still troubles with the too many sockets we raised the limit from 4096 to 102400 and it's still not enough.
We have 1 ODB Server started without any connections to it, and the sockets are slowing raising until it crashes. I am monitoring TCP connections trough tcpdump, and I did not find any significant there, but we're continuously looking as there were very few of them, which we could not identify, but only at one timestamp which does not correspond to times of increasing sockets used.

Number of opened sockets during the time:
Thu Apr 15 15:11:01 CEST 2021 : 0
Thu Apr 15 19:03:01 CEST 2021 : 0
Thu Apr 15 19:04:01 CEST 2021 : 7008
Fri Apr 16 00:12:01 CEST 2021 : 14500
Fri Apr 16 05:15:01 CEST 2021 : 21434
Fri Apr 16 10:23:01 CEST 2021 : 28544
Fri Apr 16 15:33:01 CEST 2021 : 35670

I will continue investigating this, probably by tracing the whole process as this is a blocker for production use.
 

edit
delete
#10

ODB Log prints nothing:

[2021-04-16 18:26:25 #1 server]
Server on port 10054 has started by 26621@testing

TCP Dump over all interfaces did not print anything (assuming no network connections were done)

sudo tcpdump -ni lo port 10054

We had also strace running to print all network related kernel calls from process:
Based on that 2 times a during day the ODB Server went crazy, created cuple of threads and started creating sockets in some loop.
This took about minute, so I will try to create java thread dump during that.

[pid 26650] 02:48:00.246356 socket(AF_INET6, SOCK_STREAM, IPPROTO_IP) = 46494<TCPv6:[332039994]>

[pid 26650] 02:48:00.250400 setsockopt(46494<TCPv6:[332039994]>, SOL_IPV6, IPV6_V6ONLY, [0], 4) = 0
[pid 26650] 02:48:00.253569 getsockopt(46494<TCPv6:[332039994]>, SOL_SOCKET, SO_LINGER, {l_onoff=0, l_linger=0}, [8]) = 0
[pid 26650] 02:48:00.256978 socket(AF_INET6, SOCK_STREAM, IPPROTO_IP) = 52186<TCPv6:[332039995]>

[pid 26650] 02:48:00.262088 setsockopt(52186<TCPv6:[332039995]>, SOL_IPV6, IPV6_V6ONLY, [0], 4) = 0
[pid 26650] 02:48:00.265851 getsockopt(52186<TCPv6:[332039995]>, SOL_SOCKET, SO_LINGER, {l_onoff=0, l_linger=0}, [8]) = 0
[pid 26650] 02:48:00.269934 socket(AF_INET6, SOCK_STREAM, IPPROTO_IP) = 55389<TCPv6:[332039996]>

[pid 26650] 02:48:00.274090 setsockopt(55389<TCPv6:[332039996]>, SOL_IPV6, IPV6_V6ONLY, [0], 4) = 0
...
[pid 26650] 02:49:20.166210 socket(AF_INET6, SOCK_STREAM, IPPROTO_IP) = 80620<TCPv6:[332185641]>

[pid 26650] 02:49:20.168344 setsockopt(80620<TCPv6:[332185641]>, SOL_IPV6, IPV6_V6ONLY, [0], 4) = 0
[pid 26650] 02:49:20.170243 getsockopt(80620<TCPv6:[332185641]>, SOL_SOCKET, SO_LINGER, {l_onoff=0, l_linger=0}, [8]) = 0
[pid 26650] 02:49:20.172080 socket(AF_INET6, SOCK_STREAM, IPPROTO_IP) = 80621<TCPv6:[332185642]>

[pid 26650] 02:49:20.173949 setsockopt(80621<TCPv6:[332185642]>, SOL_IPV6, IPV6_V6ONLY, [0], 4) = 0
[pid 26650] 02:49:20.175771 getsockopt(80621<TCPv6:[332185642]>, SOL_SOCKET, SO_LINGER, {l_onoff=0, l_linger=0}, [8]) = 0
[pid 26650] 02:49:20.177615 socket(AF_INET6, SOCK_STREAM, IPPROTO_IP) = 80622<TCPv6:[332185643]>


Next occurence couple hours later.


strace: Process 33702 attached
[pid 26650] 22:12:27.202012 socket(AF_INET6, SOCK_STREAM, IPPROTO_IP) = 11<TCPv6:[284630924]>

[pid 26650] 22:12:27.206298 setsockopt(11<TCPv6:[284630924]>, SOL_IPV6, IPV6_V6ONLY, [0], 4) = 0
[pid 26650] 22:12:27.208845 getsockopt(11<TCPv6:[284630924]>, SOL_SOCKET, SO_LINGER, {l_onoff=0, l_linger=0}, [8]) = 0
[pid 26650] 22:12:27.211548 socket(AF_INET6, SOCK_STREAM, IPPROTO_IP) = 12<TCPv6:[284893387]>

[pid 26650] 22:12:27.216027 setsockopt(12<TCPv6:[284893387]>, SOL_IPV6, IPV6_V6ONLY, [0], 4) = 0
[pid 26650] 22:12:27.220703 getsockopt(12<TCPv6:[284893387]>, SOL_SOCKET, SO_LINGER, {l_onoff=0, l_linger=0}, [8]) = 0
[pid 26650] 22:12:27.225259 socket(AF_INET6, SOCK_STREAM, IPPROTO_IP) = 13<TCPv6:[284893388]>

[pid 26650] 22:12:27.227884 setsockopt(13<TCPv6:[284893388]>, SOL_IPV6, IPV6_V6ONLY, [0], 4) = 0
....
[pid 26650] 22:13:17.024203 getsockopt(6783<TCPv6:[284977278]>, SOL_SOCKET, SO_LINGER, {l_onoff=0, l_linger=0}, [8]) = 0
[pid 26650] 22:13:17.026705 socket(AF_INET6, SOCK_STREAM, IPPROTO_IP) = 6784<TCPv6:[284977279]>

[pid 26650] 22:13:17.029276 setsockopt(6784<TCPv6:[284977279]>, SOL_IPV6, IPV6_V6ONLY, [0], 4) = 0
[pid 26650] 22:13:17.031247 getsockopt(6784<TCPv6:[284977279]>, SOL_SOCKET, SO_LINGER, {l_onoff=0, l_linger=0}, [8]) = 0
[pid 26650] 22:13:17.033553 socket(AF_INET6, SOCK_STREAM, IPPROTO_IP) = 6785<TCPv6:[284977280]>

[pid 26650] 22:13:17.035586 setsockopt(6785<TCPv6:[284977280]>, SOL_IPV6, IPV6_V6ONLY, [0], 4) = 0
[pid 26650] 22:13:17.037957 getsockopt(6785<TCPv6:[284977280]>, SOL_SOCKET, SO_LINGER, {l_onoff=0, l_linger=0}, [8]) = 0

Here is example of strace log during standard connection made from our app:
[pid 26632] 12:04:35.478979 accept(7<TCPv6:[:::10054]>, {sa_family=AF_INET6, sin6_port=htons(37970), inet_pton(AF_INET6, "::ffff:127.0.0.1", &sin6_addr), sin6_flowinfo=htonl(0), sin6_scope_id=0}, [28]) = 80623<TCPv6:[::ffff:127.0.0.1:10054->::ffff:127.0.0.1:37970]>

[pid 26632] 12:04:35.485544 getsockname(80623<TCPv6:[::ffff:127.0.0.1:10054->::ffff:127.0.0.1:37970]>, {sa_family=AF_INET6, sin6_port=htons(10054), inet_pton(AF_INET6, "::ffff:127.0.0.1", &sin6_addr), sin6_flowinfo=htonl(0), sin6_scope_id=0}, [28]) = 0
[pid 26632] 12:04:35.485673 getsockname(80623<TCPv6:[::ffff:127.0.0.1:10054->::ffff:127.0.0.1:37970]>, {sa_family=AF_INET6, sin6_port=htons(10054), inet_pton(AF_INET6, "::ffff:127.0.0.1", &sin6_addr), sin6_flowinfo=htonl(0), sin6_scope_id=0}, [28]) = 0
[pid 26632] 12:04:35.490411 setsockopt(80623<TCPv6:[::ffff:127.0.0.1:10054->::ffff:127.0.0.1:37970]>, SOL_TCP, TCP_NODELAY, [1], 4) = 0
[pid 26632] 12:04:35.490627 setsockopt(80623<TCPv6:[::ffff:127.0.0.1:10054->::ffff:127.0.0.1:37970]>, SOL_SOCKET, SO_LINGER, {l_onoff=0, l_linger=0}, 8) = 0
[pid 26632] 12:04:35.670835 setsockopt(80623<TCPv6:[::ffff:127.0.0.1:10054->::ffff:127.0.0.1:37970]>, SOL_TCP, TCP_NODELAY, [1], 4) = 0
[pid 26632] 12:04:35.670983 setsockopt(80623<TCPv6:[::ffff:127.0.0.1:10054->::ffff:127.0.0.1:37970]>, SOL_SOCKET, SO_LINGER, {l_onoff=0, l_linger=0}, 8) = 0

I tried to start ObjectDB Server on my working computer with MacOS for few hours without connecting to it, and it is also doing the same thing.


Lsof output from macOS looks differently:


java    16328 thommy   16u   IPv6 0x88d67f9ab22f022b       0t0                 TCP *:* (CLOSED)
java    16328 thommy   17u   IPv6 0x88d67f9ab743c22b       0t0                 TCP *:* (CLOSED)
java    16328 thommy   18u   IPv6 0x88d67f9ab743fbab       0t0                 TCP *:* (CLOSED)
java    16328 thommy   19u   IPv6 0x88d67f9ab743f54b       0t0                 TCP *:* (CLOSED)
java    16328 thommy   20u   IPv6 0x88d67f9ab743eeeb       0t0                 TCP *:* (CLOSED)
java    16328 thommy   21u   IPv6 0x88d67f9ab743e88b       0t0                 TCP *:* (CLOSED)
java    16328 thommy   22u   IPv6 0x88d67f9ab743e22b       0t0                 TCP *:* (CLOSED)
java    16328 thommy   23u   IPv6 0x88d67f9ab7441bab       0t0                 TCP *:* (CLOSED)
java    16328 thommy   24u   IPv6 0x88d67f9ab744154b       0t0                 TCP *:* (CLOSED)
java    16328 thommy   25u   IPv6 0x88d67f9ab7440eeb       0t0                 TCP *:* (CLOSED)
java    16328 thommy   26u   IPv6 0x88d67f9ab744088b       0t0                 TCP *:* (CLOSED)

For me those are closed connections which are still wasting system resources, and I have 8000 of them for few hours.

Please try it yourself, if it does not behave like this also for You.
For me it's sufficient to just start the server and leave it running for several hours.

edit
delete
#11

Based on that 2 times a during day the ODB Server went crazy, created cuple of threads and started creating sockets in some loop.
This took about minute, so I will try to create java thread dump during that.

Attaching Thread Dumps in attachments

edit
delete
#12

> 1. Is the thread pool shared somehow for both JDO and JPA?

Yes, it is the same ObjectDB implementation, and it can be accessed using JPA and JDO alternately.

Regarding the sockets issue, maybe there is some network activity on your computer, such as port scanning, maybe from external sources (if the port is not protected by a Firewall), which causes these issues. Anyway, a simple test of running ObjectDB Server idle on Windows 10 for almost 24 hours doesn't show this reported issue.

Regarding the thread dumps, most of the threads are Garbage Collector (GC) threads and no abnormal ObjectDB activity is observed. A heap dumps of the server (i.e. a snapshot of the used memory) may reveal more information than the thread dumps.

ObjectDB Support
edit
delete
#13

Regarding the sockets issue, maybe there is some network activity on your computer, such as port scanning, maybe from external sources (if the port is not protected by a Firewall

I don't think so, I tried to monitor all network traffic and did not see any.

Anyway, a simple test of running ObjectDB Server idle on Windows 10 for almost 24 hours doesn't show this reported issue

That's seems to be same on our side. Tried on 2 windows computers and we did not see any issue, or we cannot diagnose it properly.
However I tried it also in Virtual Machine with fresh Ubuntu install and the issue happened there. The SSL must be enabled. Without SSL enabled no sockets were appearing.

Please do some tests also on linux.

I am attaching heap dumps created during socket creation.
Those are created with 2 seconds interval, so it should be enough to analyze just one of them.

edit
delete
#14

The heap dump shows 1860, 1520, and 1149 instances of  SSLSocketImpl and similar classes (in the 3 hprof files). However, trying to find the GC roots that hold these instances (using Java Virtual VM) shows that these instances are not in use and nothing prevents the JVM from removing them. Actually it seems that during these 4-5 seconds the JVM released some of these objects, reducing the number from 1860 to 1149.

The ObjectDB server listens to new connections using the accept method of a ServerSocket, and when SSL is in use, it is an SSLSeverSocketIml. The accept method has a timeout of one second and after that timeout another accept is called in a loop (the gap is to check for server stop commands, etc. between accept calls, but possibly this could be implemented in a different way). Accordingly, when the server is idle accept is called about 3600 times every hour. This is the same for SSL and not SSL, but maybe the Java SSL implementation generates some temporary objects even when no new connections are created.

Were there any connections to the server in this test? What happened in these 4-5 seconds except taking the heap dumps?

It is possible that by trying to isolate the problem using a server that does nothing you see some normal behavior that does not reflect the real issue that you have.

ObjectDB Support
edit
delete
#15

The heap dump shows 1860, 1520, and 1149 instances of  SSLSocketImpl and similar classes (in the 3 hprof files). However, trying to find the GC roots that hold these instances (using Java Virtual VM) shows that these instances are not in use and nothing prevents the JVM from removing them. Actually it seems that during these 4-5 seconds the JVM released some of these objects, reducing the number from 1860 to 1149.

Yes, it seems you are right about this. I forced jvm to GC by jcmd.GC.run command, and it freed also the stuck sockets.
Attached files:
But, it's still doesn't sound right for me. I think, that the instances which are "waiting" to be collected should not waste system resources like sockets.
Our app is also using SSL connections and I never saw there unused sockets like this. Not a single one.

Accordingly, when the server is idle accept is called about 3600 times every hour. This is the same for SSL and not SSL, but maybe the Java SSL implementation generates some temporary objects even when no new connections are created.

But the sockets are not created every second. I was tracing it using strace and there are messages like this:

[pid 34741] 16:34:38.348869 accept(7<TCPv6:[:::10054]>, 0x7fec6e6c0500, [28]) = -1 EAGAIN (Resource temporarily unavailable)
[pid 34741] 16:34:40.349338 accept(7<TCPv6:[:::10054]>, 0x7fec6e6c0500, [28]) = -1 EAGAIN (Resource temporarily unavailable)
[pid 34741] 16:34:40.349602 accept(7<TCPv6:[:::10054]>, 0x7fec6e6c0500, [28]) = -1 EAGAIN (Resource temporarily unavailable)

Which corresponds with your saying. But this doesn't creating unwanted sockets.
Besides that, there are once upon a time message like this:

[pid 34777] 16:24:32.919950 getsockopt(79<TCPv6:[397019269]>, SOL_SOCKET, SO_LINGER, {l_onoff=0, l_linger=0}, [8]) = 0
[pid 34777] 16:24:32.922710 socket(AF_INET6, SOCK_STREAM, IPPROTO_IP) = 80<TCPv6:[397019270]>

[pid 34777] 16:24:32.925362 setsockopt(80<TCPv6:[397019270]>, SOL_IPV6, IPV6_V6ONLY, [0], 4) = 0
[pid 34777] 16:24:32.928017 getsockopt(80<TCPv6:[397019270]>, SOL_SOCKET, SO_LINGER, {l_onoff=0, l_linger=0}, [8]) = 0
[pid 34777] 16:24:32.930710 socket(AF_INET6, SOCK_STREAM, IPPROTO_IP) = 81<TCPv6:[397019271]>

[pid 34777] 16:24:32.933341 setsockopt(81<TCPv6:[397019271]>, SOL_IPV6, IPV6_V6ONLY, [0], 4) = 0

Those are the ones, which are creating sockets, which are left there until GC. And this is what I used in detection script which is doing the heap dumps.

Were there any connections to the server in this test? What happened in these 4-5 seconds except taking the heap dumps?

I don't think so. This server is in idle mode for days for observations. So no connections. The heap dump was taken from scripts, which is reading strace log as I wrote above. But I am under impression, that creating heap dump also triggers GC, as I saw today, that some sockets were released after creating heap dump.

It is possible that by trying to isolate the problem using a server that does nothing you see some normal behavior that does not reflect the real issue that you have.

I don't think so, as we have more ODB servers running, and numbers of sockets used were similar on idle instances and those which are used for testing. That's why we decided to investigate this more on idle instances.

edit
delete
#16

I forgot to attach the heap dumps:
idle2_4_min  - Server was started for 4 minutes and idle. I saw 40 instances of SSLSocketImpl in heap dump, but there were no unused sockets opened

idle2_4h - Same instance of server after 4h, with about 2000 of unused sockets opened

idle2_4h_after_GC_trigger - I run jcmd GC.run three times in a row, which cleared all unused sockets.

edit
delete
#17

Please try build  2.8.5_06 with new logging enabled:

  <logger name="server" level="debug" />

During sudden events of socket creations, information on the new sockets should be written to the ObjectDB log file.

2 questions:

  1. If the sockets are released when the GC works, what is the actual problem? Why so you think that this is the problem that stops you in production?
  2. Are you sure that the new Linux node that you created for this test is protected from external port scanning?
ObjectDB Support
edit
delete
#18

Thank You, I will try that.
1. Well, we could live with that now, when we now, that they are freed by GC. But the problem is, that we increased limit for opened files for the ODB process from 4096 to 102400, and the one which we left idle was still crashing twice a week because of reaching that limit. I suppose, that it is due the fact, that from Java 9 the GC is much less aggressive when there is a lot of free memory. So I think, that we would maybe need to run some script periodically which would force the GC. But we will see. 

2. Yes, it's the virtual machine on my work computer with disabled networking. And also on server, from which the heap dumps are, has ports closed in firewall.

edit
delete
#19

The logging works:

[2021-04-28 09:56:26 #61 server]
Socket type: 1

[2021-04-28 09:56:26 #62 server]
New connection to server on port 10054 from /127.0.0.1:39598

[2021-04-28 09:56:26 #63 server]
Socket type: 1

[2021-04-28 09:56:26 #64 server]
New connection to server on port 10054 from /127.0.0.1:39600

[2021-04-28 09:56:26 #65 server]
Socket type: 1

 

 

So now you can see, that there are no real connections, but still the sockets are increasing until crash, that's whole file content from log archive, I will add it also as attachment:

​​[2021-05-12 09:10:05 #1 server] Server on port 10054 has started by 21550@testing

[2021-05-15 18:34:17 #2 server]
java.io.IOException: Too many open files
    at java.base/sun.nio.ch.Net.accept(Native Method)
    at java.base/sun.nio.ch.NioSocketImpl.timedAccept(NioSocketImpl.java:711)
    at java.base/sun.nio.ch.NioSocketImpl.accept(NioSocketImpl.java:752)
    at java.base/java.net.ServerSocket.implAccept(ServerSocket.java:684)
    at java.base/java.net.ServerSocket.platformImplAccept(ServerSocket.java:650)
    at java.base/java.net.ServerSocket.implAccept(ServerSocket.java:605)
    at java.base/sun.security.ssl.SSLServerSocketImpl.accept(SSLServerSocketImpl.java:271)
    at com.objectdb.o.SMR.n(SMR.java:296)
    at com.objectdb.o.SMR.f(SMR.java:179)
    at com.objectdb.o.TOL.run(TOL.java:117)
    at com.objectdb.Server.runCommand(Server.java:200)
    at com.objectdb.Server.run(Server.java:104)
    at com.objectdb.Server.main(Server.java:63)

[2021-05-15 18:34:17 #3 server]
Server on port 10054 has stopped by 21550@testing

We will try to decrease Xmx as it now uses default value which is 32GB

 

edit
delete
#20

OK. Thank you for the update. This may confirm the suspicion that the issue is related to calling ServerSocket's accept with a timeout of 2 seconds, in a loop, in order to check for server stop requests. Looking at the source code of the Java's ServerSocket class, it does seem that every call to accept allocates a socket/port/thread/file, which may not be released until garbage collection.

There are two main methods to stop ServerSocket listening:

    https://stackoverflow.com/questions/2983835/how-can-i-interrupt-a-serversocket-accept-method

As discussed, ObjectDB used a loop of calling accept with a timeout.

Build 2.8.5_07 now switches to the other method of calling accept with no timeout and stopping by closing the ServerSocket instance from another thread. Please check if this solves the issue.

ObjectDB Support
edit
delete
#21

Hello,

it works like a charm.
Sockets open 0.
Files opened: 39.

Server is now running for almost 2 days.

Thank You very much.

edit
delete
#22

Thanks for the update and for helping to improve ObjectDB.

ObjectDB Support
edit
delete

Reply

To post on this website please sign in.