General Questions

Expand all | Collapse all

Thick client and Blocked system-critical thread has been detected.

  • 1.  Thick client and Blocked system-critical thread has been detected.

     
    Posted 15 days ago
    Edited by Dmitriy Shubin 15 days ago
    Good day!

    I have a test cluster of two nodes. There is a distributed cache between these nodes. Persistence is included.
    I want to write a thick client in Java (get data direct from node).
    I wrote this code:

    package com.bercut.thick;

    import org.apache.ignite.Ignite;
    import org.apache.ignite.IgniteCache;
    import org.apache.ignite.Ignition;
    import org.apache.ignite.configuration.IgniteConfiguration;
    import org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi;
    import org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi;
    import org.apache.ignite.spi.discovery.tcp.ipfinder.vm.TcpDiscoveryVmIpFinder;

    import java.util.Arrays;

    public class ThickClientTest {

    public static void main (String[] args) {
    TcpDiscoverySpi spi = new TcpDiscoverySpi();
    TcpDiscoveryVmIpFinder ipFinder = new TcpDiscoveryVmIpFinder();
    ipFinder.setAddresses(Arrays.asList("192.168.16.31", "192.168.16.32"));
    spi.setIpFinder(ipFinder);

    IgniteConfiguration cfg = new IgniteConfiguration();

    cfg.setClientMode(true);
    cfg.setDiscoverySpi(spi);

    Ignite ignite = Ignition.start(cfg);

    IgniteCache<String, String> cache = ignite.getOrCreateCache("bercut");

    for (int i = 1; i <= 100; i++) {
    cache.put(Integer.toString(i), "1234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890");
    }

    ignite.close();
    }

    }

    But at startup, the following error occurs:

    [15:29:55] __________ ________________
    [15:29:55] / _/ ___/ |/ / _/_ __/ __/
    [15:29:55] _/ // (7 7 // / / / / _/
    [15:29:55] /___/\___/_/|_/___/ /_/ /___/
    [15:29:55]
    [15:29:55] ver. 8.7.6#20190704-sha1:6449a674
    [15:29:55] 2019 Copyright(C) Apache Software Foundation
    [15:29:55]
    [15:29:55] Ignite documentation: http://gridgain.com
    [15:29:55]
    [15:29:55] Quiet mode.
    [15:29:55] ^-- Logging by 'JavaLogger [quiet=true, config=null]'
    [15:29:55] ^-- To see **FULL** console log here add -DIGNITE_QUIET=false or "-v" to ignite.{sh|bat}
    [15:29:55]
    [15:29:55] OS: Windows 10 10.0 amd64
    [15:29:55] VM information: Java(TM) SE Runtime Environment 1.8.0_181-b13 Oracle Corporation Java HotSpot(TM) 64-Bit Server VM 25.181-b13
    [15:29:55] Please set system property '-Djava.net.preferIPv4Stack=true' to avoid possible problems in mixed environments.
    [15:29:55] Initial heap size is 256MB (should be no less than 512MB, use -Xms512m -Xmx512m).
    [15:29:55] Configured plugins:
    [15:29:55] ^-- None
    [15:29:55]
    [15:29:55] Configured failure handler: [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0, super=AbstractFailureHandler [ignoredFailureTypes=[SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]]]
    [15:30:02] Message queue limit is set to 0 which may lead to potential OOMEs when running cache operations in FULL_ASYNC or PRIMARY_SYNC modes due to message queues growth on sender and receiver sides.
    [15:30:02] Security status [authentication=off, tls/ssl=off]
    [15:30:03] REST protocols do not start on client node. To start the protocols on client node set '-DIGNITE_REST_START_ON_CLIENT=true' system property.
    ноя 20, 2019 3:26:06 PM org.apache.ignite.logger.java.JavaLogger errorноя 20, 2019 3:26:06 PM org.apache.ignite.logger.java.JavaLogger errorSEVERE: Blocked system-critical thread has been detected. This can lead to cluster-wide undefined behaviour [threadName=tcp-comm-worker, blockedFor=52s]ноя 20, 2019 3:26:06 PM java.util.logging.LogManager$RootLogger logSEVERE: Critical system error detected. Will be handled accordingly to configured handler [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0, super=AbstractFailureHandler [ignoredFailureTypes=[SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]], failureCtx=FailureContext [type=SYSTEM_WORKER_BLOCKED, err=class o.a.i.IgniteException: GridWorker [name=tcp-comm-worker, igniteInstanceName=null, finished=false, heartbeatTs=1574252714044]]]class org.apache.ignite.IgniteException: GridWorker [name=tcp-comm-worker, igniteInstanceName=null, finished=false, heartbeatTs=1574252714044] at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance$2.apply(IgnitionEx.java:1832) at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance$2.apply(IgnitionEx.java:1827) at org.apache.ignite.internal.worker.WorkersRegistry.onIdle(WorkersRegistry.java:232) at org.apache.ignite.internal.util.worker.GridWorker.onIdle(GridWorker.java:296) at org.apache.ignite.internal.processors.timeout.GridTimeoutProcessor$TimeoutWorker.body(GridTimeoutProcessor.java:220) at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:119) at java.lang.Thread.run(Thread.java:748)

    Tell me what is wrong with the client or the settings (client/server)?

    Log from Server:
    [13:07:15,939][INFO][grid-nio-worker-tcp-comm-2-#26%L2Cache%][TcpCommunicationSpi] Established outgoing communication connection [locAddr=/192.168.16.31:37982, rmtAddr=shubin--note.office.bercut.ru/192.168.7.137:47100]
    [13:07:15,940][INFO][rest-#50%L2Cache%][TcpCommunicationSpi] TCP client created [client=GridTcpNioCommunicationClient [ses=GridSelectorNioSessionImpl [worker=DirectNioClientWorker [super=AbstractNioClientWorker [idx=2, bytesRcvd=78173, bytesSent=6623, bytesRcvd0=0, bytesSent0=0, select=true, super=GridWorker [name=grid-nio-worker-tcp-comm-2, igniteInstanceName=L2Cache, finished=false, heartbeatTs=1574255235935, hashCode=793398, interrupted=false, runner=grid-nio-worker-tcp-comm-2-#26%L2Cache%]]], writeBuf=java.nio.DirectByteBuffer[pos=0 lim=32768 cap=32768], readBuf=java.nio.DirectByteBuffer[pos=0 lim=32768 cap=32768], inRecovery=GridNioRecoveryDescriptor [acked=0, resendCnt=0, rcvCnt=0, sentCnt=0, reserved=true, lastAck=0, nodeLeft=false, node=TcpDiscoveryNode [id=2e17f9cf-4332-43f9-920c-47918e838b8f, consistentId=2e17f9cf-4332-43f9-920c-47918e838b8f, addrs=[0:0:0:0:0:0:0:1, 10.0.75.1, 127.0.0.1, 172.25.195.129, 192.168.7.137], sockAddrs=[/172.25.195.129:0, shubin--note.office.bercut.ru/192.168.7.137:0, /0:0:0:0:0:0:0:1:0, /10.0.75.1:0, /127.0.0.1:0], discPort=0, order=25, intOrder=14, lastExchangeTime=1574255199069, loc=false, ver=8.7.6#20190704-sha1:6449a674, isClient=true], connected=false, connectCnt=2, queueLimit=4096, reserveCnt=8, pairedConnections=false], outRecovery=GridNioRecoveryDescriptor [acked=0, resendCnt=0, rcvCnt=0, sentCnt=0, reserved=true, lastAck=0, nodeLeft=false, node=TcpDiscoveryNode [id=2e17f9cf-4332-43f9-920c-47918e838b8f, consistentId=2e17f9cf-4332-43f9-920c-47918e838b8f, addrs=[0:0:0:0:0:0:0:1, 10.0.75.1, 127.0.0.1, 172.25.195.129, 192.168.7.137], sockAddrs=[/172.25.195.129:0, shubin--note.office.bercut.ru/192.168.7.137:0, /0:0:0:0:0:0:0:1:0, /10.0.75.1:0, /127.0.0.1:0], discPort=0, order=25, intOrder=14, lastExchangeTime=1574255199069, loc=false, ver=8.7.6#20190704-sha1:6449a674, isClient=true], connected=false, connectCnt=2, queueLimit=4096, reserveCnt=8, pairedConnections=false], super=GridNioSessionImpl [locAddr=/192.168.16.31:37982, rmtAddr=shubin--note.office.bercut.ru/192.168.7.137:47100, createTime=1574255235935, closeTime=0, bytesSent=0, bytesRcvd=0, bytesSent0=0, bytesRcvd0=0, sndSchedTime=1574255235935, lastSndTime=1574255235935, lastRcvTime=1574255235935, readsPaused=false, filterChain=FilterChain[filters=[GridNioCodecFilter [parser=org.apache.ignite.internal.util.nio.GridDirectParser@6ce1c90b, directMode=true], GridConnectionBytesVerifyFilter], accepted=false, markedForClose=false]], super=GridAbstractCommunicationClient [lastUsed=1574255235935, closed=false, connIdx=0]], duration=30902ms]
    [13:07:26,668][INFO][grid-nio-worker-tcp-comm-3-#27%L2Cache%][TcpCommunicationSpi] Accepted incoming communication connection [locAddr=/192.168.16.31:47100, rmtAddr=/192.168.7.137:54577]
    [13:07:26,669][INFO][grid-nio-worker-tcp-comm-3-#27%L2Cache%][TcpCommunicationSpi] Received incoming connection from remote node while connecting to this node, rejecting [locNode=86060ccb-bc15-4f16-be8d-8a65036491a0, locNodeOrder=1, rmtNode=2e17f9cf-4332-43f9-920c-47918e838b8f, rmtNodeOrder=25]
    [13:07:27,511][INFO][tcp-disco-sock-reader-#20%L2Cache%][TcpDiscoverySpi] Finished serving remote node connection [rmtAddr=/192.168.7.137:54546, rmtPort=54546
    [13:07:33,035][INFO][db-checkpoint-thread-#77%L2Cache%][GridCacheDatabaseSharedManager] Skipping checkpoint (no pages were modified) [checkpointBeforeLockTime=1ms, checkpointLockWait=0ms, checkpointListenersExecuteTime=1ms, checkpointLockHoldTime=2ms, reason='timeout']
    [13:07:35,195][SEVERE][tcp-disco-msg-worker-#2%L2Cache%][G] Blocked system-critical thread has been detected. This can lead to cluster-wide undefined behaviour [threadName=tcp-comm-worker, blockedFor=18s]
    [13:07:35,196][WARNING][tcp-disco-msg-worker-#2%L2Cache%][G] Thread [name="tcp-comm-worker-#1%L2Cache%", id=52, state=RUNNABLE, blockCnt=0, waitCnt=2093]

    [13:07:35,196][SEVERE][tcp-disco-msg-worker-#2%L2Cache%][] Critical system error detected. Will be handled accordingly to configured handler [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0, super=AbstractFailureHandler [ignoredFailureTypes=[SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]], failureCtx=FailureContext [type=SYSTEM_WORKER_BLOCKED, err=class o.a.i.IgniteException: GridWorker [name=tcp-comm-worker, igniteInstanceName=L2Cache, finished=false, heartbeatTs=1574255236207]]]
    class org.apache.ignite.IgniteException: GridWorker [name=tcp-comm-worker, igniteInstanceName=L2Cache, finished=false, heartbeatTs=1574255236207]
    at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance$2.apply(IgnitionEx.java:1832)
    at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance$2.apply(IgnitionEx.java:1827)
    at org.apache.ignite.internal.worker.WorkersRegistry.onIdle(WorkersRegistry.java:232)
    at org.apache.ignite.internal.util.worker.GridWorker.onIdle(GridWorker.java:296)
    at org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.lambda$new$0(ServerImpl.java:2792)
    at org.apache.ignite.spi.discovery.tcp.ServerImpl$MessageWorker.body(ServerImpl.java:7519)
    at org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.body(ServerImpl.java:2854)
    at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:119)
    at org.apache.ignite.spi.discovery.tcp.ServerImpl$MessageWorkerThread.body(ServerImpl.java:7457)
    at org.apache.ignite.spi.IgniteSpiThread.run(IgniteSpiThread.java:61)
    [13:07:35,197][WARNING][tcp-disco-msg-worker-#2%L2Cache%][FailureProcessor] No deadlocked threads detected.
    [13:07:35,331][WARNING][tcp-disco-msg-worker-#2%L2Cache%][FailureProcessor] Thread dump at 2019/11/20 13:07:35 GMT
    Thread [name="sys-#8483%L2Cache%", id=8557, state=TIMED_WAITING, blockCnt=0, waitCnt=1]
    Lock [object=java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@434d908c, ownerName=null, ownerId=-1]
    at sun.misc.Unsafe.park(Native Method)
    at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
    at java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:467)
    at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1073)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)

    Thread [name="sys-#8482%L2Cache%", id=8556, state=TIMED_WAITING, blockCnt=0, waitCnt=1]
    Lock [object=java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@434d908c, ownerName=null, ownerId=-1]
    at sun.misc.Unsafe.park(Native Method)
    at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
    at java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:467)
    at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1073)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)

    ...

    Thread [name="srvc-deploy-#82%L2Cache%", id=142, state=WAITING, blockCnt=0, waitCnt=23]
    Lock [object=java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@63d03a91, ownerName=null, ownerId=-1]
    at sun.misc.Unsafe.park(Native Method)
    at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
    at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
    at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)

    ...

    Thread [name="db-checkpoint-thread-#77%L2Cache%", id=133, state=TIMED_WAITING, blockCnt=0, waitCnt=2809]
    Lock [object=o.a.i.i.processors.cache.persistence.GridCacheDatabaseSharedManager$Checkpointer@1579520c, ownerName=null, ownerId=-1]
    at java.lang.Object.wait(Native Method)
    at o.a.i.i.processors.cache.persistence.GridCacheDatabaseSharedManager$Checkpointer.waitCheckpointEvent(GridCacheDatabaseSharedManager.java:3632)
    at o.a.i.i.processors.cache.persistence.GridCacheDatabaseSharedManager$Checkpointer.body(GridCacheDatabaseSharedManager.java:3179)
    at o.a.i.i.util.worker.GridWorker.run(GridWorker.java:119)
    at java.lang.Thread.run(Thread.java:748)

    Thread [name="wal-segment-syncer-#74%L2Cache%", id=130, state=TIMED_WAITING, blockCnt=0, waitCnt=167730]
    at java.lang.Thread.sleep(Native Method)
    at o.a.i.i.util.IgniteUtils.sleep(IgniteUtils.java:7656)
    at o.a.i.i.processors.cache.persistence.wal.filehandle.FileHandleManagerImpl$WalSegmentSyncer.body(FileHandleManagerImpl.java:595)
    at o.a.i.i.util.worker.GridWorker.run(GridWorker.java:119)
    at java.lang.Thread.run(Thread.java:748)

    Thread [name="wal-file-archiver%L2Cache-#73%L2Cache%", id=129, state=WAITING, blockCnt=0, waitCnt=1]
    Lock [object=o.a.i.i.processors.cache.persistence.wal.aware.SegmentCurrentStateStorage@3a9f69df, ownerName=null, ownerId=-1]
    at java.lang.Object.wait(Native Method)
    at java.lang.Object.wait(Object.java:502)
    at o.a.i.i.processors.cache.persistence.wal.aware.SegmentCurrentStateStorage.awaitSegment(SegmentCurrentStateStorage.java:72)
    at o.a.i.i.processors.cache.persistence.wal.aware.SegmentCurrentStateStorage.waitNextSegmentForArchivation(SegmentCurrentStateStorage.java:89)
    at o.a.i.i.processors.cache.persistence.wal.aware.SegmentAware.waitNextSegmentForArchivation(SegmentAware.java:78)
    at o.a.i.i.processors.cache.persistence.wal.FileWriteAheadLogManager$FileArchiver.body(FileWriteAheadLogManager.java:1749)
    at o.a.i.i.util.worker.GridWorker.run(GridWorker.java:119)
    at java.lang.Thread.run(Thread.java:748)

    Thread [name="tcp-disco-sock-reader-#7%L2Cache%", id=118, state=RUNNABLE, blockCnt=1, waitCnt=1]
    at java.net.SocketInputStream.socketRead0(Native Method)
    at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
    at java.net.SocketInputStream.read(SocketInputStream.java:171)
    at java.net.SocketInputStream.read(SocketInputStream.java:141)
    at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
    at java.io.BufferedInputStream.read1(BufferedInputStream.java:286)
    at java.io.BufferedInputStream.read(BufferedInputStream.java:345)
    - locked java.io.BufferedInputStream@5468f8f2
    at o.a.i.marshaller.jdk.JdkMarshallerInputStreamWrapper.read(JdkMarshallerInputStreamWrapper.java:52)
    at java.io.ObjectInputStream$PeekInputStream.read(ObjectInputStream.java:2663)
    at java.io.ObjectInputStream$PeekInputStream.readFully(ObjectInputStream.java:2679)
    at java.io.ObjectInputStream$BlockDataInputStream.readShort(ObjectInputStream.java:3156)
    at java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:862)
    at java.io.ObjectInputStream.<init>(ObjectInputStream.java:358)
    at o.a.i.marshaller.jdk.JdkMarshallerObjectInputStream.<init>(JdkMarshallerObjectInputStream.java:42)
    at o.a.i.marshaller.jdk.JdkMarshaller.unmarshal0(JdkMarshaller.java:136)
    at o.a.i.marshaller.AbstractNodeNameAwareMarshaller.unmarshal(AbstractNodeNameAwareMarshaller.java:93)
    at o.a.i.i.util.IgniteUtils.unmarshal(IgniteUtils.java:9967)
    at o.a.i.spi.discovery.tcp.ServerImpl$SocketReader.body(ServerImpl.java:6535)
    at o.a.i.spi.IgniteSpiThread.run(IgniteSpiThread.java:61)

    Thread [name="rest-#54%L2Cache%", id=106, state=TIMED_WAITING, blockCnt=10, waitCnt=27040]
    Lock [object=java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@b5e3941, ownerName=null, ownerId=-1]
    at sun.misc.Unsafe.park(Native Method)
    at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
    at java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:467)
    at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1073)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    Thread [name="rest-#53%L2Cache%", id=105, state=TIMED_WAITING, blockCnt=9, waitCnt=27045]
    Lock [object=java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@b5e3941, ownerName=null, ownerId=-1]
    at sun.misc.Unsafe.park(Native Method)
    at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
    at java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:467)
    at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1073)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)

    Thread [name="rest-#52%L2Cache%", id=104, state=TIMED_WAITING, blockCnt=38, waitCnt=27035]
    Lock [object=java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@b5e3941, ownerName=null, ownerId=-1]
    at sun.misc.Unsafe.park(Native Method)
    at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
    at java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:467)
    at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1073)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)

    Thread [name="rest-#51%L2Cache%", id=103, state=TIMED_WAITING, blockCnt=49, waitCnt=27040]
    Lock [object=java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@b5e3941, ownerName=null, ownerId=-1]
    at sun.misc.Unsafe.park(Native Method)
    at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
    at java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:467)
    at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1073)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)
    Thread [name="rest-#50%L2Cache%", id=102, state=WAITING, blockCnt=33, waitCnt=27038]
    at sun.misc.Unsafe.park(Native Method)
    at java.util.concurrent.locks.LockSupport.park(LockSupport.java:304)
    at o.a.i.i.util.future.GridFutureAdapter.get0(GridFutureAdapter.java:177)
    at o.a.i.i.util.future.GridFutureAdapter.get(GridFutureAdapter.java:140)
    at o.a.i.spi.communication.tcp.TcpCommunicationSpi.reserveClient(TcpCommunicationSpi.java:2956)
    at o.a.i.spi.communication.tcp.TcpCommunicationSpi.sendMessage0(TcpCommunicationSpi.java:2758)
    at o.a.i.spi.communication.tcp.TcpCommunicationSpi.sendMessage(TcpCommunicationSpi.java:2717)
    at o.a.i.i.managers.communication.GridIoManager.send(GridIoManager.java:1655)
    at o.a.i.i.managers.communication.GridIoManager.sendToGridTopic(GridIoManager.java:1730)
    at o.a.i.i.processors.task.GridTaskWorker.sendRequest(GridTaskWorker.java:1435)
    at o.a.i.i.processors.task.GridTaskWorker.processMappedJobs(GridTaskWorker.java:665)
    at o.a.i.i.processors.task.GridTaskWorker.body(GridTaskWorker.java:537)
    at o.a.i.i.util.worker.GridWorker.run(GridWorker.java:119)
    at o.a.i.i.processors.task.GridTaskProcessor.startTask(GridTaskProcessor.java:809)
    at o.a.i.i.processors.task.GridTaskProcessor.execute(GridTaskProcessor.java:533)
    at o.a.i.i.IgniteComputeImpl.executeAsync0(IgniteComputeImpl.java:487)
    at o.a.i.i.IgniteComputeImpl.executeAsync(IgniteComputeImpl.java:467)
    at o.a.i.i.v.compute.VisorGatewayTask$VisorGatewayJob.execute(VisorGatewayTask.java:461)
    at o.a.i.i.processors.job.GridJobWorker$2.call(GridJobWorker.java:567)
    at o.a.i.i.util.IgniteUtils.wrapThreadLoader(IgniteUtils.java:6843)
    at o.a.i.i.processors.job.GridJobWorker.execute0(GridJobWorker.java:561)
    at o.a.i.i.processors.job.GridJobWorker.body(GridJobWorker.java:490)
    at o.a.i.i.util.worker.GridWorker.run(GridWorker.java:119)
    at o.a.i.i.processors.job.GridJobProcessor.processJobExecuteRequest(GridJobProcessor.java:1124)
    at o.a.i.i.processors.task.GridTaskWorker.sendRequest(GridTaskWorker.java:1419)
    at o.a.i.i.processors.task.GridTaskWorker.processMappedJobs(GridTaskWorker.java:665)
    at o.a.i.i.processors.task.GridTaskWorker.body(GridTaskWorker.java:537)
    at o.a.i.i.util.worker.GridWorker.run(GridWorker.java:119)
    at o.a.i.i.processors.task.GridTaskProcessor.startTask(GridTaskProcessor.java:809)
    at o.a.i.i.processors.task.GridTaskProcessor.execute(GridTaskProcessor.java:533)
    at o.a.i.i.processors.task.GridTaskProcessor.execute(GridTaskProcessor.java:513)
    at o.a.i.i.processors.rest.handlers.task.GridTaskCommandHandler.handleAsyncUnsafe(GridTaskCommandHandler.java:226)
    at o.a.i.i.processors.rest.handlers.task.GridTaskCommandHandler.handleAsync(GridTaskCommandHandler.java:162)
    at o.a.i.i.processors.rest.GridRestProcessor.handleRequest(GridRestProcessor.java:318)
    at o.a.i.i.processors.rest.GridRestProcessor.access$100(GridRestProcessor.java:99)
    at o.a.i.i.processors.rest.GridRestProcessor$2.body(GridRestProcessor.java:174)
    at o.a.i.i.util.worker.GridWorker.run(GridWorker.java:119)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)

    Locked synchronizers:
    java.util.concurrent.ThreadPoolExecutor$Worker@6364cf3c
    Thread [name="rest-#49%L2Cache%", id=101, state=TIMED_WAITING, blockCnt=10, waitCnt=27033]
    Lock [object=java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@b5e3941, ownerName=null, ownerId=-1]
    at sun.misc.Unsafe.park(Native Method)
    at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
    at java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:467)
    at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1073)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)
    ...



    ------------------------------
    Dmitriy Shubin
    DB Developer
    Bercut
    ------------------------------


  • 2.  RE: Thick client and Blocked system-critical thread has been detected.

     
    Posted 15 days ago
    Try generate cluster config from WEB-console and then in ClientNodeCodeStartup.java -> main write this:

    public static void main(String[] args) throws Exception {
    Ignite ignite = Ignition.start(ClientConfigurationFactory.createConfiguration());

    IgniteCache<String, String> cache = ignite.getOrCreateCache("bercut");

    int rec_count = 100000;

    for (int i = 0; i < rec_count; ++i) {
    cache.put(String.valueOf(i), "1234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890");
    }
    }

    Same problem :(

    ноя 20, 2019 5:06:57 PM org.apache.ignite.logger.java.JavaLogger errorноя 20, 2019 5:06:57 PM org.apache.ignite.logger.java.JavaLogger errorSEVERE: Blocked system-critical thread has been detected. This can lead to cluster-wide undefined behaviour [threadName=tcp-comm-worker, blockedFor=11s]ноя 20, 2019 5:06:57 PM java.util.logging.LogManager$RootLogger logSEVERE: Critical system error detected. Will be handled accordingly to configured handler [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0, super=AbstractFailureHandler [ignoredFailureTypes=[SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]], failureCtx=FailureContext [type=SYSTEM_WORKER_BLOCKED, err=class o.a.i.IgniteException: GridWorker [name=tcp-comm-worker, igniteInstanceName=L2Cache, finished=false, heartbeatTs=1574258805441]]]class org.apache.ignite.IgniteException: GridWorker [name=tcp-comm-worker, igniteInstanceName=L2Cache, finished=false, heartbeatTs=1574258805441] at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance$2.apply(IgnitionEx.java:1832) at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance$2.apply(IgnitionEx.java:1827) at org.apache.ignite.internal.worker.WorkersRegistry.onIdle(WorkersRegistry.java:232) at org.apache.ignite.internal.util.worker.GridWorker.onIdle(GridWorker.java:296) at org.apache.ignite.internal.util.nio.GridNioServer$GridNioAcceptWorker.accept(GridNioServer.java:2960) at org.apache.ignite.internal.util.nio.GridNioServer$GridNioAcceptWorker.body(GridNioServer.java:2899) at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:119) at java.lang.Thread.run(Thread.java:748)

    ------------------------------
    Dmitriy Shubin
    DB Developer
    Bercut
    ------------------------------



  • 3.  RE: Thick client and Blocked system-critical thread has been detected.

    Posted 15 days ago
    Hi Dmitriy,

    According to the logs I see that server node (192.168.16.31) rejected connection from the client node (192.168.7.137) since server node already in progress of establishing conection to the client node and seems like it's unable to do that. Could you please check your network configuration and ensure that server node has access to the client node and able to establish connection?

    ------------------------------
    Igor Belyakov
    Software Engineer
    GridGain
    ------------------------------



  • 4.  RE: Thick client and Blocked system-critical thread has been detected.

     
    Posted 15 days ago
    I see client node (192.168.7.137) in WEB-console:


    and 1-2 row sometimes add


    but client node (192.168.7.137) not in baseline. Maybe I need to add it? But how? Consistent ID changes on client start :(


    And when use this code all work well (thin client)
    :
    public static void main(String[] args) {
    ClientConfiguration cfg = new ClientConfiguration().setAddresses("192.168.16.31:10800", "192.168.16.32:10800");
    try (IgniteClient client = Ignition.startClient(cfg)) {

    ClientCache<String, String> cache = client.getOrCreateCache("bercut");

    for (int i = 0; i < 10000; ++i) {
    cache.put(String.valueOf(i), "123456789012345678901234567890123456789012345678901234567890");
    }
    }
    catch (ClientException e) {
    System.err.println(e.getMessage());
    }
    catch (Exception e) {
    System.err.format("Unexpected failure: %s\n", e);
    }
    }




    ------------------------------
    Dmitriy Shubin
    DB Developer
    Bercut
    ------------------------------



  • 5.  RE: Thick client and Blocked system-critical thread has been detected.

     
    Posted 14 days ago
    When I try run client on 192.168.16.31 (node 1) - all OK​
    When I try run client on 192.168.16.32 (node 2) - connection Rejected

    ------------------------------
    Dmitriy Shubin
    DB Developer
    Bercut
    ------------------------------



  • 6.  RE: Thick client and Blocked system-critical thread has been detected.

    Posted 14 days ago
    Client node shouldn't be a part of baseline since it doesn't store the data. More information regarding baseline topology can be found here:
    https://www.gridgain.com/docs/latest/developers-guide/persistence/native-persistence#baseline-topology-and-cluster-activation

    Also, do I understand right that in case you specify ports for the server nodes in ClientConfiguration, client connects successfully to the cluster and put operations work properly?

    ------------------------------
    Igor Belyakov
    Software Engineer
    GridGain
    ------------------------------



  • 7.  RE: Thick client and Blocked system-critical thread has been detected.

     
    Posted 14 days ago
    It's connect with thin client to first node (it's first node) and work correct.

    Thick client work only when start client on server with first started node.
    If I start thick client on 192.168.16.31 (first node) and connect to first or second node (192.168.16.31 or 192.168.16.32) i see in log:

    [16:42:55,613][INFO][grid-nio-worker-tcp-comm-3-#27%Cluster%][TcpCommunicationSpi] Accepted incoming communication connection [locAddr=/172.17.0.1:47100, rmtAddr=/192.168.16.31:36054]
    [16:42:55,664][INFO][grid-nio-worker-tcp-comm-0-#24%Cluster%][TcpCommunicationSpi] Accepted incoming communication connection [locAddr=/172.17.0.1:47100, rmtAddr=/192.168.16.31:36056]
    ...

    and I see it only in log of first node!

    I think first node open connect not on 192.168.16.31 interface... and work like proxy. Client go to proxy and then proxy take data from other node... like thin client. But I need client that can go to node with data without proxy.

    P.S. Can I write on Russian on this forum? If yes, I can write more correct :)

    ------------------------------
    Dmitriy Shubin
    DB Developer
    Bercut
    ------------------------------



  • 8.  RE: Thick client and Blocked system-critical thread has been detected.

    Posted 13 days ago
    Thin client works since it requires only one direction connection, on the other hand using thick client requires bidirectional connection between client and server nodes.
    Since you've 192.168.16.31 address first in the list, client tries to connect to this address in both cases when you're running it on the first node and on the second node. But in the second case seems like server node on 192.168.16.31:47100 unable to establish connection to the client node on 192.168.16.32:47101.
    Could you please check that communication ports 47100 and 47101 are open and reachable on both machines?


    ------------------------------
    Igor Belyakov
    Software Engineer
    GridGain
    ------------------------------



  • 9.  RE: Thick client and Blocked system-critical thread has been detected.

     
    Posted 13 days ago
    Port 47100 on 192.168.16.31 and 192.168.16.32 - open:
    java 27603 bercut 50u IPv6 1787850 0t0 TCP *:47100 (LISTEN)
    java 27603 bercut 1135u IPv6 1791913 0t0 TCP 192.168.16.32:47100->192.168.16.31:59376 (ESTABLISHED)

    port 47101 - not  open.

    ------------------------------
    Dmitriy Shubin
    DB Developer
    Bercut
    ------------------------------



Would you attend a July Meetup?


Announcements