Thanks Ilya,
I attached the ignite logs (only)
- one with level Info and another with level Debug.
We embed Ignite, so the configuration is done in code:
IgniteConfiguration [igniteInstanceName=test-imac.accorto.com, pubPoolSize=8, svcPoolSize=null, callbackPoolSize=8, stripedPoolSize=8, sysPoolSize=8, mgmtPoolSize=4, igfsPoolSize=8, dataStreamerPoolSize=8, utilityCachePoolSize=8, utilityCacheKeepAliveTime=60000, p2pPoolSize=2, qryPoolSize=8, igniteHome=/Users/jorg/ignite, igniteWorkDir=null, mbeanSrv=null, nodeId=null, marsh=null, marshLocJobs=false, daemon=false, p2pEnabled=false, netTimeout=5000, netCompressionLevel=1, sndRetryDelay=1000, sndRetryCnt=3, metricsHistSize=10000, metricsUpdateFreq=2000, metricsExpTime=9223372036854775807, discoSpi=TcpDiscoverySpi [addrRslvr=null, sockTimeout=0, ackTimeout=0, marsh=null, reconCnt=10, reconDelay=2000, maxAckTimeout=600000, soLinger=5, forceSrvMode=false, clientReconnectDisabled=false, internalLsnr=null, skipAddrsRandomization=false], segPlc=STOP, segResolveAttempts=2, waitForSegOnStart=true, allResolversPassReq=true, segChkFreq=10000, commSpi=null, evtSpi=null, colSpi=null, deploySpi=null, indexingSpi=null, addrRslvr=null, encryptionSpi=null, clientMode=false, rebalanceThreadPoolSize=1, rebalanceTimeout=10000, rebalanceBatchesPrefetchCnt=2, rebalanceThrottle=0, rebalanceBatchSize=524288, txCfg=TransactionConfiguration [txSerEnabled=false, dfltIsolation=REPEATABLE_READ, dfltConcurrency=PESSIMISTIC, dfltTxTimeout=0, txTimeoutOnPartitionMapExchange=0, deadlockTimeout=10000, pessimisticTxLogSize=0, pessimisticTxLogLinger=10000, tmLookupClsName=null, txManagerFactory=null, useJtaSync=false], cacheSanityCheckEnabled=true, discoStartupDelay=60000, deployMode=CONTINUOUS, p2pMissedCacheSize=100, locHost=null, timeSrvPortBase=31100, timeSrvPortRange=100, failureDetectionTimeout=60000, sysWorkerBlockedTimeout=null, clientFailureDetectionTimeout=30000, metricsLogFreq=1800000, hadoopCfg=null, connectorCfg=ConnectorConfiguration [jettyPath=null, host=null, port=11211, noDelay=true, directBuf=false, sndBufSize=32768, rcvBufSize=32768, idleQryCurTimeout=600000, idleQryCurCheckFreq=60000, sndQueueLimit=0, selectorCnt=4, idleTimeout=7000, sslEnabled=false, sslClientAuth=false, sslCtxFactory=null, sslFactory=null, portRange=100, threadPoolSize=8, msgInterceptor=null], odbcCfg=null, warmupClos=null, atomicCfg=AtomicConfiguration [seqReserveSize=1000, cacheMode=REPLICATED, backups=1, aff=null, grpName=null], classLdr=null, sslCtxFactory=null, platformCfg=null, binaryCfg=null, memCfg=null, pstCfg=null, dsCfg=DataStorageConfiguration [sysRegionInitSize=41943040, sysRegionMaxSize=104857600, pageSize=0, concLvl=0, dfltDataRegConf=DataRegionConfiguration [name=default, maxSize=6871947673, initSize=268435456, swapPath=null, pageEvictionMode=DISABLED, evictionThreshold=0.9, emptyPagesPoolSize=100, metricsEnabled=true, metricsSubIntervalCount=5, metricsRateTimeInterval=60000, persistenceEnabled=false, checkpointPageBufSize=0], dataRegions=null, storagePath=null, checkpointFreq=180000, lockWaitTime=10000, checkpointThreads=4, checkpointWriteOrder=SEQUENTIAL, walHistSize=20, maxWalArchiveSize=1073741824, walSegments=10, walSegmentSize=67108864, walPath=db/wal, walArchivePath=db/wal/archive, metricsEnabled=true, walMode=FSYNC, walTlbSize=131072, walBuffSize=0, walFlushFreq=2000, walFsyncDelay=1000, walRecordIterBuffSize=67108864, alwaysWriteFullPages=false, fileIOFactory=org.apache.ignite.internal.processors.cache.persistence.file.AsyncFileIOFactory@183559f9, metricsSubIntervalCnt=5, metricsRateTimeInterval=60000, walAutoArchiveAfterInactivity=-1, writeThrottlingEnabled=false, walCompactionEnabled=false, walCompactionLevel=1, checkpointReadLockTimeout=null], activeOnStart=true, autoActivation=true, longQryWarnTimeout=3000, sqlConnCfg=null, cliConnCfg=ClientConnectorConfiguration [host=null, port=10800, portRange=100, sockSndBufSize=0, sockRcvBufSize=0, tcpNoDelay=true, maxOpenCursorsPerConn=128, threadPoolSize=8, idleTimeout=0, handshakeTimeout=60000, jdbcEnabled=true, odbcEnabled=true, thinCliEnabled=true, sslEnabled=false, useIgniteSslCtxFactory=true, sslClientAuth=false, sslCtxFactory=null], mvccVacuumThreadCnt=2, mvccVacuumFreq=5000, authEnabled=false, failureHnd=NoOpFailureHandler [super=AbstractFailureHandler [ignoredFailureTypes=[SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]], commFailureRslvr=null]
So mainly No Persistence and the NoOpFailureHandler (for debugging).
Nevertheless, this does not run in debug mode.
Cheers,
Jorg
IgniteConfiguration cfg = new IgniteConfiguration();
cfg.setClientMode(false);
String igniteHome = IgniteUtil.getIgniteHome();
cfg.setIgniteHome(igniteHome);
m_instanceName = Sys.get().getName();
cfg.setIgniteInstanceName(m_instanceName);
cfg.setAuthenticationEnabled(IgniteUtil.isSecurityEnabled());
cfg.setDeploymentMode(DeploymentMode.CONTINUOUS);
cfg.setMetricsLogFrequency(IgniteConfiguration.DFLT_METRICS_LOG_FREQ * 30);
configLog(cfg);
System.setProperty(IgniteSystemProperties.IGNITE_JVM_PAUSE_DETECTOR_DISABLED, "true");
System.setProperty(IgniteSystemProperties.IGNITE_JVM_PAUSE_DETECTOR_PRECISION, String.valueOf(DurationUtil.MS_1MIN));
System.setProperty(IgniteSystemProperties.IGNITE_JVM_PAUSE_DETECTOR_THRESHOLD, String.valueOf(DurationUtil.MS_1HOUR));
cfg.setFailureHandler(new NoOpFailureHandler());
cfg.setFailureDetectionTimeout(IgniteConfiguration.DFLT_FAILURE_DETECTION_TIMEOUT * 6);
ClientConnectorConfiguration clientConfig = cfg.getClientConnectorConfiguration();
clientConfig.setHandshakeTimeout(ClientConnectorConfiguration.DFLT_HANDSHAKE_TIMEOUT * 6);
DataStorageConfiguration storageCfg = new DataStorageConfiguration();
storageCfg.setWalMode(WALMode.FSYNC);
DataRegionConfiguration regionCfg = storageCfg.getDefaultDataRegionConfiguration();
regionCfg.setMetricsEnabled(true);
if (Environment.isOnProduction()) {
regionCfg.setPersistenceEnabled(true);
}
storageCfg.setMetricsEnabled(true);
cfg.setDataStorageConfiguration(storageCfg);
cfg.setCacheConfiguration(cacheCache());
AtomicConfiguration atomicCfg = cfg.getAtomicConfiguration();
atomicCfg.setCacheMode(CacheMode.REPLICATED);
cfg.setDiscoverySpi(getTcpDiscovery());
cfg.setIncludeEventTypes(
EventType.EVT_CACHE_OBJECT_PUT, EventType.EVT_CACHE_OBJECT_READ, EventType.EVT_CACHE_OBJECT_REMOVED,
EventType.EVT_TASK_STARTED, EventType.EVT_TASK_FINISHED, EventType.EVT_TASK_TIMEDOUT,
EventType.EVT_TASK_SESSION_ATTR_SET, EventType.EVT_TASK_REDUCED
);
return cfg;
------------------------------
Jorg Janke
CTO Accorto, Inc.
------------------------------
Original Message:
Sent: 11-15-2019 04:12 AM
From: Ilya Kasnacheev
Subject: Unable to perform handshake within timeout
Hello!
I don't think that this is port scan. It is just a creation of outbound socket, for which a random port is assigned by operating system.
Can you provide more log? I don't understand why nodes would connect to Client listener (for ODBC/JDBC/Thin client) and not to Communication port. Can you provide your Discovery, Communication and Connector configurations?
Regards,
------------------------------
Ilya Kasnacheev
Community Support Specialist
GridGain
Original Message:
Sent: 11-14-2019 07:37 PM
From: Jorg Janke
Subject: Unable to perform handshake within timeout
We just moved from Apache Ignite 2.7.6 to GridGain 2.7.7
and now receive 10000+ of errors like
2019-11-14 23:45:00.721 WARN [grid-timeout-worker-#23%test-imac.accorto.com%] ClientListenerNioListener.warning: Unable to perform handshake within timeout [timeout=10000, remoteAddr=/127.0.0.1:54164]
2019-11-14 23:45:00.866 WARN [grid-timeout-worker-#23%test-imac.accorto.com%] ClientListenerNioListener.warning: Unable to perform handshake within timeout [timeout=10000, remoteAddr=/127.0.0.1:54166]
2019-11-14 23:45:00.866 WARN [grid-timeout-worker-#23%test-imac.accorto.com%] ClientListenerNioListener.warning: Unable to perform handshake within timeout [timeout=10000, remoteAddr=/127.0.0.1:54168]
in test environments.
We embed Ignite with our Apps within Tomcat and tests bootstrap the entire environment for every test.
The application mainly uses the JDBC to communicate with Ignite.
The first attempt was to increase the timeout time:
cfg.setFailureDetectionTimeout(IgniteConfiguration.DFLT_FAILURE_DETECTION_TIMEOUT * 6); // // from 10sec to 1minClientConnectorConfiguration clientConfig = cfg.getClientConnectorConfiguration();clientConfig.setHandshakeTimeout(ClientConnectorConfiguration.DFLT_HANDSHAKE_TIMEOUT * 6); // from 10sec to 1min
but the settings get lost when the ClientListenerNioListener is instantiated (back to 10000).
It seems it scans addresses from 49154 to 65535 (with an error message fo each port).
My guess is that our bootstrap code keeps the box too busy that the handshake fails.
Surprisingly, the application works correctly (once the server is done logging the thousands of warning).
Questions:
(1) How do we fix this?
(2) Is it a bug that the Configuration settings are ignored?
(3) Where is the port range set for scanning - from what I saw, the default port range is usually 100 - not 20000+
------------------------------
Jorg Janke
CTO Accorto, Inc.
------------------------------