Hello!
We will consider this change.
Have you tried using the data streamer by doing:
set streaming on allow_overwrite off
In allow_overwrite=off mode there should not be insert conflicts.
https://www.gridgain.com/docs/latest/developers-guide/data-streaming#overwriting-existing-keysYou can also split your INSERTs to smaller ones, there should be no serious performance hit if you're using streaming mode.
Regards,
------------------------------
Ilya Kasnacheev
Community Support Specialist
GridGain
------------------------------
Original Message:
Sent: 11-30-2020 10:14 PM
From: Qiaoqiao Sun
Subject: Is there any way to limit the size of the error message queue?
Thanks for your reply!
We inevitably need to use "insert into" for BigFileBatch processing. Do you have any plans to improve this conflict mechanism?
Best Wishes,
------------------------------
Qiaoqiao Sun
office staff
ASUS Technology
Original Message:
Sent: 11-30-2020 01:12 AM
From: Ilya Kasnacheev
Subject: Is there any way to limit the size of the error message queue?
Hello!
As a workaround you can try MERGE INTO instead of INSERT INTO, since it would not produce and collect insert conflicts.
You may also use smaller batches of INSERTs or increase the amount of available memory. Unfortunately, there is no mechanism to skip insert conflicts reporting for now.
Regards,
------------------------------
Ilya Kasnacheev
Community Support Specialist
GridGain
Original Message:
Sent: 11-29-2020 11:06 PM
From: Qiaoqiao Sun
Subject: Is there any way to limit the size of the error message queue?
Hello, thanks for your reply!
The logs from a server node are as follows:
------------------------------
Qiaoqiao Sun
office staff
ASUS Technology
Original Message:
Sent: 11-27-2020 03:27 AM
From: Andrei Alexsandrov
Subject: Is there any way to limit the size of the error message queue?
Hi Qiaoqiao Sun,
I don't think that it's possible to limit the size of error queue in your case.
If your information is correct, it should be investigated and fixed. Do you have logs from your server nodes that contain these exceptions and out of memory? Could you share it with us?
BR,
Andrei
------------------------------
Andrei Alexsandrov
Developer
GridGain
Original Message:
Sent: 11-26-2020 01:22 AM
From: Qiaoqiao Sun
Subject: Is there any way to limit the size of the error message queue?
Hi,
When the broadcast method is used for BigFileBatch processing, executing "insert in to select * from A" (A is a replicated mode table) will cause the JVM to throw an exception: "java.lang.OutofMemoryError: Java heap space", further causing cluster down. By tracking the JVM monitoring, it is found that the real cause of OutofMemoryError is not because of the large amount of data, but because of repeated insertions that produce a large number of primary key conflicts, resulting in a large number of failedKEYs and long string Exceptions. Tracing the source code found that the following exception was thrown: "Failed to INSERT some keys because they are already in cache". The "sender.failedKeys()" in the source code caused the string to be concatenated all the time when there were too many duplicate Keys. String type requires continuous memory. If there is no such large contiguous memory, an OutofMemoryError exception will be thrown.
May I consult that how to catch the exception of executing "insert in to select * " this kind of large SQL? Is there any way to limit the size of the error message queue? Or is there an useful solution to this problem?
Regards,
------------------------------
Qiaoqiao Sun
office staff
ASUS Technology
------------------------------