GridGain Community Edition

Expand all | Collapse all

The calculation principles of the primary and backup nodes of replication and partition cache are different?

  • 1.  The calculation principles of the primary and backup nodes of replication and partition cache are different?

    Posted 12 days ago
    Hi ,
    As we all know, the partition cache uses the Rendezvous hash algorithm to calculate the weight of each node. As long as the cluster view is consistent, the server will apply this algorithm to find the N nodes with the highest weight as the primary node and the backup node.
     
    The implementation of the replication cache is similar to the partition cache, each key has a primary copy and there will be backups on other nodes in the cluster.
     
    But, why for the same cluster,  under the two modes of replication and partition(backups=2), the number of primary nodes and backup nodes on the same server node are different? So, I want to consult which algorithm does the replication mode use to calculate primary node and replication nodes?
     
    Thanks,


    ------------------------------
    Qiaoqiao Sun
    office staff
    ASUS Technology
    ------------------------------


  • 2.  RE: The calculation principles of the primary and backup nodes of replication and partition cache are different?

    Posted 12 days ago
    Hello!

    As far as my understanding goes, rendezvous affinity (or any other affinity function) returns an *order* of nodes in which they are eligible to become primary. Then, the first node in that order is primary, the second is 1st backup, the third is 2nd backup, etc.

    This ordering does not have any guarantees that the same number of partitions are assigned to every node on each step.

    Regards,

    ------------------------------
    Ilya Kasnacheev
    Community Support Specialist
    GridGain
    ------------------------------



  • 3.  RE: The calculation principles of the primary and backup nodes of replication and partition cache are different?

    Posted 11 days ago
    Hello, thanks for your reply!
     
    I am so sorry that I forgot to remark the precondition in the previous question: assuming that there are only 3 cluster nodes. Please forgive my inaccurate description.
     
    Please allow me to describe my problem again:
     
    In the same cluster, why the table mode is set to partition, backups=N-1 (N is the number of cluster nodes) and the table mode is set to replication, the data distribution is different?
     
    The official document describes that: "The implementation of the replication cache is similar to the partition cache, each key has a primary copy and there will be backups on other nodes in the cluster."
     
    I want to consult which algorithm does the replication mode use to calculate primary node and backup nodes? If the replication mode is to calculate the primary and backup nodes of partitions according to the Rendezvous hash algorithm, the data distribution in the replication mode should be the same as the data distribution in the partition mode, backups=N-1 (N is the number of cluster nodes), right? But the result of the experiment is not so. 
     
    Or is the method of partitioning data in partition mode and replication mode different? Or does the data in replication mode do not need to partition the data first, but directly calculate the weight of cluster nodes based on each KEY? If this guess is correct, the reason for the different data distribution in partition and replication mode can be summarized as follows: In partition mode, first partition datas by affinity key and divide it into partitions, and then use Rendezvous hash algorithm to calculate the weight of cluster nodes in each partition. And the replication mode will not need to partition the data first, but calculate the weight of the cluster node with each KEY as the unit. Is that right?
     
    Best regards ,

    ------------------------------
    Qiaoqiao Sun
    office staff
    ASUS Technology
    ------------------------------



  • 4.  RE: The calculation principles of the primary and backup nodes of replication and partition cache are different?

    Posted 11 days ago
    Hello!

    I think that replicated caches has 512 partitions by default. If you supply an affinity function which has 1024 partitions, the layout should be identical between partitioned and replicated.

    Regards,

    ------------------------------
    Ilya Kasnacheev
    Community Support Specialist
    GridGain
    ------------------------------