Apache Ignite

Expand all | Collapse all

Spark and Ignite use case

  • 1.  Spark and Ignite use case

     
    Posted 3 days ago
    Hello,

    I would like to know  if it's possible to use Spark and Ignite for this use case:

    I have an Ignite table aaa (10000 columns , 2M rows)

    Each Spark job must update a number of columns concurrently like :
    Spark job 1 -> update all these rows from column col1 to col10 of aaa using another hive dataframe
    Spark job 2 -> update all these rows from column col11 to col100 of aaa using another hive dataframe

    Is it possible? if yes,what is the best configuration to do that (standalone or shared deployment)?

    Thanks for your help

    ------------------------------
    pascal ka
    Senior Big Data developper
    Bank Leumi
    ------------------------------


  • 2.  RE: Spark and Ignite use case

    Posted 2 days ago

    Hi,

    I am not sure that your use case is okay because you have one table and two subsets of data inside (first col1 - col10 and second col11-col100). I guess that will be easier to create two tables with these fields and then use joins by ID.

    However, according to your question - Ignite and GridGain support parallel access to the data. Spark integration will just transform your requests to SQL commands that can be executed in parallel.

    >>Is it possible? if yes : what is the best configuration to do that (standalone or shared deployment) ?

    I suggest using distributed Spark deployment because it will increase the speed of operations with Ignite/GridGain. I will provide several links that can be useful:

    https://github.com/apache/ignite/tree/master/examples/src/main/spark/org/apache/ignite/examples/spark

    https://apacheignite-fs.readme.io/docs/ignite-for-spark

    https://www.gridgain.com/docs/latest/integrations/datalake-accelerator/load-data-spark

    https://www.gridgain.com/resources/blog/how-debug-data-loading-spark-ignite

    BR,
    Andrei



    ------------------------------
    Andrei Alexsandrov
    Developer
    GridGain
    ------------------------------