In a Spark Cluster, when multiple job runs process the same data and store it back, the cluster should cache the data effectively for each job run. This means that once the data is processed by a job, it can be stored and efficiently accessed by subsequent job runs without needing to reload it from scratch each time. Is the above use-case suited for Disk Caching ?
In a Spark Cluster, when multiple job runs process the same data and store it back, the cluster should cache the data effectively for each job run. This means that once the data is processed by a job, it can be stored and efficiently accessed by subsequent job runs without needing to reload it from scratch each time.
Is the above use-case suited for Disk Caching ?