I have a question on threadpool in spark. When we use threadpool executor, all threads are running on same node? Like only on driver node? Or Will it utilize all the workers in the cluster? Can you please clarify ?
When you use threadpool executor, all threads are running on the same node, might run out of memory as well. o tackle your problem, can you try running each notebook as a separate process and create a Spark Context within that process. Please try using "subprocess" module in Python to spawn a new process for each notebook.
I have a question on threadpool in spark. When we use threadpool executor, all threads are running on same node? Like only on driver node? Or Will it utilize all the workers in the cluster? Can you please clarify ?
When you use threadpool executor, all threads are running on the same node, might run out of memory as well. o tackle your problem, can you try running each notebook as a separate process and create a Spark Context within that process. Please try using "subprocess" module in Python to spawn a new process for each notebook.