Very happy to see some work in this direction. There have been a bunch of API's I played around with but having a proper set of JDK features for this would be amazing.
The first question that was asked by someone from the audience is what i have on my mind too. The way Gary Frost describes moving data in and out of the GPU based on automatic code analysis supposes that the whole calculation is one single operation with multiple steps. The data is uploaded to the GPU, calculated trough multiple kernel calls, result downloaded, and data on the GPU discarded. What if i have a scenario that accumulates calculations over a longer time period, and never normally does need to move data out of GPU, as it always reuses the already uploaded data. So its not a single operation, but the data resides on the GPU for longer time. When needed, on demand, some of the relevant data is read back from the GPU. But only rarely. It cannot be automatically determined after a kernel call, which data should be read back, which data is temporary, as that is the result of user actions. One kernel only updates data residing on the GPU, so from that point there is no result to read back. It will be different code running at different time that actually reads some of the data back. I've used Aparapi, sometimes it was better to run the kernel on the CPU instead of the GPU, because the latency of data transfer was less, even if the computation on the GPU was faster.
Quite interesting presentation, highlights a topic not widely discussed. I liked the historic overview of the evolution how java approached the heterogeneous environment and there is obviously good progress! I wonder what comes next🙂
I have been dabbling in AI/ML code in Python and as much as I respect the work done by researchers using Python, the Python ecosystem is very immature and riddled with bad practices and security nightmares. Flaky builds, projects that have third party git repos as dependencies and they in turn have other repos as repositories. Everything is held in place by hope and prayers. Java is very well placed to take on the GPGPU / Accelerator market. Other players are emerging (other than Nvidia) and they would like to standardize this is where Java can shine. Python is currently filling the void left by absence of any viable alternative.
Very happy to see some work in this direction. There have been a bunch of API's I played around with but having a proper set of JDK features for this would be amazing.
The first question that was asked by someone from the audience is what i have on my mind too. The way Gary Frost describes moving data in and out of the GPU based on automatic code analysis supposes that the whole calculation is one single operation with multiple steps. The data is uploaded to the GPU, calculated trough multiple kernel calls, result downloaded, and data on the GPU discarded.
What if i have a scenario that accumulates calculations over a longer time period, and never normally does need to move data out of GPU, as it always reuses the already uploaded data. So its not a single operation, but the data resides on the GPU for longer time. When needed, on demand, some of the relevant data is read back from the GPU. But only rarely.
It cannot be automatically determined after a kernel call, which data should be read back, which data is temporary, as that is the result of user actions. One kernel only updates data residing on the GPU, so from that point there is no result to read back. It will be different code running at different time that actually reads some of the data back.
I've used Aparapi, sometimes it was better to run the kernel on the CPU instead of the GPU, because the latency of data transfer was less, even if the computation on the GPU was faster.
Quite interesting presentation, highlights a topic not widely discussed. I liked the historic overview of the evolution how java approached the heterogeneous environment and there is obviously good progress! I wonder what comes next🙂
Thanks, very interesting and very much needed. :)
Glad you enjoyed it!
Which JEP(s) cover the "Code Reflection" used by HAT?
There's no JEP yet but more information will soon be published including the JVMLS Code Reflection session.
I am very interested with this
And it would be helpful to make the calculation on GPU
5:50 It would be #1 in 2006, not 2012. But still, impressive :)
What about leaning on WebGPU initiative and linking against Dawn (Google implementation)?
This would allow Java to use standard cross-platform GPU API
I have been dabbling in AI/ML code in Python and as much as I respect the work done by researchers using Python, the Python ecosystem is very immature and riddled with bad practices and security nightmares. Flaky builds, projects that have third party git repos as dependencies and they in turn have other repos as repositories. Everything is held in place by hope and prayers. Java is very well placed to take on the GPGPU / Accelerator market. Other players are emerging (other than Nvidia) and they would like to standardize this is where Java can shine. Python is currently filling the void left by absence of any viable alternative.
data
First comment