Pushing Java to the Limits: Processing a Billion Rows in under 2 Seconds by ROY VAN RIJN
HTML-код
- Опубликовано: 30 сен 2024
- For updates and more, join our community 👉 / devoxx-united-kingdom
Last January a challenge was posted online by Gunnar Morling:
How fast can you parse a file with one billion rows of weather data using Java?
Little did I know this deceivingly simple question would lead me down a path that taught me all about: parallelism, memory mapped files, SWAR techniques (SIMD as a register), bit twiddling, branchless code, mechanical sympathy, Graal native compilation and finally... I even turned to the dark side: using sun.misc.Unsafe.
Join me in this deep dive where I'll explain all the code changes and tricks that took me from the reference implementation which processes the billion records in less than 4 minutes, to processing everything in under two seconds.
Who knew Java could be this fast?
Roy's talk has been featured in the last issue of Tech Talks Weekly newsletter 🎉 Congrats!
It's disheartening to think that in this day and age performant code still has to look this awful.
But the talk itself is great 😃
We should soon have the "inline" keyword. A huge effort is being performed for this in the JVM so that, for example, an array of inline objects will use contiguous memory. When iterating through, you get huge speedups as you avoid all those cache misses (30x have been seen).
Crazy engineering effort went in this challenge.
I think branchless programing fast too because of the CPU cache memroy loads with bulk (from the ram to l1 l2 l3 cache), like load the hole 64 bit block the requsted data around, therefore not only requested data load, the cpu loads after the data and there is a next section of the data, this adds more optimization
Unsafe is deprecated now
It was fun and interesting until i saw unsafe. after that it felt meaningless, empty satisfaction imho
Same feeling, like using inline asm in C/C++ 😂
Why though?
You don’t *need* unsafe to do all of this, for me it was actually a fun challenge to learn and use it.
This reminds me why I fell in love with CS and CPU Architecture. Though, it was probably easier to write this in ANSI C
As far as I remember constant maths can be performed at compile time (such as static final int foo = 1+1 would be written to the file as static final int foo = 2). Could you trick the compiler to do all of this heavy lifting and only produce the binary that contained final results?
Volkswagen would love to hire you for their diesel division.
Great Talk!
It was fun, but it felt like they were inventing an assembler
So to process 1 billion rows in Java in 2 seconds you need to use C\C++. Great job anyway!
That's what I was thinking of. Every level of abstraction that Java gives is intentionally thrown away to go as close to the low level details of the hardware as possible.🤣
@@gergonagy2733 I agree. If someone is making such language like Java, he can't not to provide some way to access low-level memory management. But could someone show me a Java dev, that committed a code with a direct memory processing to a real-life commercial product? 2-sec projects are still Core Java, but it's not "vanilla" Java.
imagine using java in 2024...
Why not? Real persons do use it 😉😂 Pun intended.
Curious to know what you use ?
Imagine saying something that brainωormed in any year
You don't need to imagine. It's the 2nd most popular programming language in the world after python.