Imagine if you could auto scale by simply by wrapping any existing app code in a function and have that block of code run in a temporary copy of your app. fly.io/blog/re...
For a long time. I've been thinking how to safely run NIFs without fear of potential crashes; and it always boils down to shipping it away to a different beam. This looks like it might be a great way to achieve that. Amazing stuff. Thanks Chris! and Fly for allowing you to work on these things.
I've never used elixir before, but I like this programmatic approach to scaling workloads. If i want to dynamically scale anything - a rust program, wasm, python, a long running node program, etc. would this be a good tool to adopt for that or is it geared toward existing elixir infrastructure?
@ChrisMcCord - in addition to min and max for a pool - how about min_standby? That way it's always trying to keep some number of workers ready to serve requests (up to the pool max).
Correct me if I'm wrong but it basically looks like a horizontal autoscaler but interfering with my application code. Why would I like to put an auto scaling logic into my application?
@@SpittingMage You don’t need to learn kubernetes but you still need to host your app somewhere. You can achieve auto scaling just by ticking one box in Google Cloud Run or AWS App Runner's web ui. No need to change anything in your code, cloud providers will handle it automatically.
Because it's not a horizontal autoscaler. It's an abstraction for deploying short lived infrastructure (similar to a lambda). For example, take my use case. I do CPU intensive ML workloads. I do not want to run that on my app server, so I wrote an absinthe application, put my workload behind it (because it's in python, bleh), and call to that server over websockets. Instead, I can use flame to call my ML code directly over erlport. I can wipe out 5-8 files of server/client code in both my application code base and my ML code base. This will handle that entire deployment abstraction, and I can run my workload with no ergonomic difference.
@@yohan31 ok but if you have continuous load lambdas are not short lived. They are getting reused. If you have some static variables the state caries from one call to another. They scale to 0 like some other autoscalers but they are not short lived by definition. On the other hand they usually come with a huge downside which is processing only one request at the time. I don’t think it was specified in the video but if each instance can process only one request at the time it’s even worse because of huge waste of resources. Example shown in the video could easily process 3 requests on single core.
I think I see it finally. It’s about autoscailing a specific part of a workflow on separate deployment to make sure the main deployment has its resources intact while reusing the docker image without a dedicated api (handled by some rpc). Right? It may be a use case.
Looks cool but unfortunately I don't run my apps on a PaaS like Fly and have really no interest for it either so I guess it's not really aimed for me :) Good job though!
@@chrismccord9211yeah thanks I got that from the video. The issue is that I run my apps on one big server from hetzner but if I would ever need more I know where to look 😊
For a long time. I've been thinking how to safely run NIFs without fear of potential crashes; and it always boils down to shipping it away to a different beam. This looks like it might be a great way to achieve that. Amazing stuff. Thanks Chris! and Fly for allowing you to work on these things.
This is really quite revolutionary and beautiful.
Excellent work! This nearly packages and simplifies clustering.
I've never used elixir before, but I like this programmatic approach to scaling workloads.
If i want to dynamically scale anything - a rust program, wasm, python, a long running node program, etc.
would this be a good tool to adopt for that or is it geared toward existing elixir infrastructure?
I need this super bad, thank you so much.
@ChrisMcCord - in addition to min and max for a pool - how about min_standby? That way it's always trying to keep some number of workers ready to serve requests (up to the pool max).
Really incredible, this is a dealbreaker for some of the things I'd like to use phoenix for
In our project we will consider contributing ECS backend for AWS. The idea of flame is really neet.
chris always makes lit stuff 💯🔥
Very, very cool
This is truly wonderful.🔥
lovely 👏
Correct me if I'm wrong but it basically looks like a horizontal autoscaler but interfering with my application code. Why would I like to put an auto scaling logic into my application?
@@SpittingMage You don’t need to learn kubernetes but you still need to host your app somewhere. You can achieve auto scaling just by ticking one box in Google Cloud Run or AWS App Runner's web ui. No need to change anything in your code, cloud providers will handle it automatically.
Because it's not a horizontal autoscaler. It's an abstraction for deploying short lived infrastructure (similar to a lambda).
For example, take my use case. I do CPU intensive ML workloads. I do not want to run that on my app server, so I wrote an absinthe application, put my workload behind it (because it's in python, bleh), and call to that server over websockets.
Instead, I can use flame to call my ML code directly over erlport. I can wipe out 5-8 files of server/client code in both my application code base and my ML code base. This will handle that entire deployment abstraction, and I can run my workload with no ergonomic difference.
@@yohan31 ok but if you have continuous load lambdas are not short lived. They are getting reused. If you have some static variables the state caries from one call to another. They scale to 0 like some other autoscalers but they are not short lived by definition. On the other hand they usually come with a huge downside which is processing only one request at the time. I don’t think it was specified in the video but if each instance can process only one request at the time it’s even worse because of huge waste of resources. Example shown in the video could easily process 3 requests on single core.
I think I see it finally. It’s about autoscailing a specific part of a workflow on separate deployment to make sure the main deployment has its resources intact while reusing the docker image without a dedicated api (handled by some rpc). Right? It may be a use case.
Looks cool but unfortunately I don't run my apps on a PaaS like Fly and have really no interest for it either so I guess it's not really aimed for me :) Good job though!
It's built with backend adapters and ships with Local and Fly adapters.
You should be able to use this at any host with an API that lets you spin up a new instance when your app code on it, and network it with the parent.
@@chrismccord9211yeah thanks I got that from the video. The issue is that I run my apps on one big server from hetzner but if I would ever need more I know where to look 😊
OK, thx, bye 👋