SH: Let's build a data pipeline with Prefect!
HTML-код
- Опубликовано: 6 окт 2024
- Want to run data transformations without building your own orchestration? Curious about data engineering but don't know where to start? Got some data in A and want it in B? Come learn Prefect with us!
In this workshop, Beege will guide you through building a data pipeline using Prefect. If you've heard of Airflow, Prefect is another open source alternative. We'll build a pipeline pull some data from a fake API and dump it into some tables in PostgreSQL. No data engineering knowledge or experience necessary.
What you need:
Your laptop
Basic Python knowledge
Python already installed
Knowledge of how to set up a Python project (virtual environment, etc)
Basic Docker knowledge
Docker already installed
Git
========= SH: Saturday Hangouts =========
Saturday evenings, come learn about new tech or just socialize with other IT folks. Sometimes we'll do workshops; sometimes we'll sit and chat. This time is for a diverse array of topics and experimentation. Have an idea for an event? Let us know! This is an informal event and questions and problems are encouraged!
We'll go out for food together towards the end of the event. CodeSeoul will buy dinner, but we humbly ask for donations to help offset the cost. If you do not currently have income, please do not donate.
This event and all Code Seoul events are FREE. We are a non-profit organization dedicated to providing tech education to the Seoul community.
If you'd like to donate, here's our info:
NongHyeop / NH / 농협은행 301 0275 2831 81 코드서울
📍 Location: Lunit Office (4th floor) (374 Gangnam-daero Gangnam-gu Seoul 서울 강남구 강남대로 374 4층)
💬 Want to chat? Join us on Discord! / discord
This was SO helpful for me, after lots of failed attempts to boil down how Prefect works into a ~1 hour lesson for my data science students! I was either making it too complicated to fit into 1 hour, or... not complicated enough to cover flows, deployments, tasks, and incorporate Pydantic as well. Thank you so much for making this available on RUclips! (Joining the Discord to say thank you there as well :D)
I’m really enjoying these videos. Watching & doing many of them will make me sound much better to employers. The other videos are gems. Anyone who watches them can’t be touched by those around them (LOL)
ayy! nate from prefect here, this is great content 👏
If dictionaries don’t support regex key search, put it in a function. That’s what I do. I guess dictionaries can also be subclassed, but I put it in a function
amazing
can you provide the github link for this project?
Sorry it took so long to see this! Here's the link: github.com/CodeSeoul/intro-to-prefect
Can you showcase the datapipeline building from end to end using spark
Sorry we were so late seeing this. We're still figuring out social media. We'll add it to our backlog of workshops!
@BhakthiRUclips We're going to do Spark this weekend!
so the on-premise one is opensource but the cloud one is paid ?
yeah
Correct. The cloud offering also has more features, like authentication. One key thing is that your flows' code isn't transferred to the cloud when you use their cloud version. The cloud system just handles task management and such.
Note that you can deploy the open source version to your own cloud systems, but you must manage everything yourself. It is possible to deploy Prefect on Kubernetes infrastructure with autoscaling and all the other benefits Kubernetes provides. The biggest difficulty is the lack of authentication, but you could some kind of gateway to accomplish this.
I am getting an error when trying to acess prefect server? any idea what might be the issue
have resolved i had to start the server and then run pipleine in new terminal
Sorry for not seeing this sooner. We're still staffing up to track all our social media. Glad you figured it out!
You joke a lot…