Airflow XCom for Beginners - All you have to know in 10 mins

Поделиться
HTML-код
  • Опубликовано: 9 июл 2024
  • Airflow XCom for Beginners - All you have to know in 10 mins to share data between tasks.
    👍 Smash the like button to become better at Airflow
    ❤️ Subscribe to my channel to become a master of Airflow
    🏆 Take my course : www.udemy.com/course/the-ulti... to join the legends of Airflow
    🚨 My Patreon: / marclamberti to support my work and be friend for life
    The materials: www.notion.so/Airflow-XCOM-Al...
    The blogs post: marclamberti.com/blog/airflow...
    1. Use case
    You have a data pipelines with 5 different tasks. The first task downloads data, 3 tasks train machine learning models, the last task chooses the best model. Each task training a machine learning model, produces an accuracy. According to this accuracy, you want to choose the best model in the last task. Question, how can you share the accuracies produced in the 3 training model tasks with the last task? XComs!
    2. What is a XCom?
    XCom stands for cross communication and allows to share messages and small amount of data between tasks in data pipelines. A XCom is composed of a key (identifier), a value (must be serializable), a timestamp (when was created), an execution date (to which dagrun the XCom belongs with), a task id (which task created the XCom), a dag id (same as the task but for the dag). XComs are stored in the database of Airflow.
    3. How to push a XCom
    2 ways, with the return keywords or with xcom_push. Any value returned from an operator is automatically pushed as a XCom with the key return_value. With xcom_push you have to specify the key as well as the value. To use xcom_push, you have to access the task instance object corresponding to your task.
    4. How to pull a XCom
    With xcom_pull. Again, to call xcom_pull, you have to access the task instance object of your task. xcom_pull expects two arguments, the key and a list of task ids.
    5. XCom limitations
    Different xcom size limit according to the database used.
    Create implicit dependencies between your tasks.
    Enjoy!

Комментарии • 52

  • @GustavoCosta-jr1mh
    @GustavoCosta-jr1mh 2 года назад

    Thanks for all your videos, Marc. Brilliant content, very straightfoward!

  • @guolunli1908
    @guolunli1908 2 года назад

    Thank you so much! By following your videos closely I not only avoided bugs but also learned in detail how to set up an airflow pipeline!

  • @JavierBeneitoBarquero
    @JavierBeneitoBarquero 3 года назад +7

    Quick and clear. With your Udemy courses and your RUclips videos, I'm learning a lot about Apache Airflow. From Zero to hero in a few weeks. Thx.

  • @pyrojoke2
    @pyrojoke2 2 года назад

    Thank you! I spent a few hours trying to figure out how it works, but your tutorial did it in 10 minutes

  • @waodezalmawati460
    @waodezalmawati460 Год назад +1

    Whats a clearly explanation! Thankyou so much!
    Love from Indonesia!❤‍🔥

  • @Jmignet
    @Jmignet 3 года назад

    Pretty enlightening!
    Thank you very much!

  • @yunusemrahulucay4953
    @yunusemrahulucay4953 Год назад

    Thank you so much. It was really quick and clear.

  • @shantanubugadi3188
    @shantanubugadi3188 Год назад

    Very well explained...! Very awesome...! I am just enjoying the series

  • @victortelles9670
    @victortelles9670 2 года назад

    Amazing tutorial as always. Love yout channel.

  • @user-ql2zc2mh1k
    @user-ql2zc2mh1k 3 года назад

    thank you for clear explanation

  • @luuquanghuy8193
    @luuquanghuy8193 Год назад

    Thank u for your sharing, keep it up because u doing that right clearly and easy to understand. Love from Viet Nam

  • @anatolys9203
    @anatolys9203 Год назад

    great tutorial and explanantion! thank you!

  • @user-mu4ty1gi8b
    @user-mu4ty1gi8b Год назад

    thank you!
    as it gets very often, some detail is missed by me in order to use new functionality and that detail was ti
    thanks for explaining!

  • @Jamdat33
    @Jamdat33 2 года назад

    Excellent thank you

  • @sodiqafolayan4921
    @sodiqafolayan4921 3 года назад +1

    Thanks Marc as always

  • @quanminfeng8375
    @quanminfeng8375 9 месяцев назад

    Best xcom video I watched

  • @saritkumarsi4166
    @saritkumarsi4166 3 года назад +2

    Thanks Marc. Great info for a beginner like me :)

    • @MarcLamberti
      @MarcLamberti  3 года назад +1

      Great pleasure, let me know what you would like to see next :)

    • @saritkumarsi4166
      @saritkumarsi4166 3 года назад

      @@MarcLamberti It would be awesome if you can share some knowledge by taking this tutorial further by adding (if possible) a new service and assigning it a task via DAGs within the Airflow container. Just to give you some background on what I am currently trying to do --> Similar to your compose file, I have Postgres, Airflow Webserver and scheduler services defined in docker compose file and running perfectly. However, I have an extra service (say an Ubuntu server) defined in the same compose file which basically takes an .sh command and runs it. I am trying to send an sh command via DAG DockerOperator, to this 4th container. I took help from your video on DockerOperator as well :). If I understood correctly, the Airflow DAG in that video is running on the host instead from within a docker container. I searched for help on the internet, but haven't come across any. Do let me know if such use case is possible with airflow. Appreciate you help :)

  • @guyfridman4426
    @guyfridman4426 3 года назад +1

    Thank you marc, I did the course in udemy and it's very recommended

  • @wll9085
    @wll9085 Год назад

    useful!

  • @ramjanraeen4263
    @ramjanraeen4263 2 года назад

    Hi Marc, how to cross communication data (data sharing) between master_dag and child_dag(this is triggered by master dag)?

  • @padegalsaigiriraj3459
    @padegalsaigiriraj3459 3 года назад

    Thanks, Marc. Great info
    Can You Please Confirm This
    I have Two Tasks in DAG1
    Task1 - Normal Function Which does some process
    Task2 - TriggerDagRunOperator (Trigger DAG2)
    Can I Use the return value/XCom of task1 in Task2 and use this value in DAG2?
    Thanks

  • @kipodiha
    @kipodiha 2 года назад

    Very useful video! after reading airflow documentation and example for me this XCOM was still not clear :( Now I m ok! Thanks a lot!!!

  • @bigkongenergy6054
    @bigkongenergy6054 2 года назад +2

    Commander, the aliens continue to make progress on the Avatar Project. If we're going to slow them, we need to move fast.

  • @kyuukev
    @kyuukev Год назад

    Nice video but is there a way to do it between 2 DockerOperator? And if not with the DockerOperator creating some kind of CustomDockerOperator to make it able to do it?

  • @user-ue2ig5gp2j
    @user-ue2ig5gp2j Год назад

    Hi all! Please suggest how to implement the following scenario:
    Have a dag with sshoperator that executes shell script. Shell script returns different return codes. How to analyze which return code is received??

  • @prashantjoshi4691
    @prashantjoshi4691 3 года назад

    add provide_context = True in task which uses ti in older version of airflow

  • @thati2792
    @thati2792 2 года назад

    can we pass xcom as params in redshiftsqloperator?

  • @BULLSHXTYT
    @BULLSHXTYT 2 года назад

    How to access xcom in subdag from parent dag?

  • @drakdragon
    @drakdragon 2 года назад

    Sir, I tried to do the needful to do good and nice Airflow in the XCom, but my hit chance was still saying 99% chance of to becoming the hit sir, but it still missed most of the time. Sir I do believe you should do the needful by improving the Airflow by using the Minecraft Mushroom Soup Windtunnel.

  • @antoniosaldivar4163
    @antoniosaldivar4163 2 года назад

    I have the same xcom key from same task and it does not get replace or overwrite, instead it creates other xcom record with same key and task id

    • @MarcLamberti
      @MarcLamberti  2 года назад

      look at the execution date which should not be the same. That's why you see that. XCOMs are created based on their DAGRun too

    • @antoniosaldivar4163
      @antoniosaldivar4163 2 года назад

      @@MarcLamberti thank you for the info!!, you are right the execution dates (YYYY-mm-dd, HH:mm:ss) are different, I see it has to be executed at the same time to overwrite that XCOM.

    • @RaviPrakash-vu6hf
      @RaviPrakash-vu6hf Год назад

      Hi @@MarcLamberti I have the same issue. I do want to overwrite the value from the last run. I have a DAG that will run every 15 mins and it will push a key to XCOM. Since this key will be stored in the DB, the number of records will keep on growing. Hence, I want to overwrite the value of the key from the previous DAG run. please let me know if that is possible. Thanks!!

  • @djalan84
    @djalan84 Месяц назад

    what about sharing data between dags?

    • @MarcLamberti
      @MarcLamberti  Месяц назад

      You can. Just specify the dag_id in the xcom_pull method.

  • @yerkhankabyl6284
    @yerkhankabyl6284 2 года назад

    Does someone have the same issue like me? I got 3 “None” -> [None, None, None]

    • @yerkhankabyl6284
      @yerkhankabyl6284 2 года назад

      Oops, I solved this problem, the main error was the name of “Xcom” 😅
      Thank u, man, u are awesome (:

  • @BM-uf4pp
    @BM-uf4pp 3 года назад

    The three people who disliked this video are nifi fans

  • @g3rzin
    @g3rzin Год назад

    Please don't read the underscore!!!! Just read it a "choose model" and not "choose underscore model"

    • @MarcLamberti
      @MarcLamberti  Год назад

      Okkkkkkkkkk!!!!!!!! Will do

    • @g3rzin
      @g3rzin Год назад

      @@MarcLamberti ahahah great! And thanks for teaching me airflow!