Implementing Data Quality Checks in Airflow

Поделиться
HTML-код
  • Опубликовано: 6 сен 2022
  • Data quality is key to the success of an organization’s data systems. In Apache Airflow, implementing data quality checks in DAGs is both easy and effective. With in-DAG quality checks, you can halt pipelines and alert stakeholders before bad data makes its way to a production lake or warehouse.
    In this webinar, we discuss the benefits of the Common SQL provider package, a consistent, easy-to-use, and versatile set of operators for implementing data quality checks in your pipelines. In particular, we focus on the SQLColumnCheckOperator and SQLTableCheckOperator, both part of the provider package, and how they can work with OpenLinage.
    The webinar shows you how to effectively use SQL for data quality checks, and answers questions like:
    -Why does the Common SQL provider exist and how does it work?
    -How do I implement column-level and table-level checks in my DAGs?
    -How does the Common SQL provider operate with OpenLineage?
    All of the sample code shown in this webinar can be found in this repo. github.com/astronomer/airflow...
  • НаукаНаука

Комментарии • 6

  • @atalaykutlay6898
    @atalaykutlay6898 Год назад +1

    Thank you for another great video!

  • @amrutakothari9293
    @amrutakothari9293 Год назад

    Hi can i check values for current date?
    i mean i want to check count and sum of column for current date
    and if yes then please tell me how...... thanks in advance

    • @Astronomer
      @Astronomer  Год назад

      Would be a different SQL command depending on your data set up!

  • @xmagcx1
    @xmagcx1 Год назад

    i like Dana