Pedro Holanda - DuckDB: Bringing analytical SQL directly to your Python shell

Поделиться
HTML-код
  • Опубликовано: 3 окт 2024
  • PyData Eindhoven 2022
    In this talk, we will present DuckDB. DuckDB is a novel data management system that executes analytical SQL queries without requiring a server. DuckDB has a unique, in-depth integration with the existing PyData ecosystem. This integration allows DuckDB to query and output data from and to other Python libraries without copying it. This makes DuckDB an essential tool for the data scientist. In a live demo, we will showcase how DuckDB performs and integrates with the most used Python data-wrangling tool, Pandas.
    The talk is catered primarily towards data scientists and data engineers. The talk aims to familiarize users with the design differences between Pandas and DuckDB and how to combine them to solve their data-science needs. We will have an overview about five main characteristics of DuckDB. 1) Vectorized Execution Engine, 2) End-to-end Query Optimization, 3) Automatic Parallelism, 4) Beyond Memory Execution, and 5) Data Compression. In addition, users will also experience a live demo of DuckDB and Pandas in a typical data science scenario, focusing on comparing their performance and usability while showcasing their cooperation. The demo is most interesting for an audience familiar with Python, the Pandas API, and SQL.

Комментарии • 6

  • @holandacaua643
    @holandacaua643 Год назад +2

    Comecei a usar o duck realmente muito eficiente!

  • @taxed825
    @taxed825 Год назад

    Duck Db looks awesome. Will be trying out this week!

  • @AndikaMcenroe85
    @AndikaMcenroe85 Год назад

    Great presentation..! Does that presentation is shared?

  • @cboyda
    @cboyda Год назад

    Where can we find the link the .ipynb demo file? Great examples!

  • @datasleek7950
    @datasleek7950 Год назад

    Why another DB? Clickhouse also support column engine, can ingest large amount of data and run fast queries. Singlestore same. Can ingest data directly from S3, Kafka, Azure. Singlestore is way ahead, advance support for JSON (faster than MangoDB), and much more.

  • @JuniorMarques79
    @JuniorMarques79 Год назад

    Bem esclarecedor