Automate Data Pipeline for RAG with Github Actions

Поделиться
HTML-код
  • Опубликовано: 29 июл 2024
  • Data is a key aspect of a RAG system. In some cases, we want to always get the latest data. For example, if we're building a chatbot for financial reports, research, or news, we want to obtain the most recent information. However, how to automate this pipeline is rarely discussed, and that's what I want to share in this video. I'll cover how to set up an ETL pipeline, introduce you to Supabase Vector, and show you how to automate the pipeline using GitHub Actions.
    Chapter:
    00:00:00 - Intro
    00:01:02 - Project Overview
    00:02:29 - ETL Pipeline Explanation
    00:05:00 - Set Up Database
    00:05:50 - Test the Pipeline (Insert Data into Supabase)
    00:06:25 - Set Up GitHub Actions
    00:09:30 - Save Environment Variables in GitHub Repository
    00:10:08 - Check the Results of the Automated Process in GitHub
    00:11:05 - Add Command in actions.yaml to Save Files After Process Runs (git push)
    00:11:30 - Add a Scheduler
    00:12:13 - Outro
    #etl #githubactions #python #etl

Комментарии • 2