Unstructured Data Processing Meetup NYC - 25 July 2024 -

Поделиться
HTML-код
  • Опубликовано: 12 сен 2024
  • unstructuredmeetuptspann25July2024
    This is an in-person event! Registration is required to get in.
    Topic: Connecting your unstructured data with Generative LLMs
    What we’ll do:
    Have some food and refreshments. Hear three exciting talks about unstructured data and generative AI.
    5:30 - 6:00 - Welcome/Networking/Registration
    6:05 - 6:30 - Tim Spann, Principal DevRel, Zilliz
    6:35 - 7:00 - Chris Joynt, Senior PMM, Cloudera
    7:05 - 7:30 - Lisa N Cao, Product Manager, Datastrato
    7:30 - 8:30 - Networking
    Tech talk 1: Unstructured Data Processing From Cloud to Edge
    Speaker: Tim Spann, Principal Dev Advocate, Zilliz
    In this talk I will do a presentation on why you should add a Cloud Native vector database to your Data and AI platform. He will also cover a quick introduction to Milvus, Vector Databases and unstructured data processing. By adding Milvus to your architecture you can scale out and improve your AI use cases through RAG, Real-Time Search, Multimodal Search, Recommendations Engines, fraud detection and many more emerging use cases.
    As I will show, Edge devices even as small and inexpensive as a Raspberry Pi 5 can work in machine learning, deep learning and AI use cases and be enhanced with a vector database.
    Tech talk 2: RAG Pipelines with Apache NiFi
    Speaker: Chris Joynt, Senior PMM, Cloudera
    Executing on RAG Architecture is not a set-it-and-forget-it endeavor. Unstructured or multimodal data must be cleansed, parsed, processed, chunked and vectorized before being loaded into knowledge stores and vector DB's. That needs to happen efficiently to keep our GenAI up to date always with fresh contextual data. But not only that, changes will have to be made on an ongoing basis. For example, new data sources must be added. Experimentation will be necessary to find the ideal chunking strategy. Apache NiFi is the perfect tool to build RAG pipelines to stream proprietary and external data into your RAG architectures. Come learn how to use this scalable and incredible versatile tool to quickly build pipelines to activate your GenAI use case.
    Tech Talk 3: Metadata Lakes for Next-Gen AI/ML
    Speaker: Lisa N Cao, Datastrato
    Abstract: As data catalogs evolve to meet the growing and new demands of high-velocity, unstructured data, we see them taking a new shape as an emergent and flexible way to activate metadata for multiple uses. This talk discusses modern uses of metadata at the infrastructure level for AI-enablement in RAG pipelines in response to the new demands of the ecosystem. We will also discuss Apache (incubating) Gravitino and its open source-first approach to data cataloging across multi-cloud and geo-distributed architectures.
    Who Should attend:
    Anyone interested in talking and learning about Unstructured Data and Generative AI Apps.
    When:
    July 25, 2024
    5:30PM
    Where:
    This is an in-person event! Registration is required to get in. Registration will close 2 days before the event. Sponsored by Zilliz maintainers of Milvus.

Комментарии •