Видео 15
Просмотров 10 466

Demo and Code Review for Text-To-SQL with Open-WebUI

19:16

TWRTW Ep #3 - GenAI's Impact on Work/Privacy, Immersion Cooling, OpenStack is Back, Perplexity Ads

1:11:19

TWRTW Ep #2 - Nuclear Power, GPU Buildouts, Semi-Stateful Workloads, LLM security, GPT-5 Speculation

1:18:03

Using Llama-3.1-405B as a Coding Assistant with Continue.Dev, Ollama, and NVIDIA GH200 Superchip

10:33

Running Llama-3.1-405B with Ollama and Open-WebUI: Introduction to the DL384 Gen12 Server

7:11

TWRTW Ep. #1 - Intel issues, NVIDIA delays, JPY carry trades, Google antitrust

28:51

TWRTW Ep #4 - Coding Assistants, SMCI shorts, NVIDIA DOJ Subpoena, OpenAI Lawsuits, Intel Spinoff

With backgrounds in the design and implementation of compute infrastructure from edge to cloud, sensor to tensor, Jordan Nanos and Hunter Almgren give their take on what’s new in enterprise technology - specifically, what they read this week.
Show notes:
x.com/peterberezinbca/status/1829343765653225639
x.com/mattshumer_/status/1831767014341538166
www.bloomberg.com/news/articles/2024-08-30/intel-is-said-to-explore-options-to-cope-with-historic-slump
futurism.com/the-byte/openai-copyrighted-material-parliament
x.com/RnaudBertrand/status/1831536755729952909
www.bloomberg.com/news/articles/2024-08-30/intel-is-said-to-explore-options-to-cope-with-historic-slump

Видео

Demo and Code Review for Text-To-SQL with Open-WebUI

19:16

Demo and Code Review for Text-To-SQL with Open-WebUI

Просмотров 613День назад

github.com/JordanNanos/example-pipelines

TWRTW Ep #3 - GenAI's Impact on Work/Privacy, Immersion Cooling, OpenStack is Back, Perplexity Ads

1:11:19

TWRTW Ep #3 - GenAI's Impact on Work/Privacy, Immersion Cooling, OpenStack is Back, Perplexity Ads

Просмотров 6421 день назад

With backgrounds in the design and implementation of compute infrastructure from edge to cloud, sensor to tensor, Jordan Nanos and Hunter Almgren give their take on what’s new in enterprise technology - specifically, what they read this week. Show Notes: Twitter taught Tay to be racist: www.theverge.com/2016/3/24/11297050/tay-microsoft-chatbot-racist Digital twins for people doing interviews: w...

TWRTW Ep #2 - Nuclear Power, GPU Buildouts, Semi-Stateful Workloads, LLM security, GPT-5 Speculation

1:18:03

TWRTW Ep #2 - Nuclear Power, GPU Buildouts, Semi-Stateful Workloads, LLM security, GPT-5 Speculation

Просмотров 69Месяц назад

With backgrounds in the design and implementation of compute infrastructure from edge to cloud, sensor to tensor, Jordan Nanos and Hunter Almgren give their take on what’s new in enterprise technology - specifically, what they read this week. Show Notes: Intel compared to Boeing: www.theregister.com/2024/08/09/opinion_column_intel/?td=rt-3a Nuclear Power Roadblocks: www.cnbc.com/2024/08/10/why-...

Using Llama-3.1-405B as a Coding Assistant with Continue.Dev, Ollama, and NVIDIA GH200 Superchip

10:33

Using Llama-3.1-405B as a Coding Assistant with Continue.Dev, Ollama, and NVIDIA GH200 Superchip

Просмотров 819Месяц назад

Running Llama-3.1-405B in just 2u of rackspace, under 3kW, with an industry standard server, the HPE ProLiant DL384 Gen12: www.hpe.com/ca/en/compute/pro... Featuring 2x NVIDIA Grace Hopper GH200 superchips inside Continue.Dev VSCode Plugin for coding assistance: continue.dev/ Ollama for model serving: ollama.com/ Open-WebUI for the chat frontend: openwebui.com/

Running Llama-3.1-405B with Ollama and Open-WebUI: Introduction to the DL384 Gen12 Server

7:11

Running Llama-3.1-405B with Ollama and Open-WebUI: Introduction to the DL384 Gen12 Server

Просмотров 220Месяц назад

Running Llama-3.1-405B in just 2u of rackspace, under 3kW, with an industry standard server, the HPE ProLiant DL384 Gen12: www.hpe.com/ca/en/compute/proliant-dl384-gen12.html Featuring 2x NVIDIA Grace Hopper GH200 superchips inside Ollama for model serving, Open-WebUI for the chat frontend

TWRTW Ep. #1 - Intel issues, NVIDIA delays, JPY carry trades, Google antitrust

28:51

TWRTW Ep. #1 - Intel issues, NVIDIA delays, JPY carry trades, Google antitrust

Просмотров 142Месяц назад

With backgrounds in the design and implementation of compute infrastructure from edge to cloud, sensor to tensor, Jordan Nanos and Hunter Almgren give their take on what’s new in enterprise technology - specifically, what they read this week. RSS: rss.com/podcasts/twrtw Spotify: open.spotify.com/show/1FzBQv10cadDsQGOCudEXD Apple: podcasts.apple.com/ca/podcast/things-we-read-this-week-jordan-nan...

Building Customized Text-To-SQL Pipelines in Open WebUI

6:22

Building Customized Text-To-SQL Pipelines in Open WebUI

Просмотров 2,1 тыс.Месяц назад

Accessing an LLM served by vLLM or ollama through open-webui Using a text-to-SQL prompt within an open-webui pipeline Connecting to a postgres database with an open-webui pipeline Modifying SQL queries using the information in-context All disconnected, running on a single server with 1x GPU www.youtube.com/@jordannanos x.com/JordanNanos www.linkedin.com/in/jordannanos/

Simple Overview of Text to SQL Using Open-WebUI Pipelines

6:02

Simple Overview of Text to SQL Using Open-WebUI Pipelines

Просмотров 3,1 тыс.Месяц назад

runs on a single GPU server

Overview of an Example LLM Inference Setup

10:21

Overview of an Example LLM Inference Setup

Просмотров 2,8 тыс.Месяц назад

HPE Cray XD670 with 8x H100 80GB SXM GPUs ollama, vLLM, open-webui docker containers llama3.1, mistral, gemma models

@Mohsin.Siddique День назад
Great Video! Can you tell me please how to create/generate API Key for llama_index?
@jordannanos День назад
@@Mohsin.Siddique llama-index is a python package that is installed via pip, you don’t need an API key. No API keys required for this pipeline
@harsh90dem0 3 дня назад
Iam running both the openwebui and pipelines on different docker containers, but there seems to be an error whenever i try to connect both, your example which repeats the text back to the user seems to work fine, but whenever i use libraries like langchain or lammaindex it doesnt seem to work, throws an http connection error.. could you provide any help on this ?
@jordannanos 2 дня назад
@@harsh90dem0 are the dependant packages installed in your pipelines container? you’ll need to docker exec or kubectl exec into the container called “pipelines” Then run: pip install llama-cloud==0.0.13 llama-index==0.10.65 llama-index-agent-openai==0.2.9 \ llama-index-cli==0.1.13 llama-index-core==0.10.66 llama-index-embeddings-openai==0.1.11 \ llama-index-indices-managed-llama-cloud==0.2.7 llama-index-legacy==0.9.48.post2 \ llama-index-llms-ollama==0.2.2 llama-index-llms-openai==0.1.29 \ llama-index-llms-openai-like==0.1.3 llama-index-multi-modal-llms-openai==0.1.9 \ llama-index-program-openai==0.1.7 llama-index-question-gen-openai==0.1.3 \ llama-index-readers-file==0.1.33 llama-index-readers-llama-parse==0.1.6 \ llama-parse==0.4.9 nltk==3.8.1
@RickySupriyadi 7 дней назад
minds to share your code please?
@jordannanos 7 дней назад
@@RickySupriyadi hi, code is here: github.com/JordanNanos/example-pipelines video reviewing the code: ruclips.net/video/iLVyEgxGbg4/видео.html
@RickySupriyadi 7 дней назад
@@jordannanos wow cool, thank you.
@Alex-os5co 8 дней назад
What an awesome introduction to pipelines - thank you so much!
@gilkovary2753 8 дней назад
Hi, how do I execute the python lib installation on the pipeline server?
@jordannanos 8 дней назад
@@gilkovary2753 you’ll need to docker exec or kubectl exec into the container called “pipelines” Then run: pip install llama-cloud==0.0.13 llama-index==0.10.65 llama-index-agent-openai==0.2.9 \ llama-index-cli==0.1.13 llama-index-core==0.10.66 llama-index-embeddings-openai==0.1.11 \ llama-index-indices-managed-llama-cloud==0.2.7 llama-index-legacy==0.9.48.post2 \ llama-index-llms-ollama==0.2.2 llama-index-llms-openai==0.1.29 \ llama-index-llms-openai-like==0.1.3 llama-index-multi-modal-llms-openai==0.1.9 \ llama-index-program-openai==0.1.7 llama-index-question-gen-openai==0.1.3 \ llama-index-readers-file==0.1.33 llama-index-readers-llama-parse==0.1.6 \ llama-parse==0.4.9 nltk==3.8.1
@netixc9733 11 дней назад
Thanks for sharing this awesome project! I tried running the 01_text_to_sql_pipeline_vLLM_llama.py file from your GitHub repo, but I'm having trouble uploading it on Open WebUI even though I've installed all the requirements. Do you have any idea what might be causing this issue? Thanks again!
@dj_hexa_official 11 дней назад
Did you configure well pipeline ?
@netixc9733 11 дней назад
@@dj_hexa_official what do u mean with that ?
@jordannanos 11 дней назад
@@netixc9733 what error are you seeing? docker logs -f or kubectl logs -f your pipelines container and it may report an error
@johnkintree763 12 дней назад
Lovely demo of the synergy between language models and databases.
@JJaitley 14 дней назад
First of all great job Jordan. It would be really helpful if you could share the code on git.
@jordannanos 12 дней назад
hi, code is here: github.com/JordanNanos/example-pipelines video reviewing the code: ruclips.net/video/iLVyEgxGbg4/видео.html
@rafaelg8238 15 дней назад
great video. hoping to see more as soon. congrats.
@jordannanos 12 дней назад
hi, code is here: github.com/JordanNanos/example-pipelines video reviewing the code: ruclips.net/video/iLVyEgxGbg4/видео.html
@dj_hexa_official 17 дней назад
Jordan , Super good job. I'm trying to integrate openwebui into my CRM system. That I would like to query the database for any of our product price or everything through the chat for my employees. This rag pipeline can make it in this way for example ? Thanks you for your answer
@jordannanos 12 дней назад
hi, I think if you've got a db you should be able to query it. especially if you already know how using python. I posted another video. code is here: github.com/JordanNanos/example-pipelines video reviewing the code: ruclips.net/video/iLVyEgxGbg4/видео.html
@dj_hexa_official 11 дней назад
@@jordannanos Thanks a lot Jordan . Super cool
@swarupdas8043 27 дней назад
Hi. Could you link us to the source code of the pipeline?
@jordannanos 12 дней назад
code is here: github.com/JordanNanos/example-pipelines video reviewing the code: ruclips.net/video/iLVyEgxGbg4/видео.html
@RedCloudServices День назад
Jordan thanks, I have a single gpu runpod setup would you recommend just adding a docker postgresql to existing pod? and is the python code using langchain stored in the pod pipeline settings? this sort of reminds me of AWS serverless Lambda but simpler
@jordannanos День назад
@@RedCloudServices if you’d like to save money I would run Postgres in docker on the same VM you’ve already got. That will also simplify networking. Over time you might want to start/stop those services independently in the event of an upgrade to docker or your VM. Or you might want to scale independently. In that case you might want a separate VM for your DB and a separate one for your UI. Or you might consider running kubernetes. Yes the python code is all contained within the pipelines container and uses llama-index not langchain (though you could use langchain too). Just a choice I made.
@jordannanos День назад
@@RedCloudServices in other words, you’ll need to pip install the packages that the pipeline depends on, inside the pipelines container. Watch the other video I linked for more detail on how to do this.
@RedCloudServices День назад
@@jordannanos yep! just watched it. I just learned openwebui does not allow Vision only models or multi modal LLMs like Gemini. Was hoping to setup a pipeline using a vision model 🤷‍♂️ also it’s not clear how to edit or setup whatever vector db it’s using
@martinsmuts2557 29 дней назад
Hi Jordan, thanks. I am missing the steps where you created the custom "Database Rag Pipeline with Display". From the Pipelines page you completed the database details and set the Text-to-sql Model to Llama3, but where do you configure the connection between the pipeline valves and the "Database Rag Pipeline with Display" to be an option to be selected?
@jordannanos 28 дней назад
@@martinsmuts2557 it’s a single .py file that is uploaded to the pipelines container. I’ll cover that in more detail in a future video
@KunaalNaik 12 дней назад
@@jordannanos Do create this video soon!
@jordannanos 12 дней назад
@@KunaalNaik @martinsmuts2557 just posted a video reviewing the code: ruclips.net/video/iLVyEgxGbg4/видео.html repo is here: github.com/JordanNanos/example-pipelines
@rendom_stuf Месяц назад
hi
@ZiggyDaZigster Месяц назад
30k GC? 8 of them?
@user-my4ni3eq3f Месяц назад
Thx for sharing and it's really interesting to learn more about the pipeline projects related to open webui.
@fakebizPrez Месяц назад
Sweet rig. Is that your daily driver? 😀😀
@KCM25NJL Месяц назад
The cost of such a setup is circa $500,000........ amma get me 2 :)
@pipcountgps1 Месяц назад
Thank you Jordan! Great work, interesting to see how these new servers can really deliver performance. ARM / x86.. just works. Yours, Greg
@pipcountgps1 Месяц назад
Thank you Jordan.
@hasaniqbal3180 Месяц назад
Thank you for this. Can you share more info on the RAG pipeline along with code examples.
@jordannanos Месяц назад
working on getting it to run on both vLLM + ollama endpoints with llama3.1 + mistral. prompt uses llamaindex for text-to-sql.
@jordannanos Месяц назад
similar to this guide: docs.llamaindex.ai/en/stable/examples/index_structs/struct_indices/SQLIndexDemo/
@jvannoyx4 Месяц назад
Great job can't wait to see more
@jordannanos 12 дней назад
@@jvannoyx4 hi, code is here: github.com/JordanNanos/example-pipelines video reviewing the code: ruclips.net/video/iLVyEgxGbg4/видео.html
@interactivetech1 Месяц назад
Amazing video! I have a 4xA4000 GPU 128GB and I can only get the 405B 2_K model, and it’s really slow. Amazing how the GH100 chips offer great token/sec performance!
@niceshotapps1233 Месяц назад
- what are you using it for? - .... stuff
@0101-s7v Месяц назад
AI, apparently. (LLM = Large Language Model)
@rodrimora Месяц назад
I feel jalous of that 8xH100 server. Currently using a 4x3090 at home. I actually use a pretty similar setup with vLLM for the full precision models and exllama or llama.cpp for quantized models + OpenwebUI as a frontend.
@MadeInJack Месяц назад
Why would you need more than that? Be glad for what you already have or you won't find happiness :)
@ricardocosta9336 29 дней назад
Bitch i have a p40 and im over the moon. Being poor in ml is hard.
@nesdi6653 Месяц назад
Word
@nesdi6653 Месяц назад
Why not podman tho
@davidjanuski Месяц назад
Good discussion! Keep it up.
@FarhadOmid Месяц назад
Great work, Jordan! Gonna start scraping the parts together...
@TensorTom Месяц назад
I been automating deployments with skypilot. It uses the cheapest spot instances and heals itself
@peter102 Месяц назад
nice video. saw the link from twitter. my question is, is there a way to speed up the results after you ask it a question?
@jordannanos Месяц назад
Yes, working to improve the LLM response and SQL query time

Jordan Nanos

Комментарии