Видео 74
Просмотров 7 763

EE837 (Fall 2024): Fine-tuning Multimodal LLMs to Follow Zero-shot Demonstrative Instructions

36:09

EE837 (Fall 2024): MoMa: Efficient Early-Fusion Pre-training with Mixture of Modality-Aware Experts

27:18

EE837 (Fall 2024): Compositional Chain-of-Thought Prompting for Large Multimodal Models

24:03

EE837 (Fall 2024): Make Your LLM Fully Utilize the Context

25:50

EE837 (Fall 2024): Aligning Large Multimodal Models with Factually Augmented RLHF

24:05

EE837 (Fall 2024): RegionGPT: Towards Region Understanding Vision Language Model

30:56

EE837 (Fall 2024): Generative Multimodal Models are In-Context Learners

EE837 Special Topics on Signal Processing: Multimedia Processing and Learning (Fall 2024) Student Presentation
Paper title: Generative Multimodal Models are In-Context Learners
Presenter: Damin Yeom
Date: Nov. 12th, 2024

Видео

EE837 (Fall 2024): Fine-tuning Multimodal LLMs to Follow Zero-shot Demonstrative Instructions

36:09

EE837 (Fall 2024): Fine-tuning Multimodal LLMs to Follow Zero-shot Demonstrative Instructions

Просмотров 712 часов назад

EE837 Special Topics on Signal Processing: Multimedia Processing and Learning (Fall 2024) Student Presentation Paper title: Fine-tuning Multimodal LLMs to Follow Zero-shot Demonstrative Instructions Presenter: Dahye Lee Date: Nov. 7th, 2024

EE837 (Fall 2024): MoMa: Efficient Early-Fusion Pre-training with Mixture of Modality-Aware Experts

27:18

EE837 (Fall 2024): MoMa: Efficient Early-Fusion Pre-training with Mixture of Modality-Aware Experts

Просмотров 1016 часов назад

EE837 Special Topics on Signal Processing: Multimedia Processing and Learning (Fall 2024) Student Presentation Paper title: MoMa: Efficient Early-Fusion Pre-training with Mixture of Modality-Aware Experts Presenter: Dahye Lee Date: Oct. 24th, 2024

EE837 (Fall 2024): Compositional Chain-of-Thought Prompting for Large Multimodal Models

24:03

EE837 (Fall 2024): Compositional Chain-of-Thought Prompting for Large Multimodal Models

Просмотров 1116 часов назад

EE837 Special Topics on Signal Processing: Multimedia Processing and Learning (Fall 2024) Student Presentation Paper title: Compositional Chain-of-Thought Prompting for Large Multimodal Models Presenter: Seongyeop Kim Date: Nov. 5th, 2024

EE837 (Fall 2024): Make Your LLM Fully Utilize the Context

25:50

EE837 (Fall 2024): Make Your LLM Fully Utilize the Context

Просмотров 1216 часов назад

EE837 Special Topics on Signal Processing: Multimedia Processing and Learning (Fall 2024) Student Presentation Paper title: Make Your LLM Fully Utilize the Context Presenter: Sangyun Chung Date: Oct. 31st, 2024

EE837 (Fall 2024): Aligning Large Multimodal Models with Factually Augmented RLHF

24:05

EE837 (Fall 2024): Aligning Large Multimodal Models with Factually Augmented RLHF

Просмотров 3114 дней назад

EE837 Special Topics on Signal Processing: Multimedia Processing and Learning (Fall 2024) Student Presentation Paper title: Aligning Large Multimodal Models with Factually Augmented RLHF Presenter: Byung-Kwan Lee Date: Oct. 29th, 2024

EE837 (Fall 2024): RegionGPT: Towards Region Understanding Vision Language Model

30:56

EE837 (Fall 2024): RegionGPT: Towards Region Understanding Vision Language Model

Просмотров 2321 день назад

EE837 Special Topics on Signal Processing: Multimedia Processing and Learning (Fall 2024) Student Presentation Paper title: RegionGPT: Towards Region Understanding Vision Language Model Presenter: Damin Yeom Date: Oct. 22nd, 2024

EE837 (Fall 2024): MA-LMM: Memory-Augmented Large Multimodal Model for Long-Term Video Understanding

35:17

EE837 (Fall 2024): MA-LMM: Memory-Augmented Large Multimodal Model for Long-Term Video Understanding

Просмотров 7528 дней назад

EE837 Special Topics on Signal Processing: Multimedia Processing and Learning (Fall 2024) Student Presentation Paper title:MA-LMM: Memory-Augmented Large Multimodal Model for Long-Term Video Understanding Presenter: Minho Park Date: Oct. 15th, 2024

EE837 (Fall 2024): Language Model Beats Diffusion: Tokenizer is Key to Visual Generation

53:33

EE837 (Fall 2024): Language Model Beats Diffusion: Tokenizer is Key to Visual Generation

Просмотров 104Месяц назад

EE837 Special Topics on Signal Processing: Multimedia Processing and Learning (Fall 2024) Student Presentation Paper title: Language Model Beats Diffusion: Tokenizer is Key to Visual Generation Presenter: Byung-Kwan Lee Date: Oct. 10th, 2024

EE837 (Fall 2024): Analyzing and Mitigating Object Hallucination in Large Vision-Language Models

28:48

EE837 (Fall 2024): Analyzing and Mitigating Object Hallucination in Large Vision-Language Models

Просмотров 67Месяц назад

EE837 Special Topics on Signal Processing: Multimedia Processing and Learning (Fall 2024) Student Presentation Paper title: Analyzing and Mitigating Object Hallucination in Large Vision-Language Models Presenter: Seongyeop Kim Date: Oct. 8th, 2024

EE837 (Fall 2024): MuRAG: Multimodal Retrieval-Augmented Generator for Open Question Answering over

40:51

EE837 (Fall 2024): MuRAG: Multimodal Retrieval-Augmented Generator for Open Question Answering over

Просмотров 44Месяц назад

EE837 Special Topics on Signal Processing: Multimedia Processing and Learning (Fall 2024) Student Presentation Paper title: MuRAG: Multimodal Retrieval-Augmented Generator for Open Question Answering over Images and Text Presenter: Sangyun Chung Date: Oct. 4th, 2024

EE837 (Fall 2024): GroundingGPT: Language Enhanced Multi-modal Grounding Model

35:19

EE837 (Fall 2024): GroundingGPT: Language Enhanced Multi-modal Grounding Model

Просмотров 24Месяц назад

EE837 Special Topics on Signal Processing: Multimedia Processing and Learning (Fall 2024) Student Presentation Paper title: GroundingGPT: Language Enhanced Multi-modal Grounding Model Presenter: Minho Park Date: Oct. 2nd, 2024

EE837 (Fall 2024): Auto-Encoding Morph-Tokens for Multimodal LLM

42:43

EE837 (Fall 2024): Auto-Encoding Morph-Tokens for Multimodal LLM

Просмотров 53Месяц назад

EE837 Special Topics on Signal Processing: Multimedia Processing and Learning (Fall 2024) Student Presentation Paper title: Auto-Encoding Morph-Tokens for Multimodal LLM Presenter: Damin Yeom Date: Sep. 26th, 2024

EE837 (Fall 2024): AnyGPT: Unified Multimodal LLM with Discrete Sequence Modeling

37:42

EE837 (Fall 2024): AnyGPT: Unified Multimodal LLM with Discrete Sequence Modeling

Просмотров 35Месяц назад

EE837 Special Topics on Signal Processing: Multimedia Processing and Learning (Fall 2024) Student Presentation Paper title: AnyGPT: Unified Multimodal LLM with Discrete Sequence Modeling Presenter: Dahye Lee Date: Sep. 24th, 2024

EE837 (Fall 2024): Leveraging Open Knowledge for Advancing Task Expertise in Large Language Models

29:39

EE837 (Fall 2024): Leveraging Open Knowledge for Advancing Task Expertise in Large Language Models

Просмотров 55Месяц назад

EE837 Special Topics on Signal Processing: Multimedia Processing and Learning (Fall 2024) Student Presentation Paper title: Leveraging Open Knowledge for Advancing Task Expertise in Large Language Models Presenter: Seongyeop Kim Date: Sep. 19th, 2024

EE837 (Fall 2024): DreamLLM: Synergistic Multimodal Comprehension and Creation

34:01

EE837 (Fall 2024): DreamLLM: Synergistic Multimodal Comprehension and Creation

Просмотров 472 месяца назад

EE837 (Fall 2024): DreamLLM: Synergistic Multimodal Comprehension and Creation

EE837 (Fall 2024): LLM Pruning and Distillation in Practice: The Minitron Approach

35:37

EE837 (Fall 2024): LLM Pruning and Distillation in Practice: The Minitron Approach

Просмотров 1022 месяца назад

EE837 (Fall 2024): LLM Pruning and Distillation in Practice: The Minitron Approach

EE474 (Spring 2024): AI Chatbot "Warny" for Automotive Dashboard Warning Lights

29:34

EE474 (Spring 2024): AI Chatbot "Warny" for Automotive Dashboard Warning Lights

Просмотров 124 месяца назад

EE474 (Spring 2024): AI Chatbot "Warny" for Automotive Dashboard Warning Lights

EE474 (Spring 2024): Generating Moving Emoticons from Webtoon

27:39

EE474 (Spring 2024): Generating Moving Emoticons from Webtoon

Просмотров 164 месяца назад

EE474 (Spring 2024): Generating Moving Emoticons from Webtoon

EE474 (Spring 2024): MUSE: Multimodal Utility for Story Enhancement

29:28

EE474 (Spring 2024): MUSE: Multimodal Utility for Story Enhancement

Просмотров 154 месяца назад

EE474 (Spring 2024): MUSE: Multimodal Utility for Story Enhancement

EE474 (Spring 2024): ALACen: Automatic Language-level Adjustment for Video Censorship

29:46

EE474 (Spring 2024): ALACen: Automatic Language-level Adjustment for Video Censorship

Просмотров 44 месяца назад

EE474 (Spring 2024): ALACen: Automatic Language-level Adjustment for Video Censorship

EE474 (Spring 2024): Adaptation of Music

25:33

EE474 (Spring 2024): Adaptation of Music

Просмотров 54 месяца назад

EE474 (Spring 2024): Adaptation of Music

EE474 (Spring 2024): Beyond the Page: AI-Powered Multimedia Books from Text for Children

29:25

EE474 (Spring 2024): Beyond the Page: AI-Powered Multimedia Books from Text for Children

Просмотров 94 месяца назад

EE474 (Spring 2024): Beyond the Page: AI-Powered Multimedia Books from Text for Children

EE474 (Spring 2024): REMEDI: REpresenting Me using D-Id studio

27:35

EE474 (Spring 2024): REMEDI: REpresenting Me using D-Id studio

Просмотров 204 месяца назад

EE474 (Spring 2024): REMEDI: REpresenting Me using D-Id studio

Causal Mode Multiplexer: A Novel Framework for Unbiased Multispectral Pedestrian Detection

8:09

Causal Mode Multiplexer: A Novel Framework for Unbiased Multispectral Pedestrian Detection

Просмотров 1157 месяцев назад

Causal Mode Multiplexer: A Novel Framework for Unbiased Multispectral Pedestrian Detection

Integrating Language-Derived Appearance Elements with Visual Cues in Pedestrian Detection

2:00

Integrating Language-Derived Appearance Elements with Visual Cues in Pedestrian Detection

Просмотров 487 месяцев назад

Integrating Language-Derived Appearance Elements with Visual Cues in Pedestrian Detection

Mitigating Dataset Bias in Image Captioning through CLIP Confounder free Captioning Network (ICIP23)

10:04

Mitigating Dataset Bias in Image Captioning through CLIP Confounder free Captioning Network (ICIP23)

Просмотров 358 месяцев назад

Mitigating Dataset Bias in Image Captioning through CLIP Confounder free Captioning Network (ICIP23)

DiffV2S: Diffusion-based Video-to-Speech Synthesis with Vision-guided Speaker Embedding (ICCV23)

2:59

DiffV2S: Diffusion-based Video-to-Speech Synthesis with Vision-guided Speaker Embedding (ICCV23)

Просмотров 478 месяцев назад

DiffV2S: Diffusion-based Video-to-Speech Synthesis with Vision-guided Speaker Embedding (ICCV23)

Watch or Listen: Robust Audio-Visual Speech Recognition (CVPR23)

2:04

Watch or Listen: Robust Audio-Visual Speech Recognition (CVPR23)

Просмотров 518 месяцев назад

Watch or Listen: Robust Audio-Visual Speech Recognition (CVPR23)

Lip-to-speech Synthesis in the Wild with Multi-task Learning (ICASSP23)

1:16

Lip-to-speech Synthesis in the Wild with Multi-task Learning (ICASSP23)

Просмотров 418 месяцев назад

Lip-to-speech Synthesis in the Wild with Multi-task Learning (ICASSP23)

@rossvalence7292 2 месяца назад
This is very nice and helpful for our research thank you
@neverfok 7 месяцев назад
발표 잘 하시네요. 내용을 잘 알고 계신것 같습니다. 좋은 내용 감사합니다!
@neverfok 7 месяцев назад
좋은 논문 발표 감사합니다~~!!
@kavibharathi1547 8 месяцев назад
Add english subtitles or audio
@TaeKimPiano Год назад
김현준 폼 미쳤다
@TylerMatthewHarris 7 лет назад
Very cool

IVY & IVL Lab in KAIST

Видео

Комментарии