Видео 257
Просмотров 97 213

Computer Vision Meetup: Simple & Scalable Approach to Improve Vision Model Robustness to Corruptions

16:15

Computer Vision Meetup: CLIP: Insights into Zero-Shot Image Classification with Mutual Knowledge

28:27

Computer Vision Meetup: Intrinsic Self-Supervision for Data Quality Audits

29:01

Computer Vision Meetup: Map It Anywhere: Empowering BEV Map Prediction using Public Datasets

26:58

Computer Vision Meetup: Understanding Bias in Large-Scale Visual Datasets

22:15

Computer Vision Meetup: No "Zero-Shot" Without Exponential Data

26:05

Computer Vision Meetup: Multiview Scene Graph

Motivated by how humans perceive scenes, we propose the Multiview Scene Graph (MSG) as a general topological scene representation. MSG constructs a place+object graph from unposed RGB images and we provide novel metrics to evaluate the graph quality. We combine visual place recognition and object association to build MSG in one Transformer decoder model. We believe MSG can connect dots across classic vision tasks to promote spatial intelligence and open new doors for topological 3D scene understanding.
Read the paper: arxiv.org/abs/2410.11187
About the Speaker:
Juexiao Zhang is a second-year PhD student in computer science at NYU Courant, advised by Professor Chen Feng. He is interested in l...

Видео

Computer Vision Meetup: Simple & Scalable Approach to Improve Vision Model Robustness to Corruptions

16:15

Computer Vision Meetup: Simple & Scalable Approach to Improve Vision Model Robustness to Corruptions

Просмотров 234 часа назад

Deep neural networks perform exceptionally on clean images but face significant challenges with corrupted ones. While data augmentation with specific corruptions during training can improve model robustness to those particular distortions, this approach typically degrades performance on both clean images and corruptions not encountered during training. In this talk, we present a novel approach ...

Computer Vision Meetup: CLIP: Insights into Zero-Shot Image Classification with Mutual Knowledge

28:27

Computer Vision Meetup: CLIP: Insights into Zero-Shot Image Classification with Mutual Knowledge

Просмотров 944 часа назад

We interpret CLIP’s zero-shot image classification by examining shared textual concepts learned by its vision and language encoders. We analyzes 13 CLIP models across various architectures, sizes, and datasets. The approach highlights a human-friendly way to understand CLIP’s classification decisions. Read the paper: arxiv.org/abs/2410.13016 Fawaz Sammani is a 2nd year PhD student at the Vrije ...

Computer Vision Meetup: Intrinsic Self-Supervision for Data Quality Audits

29:01

Computer Vision Meetup: Intrinsic Self-Supervision for Data Quality Audits

Просмотров 274 часа назад

Benchmark datasets in computer vision often contain issues such as off-topic samples, near-duplicates, and label errors, compromising model evaluation accuracy. This talk will discuss SelfClean, a data-cleaning framework that leverages self-supervised representation learning and distance-based indicators to detect these issues effectively. By framing the task as a ranking or scoring problem, Se...

Computer Vision Meetup: Map It Anywhere: Empowering BEV Map Prediction using Public Datasets

26:58

Computer Vision Meetup: Map It Anywhere: Empowering BEV Map Prediction using Public Datasets

Просмотров 439 часов назад

Top-down Bird’s Eye View (BEV) maps are a popular representation for ground robot navigation due to their richness and flexibility for downstream tasks. While recent methods have shown promise for predicting BEV maps from First-Person View (FPV) images, their generalizability is limited to small regions captured by current autonomous vehicle-based datasets. In this context, we show that a more ...

Computer Vision Meetup: Understanding Bias in Large-Scale Visual Datasets

22:15

Computer Vision Meetup: Understanding Bias in Large-Scale Visual Datasets

Просмотров 329 часов назад

Truly general-purpose vision systems require pre-training on diverse and representative visual datasets. The “dataset classification” experiment reveals that modern large-scale visual datasets are still very biased: neural networks can achieve excellent accuracy in classifying which dataset an image is from. However, the concrete forms of bias among these datasets remain unclear. In this talk, ...

Computer Vision Meetup: No "Zero-Shot" Without Exponential Data

26:05

Computer Vision Meetup: No "Zero-Shot" Without Exponential Data

Просмотров 599 часов назад

Web-crawled pretraining datasets underlie the impressive “zero-shot” evaluation performance of multimodal models. However, it is unclear how meaningful the notion of “zero-shot” generalization is for such multimodal models, as it is not known to what extent their pretraining datasets encompass the downstream concepts targeted for during “zero-shot” evaluation. In this work, we ask: How is the p...

Computer Vision Meetup: An AI-Powered Teaching Assistant for Scalable and Adaptive Learning

11:40

Computer Vision Meetup: An AI-Powered Teaching Assistant for Scalable and Adaptive Learning

Просмотров 51День назад

The future of education lies in personalized and scalable solutions, especially in fields like computer engineering where complex concepts often challenge students. This talk introduces Lumina (AI Teaching Assistant), a cutting-edge agentic system designed to revolutionize programming education through its innovative architecture and teaching strategies. Built using OpenAI API, LangChain, RAG, ...

Computer Vision Meetup: Scaling Semantic Segmentation with Blender

27:35

Computer Vision Meetup: Scaling Semantic Segmentation with Blender

Просмотров 50День назад

Generating datasets for semantic segmentation can be time-intensive. Learn how to use Blender’s Python API to create diverse and realistic synthetic data with automated labels, saving time and improving model performance. Preview the topics to be discussed in this Medium post. About the Speaker Vincent Vandenbussche has a PhD in Physics, is an author, and Machine Learning Engineer with 10 years...

Computer Vision Meetup: WACV 2025 - Elderly Action Recognition Challenge

8:23

Computer Vision Meetup: WACV 2025 - Elderly Action Recognition Challenge

Просмотров 43День назад

Join us for a quick update on the Elderly Action Recognition (EAR) Challenge, part of the Computer Vision for Smalls (CV4Smalls) Workshop at the WACV 2025 conference! This challenge focuses on advancing research in Activity of Daily Living (ADL) recognition, particularly within the elderly population, a domain with profound societal implications. Participants will employ transfer learning techn...

Computer Vision Meetup: Using Machine Vision to Create Sustainable Practices in Fisheries

27:35

Computer Vision Meetup: Using Machine Vision to Create Sustainable Practices in Fisheries

Просмотров 66День назад

Fishing vessels are on track to generate 10 million hours of video footage annually, creating a massive machine learning operations challenge. At AI.Fish, we are building an end-to-end system enabling non-technical users to harness AI for catch monitoring and classification both on-board and in the cloud. This talk explores our journey in building these approachable systems and working toward a...

Visual AI for Geospatial: Evaluating Earth Observation Foundation Models

31:32

Visual AI for Geospatial: Evaluating Earth Observation Foundation Models

Просмотров 121День назад

Geospatial and Earth Observation have benefited from the new advances in computer vision. In this talk we are going to evaluate the accuracy and ease of use for two of these great new models - the Satlas and Clay foundational models. The evaluation will look at distinct different areas on the globe. Come see how this gift of foundational models improves your work in geospatial or Earth observat...

Visual AI for Geospatial: Earth Monitoring for Everyone with Earth Index

27:29

Visual AI for Geospatial: Earth Monitoring for Everyone with Earth Index

Просмотров 184День назад

Earth Index is a end user focused application that preprocesses global imagery through AI foundation models to enable rapid in-browser search and monitoring. Earth Genome builds Earth Index for critical applications in the environment, and is being used today to report on illegal airstrips built in the Peruvian Amazon, track cattle factory farms across the planet for emissions modeling, and exp...

Visual AI for Geospatial: Is AI Creating a Whole New Earth-Aware Geospatial Stack?

23:21

Visual AI for Geospatial: Is AI Creating a Whole New Earth-Aware Geospatial Stack?

Просмотров 140День назад

The latest wave of AI innovation is profoundly changing many domains. In remote sensing, despite efforts like ours at Clay and others, it is been less so. In this talk we will share our experience as we realize, and explore, if geoAI represents a whole new stack to work with Earth data. About the Speaker Dr. Bruno Sanchez-Andrade Nuno is the executive director of the non-profit project Clay, an...

The Frontier of VisualAI in Medical Imaging

19:01

The Frontier of VisualAI in Medical Imaging

Просмотров 19421 день назад

Explore the transformative potential of Visual AI in medical imaging with Daniel Gural, Machine Learning and Developer Relations expert at Voxel51. 🤖🩺 In this forward-looking talk, Daniel dives into the intersection of medicine and AI in 2025, addressing critical industry and research challenges: - Key Industry Challenges: Rising costs, doctor shortages, and time-intensive processes. - Key Rese...

How to Unlock More in Self Driving Datasets

45:04

How to Unlock More in Self Driving Datasets

Просмотров 19521 день назад

How to Unlock More in Self Driving Datasets

Streamlined Retail Product Detection with YOLOv8 and FiftyOne

24:23

Streamlined Retail Product Detection with YOLOv8 and FiftyOne

Просмотров 163Месяц назад

Streamlined Retail Product Detection with YOLOv8 and FiftyOne

Hands-On with Meta AI's CoTracker3: Parsing and Visualizing Point Tracking Output

14:48

Hands-On with Meta AI's CoTracker3: Parsing and Visualizing Point Tracking Output

Просмотров 101Месяц назад

Hands-On with Meta AI's CoTracker3: Parsing and Visualizing Point Tracking Output

How We Built CoTracker3: Simpler and Better Point Tracking by Pseudo-Labeling Real Videos

20:46

How We Built CoTracker3: Simpler and Better Point Tracking by Pseudo-Labeling Real Videos

Просмотров 246Месяц назад

How We Built CoTracker3: Simpler and Better Point Tracking by Pseudo-Labeling Real Videos

The NeurIPS 2024 Preshow: Creating SPIQA: Addressing the Limitations of Existing Datasets for VQA

35:17

The NeurIPS 2024 Preshow: Creating SPIQA: Addressing the Limitations of Existing Datasets for VQA

Просмотров 4792 месяца назад

The NeurIPS 2024 Preshow: Creating SPIQA: Addressing the Limitations of Existing Datasets for VQA

The NeurIPS 2024 Preshow: Zero-Shot Learning: A Misnomer?

26:38

The NeurIPS 2024 Preshow: Zero-Shot Learning: A Misnomer?

Просмотров 3242 месяца назад

The NeurIPS 2024 Preshow: Zero-Shot Learning: A Misnomer?

The NeurlPS 2024: A Textbook Remedy for Domain Shifts: Knowledge Priors for Medical Image Analysis

26:17

The NeurlPS 2024: A Textbook Remedy for Domain Shifts: Knowledge Priors for Medical Image Analysis

Просмотров 1442 месяца назад

The NeurlPS 2024: A Textbook Remedy for Domain Shifts: Knowledge Priors for Medical Image Analysis

The NeurlPS 2024 Preshow NaturalBench Evaluating Vision Language Model on Natural Adversarial Sample

22:44

The NeurlPS 2024 Preshow NaturalBench Evaluating Vision Language Model on Natural Adversarial Sample

Просмотров 2272 месяца назад

The NeurlPS 2024 Preshow NaturalBench Evaluating Vision Language Model on Natural Adversarial Sample

Computer Vision Meetup: Do It Yourself LLMs

46:18

Computer Vision Meetup: Do It Yourself LLMs

Просмотров 1862 месяца назад

Computer Vision Meetup: Do It Yourself LLMs

The NeurIPS 2024 Preshow: A Label is Worth a Thousand Images in Dataset Distillation

17:47

The NeurIPS 2024 Preshow: A Label is Worth a Thousand Images in Dataset Distillation

Просмотров 4152 месяца назад

The NeurIPS 2024 Preshow: A Label is Worth a Thousand Images in Dataset Distillation

The NeurIPS 2024 Preshow: What matters when building vision-language models?

18:41

The NeurIPS 2024 Preshow: What matters when building vision-language models?

Просмотров 7192 месяца назад

The NeurIPS 2024 Preshow: What matters when building vision-language models?

ECCV 2024: Open-Vocabulary 3D Semantic Segmentation with Text-to-Image Diffusion Models

40:09

ECCV 2024: Open-Vocabulary 3D Semantic Segmentation with Text-to-Image Diffusion Models

Просмотров 1622 месяца назад

ECCV 2024: Open-Vocabulary 3D Semantic Segmentation with Text-to-Image Diffusion Models

ECCV Redux: Zero-shot Video Anomaly Detection: Leveraging LLMs for Rule-Based Reasoning

35:14

ECCV Redux: Zero-shot Video Anomaly Detection: Leveraging LLMs for Rule-Based Reasoning

Просмотров 1532 месяца назад

ECCV Redux: Zero-shot Video Anomaly Detection: Leveraging LLMs for Rule-Based Reasoning

ECCV 2024 Redux: Day 3- Closing the Gap Between Satellite & Street View Imagery Generative Models

24:10

ECCV 2024 Redux: Day 3- Closing the Gap Between Satellite & Street View Imagery Generative Models

Просмотров 1162 месяца назад

ECCV 2024 Redux: Day 3- Closing the Gap Between Satellite & Street View Imagery Generative Models

ECCV 2024 Redux: Day 3- High-Efficiency 3D Scene Compression Using Self-Organizing Gaussians

33:34

ECCV 2024 Redux: Day 3- High-Efficiency 3D Scene Compression Using Self-Organizing Gaussians

Просмотров 4382 месяца назад

ECCV 2024 Redux: Day 3- High-Efficiency 3D Scene Compression Using Self-Organizing Gaussians

@cyberhard 15 часов назад
Excellent presentation! Thank you.
@RimaBegum-v3t 2 дня назад
Hello sir, How are you? I am a regular viewer of yours. I follow your RUclips channel. I really your videos. But your videos get very few views. Have you thought about this?
@gobajoseph5064 3 дня назад
Super c'est modèles sont disponibles pour l'ensemble de la planète ? l'Afrique par exemple
@redforestx7371 2 месяца назад
Amazing work!
@redforestx7371 2 месяца назад
WOW. This looks so amazing. I can't wait to use this!
@adamaustad 2 месяца назад
Really interesting
@sylviaschmitt 2 месяца назад
Thank you for sharing the video. Does this plugin assume a vector engine like qdrant is used as backend?
@sylviaschmitt 2 месяца назад
Thank you for sharing this video on the Active Learning plugin. Is it possible to use the plugin for multi-class multi-label tasks as well?
@sai.sankarwork 3 месяца назад
When I try the dev install process in a git bash terminal, it fails at a point because of a package error. "Collecting shapely>=1.7.1 (from -r requirements\extras.txt (line 7)) Using cached shapely-2.0.6-cp312-cp312-win_amd64.whl.metadata (7.2 kB) ERROR: Could not find a version that satisfies the requirement open3d>=0.16.0 (from versions: none) ERROR: No matching distribution found for open3d>=0.16.0" How can this be solved?
@NishantRoy-h4d 3 месяца назад
Very interesting demo; would you mind sharing the Colab link?
@deemon101 4 месяца назад
ok, and how does one start it?
@AmeeliaK 4 месяца назад
This was very helpful! Llama Index grows so fast, it feels overwhelming for a beginner.
@BD_Gaming2013 5 месяцев назад
!second comment
@SergeyPavlov-b4c 5 месяцев назад
I want to work with my custom dataset. I'd like you to show me how to do it and which benefits I can get using your product. Examples, how can I refine my own data with fiftyone
@menghuitan1628 6 месяцев назад
Isn't the "Grid Trick" similar to using ControlNet, a type of model for controlling image diffusion models by conditioning the model with an additional input image?
@ByTobys 6 месяцев назад
Love your product!
@MohitAkhakharia 7 месяцев назад
How to we execute the plugin logic in the code? This doesn't seem to work: logging.info("removing approximate duplicates") operator_uri = "@jacobmarks/image_deduplication/remove_all_approximate_duplicates" params = { "sim_choices": "sim", # You may need to adjust this based on your similarity run key "threshold_value": 0.4 } # Create an invocation request request = foe.InvocationRequest(operator_uri, params=params) # Create an executor and execute the request executor = foe.Executor(requests=[request]) result = executor.trigger(operator_uri, params=params) print(result.to_json()) # logging.info(f"Found approximate duplicates: {result.result}") return result
@alivirat6926 7 месяцев назад
The video was great, thanks mate for explination.
@ashwinkumar5223 8 месяцев назад
Wonderful 👍
@rishiraj2548 8 месяцев назад
Great!
@HarisonRoberto 9 месяцев назад
the search result are only online images? or can it be local images?
@voxel51 8 месяцев назад
you can drag and drop a local image in :)
@kai_harm942 9 месяцев назад
Made that look *way* too easy. I spent a whole hour last night trying to get the first line of code to work! It was because my Python paths were thrown about the place
@ВладиславЛевчук-к3е 8 месяцев назад
ahahahaha) me too)
@technologyencroyable 9 месяцев назад
How to build the js part of code to generate umd.js file in dist folder. I am build using yarn build but the generated umd file is not working and not opening new panel. Please help
@technologyencroyable 9 месяцев назад
How to build the js part of code to generate umd.js file in dist folder. I am build using yarn build but the generated umd file is not working and not opening new panel. Please help
@voxel51 9 месяцев назад
Great question. Try `yarn install` as well. Make sure that the plugin is in your plugins directory. And when you want to change the plugin, make sure you use `yarn dev`. If you have more questions about FiftyOne Plugins, check out the #plugins channel in the FiftyOne community Slack! slack.voxel51.com/
@MyJunkEmail 10 месяцев назад
great tutorial, can you use a local instance of SD?
@AlainPilon Год назад
It is great that you give us a list of next steps, but a link to each of these points would have been nice!
@ZixuWang-ul8hr Год назад
nice job！
@aimadnessbot Год назад
This is good! But i believe the data should also grab eye movement. Eye movement is crucial to map intention and will aid in robot navigation. Apple's headset has the hardware to monitor both eye direction and head direction.
@huynhphanngockhang5733 Год назад
I have a idea for build a autonomous drone using computer vision to detect objects that is labled with a GPS location before.
@rezamahmoudi163 Год назад
please slide share?
@SeedmancChitOKun Год назад
How does it select which images would be kept as "representatives" and which removed?
@aldem34 Год назад
I want words like these intitle:"keyword" For better search efficiency for topics
@wata1991 Год назад
Is it possible to use this and find the most similar image given user submitted photos? For example I'm trying to do something to detect trading cards, where the input would be photos of cards submitted by users.
@voxel51 Год назад
absolutely!
@tyronetyrone2652 Год назад
Hello, I downloaded and installed FiftyOne, but I don’t know how to use it. All your videos didn’t explain how to use it.
@ByTobys 6 месяцев назад
There is lots of documentation online on their website, check it out! Its really not difficult to get it running, but its "only" an API, so some python Experience is definetly helpful to get it running. :)
@ByTobys 6 месяцев назад
There is lots of documentation online on their website, check it out! Its really not difficult to get it running, but its "only" an API, so some python Experience is definetly helpful to get it running. :)
@davidgrayson181 Год назад
I love this
@beiddouwang6643 Год назад
good
@omarelsherif010 Год назад
Thanks for clear explanation❤
@jasonwell5299 Год назад
Thank you so much bro. Nice tutorial.
@divyanshnautiyal8110 Год назад
getting Not Found
@robosergTV Год назад
Thanks, great overview
@ChrisWiggins1 Год назад
Look promising, I was going through your tutorial, and I was hoping to see how you can import your own database.
@ritagislason Год назад
🤩 Promo'SM
@vanessacrosbyfitzgerald Год назад
Can you perform the initial labeling on images that have not been annotated yet? On part 5 and I have not seen that information yet. Did I miss it?
@DigiDriftZone Год назад
Can you edit/correct or add/remove annotations directly in FiftyOne?
@sapsan1234 Год назад
I am really excited about this product! Thank you for this hands-on video!
@akshayiitk4440 2 года назад
"Wow, this video is incredibly informative and well-produced! The speaker does a fantastic job of explaining the complex topic of speech recognition and the new Whisper model from OpenAI in a way that's easy to understand. Great job, highly recommended to anyone interested in this field!"
@magdalenakate6781 2 года назад
splendid 🙂✌️️️!! Find out how your competition ranks better = 'Promosm'!!
@AliHamza-ys8dt 2 года назад
how to add our own dataset into FiftyOne. I want to label my own data.
@ByTobys 6 месяцев назад
As mentioned in the video, fiftyone isn't a classical annotation tool, but it provides hooks to do that with cvat, labelbox etc and then load the labeled data back into fiftyone. For me the cvat solution worked perfectly fine. Everything is perfectly documented on their website, check it out! :) If you want to load your annotation data which is in your own format, and not in a typical dataformat (COCO,...) you'll have to write a few lines of python codes yourself. For that purpose I have implemented a DatasetHandler-class. You'll have to convert into fiftyone-format by iterating through your data and turn them into fiftyone Detection-Objects: detections.append( fo.Detection(label=my_label, bounding_box=my_bbox) ) Fiftyone doesn't work "out of the box", but it's a great tool for working with CV-Data!

Voxel51

Комментарии