Voxel51
Voxel51
  • Видео 229
  • Просмотров 85 303
ECCV 2024: Open-Vocabulary 3D Semantic Segmentation with Text-to-Image Diffusion Models
In this talk, I will introduce our recent work on open-vocabulary 3D semantic understanding. We propose a novel method, namely Diff2Scene, which leverages frozen representations from text-image generative models, for open-vocabulary 3D semantic segmentation and visual grounding tasks. Diff2Scene gets rid of any labeled 3D data and effectively identifies objects, appearances, locations and their compositions in 3D scenes.
ECCV 2024 Paper: Open-Vocabulary 3D Semantic Segmentation with Text-to-Image Diffusion Models
arxiv.org/abs/2407.13642
About the Speaker
Xiaoyu Zhu is a Ph.D. student at Language Technologies Institute, School of Computer Science, Carnegie Mellon University. Her research inte...
Просмотров: 80

Видео

ECCV Redux: Zero-shot Video Anomaly Detection: Leveraging LLMs for Rule-Based Reasoning
Просмотров 8514 часов назад
Video Anomaly Detection (VAD) is critical for applications such as surveillance and autonomous driving. However, existing methods lack transparent reasoning, limiting public trust in real-world deployments. We introduce a rule-based reasoning framework that leverages Large Language Models (LLMs) to induce detection rules from few-shot normal samples and apply them to identify anomalies, incorpo...
ECCV 2024 Redux: Day 3- Closing the Gap Between Satellite & Street View Imagery Generative Models
Просмотров 6016 часов назад
Closing the Gap Between Satellite and Street-View Imagery Using Generative Models With the growing availability of satellite imagery (e.g., Google Earth), nearly every part of the world can be mapped, though street-view images remain limited. Creating street views from satellite data is crucial for applications like virtual model generation, media content enhancement, 3D gaming, and simulations...
ECCV 2024 Redux: Day 3- High-Efficiency 3D Scene Compression Using Self-Organizing Gaussians
Просмотров 19316 часов назад
In just over a year, 3D Gaussian Splatting (3DGS) has made waves in computer vision for its remarkable speed, simplicity, and visual quality. Yet, even scenes of a single room can exceed a gigabyte in size, making it difficult to scale up to larger environments, like city blocks. In this talk, we’ll explore compression techniques to reduce the 3DGS memory footprint. We’ll dive deeply into our n...
ECCV 2024 Redux: Day 3- Skeleton Recall Loss for Connectivity Conserving and Resource Efficient Seg.
Просмотров 4016 часов назад
Skeleton Recall Loss for Connectivity Conserving and Resource Efficient Segmentation of Thin Tubular Structures We present Skeleton Recall Loss, a novel loss function for topologically accurate and efficient segmentation of thin, tubular structures, such as roads, nerves, or vessels. By circumventing expensive GPU-based operations, we reduce computational overheads by up to 90% compared to the ...
ECCV 2024 Redux: Day 1 - Tree-of-Life Meets AI
Просмотров 5021 час назад
A central challenge in biology is understanding how organisms evolve and adapt to their environment, acquiring variations in observable traits across the tree of life. However, measuring these traits is often subjective and labor-intensive, making trait discovery a highly label-scarce problem. With the advent of large-scale biological image repositories and advances in generative modeling, ther...
ECCV 2024 Redux: Day 1 - Robust Calibration of Large Vision-Language Adapters
Просмотров 5121 час назад
We empirically demonstrate that popular CLIP adaptation approaches, such as Adapters, Prompt Learning, and Test-Time Adaptation, substantially degrade the calibration capabilities of the zero-shot baseline in the presence of distributional drift. We identify the increase in logit ranges as the underlying cause of miscalibration of CLIP adaptation methods, contrasting with previous work on calib...
ECCV 2024 Redux: Day 1 - Fast and Photo-realistic Novel View Synthesis from Sparse Images
Просмотров 4521 час назад
Novel view synthesis generates new perspectives of a scene from a set of 2D images, enabling 3D applications like VR/AR, robotics, and autonomous driving. Current state-of-the-art methods produce high-fidelity results but require a lot of images, while sparse-view approaches often suffer from artifacts or slow inference. In this talk, I will present my research work focused on developing fast a...
Computer Vision Meetup: Deploying ML models on Edge Devices using Qualcomm AI Hub
Просмотров 80День назад
In this talk we address the common challenges faced by developers migrating AI workloads from the cloud to edge devices. Qualcomm aims to democratize AI at the edge, easing the transition to the edge by supporting familiar frameworks and data types. ​This is where Qualcomm AI Hub comes in. Developers can follow along, gaining knowledge and tools to efficiently deploy optimized models on real de...
Computer Vision Meetup: Human-in-the-loop: Practical Lessons for Building Comprehensive AI Systems
Просмотров 110День назад
AI systems often struggle with data limitations, data distribution shift over time, and a poor user experience. Human-in-the-loop design offers a solution by placing users at the center of AI systems and leveraging human feedback for continuous improvement. In this talk, we’ll dive deeply into a recent project at Merantix Momentum: A interactive tool for automatic rodent behaviour analysis in v...
Computer Vision Meetup: Curating Excellence: Strategies for Optimizing Visual AI Datasets
Просмотров 100День назад
In this talk Harpreet will discuss common challenges plaguing visual AI datasets, their impact on model performance, and share some tips and tricks for curating datasets to make the most of any compute budget or network architecture. Speaker: Harpreet Sahota is a hacker-in-residence and machine learning engineer with a passion for deep learning and generative AI. He’s got a deep interest in RAG...
Computer Vision Meetup: PostgreSQL for Innovative Vector Search
Просмотров 63Месяц назад
There are a plethora of datastores that can work with vector embeddings. You are probably already running one that allows for innovative uses of data alongside your embeddings - PostgreSQL! This talk will focus on showing examples of how features already present in the PostgreSQL ecosystem allow you to leverage it for cutting edge use cases. Live demos and lively discussion will be the focus of...
Computer Vision Meetup: Pixels Are All You Need Utilizing 2D Image Representation in Robotics
Просмотров 227Месяц назад
Many vision-based robot control applications (like those in manufacturing) require 3D estimates of task-relevant objects, which can be realized by training a direct 3D object detection model. However, obtaining 3D annotation for a specific application is expensive relative to 2D object representations like segmentation masks or bounding boxes. In this talk, Brent will describe how we achieve mo...
Computer Vision Meetup: Accelerating Machine Learning Research and Development for Autonomy
Просмотров 242Месяц назад
At Oxa (Autonomous Vehicle Software), we designed an automated workflow for building machine vision models at scale from data collection to in-vehicle deployment, involving a number of steps, such as, intelligent route planning to maximise visual diversity; sampling of the sensor data w.r.t. visual and semantic uniqueness; language-driven automated annotation tools and multi-modal search engine...
Computer Vision Meetup: Using Elasticsearch Vector Search in FiftyOne
Просмотров 110Месяц назад
In this short demo, Steve Pousty (Developer Advocate at Voxel51) shows you how to leverage Elastic’s vector search search capabilities for computer vision use cases using the FiftyOne open source library. Not a Meetup member? Sign up to attend the next event: voxel51.com/computer-vision-ai-meetups/ Recorded on Oct 10, 2024 at the AL, Machine Learning and Computer Vision Meetup. #computervision ...
Computer Vision Meetup: Elastic is for the Birds: Identifying Embedding Images using Vector Search
Просмотров 100Месяц назад
Computer Vision Meetup: Elastic is for the Birds: Identifying Embedding Images using Vector Search
Computer Vision Meetup: RGB-X Model Development: Exploring Four Channel ML Workflows
Просмотров 109Месяц назад
Computer Vision Meetup: RGB-X Model Development: Exploring Four Channel ML Workflows
Computer Vision Meetup: How Renault Leveraged Machine Learning to Scale Electric Vehicle Sales
Просмотров 140Месяц назад
Computer Vision Meetup: How Renault Leveraged Machine Learning to Scale Electric Vehicle Sales
Scaling Industrial AI with FiftyOne
Просмотров 962 месяца назад
Scaling Industrial AI with FiftyOne
Computer Vision Meetup: GPUs at Scale - Trials of a GPUaaS Provider
Просмотров 822 месяца назад
Computer Vision Meetup: GPUs at Scale - Trials of a GPUaaS Provider
Visual AI in Healthcare: NVIDIA’s VISTA-3D and MedSAM-2 Medical Imaging Models
Просмотров 4732 месяца назад
Visual AI in Healthcare: NVIDIA’s VISTA-3D and MedSAM-2 Medical Imaging Models
Visual AI in Healthcare: Exploring Instance Imbalance in Medical Semantic Segmentation
Просмотров 942 месяца назад
Visual AI in Healthcare: Exploring Instance Imbalance in Medical Semantic Segmentation
Visual AI in Healthcare: Advancing Comparative Computational AI in Veterinary Oncology
Просмотров 1382 месяца назад
Visual AI in Healthcare: Advancing Comparative Computational AI in Veterinary Oncology
Visual AI in Healthcare: Interpretable AI Models in Radiology
Просмотров 2362 месяца назад
Visual AI in Healthcare: Interpretable AI Models in Radiology
Computer Vision Meetup: It's in the Air Tonight. Sensor Data in RAG
Просмотров 1492 месяца назад
Computer Vision Meetup: It's in the Air Tonight. Sensor Data in RAG
Computer Vision Meetup: Data-Centric AI Competition on Hugging Face Spaces
Просмотров 562 месяца назад
Computer Vision Meetup: Data-Centric AI Competition on Hugging Face Spaces
Computer Vision Meetup: Reducing Hallucinations in ChatGPT and Similar AI Systems
Просмотров 1472 месяца назад
Computer Vision Meetup: Reducing Hallucinations in ChatGPT and Similar AI Systems
Computer Vision Meetup: Accelerating Multimodal RAG Pipelines with NVIDIA and OSS Integrations
Просмотров 1752 месяца назад
Computer Vision Meetup: Accelerating Multimodal RAG Pipelines with NVIDIA and OSS Integrations
Computer Vision Meetup: 5 Handy Ways to Use Embeddings, the Swiss Army Knife of AI
Просмотров 772 месяца назад
Computer Vision Meetup: 5 Handy Ways to Use Embeddings, the Swiss Army Knife of AI
Computer Vision Meetup: Agentic RAG in 2024
Просмотров 5262 месяца назад
Computer Vision Meetup: Agentic RAG in 2024

Комментарии

  • @redforestx7371
    @redforestx7371 7 дней назад

    Amazing work!

  • @redforestx7371
    @redforestx7371 7 дней назад

    WOW. This looks so amazing. I can't wait to use this!

  • @ajwaus
    @ajwaus 9 дней назад

    Really interesting

  • @sylviaschmitt
    @sylviaschmitt 10 дней назад

    Thank you for sharing the video. Does this plugin assume a vector engine like qdrant is used as backend?

  • @sylviaschmitt
    @sylviaschmitt 10 дней назад

    Thank you for sharing this video on the Active Learning plugin. Is it possible to use the plugin for multi-class multi-label tasks as well?

  • @sai.sankarwork
    @sai.sankarwork 24 дня назад

    When I try the dev install process in a git bash terminal, it fails at a point because of a package error. "Collecting shapely>=1.7.1 (from -r requirements\extras.txt (line 7)) Using cached shapely-2.0.6-cp312-cp312-win_amd64.whl.metadata (7.2 kB) ERROR: Could not find a version that satisfies the requirement open3d>=0.16.0 (from versions: none) ERROR: No matching distribution found for open3d>=0.16.0" How can this be solved?

  • @NishantRoy-h4d
    @NishantRoy-h4d Месяц назад

    Very interesting demo; would you mind sharing the Colab link?

  • @deemon101
    @deemon101 Месяц назад

    ok, and how does one start it?

  • @AmeeliaK
    @AmeeliaK 2 месяца назад

    This was very helpful! Llama Index grows so fast, it feels overwhelming for a beginner.

  • @BD_Gaming2013
    @BD_Gaming2013 2 месяца назад

    !second comment

  • @SergeyPavlov-b4c
    @SergeyPavlov-b4c 3 месяца назад

    I want to work with my custom dataset. I'd like you to show me how to do it and which benefits I can get using your product. Examples, how can I refine my own data with fiftyone

  • @menghuitan1628
    @menghuitan1628 3 месяца назад

    Isn't the "Grid Trick" similar to using ControlNet, a type of model for controlling image diffusion models by conditioning the model with an additional input image?

  • @ByTobys
    @ByTobys 4 месяца назад

    Love your product!

  • @MohitAkhakharia
    @MohitAkhakharia 4 месяца назад

    How to we execute the plugin logic in the code? This doesn't seem to work: logging.info("removing approximate duplicates") operator_uri = "@jacobmarks/image_deduplication/remove_all_approximate_duplicates" params = { "sim_choices": "sim", # You may need to adjust this based on your similarity run key "threshold_value": 0.4 } # Create an invocation request request = foe.InvocationRequest(operator_uri, params=params) # Create an executor and execute the request executor = foe.Executor(requests=[request]) result = executor.trigger(operator_uri, params=params) print(result.to_json()) # logging.info(f"Found approximate duplicates: {result.result}") return result

  • @alivirat6926
    @alivirat6926 5 месяцев назад

    The video was great, thanks mate for explination.

  • @ashwinkumar5223
    @ashwinkumar5223 5 месяцев назад

    Wonderful 👍

  • @rishiraj2548
    @rishiraj2548 5 месяцев назад

    Great!

  • @HarisonRoberto
    @HarisonRoberto 6 месяцев назад

    the search result are only online images? or can it be local images?

    • @voxel51
      @voxel51 6 месяцев назад

      you can drag and drop a local image in :)

  • @kai_harm942
    @kai_harm942 6 месяцев назад

    Made that look *way* too easy. I spent a whole hour last night trying to get the first line of code to work! It was because my Python paths were thrown about the place

  • @technologyencroyable
    @technologyencroyable 7 месяцев назад

    How to build the js part of code to generate umd.js file in dist folder. I am build using yarn build but the generated umd file is not working and not opening new panel. Please help

  • @technologyencroyable
    @technologyencroyable 7 месяцев назад

    How to build the js part of code to generate umd.js file in dist folder. I am build using yarn build but the generated umd file is not working and not opening new panel. Please help

    • @voxel51
      @voxel51 7 месяцев назад

      Great question. Try `yarn install` as well. Make sure that the plugin is in your plugins directory. And when you want to change the plugin, make sure you use `yarn dev`. If you have more questions about FiftyOne Plugins, check out the #plugins channel in the FiftyOne community Slack! slack.voxel51.com/

  • @MyJunkEmail
    @MyJunkEmail 8 месяцев назад

    great tutorial, can you use a local instance of SD?

  • @AlainPilon
    @AlainPilon 9 месяцев назад

    It is great that you give us a list of next steps, but a link to each of these points would have been nice!

  • @ZixuWang-ul8hr
    @ZixuWang-ul8hr 9 месяцев назад

    nice job!

  • @aimadnessbot
    @aimadnessbot 10 месяцев назад

    This is good! But i believe the data should also grab eye movement. Eye movement is crucial to map intention and will aid in robot navigation. Apple's headset has the hardware to monitor both eye direction and head direction.

  • @huynhphanngockhang5733
    @huynhphanngockhang5733 10 месяцев назад

    I have a idea for build a autonomous drone using computer vision to detect objects that is labled with a GPS location before.

  • @rezamahmoudi163
    @rezamahmoudi163 11 месяцев назад

    please slide share?

  • @SeedmancChitOKun
    @SeedmancChitOKun 11 месяцев назад

    How does it select which images would be kept as "representatives" and which removed?

  • @aldem34
    @aldem34 11 месяцев назад

    I want words like these intitle:"keyword" For better search efficiency for topics

  • @wata1991
    @wata1991 Год назад

    Is it possible to use this and find the most similar image given user submitted photos? For example I'm trying to do something to detect trading cards, where the input would be photos of cards submitted by users.

  • @tyronetyrone2652
    @tyronetyrone2652 Год назад

    Hello, I downloaded and installed FiftyOne, but I don’t know how to use it. All your videos didn’t explain how to use it.

    • @ByTobys
      @ByTobys 4 месяца назад

      There is lots of documentation online on their website, check it out! Its really not difficult to get it running, but its "only" an API, so some python Experience is definetly helpful to get it running. :)

    • @ByTobys
      @ByTobys 4 месяца назад

      There is lots of documentation online on their website, check it out! Its really not difficult to get it running, but its "only" an API, so some python Experience is definetly helpful to get it running. :)

  • @davidgrayson181
    @davidgrayson181 Год назад

    I love this

  • @beiddouwang6643
    @beiddouwang6643 Год назад

    good

  • @omarelsherif010
    @omarelsherif010 Год назад

    Thanks for clear explanation❤

  • @jasonwell5299
    @jasonwell5299 Год назад

    Thank you so much bro. Nice tutorial.

  • @divyanshnautiyal8110
    @divyanshnautiyal8110 Год назад

    getting Not Found

  • @robosergTV
    @robosergTV Год назад

    Thanks, great overview

  • @ChrisWiggins1
    @ChrisWiggins1 Год назад

    Look promising, I was going through your tutorial, and I was hoping to see how you can import your own database.

  • @ritagislason
    @ritagislason Год назад

    🤩 Promo'SM

  • @vanessacrosbyfitzgerald
    @vanessacrosbyfitzgerald Год назад

    Can you perform the initial labeling on images that have not been annotated yet? On part 5 and I have not seen that information yet. Did I miss it?

  • @DigiDriftZone
    @DigiDriftZone Год назад

    Can you edit/correct or add/remove annotations directly in FiftyOne?

  • @sapsan1234
    @sapsan1234 Год назад

    I am really excited about this product! Thank you for this hands-on video!

  • @akshayiitk4440
    @akshayiitk4440 Год назад

    "Wow, this video is incredibly informative and well-produced! The speaker does a fantastic job of explaining the complex topic of speech recognition and the new Whisper model from OpenAI in a way that's easy to understand. Great job, highly recommended to anyone interested in this field!"

  • @magdalenakate6781
    @magdalenakate6781 Год назад

    splendid 🙂✌️️️!! Find out how your competition ranks better = 'Promosm'!!

  • @AliHamza-ys8dt
    @AliHamza-ys8dt Год назад

    how to add our own dataset into FiftyOne. I want to label my own data.

    • @ByTobys
      @ByTobys 4 месяца назад

      As mentioned in the video, fiftyone isn't a classical annotation tool, but it provides hooks to do that with cvat, labelbox etc and then load the labeled data back into fiftyone. For me the cvat solution worked perfectly fine. Everything is perfectly documented on their website, check it out! :) If you want to load your annotation data which is in your own format, and not in a typical dataformat (COCO,...) you'll have to write a few lines of python codes yourself. For that purpose I have implemented a DatasetHandler-class. You'll have to convert into fiftyone-format by iterating through your data and turn them into fiftyone Detection-Objects: detections.append( fo.Detection(label=my_label, bounding_box=my_bbox) ) Fiftyone doesn't work "out of the box", but it's a great tool for working with CV-Data!

  • @Himakarbavikaty
    @Himakarbavikaty Год назад

    Hi I am getting the following error in colab and jupyter notebook with custom data and coco 2017 (default data) MalformedQueryException: Cannot attach/detach dataset to/from a batch project Kindly help me to solve this issue

  • @vernenfelcher6442
    @vernenfelcher6442 2 года назад

    𝐩яⓞ𝓂𝓞Ş𝐦

  • @Kk-vx1id
    @Kk-vx1id 2 года назад

    Hi guys, how are you? How to change the font on the interface of fiftyone, I hope to get your reply!

  • @CannibalWarthog
    @CannibalWarthog 2 года назад

    A installation tutorial would be nice