- Видео 257
- Просмотров 97 213
Voxel51
США
Добавлен 8 июн 2019
Voxel51 is bringing transparency and clarity to the world's data. We build software that enables developers, scientists, and organizations to build high-quality datasets and computer vision models that power some of today's most remarkable machine learning and artificial intelligence.
Get open source FiftyOne: github.com/voxel51/fiftyone
Learn about FiftyOne Teams: voxel51.com/fiftyone-teams/
Get open source FiftyOne: github.com/voxel51/fiftyone
Learn about FiftyOne Teams: voxel51.com/fiftyone-teams/
Computer Vision Meetup: Multiview Scene Graph
Motivated by how humans perceive scenes, we propose the Multiview Scene Graph (MSG) as a general topological scene representation. MSG constructs a place+object graph from unposed RGB images and we provide novel metrics to evaluate the graph quality. We combine visual place recognition and object association to build MSG in one Transformer decoder model. We believe MSG can connect dots across classic vision tasks to promote spatial intelligence and open new doors for topological 3D scene understanding.
Read the paper: arxiv.org/abs/2410.11187
About the Speaker:
Juexiao Zhang is a second-year PhD student in computer science at NYU Courant, advised by Professor Chen Feng. He is interested in l...
Read the paper: arxiv.org/abs/2410.11187
About the Speaker:
Juexiao Zhang is a second-year PhD student in computer science at NYU Courant, advised by Professor Chen Feng. He is interested in l...
Просмотров: 60
Видео
Computer Vision Meetup: Simple & Scalable Approach to Improve Vision Model Robustness to Corruptions
Просмотров 234 часа назад
Deep neural networks perform exceptionally on clean images but face significant challenges with corrupted ones. While data augmentation with specific corruptions during training can improve model robustness to those particular distortions, this approach typically degrades performance on both clean images and corruptions not encountered during training. In this talk, we present a novel approach ...
Computer Vision Meetup: CLIP: Insights into Zero-Shot Image Classification with Mutual Knowledge
Просмотров 944 часа назад
We interpret CLIP’s zero-shot image classification by examining shared textual concepts learned by its vision and language encoders. We analyzes 13 CLIP models across various architectures, sizes, and datasets. The approach highlights a human-friendly way to understand CLIP’s classification decisions. Read the paper: arxiv.org/abs/2410.13016 Fawaz Sammani is a 2nd year PhD student at the Vrije ...
Computer Vision Meetup: Intrinsic Self-Supervision for Data Quality Audits
Просмотров 274 часа назад
Benchmark datasets in computer vision often contain issues such as off-topic samples, near-duplicates, and label errors, compromising model evaluation accuracy. This talk will discuss SelfClean, a data-cleaning framework that leverages self-supervised representation learning and distance-based indicators to detect these issues effectively. By framing the task as a ranking or scoring problem, Se...
Computer Vision Meetup: Map It Anywhere: Empowering BEV Map Prediction using Public Datasets
Просмотров 439 часов назад
Top-down Bird’s Eye View (BEV) maps are a popular representation for ground robot navigation due to their richness and flexibility for downstream tasks. While recent methods have shown promise for predicting BEV maps from First-Person View (FPV) images, their generalizability is limited to small regions captured by current autonomous vehicle-based datasets. In this context, we show that a more ...
Computer Vision Meetup: Understanding Bias in Large-Scale Visual Datasets
Просмотров 329 часов назад
Truly general-purpose vision systems require pre-training on diverse and representative visual datasets. The “dataset classification” experiment reveals that modern large-scale visual datasets are still very biased: neural networks can achieve excellent accuracy in classifying which dataset an image is from. However, the concrete forms of bias among these datasets remain unclear. In this talk, ...
Computer Vision Meetup: No "Zero-Shot" Without Exponential Data
Просмотров 599 часов назад
Web-crawled pretraining datasets underlie the impressive “zero-shot” evaluation performance of multimodal models. However, it is unclear how meaningful the notion of “zero-shot” generalization is for such multimodal models, as it is not known to what extent their pretraining datasets encompass the downstream concepts targeted for during “zero-shot” evaluation. In this work, we ask: How is the p...
Computer Vision Meetup: An AI-Powered Teaching Assistant for Scalable and Adaptive Learning
Просмотров 51День назад
The future of education lies in personalized and scalable solutions, especially in fields like computer engineering where complex concepts often challenge students. This talk introduces Lumina (AI Teaching Assistant), a cutting-edge agentic system designed to revolutionize programming education through its innovative architecture and teaching strategies. Built using OpenAI API, LangChain, RAG, ...
Computer Vision Meetup: Scaling Semantic Segmentation with Blender
Просмотров 50День назад
Generating datasets for semantic segmentation can be time-intensive. Learn how to use Blender’s Python API to create diverse and realistic synthetic data with automated labels, saving time and improving model performance. Preview the topics to be discussed in this Medium post. About the Speaker Vincent Vandenbussche has a PhD in Physics, is an author, and Machine Learning Engineer with 10 years...
Computer Vision Meetup: WACV 2025 - Elderly Action Recognition Challenge
Просмотров 43День назад
Join us for a quick update on the Elderly Action Recognition (EAR) Challenge, part of the Computer Vision for Smalls (CV4Smalls) Workshop at the WACV 2025 conference! This challenge focuses on advancing research in Activity of Daily Living (ADL) recognition, particularly within the elderly population, a domain with profound societal implications. Participants will employ transfer learning techn...
Computer Vision Meetup: Using Machine Vision to Create Sustainable Practices in Fisheries
Просмотров 66День назад
Fishing vessels are on track to generate 10 million hours of video footage annually, creating a massive machine learning operations challenge. At AI.Fish, we are building an end-to-end system enabling non-technical users to harness AI for catch monitoring and classification both on-board and in the cloud. This talk explores our journey in building these approachable systems and working toward a...
Visual AI for Geospatial: Evaluating Earth Observation Foundation Models
Просмотров 121День назад
Geospatial and Earth Observation have benefited from the new advances in computer vision. In this talk we are going to evaluate the accuracy and ease of use for two of these great new models - the Satlas and Clay foundational models. The evaluation will look at distinct different areas on the globe. Come see how this gift of foundational models improves your work in geospatial or Earth observat...
Visual AI for Geospatial: Earth Monitoring for Everyone with Earth Index
Просмотров 184День назад
Earth Index is a end user focused application that preprocesses global imagery through AI foundation models to enable rapid in-browser search and monitoring. Earth Genome builds Earth Index for critical applications in the environment, and is being used today to report on illegal airstrips built in the Peruvian Amazon, track cattle factory farms across the planet for emissions modeling, and exp...
Visual AI for Geospatial: Is AI Creating a Whole New Earth-Aware Geospatial Stack?
Просмотров 140День назад
The latest wave of AI innovation is profoundly changing many domains. In remote sensing, despite efforts like ours at Clay and others, it is been less so. In this talk we will share our experience as we realize, and explore, if geoAI represents a whole new stack to work with Earth data. About the Speaker Dr. Bruno Sanchez-Andrade Nuno is the executive director of the non-profit project Clay, an...
The Frontier of VisualAI in Medical Imaging
Просмотров 19421 день назад
Explore the transformative potential of Visual AI in medical imaging with Daniel Gural, Machine Learning and Developer Relations expert at Voxel51. 🤖🩺 In this forward-looking talk, Daniel dives into the intersection of medicine and AI in 2025, addressing critical industry and research challenges: - Key Industry Challenges: Rising costs, doctor shortages, and time-intensive processes. - Key Rese...
How to Unlock More in Self Driving Datasets
Просмотров 19521 день назад
How to Unlock More in Self Driving Datasets
Streamlined Retail Product Detection with YOLOv8 and FiftyOne
Просмотров 163Месяц назад
Streamlined Retail Product Detection with YOLOv8 and FiftyOne
Hands-On with Meta AI's CoTracker3: Parsing and Visualizing Point Tracking Output
Просмотров 101Месяц назад
Hands-On with Meta AI's CoTracker3: Parsing and Visualizing Point Tracking Output
How We Built CoTracker3: Simpler and Better Point Tracking by Pseudo-Labeling Real Videos
Просмотров 246Месяц назад
How We Built CoTracker3: Simpler and Better Point Tracking by Pseudo-Labeling Real Videos
The NeurIPS 2024 Preshow: Creating SPIQA: Addressing the Limitations of Existing Datasets for VQA
Просмотров 4792 месяца назад
The NeurIPS 2024 Preshow: Creating SPIQA: Addressing the Limitations of Existing Datasets for VQA
The NeurIPS 2024 Preshow: Zero-Shot Learning: A Misnomer?
Просмотров 3242 месяца назад
The NeurIPS 2024 Preshow: Zero-Shot Learning: A Misnomer?
The NeurlPS 2024: A Textbook Remedy for Domain Shifts: Knowledge Priors for Medical Image Analysis
Просмотров 1442 месяца назад
The NeurlPS 2024: A Textbook Remedy for Domain Shifts: Knowledge Priors for Medical Image Analysis
The NeurlPS 2024 Preshow NaturalBench Evaluating Vision Language Model on Natural Adversarial Sample
Просмотров 2272 месяца назад
The NeurlPS 2024 Preshow NaturalBench Evaluating Vision Language Model on Natural Adversarial Sample
Computer Vision Meetup: Do It Yourself LLMs
Просмотров 1862 месяца назад
Computer Vision Meetup: Do It Yourself LLMs
The NeurIPS 2024 Preshow: A Label is Worth a Thousand Images in Dataset Distillation
Просмотров 4152 месяца назад
The NeurIPS 2024 Preshow: A Label is Worth a Thousand Images in Dataset Distillation
The NeurIPS 2024 Preshow: What matters when building vision-language models?
Просмотров 7192 месяца назад
The NeurIPS 2024 Preshow: What matters when building vision-language models?
ECCV 2024: Open-Vocabulary 3D Semantic Segmentation with Text-to-Image Diffusion Models
Просмотров 1622 месяца назад
ECCV 2024: Open-Vocabulary 3D Semantic Segmentation with Text-to-Image Diffusion Models
ECCV Redux: Zero-shot Video Anomaly Detection: Leveraging LLMs for Rule-Based Reasoning
Просмотров 1532 месяца назад
ECCV Redux: Zero-shot Video Anomaly Detection: Leveraging LLMs for Rule-Based Reasoning
ECCV 2024 Redux: Day 3- Closing the Gap Between Satellite & Street View Imagery Generative Models
Просмотров 1162 месяца назад
ECCV 2024 Redux: Day 3- Closing the Gap Between Satellite & Street View Imagery Generative Models
ECCV 2024 Redux: Day 3- High-Efficiency 3D Scene Compression Using Self-Organizing Gaussians
Просмотров 4382 месяца назад
ECCV 2024 Redux: Day 3- High-Efficiency 3D Scene Compression Using Self-Organizing Gaussians
Excellent presentation! Thank you.
Hello sir, How are you? I am a regular viewer of yours. I follow your RUclips channel. I really your videos. But your videos get very few views. Have you thought about this?
Super c'est modèles sont disponibles pour l'ensemble de la planète ? l'Afrique par exemple
Amazing work!
WOW. This looks so amazing. I can't wait to use this!
Really interesting
Thank you for sharing the video. Does this plugin assume a vector engine like qdrant is used as backend?
Thank you for sharing this video on the Active Learning plugin. Is it possible to use the plugin for multi-class multi-label tasks as well?
When I try the dev install process in a git bash terminal, it fails at a point because of a package error. "Collecting shapely>=1.7.1 (from -r requirements\extras.txt (line 7)) Using cached shapely-2.0.6-cp312-cp312-win_amd64.whl.metadata (7.2 kB) ERROR: Could not find a version that satisfies the requirement open3d>=0.16.0 (from versions: none) ERROR: No matching distribution found for open3d>=0.16.0" How can this be solved?
Very interesting demo; would you mind sharing the Colab link?
ok, and how does one start it?
This was very helpful! Llama Index grows so fast, it feels overwhelming for a beginner.
!second comment
I want to work with my custom dataset. I'd like you to show me how to do it and which benefits I can get using your product. Examples, how can I refine my own data with fiftyone
Isn't the "Grid Trick" similar to using ControlNet, a type of model for controlling image diffusion models by conditioning the model with an additional input image?
Love your product!
How to we execute the plugin logic in the code? This doesn't seem to work: logging.info("removing approximate duplicates") operator_uri = "@jacobmarks/image_deduplication/remove_all_approximate_duplicates" params = { "sim_choices": "sim", # You may need to adjust this based on your similarity run key "threshold_value": 0.4 } # Create an invocation request request = foe.InvocationRequest(operator_uri, params=params) # Create an executor and execute the request executor = foe.Executor(requests=[request]) result = executor.trigger(operator_uri, params=params) print(result.to_json()) # logging.info(f"Found approximate duplicates: {result.result}") return result
The video was great, thanks mate for explination.
Wonderful 👍
Great!
the search result are only online images? or can it be local images?
you can drag and drop a local image in :)
Made that look *way* too easy. I spent a whole hour last night trying to get the first line of code to work! It was because my Python paths were thrown about the place
ahahahaha) me too)
How to build the js part of code to generate umd.js file in dist folder. I am build using yarn build but the generated umd file is not working and not opening new panel. Please help
How to build the js part of code to generate umd.js file in dist folder. I am build using yarn build but the generated umd file is not working and not opening new panel. Please help
Great question. Try `yarn install` as well. Make sure that the plugin is in your plugins directory. And when you want to change the plugin, make sure you use `yarn dev`. If you have more questions about FiftyOne Plugins, check out the #plugins channel in the FiftyOne community Slack! slack.voxel51.com/
great tutorial, can you use a local instance of SD?
It is great that you give us a list of next steps, but a link to each of these points would have been nice!
nice job!
This is good! But i believe the data should also grab eye movement. Eye movement is crucial to map intention and will aid in robot navigation. Apple's headset has the hardware to monitor both eye direction and head direction.
I have a idea for build a autonomous drone using computer vision to detect objects that is labled with a GPS location before.
please slide share?
How does it select which images would be kept as "representatives" and which removed?
I want words like these intitle:"keyword" For better search efficiency for topics
Is it possible to use this and find the most similar image given user submitted photos? For example I'm trying to do something to detect trading cards, where the input would be photos of cards submitted by users.
absolutely!
Hello, I downloaded and installed FiftyOne, but I don’t know how to use it. All your videos didn’t explain how to use it.
There is lots of documentation online on their website, check it out! Its really not difficult to get it running, but its "only" an API, so some python Experience is definetly helpful to get it running. :)
There is lots of documentation online on their website, check it out! Its really not difficult to get it running, but its "only" an API, so some python Experience is definetly helpful to get it running. :)
I love this
good
Thanks for clear explanation❤
Thank you so much bro. Nice tutorial.
getting Not Found
Thanks, great overview
Look promising, I was going through your tutorial, and I was hoping to see how you can import your own database.
🤩 Promo'SM
Can you perform the initial labeling on images that have not been annotated yet? On part 5 and I have not seen that information yet. Did I miss it?
Can you edit/correct or add/remove annotations directly in FiftyOne?
I am really excited about this product! Thank you for this hands-on video!
"Wow, this video is incredibly informative and well-produced! The speaker does a fantastic job of explaining the complex topic of speech recognition and the new Whisper model from OpenAI in a way that's easy to understand. Great job, highly recommended to anyone interested in this field!"
splendid 🙂✌️️️!! Find out how your competition ranks better = 'Promosm'!!
how to add our own dataset into FiftyOne. I want to label my own data.
As mentioned in the video, fiftyone isn't a classical annotation tool, but it provides hooks to do that with cvat, labelbox etc and then load the labeled data back into fiftyone. For me the cvat solution worked perfectly fine. Everything is perfectly documented on their website, check it out! :) If you want to load your annotation data which is in your own format, and not in a typical dataformat (COCO,...) you'll have to write a few lines of python codes yourself. For that purpose I have implemented a DatasetHandler-class. You'll have to convert into fiftyone-format by iterating through your data and turn them into fiftyone Detection-Objects: detections.append( fo.Detection(label=my_label, bounding_box=my_bbox) ) Fiftyone doesn't work "out of the box", but it's a great tool for working with CV-Data!