- Видео 55
- Просмотров 117 986
AIology
Канада
Добавлен 2 мар 2018
I create content about recent deep learning algorithms.
Disclaimer: All content is my own and does not reflect my current, past, or future employers.
For more information:
scholar.google.com/citations?user=AzvrVEwAAAAJ&hl=en
www.linkedin.com/in/mr-mohebbian-1a236952/
Disclaimer: All content is my own and does not reflect my current, past, or future employers.
For more information:
scholar.google.com/citations?user=AzvrVEwAAAAJ&hl=en
www.linkedin.com/in/mr-mohebbian-1a236952/
10-minute paper (episode 32): A Human-Inspired Reading Agent with Gist Memory of Very Long Contexts
Current Large Language Models (LLMs) are not only limited to some maximum context length, but also are not able to robustly consume long inputs. To address these limitations, ReadAgent is introduced as an LLM agent system that increases effective context length up to 20x in.
Inspired by how humans interactively read long documents, ReadAgemt (1) decide what content to store together in a memory episode, (2) compress those memory episodes into short episodic memories called gist memories, and (3) take actions to look up passages in the original text if ReadAgent needs to remind itself of relevant details to complete a task.
ReadAgent is evaluated against baselines using retrieval methods, u...
Inspired by how humans interactively read long documents, ReadAgemt (1) decide what content to store together in a memory episode, (2) compress those memory episodes into short episodic memories called gist memories, and (3) take actions to look up passages in the original text if ReadAgent needs to remind itself of relevant details to complete a task.
ReadAgent is evaluated against baselines using retrieval methods, u...
Просмотров: 167
Видео
10-minute paper (episode 31): Self-Supervised Learning from Images I-JEPA
Просмотров 93711 месяцев назад
In this groundbreaking paper, we unveil a cutting-edge approach to learning highly semantic image representations without the need for painstakingly crafted data augmentations. Introducing the Image-based Joint-Embedding Predictive Architecture (I-JEPA), a non-generative method for self-supervised learning from images, we unlock new possibilities in the world of computer vision.
Mojo - First impression
Просмотров 7 тыс.Год назад
Recently, the new language "Mojo" by Modular has been garnering significant interest. As of today, 7th September 2022, it has been publicly released. In the following video, I'll be coding to provide my initial impressions. Additionally, I'll discuss licensing concerns and hurdles Mojo must address to be a viable AI language option in the future. Lastly, I'll introduce Zig. While I'm not diving...
10-minute paper (episode 30): ColT5 (Part 1): Faster, Long-Range Transformers
Просмотров 454Год назад
Many natural language processing tasks benefit from long inputs, but processing long documents with Transformers is expensive not only due to quadratic attention complexity but also from applying feedforward and projection layers to every token. However, not all tokens are equally important, especially for longer documents. CoLT5, a long-input Transformer model that builds on this intuition by ...
10 minutes paper (episode 29): Table Extraction
Просмотров 1,2 тыс.Год назад
In recent advancements in machine learning, extracting structured tables from unstructured documents has seen significant progress. However, a key challenge has been creating large-scale datasets with accurate ground truth. To tackle this, authors introduced PubTables-1M, a comprehensive dataset with nearly one million tables from scientific articles. It supports various input types, offers det...
10 minutes paper (episode 28): AliBi; Train Short, Test Long
Просмотров 213Год назад
Alibi proposes to simply apply a static linear bias to the attention matrix. The authors show this is not only effective as a relative positional encoding, but also allows the attention net to extrapolate to greater sequences length than what it was trained on, for autoregressive language models. In this video we walk through paper, write some small code and look at x-transformers library. Plea...
Ray (Episode 4): Deploying 7B GPT using Ray
Просмотров 369Год назад
In the video presentation, I will delve into the topic of Ray Serve, showcasing its capabilities in serving machine learning models, particularly focusing on casual language models and other models built using popular frameworks like Transformers, PyTorch, and TensorFlow. This technology provides an efficient and scalable solution for deploying and managing such models in production environments.
Ray (Episode 3): Memory management in Ray Object Store
Просмотров 525Год назад
In this video, we will show how you can put and get large object in object store of ray that every actors can access it with reference to reduce latency. Then, we talk about about how you can free memory. At the end we will bring some fun examples about reference and value management in python and ray.
Ray (Episode 2): Actor models
Просмотров 278Год назад
Ray is an open-source distributed computing framework designed to make it easier to build and scale applications that require parallel and distributed processing. It provides a set of tools and libraries that help developers write efficient and high-performance applications across a variety of domains, from machine learning and data processing to reinforcement learning and beyond. Int this seri...
Ray (Episode 1): Remote function
Просмотров 446Год назад
Ray is an open-source distributed computing framework designed to make it easier to build and scale applications that require parallel and distributed processing. It provides a set of tools and libraries that help developers write efficient and high-performance applications across a variety of domains, from machine learning and data processing to reinforcement learning and beyond. Int this seri...
10 minutes paper (episode 27): LLM powered autonomous agents
Просмотров 1 тыс.Год назад
Imagine a world where AI isn't just smart - it's genius. In this video we walk through requirements for creating an autonomous agent powered by LLM using Lillian post. An agent should have following steps in the best optimized way: 📚 Planning: Agents excel at breaking down big tasks into manageable subgoals, and they're self-reflective learners, continuously improving by learning from their own...
10 minutes paper (episode 26):Multi-Grained Vision Language Pre-Training: X-VLM
Просмотров 475Год назад
Introducing X-VLM, a groundbreaking method in vision language pretraining that overcomes the limitations of existing approaches. Traditional methods heavily rely on object-centric features extracted through object detection, which struggle to capture relationships among multiple objects. X-VLM introduces multi-grained alignments by locating visual concepts in images based on associated texts an...
10 minutes paper (episode 25): Low Rank Adaptation: LoRA
Просмотров 958Год назад
In this video, we'll explore the challenges posed by full fine-tuning of massive language models, such as GPT-3 175B, and the prohibitive costs associated with deploying independent instances of such models. LoRA utilizes an innovative approach by freezing the pretrained model weights and introducing trainable rank decomposition matrices into each layer of the Transformer architecture. This ing...
AI-Code-Mastery (Episode 8): Fine-Tuning MPT-7B by Single GPU | Open-Source and Commercializable
Просмотров 26 тыс.Год назад
I am excited to bring you a comprehensive step-by-step guide on how to fine-tune the newly announced MPT-7B parameters model using just a single GPU. This remarkable model is not only open source but also commercializable, making it a valuable tool for a wide range of natural language processing (NLP) tasks. MPT token size beat GPT4 and also it outperformed many available language models like G...
10 minutes paper (episode 24): ViperGPT
Просмотров 555Год назад
ViperGPT is a framework that enables complex visual queries by combining vision and language models into subroutines. This framework uses code generation models to create Python code that is later executed, and utilizes a provided API to access available modules. ViperGPT achieves state-of-the-art results in various complex visual tasks without the need for further training. This approach offer...
10 minutes paper (episode 23): Unlocking Full Potential of Language Models with Chain-of-Thought
Просмотров 703Год назад
10 minutes paper (episode 23): Unlocking Full Potential of Language Models with Chain-of-Thought
AI-Code-Mastery (Episode 7): Text2Image using Diffusion Model and ControlNet
Просмотров 673Год назад
AI-Code-Mastery (Episode 7): Text2Image using Diffusion Model and ControlNet
AI-Code-Mastery (Episode 6): Explore the Best ML Challenge Websites and Datasets
Просмотров 402Год назад
AI-Code-Mastery (Episode 6): Explore the Best ML Challenge Websites and Datasets
AI-Code-Mastery (Episode 5): Zero-Shot document question answering with Flan-ULv2
Просмотров 6 тыс.Год назад
AI-Code-Mastery (Episode 5): Zero-Shot document question answering with Flan-ULv2
AI-Code-Mastery (Episode 4): Torch Drug
Просмотров 1,1 тыс.Год назад
AI-Code-Mastery (Episode 4): Torch Drug
10 minutes paper (episode 22); Beyond neural scaling laws
Просмотров 716Год назад
10 minutes paper (episode 22); Beyond neural scaling laws
10 minutes paper (episode 21); LayoutReader
Просмотров 563Год назад
10 minutes paper (episode 21); LayoutReader
10 minutes paper (episode 20); InstructGPT
Просмотров 11 тыс.Год назад
10 minutes paper (episode 20); InstructGPT
AI-Code-Mastery (Episode 2): alert, diffdir, conv, ocr
Просмотров 189Год назад
AI-Code-Mastery (Episode 2): alert, diffdir, conv, ocr
10 minutes paper (episode 19); ConvNeXt: A ConvNet for the 2020s
Просмотров 9872 года назад
10 minutes paper (episode 19); ConvNeXt: A ConvNet for the 2020s
10 minutes paper (episode 18); Similarity of Neural Network Representations Revisited
Просмотров 6032 года назад
10 minutes paper (episode 18); Similarity of Neural Network Representations Revisited
10 minutes paper (episode 17); Micro-Batch Training
Просмотров 2072 года назад
10 minutes paper (episode 17); Micro-Batch Training
amazing video!
Your code is not the TabNet, it is TabTransformer., they are different papers.
overall presentation needs improvement. Your accent isn't that hard to understand but your mic is tinny and I don't want to see your browser or desktop.
Thank you for sharing this short and concise video. It is helpful to understand knowledge distillation and for showing an example in code and for also walking through the code examples.
Here, are you tuning a text completion model? What about the instruct or chat models?
Thank you!!! I love every bit of the video. The fact that you didn't disrupt the lecture with "...Like, comment, share, and subscribe to the channel..." in 59 minutes till the end of the video shows how passionate you are to just share your knowledge. Thank you again. Could you cover the Molecule Generation Tasks? That will be very helpful to my current project.
Can you please share your jupyter notebook for tutoring/experimenting? Thanks.
Where can I find part2?
Great Cython introduction video! in the video you mentioned you're going to go in depth with Cython's features...is there a follow up Cython video to this one? Would love to see Cython's implementations either in data analytics (Pandas) or AI-related (pytorch, llamaindex, etc). Liked and subbed!
There is some strange screech sounds when ever you speak...
well stated sir. thanks for brief explanation and few source you presented. i hope we can appreciate your work and time you spent here by expanding this course and knowledge. btw i think you are persian, so i might say "daste shoma dard nakone" :)
What if we don't have access to the teacher's data?
eigen vector ---> آیگِن وکتور خوانده می شود.
Could you please explain the code of this paper as well
🎯 Key Takeaways for quick navigation: 00:00 📚 *Introduction to Probability Calibration* - Brief overview of why probability calibration is important. - Explanation of what probability calibration is. - Mention of examples to be covered in Jupyter Notebook. 01:32 🤔 *Why Probability Calibration is Needed* - Discussion on the unreliability of probability values from classifiers. - Importance of probability calibration for cases where percentages matter. - Example scenarios where calibrated probabilities are crucial. 03:30 📊 *Example: Calibration Process with 10 Data Points* - Introduction to a dataset with 10 data points for calibration. - Explanation of binning and cross-validation in probability calibration. - Visualization of the calibration curve for the given dataset. 07:42 📈 *Example: Calibration Process with 100 Data Points* - Overview of a dataset with 100 data points. - Observation of calibration histogram peaks due to classifier assumptions. - Discussion and visualization of the calibration plot for the 100-data-point dataset. 09:36 🔧 *Implementation of Probability Calibration in scikit-learn* - Introduction to two ways of implementing probability calibration: with prefix and without prefix. - Explanation of the code snippet for probability calibration implementation. - Overview of the scikit-learn library for calibration. 12:17 🖥️ *Implementation in Jupyter Notebook* - Importing necessary libraries for the Jupyter Notebook implementation. - Creation of a synthetic dataset for demonstration. - Setting up the dataset visualization using Seaborn. 25:38 📊 *Probability Calibration Plot in Jupyter Notebook* - Implementation of probability calibration in Jupyter Notebook. - Visualization of the probability calibration plot using Matplotlib. - Comparison of calibrated and uncalibrated probability curves. 30:02 📉 *Evaluation Metrics Comparison* - Calculation and comparison of evaluation metrics (Brier score, precision, recall, F1) for both calibrated and uncalibrated classifiers. - Interpretation of the metrics to assess the effectiveness of probability calibration. - Presentation of the results for the Gaussian Naive Bayes classifier. 34:44 📚 *Conclusion* - Recap of the key points discussed in the video. - Invitation for comments and suggestions from viewers. - Announcement of the next and final video on supervised learning in scikit-learn. Made with HARPA AI
How to download the train_clean_kalman.csv😁?
hi! thanks for the work, excellent explanation. notebook is not added?
Thank you for your sharing!
link to previous video, you are talking about in the start please ..?
Hi , Great content.. Could you please let me know how to copy values from table detected into a csv file?
Did you found the solution ?
Thanks for the review, great Job :), there is only one point about isotonic that I am not sure is correct if what you mentioned isotonic is appropriate for small data sets!? It is known to be the other way. would you please check? [time:10:14]
Isotonic regression is often suitable for small to moderately sized datasets where you want to model a monotonic relationship between variables. It may not be as efficient for very large datasets due to its computational complexity.
@@AIology2022 Thanks for the reply, although this topic could be confusng as we are not sure what size of dataset is considered as small or large. On must of resources it is been mentioned for datasets of size larger than 1000, isotonic can be a good choice. no more infor. Would you sugest a reference helps me to learn more about it? in particular in context of stream data. Thanks again and wish you success...
Thanks for putting these out! I have a hard time fitting reading into my schedule so these are really valuable for helping me stay up to date
I was hoping you would start with Dear Fellow Scholars hahaha
Thanks for sharing this. I am planning to buy a GPU, what's the cheapest GPU you suggest for fine-tuning LLMs. Did you try Mistral 7B? Any LLM which we can finetune on CPU as I have domain data on which I want to finetune a model for QA. Tried plain gpt2 model with vector db but didn't perform well..
promo sm
Can you make a video explaining some supervised training algorithm for SNN, I am trying to understand the available algorithms but can't get enough out of them, I need someone to explain.
Sure. Good idea
@@AIology2022 thanks bro
How much is the download size?
Guys, are the features of the mojo extension working for you?
Nope
You imported pytorch or pandas as np but in the next line you are trying to call np.array which naturally results in exception.
Although you are right and I am sorry for stupid mistake, but I tried it now and it is still not importing pandas, and torch.
Wow, what a garbage language :D
Btw you couldn't import torch as there was an exception on line 19. You forgot to remove np.array. 😂
Oh shite! My bad:) This was by far the worst video I have recorded and still getting more views than ones I expected better 😅
I tried it with correct syntax and still was not able to import pandas, and torch
@@AIology2022 It works without the try catch when you use the “raises” keyword in the fn declaration. So i just had “fn main() raises:” and it worked just fine.
Mojo sounds great, I didn't dl it though because their website uses google analytics. I don't like that.
I just realized that about 5 minutes of video is cut (maybe my fault during editing, but I am 90% sure I checked before uploading). In those five minutes I was talking about licensing and I had some criticism about it. Weird!
What is the licensing? I still can’t find find details on it other than it has no licensing.
@@experimenteeer it is non-commercial use and can change anytime to a fee or fully open source (that I don’t think so) You can find license link in download page
@@AIology2022if they do fee for the programming language there will be backlash from the community, and not one semi-pro team will utilize their language for big enough project. I want this language to be successful, plsplspls
@@AIology2022 Care to elaborate on this: "can change anytime to a fee"
Is there any way to convert my Django project to Mojo or we aren't there yet? I wonder if this would be possible at all.
We aren’t there yet
That is awesome. It looks a lot like Kotlin. I mostly use Python for AI/ ML, Kotlin for Android Mobile Dev and C++ for Game Dev. And this language Mojo looks like the best from all those 3 languages, plus is fast. Definitely learning this.
waiting for windows release, for me WSL is not working sadly
Congratulations on doing the first RUclips video as the local release was made available for Linux. I am still waiting for release on apple M chips
Same here but I will test it using my virtual machine
what configuration did you use?
Default one
Since Mojo is closed source 🥺 Do I have to pay money 💸💵💵💸 to use Mojo 🔥
For your personal project that would be OK
By the end of the year they are open sourcing it.
@@encapsulatio What is the evidence for that 🥺
Not at all.. They will open the source.. Don't worry..
They're initially closed source as they want to solidly define how the language should perform and grow. They've said it's going to be open source after that
how do you feel about the acquisition?
Interesting! Thank you AIology!
Another good one! Great new format. I still think your code tutorials are gold but this was great, thank you!
Excellent work, very clean. Could I have the notebook, PLEASE?
Thank you for this great video. It was really informative!
Can you share a better example to train ? this image example is confusing
What would we need to edit to run this with distributed inference? Assuming we just had a pc with 2 gpu cards??
hey i have one question , if i am finetuning it as in the example do we have to do it token by token for instance for every input output pair we would append the output tokens into input and use the next output as the prediction , but this would multiply datasets to max_output_length*total_pairs fold , is there a better way to train? not essentially token by token if i predict max_length and then use cross entropy is it fine ?
Hi While token-level training is the standard approach, there are other techniques that can be applied to improve language model training, such as using subword units like byte-pair encoding (BPE) or sentencepiece, incorporating positional encodings, and applying various regularization methods. These techniques help enhance the model's performance and ability to generate high-quality text. When it comes to training language models like GPT-3.5, training is typically done at the token level rather than at the level of whole examples or sentences.
At min 17:47 shouldn't the skip layer be applied on the inputx rather than x, based on the architecture diagram? x is not the input to layer but an output of dilated conv & activations