AIology
AIology
  • Видео 55
  • Просмотров 117 986
10-minute paper (episode 32): A Human-Inspired Reading Agent with Gist Memory of Very Long Contexts
Current Large Language Models (LLMs) are not only limited to some maximum context length, but also are not able to robustly consume long inputs. To address these limitations, ReadAgent is introduced as an LLM agent system that increases effective context length up to 20x in.
Inspired by how humans interactively read long documents, ReadAgemt (1) decide what content to store together in a memory episode, (2) compress those memory episodes into short episodic memories called gist memories, and (3) take actions to look up passages in the original text if ReadAgent needs to remind itself of relevant details to complete a task.
ReadAgent is evaluated against baselines using retrieval methods, u...
Просмотров: 167

Видео

10-minute paper (episode 31): Self-Supervised Learning from Images I-JEPA
Просмотров 93711 месяцев назад
In this groundbreaking paper, we unveil a cutting-edge approach to learning highly semantic image representations without the need for painstakingly crafted data augmentations. Introducing the Image-based Joint-Embedding Predictive Architecture (I-JEPA), a non-generative method for self-supervised learning from images, we unlock new possibilities in the world of computer vision.
Mojo - First impression
Просмотров 7 тыс.Год назад
Recently, the new language "Mojo" by Modular has been garnering significant interest. As of today, 7th September 2022, it has been publicly released. In the following video, I'll be coding to provide my initial impressions. Additionally, I'll discuss licensing concerns and hurdles Mojo must address to be a viable AI language option in the future. Lastly, I'll introduce Zig. While I'm not diving...
10-minute paper (episode 30): ColT5 (Part 1): Faster, Long-Range Transformers
Просмотров 454Год назад
Many natural language processing tasks benefit from long inputs, but processing long documents with Transformers is expensive not only due to quadratic attention complexity but also from applying feedforward and projection layers to every token. However, not all tokens are equally important, especially for longer documents. CoLT5, a long-input Transformer model that builds on this intuition by ...
10 minutes paper (episode 29): Table Extraction
Просмотров 1,2 тыс.Год назад
In recent advancements in machine learning, extracting structured tables from unstructured documents has seen significant progress. However, a key challenge has been creating large-scale datasets with accurate ground truth. To tackle this, authors introduced PubTables-1M, a comprehensive dataset with nearly one million tables from scientific articles. It supports various input types, offers det...
10 minutes paper (episode 28): AliBi; Train Short, Test Long
Просмотров 213Год назад
Alibi proposes to simply apply a static linear bias to the attention matrix. The authors show this is not only effective as a relative positional encoding, but also allows the attention net to extrapolate to greater sequences length than what it was trained on, for autoregressive language models. In this video we walk through paper, write some small code and look at x-transformers library. Plea...
Ray (Episode 4): Deploying 7B GPT using Ray
Просмотров 369Год назад
In the video presentation, I will delve into the topic of Ray Serve, showcasing its capabilities in serving machine learning models, particularly focusing on casual language models and other models built using popular frameworks like Transformers, PyTorch, and TensorFlow. This technology provides an efficient and scalable solution for deploying and managing such models in production environments.
Ray (Episode 3): Memory management in Ray Object Store
Просмотров 525Год назад
In this video, we will show how you can put and get large object in object store of ray that every actors can access it with reference to reduce latency. Then, we talk about about how you can free memory. At the end we will bring some fun examples about reference and value management in python and ray.
Ray (Episode 2): Actor models
Просмотров 278Год назад
Ray is an open-source distributed computing framework designed to make it easier to build and scale applications that require parallel and distributed processing. It provides a set of tools and libraries that help developers write efficient and high-performance applications across a variety of domains, from machine learning and data processing to reinforcement learning and beyond. Int this seri...
Ray (Episode 1): Remote function
Просмотров 446Год назад
Ray is an open-source distributed computing framework designed to make it easier to build and scale applications that require parallel and distributed processing. It provides a set of tools and libraries that help developers write efficient and high-performance applications across a variety of domains, from machine learning and data processing to reinforcement learning and beyond. Int this seri...
10 minutes paper (episode 27): LLM powered autonomous agents
Просмотров 1 тыс.Год назад
Imagine a world where AI isn't just smart - it's genius. In this video we walk through requirements for creating an autonomous agent powered by LLM using Lillian post. An agent should have following steps in the best optimized way: 📚 Planning: Agents excel at breaking down big tasks into manageable subgoals, and they're self-reflective learners, continuously improving by learning from their own...
10 minutes paper (episode 26):Multi-Grained Vision Language Pre-Training: X-VLM
Просмотров 475Год назад
Introducing X-VLM, a groundbreaking method in vision language pretraining that overcomes the limitations of existing approaches. Traditional methods heavily rely on object-centric features extracted through object detection, which struggle to capture relationships among multiple objects. X-VLM introduces multi-grained alignments by locating visual concepts in images based on associated texts an...
10 minutes paper (episode 25): Low Rank Adaptation: LoRA
Просмотров 958Год назад
In this video, we'll explore the challenges posed by full fine-tuning of massive language models, such as GPT-3 175B, and the prohibitive costs associated with deploying independent instances of such models. LoRA utilizes an innovative approach by freezing the pretrained model weights and introducing trainable rank decomposition matrices into each layer of the Transformer architecture. This ing...
AI-Code-Mastery (Episode 8): Fine-Tuning MPT-7B by Single GPU | Open-Source and Commercializable
Просмотров 26 тыс.Год назад
I am excited to bring you a comprehensive step-by-step guide on how to fine-tune the newly announced MPT-7B parameters model using just a single GPU. This remarkable model is not only open source but also commercializable, making it a valuable tool for a wide range of natural language processing (NLP) tasks. MPT token size beat GPT4 and also it outperformed many available language models like G...
10 minutes paper (episode 24): ViperGPT
Просмотров 555Год назад
ViperGPT is a framework that enables complex visual queries by combining vision and language models into subroutines. This framework uses code generation models to create Python code that is later executed, and utilizes a provided API to access available modules. ViperGPT achieves state-of-the-art results in various complex visual tasks without the need for further training. This approach offer...
10 minutes paper (episode 23): Unlocking Full Potential of Language Models with Chain-of-Thought
Просмотров 703Год назад
10 minutes paper (episode 23): Unlocking Full Potential of Language Models with Chain-of-Thought
AI-Code-Mastery (Episode 7): Text2Image using Diffusion Model and ControlNet
Просмотров 673Год назад
AI-Code-Mastery (Episode 7): Text2Image using Diffusion Model and ControlNet
AI-Code-Mastery (Episode 6): Explore the Best ML Challenge Websites and Datasets
Просмотров 402Год назад
AI-Code-Mastery (Episode 6): Explore the Best ML Challenge Websites and Datasets
AI-Code-Mastery (Episode 5): Zero-Shot document question answering with Flan-ULv2
Просмотров 6 тыс.Год назад
AI-Code-Mastery (Episode 5): Zero-Shot document question answering with Flan-ULv2
AI-Code-Mastery (Episode 4): Torch Drug
Просмотров 1,1 тыс.Год назад
AI-Code-Mastery (Episode 4): Torch Drug
10 minutes paper (episode 22); Beyond neural scaling laws
Просмотров 716Год назад
10 minutes paper (episode 22); Beyond neural scaling laws
10 minutes paper (episode 21); LayoutReader
Просмотров 563Год назад
10 minutes paper (episode 21); LayoutReader
10 minutes paper (episode 20); InstructGPT
Просмотров 11 тыс.Год назад
10 minutes paper (episode 20); InstructGPT
AI-Code-Mastery (Episode 3): Split
Просмотров 156Год назад
AI-Code-Mastery (Episode 3): Split
AI-Code-Mastery (Episode 2): alert, diffdir, conv, ocr
Просмотров 189Год назад
AI-Code-Mastery (Episode 2): alert, diffdir, conv, ocr
AI-Code-Mastery (Episode 1): Cython
Просмотров 950Год назад
AI-Code-Mastery (Episode 1): Cython
10 minutes paper (episode 19); ConvNeXt: A ConvNet for the 2020s
Просмотров 9872 года назад
10 minutes paper (episode 19); ConvNeXt: A ConvNet for the 2020s
10 minutes paper (episode 18); Similarity of Neural Network Representations Revisited
Просмотров 6032 года назад
10 minutes paper (episode 18); Similarity of Neural Network Representations Revisited
10 minutes paper (episode 17); Micro-Batch Training
Просмотров 2072 года назад
10 minutes paper (episode 17); Micro-Batch Training
Implement U-Net (PyTorch)
Просмотров 1,2 тыс.2 года назад
Implement U-Net (PyTorch)

Комментарии

  • @Ronald_McColeman
    @Ronald_McColeman 18 дней назад

    amazing video!

  • @shuangwang9886
    @shuangwang9886 3 месяца назад

    Your code is not the TabNet, it is TabTransformer., they are different papers.

  • @helloyes2288
    @helloyes2288 5 месяцев назад

    overall presentation needs improvement. Your accent isn't that hard to understand but your mic is tinny and I don't want to see your browser or desktop.

  • @jeromeeusebius
    @jeromeeusebius 6 месяцев назад

    Thank you for sharing this short and concise video. It is helpful to understand knowledge distillation and for showing an example in code and for also walking through the code examples.

  • @tag_of_frank
    @tag_of_frank 8 месяцев назад

    Here, are you tuning a text completion model? What about the instruct or chat models?

  • @aduasarebaffour759
    @aduasarebaffour759 8 месяцев назад

    Thank you!!! I love every bit of the video. The fact that you didn't disrupt the lecture with "...Like, comment, share, and subscribe to the channel..." in 59 minutes till the end of the video shows how passionate you are to just share your knowledge. Thank you again. Could you cover the Molecule Generation Tasks? That will be very helpful to my current project.

  • @हरिःओमतत्सत्
    @हरिःओमतत्सत् 8 месяцев назад

    Can you please share your jupyter notebook for tutoring/experimenting? Thanks.

  • @harrumnoor3802
    @harrumnoor3802 8 месяцев назад

    Where can I find part2?

  • @pietraderdetective8953
    @pietraderdetective8953 8 месяцев назад

    Great Cython introduction video! in the video you mentioned you're going to go in depth with Cython's features...is there a follow up Cython video to this one? Would love to see Cython's implementations either in data analytics (Pandas) or AI-related (pytorch, llamaindex, etc). Liked and subbed!

  • @eliaweiss1
    @eliaweiss1 9 месяцев назад

    There is some strange screech sounds when ever you speak...

  • @MohammadTat-v9g
    @MohammadTat-v9g 9 месяцев назад

    well stated sir. thanks for brief explanation and few source you presented. i hope we can appreciate your work and time you spent here by expanding this course and knowledge. btw i think you are persian, so i might say "daste shoma dard nakone" :)

  • @auresdz701
    @auresdz701 9 месяцев назад

    What if we don't have access to the teacher's data?

  • @neelarahimi1053
    @neelarahimi1053 9 месяцев назад

    eigen vector ---> آیگِن وکتور خوانده می شود.

  • @mubasharsaeed6044
    @mubasharsaeed6044 10 месяцев назад

    Could you please explain the code of this paper as well

  • @crater721
    @crater721 10 месяцев назад

    🎯 Key Takeaways for quick navigation: 00:00 📚 *Introduction to Probability Calibration* - Brief overview of why probability calibration is important. - Explanation of what probability calibration is. - Mention of examples to be covered in Jupyter Notebook. 01:32 🤔 *Why Probability Calibration is Needed* - Discussion on the unreliability of probability values from classifiers. - Importance of probability calibration for cases where percentages matter. - Example scenarios where calibrated probabilities are crucial. 03:30 📊 *Example: Calibration Process with 10 Data Points* - Introduction to a dataset with 10 data points for calibration. - Explanation of binning and cross-validation in probability calibration. - Visualization of the calibration curve for the given dataset. 07:42 📈 *Example: Calibration Process with 100 Data Points* - Overview of a dataset with 100 data points. - Observation of calibration histogram peaks due to classifier assumptions. - Discussion and visualization of the calibration plot for the 100-data-point dataset. 09:36 🔧 *Implementation of Probability Calibration in scikit-learn* - Introduction to two ways of implementing probability calibration: with prefix and without prefix. - Explanation of the code snippet for probability calibration implementation. - Overview of the scikit-learn library for calibration. 12:17 🖥️ *Implementation in Jupyter Notebook* - Importing necessary libraries for the Jupyter Notebook implementation. - Creation of a synthetic dataset for demonstration. - Setting up the dataset visualization using Seaborn. 25:38 📊 *Probability Calibration Plot in Jupyter Notebook* - Implementation of probability calibration in Jupyter Notebook. - Visualization of the probability calibration plot using Matplotlib. - Comparison of calibrated and uncalibrated probability curves. 30:02 📉 *Evaluation Metrics Comparison* - Calculation and comparison of evaluation metrics (Brier score, precision, recall, F1) for both calibrated and uncalibrated classifiers. - Interpretation of the metrics to assess the effectiveness of probability calibration. - Presentation of the results for the Gaussian Naive Bayes classifier. 34:44 📚 *Conclusion* - Recap of the key points discussed in the video. - Invitation for comments and suggestions from viewers. - Announcement of the next and final video on supervised learning in scikit-learn. Made with HARPA AI

  • @kevinjohn-x4e
    @kevinjohn-x4e 10 месяцев назад

    How to download the train_clean_kalman.csv😁?

  • @wishswiss
    @wishswiss 10 месяцев назад

    hi! thanks for the work, excellent explanation. notebook is not added?

  • @sakurasadkj5839
    @sakurasadkj5839 11 месяцев назад

    Thank you for your sharing!

  • @sandeepanand3834
    @sandeepanand3834 11 месяцев назад

    link to previous video, you are talking about in the start please ..?

  • @shob_xyz
    @shob_xyz 11 месяцев назад

    Hi , Great content.. Could you please let me know how to copy values from table detected into a csv file?

  • @sirvanparasteh8831
    @sirvanparasteh8831 11 месяцев назад

    Thanks for the review, great Job :), there is only one point about isotonic that I am not sure is correct if what you mentioned isotonic is appropriate for small data sets!? It is known to be the other way. would you please check? [time:10:14]

    • @AIology2022
      @AIology2022 11 месяцев назад

      Isotonic regression is often suitable for small to moderately sized datasets where you want to model a monotonic relationship between variables. It may not be as efficient for very large datasets due to its computational complexity.

    • @sirvanparasteh8831
      @sirvanparasteh8831 11 месяцев назад

      @@AIology2022 Thanks for the reply, although this topic could be confusng as we are not sure what size of dataset is considered as small or large. On must of resources it is been mentioned for datasets of size larger than 1000, isotonic can be a good choice. no more infor. Would you sugest a reference helps me to learn more about it? in particular in context of stream data. Thanks again and wish you success...

  • @DavidWalker-hw6co
    @DavidWalker-hw6co 11 месяцев назад

    Thanks for putting these out! I have a hard time fitting reading into my schedule so these are really valuable for helping me stay up to date

  • @renanmonteirobarbosa8129
    @renanmonteirobarbosa8129 11 месяцев назад

    I was hoping you would start with Dear Fellow Scholars hahaha

  • @vivekmishra69
    @vivekmishra69 11 месяцев назад

    Thanks for sharing this. I am planning to buy a GPU, what's the cheapest GPU you suggest for fine-tuning LLMs. Did you try Mistral 7B? Any LLM which we can finetune on CPU as I have domain data on which I want to finetune a model for QA. Tried plain gpt2 model with vector db but didn't perform well..

  • @stewardeastes8240
    @stewardeastes8240 Год назад

    promo sm

  • @AyushVerma-ui7re
    @AyushVerma-ui7re Год назад

    Can you make a video explaining some supervised training algorithm for SNN, I am trying to understand the available algorithms but can't get enough out of them, I need someone to explain.

  • @mutamanabdalazeem7238
    @mutamanabdalazeem7238 Год назад

    How much is the download size?

  • @micaelcode8889
    @micaelcode8889 Год назад

    Guys, are the features of the mojo extension working for you?

  • @w24lp05
    @w24lp05 Год назад

    You imported pytorch or pandas as np but in the next line you are trying to call np.array which naturally results in exception.

    • @AIology2022
      @AIology2022 Год назад

      Although you are right and I am sorry for stupid mistake, but I tried it now and it is still not importing pandas, and torch.

  • @vicktorioalhakim3666
    @vicktorioalhakim3666 Год назад

    Wow, what a garbage language :D

  • @TheGoldenPro
    @TheGoldenPro Год назад

    Btw you couldn't import torch as there was an exception on line 19. You forgot to remove np.array. 😂

    • @AIology2022
      @AIology2022 Год назад

      Oh shite! My bad:) This was by far the worst video I have recorded and still getting more views than ones I expected better 😅

    • @AIology2022
      @AIology2022 Год назад

      I tried it with correct syntax and still was not able to import pandas, and torch

    • @chaoticwagon
      @chaoticwagon Год назад

      @@AIology2022 It works without the try catch when you use the “raises” keyword in the fn declaration. So i just had “fn main() raises:” and it worked just fine.

  • @undeadpresident
    @undeadpresident Год назад

    Mojo sounds great, I didn't dl it though because their website uses google analytics. I don't like that.

  • @AIology2022
    @AIology2022 Год назад

    I just realized that about 5 minutes of video is cut (maybe my fault during editing, but I am 90% sure I checked before uploading). In those five minutes I was talking about licensing and I had some criticism about it. Weird!

    • @experimenteeer
      @experimenteeer Год назад

      What is the licensing? I still can’t find find details on it other than it has no licensing.

    • @AIology2022
      @AIology2022 Год назад

      @@experimenteeer it is non-commercial use and can change anytime to a fee or fully open source (that I don’t think so) You can find license link in download page

    • @IqweoR
      @IqweoR Год назад

      ​@@AIology2022if they do fee for the programming language there will be backlash from the community, and not one semi-pro team will utilize their language for big enough project. I want this language to be successful, plsplspls

    • @decimatech5102
      @decimatech5102 Год назад

      @@AIology2022 Care to elaborate on this: "can change anytime to a fee"

  • @СергейГалиуллин-п9ю

    Is there any way to convert my Django project to Mojo or we aren't there yet? I wonder if this would be possible at all.

  • @vectoralphaSec
    @vectoralphaSec Год назад

    That is awesome. It looks a lot like Kotlin. I mostly use Python for AI/ ML, Kotlin for Android Mobile Dev and C++ for Game Dev. And this language Mojo looks like the best from all those 3 languages, plus is fast. Definitely learning this.

  • @baldeagle6531
    @baldeagle6531 Год назад

    waiting for windows release, for me WSL is not working sadly

  • @Ivoshevo
    @Ivoshevo Год назад

    Congratulations on doing the first RUclips video as the local release was made available for Linux. I am still waiting for release on apple M chips

    • @LacksonMunthali
      @LacksonMunthali Год назад

      Same here but I will test it using my virtual machine

  • @teefus1
    @teefus1 Год назад

    what configuration did you use?

  • @إبراهيم-س9ف6ص
    @إبراهيم-س9ف6ص Год назад

    Since Mojo is closed source 🥺 Do I have to pay money 💸💵💵💸 to use Mojo 🔥

    • @AIology2022
      @AIology2022 Год назад

      For your personal project that would be OK

    • @encapsulatio
      @encapsulatio Год назад

      By the end of the year they are open sourcing it.

    • @إبراهيم-س9ف6ص
      @إبراهيم-س9ف6ص Год назад

      @@encapsulatio What is the evidence for that 🥺

    • @DavidRagazzi
      @DavidRagazzi Год назад

      Not at all.. They will open the source.. Don't worry..

    • @mattrs1
      @mattrs1 Год назад

      They're initially closed source as they want to solidly define how the language should perform and grow. They've said it's going to be open source after that

  • @mooncop
    @mooncop Год назад

    how do you feel about the acquisition?

  • @TailorJohnson-l5y
    @TailorJohnson-l5y Год назад

    Interesting! Thank you AIology!

  • @TailorJohnson-l5y
    @TailorJohnson-l5y Год назад

    Another good one! Great new format. I still think your code tutorials are gold but this was great, thank you!

  • @madanydiallo7573
    @madanydiallo7573 Год назад

    Excellent work, very clean. Could I have the notebook, PLEASE?

  • @mahyaahmadzadeh9855
    @mahyaahmadzadeh9855 Год назад

    Thank you for this great video. It was really informative!

  • @khushboodholi4237
    @khushboodholi4237 Год назад

    Can you share a better example to train ? this image example is confusing

  • @redgenAI
    @redgenAI Год назад

    What would we need to edit to run this with distributed inference? Assuming we just had a pc with 2 gpu cards??

  • @c.b.t6738
    @c.b.t6738 Год назад

    hey i have one question , if i am finetuning it as in the example do we have to do it token by token for instance for every input output pair we would append the output tokens into input and use the next output as the prediction , but this would multiply datasets to max_output_length*total_pairs fold , is there a better way to train? not essentially token by token if i predict max_length and then use cross entropy is it fine ?

    • @AIology2022
      @AIology2022 Год назад

      Hi While token-level training is the standard approach, there are other techniques that can be applied to improve language model training, such as using subword units like byte-pair encoding (BPE) or sentencepiece, incorporating positional encodings, and applying various regularization methods. These techniques help enhance the model's performance and ability to generate high-quality text. When it comes to training language models like GPT-3.5, training is typically done at the token level rather than at the level of whole examples or sentences.

  • @georgealexandruvlad7837
    @georgealexandruvlad7837 Год назад

    At min 17:47 shouldn't the skip layer be applied on the inputx rather than x, based on the architecture diagram? x is not the input to layer but an output of dilated conv & activations