Firefox Has A PERFECT Use For AI Text Generation

Поделиться
HTML-код
  • Опубликовано: 4 июн 2024
  • A lot of the use cases for AI have been replacing the thing we already do with AI to save some money but Mozilla Firefox actually has a use that I think is really good and a massive boon for accessibility, alt-text generation
    ==========Support The Channel==========
    ► Patreon: brodierobertson.xyz/patreon
    ► Paypal: brodierobertson.xyz/paypal
    ► Liberapay: brodierobertson.xyz/liberapay
    ► Amazon USA: brodierobertson.xyz/amazonusa
    ==========Resources==========
    Mozilla Mastodon Post: mozilla.social/@mozilla/11255...
    Mozilla Blog Post: hacks.mozilla.org/2024/05/exp...
    =========Video Platforms==========
    🎥 Odysee: brodierobertson.xyz/odysee
    🎥 Podcast: techovertea.xyz/youtube
    🎮 Gaming: brodierobertson.xyz/gaming
    ==========Social Media==========
    🎤 Discord: brodierobertson.xyz/discord
    🐦 Twitter: brodierobertson.xyz/twitter
    🌐 Mastodon: brodierobertson.xyz/mastodon
    🖥️ GitHub: brodierobertson.xyz/github
    ==========Credits==========
    🎨 Channel Art:
    Profile Picture:
    / supercozman_draws
    🎵 Ending music
    Track: Debris & Jonth - Game Time [NCS Release]
    Music provided by NoCopyrightSounds.
    Watch: • Debris & Jonth - Game ...
    Free Download / Stream: ncs.io/GameTime
    #Linux #Firefox #OpenSource #AI #LLM #GPT
    DISCLOSURE: Wherever possible I use referral links, which means if you click one of the links in this video or description and make a purchase I may receive a small commission or other compensation.
  • НаукаНаука

Комментарии • 285

  • @FerroMeow
    @FerroMeow 23 дня назад +348

    this is the kind of thing AI should automate - a useful, needed work that NO human wants to do. This is literally the most (use-case-wise) ethical AI application I've seen yet

    • @GSBarlev
      @GSBarlev 23 дня назад +22

      I largely agree, but I always think use cases like this should be *opt-in* for two reasons:
      1. Environmental impact-if this is a feature the user doesn't need, then they're burning inference cycles and driving up their power bill (or depleting their battery) unnecessary
      2. Intellectual property: *All* of these models are _extremely cagely_ about what datasets they were trained on. The reason is almost certainly because the training datasets contain copyrighted work. No matter how noble the use case, if you believe that artists should have a say in how their works are used, you shouldn't be tricked into being complicit in what is essentially theft.

    • @U1TR4F0RCE
      @U1TR4F0RCE 23 дня назад +41

      @@GSBarlev they explain in the video it is opt-in.

    • @nullvoid3545
      @nullvoid3545 23 дня назад +7

      I think its worth looking at why people insist on being used as resources when robots could and should be used to improve our lives in every way they can.
      If we weren't deprived of basic necessities by default, maybe the fear of being replaced would subside, and instead we could spend our time looking for things to enjoy together.

    • @Denis-Maldonado
      @Denis-Maldonado 23 дня назад +5

      Why the requirement of "No human wanting to do it"? We didn't halt new job creation, when machinery was invented, despite some people wanting to preserve their manual jobs that the newly created machines were taking from them.
      Why AI should be different? Better ask governments for an U.B.I. rather to stop progress.
      Or learn to adapt and use AI in your workflow.

    • @OhhCrapGuy
      @OhhCrapGuy 23 дня назад +7

      Here's another case, a RUclipsr I know has a crawling text next to his narration, which took forever to manually synchronize. I used an AI model to listen to his narration, extract the timing of each paragraph, and automate creating the text crawl as a video for he could simply import into Premiere.
      It was work he himself was already doing, and it just made his workflow simpler.

  • @NeftisIsHere
    @NeftisIsHere 23 дня назад +222

    This one actually looks like a good idea, i love how firefox want to handle it

  • @sebastianarmstrong5726
    @sebastianarmstrong5726 23 дня назад +175

    it's telling that google comes out of the gate with buggy and misleading ai search, while firefox waits longer to test this reasonable and actually useful use of ai. this is part of why, even though mozilla isnt perfect, i will continue to use firefox over chrome / chromium based browsers. i still want to hear from people who are blind or already make use of screenreaders to see how helpful this feature will be before making a judgement though.

    • @MechanicaMenace
      @MechanicaMenace 22 дня назад +3

      I'm not sure the buggy Google AI search summaries are real. The screenshots of it being bad are all 1) funny and 2) are cropped to cut out the sources section. I'm pretty certain a lot of them are just people using the F12 key.

    • @EwanMarshall
      @EwanMarshall 22 дня назад

      Google was testing and waiting, openai and microsoft kind of pushed them to keep shareholders happy. Let me point out a few things, used google translate sometime since november 2016? Well, between 2016 and 2020 you were using Google Neural Marchine Transtlation system (a system they started putting together in 2011), a precursor to Generative Pretrained Transformer (GPT), by 2020, well google behind the scenes switched out to a GPT based system.
      OpenAI managed to a do a good job with training of a scaled up version and b get lucky with said training. But they were working off of what|others, mostly google had already done.
      Still I prefer firefox too, but let us be clear, there is more here than it looks like.

    • @LautaroQ2812
      @LautaroQ2812 22 дня назад

      Sentiment is great but Mozilla gets millions from Google so Google having buggy AI used by millions of users and potentially being funded by stealing all that data is what funds Mozilla as well. People always seem to forget this and they keep boasting the argument of using FF against Chromium browsers as if it's noble or something. I'm not saying DON'T use it. I'm saying let's be real about it.
      The tab grouping feature request has been done about 2 years ago with many comments in the (mozilla) forums and they are "still looking on it" meaning they don't give a shit and they won't do it - a feature that had 2 thousand upvotes which is extremely high considering how many people use FF and then how many of those use community forums to ask for a feature.
      Then, they suddenly come out with "articles" which a lot of people don't read and they always come off with the wrong impression. This is partially people's responsibility, yes. Thanks to Brodie we can have a video about it. However, if it happens so often then they should revise their communication strategy. This let alone with them shipping with telemetry which many people dislike and also having Google as default search engine, unlike before that it used to have DDG and have Ecosia and others as options right there.
      This, if it helps people with disability, it's GREAT. I'm ALL for it. But let's not keep pretending FF is a golden standard anymore. It's just another one that luckily, is not Chromium based, but might as well be and it'd be the same.

    • @LautaroQ2812
      @LautaroQ2812 22 дня назад +1

      @hoovysimulator2518 I use it - depending on what and how, the AI summarizer is FINE. Is not perfect and SOME AI Answers seem to conflict on certain things because of the sources/info that pulls (at least for the queries I've done myself, maybe other topics are much better or much worse), but otherwise in general is pretty neutral which I like. It usually gives a general description, then pros and cons of something according to that info and then a conclusion as a summary. It's good enough imo.

    • @MechanicaMenace
      @MechanicaMenace 22 дня назад

      @hoovysimulator2518 or the person who used F12 to fake it copied it from a Reddit post.

  • @kdcadet
    @kdcadet 23 дня назад +161

    As a VIP, visually impaired person, I greatly appreciate this sort of feature. As a machine learning engineer I very much appreciate these sorts of features when implemented to run locally.

    • @RadikAlice
      @RadikAlice 23 дня назад +10

      Agreed. Relying on the internet for it is so dumb, might as well be fully online at that point

    • @EwanMarshall
      @EwanMarshall 22 дня назад +3

      As a web developer (among other things, the webdev is more of a side thing now) I do have a problem in that if I can't come up with good alt text to use, I doubt the AI will be able to either, but other than that, it is a great idea generally. Not all web developers are good at alt text consistency and client corporations tend to be attrocious if once allows them to directly add content.

  • @EQuivalentTube2
    @EQuivalentTube2 23 дня назад +131

    Now we're getting somewhere.
    Edit: honestly, when the cause is clear and noble, I'd WANT to donate my data to be used in training. Please give me that option in the telemetry settings.

    • @icanonlysuffer
      @icanonlysuffer 23 дня назад

      i would absolutely love opt-in telemetry for this purpose

    • @bleack8701
      @bleack8701 22 дня назад +2

      The problem is that you can't know for sure they aren't breaking the law. Sometimes businesses make the calculation and decide that the fines wouldn't impact the profits enough for it to matter that they've broken the law

    • @chriss3404
      @chriss3404 22 дня назад +3

      and have a setting to not share from private sessions 💀

  • @jort93z
    @jort93z 23 дня назад +19

    Many social media sites don't even allow you to put alt text. In misskey for example, all the alt text is simply the name of the uploaded file. Which usually isn't helpful. So this seems very nice.

    • @felixfourcolor
      @felixfourcolor 22 дня назад

      that's so stupid, why do they do that

    • @jort93z
      @jort93z 22 дня назад +1

      @@felixfourcolor Laziness, i guess. Most people don't care about alt-text.

    • @SteinGauslaaStrindhaug
      @SteinGauslaaStrindhaug 21 день назад

      If you don't allow a real useful alt text you should actually not include the attribute at all. Having the screen reader exclaim "Image!" and nothing more is after all more useful than having it exclaim "Image twenty million two houndred fourty thousand three hundred thirty underscore zero zero ay ex fifty three jey peg"; since both exclaimations only tells you that the people who made the site was lazy and didn't bother with alt-texts but the first one does so in much less time and annoyance...
      It won't validate of course, that's because you are supposed to supply an alt-text, but supplying junk isn't doing anything except fooling a silly static code analysis tool.
      In fact fairly often the correct solution (which of course only works in an editorial setting, not in social media) is to include a deliberately empty alt-text. If the image is _purely_ decorative like some abstract patterns or geometric shapes or a completely irrelevant stock photo with _no_ real function to the context beyond breaking up a wall of text with some pretty colours; these purely decorative images should really be ignored by the screen reader, and by explicitly supplying an empty alt-text you're telling the screen reader to ignore it completely.
      Though; arguably if the image is completely meaningless (i.e. seeing visitors will also just completely ignore it while reading), you probably should not even include it. Why waste bandwidth and screen space with something that everyone will skip? I'd much prefer some SVG decoration or some dingbat or flourish character from a webfont with text colour as a colourful chapter divider, which does't kost many kilobytes at all over a meaningless 1 megabyte stock photo that I have to waste resources downloading and then waste brain cycles figuring out if it's meaningful or not just to mentally discard it immediately.
      Btw, decorative SVG's or dingbats should also have an aria-hidden attribute to tell the screen reader to ignore it.
      If it also functions as a visual indication of a new chapter or section in the text; and this would be meaningful to indicate with a brief pause if you were reading it aloud; you should probably end a section tag before the symbol and start a new section after it, so that the screen reader can announce it (if configured to do so) and so that the screen reader user can use the list of sections on the page as landmarks to quickly skip back and forth in the text just like seeing visitors can use the symbol as a landmark.

  • @davidh.4944
    @davidh.4944 23 дня назад +24

    It would be useful for the model to add an additional line, something like "ai generated: 75% confidence" at the end of its output.

    • @deadoon
      @deadoon 23 дня назад +21

      The problem is that those confidence values are often not very accurate, so are more misleading than anything. High confidence hallucinations would give people a false sense of security. "AI assessment, errors may exist" is a better one since it is vague.

    • @guyblack9729
      @guyblack9729 23 дня назад +3

      I think that attaching a confidence percentage would get muddy, but you have a point. Maybe when this is expanded to the rest of the browser there is an option to distinguish AI generated alt text that came from the website. For sighted people this could be in the form of maybe a different color for the text that pops up, and for the visually impaired I'm assuming the screen reader could use a different tone of voice (I don't need to and haven't really used screen reader software so I may be totally wrong here).

  • @MechMK1
    @MechMK1 23 дня назад +27

    A few years ago I was working with software that output a giant list of results with text like "Good", "Warning", "Error", etc..., but as an image so "it looked nice". Unfortunately, one of my co-workers was vision-impaired and the software did not provide alt-text. The at-the-time solution was to use a python script to manually modify the output and add alt-text tags for every image.
    But I can see how such a tool would be useful for vision-impaired people in an on-demand context. Something where they can configure when to generate descriptions, whether or not to wait for confirmation, etc...

  • @olliestudio45
    @olliestudio45 23 дня назад +15

    I thought the firefox model's description was great but the human one was definitely very good. Both are probably better than anything I may have come up with.

  • @DryPaperHammerBro
    @DryPaperHammerBro 23 дня назад +100

    Finally, some good fucking AI

    • @mskiptr
      @mskiptr 23 дня назад +13

      It was there all along. Just without that much hype and usually called "machine learning" instead.
      The only big change in the past 2 years is we now have models that can generate highly intelligible text. And lots of bogus funding.

    • @U1TR4F0RCE
      @U1TR4F0RCE 23 дня назад +3

      Jeff Geerling has another example in his video he also mentions the fact that AI is kind of a buzzword for the situation since machine learning has been worked on for a while, but using a dedicated attachement to the Pi to improve it's capabilities with machine learning for a camera to be used as a security camera is good.

    • @RadikAlice
      @RadikAlice 23 дня назад +1

      Their page translation feature is also powered by locally-run AI. So this is a welcome expansion of that

    • @olemortensen3354
      @olemortensen3354 22 дня назад +2

      The more specialized/small the AI the better the result.

    • @RadikAlice
      @RadikAlice 22 дня назад

      @@olemortensen3354 It's like the old saying. Jack of all trades, master of one
      A lot of things would be dead or stagnant without people narrowing down their skill into a niche

  • @Burgo361
    @Burgo361 22 дня назад +6

    Alt text is a fundamental basic requirement of making a website it's insane that we have reached the point where this is needed.
    At least this is a valid use case of AI and being open source project I can buy that it will stay local.

  • @CjqNslXUcM
    @CjqNslXUcM 22 дня назад +6

    I agree with the mozilla model of AI. Give me local models on demand and don't touch my data. I wish they would bundle a really good text to speech engine, even if you need a GPU.

  • @hadockzin
    @hadockzin 23 дня назад +8

    This is a cool feature

  • @blarghblargh
    @blarghblargh 23 дня назад +29

    Year of the Linux console

    • @anonymouscommentator
      @anonymouscommentator 23 дня назад +15

      year of the linux browser

    • @ChrisWijtmans
      @ChrisWijtmans 23 дня назад +4

      linux console was always superior.

    • @icanonlysuffer
      @icanonlysuffer 23 дня назад

      year of the linux kernel panic logs

    • @GSBarlev
      @GSBarlev 23 дня назад +2

      I'd love to see this running in lynx.

    • @anonymouscommentator
      @anonymouscommentator 22 дня назад +1

      @@ChrisWijtmans tbh if you compare the stock experience between the windows terminal and the stock experience of something like konsole or the gnome terminal then id say the windows terminal wins. its just that there are many, better terminal emulators on linux and you can actually customize them which is their real power

  • @mble
    @mble 22 дня назад +2

    Actually, on social media there are image-to-text models that add description for an image automagicaly

  • @guiorgy
    @guiorgy 23 дня назад +7

    I can imagine it being used to automatically generate alt texts for images, which are then automatically scraped by someone to trained their own model, which is used to generate alt texts for images, and then scraped again, and so on. I wonder how much they will start to hallucinate 😅

    • @scarecat
      @scarecat 23 дня назад +2

      why would scrapers scrape locally generated text?

    • @rkvkydqf
      @rkvkydqf 23 дня назад +1

      Case in point: only leave its outputs locally, don't add similar stuff to WordPress so you can "comply" with accessibility guidelines, just like it is used by Firefox already.

  • @EpiX0R
    @EpiX0R 23 дня назад +9

    As a web developer I think this is one of the few good uses for AI. I saw an article title that read "I want AI to do my dishes and laundry so I have time for art and writing NOT AI to do art and writing so I have time for dishes and laundry."
    AI should not replicate human behaviour, AI should assist human behaviour to improve our efficiency in things we don't want to do. Alt-tags is a great example of an important but tedious task that often get forgotten. AI is a great use here, under the restrictions that it is not taken for granted. AI today is not good enough to replace humans - we still need to check it to ensure it is not hallucinating.

  • @TechJolt3d
    @TechJolt3d 23 дня назад +7

    I actually use firefox pdf editor, its super useful for me lol. Also Firefox is doing something good, that is opt in and local, and is being done for improving the firefox user experience??
    Hope the model is trained on stuff thats not in copyright (or was fully paid for and got permission to do so), because I think thats the only way its ethical.

    • @RottenFishbone
      @RottenFishbone 23 дня назад

      It seems less unethical to train AI used to generate text from an image using copyrighted images than the opposite. When you're converting an image to text its hard to say you didn't transform it. When you're generating images using copyrighted images you can often see artifacts of watermarks in the images (given certain parameters) and arguing that it was transformative enough seems like wishful thinking from AI enthusiasts.

  • @LinuxinaBit
    @LinuxinaBit 23 дня назад +6

    Somebody came up with an actually good usecase for AI? I actually can't believe it!

  • @opposite342
    @opposite342 23 дня назад +3

    I didn't know this isn't already in production since you can probably technically have a more power consuming version run on gpu-acceleration for a while now (like probably even when YOLO came out years before ChatGPT). But I guess it makes more sense now that it is much less process-taxing.

  • @shApYT
    @shApYT 22 дня назад +2

    They could also build in local translation for images like foreign signboard or, for you weebs, manga panels. There is also Phi-3 vision and paligemma for GPT4o style descriptions.

  • @ltxr9973
    @ltxr9973 22 дня назад

    The nuanced middle position is - if you want to process a mass of data and only need a resonable amount of good results, not all good results, then AI is great. If you want to process one specific thing to get one specific result you'll get frustrated and it'll take you longer than doing it manually. Generating alt text is pretty smart if it doesn't use too much resources.

  • @TravisNewton1
    @TravisNewton1 22 дня назад +2

    This is why I love using Ice Cubes for Mastodon. I hate describing images in my posts. So I let AI describe the image for me. I’m all for AI doing tasks like this.

  • @Herr_U
    @Herr_U 22 дня назад

    Sounds like they hit a reasonable compromise.
    This however reminds me of a thing I sorely miss from the olden days of browsers - the ability to easily (hotkey) turn images on and off. Was so much easier to check for missed alt-tags with that (also, made a lots of pages easier to read).

  • @michaelk__
    @michaelk__ 23 дня назад

    Good enough image descriptions when they are missing is finally an actually useful use for LLMs!
    Nice to see it implemented filly locally with a small model that hopefully will be able to run reasonably well even on just a CPU!

  • @today273
    @today273 23 дня назад

    This one is so interesting! Thank you for reporting it. It's so interesting that I may have check it out and apt install firefox-nightly

  • @ivailogeimara
    @ivailogeimara 22 дня назад +1

    FE dev here. The problem with that approach is that the alt text is going to describe the image while in my experience in most cases it's more helpful to describe what is the purpose of the image with alt-text rather than the image itself.
    For example I have an button with an image inside. The alt text in this case is what that button will say if it was a text button and not describing the image itself. Or sometimes I have a card of an item and the image is based on the type of that item (it's a thing that could be one of a few types). So in that case the alt does not describe the image itself but is something like "An image for a of that ".
    It's a good thing I enforce all images having an `alt ` attribute with ESLint rule so FF won't be generating anything for our projects (when they introduce it for all pages not just PDFs).

    • @void_vale
      @void_vale 22 дня назад +2

      This is true, and we obviously can't outsource alt text generation to AI from a dev perspective, but from the perspective of a user, in a situation where the page provides absolutely no information on what an image contains, or what its purpose is, I think it's inarguable that this feature represents an improvement to the user experience.
      Besides, any image that doesn't have an alt text represents a failure on part of the developer, so even if a FF generated alt text is wrong and a user gets confused or reports it as an issue, you're still at fault for not adding proper accessibility metadata in the first place.

  • @moltony
    @moltony 22 дня назад +2

    Finally AI being put into good use

  • @Alice_Fumo
    @Alice_Fumo 22 дня назад

    Seems great. I've been personally really impressed with the open image captioning models and think they're quite a bit better than useless on almost all images.

  • @hnasheralneam
    @hnasheralneam 23 дня назад +1

    The Firefox pdf editor is pretty nice, but standard for a web browser

  • @RadikAlice
    @RadikAlice 23 дня назад

    Haven't used it much, but just like with page translation. I like this, and they could even reuse the same interface
    already in place for languages of pages to translate. Don't need alt text, but I like that platforms like Mastodon and Bluesky
    offer a toggle to not let you post an image until you add it. Helps me not forget, think making a default would make a lot more people aware

  • @adamgarlow5347
    @adamgarlow5347 23 дня назад +5

    The fact that its mozilla, not bundled with difficult opt out (Microsoft) makes this acceptable

  • @ai-spacedestructor
    @ai-spacedestructor 23 дня назад +2

    i wouldnt say that this is the only good use case but this falls under repetitive tasks where it has to do one thing only, which i think in general many of those can also be usecases for ai which have now drawbacks to them asuming its implemented correctly.

  • @spatiumowl
    @spatiumowl 22 дня назад +1

    As someone, who hates LLMs with a passion, I think this is an actually good use case (actually helping people) with an implementation that doesn't raise horrible concerns for me

  • @deltamico
    @deltamico 22 дня назад

    That's great, it even avoids publishing generated alt texts that would be used for training future models and thus keeps the quality of scraped data

  • @nani8ot
    @nani8ot 22 дня назад

    I knew about and used the Firefox PDF Editor beforehand. It's great because I have my browser always open anyways and use it for reading PDFs (e.g. assignments found on the uni website).
    The Firefox AI features are awesome, e.g. I've been searching for private website translations for years and never found a proper alternative besides Google web translate addons. This alt text accessibility feature is a great continueation of this private AI work they've done.

  • @thingymcjig788
    @thingymcjig788 22 дня назад +1

    I've always thought that small models like this were much cooler than huge clouded models like GPT4. Small programs that do one task pretty well hashtag unix philosophy.

  • @StarlordStavanger
    @StarlordStavanger 23 дня назад +10

    I'm sure it's gone unnoticed, but I appreciate seeing Firefox in your videos now more instead of Brave. Good on ya mate, we can't keep supporting chromium for as long as google controls it

    • @LautaroQ2812
      @LautaroQ2812 22 дня назад

      Mozilla is funded by Google. So...?

    • @atijohn8135
      @atijohn8135 22 дня назад

      @@LautaroQ2812 only because Google is forced to do it, so that it can't have a monopoly. it certainly would be better if it didn't receive it, but Firefox forks like LibreWolf and Pale Moon also exist.

    • @renner0395
      @renner0395 22 дня назад

      @@LautaroQ2812 It's the lesser evil.

    • @darkm007
      @darkm007 22 дня назад

      @@LautaroQ2812 Yea but it's more of a:
      Google: Here's money for your foundation to make my reputation better
      Mozilla: Nice this will work great on that project that will make you look really stupid
      Google: What?
      Mozilla: What?

    • @Mario583a
      @Mario583a 21 день назад

      @@LautaroQ2812 The only thing Google funds is the default search contract deal.

  • @jaskij
    @jaskij 22 дня назад +2

    Just about the only real criticism I can think of is that it should be part of the screen reader, not the browser.
    There's the matter of resource utilization and battery life, but that's honestly just a question of good defaults and visibility.

  • @luketurner314
    @luketurner314 22 дня назад

    10:10 OCR (Optical Character Recognition) for example

  • @darkenblade986
    @darkenblade986 22 дня назад +1

    Finally a good use for ai. Big props to Firefox.

  • @hummel6364
    @hummel6364 22 дня назад

    I have already seen some automatically generated alt-text on websites, those used "simple" machine vision technologies and were very limited, and often bad. A good on-device neural network could strike a good balance, although the end of the text should always be "This image description was automatically generated and may not be accurate."

  • @knghtbrd
    @knghtbrd 23 дня назад +1

    I'm legally blind. I have some vision, and I will still be using this myself. I don't trust Mozilla not to collect data from me, I trust them to tell me if they're doing it and to give me the chance to say no and respect my decision. And it's important that this be done, because it makes the web more accessible to all.
    (And yes I knew about Mozilla's PDF editing features. Firefox is my primary browser because of multiple account containers (which Chromes of any color don't have) and temporary containers (3rd party thing). The result is that every new either goes into a pre-defined container (which retains its cookies) or a temporary container which is effectively a private browsing window, gone the instant I close it.

  • @chloe-sunshine7
    @chloe-sunshine7 21 день назад

    I want a loop where you provide text, it generates an image based on that text, it generates text based on the resulting image, etc. etc.

  • @4nyNoob
    @4nyNoob 23 дня назад

    damn, that zoom out is a game change, production value go brrrrrr

  • @NickNackGus
    @NickNackGus 23 дня назад

    Sounds like a good use to me. I wonder how it'll work when deployed to work on web images as well, in particular for images where there is already alt text that appears when hovered over, but does not describe the image - like the extra jokes on xkcd's comics.

  • @Clawthorne
    @Clawthorne 22 дня назад

    Huh. Finally an actual use for the NPUs which will be on every CPU from now on. Good job Mozilla!

  • @lesh4357
    @lesh4357 23 дня назад

    I find this use and how it is implemented OK.
    Looking at the very good description by ChatGPT-4 and the shortened version of the text, two tags should be part of W3C standards. Maybe Alt-text and Alt-text-long, if not already existing.
    The long version being particularly useful to visually impaired people.
    As hardware/software becomes more efficient, the sort of description provided in the ChatGPT-4 example may become available locally. Web developers could generate and vet text locally before publishing, same for social media content posters. As a last resort, empty tags could be generated locally at the readers end.

  • @vxer
    @vxer 22 дня назад +1

    FF pdf editor is awesome. Been using it for a while.

  • @kuhluhOG
    @kuhluhOG 22 дня назад

    8:52 Ok, that's pretty surprising.
    Because the PDF viewers and editors of browser (doesn't matter if Firefox or Chromium based) at this point became so good, that if somebody doesn't need it for their day to day stuff (like work or school), I don't think you needs to install one.

  • @guss77
    @guss77 21 день назад

    I expect the situation in social media to be better than the average website - for example Instagram have been using AI generated alt text since forever.

  • @SteinGauslaaStrindhaug
    @SteinGauslaaStrindhaug 21 день назад

    5:24 A random human content editor will quite often _also_ focus on the completely wrong thing in an image when writing a manual alt-text too. Especially if they are creating alt-text for a generic stock photo or generic press photo of some person without knowing the context the image will be used in.

  • @JessicaFEREM
    @JessicaFEREM 23 дня назад +1

    It would be cool if websites could hook into an API and have an option to generate alt text for images on the fediverse automatically.

  • @gsestream
    @gsestream 20 дней назад

    alt text can also be the popup text when mouse overing

  • @rpeetz
    @rpeetz 22 дня назад

    This is a really nice concept and totally worth use of AI, not to mention how noble of a subject it deals with(accessibility)

  • @DorE3k
    @DorE3k 22 дня назад

    That's a great use of AI, saving work no one wants to do and helping people who otherwise would not be able to interact with the page fully

  • @olliestudio45
    @olliestudio45 23 дня назад +15

    So this guy got a haircut but left that beard totally untouched?? Good for him. This is Linuxland.

    • @l0gic23
      @l0gic23 23 дня назад +1

      That cut is sharp

    • @_Lumiere_
      @_Lumiere_ 22 дня назад +2

      He's still a few decades away from the unix beard tho

  • @F_Around_and_find_out
    @F_Around_and_find_out 22 дня назад

    Mozilla blessings for Rust and this are 2 of Mozilla recent Ws. I support the way they implement the AI specifically: totally optional plus future plan to provide to the user an interface to manage and install AI models. This is actually good stuff.

  • @l0gic23
    @l0gic23 23 дня назад

    Noce job firefox. Good reporting

  • @moetocafe
    @moetocafe 23 дня назад +1

    Very good topic and video on it.
    Such micro-AI can be very helpful in many situations.
    AI is a tool (more a range of tools), as such it is not "good" nor "evil" by itself. How will one use it - to do good or bad - is something, related to humans, not to AI.

  • @nanopi
    @nanopi 22 дня назад

    Killer feature maybe, big w for moz

  • @AM-yk5yd
    @AM-yk5yd 20 дней назад

    When I heard several month ago Mozilla jumped on ai hype train I was sceptical, but so far they seem to deliver. Considering they've crowdsourced and released Common Voices dataset, next model is going to be tts.

  • @luketurner314
    @luketurner314 22 дня назад +1

    As someone that's against AI for literally anything and everything (I also have a bit of a hype aversion in general), I might actually spin up an open-source, self-host-able LLM for alt text generation while web dev-ing

  • @torspedia
    @torspedia 22 дня назад

    This sounds rather useful, especially for those images where you just don't know how to describe them. Sounds like it will give you a good starting point, before translating it into your own words.

  • @magi5587
    @magi5587 22 дня назад +1

    what are the best ai text and ai imagine generators to use at the moment?

  • @doce3609
    @doce3609 23 дня назад

    This is the only real and working normal use of AI i have ever seen.
    Amazing I love it. Even though I will probably never use it

  • @romancvijanovic7130
    @romancvijanovic7130 22 дня назад +1

    AS is definitely over hyped but it shouldn't be demonized. Many are being closed minded about it, but I don't see the reason why.

  • @youkofoxy
    @youkofoxy 22 дня назад

    Is a good idea.

  • @chadmckean9026
    @chadmckean9026 22 дня назад

    sounds like a good use, that said i do not use a screen reader so i would choose to not use it, if the image did not load i would find it to be useful, but i would assume if the image does not load alt text can not be generated

  • @NeilHaskins
    @NeilHaskins 22 дня назад

    Yes, generating alt text sounds like a great use for AI, as long as it's specified in the text that it's AI generated. "Halucinations" will probably always be an issue, but users should be able to understand that.

  • @SkylerLinux
    @SkylerLinux 21 день назад

    In limited fairness to the Baseline Model, she's touching the chair and could been seen as holding. So it's not a crazy hallucination

  • @bleack8701
    @bleack8701 22 дня назад

    HDR has been a "bug" in Firefox for years, but never mind that. It's time to work on AI and get the implementation done in just a few months!
    It's a good feature for accessibility. I'm not complaining about the feature itself. Just questioning why they're draghing their feet on HDR so kuch when they get a feature like this out so quickly

  • @t1m3f0x
    @t1m3f0x 23 дня назад +1

    I just hope that it won't overwrite alt-text that is intentionally set to be blank in the future. I have a weird use case in a userscript were I need to set [alt=""] on an img element.

    • @Poldovico
      @Poldovico 23 дня назад

      that should be pretty easy to implement. It's probably good to have empty alt-text on some less esoteric stuff, too: I imagine you don't want screen-readers going "the logo" several times on every page.

    • @t1m3f0x
      @t1m3f0x 23 дня назад +1

      @@Poldovico It is implemented, I was saying I hope it won't overwrite that in the future. My use case is kind of a hack job that makes the img element 0px X 0px if it's broken,(instead of showing a broken image icon) it's not really the right way to do what I'm trying to, but it's what I (a self taught hobbyist) can do to modify an existing script to do what I need it to, and it's good enough so I don't care. But that is a good point, about there being cases where you wouldn't want alt-text.

  • @paulstubbs7678
    @paulstubbs7678 22 дня назад

    Ah alt text, the text you feel compelled to supply because html has a spot for it, but on never seeing any results/benefits from, soon loose interest, and on receiving no negative responses, choose to spend one's time on something that does bring 'something' for your effort.

  • @__christopher__
    @__christopher__ 22 дня назад

    Does it detect and omit design elements? Those would probably get descriptions that are not very useful.
    But it definitely should label tracking pixels as such!

  • @sprinklednights
    @sprinklednights 22 дня назад +1

    Okay, but Firefox still doesn't support the XDG Base Directory specification.

  • @Puzzlers100
    @Puzzlers100 22 дня назад

    One thing i think it needs is a small disclaimer, something like "AI alt text" to denote alt text was ai generated and may not be completly accurate.

  • @derdodel7978
    @derdodel7978 21 день назад

    This the kind of thing AI should be used for. People just hate that big corps are trying to replace creative jobs like Art and writing with AI and that just leads to uninspired garbage or at worst misinformation in Google's case

  • @tq1238
    @tq1238 23 дня назад +2

    Seeing Brodie not have Brave open 👀👀👀I wonder if this is a one time thing.

    • @renner0395
      @renner0395 22 дня назад

      I think he uses FF for quiet some time now. Around when he made the first videos about Plasma 6 I think. But he still burns our retinas with those light themes.

    • @tq1238
      @tq1238 22 дня назад

      @@renner0395 I must have not noticed it in the last few months. I'm assuming it's cus of MV3

  • @blenderpanzi
    @blenderpanzi 23 дня назад

    Yeah, accessibility (and data analysis in science and perhaps medicine) is probably the one good use of this kind of AI.

  • @oserodal2702
    @oserodal2702 23 дня назад

    Wow, firefox actually does something ahead of chrome/ium.

    • @renner0395
      @renner0395 22 дня назад +1

      corrected: Wow, firefox actually does something

  • @MisterDevel
    @MisterDevel 23 дня назад

    Damn, the one good use of this billion dollar technology

  • @groovecrusader5770
    @groovecrusader5770 23 дня назад

    The firefox pdf editor if fairly new (one or two months I think", so that explains why you didn't know about it.

  • @yaroslavpanych2067
    @yaroslavpanych2067 21 день назад

    About 'legal': if there is no regulations in place, it cannot be illegal. If it is not explicitly forbidden, it is allowed, and cannot be punished

  • @doingwell5629
    @doingwell5629 22 дня назад

    I hope it is Opt-in feature.

  • @logicalfundy
    @logicalfundy 22 дня назад

    I'm still a bit wary about the ultimate consequences of AI, especially as there seems to be a push towards truly generalized AI (and quite frankly, we may already be there and not know about it).
    That said - I do like Firefox's approach. Look for legitimate cases where it can be genuinely useful, and build from there. This is far better than slapping an AI button on everything you can, which is what I see a lot in other products.

  • @pravculear
    @pravculear 23 дня назад

    this is the type of ai automation i can get behind.

  • @gr33nDestiny
    @gr33nDestiny 23 дня назад

    Hope they do the organise tabs with AI thing

  • @bleack8701
    @bleack8701 22 дня назад +2

    You didnt know Firefox had a PDF editor? Doesn't every browser have that?

  • @Blaineworld
    @Blaineworld 22 дня назад +1

    finally, a good use for ai!!

  • @jupitersky
    @jupitersky 23 дня назад

    I HAVE BEEN SAYING THIS SINCE IMAGE GEN CAME OUT LETS GOOOOOOO

  • @mixenne
    @mixenne 12 дней назад

    My concern around automated alt text generation is that it feels as though people start to think that because of all the automatic ways to generate it, they don't need to do it themselves any longer.
    This is kind of something that seems to have happened on Mastodon that was once great with alt text, but now it feels like people have given up because they think that things like gpt4o are a perfectly valid replacement, which they are not.
    It's a good alternative to having some alt text where there was previously none, not a replacement.
    When you mentioned that a certain version of alt text was too long, I disagree. I always prefer alt text that goes into finest detail and helps me imagine the setting and stuff, although this may be more appropriate for social media rather than every single image on a website being described this way.
    Hillucination is a huge issue with AI, and so is censorship of inappropriate things (or things that the model detects as inappropriate anyway), which means that your access is limited.
    Consider Gemini. I wanted to know what a social media post screenshot said, and it refused to tell me because the content isn't something that it was designed to say. Turns out it was a post by someone who was ranting about a book that discusses children as soldiers or shield in wars, and making a point that if you need to consider or ask if it's OK to hurt children in war, you are a bad person. It was nothing graphic, but even if it was, I think it should try to phrase it in such way that still protects people, but doesn't limit access.
    I've always said that my problem with machine learning isn't machine learning itself, it's when you combine machine learning, capitalism, and disregard for consent and privacy.
    At this point the high performance computing that LLM's do that costs an arm and a leg to run and relies on basically slavery, on top of being trained on non-consenting people's data is misrepresenting the whole of machine learning IMO.
    An on-device, offline machine learning trained on legal/ethical data is great.
    Bonus points if it's also open source.
    Heck, even online machine learning that makes you train your own model and doesn't use potentially copyrighted data of non-consenting people I see no issue with.
    The fact that people are throwing a fit over Mozilla's on-device machine learning feels really meh as a screen reader user.
    Considering that Apple especially has been using on-device machine learning for a while now, including their noise reduction, siri, tts voices, etc. It just feels like the marketting term AI has made everyone (understandably) jumpy and we probably need a better term for legal/ethical machine learning.
    I can't help but not think of adult video industry when I talk about ethical and legal stuff. Very sad in both cases that those aren't respected by default instead of being something special.

  • @MrAlanCristhian
    @MrAlanCristhian 23 дня назад +1

    My negative opinions about IA are about exaggerations, overhype and overpromises. Also I against strapping IA everywhere to solve problems that nobody have.
    For example speech syntesis is a legit and usefull case. Email templating, is barelly better than regular templates. AI on search engines are horrible.
    As you said Firefox has found a good and actually usefull way to use it.

  • @chlorobyte_projects
    @chlorobyte_projects 22 дня назад

    I remember when AI was simply a bunch of experiments, before it went mainstream, before it became the money-making fad. This feels like a call back to those days.

  • @tomaszgasior772
    @tomaszgasior772 23 дня назад +8

    There are cases when it's expected to have empty alt text and this in that cases the alt text should not be filled in automatically by the browser.
    For example, if you have UI with notification constructed from "warning" icon image and notification message, you want to have alt text set as something like "warning: " or "important: " for that image. However, if you have UI with notification designed as "warning" icon image followed by "warning" word and then by the notification itself, it is expected to have empty alt for the icon - the alt would not serve any purpose for blind person, image it's used only as decorative element.

    • @SemiDoge
      @SemiDoge 23 дня назад +6

      True, but if I were hard-of-sight, I would totally take redundant alt-text like that over no alt-text at all on images that should have it.

    • @greembow
      @greembow 23 дня назад +4

      You can mark images as decorative which makes screen readers and other tools disregard the images

    • @Sierra410
      @Sierra410 23 дня назад +4

      UI elements generally don't use img tags, though. Almost universally img tags represent content. UI pictograms are generally done with svg tags, css (be it background-image or pure CSS shapes), or fonts (e.g. fontawesome).

    • @agsystems8220
      @agsystems8220 23 дня назад +1

      I would disagree. If you have an image saying warning, and text saying warning, it is emphasised for sighted people. The repetition applies a similar level of emphasis to a partially sighted person.
      We are a while away from that anyway, this is aimed at the pdf editor (which I kinda want to try now). As much as descriptions of images are useful, for browsing it really it does need to take the context into account, so wants to be a transformer on the whole page when we get to that point. Then we could fine train it to ignore redundant images anyway. "It might get it wrong" is not a good reason for criticising an assistant tool, particularly given it almost certainly will in unexpected and creative ways!
      Deliberate omission seems such a corner case (and unhelpful, because you can never know whether it was deliberate), that I don't see a good reason for pushback here. In those cases what you really want is an AI looking at the warning image, seeing it is followed by warning text, and replacing empty alt text with an empty string. A blind person without assistance will have no idea what is in the image. They just see an image with missing alt text. It's probably a redundant warning image, but they actually have no idea.

  • @memesfromtheforsakenworlwi9218
    @memesfromtheforsakenworlwi9218 22 дня назад

    Ok, this is genius

  • @zxuiji
    @zxuiji 22 дня назад

    8:55 Yeah I basically use it exclusively for PDF reading now, what's the point in installing something else if the builtin one does me just fine? As for creating them I can just export from libreoffice.

  • @natrixnatrix
    @natrixnatrix 22 дня назад

    Instead of generating alt text it should recognise images with a missing or wrong alt text and not display them. Thus forcing everyone to comply with the html spec.

  • @jenbanim
    @jenbanim 23 дня назад

    Everyone who was freaking out about "Firefox adding AI" owes Mozilla an apology and needs to learn to wait until details are available to make judgements

  • @taith2
    @taith2 18 дней назад

    Making me wonder if people from danbooru are hired to tag thise to train LLMs as they go over the top with tagging