ElixirDaze 2016 - Processing 2.7 million images with Elixir (vs Ruby) by David Padilla

Поделиться
HTML-код
  • Опубликовано: 4 окт 2024

Комментарии • 70

  • @johnjames282
    @johnjames282 8 лет назад +49

    good talk, please consider not using white on yellow slides. Most of those slides were very unreadable.

  • @supernewuser
    @supernewuser 8 лет назад +58

    Whoever told you bright yellow slides with white text was acceptable is acting against your best interests.

    • @labcrowd
      @labcrowd 7 лет назад +4

      Agreed. It was like putting hot lava stones on my eyes..

  • @nonefvnfvnjnjnjevjenjvonej3384
    @nonefvnfvnjnjnjevjenjvonej3384 4 года назад +8

    Another option would be to resize the images using css and get a background job to resize just those images so next time they will be the right size. That way you do it in chunks.

  • @amerispunk
    @amerispunk 7 лет назад +10

    Great video. I like how you walked us through every step as you're learning all this for yourself. Very entertaining, as well! The color of your slides in a few cases could have been better, though, as was previously mentioned.

  • @sureseam
    @sureseam 8 лет назад +3

    Communicated very well - thank you!

  • @balavarikuti
    @balavarikuti 8 лет назад +3

    great talk. i have watched all the way. i liked the last piece especially erlang server handling other servers.

  • @jpmohan96
    @jpmohan96 7 лет назад +8

    Why wasn't CSS used for resizing images?

    • @goodcyrus
      @goodcyrus 6 лет назад

      That's what I thought from the beginning. I thought someone there would stop his talk or in comments. Only 2 people mentioned this

    • @Itachi.Uchiha.Offical
      @Itachi.Uchiha.Offical 5 лет назад +4

      1) Decrease loading time (browser)
      2) Save storage (some kB make a difference on 2.7m images)

    • @smonkey001
      @smonkey001 5 лет назад +5

      CSS is the correct 1 day solution.

  • @ThePhan7em
    @ThePhan7em 4 года назад +6

    A better solution would have been to dynamically resize the image as they are requested. That way you only resize the images that users are actually requesting. Once the image has been resized, store that on S3 and serve that going forward. Still an informative presentation, thank you.

    • @Qrzychu92
      @Qrzychu92 3 года назад

      there is one problem with that - it takes 2s to do that per image, right? So, even if you did that concurently, images still would all appear 2s after the page loaded

    • @ThePhan7em
      @ThePhan7em 3 года назад +2

      @@Qrzychu92 Even if that were the case, that would happen once. To avoid this, you could show a thumbnail in its place until the resized image is ready. That is if you have a thumbnail(which you should) to show to begin with, if not, we are back to your point which is valid.

    • @Qrzychu92
      @Qrzychu92 3 года назад

      @@ThePhan7em yes, once for every image :) if you were unlucky, you could be the user that triggers it on every image you see.
      It all depends whether this is acceptable in the app context. In this case, where users rely havily on images in decision making, I think parsing all images upfront wa s good decision.

    • @ThePhan7em
      @ThePhan7em 3 года назад +1

      @@Qrzychu92 I dont agree tbh. You may end up paying for storing images that may only be seen once, then storing it at cost. Have a look at imgix.com. You can roll your own AWS Lambda or the likes relative ease.

    • @Qrzychu92
      @Qrzychu92 3 года назад

      @@ThePhan7em then why even bother with scaling the images on upload when you can scale them on demand?
      It's possible that storage is cheaper than running AWS lambda. Like I said, it all depends. Imagine instagram scaling images when poeple first see them. Would everyone notice? Probably not.

  • @AdrianoMitre
    @AdrianoMitre 7 лет назад +11

    Wouldn't it be easier to simply have resorted to GNU parallel and ImageMagick?

    • @batlin
      @batlin 7 лет назад

      Probably, but you'd have to wrap each ImageMagick invocation in calls to something like s3cmd get/put, and like he said in the talk, the overhead of starting and setting up s3cmd processes each time makes it slow.

    • @eafadeev
      @eafadeev 7 лет назад

      he could have put the S3 put calls from threads, that would dramatically parallelize uploading of the resized images. Resizing can be done by multiprocessing to resolve the GIL issue. I.e. ruby (or python or any other language x) would do the job just as fast.

    •  4 года назад +2

      He said he wanted to try Elixir.

  • @hvgpilaatkaiken2300
    @hvgpilaatkaiken2300 5 лет назад +3

    Fun talk, but the story about time pressure is obviously BS. If you have to solve a trivially data parallel task asap, you don't spend 12 days learning a new language because it's designed for massive parallelism and run it on a single node for 4 days.
    You split the data to N batches, where N is the number of cores, and run N instances of whatever global interpreter locked language you're comfortable with and be done 12 days faster. Or split the data to 100 patches, make a shell script to kick up 100 instances in EC2, run your program on each and be done before lunch...

  • @nicanorperera
    @nicanorperera 8 лет назад

    Gracias! Muy buena presentación.

  • @ferdinandenario9137
    @ferdinandenario9137 8 лет назад +3

    Nice talk on Elixir! thanks

  • @Thetastygamers
    @Thetastygamers 7 лет назад

    Perfect speech. Thank you

  • @morkhoudia9
    @morkhoudia9 4 месяца назад

    Great talk

  • @BorisBarroso
    @BorisBarroso 8 лет назад +1

    Great, Buena presentación, sigo aprendiendo Elixir y tratando de cambiar de Ruby a Elixir :)

  • @ArquitectoR
    @ArquitectoR Год назад

    Mogrify is just a wrapper of command line call of ImageMagic, so the same issues here like with the first attempt to upload to S3.
    Bindings to magick++ library or Vix should be used instead.

  • @iansoulful
    @iansoulful 8 лет назад +10

    I wonder if using AWS lamda would have saved, money developer time, download and upload time of images. it would probably be cheaper and much faster. Nevertheless great talk and information on Elixir :)

    • @marksargento840
      @marksargento840 8 лет назад +2

      I guess it's not possible to use lambda on Digital Ocean(they're not on EC2 from what I heard on the video) but I agree with you, they would have saved time and money. Great video though!

    • @omarlujan1809
      @omarlujan1809 8 лет назад

      probs

    • @RistoNovik
      @RistoNovik 8 лет назад +2

      Well the images are stored already in AWS S3, so there is no problem to spin of lambda to do the batch work. So there would be no need to extra download and upload time waste needed.

  • @christianrojasgar
    @christianrojasgar 8 лет назад +1

    Awesome talk!, thanks

  • @정동호-i1r
    @정동호-i1r 5 лет назад +1

    It's very helpful for me who new to the elixir.

  • @giftoflife6921
    @giftoflife6921 2 года назад

    Before you upload your videos it might help if you view it from beginning to end to see if it's good.
    My point is this. You displayed some white-colored text against a bright yellow background, which makes for difficult reading.

  • @RafidelisMaker
    @RafidelisMaker 7 лет назад +2

    I'm sintax guy too! Nice talk, thanks for sharing.

  • @elie2222
    @elie2222 8 лет назад +1

    Why were you limited to 20 cores? Couldn't you have run the job on multiple machines? 4 * 20 for example (splitting up the images into 4 groups) and had it all done in a day?
    (I didn't watch the whole thing. Sorry if I missed the answer)

    • @vasumahesh3800
      @vasumahesh3800 8 лет назад +6

      You are right, Elixir can be deployed to different machines and be interconnected as well. But I guess as per the video the guy didn't have much time. But still a wonderful video.

  • @emanuelquimper697
    @emanuelquimper697 7 лет назад +5

    I want this color syntax :(

  • @InsanityNerve
    @InsanityNerve 7 лет назад +1

    Great talk!

  • @ultort
    @ultort 4 месяца назад

    Really cool, I love Elixir, but using it for a one shot to resize images it really overkill, specially if as told this was something needed ASAP. 3M records is manageable, extract it in a file, split it in N*k files, write a batch script that download the images, process them and upload them, send it to N machines, start k processes that run concurrently on each machine and wait. If you have 5 machines with 20 cores, even at 1.6s/image (probably can be improved with batch processing), it will only take 13 hours, the next day your images would be ready. Enjoy your free time to learn more about Elixir.

  • @haystackdmilith
    @haystackdmilith 8 лет назад +2

    Like for that Ruby-Threading troll

  • @pookachu64
    @pookachu64 7 лет назад

    Flaw in the logic of using digital ocean - you don't pay for transfer costs between s3 and ec2. Do the whole thing in aws!

  • @lostcodex
    @lostcodex 8 лет назад +3

    Did you open source your work on github?

    • @btc-btc-net
      @btc-btc-net 7 лет назад +8

      github.com/dabit/elixir_images

  • @btc-btc-net
    @btc-btc-net 7 лет назад +1

    Good talk.

  • @klancaster1957
    @klancaster1957 8 лет назад +1

    Nice talk!

  • @michaelkohlhaas4427
    @michaelkohlhaas4427 3 года назад

    *Why doing it the easy way if we can make it complicated?*

  • @romenigld
    @romenigld 7 лет назад

    Nixe talk, congrats!

  • @leo11877
    @leo11877 7 лет назад +2

    or maybe use responsive CSS images?

  • @ThrashAbaddon
    @ThrashAbaddon 7 лет назад

    Good and fun talk. :)

  • @subratamajumdar13
    @subratamajumdar13 7 месяцев назад

    It is HTTPoison not HTTPotion!

  • @blu3h4t
    @blu3h4t Год назад

    maybe if he typed how to upload to s3 with erlang instead of elixir he would find something, if you are already on the otp maybe was worth a try

    • @blu3h4t
      @blu3h4t Год назад

      ah ok i c i was too fast :D

  • @christopheverbinnen3626
    @christopheverbinnen3626 7 лет назад +5

    Ever heard of Sidekiq?

    • @_tachyons
      @_tachyons 7 лет назад

      Christophe Verbinnen AFAIK sidekiq still uses single cpu core unless you purchased their enterprise version. Or am I missing something

    • @christopheverbinnen3626
      @christopheverbinnen3626 7 лет назад +1

      Unless you use jruby or you start on sidekiq process per core. Pretty straight forward to setup.

  • @lr5867
    @lr5867 7 лет назад +1

    Arguably he might've done the same batch execution w/ C++, using a Ruby supervisor task to spawn as many small processes to accomplish the same task. Speed differential .... maybe gain another 33% - 50% in performance??

    • @lr5867
      @lr5867 7 лет назад

      As for resizing images ... well, they'll load faster, but he could've just as well changed the CSS to scale the images instead. I think I could've figured a way to do it in a half hour but tell the boss it took 2 days ... :)

    • @batlin
      @batlin 7 лет назад

      Yeah I was wondering why the images needed to be resized at all... maybe the quality of dynamically scaling them down in the browser is slightly less than you can get with ImageMagick.

    • @techorb5799
      @techorb5799 7 лет назад

      This is exactly how I handled it with Ruby 1.9. In my case I was creating millions of animated 2 frame banner ads. I used mini-magik to create the frame, then a cmd shell all out to `convert' utility to animate the frames together. Each batch gets fed to a process which is Fork 'd and processes the batch sequentially. Get a big AWS instance with lots of cores handle all the processes and start spawning forks until the machine falls over, then dial back a couple. could generate millions of images in a matter of hours.

  • @juliolinarezescobar
    @juliolinarezescobar 11 месяцев назад

    Now this 15:20 it is so different

  • @NikolajLepka
    @NikolajLepka 7 лет назад

    I don't get why he explains elixir features at an elixir conference... you'd assume people at an elixir conference already know how to use elixir

    • @_tachyons
      @_tachyons 7 лет назад +2

      Niko L Not really, In every conference there will people from different levels of expertise. Especially in new programming language like elixer,there will many people who came to explore the basics of Elixir

  • @torvic99
    @torvic99 6 лет назад

    Latin America #1

  • @edugonch2
    @edugonch2 7 лет назад

    Next time you just use Cloudinary XD, nice talk by the way.

    • @pablocacaster
      @pablocacaster 7 лет назад

      So you wanna pay 140 a month for a slow service?