The Value Of Big Data Engineering | Jesse Anderson In The Engineering Room Ep. 8

Поделиться
HTML-код
  • Опубликовано: 21 дек 2024

Комментарии • 18

  • @dandogamer
    @dandogamer 2 года назад +3

    This was a great talk, really enjoyed listening to this as a software engineer who is starting my 1st data engineering role! I've just read the martin folwer Data mesh article and I'm glad Jessie mentioned that it is for big enterprise companies because thats something the article missed and I can see many companies just adopting it as the latest standard in DE without considering certain factors!!

  • @martinnyolt173
    @martinnyolt173 2 года назад +2

    I work as a software engineer and have always had a programming-focused background. But I also wrote my PhD in AI (where I was working on inference algorithms), and thus also had to do some data science (in research still, not on "big data" for large companies). I think the hardest issue as a data scientist is really the exploratory research you are doing. Data scientists have to explore a lot of feature selection, algorithm selection, parameter tuning etc. It is much less straight forward than implementing user stories. For the former, you have to back up much more often. On the flip side, if you have a good set of validation and test data, you can train and test your models without deploying to production often. So I find it hard to imagine to have something similar akin to continuous delivery for data scientist - I know there are approaches, but you most certainly can’t just use traditional practices.
    Once you have a working model and you just update your training data, of course you don’t have these problems anymore. But then you also don’t do much changes to your code, it’s just updating the model with the latest data, and validating it still has good performance.

    • @martinnyolt173
      @martinnyolt173 2 года назад

      Oh, and I think XML has to be one of the most misused formats for machine to machine communication.

    • @stephenhall1453
      @stephenhall1453 2 года назад

      Hi
      I agree with the exploratory phase in Data science. It is the most time consuming part of all. That is also where the collaboration between data engineer and scientific is vital. In an ideal world Data scienctist does also good data engineering. Otherwise the exploration grind to a halte due to complexity in the data pipeline.
      It is a lot to ask to data scienctist. That is why good design software principal, small pluridisciplinaire team and pair programming is also vital into a data science project

    • @jessetanderson
      @jessetanderson 2 года назад

      I talk quite about these differences in my writing and talks. Glad you enjoyed it

  • @brownhorsesoftware3605
    @brownhorsesoftware3605 2 года назад +2

    Thanks for yet another interesting, informative, and thought-provoking video.
    It made me think about my experience with teams and experts.
    The best and most successful teams I've worked on were small and everyone was an expert. If there was something outside our expertise we learned it or added another person as needed.
    I've also been an expert passed around to other teams. My experience with that was no good deed goes unpunished. It was also a path to burnout: the product would ship and everyone goes on vacation except the expert who is needed for some other project. I usually felt more like a software cleaning lady than an expert.

  • @GenaboyA
    @GenaboyA 2 года назад

    Very interesting show thanks team

  • @datasciencesolutions2361
    @datasciencesolutions2361 2 года назад

    Great stuff!

  • @MartinsTalbergs
    @MartinsTalbergs 2 года назад

    Can you please link the course that Jesse offers?

    • @ContinuousDelivery
      @ContinuousDelivery  2 года назад +2

      I've added the link in the video details. Jesse Anderson's Data Engineering Courses: www.jesse-anderson.com/courses/

  • @no_more_free_nicks
    @no_more_free_nicks 2 года назад

    It is funny, I have now two job offers, one for Scala FP developer, and second one for a Data Engineer in a large company that uses data mesh.

    • @centerfield6339
      @centerfield6339 2 года назад

      The former sounds better. Most data engineering is just rebadged etl.

    • @Shapar95
      @Shapar95 2 года назад

      @@centerfield6339 interesting. Isn’t there a lot more opportunity for data engineering?

    • @centerfield6339
      @centerfield6339 2 года назад

      @@Shapar95 there is, but only because there are a lot of boring data jobs. There are about the same number of interesting jobs in each option, but the chances of the average scala/fp job being cool is much higher.

  • @tomatohero8015
    @tomatohero8015 2 года назад +1

    1:02:19 - Jesse might be talking about the paper "Hidden Technical Debt in Machine Learning Systems" proceedings.neurips.cc/paper/2015/file/86df7dcfd896fcaf2674f757a2463eba-Paper.pdf