The Problem with Research Software Engineering

Поделиться
HTML-код
  • Опубликовано: 1 июл 2024
  • A discussion about how to make research software engineering a bit better!
    Github sponsors (Patreon for code): github.com/sponsors/leios
    Twitch: / leioslabs
    Discord: / discord
    Github: github.com/leios
    Bibliography
    [1] joss.theoj.org/papers/10.2110...
    [2] journals.aps.org/prfluids/abs...
  • НаукаНаука

Комментарии • 139

  • @LeiosLabs
    @LeiosLabs  4 года назад +37

    A bit of a different video today about something that's been on my mind. I know it's a bit of a rant and more or less a clip from my livestream, but I thought some people might benefit from it! Let me know if you like this type of content as well. If so, I am happy to do more "lecture-style" videos on various topics.

    • @LeiosLabs
      @LeiosLabs  4 года назад

      What do you mean, exactly? Like a better code review process to go along with the peer review?

    • @PerpetualHope
      @PerpetualHope 4 года назад

      Have you tried putting this in the science twitter sphere? Could generate much-needed discussion there among researchers

    • @Crushnaut
      @Crushnaut 4 года назад

      TIL some people's rants have powerpoints

    • @gmt-yt
      @gmt-yt 4 года назад +1

      Hell, yes! Also media. The problem is particularly noticeable in libraries where one project's "issues" tend to infect other, "innocent" projects, who thought they would just be consuming an "API" and instead ingested a can of worms. Sci/media projects tend to be rife with platform and abandon-ware dependencies, idiosyncratic build frameworks, undocumented behavior, hard-coded constants, nonstandard object persistence, premature optimization, low-level programming abuse, non-adherence to coding standards, and any other offense to maintainability and code-safety imaginable. It's not your imagination.

    • @foobars3816
      @foobars3816 3 года назад

      Often the code I see coming from people with a scientific background also tends to use a lot of single letter variable names instead of descriptive ones that could aid in the understanding of the code. Has this been you experience as I was surprised it wasn't mentioned?

  • @bartzijlstra3193
    @bartzijlstra3193 4 года назад +45

    Having recently started as a "real" software engineer after finishing my PhD, I recognize many of these problems. We did do version control and unit-testing for our research software, but I often passed up on good software documentation in favor of writing the actual research articles. I've also had many requests from colleagues to share my code for making high-quality graphs. Most of the time I had to reply with: "You can have my code, but it won't work directly on any other data than mine. Please take my code as-is, and use it as an example to try writing something of your own." I know I could have made my graphing tools much more modular and general, but at the end of the day I needed to have my thesis finished.

    • @LeiosLabs
      @LeiosLabs  4 года назад +8

      Honestly, using version control and doing proper testing is still pretty good! In my opinion, software is only really as useful as its documentation, but if the code was meant as a script for a single publication and not meant for reuse, I think it's acceptable to have less documentation... As long as people can still understand the code enough to replicate the results!

  • @rentristandelacruz
    @rentristandelacruz 4 года назад +41

    I worked as a research assistant in a chemistry laboratory that primarily deals with simulation. The lab head is still using a FORTRAN for nucleation simulation. I believe that code is at least 20 years old. When I tried to read the code it has variables like 'xxx' and 'yyy'.

    • @apurbabiswas7218
      @apurbabiswas7218 4 года назад +9

      I'm worried about working with an old code base myself. That seems dreadful

    • @LeiosLabs
      @LeiosLabs  4 года назад +9

      Haha. I've definitely had a similar experience! I actually like fortran, but bad code is bad code.

    • @altaroffire56
      @altaroffire56 4 года назад +12

      In my experience, Fortran is very convenient for that kind of work. (Fast, low-level, clean syntax, built-in support for matrices and complex numbers... it feels like the right tool for the job. With C, on the other hand, it often feels like working against the language to get it to do what I want.)
      A determined scientist can write unreadable code in any language, though.

    • @LeiosLabs
      @LeiosLabs  4 года назад +11

      @@altaroffire56 A lot of people say fortran is bad, but I 100% agree. Usually fortran is seen as a bad programming language only because people programming it use bad programming practices.

    • @coryrobertson6367
      @coryrobertson6367 4 года назад +5

      @@LeiosLabs Lets not forget that what is now bad programming practices may have been the best at the time. Short identifiers due to length restrictions, terse programs due to small screen sizes, few inline comments due to file limitations, etc. It is the improvements in technology that have allowed us to write more human readable code.

  • @AngryArmadillo
    @AngryArmadillo 4 года назад +55

    As someone who has worked in both pure software development and pure CS research positions, I completely agree. Specially when it comes to documentation and peer review of code, I’m shocked by a lack of standardization. Asking a researcher for access to their code is a true roll of the dice.

    • @LeiosLabs
      @LeiosLabs  4 года назад +10

      This has been my experience as well, but I definitely started on the academic side. The moment you leave the academic bubble, you start to realize how poor the software standards actually are in academia.

  • @zebulon220
    @zebulon220 4 года назад +28

    Congrats on your phd, I completely agree with everything you say in this video.

    • @LeiosLabs
      @LeiosLabs  4 года назад +3

      Thanks! I'm hoping to start a discussion and maybe get researchers to think a bit more about their code.

  • @Axman6
    @Axman6 4 года назад +15

    My experience with researchers writing code was that the piece of software they needed most was git. So much version-control-via-making-copies-and-emailing-it-to-yourself.

    • @LeiosLabs
      @LeiosLabs  4 года назад +7

      Who needs version control when you have dropbox?

    • @Axman6
      @Axman6 4 года назад +6

      LeiosOS stop it, please 😭

  • @tallon3925
    @tallon3925 3 года назад +1

    found you through OIST's youtube channel, love your videos! thanks for sharing your passions

    • @LeiosLabs
      @LeiosLabs  3 года назад

      Oh, cool! Happy to see you are checking out OIST's content! They've really been doing their best to put out cool, compelling content recently!

  • @HatersGonnaHate4
    @HatersGonnaHate4 Год назад

    This video has really made clear some issues I've noticed at my current (research focussed) job and it's very satisfying to hear it stated succinctly

  • @Pa_Nic
    @Pa_Nic 4 года назад +12

    Competitive programmers may be able to help. You can get relatively clean and simple code from very complex new algorithms if you ask competitive programmers. We are trained to code common algorithms really quickly and occasionally search for better (faster, more memory efficient, working online, etc.) algorithms to implement so we can use them as "secret weapons" during contests.
    As an example: given a tree graph of N nodes, it is widely known that you can find its centroid decomposition in O(N log N) time. However, a quick Google search will lead you to a paper demonstrating O(N) centroid decomposition which has no code. To verify, we usually just read the paper, code the algorithm ourselves, and stress test it against the verified slower algorithm with thousands of randomly generated cases.
    Might it be possible for researchers to get competitive programmers verify their work?

    • @LeiosLabs
      @LeiosLabs  4 года назад +6

      I don't think that would be a bad idea. I think the best bet is to teach researchers to think more like competitive programmers in this case.

    • @fa-pm5dr
      @fa-pm5dr 4 года назад +3

      @@LeiosLabs competitive programming has one of the steepest learning curves i have seen

    • @milobem4458
      @milobem4458 3 года назад

      How do you get them to work together? Academia has a very traditional structure. Software engineers are sometimes hired as "lab assistants", which means everyone ignores them until last minute when some bug shows up in a big mess of unreadable code. If they get accepted into academic position, like PhD student, or post-doc, they are pressured to publish their own work asap or get lost.

  • @apurbabiswas7218
    @apurbabiswas7218 4 года назад +25

    This was very helpful. I'm going to look more into JOSS.
    As a Physicist interested in Scientific computing, unit testing seems like almost a foreign concept, and I feel fairly inadequate compared to my computer science peers.
    I've had enough exposure to the importance of version control prompting me to learn git myself. For anyone else in a similar position, look at the MIT Missing Semester Jan2020 IAP for similar computer sciencey-"filler" education.
    More videos about CliMA would be cool : )

    • @LeiosLabs
      @LeiosLabs  4 года назад +5

      I was in your exact position at the start of my PhD. I knew about unit testing, but never "needed" it for my code and used version control, but couldn't really get my peers to use it, so I was stuck. It was an uphill battle for me, but learning proper programming practices helped out my research tremendously!

    • @apurbabiswas7218
      @apurbabiswas7218 4 года назад +2

      @@LeiosLabs thanks for the tip. I feel fairly lucky in this regard as there's so much to learn from online, that hopefully I'll have it easier. Content like yours helps so thank you once again

  • @youtubereview8176
    @youtubereview8176 Год назад

    Thank you so much for posting this video. What I've heard for a lot about algorithms is that when a paper is written, and it says that it has great performance, it's very likely that the implementation will be very costly and won't have better performance than the current solution. OFC, there are also some breakthroughs.

  • @brandonnelson8781
    @brandonnelson8781 3 года назад +3

    Thank you for posting this. Going through my PhD now, I experience many of these pains that you've clearly outlined here. If we could continue to grow this discussion and build a scientific community more embracing of software engineering practices, starting with git and code re-usability, the long term gains would certainly outpace the short term learning pains.

  • @IamLupo
    @IamLupo 4 года назад +1

    It not a rant my friend. You made a valid point! Keep doing what your doing buddy!

  • @lw4423
    @lw4423 3 года назад +4

    I wish someone would make a tutorial where the student can follow along and learn to make a Julia package that does something trivial but the point is learning to make a package and putting in on GitHub plus all the documentation and tests and making branches and all that.

  • @alijassim7015
    @alijassim7015 4 года назад +3

    This video is so spot on! Publishers need to see this.

    • @LeiosLabs
      @LeiosLabs  4 года назад

      I was hoping to start a discussion

  • @ProjectPhysX
    @ProjectPhysX 4 года назад +3

    Congrats to your PhD!
    Thank you for your perspective on research software engineering. I have never seen course offers at my university for scientists on how to write good software and in the end it comes down to teaching yourself.
    I work in the same field (PhD candidate in computational fluid dynamics with LBM / physics) and I've seen lots of bad code as well, due to the points you've discussed. But that's not always the case.
    The incentive to write clean code is given at least once you work on software as a team. We do refactoring and on a regular basis and make sure every line is properly documented.
    Because of the teamwork, version control becomes a necessity as well.
    Testing code is actually most of the work. If code is not testable and the results are not reproducible, it is trash code, no matter in which field.
    The main incentive for our software project actually was hardware (GPU) efficiency and performance. No other software on the market is capable of comparable performance, so we had to write our own.
    Regarding job chances, research software engineering is not a dead end at all. If you really master scientific programming, you don't have to apply for a job because companies will apply for your time.

    • @LeiosLabs
      @LeiosLabs  4 года назад

      I think we have almost the same perspective here. I'm happy to see more people writing the best code they can given the circumstances!
      As a note: I've been considering doing a video on heterogeneous computation (CPU / GPU) in Julia for HPC relatively soon.

    • @milobem4458
      @milobem4458 3 года назад +1

      I don't know about your team, but many scientists seem to think "testing" means running the whole 40 hours experiment and comparing the results with the last run. When software engineers talk about testing, they mean small and readable unit tests which take seconds to run and verify automatically. But again, we can't blame self taught programmers of not knowing all the best practices.

  • @monikaparmar2061
    @monikaparmar2061 4 года назад +2

    Great video. Thanks!

    • @LeiosLabs
      @LeiosLabs  4 года назад +1

      Glad you liked it!

  • @KevinHorecka
    @KevinHorecka 4 года назад +1

    This video speaks to me so much. I was a software engineer/systems engineer before going back to grad school, and I was the only computational-focused person in my lab for Neuroscience. There were other folks who knew how to program (and some who couldn't do more than a stats script), but writing "good" code (as loaded as that is) was just not a priority because no one else was ever going to see it (because there was no avenue to share and no one wants to replicate results anyway).
    Lo and behold, my code ends up being pretty useful for some other work (related to TBI), and it is fortunately very documented so I was able to share it. It's far from perfect, and finding the balance of where to stop on it because it was good enough was a huge challenge. I would've loved to submit it to a journal and get it more polished, but there was no value in that (at least relative to the other priorities I had to graduate).
    I wish I knew how to help push the culture forward in this space. I left academia after graduating, so I'm afraid I'm not being very helpful. I've started publishing again recently around my volunteer work, so maybe that's my avenue to help.

    • @LeiosLabs
      @LeiosLabs  4 года назад +1

      I'm glad you are still thinking about helping out! I think creating well-documented code is already a good step forward. If people can use your code easier, then they will start to see the value in good programming practices.

    • @KevinHorecka
      @KevinHorecka 4 года назад +1

      @@LeiosLabs Thanks! Keep up the great work! Your videos are always a joy to watch.

  • @AaronPM55
    @AaronPM55 3 года назад +2

    I work in the DSP field and we work closely with people in academia. I 100% agree with what you say. So much time could have been saved if the code handed to us was written better or even followed the paper.
    I think a big thing is that some older people in academia have the attitude of "if you used simulations, you didn't solve the problem." I personally think it's weird to see people not use software as a tool for verification on both generated and real data.

  • @yas-xk5wf
    @yas-xk5wf 4 года назад +2

    Excellent video. Empathise with the points you've made. They're not only relevant in academia but also within commerical settings where there is pressure to release and not enough resource is committed towards building robust systems. Or, conversely, systems are engineered well but the solutions aren't scientifically rigorous.

    • @LeiosLabs
      @LeiosLabs  4 года назад +2

      I don't often see production code, but don't doubt there are similar issues there! In general, we need to do better to write high-quality software whenever possible!

  • @gavinpeng1976
    @gavinpeng1976 4 года назад +4

    Right on point. I am currently trying to refractor an old academia codebase consisting of Matlab, Python, Java, and C++ that are glued together using Matlab, and it is just a nightmare. And yes, Matlab is evil - you often see thousands of lines of code without encapsulation and a huge namespace. I genuinely think that much more people would have used the code if it was written in a more professional fashion.

    • @LeiosLabs
      @LeiosLabs  4 года назад +1

      That sounds awful! I have had similar experiences, no where near *that* bad. On the other hand, at least they are giving you time to refactor!
      I really feel we need to be honest about the fact that software is how people conduct research. Poor documentation / software practices are precisely the same as keeping a bad lab notebook / sloppy methodologies.

  • @rifatahamed7052
    @rifatahamed7052 3 года назад +1

    This is an essential topic for research. More incentive should be given towards research software development. Many of the high quality research depend on how well a simulation or model has been formulated and executed. Better programming practices in developing research works will lead towards better research scopes.

  • @felixrichter1100
    @felixrichter1100 4 года назад +2

    As a master student working in a research group I could not agree more with all the things you just said.

    • @LeiosLabs
      @LeiosLabs  4 года назад

      Happy to hear I'm not alone!

  • @MarcelRobitaille
    @MarcelRobitaille 3 года назад

    I totally agree! So many more things need to be under version control.

  • @loutsauv5079
    @loutsauv5079 4 года назад +3

    Thank you ! That's exactly why I did not went into research :(. I was shocked that in operational research, code was not standardized, shared nor reviewed ! People are publishing results (sometimes modified or cherry-picked, impossible (or long and hard) to verifiy.
    Please we need more people to pay attention on this problem which in my opinion slows down research and hinder its credibility !
    It's also true in economics, especially with the infamous case of a research paper wrong and badly reviewed with a badly written excel file driving most of the modern politics regarding public debt based on false assertions :( !

    • @LeiosLabs
      @LeiosLabs  4 года назад

      I definitely feel your pain! It's sometimes incredible how poor research code can be!

  • @gz6616
    @gz6616 4 года назад +4

    I recently came across JOSS and made a submission to it. In so doing I found that there are lots of thing I didn't know, including writing tests, documentation using sphnix, proper packaging of the code. At least I knew a bit of git, which I learned during my spare time, when making side projects totally unrelated to the research projects. These things we have to learn by ourselves, the institution does not provide such trainings, and many of my colleagues don't care about these things at all.
    And I also dislike matlab, array doesn't even start from 0.

    • @LeiosLabs
      @LeiosLabs  4 года назад +1

      This is my experience as well. I like JOSS because it brings up a lot of topics academics typically ignore.

    • @shoam2103
      @shoam2103 4 года назад

      Arrays starting from 0 isn't a problem. R, Julia, Lua, etc all start with 1. In functional / array programming languages it doesn't matter. You'd be basically writing the same code in Haskell for e.g. (a 0-indexed functional language). Function composition instead of loops always!

  • @SoopaPop
    @SoopaPop 4 года назад +5

    Thank you for this video. I'm a 3rd year doctoral student in Applied Math, and specifically the scientific computer subdisciplines you mention. I'm currently finalizing a moderate size (about 4000 lines of C) codebase to be open sourced along with a paper submission. There a serious crunch-time feeling which is causing various holes in documentation as well as crappy inefficient fixes. You're definitely right, writing well documented code feels impossible when one is also supposed to also be pushing out theoretical breakthroughs of some flavor.
    On the other hand, it is also very hard to write code that works without a strong grounding in the theory of a subject.

    • @LeiosLabs
      @LeiosLabs  4 года назад

      Right, exactly! You need domain knowledge and software knowledge. It's not an easy job at all! Good luck with the paper!

  • @soheilsolhjoo
    @soheilsolhjoo 4 года назад +1

    Hello James, congrats on your PhD, and thanks for bringing up this topic. As a computational scientist, I needed to write a lot of different codes from my bachelor's project till my postdoc, yet only very recently I learned about git. Such a shame!
    Regarding publishing codes and make them open access, I totally agree with that: the very least advantage will be a required documentation, besides the code itself.
    However, I'm not really sure whether reviewing the codes can be a good idea. Imagine for a particular work, you need to write in different languages, e.g. bash codes for Linux, Matlab and Fortran, which doesn't happen all the time, but it's possible; expecting the reviewers to be familiar with all of these languages doesn't seem that realistic to me. Moreover, if a code is supposed to model a phenomenon, shouldn't its results to be the main concern in the review process?
    Perhaps it would help if the publishing houses make it mandatory to publish the accompanied codes (+a proper documentation) of each paper, even without a proper review, and let the readers/users decide for themselves.

    • @LeiosLabs
      @LeiosLabs  4 года назад +1

      This is a great perspective and good discussion!
      I still say that we need some sort of review. If a reviewer doesn't know fortran (or the language the code is in) and is reviewing for a journal, that's totally fine, but *someone* should know fortran in the review process. A good solution would be to link most paper reviews to JOSS, so people can review the software independent of the scientific review.
      Part of the current problem is due to the fact that people don't know the languages / methods used in the field, but still review for that field. I would argue that if they can't read the code, they shouldn't review the paper. Obviously, this is unrealistic in practice, so for a good first step, we should at least publish code with the paper (like you said).

    • @MarcelloZucchi91
      @MarcelloZucchi91 3 года назад

      The issue is not about the correctness of the code, rather the reproducibility of the results which is (was maybe) a pillar of the scientific process. You can always describe your algorithms with pseudocode, but that's not feasible for large projects. Moreover, people in academia usually want to keep their software for themselves, so as to retain an advantage above potential competitors. That's why services like CodeOcean are gaining popularity. They effectively provide an interface "shield" between your code and the outside world, allowing people to experiment with it as a black box (very important for scripting and interpreted languages).

  • @SapereAude1490
    @SapereAude1490 Год назад

    Watching this video while working on a Matlab AppDesigner web app for a paper. PhD in chemistry. Everything you say is true.
    I've been watching python tutorial lately - I hope to escape this landscape into a proper developer, because I know full well this is not the proper way to do programming.

  • @danielchin1259
    @danielchin1259 4 года назад

    Thank you for your proposal.

  • @BradenEliason
    @BradenEliason 4 года назад +4

    I think there's a lot of researchers that see Matlab as a necessary evil. There's just such a developed ecosystem of tools and labs are reluctant to migrate.
    I would like to see some sort of bounty system for migrating Matlab code to Julia. In ultrasound physics there's packages like Field II and FOCUS for Matlab which I would like to see migrated and I'd be happy to chip in to some fund to make that happen.

    • @LeiosLabs
      @LeiosLabs  4 года назад +1

      Yeah, matlab for experimental work is one thing. If there are no other packages that allow users to connect with their experiments, then it's the only option.
      I would love to see Julia take over the role of Matlab, though. Have you gone on the Julia Slack or Discourse to see if there is development in the areas you need for your research?

    • @BradenEliason
      @BradenEliason 4 года назад +2

      @@LeiosLabs I've looked but nothing has anywhere close to feature parity with the Matlab packages I referenced. To dethrown Matlab you need to get a critical mass of labs to switch over and that's not going to happen quickly if necessary packages are missing. Unfortunately, the same labs that are reluctant to switch are the ones that are best suited to migrate the libraries to Julia. It's a bit of chicken and egg problem. What do you think of a bounty system to reward open source research programmers?

    • @LeiosLabs
      @LeiosLabs  4 года назад +1

      @@BradenEliason I wish the bounty system would work! If there was funding for that, I would love to take a crack at it.

  • @TheMazyProduction
    @TheMazyProduction 4 года назад +3

    CONGRATS 🎉🎈🍾 on the PhD, well deserved.

  • @porschepanamera92
    @porschepanamera92 4 года назад +3

    And you're probably only talking about software engineering or at least from that perspective.
    However, as a PhD student in the field of structural engineering, I wasn't specifically trained to write code, but some/many problems can't be solved without a bit of coding. Now, imagine that horrible google-stackoverflow-slapped-together-frankenstein code. But most of the time, if it works it works and I'm more than happy I made something that actually does what it should. And indeed, usually, once the publication is done, the project is set aside, as well as the code. Fully agree though!

    • @LeiosLabs
      @LeiosLabs  4 года назад +1

      I tried to speak both from the academic perspective (where there is no incentive to write clean code) and the software engineering perspective (where there is not incentive for software engineers to stay in academia). I also left out some content that was particularly ranty about academia. I want to start a conversation, not an argument.

    • @porschepanamera92
      @porschepanamera92 4 года назад

      @@LeiosLabs I hear you and I wanted to acknowledge this from my own experience in academia. I'm curious how it would evolve over time, as programming will become increasingly more important (I think).

  • @thej680
    @thej680 Год назад +1

    Hello! I found your video through hearing of "research software engineers," and I am very curious. I understand your present concerns with your current career. I am looking for advice.
    I received a bachelor's degree with math and cs. I am personally very interested in getting into the HPC background and would like some recommendations because I am not sure where to even begin. I am considering going back to school too, but I am also not sure about funding and such. What would you recommend to do if you were me?
    Also, I would like to try out my own pet project to get some introduction to the subject. I've heard of OpenMPI and some other things for parallel computing. If you could suggest a beginner level project, what would you recommend?

    • @LeiosLabs
      @LeiosLabs  Год назад +1

      So there are a bunch of different "ways to start" with RSE. The best way is probably to just e-mail people, state your background, and ask if they need some programming help. You will learn the tools along the way.
      As for a pet project, it's kinda hard to say. There's a difference between parallelism and *distributed* parallelism that you need for HPC. Going from parallel to distributed is not an easy step, but can only really be done if you have a cluster available to mess around with. You can still learn the tools necessary, though (MPI, mainly, but also CUDA for GPUs).
      I might recommend looking into the Julia ecosystem at this time as I know people are looking for help with their distributed setup and having your name associated with those tools will probably help out if people are looking for Julia positions. That said, most of the RSE code is either in C(++) or Fortran, so knowing those languages might be a bit more useful.

    • @thej680
      @thej680 Год назад

      @@LeiosLabs Thank you for the prompt response! I can definitely have a look into those languages and technologies you mentioned.
      When you say e-mailing people, do you mean university professors or people via LinkedIn? I imagine professors. One issue I came across was one professor couldn't take my assistance unless I was a student at their university. Can some professors be flexible regarding that?

    • @LeiosLabs
      @LeiosLabs  Год назад +1

      @@thej680 Not necessarily university professors. People who are writing papers or software you are interested in. Most of the code for RSE is open source and you can probably start collaborating on github pretty quickly.

  • @milos_radovanovic
    @milos_radovanovic 4 года назад +2

    Have you had any GUI related problems with academically developed, commercial research software?
    I find that research software GUI tends to follow similar anti-design-patterns to one's encountered in other topic-specific software fields like music composing as showcased by Tantacrul on his channel in his "Music Software & Interface Design" series.
    I watch his "diatribe" videos because I find it healing after long hours working with research software GUI.

    • @LeiosLabs
      @LeiosLabs  4 года назад +1

      I don't typically come across research software with GUIs. Most of the stuff I do is CLI-only.
      The one time I did use a GUI, it was a bit of a mess.

  • @abrahamx910
    @abrahamx910 4 года назад +1

    Agreed, congrats for your phD

  • @chetanvardhan2906
    @chetanvardhan2906 3 года назад +1

    May I ask about your experience at OIST?I was thinking about pursuing Graduate degree there… Would you recommend it?

    • @LeiosLabs
      @LeiosLabs  3 года назад

      OIST was great for me, but works best for students that are self-motivated.

  • @FaffyWaffles
    @FaffyWaffles 2 месяца назад

    PREACH

  • @user-pq9qz5zl7t
    @user-pq9qz5zl7t 4 года назад +2

    Dr Schloss, what software did you use to change the colour of your shirt?

    • @LeiosLabs
      @LeiosLabs  4 года назад +1

      Ah, blender.
      Also, didn't you have a different channel before?

    • @user-pq9qz5zl7t
      @user-pq9qz5zl7t 4 года назад

      @@LeiosLabs sorry, I deleted the videos you liked last time

  • @abhinavadarshsood5759
    @abhinavadarshsood5759 4 года назад +2

    As a high school graduate, is it worthwhile economically to get into research in fields related to Machine Learning, Computer Science, and Software Engineering if very interested in the fields?

    • @LeiosLabs
      @LeiosLabs  4 года назад +2

      Yeah, definitely! If you want to go down the pure research route, there is funding and interesting research opportunities. If you want to do these fields in industry, there is also plenty of funding available.
      My point here is just that software engineering is not well integrated into research in all fields. It's getting better and it's a good idea to ride the wave now while people are starting to see the need of better integration.

  • @atharvas4399
    @atharvas4399 3 года назад +1

    peer-reviewing research code is next to impossible because you need a constant supply of academics with interdisciplinary knowledge. for eg someone who has a very good understanding of computational Quantum Mechanics for a specific experiment AND simulation methods in python including familiarity with a particular tech stack

  • @NicosLeben
    @NicosLeben 4 года назад +2

    Good video. I think we all have the right to get access to source codes and to all the work in generally which got funded by taxes. It's annoying to pay for papers when I have payed for them already.

    • @LeiosLabs
      @LeiosLabs  4 года назад +1

      That is a great point from the public perspective. People want open research, so we should truly open the research!

  • @saurabh7532
    @saurabh7532 4 года назад +3

    This is something that troubles me and made me feel weird about pursuing research for long term (pursue a PhD), I love it but I feel it requires a lot of reforms.

    • @LeiosLabs
      @LeiosLabs  4 года назад +1

      The only way that reform will happen is if we target the areas in need of change and change them ourselves.

  • @hitoshiyamauchi
    @hitoshiyamauchi 4 года назад +1

    I see the research code problem. Additionally, the code is usually not maintained after publishing. So even it was well developed, there is no guarantee it works when someone is interested in it. This is not 10 years case, but just a few years in nowadays. (For instance, I suppose CUDA version, compute architecture, and GPU architecture would be the issue in your case. Probably a matlab code runs longer.) But this is quite hard problem. And I think it should not depend on an individual's effort. Ideally, some systematic support is great. Thanks for focusing a fundamental problem.

    • @LeiosLabs
      @LeiosLabs  4 года назад +1

      totally agreed! I think a lot of researchers use the fact that software is constantly evolving as an excuse *not* to review it. The way I see it, maintenance of code is a huge issue as well and hard to do right. I think the best we can do is provide the version numbers and such at publication. This will allow the code to be run for at least a decade or so.
      At this stage, just reviewing code is already a big step forward.

  • @kapoioBCS
    @kapoioBCS 4 года назад +2

    As a theoretical physicist researcher , it is hard to imagine software engineers complain about funding , literally all the funding in my University goes either to software engineers or Bio- Med- researchers :p

    • @LeiosLabs
      @LeiosLabs  4 года назад +1

      That's interesting to hear / thanks for the input! I am sure a lot of funding goes into software engineering, but is it for research software or for general-purpose software?

  • @protocol6
    @protocol6 4 года назад +4

    This is only tangentially related but, as someone who learned to program long before learning any higher mathematics or physics, it has always irked me that variables and constants in mathematical formalism are even worse than hungarian notation in programming. Too many papers fail to define all their domain-specific symbols and ,if you read papers from 100 years ago, you have to chase down obsolete definitions. That's before you even get to all the situations that mix multiple definitions of the same symbol (like elementary charge and Euler's number) in the same equation. It's just begging for people to make mistakes. Mathematics could stand to import some best practices from software development.

    • @LeiosLabs
      @LeiosLabs  4 года назад +1

      Yeah! I completely agree with this! In some codes, theta is an angle. In others, it's temperature. It's hard to keep everything straight and is another complaint I have about a lot of research scripts! I am alright with it iff there is a comment somewhere in the code to some text that has similar notation or if there is a table of symbols somewhere available, but most people don't do this.

  • @EvilCherry3
    @EvilCherry3 4 года назад

    True

  • @scottdriggers8400
    @scottdriggers8400 4 года назад +2

    I once had a summer project improving someones simulation software, and it was dreadful wading through the pages of nonsense code. (Also, Nice shirt change at 6:37)

    • @LeiosLabs
      @LeiosLabs  4 года назад

      Yeah, there were 2 instances where I needed to record part of the video on another day because I missed key information on the first recording.

  • @diegofloor
    @diegofloor 4 года назад

    Yikes. Physicist here. I can confirm everything you say here. I've used code written in Fortran 20 years ago in my research and it's constantly bugging out with every fortran compiler update. Why? because it was no one's priority to rewrite it in a more convenient language. When the project fell on my hands I had some time to get results and then I moved on. Someone is going to inherit this code and probably do the same thing. Then there is my analysis code on Mathematica. It contains my own implementation of a few different algorithms described in papers but there is no way to check if my implementation really is bullet proof. Collaborators don't really care as long as it produces publishable results.
    What gives me some pace of mind is that in an active field of research a result wildly different from the expected will be under more scrutiny.

    • @MarcelloZucchi91
      @MarcelloZucchi91 3 года назад +1

      Fortran is a perfectly convenient language for scientific computing. You could even argue that it is the best for certain applications. Chances are that your code was written badly from the get go (maybe using non-standard, compiler specific extensions). Fortran is and will ever be back compatible with previous versions of the standard, but compilers issue warnings if you use obsolete constructs.

  • @Ddddddddddd381
    @Ddddddddddd381 4 года назад

    In my learning I found matlab nice to get simple problems solved and make simple scripts but I wouldn’t want to do large projects in it

    • @LeiosLabs
      @LeiosLabs  4 года назад +1

      Yeah, I tried to make sure people knew I was biased about that part. I have not found a single use-case in my own research where using matlab would be better than python/julia, and most of the time, it's flat-out infuriating, losing me hours of time.
      I appreciate that others have a different perspective and find the language useful, but I do not think I will ever be able to get over my biases about it.

    • @Ddddddddddd381
      @Ddddddddddd381 4 года назад

      @@LeiosLabs I completely agree, matlab has its quirks and aggravates me just as any other language would. Personally, I was taught my in my engineering degree with a math backbone, so the syntax of matlab for math makes somewhat sense. that being said, counting from 1 is stupid

    • @LeiosLabs
      @LeiosLabs  4 года назад +2

      @@Ddddddddddd381 Counting from 1 is the least aggravating thing, honestly. I also don't mind the formula specification because I was trained as a physicist. I genuinely hate how radically different its syntax is from everything else, but I could see why some people like that. It's everything else that bugs me, like how:
      - you can only have 1 function per file
      - loops are slow
      - structs / classes are poorly optimized
      - good luck with graph / tree methods
      - you cannot edit files outside of the IDE provided, otherwise matlab doesn't recognize the change
      - it's licensed and checks for that license on startup... Which is doubly annoying when running matlab on distributed nodes, because it then checks licenses n times, where n = number of nodes. The license here, combined with radically different syntax actually makes the software predatory because it locks users into a system that they *have* to pay for and cannot escape from easily.
      - it crashes almost every time I try to run anything reasonably complex.
      I mean, if it's an esoteric language, it's an esoteric language. It might have some form of historical precedence as well, but there it still boggles my mind how people put up with it in 2020 when other, better languages exist for prototyping.

    • @Ddddddddddd381
      @Ddddddddddd381 4 года назад

      @@LeiosLabs what a fantastic explanation. thank you so much! I haven't personally came across those issues because I haven't done anything really complex with it, but I really do respect your expert opinion

  • @yogibairagi6354
    @yogibairagi6354 3 года назад

    How can one become a research software engineer? Possible with no PhD?

  • @AngryArmadillo
    @AngryArmadillo 4 года назад +3

    Matlab is the bane of my existence

    • @LeiosLabs
      @LeiosLabs  4 года назад +7

      Haha, I also wanted to make a language review for matlab where I just show every single flaw I can find. It's genuinely infuriating!

    • @shoam2103
      @shoam2103 4 года назад +2

      @@LeiosLabs please do this! Even in a light hearted way! I like array programming, but matlab's syntax and api is off-putting.

    • @iminni3459
      @iminni3459 4 года назад +1

      @@LeiosLabs I know nothing about Matlab but I want to watch this.

    • @LeiosLabs
      @LeiosLabs  4 года назад +2

      I guess that's 2 people interested, at least! I'll think about how I might do it!

    • @altaroffire56
      @altaroffire56 4 года назад +1

      @@LeiosLabs Make that 3. I learned Matlab in university and actually sort of like it, but I'm genuinely interested in what you have to say about its flaws.

  • @mossylikescake
    @mossylikescake 4 года назад +1

    I feel like this is a personal attack....

    • @LeiosLabs
      @LeiosLabs  4 года назад

      It definitely wasn't! I've had this discussion at least a dozen times with other folks in a similar position as I am, and we all kinda agreed on these points.
      I hope all is well!

  • @DucBanal
    @DucBanal 4 года назад +1

    As a junior research instrumentation engineer, I am in the same hatred towards LabView...

    • @LeiosLabs
      @LeiosLabs  4 года назад

      Oh boy, I bet! That's another predatory licensed software that is entrenched into the research ecosystem.

    • @DucBanal
      @DucBanal 4 года назад +1

      @@LeiosLabs The problem is that you can't really defend any other solution because that's all researchers know and telling them to use anything else is seen as trying to "break something that works"

    • @LeiosLabs
      @LeiosLabs  4 года назад

      @@DucBanal Also agreed with this. As much as I hate the system, I don't think it's good to pull the rug up.

  • @wailrimouche1171
    @wailrimouche1171 4 года назад

    May I ask how old are you?

    • @LeiosLabs
      @LeiosLabs  4 года назад +2

      28 right now. Started doing research at 21, so I am still relatively new to the scene. I was doing software development well before that, though!
      In addition, the thoughts presented here came about from long discussions with my peers, so it's not like the arguments made were only from my own perspective!

    • @wailrimouche1171
      @wailrimouche1171 4 года назад +2

      @@LeiosLabs I was just asking because I was very impressed by youf qualifications. Those are some great endeavors for someone as young as you.

  • @random_x_
    @random_x_ 4 года назад +2

    In my opinion, the fact that research code is so bad (for computer science research) is inexcusable. Many times a CS undergrad could write code that actually works, with better documentation, that can be run by anyone on any machine and reproduced. It's just laziness, partially incompetence, and I don't think the code should be written at all if it is going to be bad. It's just that you have people who aren't trained as software engineers (yet are in computer science!) writing code and it's complete spaghetti. Do researchers know what they're doing? Yes, when it comes to writing papers. Should researchers be writing code? In my opinion, no not really. It's a waste of time if you're terrible enough at software engineering because it won't be reproducible anyways. If researchers HAVE to write bad code though, I would really hope that those who do just put in the readme something along the lines of "we just did this so we wouldn't get yelled at by reviewer #2 for not having an implementation. Our software is terrible and probably doesn't even work, and we barely remember how to get it to work because we're writing another paper already. You're better off writing your own implementation." That way I can stop wasting my time with the code, because 99% of the time it's easier to implement something from scratch with good software engineering practices than it is to try to get something built with terrible software engineering practices to work.

    • @LeiosLabs
      @LeiosLabs  4 года назад

      I can only echo your frustration. This is exactly the problem with research code today.

  • @td4739
    @td4739 Год назад

    Maybe academia should invest in Software Engineering Research.

  • @luismendo
    @luismendo 4 года назад

    Even if you have a "deep and fiery hatred of Matlab" (3:42) you can't leave it out of a list titled "Types of research software" (1:07). Come on

    • @LeiosLabs
      @LeiosLabs  4 года назад

      It was there, under "languages and frameworks." I just gave julia as my example instead of numpy or matlab. It's underlying libraries were also there under the same section with blas and lapack. Again, there are way too much research software out there, so it was not possible to list them all on that slide

  • @woosix7735
    @woosix7735 3 года назад

    matlab lol

  • @platypusbox6479
    @platypusbox6479 6 месяцев назад

    Academia doesn't incentivize people to make good software - true that. I'd argue it doesn't incentivize them to do good research either.