Deep Dive Into the Repository Design Pattern in Python

Поделиться
HTML-код
  • Опубликовано: 20 сен 2024

Комментарии • 111

  • @ArjanCodes
    @ArjanCodes  7 месяцев назад +1

    💡 Get my FREE 7-step guide to help you consistently design great software: arjancodes.com/designguide

  • @yurykliachko1815
    @yurykliachko1815 7 месяцев назад +24

    this is a good pattern when your entity (Post in this case) is stored partially in different storages (sql DB + cloud storage, sql db + nosql DB etc), it hides all the complexity. Thank you for this guide!

    • @ArjanCodes
      @ArjanCodes  7 месяцев назад

      Glad you enjoyed the topic!

  • @notead
    @notead 7 месяцев назад +4

    Hey Arjan,
    I just want to say thank you.
    I was able to land a job in data engineering thanks to your course and your videos on design patterns. Seeing your approach to building applications finally made it click for me that learning a language is the "easy" part, and that understanding _how to think about systems_ not only makes me a better developer - but is a super important, generalizable skill that goes beyond just programming. Maybe that's obvious for many, but I am really grateful for that insight.

    • @ArjanCodes
      @ArjanCodes  7 месяцев назад +2

      It's an absolute pleasure hearing about your success story and your learning journey, thank you for letting me be part of it! Best of luck :)

  • @FernandoCordeiroDr
    @FernandoCordeiroDr 7 месяцев назад +6

    I had to use this pattern recently. I was working on a Django app that had to work both with MongoDB and Postgres' PGVector. I created a repository for each and then a factory function that, based on environmental variables. determines which repo to be used. These repos are then used inside the methods of normal Django models. The main benefit is that adding an integration to another vector database is just a matter of creating a new repository.

  • @klmcwhirter
    @klmcwhirter 7 месяцев назад +2

    A common misconception of design patterns is that the concrete implementations need to have the same method signatures (or implement the same interface). That simply is not true!
    The spirit of the Repository pattern is to decouple storage from business logic. If the storage strategy changes, then the business logic layer should not have to change. See the Open-Closed principle for details.
    That is hard to accomplish if every Repository in your code base has the same set of method signatures, i.e., implements the same interface (er, Protocol as you have taught us). The methods should implement a business function required by the layer calling into the storage layer instead.
    First of all, you NEVER should embed SQL statements in today's world. That is a huge design smell that will never pass code review in an enterprise context.
    Second, the functions in a Repository class should be elegantly "callable" from the business layer and not just implement CRUD methods. That is a wrong usage of the Repository design pattern.
    It is a misconception that an OR/M provides a Repository - that is not true! Session management and Repository implementations are different concerns and do not belong together. Except in "Hello, World" examples I guess. Don't do that. It just does not work when you have a complex data concept involving hundreds of tables. Yep, those are normal in real world use cases.
    Please think about the place where you may need to move functionality from a database to an API. That will provide you with a correct mental model about the Repository pattern. just encapsulate the behavior needed for the underlying operational storage mechanism.
    At the end of the day, it is a specialized kind of an Adapter.
    I love your content @ArjanCodes, please keep doing what you are doing.
    But this one could have been presented better.
    I, as an educator myself, realize there are compromises that need to be made to simplify introduction of (potentially) new concepts. But you went too far this time. Sorry.

  • @prinsniels
    @prinsniels 7 месяцев назад +7

    I use the pattern a lot, but in a more general way. I tend to write things on the base of interfaces, combining it with dependency injection makes things easy to test and allows for composable programs and great flexibility.
    I tend to stay away from ORMs, for me they add an extra layer of complexity to programs and in analitics it quickly ends in writing straight SQL to your ORM, so cutting the middle man seems wise then 😅
    Thnxs for the video!

    • @oscarmulin114
      @oscarmulin114 7 месяцев назад

      Agree with avoiding ORMs 100%.

    • @bachkhoahuynh9110
      @bachkhoahuynh9110 7 месяцев назад +1

      In data-centric applications, you can stay away from ORMs, but if your team uses an object-oriented domain model, ORMs are especially useful.

  • @adjbutler
    @adjbutler 7 месяцев назад +5

    I love your pattern videos! (I will even allow you to make up your own patterns)
    Or do PART 2, 3, 4 on previous patterns! Your videos are amazing! Thank you

    • @alexandarjelenic2880
      @alexandarjelenic2880 7 месяцев назад

      Or combining patterns, or more example of solving the same issue with various approaches.

    • @notead
      @notead 7 месяцев назад

      I agree! It would also be really cool to see more videos of him refactoring projects into using design patterns, especially hearing him discuss why he makes certain choices, the considerations and thoughts that cross his mind when making them.

  • @mhl1740
    @mhl1740 7 месяцев назад +7

    I always use this kind of Repository, but I didn't know, that I follow a pattern 😀. Thank you.

    • @ArjanCodes
      @ArjanCodes  7 месяцев назад +1

      Glad the video was helpful!

  • @Jakub1989YTb
    @Jakub1989YTb 7 месяцев назад +5

    Why classmethods if you are not using them to create "instances" of the class?
    Didn't you mean staticmethods? This is very misleading.

  • @axeldelsol8503
    @axeldelsol8503 7 месяцев назад +9

    This pattern is also very useful when you are wrapping a API offering CRUD routes for resources
    Great video !

    • @Nalewkarz
      @Nalewkarz 7 месяцев назад

      It's much more suited for your usecase than his.

  • @SeliverstovMusic
    @SeliverstovMusic 7 месяцев назад +1

    I use repository on top sqlalchemy. A have a base repo class with CRUD function. For every table, I create a new subclass, and all CRUD operation become available for the table. Magic =)

  • @dadoo94000
    @dadoo94000 7 месяцев назад

    Thank you Arjan. I use this pattern in fastapi. Layer endpoint > layer services (logic etc) > layer repository with FastApi dependencies between these layers. I like it. Effectively, often I need more than simple CRUD operation and add it to my repository layer. It's not a good idea I think. Maybe we should create theses different method in services. But I like these pattern.
    WHen I need to call external api, I create a repository also for that. For me repo = access to data

  • @markasiala6355
    @markasiala6355 7 месяцев назад

    I also have used this pattern without knowing it simply by focusing on decoupling and dependency injection.
    I have an abstract data class and an abstract FileIO class. Gives me flexibility on how I load data into the class or write it out. This helps me track changes in the data when I compare versions of the output data (e.g., I read in data from a user-friendly Excel file but write it out to pipe-delimited text, JSON, or YAML output where a simple diff tells me what changed).

  • @rrwoodyt
    @rrwoodyt 7 месяцев назад +3

    I like the separation and abstraction. It would have been interesting to see you make a class that could handle a generic dataclass, but that's beyond the scope of what you were trying to show. Maybe next time...

  • @dalenmainerman
    @dalenmainerman 7 месяцев назад +2

    I actually used this one unintentionally
    At the earliest stages of a project, all data was stored in a bunch of csv files (not my idea, not my decision)
    Implementing all data-related operations with this pattern allowed me to migrate to the real database almost effortlessly

    • @edgeeffect
      @edgeeffect 7 месяцев назад +4

      "not my idea, not my decision" ... is the (sad) story of our lives!

    • @sharkpyro93
      @sharkpyro93 7 месяцев назад +3

      i worked in a project of a national wide editors and magazines publisher company and they used some excel sheets as db, it was miserable

    • @Naej7
      @Naej7 7 месяцев назад

      @@sharkpyro9395% of the world data is stored in Excel sheets…

    • @dalenmainerman
      @dalenmainerman 7 месяцев назад +3

      @@sharkpyro93 I have to work with google sheets as a db on my current project. Annoying af, trying to teach my colleagues to use real databases, wish me luck

    • @sharkpyro93
      @sharkpyro93 7 месяцев назад +2

      @@dalenmainerman why do i feel like i know how your collegues look?

  • @ajflorido
    @ajflorido 6 месяцев назад

    using this pattern with SqlAlchemy you can load different models dinamically and use the same repo to get the data for different db engines. For example we have models for postgres,oracle and mysql that with SA some columns definitions for the model are quite different and we load the correct model dinamically within the repo itself, so you can also decouple this pattern into another step for different engines.
    Thanks Arjan!

  • @devilslide8463
    @devilslide8463 7 месяцев назад +1

    I particularly appreciate the ease of mocking this repository. It's very convenient for testing the logic of services that utilize the repository class.

    • @ArjanCodes
      @ArjanCodes  7 месяцев назад

      I'm glad you enjoyed this design pattern!

  • @muzafferckay2609
    @muzafferckay2609 7 месяцев назад +1

    Implementing repository pattern is not about switching from sql to nosql or vice verce. It decouple the business logic from the persistent layer this can be orm or sql language. As you mentioned implementing repository pattern limit querying, updating ... It is too hard to provide all feature that orm does. For example you have to define comparison operators such as in, grater than, less than etc. Your get method shold take set of relational fields to fetch as it is going to be used in different places. You have to define or, and and more complex query. Basically you have to define your own query language step by step as you need. And translating your query to Orms.
    Otherwise you have to define too many different get methods for querying

    • @Naej7
      @Naej7 7 месяцев назад

      It is about switching. By decoupling the business logic from the persistance layer, you can switch the persistance class (one for SQL, one for NoSQL)

  • @2006pizzaboy15
    @2006pizzaboy15 7 месяцев назад +2

    You can also look at the Unit of Work pattern that often goes hand in hand with Repository.

  • @davidmasipbonet2508
    @davidmasipbonet2508 7 месяцев назад +3

    Why do you need create_table to be a classmethod?

  • @SkielCast
    @SkielCast 7 месяцев назад

    In this case you have nonly a couple of columns but using row_factory = sqlite3.Row and casting to dict would have allowed to use the **kwargs syntax which is especially handly in this case, maybe the code could be a little easier to follow that way

  • @bachkhoahuynh9110
    @bachkhoahuynh9110 7 месяцев назад

    The repository pattern is not about switching from SQL to noSQL. We call this switching effect the persistent ignorance principle. the main thing to consider to use the repository pattern is that you want to decouple domain logic from infrastructure logic. A repository is usually backed by an ORM because when you use raw SQL, you eventually implement some ORM's features such as changes tracking, proxy for lazy loading, ... I only use raw SQL for complex queries.

  • @sandeshgowdru8869
    @sandeshgowdru8869 7 месяцев назад

    Thanks a lot for making videos, I was looking for a architecture for getting data from multiple sources, I was looking into a combination of factory, strategy etc, But this pattern is perfect for my need
    Once again thanks a lot for sharing this....

    • @ArjanCodes
      @ArjanCodes  7 месяцев назад +1

      I'm glad this video was helpful for your current objectives :)

  • @rohailtaimourInc
    @rohailtaimourInc 7 месяцев назад

    Hi @arjancodes, I’ve been really enjoying your videos and specifically how you always focus on how to test the code you demonstrate. Thank you for your content.
    I was wondering if you can cover testing functions that are decorated? They pose an interesting challenge and I didn’t find it to be straightforward to test such use cases

  • @edgeeffect
    @edgeeffect 7 месяцев назад +2

    It's one of my favourite patterns and I so dearly wish we had it in the awful legacy app we've got at work.

  • @Djellowman
    @Djellowman 7 месяцев назад +1

    Great no-nonsense video!

  • @BradleyBell83
    @BradleyBell83 7 месяцев назад +4

    Any reason why ABC was used as opposed to Protocol?

    • @aimbrock
      @aimbrock 7 месяцев назад

      Talk about coming full circle here...
      I went searching for this Protocol package you mention and found a blog article claiming that Protocol is better and everyone should abandon ABC. In that same blog article he links to an ArjanCodes video that maybe answers your question: ruclips.net/video/xvb5hGLoK0A/видео.html
      Having just discovered Repository Pattern and Unit of Work and now ABC and Protocol I have no perspective to offer but I thought it funny.

  • @ChrisBNisbet
    @ChrisBNisbet 7 месяцев назад +4

    Hmm, do the tests you showed us test anything other than the mock class you created for the purpose of adding tests?

    • @joelffarthing
      @joelffarthing 7 месяцев назад

      Imagine an application function or use case that depends on a repository; You can inject the 'fake' Repository in your test instead of the version that uses a real database. That way, you have something that implements the expected interface, but doesn't actually require a real database. He talked about this but didn't actually show an example. Architecture Patterns With Python is a great book that goes over this and other patterns in detail.

    • @ChrisBNisbet
      @ChrisBNisbet 7 месяцев назад

      @@joelffarthing Yep, I get all that.

  • @obsidiansiriusblackheart
    @obsidiansiriusblackheart 7 месяцев назад

    I find like most patterns, I have used this before but didn't know the name. Thanks for this awesome video! Your channel really helps me better understand coding and jargon in the field (I have ~10 years coding xp and 6/7 years work xp)

    • @ArjanCodes
      @ArjanCodes  7 месяцев назад

      I'm really happy to hear that these types of videos have been useful! :)

  • @barefeg
    @barefeg 7 месяцев назад

    Awesome. Maybe follow ups could be how to define filters in your get_all that are no tied to SQL (e.g. specification pattern), as well as handling transactions with unit of work pattern.

  • @FolkOverplay
    @FolkOverplay 7 месяцев назад +2

    Is there a special reason why the tests were not refactored to use parameterize?

  • @Nalewkarz
    @Nalewkarz 7 месяцев назад +2

    You are not limited to Python 3.12 with this. You can do it also with older versions. Just use T = TypeVar("T") and then inherit from Generic[T] in the repository. But i'll allow myself some criticism. This pattern is not very usefull without more strict Port/Adapter pattern where repository is implementation of concrete interface. For simple CRUDS you can go just with ORM it's just not worth the effort.

  • @user-pz3wg6ch9b
    @user-pz3wg6ch9b 7 месяцев назад +1

    Why classmethod when it's not accessing any class instance variable also not returning the class? Can be a staticmethod right.

  • @SkielCast
    @SkielCast 7 месяцев назад

    Wouldn't it make more sense for the PostRepository methods to take a Post object rather than kwargs? That way we could have leverage typing

  • @dankprole7884
    @dankprole7884 7 месяцев назад

    I use this for reading and writing dataframes. csv, parquet or pickle, local storage or s3. Same interface 😊

  • @CottidaeSEA
    @CottidaeSEA 7 месяцев назад

    The repository is one of my favorites, because I really don't like it when database queries are tightly coupled with logic or the entities themselves.

  • @kristofferjohansson3768
    @kristofferjohansson3768 24 дня назад

    When in doubt, create an interface as Fowler would say.

  • @johnabrossimow
    @johnabrossimow 7 месяцев назад +1

    I wrote a class to access the filepaths in the project repository my app creates.

  • @peterlogg5576
    @peterlogg5576 6 месяцев назад

    Is there a reason to make the parent `Repository` an `ABC` rather than a `Protocol`? We generally use `ABC` for our repositories but I'm curious if there's a reason not to implement it as a Protocol instead?

  • @tihon4979
    @tihon4979 7 месяцев назад +3

    Cool! What about Unit of work pattern? ;)

    • @edgeeffect
      @edgeeffect 7 месяцев назад +1

      Yeah.... this is all starting to sound a little bit like "Doctrine" ... but that's PHP???????

  • @VashdyTV
    @VashdyTV 7 месяцев назад

    Great guide as always!

    • @ArjanCodes
      @ArjanCodes  7 месяцев назад

      Thank you so much!

  • @mustafabozkaya3658
    @mustafabozkaya3658 2 месяца назад

    Very awesome

  • @vikingthedude
    @vikingthedude 7 месяцев назад +1

    This looks like the strategy design pattern, applied to storage. Here, the SQLite storage is a specific strategy. Another strategy could be a remote storage. Am I understanding this right?

    • @Naej7
      @Naej7 7 месяцев назад +1

      I understand what you mean, but I can’t really say it is exactly the same thing. It does use the same mechanism though, which is essentially dependency injection

    • @thomaseb97
      @thomaseb97 7 месяцев назад +1

      most patterns are conceptually similar, atleast within the same category, there is very little difference between them, they just tackle somewhat specific tasks
      if its easier for you to imagine it as strategy pattern go for it

  • @basedmuslimbooks
    @basedmuslimbooks 7 месяцев назад

    I love this - can you expand your repository design patterns to other databases ? Mongodb is something im struggling with. Or graph databases

  • @TheOnlyEpsilonAlpha
    @TheOnlyEpsilonAlpha 7 месяцев назад

    I used it lately without knowing that I used it and without the decorators. Wrote a file handling py for crud that way without specifying the content so it could be used to handle c.r.u.d.operations and is not stuck to a specific content type

  • @Alticroo
    @Alticroo Месяц назад

    10:40 is this a good use case for "repository" mixins then?
    Have a BaseRepository that implements basic CRUD
    Then have mixins that enabled more distinct but asymmetric operations - is this possible to do whilst respecting the generics from the BaseRepository?

  • @Vijay-Yarramsetty
    @Vijay-Yarramsetty 7 месяцев назад

    thanks

  • @broomva
    @broomva 7 месяцев назад

    And how would you abstract away the SQL queries in the repository definition, so that different types of repository implementations could be made by changing something like a SQL objects template?

  • @feldinho
    @feldinho 7 месяцев назад +1

    Using this pattern, how would you deal with N+1 problems?
    Imagine you have posts with authors; it's easy to get an author with no posts or a post with no authors, but what about retrieving both together? Calling each other's repository would lead to infinite recursion while using joins in both repos would lead to duplicate logic. How would you solve this?

    • @NotNullReference
      @NotNullReference 7 месяцев назад +1

      You can add as many methods as need. The repository pattern is just a form to abstract and decouple the data access logic from bussines logics.
      So, if you need the Posts and Authors, you create a query that holds both items, in which repository you create this methods depends in the "dependent" side of the query, 'cause is different to said:
      - "I need the author of this book": meaning that you need a BookWithAuthor class, with the author as dependent side, so a join between book to author with a where in BookId
      - "I need the books of this author": meaning that you need a AuthorBook class, with the book as dependent side, so a join between author to book with a where in AuthorId
      Wherever is your case, you create the method in the strong side, in the first case, is in the BookRepository, in the second is AuthorRepository

    • @Nalewkarz
      @Nalewkarz 7 месяцев назад

      It's not very usefull without rest of the hexagonal architecture building blocks. Basically you just won't do it like you think. You must have some facade like "use cases" or "service" then pack database objects to entities that in that case would be aggregates because it will consist of two different related types od objects. Just imagine DAO with prefetched related objects. I can recommend veru good book about such implementation "Implementing the Clean Architecture by Sebastian Buczynski".

    • @betopolione.laura.gil.1
      @betopolione.laura.gil.1 7 месяцев назад +1

      Hey man, your question is very common. But it is also very simple to answer. Repository is NOT made for "queries" or "get data performatically". Repository should be used to persist the state of an "aggregate". In my point of view, you should have only the necessary methods to retrieve the Aggregate, then you modify it, and send it again to the repository asking to persisirst it. To avoid this kind of confusion about how to use repos, take a look at the CQRS concept.

  • @PietroBrunetti
    @PietroBrunetti 7 месяцев назад

    If I don't remember wrong, I saw it in the Cosmic Python book.

  • @nightcrawer
    @nightcrawer 7 месяцев назад

    Hey Arjan! thanks for the post.
    In a more complex applications using DDD is a good practice to separate domain from models ?
    My repository return a model and my model know hot to convert into a domain

  • @Naej7
    @Naej7 7 месяцев назад

    I use it every day, because I need a InMemory version for my tests

  • @brainforest88
    @brainforest88 7 месяцев назад +3

    Tipp: Never use Select * in a sqlquery in code. It bites back.
    Worked 25 years developing db applications in Oracle (pl/sql). Looking at sqlalchemy queries is exhausting the L1 cache in my brain. I‘m used to write my sql straight. Easier to understand and I doubt I can do everything I need with Orm, so why start with it in the first place.

    • @dankprole7884
      @dankprole7884 7 месяцев назад

      Agreed, I use them both so infrequently I want to double my chances of remembering so I just use sql

  • @luscasleo
    @luscasleo 7 месяцев назад +3

    I didn't know that its possible to declare generic classes using brackets like that and even not needing to declate the typevar T. Which python version is that?

    • @maephisto
      @maephisto 7 месяцев назад +4

      3.12

    • @Naej7
      @Naej7 7 месяцев назад +1

      The newest version 😉

  • @loicquivron3872
    @loicquivron3872 7 месяцев назад +3

    The thumbnail looks so cursed

    • @Naej7
      @Naej7 7 месяцев назад +1

      Right ? It has a « made with AI » vibe

  • @maephisto
    @maephisto 7 месяцев назад +1

    Around 5:40 we see that the new class Repository is a generic class that returns T (via get), list of T (via get_all). But why the add and update methods have no notion of T? Why do we go away from the generics and we introduce the **kwargs: object? I was expecting an add method which takes as an input a T.

    • @aflous
      @aflous 7 месяцев назад +2

      It allows more flexibility in the sense that you would not be tied to only use Post as an argument for these methods

    • @maephisto
      @maephisto 7 месяцев назад

      @@aflous well... So why not returning then with get and get_all an object with random fields. My point is : some methods are specialized for T, others not. And I don't think that's right because I understand flexibility but either everything is flexible or nothing is like that and it's based on T.

    • @aflous
      @aflous 7 месяцев назад

      @@maephisto when you perform a get or get_all you wouldn't really need to specify any other info and you would expect to get an object of the same type (or a list of objects of that type). For other methods like update, you need at least to specify some other info like the data you want to supply for the update.

    • @maephisto
      @maephisto 7 месяцев назад

      Not fully convinced. When you add, you add T. But the example with update make sense. Thanks

  • @jwcnmr
    @jwcnmr 6 месяцев назад

    Design Patterns are usually "discovered." Has this pattern been described elsewhere?

  • @sarveshsawant7232
    @sarveshsawant7232 7 месяцев назад

    Great

  • @plato4ek
    @plato4ek 7 месяцев назад

    9:05 So, in essence, you create a mock and put this mock under test. This is not a proper way to do testing.

  • @MrLotrus
    @MrLotrus 7 месяцев назад

    It hurts when adding transactions

  • @xtunasil0
    @xtunasil0 7 месяцев назад

    It's the standard way to work with the java spring framework

  • @thepaulcraft957
    @thepaulcraft957 7 месяцев назад +2

    saw it during a internship every day and now I am sick of it

    • @Naej7
      @Naej7 7 месяцев назад

      So since you’ve seen code every day, you’re now sick of code as well ?

    • @xiggywiggs
      @xiggywiggs 7 месяцев назад

      @Naej7 some days... yeah lol

    • @thepaulcraft957
      @thepaulcraft957 7 месяцев назад +2

      @@Naej7 no but it was completely overused and made simple things much too complicated

    • @Naej7
      @Naej7 7 месяцев назад

      @@thepaulcraft957 Probably not, it’s used for a reason (often tests)

    • @thepaulcraft957
      @thepaulcraft957 7 месяцев назад

      @@Naej7 but for many things you could use simple dtos instead of full repositories. Testing is a good point though.