Using the Repository Pattern for better data access encapsulation (in Python)

Поделиться
HTML-код
  • Опубликовано: 27 июл 2024
  • The code for this tutorial is available in 𝐆𝐢𝐭𝐇𝐮𝐛: github.com/abunuwas/repositor...
    This tutorial explains
    👉 What the Repository Pattern is
    👉 When to use the Repository Pattern
    👉 How to implement the Repository Pattern
    You'll learn to use the Repository Pattern by refactoring an application in which the data access layer is tightly coupled with the API layer. To keep things simple, we use a SQLite database for demonstration, but the same strategies can be applied with any other database engine.
    The starter code is an API built with FastAPI and it uses SQLAlchemy to manage interactions with the database. The video doesn't explain how FastAPI and SQLAlchemy work, so if you need a 𝐫𝐞𝐟𝐫𝐞𝐬𝐡𝐞𝐫, check out the following videos:
    👉 𝐅𝐚𝐬𝐭𝐀𝐏𝐈: • FastAPI Tutorial
    👉 𝐒𝐐𝐋𝐀𝐥𝐜𝐡𝐞𝐦𝐲 + 𝐀𝐥𝐞𝐦𝐛𝐢𝐜: • Setting up Alembic wit...
    🚨𝐖𝐀𝐑𝐍𝐈𝐍𝐆🚨 The content in this tutorial is fairly 𝐚𝐝𝐯𝐚𝐧𝐜𝐞𝐝. If you're a 𝐛𝐞𝐠𝐢𝐧𝐧𝐞𝐫 developer, you can still benefit from watching this video, but don't get frustrated if not everything makes sense. I highly recommend you 𝐩𝐫𝐚𝐜𝐭𝐢𝐜𝐞 along with the code, and try out all the commands I run in the terminal. If you encounter any issues with the code or something doesn't make sense, feel free to ping me or feel free to raise an issue in the GitHub repo for this video. My contact details are at the bottom of this description.
    💡𝐃𝐢𝐬𝐜𝐥𝐨𝐬𝐮𝐫𝐞💡 The implementation of the Repository Pattern shown in this video is fairly standard. The idea of the repositories registry is my opinionated. You don't need to use both patterns together. The strategy used to make dependencies injectable is also opinionated, and you may be able to use alternative dependency injection strategies. These are the strategies and patterns that I generally use with my clients.
    Chapters:
    0:00 Introduction
    0:29 Startup code and clone and checkout
    1:22 Presenting the startup code, API and db models
    3:00 How the API is tightly coupled to the data layer and why tight coupling is bad
    4:32 How repository helps avoid tight coupling with the data layer and what are the benefits
    6:50 How do you implement the repository pattern?
    8:45 Implementing the bookings repository
    10:22 Moving database logic from the API layer to the repository
    11:41 Using the repository in the API layer
    12:50 Setting up the environment, installing the dependencies
    13:30 Running the database migrations
    13:54 Running the server and testing the changes using the Swagger UI
    15:07 Introducing the test suite and running it
    16:12 Making SQLAlchemy's session maker an injectable dependency
    18:15 Testing the changes
    18:41 Updating the test suite to inject SQLAlchemy's session maker
    19:32 Mocking SQLAlchemy's session maker
    20:06 Adding a bookings business object
    25:07 Returning business objects from the repository
    27:12 Testing the changes
    27:50 Injecting the repositories into the FastAPI app using a registry
    30:58 Injecting a dummy repository in the test and running the test suite
    33:19 Wrapping up (show images of the books here)
    𝐖𝐡𝐚𝐭 𝐢𝐬 𝐭𝐡𝐞 𝐑𝐞𝐩𝐨𝐬𝐢𝐭𝐨𝐫𝐲 𝐏𝐚𝐭𝐭𝐞𝐫𝐧?
    Repository is 𝐝𝐚𝐭𝐚 𝐚𝐜𝐜𝐞𝐬𝐬 𝐚𝐛𝐬𝐭𝐫𝐚𝐜𝐭𝐢𝐨𝐧 pattern. It helps you build a bridge between your business layer and your data access layer. Typically, Repository offers an in-memory list or 𝐜𝐨𝐥𝐥𝐞𝐜𝐭𝐢𝐨𝐧-like interface to your data. It 𝐞𝐧𝐜𝐚𝐩𝐬𝐮𝐥𝐚𝐭𝐞𝐬 the implementation details of your data management system, hence 𝐝𝐞𝐜𝐨𝐮𝐩𝐥𝐢𝐧𝐠 your business layer from your databases. By using a repository, saving or deleting data from your database looks like adding or removing elements from a list. Using the Repository pattern makes it easier to maintain, test, and debug all the layers of your application in isolation from the database layer.
    I cover the Repository Pattern in-depth in 𝐜𝐡𝐚𝐩𝐭𝐞𝐫 𝟕 of my book 𝐌𝐢𝐜𝐫𝐨𝐬𝐞𝐫𝐯𝐢𝐜𝐞 𝐀𝐏𝐈𝐬. You can download chapter 7 for 𝐟𝐫𝐞𝐞 using the following link: www.microapis.io/resources/mi...
    You can also use the following code to obtain a 𝟒𝟎% 𝐝𝐢𝐬𝐜𝐨𝐮𝐧𝐭 if you want to buy the book: 𝐬𝐥𝐩𝐞𝐫𝐚𝐥𝐭𝐚. Use the following link: mng.bz/0wmx.
    I regularly organise live 𝐰𝐨𝐫𝐤𝐬𝐡𝐨𝐩𝐬 on software development. If you want to attend any of those workshops, check out the following page: microapis.io/workshops
    If you have any 𝐪𝐮𝐞𝐬𝐭𝐢𝐨𝐧𝐬, or if you just would like to 𝐜𝐨𝐧𝐧𝐞𝐜𝐭 with me, feel free to reach out to me in any of the following platforms:
    👉 𝐓𝐰𝐢𝐭𝐭𝐞𝐫: / joseharoperalta
    👉 𝐋𝐢𝐧𝐤𝐞𝐝𝐈𝐧: / jose-haro-peralta
  • НаукаНаука

Комментарии • 33

  • @stevecanny1583
    @stevecanny1583 2 года назад +4

    This was super-helpful for me José, thank you so much! This is by far the best video on RUclips for repository pattern in Python. I've know the abstract pattern for some time but struggled to interpret it into a Python context. This totally did the trick for me, thanks again so much!

    • @microapis
      @microapis  2 года назад

      Thank you for your kind words Steve! I'm so glad to hear you found it useful!

  • @RamiAwar
    @RamiAwar 6 месяцев назад +2

    Ok, this looks good for CRUD. What about beyond CRUD? How do we do joins with this repo pattern? Or a simple subquery? How can we group queries into one instead of hitting the DB several times?
    I mean at least you're still tightly coupled to SQLAlchemy and you're returning the sqlalchemy instances as is. I've seen some repo patterns where people serialized those into intermediate objects, creating even bigger problems down the line cause of the need to re-fetch that sqlalchemy object in different functions.
    I feel like everyone goes for repository patterns for simple projects which ends up creating way more problems than it solves.
    Also, I love the repository registry idea, but am not loving the fact that it's injected into the request object. I think it would we way cleaner as a dependency!😁

    • @microapis
      @microapis  Месяц назад

      Thanks for your comment @RamiAwar! This is a great discussion. First of all 💯 repository isn't suitable in all situations - if it doesn't help it has no place in our codebase.
      Re joins and subqueries - I think this is query repository shines. You are simply encapsulating the complexity of those queries away from your business and other layers. The example in the video may give the wrong impression that repository is a 1-to-1 between classes and tables, but that's not how it works in practice. Domain models usually pull data from multiple tables and the repository should serve those needs.
      Ideally, repository doesn't return SQLAlchemy objects, but DTOs or something similar like you say. In the tutorial, the repo's add() method returns an instance of a plain Booking object (github.com/abunuwas/repository-pattern-tutorial/blob/master/data_access/repository.py#L24) and the list() method returns a list of dictionaries (github.com/abunuwas/repository-pattern-tutorial/blob/master/data_access/repository.py#L14). A plain and simple DTO would be a better choice.
      Personally, I only use the repository pattern when I want to enforce a clear separation between data access and other layers for testing and other purposes, and when queries are growing complex and I want to abstract them away from the business layer.
      Hope these comments help and thanks again for sharing your thoughts!

  • @kevon217
    @kevon217 Год назад +2

    great and comprehensive video, thanks!

    • @microapis
      @microapis  Год назад

      Thank you for your kind words Kevin 🙏!

  • @benmolina3700
    @benmolina3700 Год назад +2

    Great video, very well explained, thanks!! 😀

    • @microapis
      @microapis  Год назад

      Thank you for your nice feedback ❤❤!

  • @_das_19
    @_das_19 6 месяцев назад +2

    Thank you :)

  • @user-vc5yd5wr8c
    @user-vc5yd5wr8c 2 года назад +3

    Great! Thanks!

    • @microapis
      @microapis  2 года назад

      Thank you for your kind feedback!

  • @alexng1126
    @alexng1126 8 месяцев назад +2

    great video. tks

    • @microapis
      @microapis  6 месяцев назад

      Thanks for checking and for your kind comment 🙏!

  • @lucasalvarezlacasa2098
    @lucasalvarezlacasa2098 Год назад +2

    Really good video. I'm starting to learn about FastAPI. One question I have about your design is related to the dependency injection.
    Why are you injecting the dependency as part of the "app" object (create_server method) instead of adding them to the dependencies list offered when creating a new "FastAPI" and then using them somehow in the routes of the controller?
    Shouldn't the Repository be added only at the Router level for Bookings instead of at the level of the entire application?
    Let me know.

    • @microapis
      @microapis  Год назад +1

      Hi Lucas thank you for your kind feedback and for your questions 🎉! This is a very good question about dependency injection in FastAPI, so let me analyze each option separately:
      👉 Using 𝐠𝐥𝐨𝐛𝐚𝐥 𝐝𝐞𝐩𝐞𝐧𝐝𝐞𝐧𝐜𝐢𝐞𝐬 (fastapi.tiangolo.com/tutorial/dependencies/global-dependencies/): Unfortunately, global dependencies can't return a value, so we can't register the repositories as global dependencies and access them in the routes. There's currently a feature request open in FastAPI to make this possible (github.com/tiangolo/fastapi/issues/4246). It would be great if this gets done, because this would be the proper way of registering dependencies.
      👉 Using 𝐫𝐨𝐮𝐭𝐞𝐫 𝐝𝐞𝐩𝐞𝐧𝐝𝐞𝐧𝐜𝐢𝐞𝐬: the APIRouter class has the same problem, so we can't register repositories as dependencies for a whole group of routes.
      👉 Using 𝐫𝐨𝐮𝐭𝐞-𝐛𝐚𝐬𝐞𝐝 𝐝𝐞𝐩𝐞𝐧𝐝𝐞𝐧𝐜𝐢𝐞𝐬: finally, we could inject the repo directly on the routes, but then we would be importing a specific implementation of the repo in the controller, which means we tight couple the repo with the controller and we wouldn't be able to easily inject test repositories.
      To fully take advantage of dependency injection, we want to be able to inject our dependencies from an entry point which is fully under our control, which in this case is the create_server() function. Through that function, we control how FastAPI is initialized and which dependencies must be injected into it.
      Hope this makes sense! Let me know if you have more questions!

    • @lucasalvarezlacasa2098
      @lucasalvarezlacasa2098 Год назад +1

      Hi @@microapis , thank you so much for your response and sorry about the delay of mine.
      As far as I've seen, there's is a parameter that you can specify at the Router or App level called "dependencies", which is a list.
      The problem of doing this is that in order to access a certain dependency from the route itself (the endpoint of the controller), you would need to know the order in which such dependency was injected, which to be honest it sucks. I don't understand why FastAPI does not have anything better for this situation.
      To me, registering everything at the application level and then having to know inside each controller endpoint how to access those dependencies doesn't make much sense either, it breaks the idea of having each controller receive only what it needs. I'd rather prefer to inject the repository at each route despite of the duplication of the code than having to register them at the application level. Why do you mention that if you inject the Repository at the controller's route level then you wouldn't be able to test it? Can't you just mock that same repository in the test and that's it?
      Let me know.

    • @microapis
      @microapis  Год назад +1

      @@lucasalvarezlacasa2098 Hi Lucas thank you for your answer! In this case, it's all about tradeoffs. My solution isn't perfect, but it's a common use of dependency injection. The main benefit of my approach is it gives you control over which dependencies must be injected at load time, which is when you have most control over your app configuration. Notice that, although repositories and sessions are available to all routes, you still need to instantiate them, so you're not unnecessarily opening database sessions and so on. I like this approach because it allows you to set the application in its desired state from the moment you create it, and it avoids duplication.
      Injecting directly on the routes is doable, but it means you'll be importing the session maker from SQLAlchemy and the repositories directly in your router module. It means any test on any route within that module needs to mock those import paths if you want to keep the test isolated. The downside of this for me is that if I'm doing a bunch of tests that have nothing to do with the database, I still need to mock those imports. It kind of defeats the purpose of dependency injection.
      The trickiest part really is SQLAlchemy. You need to create the database connection and get the session factory somewhere. Ideally, that happens outside of your routes. So one compromise could be to set up SQLAlchemy in the server factory function ("create_server()"), and inject the repositories in the routes.
      As you say, and as many devs have requested, ideally we would be able to inject the dependencies at the app or router level and access them as parameters in the routes. Hopefully we'll see this feature available soon.
      As I say in the video description, my solution here is opinionated, and I'm sure it won't necessarily work well in all cases. Like all things in software, there's hardly a universally right way of doing things and it's all about use cases and the needs of your project.

  • @yslx740
    @yslx740 Год назад +2

    What if I wanted to save a booking to the DB & also the filesystem, it seems because you bind the repo inside of the api route theres no choice to change it or have multiple?

    • @microapis
      @microapis  Год назад +3

      Thanks for your question @yslx and apologies for the late reply! This is actually a great question. In this case, you'd create another repository for the file system, and list this repo in the registry together with the others.
      The only complication is when we initialise a repository, we typically pass the session. In a file system repository, a database session doesn't make sense - we'd probably pass a file name, or perhaps a buffer. So the business layer needs to distinguish between different types of repositories and know which type of argument to pass, or preferably we encapsulate this knowledge within the registry itself.
      What is important is ensuring that we either commit or roll back all the operations together. If we're saving data to the db and to the filesystem at the same time, we'd want to make sure both writes fail or succeed together - otherwise we'd corrupt our records. So the filesystem repository would have normal commit() and rollback() methods.
      I'd personally implement the file handling logic using the pathlib library and use either write_text() or write_bytes() (depending on the requirements) to save data to the file. I'd commit() using write_text(). If rolling back, then nothing needs to happen - just close the file if you opened it earlier and/or flush the buffer if you were using one.
      Hope this helps!

    • @yslx740
      @yslx740 Год назад +1

      @@microapis thanks a lot for the detailed reply! Some food for thought

  • @szymonf7623
    @szymonf7623 10 месяцев назад +2

    Is this approach for putting a session object into an app going to work for an async session? I was struggling to make this work but without luck.

    • @microapis
      @microapis  8 месяцев назад

      Hi thank you for your question and sorry for my late reply! Bear in mind what we're injecting is the session factory, not the session itself. Do you have an example of the code you're struggling with?

    • @szymonf7623
      @szymonf7623 7 месяцев назад

      @@microapis Good point, I'll be back to it then. Thanks.

    • @szymonf7623
      @szymonf7623 7 месяцев назад

      When I try to use the context manager protocol, I get 'TypeError: 'AsyncSession' object does not support the context manager protocol' error, on the other hand, session = request.app.session_maker(), works as expected and I get result, but Postgres is complaining about pools. "Please ensure that SQLAlchemy pooled connections are returned to the pool explicitly, either by calling ``close()`` or by using appropriate context managers to manage their lifecycle." and await session.close(), does not seem to solve the problem.

  • @whu.9163
    @whu.9163 11 месяцев назад +1

    what is the difference between a repository pattern and DAO ? When to use each ?

    • @microapis
      @microapis  8 месяцев назад

      Thank you for your question! Technically, Data Access Objects (DAOs) are objects that handle the implementation details of saving or retrieving data from a persistence storage. For example, if you use a SQL database, a DAO would take care of translating code to SQL statements.
      It's kind of related to Data Mapper, tho the idea in Data Mapper is to map data from the persistence storage into domain objects. So a Data Mapper may use a DAO to handle the low-level interactions with the db. Repository represents a collection of domain objects. It encapsulates the logic required to map db records to domain objects.
      In most situations, you don't have to handle DAOs directly. You'll normally use a SQLAlchemy model to write db queries using Python code. The SQLAlchemy model is your data mapper, and it'll use internal logic akin to a data access object to translate the code into SQL statements. I always recommend not introducing any business logic within SQLAlchemy models, and to keep them strictly as a data representation layer. So the way I use SQLAlchemy models is closer to a DAO.
      If you need to write very complex queries not worth doing with SQLAlchemy, then you'd write your own DAO. It could be a function that puts together the right SQL statement to retrieve some data. When you map that data to a domain object, you're writing a data mapper. All that logic would be encapsulated behind a repository.
      Note that I say a DAO could be a function despite its name saying data access OBJECT. Historically, the concept comes from the Java community, where everything used to be objects, si that was the only possible implementation. In Python, it can be anything you want and suits your needs.
      Hope this helps 🚀🚀!

  • @angely9783
    @angely9783 Год назад +2

    Your repository looks like a simple DAO to me. There is like a one-to-one mapping between your (domain) model and the database. A repository can achieve more complex things and more custom mappings between the domain objects and the database. In particular, if the Aggregate Root has multiple Value Objects, the repository is responsible for creating or removing these VOs (and for hiding this complexity). Thanks for the video.

    • @microapis
      @microapis  Год назад

      Thanks for your comment @Angély! You're absolutely right, I totally forgot to bring up this point! I guess I was too focused on encapsulation 😅. Maybe I should cover repository more in-depth in another video!

    • @angely9783
      @angely9783 Год назад +1

      @@microapis Well, I used the Repository pattern in a recent Python project (also based on FastAPI), hence my comment 😊 I have a case where the database differs from the domain objects, so I saw a benefit in applying the Repository pattern. But when using ORM and CRUD apps (one-to-one mapping between entities and the database), it seems to me it just adds another layer for nothing really. I just published the project on GitHub. If interested, I'd gladly share it here. It is rather simple and it doesn't include an ORM (SQL queries not being complex).

    • @microapis
      @microapis  Год назад

      @@angely9783 would love to take a look the repo!

    • @angely9783
      @angely9783 Год назад +1

      @@microapis Wow, RUclips has a nice feature called "we silently delete comment with a link in it." 😄 I cannot post the full link, so for those interested: *angely-dev/freeradius-api* after the base URL of GitHub. Conceptual approach at the end. Repositories implementation are in *src/pyfreeradius* file. It surely is not perfect (e.g., committing inside the repositories) but I think it could be a good example of "objects reconstitution" as per the DDD. Thanks again for the video, glad I'm not the only one trying this pattern in Python.

    • @microapis
      @microapis  Год назад

      @@angely9783 Thanks for the code! That's a great example of how to encapsulate complex database operations behind a repository, which is when the pattern becomes truly useful. Great job!