An approach I use is by going a level up to the function that is calling the one I'm about to write. In that calling function, the name and arguments should read in plain English, almost like a story. Ultimately, if you've got your face in a function you can see what it does, but if you are in the outer scope, a good signature means you don't need to dig deeper on that line to reason about what's going on. Great video, thank you!
This is the major idea behind Test-Driven Development, too. But if you can write your entire application top-down and insert empty functions at placeholders as you go, you gain most of the same benefit. I find I can commonly that in heavy data processing apps, but it's hard to do when things get complex.
Combination "_and_" functions are useful at collapsing common boilerplate code down into a single line used in high level applications. For example, all applications may need to setup logging, read the local config, open a database, and connect to the attached device. Having a method for that reduces code duplication, allows to improve that common code in one place, makes the app easier to read, and makes creating new apps significantly easier.
I also read that the function name specificity should be inversely proportional to its scope, e.g. a function collect_and_summarize_invoices might be used once or twice within the same class/module, but if a function is used all over the place, it should have a very short name, e.g. python's "open" function. thank you for another great video.
Great video. At first I was wondering what could you possibly talk about 30 mins for just the function signature but I actually learned a lot. Thanks for putting this together
Also crucial: function arguments should always be annotated with the most general protocol possible, but the return type should always be as specific as possible type as possible.
When handling default arguments, I often use the following line to set them at runtime: `timestamp = timestamp or time.time()` This works if the default value is `None`, because `None` is a "False" value and a valid timestamp is a `True` value. It also uses the fact that Python passes on the actual value of the first `True` element in an `or` operation, and does not evaluate the second value if the first one is already `True`. If the first value is `False`, it yields the second value regardless of whether it is `True` or `False`.
You beat me to this. 😊 It also works on all False-y values like empty data structures. It's also nice to add __bool__ methods to custom classes to indicate when they're uninitialized or when a connection is open/closed.
There's another naming style sometimes used in the Python stdlib and ecosystem: adjectives describing the quality of the returned object, like reversed, sorted and functools.batched. They do not change the argument, so a name like "sort" would be confusing in this case, it's used for the corresponding method, which actually transforms the object. I don't know how this style is named though.
thanks for posting. I'd rethink naming functions using their implementation detail. `calculate_total_minus_discount` is perhaps overly close to the implementation detail. I'd suggest something like `calculated_discounted_total_price`. that may read better in the caller's code too, as in ìf calculated_discounted_total_price(…):` I'd also argue about the benefit of verbs in functions names… after decades of using them! :) Writing in an FP language and everything is just data to me now :)
Agreed that verbs are often unnecessary. Just describe what the function gives you; discounted_total_price. The word "calculate" is just noise. Unless you need to distinguish between calculating and just returning it.
class DiscountedTotalPrice(object): """ The total price that is discounted """ def __init__(self): """ Initiate the object to be used as a callable.""" def __call__(self, args): """ Return the calculated total price. """ return self.calculate(args) def calculate(self, args): """Calculate the discounted total price and return the result as a float.""" discounted_total_price=DiscountedTotalPrice() x=discounted_total_price(y,z,m) #because you should only use verbs as functionnames ... 😂 (and I prolly made an error in there somewhere, apart from skipping first indentation for brevity.)
A tip I want to share which is slightly related is the “extract function” feature that a lot of IDEs have, which allows you to highlight a code block and press a hotkey to turn it into a function automatically. At least pycharm has this, and I guess you can find extensions for it for most of the popular editors. You can also do the inverse operation, meaning turning a function into inline code).
Calling it a "function header" is weird. The usual name you see for it is "function signature". That's what it's called in Python itself (see inspect.Signature).
Not feeling too comfortable with the “minus” in the function name. I prefer something like “calculate_total_after_discount” and then if the discount math changes the function name is still valid
Regarding variable naming, I use plural form for a collection as in "cars" unless the variable name hints to a collection as in "list_car", in which case I use the singular form.
Yes, and I didn't see Arjan's point there. For a hypothethical one-line syntax to unpack several chosen entries from a dictionary, the idea must reasonably be to use the variable names written on the left as keys for the dictionary -- not to care about in what order that the dictionary was built (then it would already work with tuple unpacking).
Something worth noting is that a better type annotation for generic numeric types is a union of numbers.Integral, numbers.Real, and decimal.Decimal (or numbers.Number if complex numbers are allowed as well).
Just quickly, the `def foo[T, U](...): ...` syntax is part of Python's newer generics syntax introduced fairly recently. It replaces the awkward use of type variables declared at the module level, where even if you wanted them just to be associated with a single function or class, they really weren't.
What I do sometimes regarding the dataclass vs dict is I use the `validate_call` decorator of pydantic which validates and parses the input as a pydantic field. This allows the user to enter a dictionary or the base model.
Great video as Always Arjan. Thanks for covering this topic. A real bugbear of mine is splitting function headers across multiple lines like @ 8:14 this is obviously auto-formated and it's a PEP8 guideline I know but I find it makes headers much more difficult to read (unless they are really long with lots of arguments - which they shouldn't be). I started using "autopep8.args": ["--ignore=E501"] in my settings to ignore long lines.
Regarding object options: if you use any specific value from it or want to set as default, it should be a normal argument. The only reason to use **kwargs is if you do not access its data and only pass it as dict to other functions
You can also define your own types as variables, and the IDE will recognize them as doctypes (at least PyCharm does). Just define, for example: Real = int | float And use it as: def whatever_function(arg: Real) -> Real: ... It works for me in the latest version of PyCharm.
Imho, i am not sure "total" case is a bad one and the first is a good. "total" function behavior is quite straightforward considering function and parameter names. At least for me it clearly states that "total from items considering a discount". And this is a routine we are used to in real life: subtract the discount :) Why the longer function in this case is not the best one: we are exposing the internals of how we are applying the discount and transform it to function name. This function name will need to be changed when we change the discount algorithm which is not quite good: we can forgot to do that. Also what if the calculation will be more sophisticated? The function name should be super long. With a "total" option we do not care how the discount is applied and if we do then i would say our function do 2 things here and design should be reconsidered :)
Hi Arjan, thanks alot for the Video. What do you think about retuning a bool for functions that otherwise could return None? When using a bool as return one can control the main process by knowing If the function actually did work or would you argue that Program flow determines this anyway? Cheers
@ArjanCodes As a seasoned software developer, do you make a difference between arguments and parameters? To me, parameters are found in function definitions, arguments are what you call the function with.
(a) I do usually teach my guys to use "Iterator" when a (single) yield-statement is used in a function. The editor may detect Generator as the actual return type but it's not a good idea to be that specific. (b) I do also teach my guys to try to use "None" as the default as often as possible, and the actual non-nullable value can be set in a single line after the header - "value = value if value is not None else default". That's even more readable than a full if-block. In terms of string-values you would also want "value = value if value else default" anyway as usually an empty string is not an acceptable value.
Is there a convention for argument order? For example, when implementing `publish_info_to` (5:45), would you put `library` or `info` as the first argument, and why?
I have a question regarding the return types mentioned towards the end of the video: I understand that being more specific allows us to use more "features" (e.g. list vs Iterable -> being able to use indices to access list elements). But on the other hand, being more generic allows me to refactor the function more easily, since I am not "bound" to a more specific type (e.g. if I specified Iterable as the return type, I can later change from a list to a set, (ideally) without needing to modify the code that called the function). Is my line of thinking flawed, or is this a legit concern? And if so, what would be your arguments for / against more specific or more generic return types?
Arjan, great video as always, but I have something to add. A name 'calculate_totla_minus_discount' in my view is not so good. First, it describes too exact what it does, and second it looks like do many things: 'calc_total' and 'minus_discount'. I'd better name function 'calc_cart_total' or just 'calc_total' (may be 'calc_total_applying_discount'). There is no mention how it would be applied - 'minus' or 'plus'. For the user the name stays clear. Comparing dataclasses with TypedDict, I'd prefer dataclasses. The code with them at least visually is more clear: options.age_limit vs options['age_limit'].
...one more thing - about add_number_to_each_element. It does not clear from the name does it add in place or creating a new one. I'd prefer something like get_elements_increased_by
I mostly agree with you. The name should be concise, and if the behaviour is non-trivial, it should be documented in the docstring. I also think dataclasses are much better. TypedDicts are meant to be used to interface with older code that uses dictionaries for stuff like that, not for new code that can use dataclasses, Pydantic, or even NamedTuples. Regarding your comment about `add_number_to_each_element`, while I have problems with the name for being overly verbose, I think `get_element_increased_by` isn't that good either. Whether a function adds in place or creates a new one should be easy to describe: take the elements in as a Sequence or an Iterable, so you can't assign to the elements, and return a list. That shows the intent better. To be honest, my favourite name for this would be "increase_elements_by". The verb "get" is overused in function names.
I've been using "any" for some type annotations for a while now and it works even without importing it. Any relevant differences to doing it the other way?
Don’t add generic words that can be applied almost anywhere to function names like “calculate”. A good test is to try removing the word and see if the meaning actually changes.
Verbs aren’t always necessary. I would argue that functions with side effects should have verbs, but functions that derive/transform data can and should be nouns. That eliminates these useless “get”, “calculate”, etc prefixes spamming all over the code. You already know its a function, therefore it will always calculate something. Just call it “total_something()” if its summing something.
In general, I'd say you're better off writing straightforward signatures at first in the spirit of YAGNI. It's easy to spend too much time writing a perfectly generic function when you'll only ever use a single type with it. It's much better to start concrete and get more generic as you need to refactor. That being said, using Sequence/Iterable/Mapping doesn't hurt, as that's barely any effort, and you should return concrete types as much as possible. Finally, naming functions and parameters is an art. It's something I'm continuously thinking about. At the end of the day, you're better off documenting the behaviour in the docstring rather than trying to write the perfect name.
The only reason you should split functions is when you need to use half of it in one place and the other half in another, if you have 10 functions that only ever call each-other linearly the only thing you're achieving is to make your code slower and harder to read
"Function names should be actions" -- that convention works well, but it's not the only one. It's very common for functions to look more like nouns that describe their return value. In both cases you get similar information from the name. For example I would argue that a function named "p99" or "average" is better than the same function called "calculate_p99" or "calculate_average".
When it comes to default values for options using a TypedDict you could define a privated options object and use dictionary merging, e.g. _default_options: Options = { 'foo': 0, 'bar': ['beep'], } def func(data: Data, options: Options) -> None: merged_options = _default_options | options
I can't write a function without type hints now, it's just automatic. They are worth using for the IDE hints alone IMO, in Neovim if I have set a function to take an int and accidently returned a string elsewhere, I know before even running any code. Saves a lot of time and frustration in our dynamic typing world.
Minor nitpick: I think your analysis at 24:30 is not completely right. The reason for using a generic is to enforce that the type of the values in the returned list is the same as whichever type the user chooses to supply in the input Iterable.
Great video. What about type hints of arguments which are types from other classes like a numpy array of Cosmology class from astropy for example. What would the best practice be for that? Just np.ndarray? Seems ugly.
Why? Type hints are type hints: they tell you what to expect and in an IDE, they enable accces to good auto completions. Using np.ndarray as type hints is super helpful when writing subsequent code in the function body because of type inference and intellisense autocomplete..but maybe I dont understand the term "ugly" in this context :)
As mentioned, there's numpy.typing to help with that. Unfortunately, there are many libraries that don't provide type hints, so sometimes you'll have to do manual casting (typing.cast) yourself. In some extreme cases, you'd have to provide typed wrappers around untyped libraries.
I disagree about the options object. It’s an approach that is very common in Java and C#, because those languages only know positional arguments, but in Python the configurable fields of the options object are more commonly passed as keyword arguments.
Tx. "Hardest thing"? Processing everyone's version of null, nul, Null, NULL, "null", \0, , None, Empty, "", 0, "0", "", [ ], { } and so on... esp found in modern, "low-code" data packets.
I would not use `Iterable` as shown at the end of the video, I'd rather use `Collection` as Iterable can be infinte and this will make the code get stuck.
Maybe that's related to my programming history - I was tought pure C (K&R 2nd ed.) back in the late 1980ies - but how about abbreviating object names ? E.g. "calculate_total_price_including_discount" becomes something like "calc_ttl_prc_incl_dscnt" with arguments abbreviated similarly ? Is this an absolute no-go ? I hope not ... ;-)
I exclusively use slotted dataclasses because of the performance benefits. Even if performance does not matter, either at all or in that area, I feel being consistent has more value than anything a dictionary can offer. Quick note on generics, the type parameter list in your examples was only added in 3.12 (if I remember correctly) and without those additions declaring type variables and manually handling variance is usually more mess and work than the value they provide.
I find list[int] typing unpythonic. It's nice to know what a function expects, but if you want many ints, use an array of int, where trying is both obvious and enforced. The point of a list is 2-fold: it's mutable, its elements are "any". The point of array.array(int, ) is that it's an ordered container of ints. I know it's not practical to implement, and no one uses the array from the standard library, so: j/s.
@@alexp6013 Write once code can be fast to write, sure. But if you apply a smidgen of sane patterns it is almost as fast to write, but maintanable too.
@@dtkedtyjrtyj Unpacked typed dict are very maintainable for kwargs IMO. However, I do agree on complex functions that the code handling the defaults should be separated from the implementation. On simple functions, where good defaults exist, they don't cause any issue
@@alexp6013 in my experience, it is always easier to just pass in any "simple" defaults. It get easier to add parameters and read the code. And if you really want to provide a default, use another function that does it. Default values usually mean your function does too much.
Even `calculate_total_minus_discount` is ambiguous. Is the discount subtracted per item? Is the discount a percentage of the total? If the discount is a percentage, should the user pass the percentage as an actual percentage (ie 25%) or a fractional proportion (ie 0.25)? Definitely the best function name would be `calculate _total_of_all_items_and_then_subtract_discount(item_prices: Iterable[int], total_discount_as_an_amount_of_money: int)`. If only there was a way to somehow leave a comment for a function that would document such particulars!
I prefer the idea that higher level functions have shorter names signaling that they have abstracted out the details that the caller should not have to care about. If my service’s job is to resolve the final total to be paid, the top level function should be called simply “total()”. Inside that function you would see things like return total_before_discount() - total_discount()
Probably the most attractive aspect of Python used to be how simple it was to write it and to read the resulting code. One key part of that was duck typing - no need to specify what type of variable you were using, which also made it more flexible as eg the language would handle adding an int and a float. For some reason, people who like fully specifying types, and should probably have just stuck to those sorts of languages, have come along and fouled this up, now we are encouraged to write unreadable code using zillions of type hints. I was particularly amused, Arjan, with your section on 'making your function more generic' - achieved by adding even more type hinting ... you could just drop all the type hints and achieve that!
I am sure you have considered all the arguments pro and con type hinting already, so I am not going to change your mind. Let me just say that a significant portion of bugs in my project come from 3rd party modules not providing type hints or generic "Any" types. It takes way more time for a user of your module to crawl through documentation for debugging \ accepting all kinds of return types than referencing a typed interface.
With more power comes more responsibility. When you were learning python or experimenting and what you made had little consequence if it broke, doing everything loose and fast is fine. When you then have to work with others who depend on you (and you depend on them) these checks end up helping everyone including you much more than they hurt. Just remember, you’re benefiting from everyone else following the rules too. Obviously, you never make bugs, but these rules prevent a lot of the bugs your colleagues will make that you’ll end up having to deal with. 😉
I don't have half an hour to watch a video to see what is worth knowing or what is not. I am a speed reader and would like access to text versions of video.
I hate when someone requres to have everything with type hints. And I agree that they can make the code less readable. But It's hard to live without them
💡 Get my FREE 7-step guide to help you consistently design great software: arjancodes.com/designguide.
An approach I use is by going a level up to the function that is calling the one I'm about to write. In that calling function, the name and arguments should read in plain English, almost like a story. Ultimately, if you've got your face in a function you can see what it does, but if you are in the outer scope, a good signature means you don't need to dig deeper on that line to reason about what's going on.
Great video, thank you!
Great tip, thanks for sharing!
This is a great tip I’ve not heard before.
You can thank my old C++ days for that one! :)
This is the major idea behind Test-Driven Development, too. But if you can write your entire application top-down and insert empty functions at placeholders as you go, you gain most of the same benefit. I find I can commonly that in heavy data processing apps, but it's hard to do when things get complex.
"Make interfaces easy to use correctly and hard to use incorrectly."
- Scott Meyers, The Most Important Design Guideline
Is this from a book?
@@iliasaarab7922 you can find his talk about this on YT by searching that title
Combination "_and_" functions are useful at collapsing common boilerplate code down into a single line used in high level applications. For example, all applications may need to setup logging, read the local config, open a database, and connect to the attached device. Having a method for that reduces code duplication, allows to improve that common code in one place, makes the app easier to read, and makes creating new apps significantly easier.
I also read that the function name specificity should be inversely proportional to its scope, e.g. a function collect_and_summarize_invoices might be used once or twice within the same class/module, but if a function is used all over the place, it should have a very short name, e.g. python's "open" function.
thank you for another great video.
i am not sure i understood this one. Why can't i use a descriptive name if is used all over the place?
Great video. At first I was wondering what could you possibly talk about 30 mins for just the function signature but I actually learned a lot. Thanks for putting this together
Also crucial: function arguments should always be annotated with the most general protocol possible, but the return type should always be as specific as possible type as possible.
When handling default arguments, I often use the following line to set them at runtime:
`timestamp = timestamp or time.time()`
This works if the default value is `None`, because `None` is a "False" value and a valid timestamp is a `True` value.
It also uses the fact that Python passes on the actual value of the first `True` element in an `or` operation, and does not evaluate the second value if the first one is already `True`. If the first value is `False`, it yields the second value regardless of whether it is `True` or `False`.
You beat me to this. 😊 It also works on all False-y values like empty data structures. It's also nice to add __bool__ methods to custom classes to indicate when they're uninitialized or when a connection is open/closed.
There's another naming style sometimes used in the Python stdlib and ecosystem: adjectives describing the quality of the returned object, like reversed, sorted and functools.batched. They do not change the argument, so a name like "sort" would be confusing in this case, it's used for the corresponding method, which actually transforms the object.
I don't know how this style is named though.
thanks for posting. I'd rethink naming functions using their implementation detail. `calculate_total_minus_discount` is perhaps overly close to the implementation detail. I'd suggest something like `calculated_discounted_total_price`. that may read better in the caller's code too, as in ìf calculated_discounted_total_price(…):` I'd also argue about the benefit of verbs in functions names… after decades of using them! :) Writing in an FP language and everything is just data to me now :)
Agreed that verbs are often unnecessary. Just describe what the function gives you; discounted_total_price. The word "calculate" is just noise. Unless you need to distinguish between calculating and just returning it.
class DiscountedTotalPrice(object):
""" The total price that is discounted """
def __init__(self):
""" Initiate the object to be used as a callable."""
def __call__(self, args):
""" Return the calculated total price. """
return self.calculate(args)
def calculate(self, args):
"""Calculate the discounted total price and return the result as a float."""
discounted_total_price=DiscountedTotalPrice()
x=discounted_total_price(y,z,m)
#because you should only use verbs as functionnames ... 😂 (and I prolly made an error in there somewhere, apart from skipping first indentation for brevity.)
A tip I want to share which is slightly related is the “extract function” feature that a lot of IDEs have, which allows you to highlight a code block and press a hotkey to turn it into a function automatically. At least pycharm has this, and I guess you can find extensions for it for most of the popular editors. You can also do the inverse operation, meaning turning a function into inline code).
Viscose has it too. Doesn't type annotate, though.
You could use NamedTuple for Options in that case it could be destructured almost like in TS
Calling it a "function header" is weird. The usual name you see for it is "function signature". That's what it's called in Python itself (see inspect.Signature).
I guess that’s due to my upbringing in C! But yes, signature is correct in a Python setting.
@@ArjanCodes the first step to becoming a Pystro is forgetting all other languages.
Ah, there’s hope for me yet. I’m really good at forgetting things. 😁
Not feeling too comfortable with the “minus” in the function name. I prefer something like “calculate_total_after_discount” and then if the discount math changes the function name is still valid
@@DrDeuteronI thought the official term was "pythonista"
Great video Arjen, yes would love to hear your thoughts on function body design 👍
Actually, the two hardest things in computer science is naming things, cache invalidation, and off-by-1 errors. 😉
LOL
It's not even a joke 👀
@@aflous which makes it extra funny
Haha, only serious
Don't forget scope creep!
Loooved this video, so clear and helpful! Keep them coming!
Happy you liked it. Will do 😊
Regarding variable naming, I use plural form for a collection as in "cars" unless the variable name hints to a collection as in "list_car", in which case I use the singular form.
If I’m not mistaken, starting with Python 3.7, the order of dictionaries is guaranteed.
Yes, and I didn't see Arjan's point there. For a hypothethical one-line syntax to unpack several chosen entries from a dictionary, the idea must reasonably be to use the variable names written on the left as keys for the dictionary -- not to care about in what order that the dictionary was built (then it would already work with tuple unpacking).
Something worth noting is that a better type annotation for generic numeric types is a union of numbers.Integral, numbers.Real, and decimal.Decimal (or numbers.Number if complex numbers are allowed as well).
Can you point to more info on the syntax used where you have Numeric? The square brackets right after the function's name. Thanks!
Just quickly, the `def foo[T, U](...): ...` syntax is part of Python's newer generics syntax introduced fairly recently. It replaces the awkward use of type variables declared at the module level, where even if you wanted them just to be associated with a single function or class, they really weren't.
What I do sometimes regarding the dataclass vs dict is I use the `validate_call` decorator of pydantic which validates and parses the input as a pydantic field. This allows the user to enter a dictionary or the base model.
Great video as Always Arjan. Thanks for covering this topic. A real bugbear of mine is splitting function headers across multiple lines like @ 8:14 this is obviously auto-formated and it's a PEP8 guideline I know but I find it makes headers much more difficult to read (unless they are really long with lots of arguments - which they shouldn't be). I started using "autopep8.args": ["--ignore=E501"] in my settings to ignore long lines.
Regarding object options: if you use any specific value from it or want to set as default, it should be a normal argument. The only reason to use **kwargs is if you do not access its data and only pass it as dict to other functions
You can also define your own types as variables, and the IDE will recognize them as doctypes (at least PyCharm does). Just define, for example:
Real = int | float
And use it as:
def whatever_function(arg: Real) -> Real:
...
It works for me in the latest version of PyCharm.
Agree, type aliases are very handy for cases where the data type might change e.g. you want to change a string to a UUID
the underscore notation is known snake case, just for curious, as well as camelCase has a name. Super good videos 👍🏻👍🏻
Valuable insights. Thank you for posting.
Man, what a beautiful video, I've learned a lot, thank you!
Happy to hear you enjoyed it!
Imho, i am not sure "total" case is a bad one and the first is a good. "total" function behavior is quite straightforward considering function and parameter names. At least for me it clearly states that "total from items considering a discount". And this is a routine we are used to in real life: subtract the discount :) Why the longer function in this case is not the best one: we are exposing the internals of how we are applying the discount and transform it to function name. This function name will need to be changed when we change the discount algorithm which is not quite good: we can forgot to do that. Also what if the calculation will be more sophisticated? The function name should be super long. With a "total" option we do not care how the discount is applied and if we do then i would say our function do 2 things here and design should be reconsidered :)
Hi Arjan, thanks alot for the Video. What do you think about retuning a bool for functions that otherwise could return None? When using a bool as return one can control the main process by knowing If the function actually did work or would you argue that Program flow determines this anyway? Cheers
Great one
I didn't know the typedDict, was struggling for awhile multiple optional field dataclass
Glad you enjoyed it!
@ArjanCodes As a seasoned software developer, do you make a difference between arguments and parameters? To me, parameters are found in function definitions, arguments are what you call the function with.
That’s the same distinction I know. But I must admit, I’m not consistent in using the terms correctly in the videos.
Video was awesome can you make an other video on how declare value to a variable in depth .
(a) I do usually teach my guys to use "Iterator" when a (single) yield-statement is used in a function. The editor may detect Generator as the actual return type but it's not a good idea to be that specific.
(b) I do also teach my guys to try to use "None" as the default as often as possible, and the actual non-nullable value can be set in a single line after the header - "value = value if value is not None else default". That's even more readable than a full if-block. In terms of string-values you would also want "value = value if value else default" anyway as usually an empty string is not an acceptable value.
Gosh you’re awesome arjan!
13:24 subtitles: "args and quarks". it seems like we went from python to physics just like that :D
i love Arjan vids, i'd love to know what he does in his job? it does not seem that he is the type of guys that only makes CRUD lol
Is there a convention for argument order? For example, when implementing `publish_info_to` (5:45), would you put `library` or `info` as the first argument, and why?
I have a question regarding the return types mentioned towards the end of the video:
I understand that being more specific allows us to use more "features" (e.g. list vs Iterable -> being able to use indices to access list elements).
But on the other hand, being more generic allows me to refactor the function more easily, since I am not "bound" to a more specific type (e.g. if I specified Iterable as the return type, I can later change from a list to a set, (ideally) without needing to modify the code that called the function).
Is my line of thinking flawed, or is this a legit concern? And if so, what would be your arguments for / against more specific or more generic return types?
Arjan, great video as always, but I have something to add.
A name 'calculate_totla_minus_discount' in my view is not so good. First, it describes too exact what it does, and second it looks like do many things: 'calc_total' and 'minus_discount'. I'd better name function 'calc_cart_total' or just 'calc_total' (may be 'calc_total_applying_discount'). There is no mention how it would be applied - 'minus' or 'plus'. For the user the name stays clear.
Comparing dataclasses with TypedDict, I'd prefer dataclasses. The code with them at least visually is more clear: options.age_limit vs options['age_limit'].
...one more thing - about add_number_to_each_element. It does not clear from the name does it add in place or creating a new one. I'd prefer something like get_elements_increased_by
I mostly agree with you. The name should be concise, and if the behaviour is non-trivial, it should be documented in the docstring. I also think dataclasses are much better. TypedDicts are meant to be used to interface with older code that uses dictionaries for stuff like that, not for new code that can use dataclasses, Pydantic, or even NamedTuples.
Regarding your comment about `add_number_to_each_element`, while I have problems with the name for being overly verbose, I think `get_element_increased_by` isn't that good either. Whether a function adds in place or creates a new one should be easy to describe: take the elements in as a Sequence or an Iterable, so you can't assign to the elements, and return a list. That shows the intent better. To be honest, my favourite name for this would be "increase_elements_by". The verb "get" is overused in function names.
@@maleldil1 `increment_by(elements, increment)`?
well at this point, make a Cart class and put it in methods with a default _discounted=False class attribute.
Enjoyed ❤
Thank you!
What is your Keyboard Arjan?
Looks like NuPhy Air75.
Hey Arjan, can you tell me how do you get this 'sparkles' indicator for the line that you're currently on?
It appears if you have the copilot extension installed. It allows you to access copilot if you click on it.
would you do `weight_kg: int` or `kg = int; weight: kg`?
thnx for the video, I've learned a lot from u!
You’re welcome! ☺️
What about NamedTuple for your Options instead of Dataclass? You can unpack a tuple.
I prefer `optional = optional or default` in lieu of `if optional is None: optional = default`.
Unless the optional can be falsy.
I've been using "any" for some type annotations for a while now and it works even without importing it. Any relevant differences to doing it the other way?
Thanks!
Thank you so much!
Don’t add generic words that can be applied almost anywhere to function names like “calculate”. A good test is to try removing the word and see if the meaning actually changes.
1:18 and off-by-one errors
That was a fire hose, but appreciated anyway!
Great insight on important task !!
Thank you, glad you enjoyed it!
If you limit the function argument to few, will it trades off on dependency injection?
20:37 Actually dictionaries are ordered since Python 3.6/3.7
20:30 - python dicts have been sorted by insertion order for a while now
Verbs aren’t always necessary. I would argue that functions with side effects should have verbs, but functions that derive/transform data can and should be nouns. That eliminates these useless “get”, “calculate”, etc prefixes spamming all over the code. You already know its a function, therefore it will always calculate something. Just call it “total_something()” if its summing something.
Where I can find more info on this notation: def add_number_to_each_element_v3[Numeric: (int, float, Decimal)] ? Numeric is new to me.
In general, I'd say you're better off writing straightforward signatures at first in the spirit of YAGNI. It's easy to spend too much time writing a perfectly generic function when you'll only ever use a single type with it. It's much better to start concrete and get more generic as you need to refactor. That being said, using Sequence/Iterable/Mapping doesn't hurt, as that's barely any effort, and you should return concrete types as much as possible. Finally, naming functions and parameters is an art. It's something I'm continuously thinking about. At the end of the day, you're better off documenting the behaviour in the docstring rather than trying to write the perfect name.
Later equals never
The only reason you should split functions is when you need to use half of it in one place and the other half in another, if you have 10 functions that only ever call each-other linearly the only thing you're achieving is to make your code slower and harder to read
What the function!
"Function names should be actions" -- that convention works well, but it's not the only one. It's very common for functions to look more like nouns that describe their return value. In both cases you get similar information from the name. For example I would argue that a function named "p99" or "average" is better than the same function called "calculate_p99" or "calculate_average".
What is function header?? You seem to be talking about what I was taught is the Signature? Help explain
Actually when writing my own logger and scheduler, I found it way better to just pass the stamp creating callable
When it comes to default values for options using a TypedDict you could define a privated options object and use dictionary merging, e.g.
_default_options: Options = {
'foo': 0,
'bar': ['beep'],
}
def func(data: Data, options: Options) -> None:
merged_options = _default_options | options
I can't write a function without type hints now, it's just automatic. They are worth using for the IDE hints alone IMO, in Neovim if I have set a function to take an int and accidently returned a string elsewhere, I know before even running any code. Saves a lot of time and frustration in our dynamic typing world.
5:56 hard stop, no 😄
Minor nitpick: I think your analysis at 24:30 is not completely right. The reason for using a generic is to enforce that the type of the values in the returned list is the same as whichever type the user chooses to supply in the input Iterable.
In function definition these are not arguments but parameters ,)
Great video. What about type hints of arguments which are types from other classes like a numpy array of Cosmology class from astropy for example. What would the best practice be for that? Just np.ndarray? Seems ugly.
Why? Type hints are type hints: they tell you what to expect and in an IDE, they enable accces to good auto completions. Using np.ndarray as type hints is super helpful when writing subsequent code in the function body because of type inference and intellisense autocomplete..but maybe I dont understand the term "ugly" in this context :)
Numpy has a typing submodule to help a bit, though it's still in-progress (from numpy.typing import NDArray)
As mentioned, there's numpy.typing to help with that. Unfortunately, there are many libraries that don't provide type hints, so sometimes you'll have to do manual casting (typing.cast) yourself. In some extreme cases, you'd have to provide typed wrappers around untyped libraries.
@@lazerbro omg. what? my version doesn't have it tho. We got security lags.
@@mytelevisionisdead yeah I agree. It just looks ugly to me. I still use it.
The funny part of that saying is that "cache invalidation" is a bad name
At 12:44 instead the if statement I personally like timestamp = timestamp or time.time() better. Its looks cleaner..
But as always great video!
cyclomatic_complexity -= 1
ftw.
nitpick: this will be wrong if the timestamp is zero :D
If you send me an instance, I can access it attribute names and values in its dunder dict attribute. But that is some inappropriate intimacy.
You like the 'typing' module but it seems that the ' typing module is getting deprecated in Python' :
ruclips.net/video/cv1F_c66utw/видео.html
I disagree about the options object. It’s an approach that is very common in Java and C#, because those languages only know positional arguments, but in Python the configurable fields of the options object are more commonly passed as keyword arguments.
Tx. "Hardest thing"? Processing everyone's version of null, nul, Null, NULL, "null", \0, , None, Empty, "", 0, "0", "", [ ], { } and so on... esp found in modern, "low-code" data packets.
There are 2 hard problems in computer science: cache invalidation, naming things, and off-by-1 errors.
I would not use `Iterable` as shown at the end of the video, I'd rather use `Collection` as Iterable can be infinte and this will make the code get stuck.
Maybe that's related to my programming history - I was tought pure C (K&R 2nd ed.) back in the late 1980ies - but how about abbreviating object names ? E.g. "calculate_total_price_including_discount" becomes something like "calc_ttl_prc_incl_dscnt" with arguments abbreviated similarly ? Is this an absolute no-go ? I hope not ... ;-)
Functions without docstrings look icky to me.😅
=D
I exclusively use slotted dataclasses because of the performance benefits. Even if performance does not matter, either at all or in that area, I feel being consistent has more value than anything a dictionary can offer.
Quick note on generics, the type parameter list in your examples was only added in 3.12 (if I remember correctly) and without those additions declaring type variables and manually handling variance is usually more mess and work than the value they provide.
I find list[int] typing unpythonic. It's nice to know what a function expects, but if you want many ints, use an array of int, where trying is both obvious and enforced. The point of a list is 2-fold: it's mutable, its elements are "any". The point of array.array(int, ) is that it's an ordered container of ints. I know it's not practical to implement, and no one uses the array from the standard library, so: j/s.
god python syntax and naming conventions are terrible
the IDE's color differentiation is doing all of the heavy lifting with making this ish readable
wrong title
That's not a function header. There are no function headers in Python. You should think about what words mean when designing sentences in English.
Just don't use default arguments, make a different function.
Python is made to be fast.
Fast to write.
@@alexp6013 Write once code can be fast to write, sure.
But if you apply a smidgen of sane patterns it is almost as fast to write, but maintanable too.
@@dtkedtyjrtyj Unpacked typed dict are very maintainable for kwargs IMO.
However, I do agree on complex functions that the code handling the defaults should be separated from the implementation.
On simple functions, where good defaults exist, they don't cause any issue
@@alexp6013 in my experience, it is always easier to just pass in any "simple" defaults. It get easier to add parameters and read the code. And if you really want to provide a default, use another function that does it.
Default values usually mean your function does too much.
@@dtkedtyjrtyj I would agree for Go, not Python
**kwargs already a dict no need to be creating yet another dict
Even `calculate_total_minus_discount` is ambiguous. Is the discount subtracted per item? Is the discount a percentage of the total? If the discount is a percentage, should the user pass the percentage as an actual percentage (ie 25%) or a fractional proportion (ie 0.25)? Definitely the best function name would be `calculate _total_of_all_items_and_then_subtract_discount(item_prices: Iterable[int], total_discount_as_an_amount_of_money: int)`. If only there was a way to somehow leave a comment for a function that would document such particulars!
I prefer the idea that higher level functions have shorter names signaling that they have abstracted out the details that the caller should not have to care about. If my service’s job is to resolve the final total to be paid, the top level function should be called simply “total()”. Inside that function you would see things like return total_before_discount() - total_discount()
Probably the most attractive aspect of Python used to be how simple it was to write it and to read the resulting code. One key part of that was duck typing - no need to specify what type of variable you were using, which also made it more flexible as eg the language would handle adding an int and a float.
For some reason, people who like fully specifying types, and should probably have just stuck to those sorts of languages, have come along and fouled this up, now we are encouraged to write unreadable code using zillions of type hints.
I was particularly amused, Arjan, with your section on 'making your function more generic' - achieved by adding even more type hinting ... you could just drop all the type hints and achieve that!
I am sure you have considered all the arguments pro and con type hinting already, so I am not going to change your mind. Let me just say that a significant portion of bugs in my project come from 3rd party modules not providing type hints or generic "Any" types. It takes way more time for a user of your module to crawl through documentation for debugging \ accepting all kinds of return types than referencing a typed interface.
With more power comes more responsibility. When you were learning python or experimenting and what you made had little consequence if it broke, doing everything loose and fast is fine. When you then have to work with others who depend on you (and you depend on them) these checks end up helping everyone including you much more than they hurt. Just remember, you’re benefiting from everyone else following the rules too. Obviously, you never make bugs, but these rules prevent a lot of the bugs your colleagues will make that you’ll end up having to deal with. 😉
This programming nitpicking is getting ridiculous
Dont why so much work. Make it (/, **kw)
And let thr user decide what parameters he wants. Im too old for that
I don't have half an hour to watch a video to see what is worth knowing or what is not. I am a speed reader and would like access to text versions of video.
I'll be honest. I absolutely hate type hinting. It makes an unnecessary mess and makes it harder to read while not bringing any tangible benefit.
100% 👍
I hate when someone requres to have everything with type hints. And I agree that they can make the code less readable. But It's hard to live without them
24:32 and what should we do if we want to combine different types in an iterable?
Very good tips, thanks for sharing.