5692
Regular tips about Python and programming in general Owner — @pushtaev © CC BY-SA 4.0 — mention if repost
Hey folks! I know there hasn't been much activity in this channel lately, but I have some new content to share. I really hope it meets the quality standards you expect and that I always strive to provide. :)
This Thursday, May 22 at 19:00 MSK, there'll be a live mock interview session for Middle Python developers. I'll conduct a realistic Python interview with a volunteer, comment on their responses, and explain exactly what interviewers typically expect. You'll also have the chance to ask questions afterward.
The session will be held in Russian, so apologies in advance to all non-Russian speakers here.
You can get the link via the Telegram bot → @shortcut_py_bot.
See you there! And sorry if it's not exactly what you usually expect from this channel.
In addition to typing.ParamSpec, PEP 612 introduced typing.Concatenate that allows describing decorators that accept fewer or more arguments that the wrapped function:
Читать полностью…
from typing import Callable, Concatenate, ParamSpec, TypeVar
P = ParamSpec('P')
R = TypeVar('R')
class User: ...
class Request: ...
class Response: ...
def with_user(
f: Callable[Concatenate[User, P], R],
) -> Callable[P, R]:
def inner(*args: P.args, **kwargs: P.kwargs) -> R:
user = User()
return f(user, *args, **kwargs)
return inner
@with_user
def handle_request(
user: User,
request: Request,
) -> Response:
...
request = Request()
response = handle_request(request)
Let's say, you have the following mock:
from unittest.mock import MockYou fully specified all attributes and methods it should have, and you pass it into the tested code, but then that code uses an attribute that you don't expect it to use:
user = Mock()
user.name = 'Guido'
user.ageInstead of failing with an
# <Mock name='mock.age' id='...'>
AttributeError, the mock instead will create a new mock when its unspecified attribute is accessed. To fix it, you can (and should) use the unittest.mock.seal function (introduced in Python 3.7):from unittest.mock import sealЧитать полностью…
seal(user)
user.name
# 'Guido'
user.occupation
# AttributeError: mock.occupation
PEP-615 (landed in Python 3.9) introduced the module zoneinfo. The module provides access to information about time zones. It will try to use the information about time zones provided by the OS. If not available, it falls back to the official Python tzdata package which you need to install separately.
from zoneinfo import ZoneInfoYou should not use pytz anymore. Читать полностью…
from datetime import datetime
ams = ZoneInfo('Europe/Amsterdam')
dt = datetime(2015, 10, 21, 13, 40, tzinfo=ams)
dt
# datetime(2015, 10, 21, 13, 40, tzinfo=ZoneInfo(key='Europe/Amsterdam'))
la = ZoneInfo('America/Los_Angeles')
dt.astimezone(la)
# datetime(2015, 10, 21, 4, 40, tzinfo=ZoneInfo(key='America/Los_Angeles'))
When talking about asyncio functions, sometimes I used the word "coroutine" and sometimes "task". It's time to tell you the difference:
+ coroutine is what async function returns. It can be scheduled, switched, closed, and so on. It's quite similar to generators. In fact, await keyword is nothing more than an alias for yield from, and async is a decorator turning the function from a generator into a coroutine.
+ asyncio.Future is like "promise" in JS. It is an object that eventually will hold a coroutine result when it is available. It has done method to check if the result is available, result to get the result, and so on.
+ asyncio.Task is like if coroutine and future had a baby. This is what asyncio mostly works with. It can be scheduled, switched, canceled, and holds its result when ready.
There is a cool function asyncio.create_task that can turn a coroutine into a proper task. What's cool about it is that this task immediately gets scheduled. So, if your code later encounters await, there is a chance your task will be executed at that point.
import asyncio
async def child():
print('started child')
await asyncio.sleep(1)
print('finished child')
async def main():
asyncio.create_task(child())
print('before sleep')
await asyncio.sleep(0)
print('after sleep')
asyncio.run(main())
before sleep
started child
after sleep
create_task is called, it is scheduled but not yet executed.main hits await, the scheduler switches to child.child hits await, the scheduler switches to another task, which is mainmain finished, asyncio.run returned without waiting for child to finish. It's dead in space now.asyncio.gather. And later we'll see some ways to wait for it with timeouts or when you don't care about the result.task = create_task(...)Читать полностью…
...
await asyncio.gather(task)
Your best companion in learning asyncio is asyncio.sleep. It works like time.sleep making the calling code wait the given number of seconds. This is the simplest example of an IO-bound task because while sleeping, your code literally does nothing but wait. And unlike time.sleep, asyncio.sleep is async. That means, while the calling task waits for it to finish, another task can be executed.
import asyncio
import time
async def main():
start = time.time()
await asyncio.sleep(2)
return int(time.time() - start)
asyncio.run(main())
# 2
It's time for us to talk about async/await in Python. That's a big and difficult topic but a very important one if you're working with the network.
Everything your program does belongs to one of the two classes:
+ CPU-bound tasks. This is when you do a lot of computations, and the fan of your PC makes helicopter noises. You can speed up computations with multiprocessing, which is a pain in the ass to do correctly.
+ IO-bound tasks. This is when your code does nothing except wait for a response from the outside world. It includes making all kinds of network requests (sending logs, querying a database, crawling a website), network responses (like when you have a web app), and working with files. You can speed up it using async/await syntax.
The basics are quite simple:
1. If you define a function using async def instead of just def, it will return a "coroutine" when is called instead of immediately running and calculating the result.
2. If you call inside an async function another async function with adding await before it, Python will request execution of this coroutine, switch to something else, and return the result when it is available.
3. The module asyncio contains some functions to work with async code and the scheduler that decides when to run which task.
This is a very basic overview. You can read the official asyncio documentation to learn more. In follow-up posts, we will cover most of asyncio functions, one by one.
To recap: PEP-518 introduced pyproject.toml, and many Python tools started to use it to store their configs. The issue, however, is that there is no module in stdlib to parse TOML. So, different tools started to use different third-party packages for the task:
+ tomli (used by mypy) is a pure Python library that can only read TOML.
+ toml (used by most of the tools) can both read and write TOML.
+ tomlkit (used by poetry) can read, write, and modify TOML (preserving the original formatting and comments).
PEP 680 (landed in Python 3.11) introduced tomli into stdlib. But why tomli and not another library? It's pure Python and minimalistic. It cannot write TOML files, but reading is enough for most of the tools to work with pyproject.toml. And to avoid unpleasant conflicts when tomli is installed in the same environment, the name of the module was changed to tomllib.
PEP-518 introduced changes not in Python itself but rather in its ecosystem. The idea is pretty simple: let's store configs for all tools in pyproject.toml file, in tool.TOOL_NAME section. For example, for mypy:
[tool.mypy]At this moment, almost all popular tools support
files = ["my_project"]
python_version = 3.8
pyproject.toml as the configuration file, in one way or another: mypy, pytest, coverage, isort, bandit, tox, etc. The only exception from the tooling I know is flake8.pyproject.toml, many tools used to use setup.cfg for the same purpose, but this format (INI) has a few disadvantages compared to TOML: it's not well-standardized, and the only supported type of values is string.
Читать полностью…
The isinstance function checks whether an object is an instance of a class or of a subclass thereof:
class A: pass
class B(A): pass
b = B()
isinstance(b, B) # True
isinstance(b, A) # True
isinstance(b, object) # True
isinstance(b, str) # False
isinstance(str, type) # True
isinstance checks and use them to refine the type:a: object
reveal_type(a)
# ^ Revealed type is "builtins.object"
if isinstance(a, str):
reveal_type(a)
# ^ Revealed type is "builtins.str"
isinstance is that you can pass in it a tuple of types to check if the object is an instance of any of them:isinstance(1, (str, int)) # TrueЧитать полностью…
The reveal_type function doesn't exist. However, if you call it and then run a type-checker (like mypy or pyright) on the file, it will show the type of the passed object:
a = 1Now, let's run mypy:
reveal_type(a)
reveal_type(len)
$ mypy tmp.pyIt's quite helpful to see what type mypy inferred for the variable in some tricky cases.
tmp.py:2: note: Revealed type is "builtins.int"
tmp.py:3: note: Revealed type is "def (typing.Sized) -> builtins.int"
reveal_type function was also added in typing module in Python 3.11:from typing import reveal_typeAnd for curious, here is the definition:
a = 1
reveal_type(a)
# prints: Runtime type is 'int'
reveal_type(len)
# prints: Runtime type is 'builtin_function_or_method'
def reveal_type(__obj: T) -> T:Читать полностью…
print(
f"Runtime type is {type(__obj).__name__!r}",
file=sys.stderr,
)
return __obj
PEP 681 (landed in Python 3.11) introduced typing.dataclass_transform decorator. It can be used to mark a class that behaves like a dataclass. The type checker will assume that it has init that accepts annotated attributes as arguments, eq, ne, and str. For example, it can be used to annotate SQLAlchemy or Django models, attrs classes, pydantic validators, and so on. It's useful not only for libraries that don't provide a mypy plugin but also if you use a non-mypy type checker. For instance, pyright, which is used by vscode Python plugin to show types, highlight syntax, provide autocomplete, and so on.
I often find myself writing a context manager to temporarily change the current working directory:
import os
from contexlib import contextmanager
@contextmanager
def enter_dir(path):
old_path = os.getcwd()
os.chdir(path)
try:
yield
finally:
os.chdir(old_path)
contextlib.chdir:import osЧитать полностью…
from contextlib import chdir
print('before:', os.getcwd())
# before: /home/gram
with chdir('/'):
print('inside:', os.getcwd())
# inside: /
print('after:', os.getcwd())
# after: /home/gram
PEP 654 introduced not only ExceptionGroup itself but also a new syntax to handle it. Let's start right with an example:
try:The output:
raise ExceptionGroup('', [
ValueError(),
KeyError('hello'),
KeyError('world'),
OSError(),
])
except* KeyError as e:
print('caught1:', repr(e))
except* ValueError as e:
print('caught2:', repr(e))
except* KeyError as e:
1/0
caught1: ExceptionGroup('', [KeyError('hello'), KeyError('world')])
caught2: ExceptionGroup('', [ValueError()])
+ Exception Group Traceback (most recent call last):
| File "<stdin>", line 2, in <module>
| ExceptionGroup: (1 sub-exception)
+-+---------------- 1 ----------------
| OSError
+------------------------------------
This is what happened:ExceptionGroup is raised, it's checked against each except* block.except* KeyError block catches ExceptionGroup that contains KeyError.except* block receives not the whole ExceptionGroup but its copy containing only matched sub-exceptions. In case of except* KeyError, it includes both KeyError('hello') and KeyError('world')
4. For each sub-exception, only the first match is executed (1/0 in the example wasn't reached).except* blocks.ExceptionGroup with them is raised. So, ExceptionGroup('', [OSError()]) was raised (and beautifully formatted).
Читать полностью…
PEP 678 (landed in Python 3.11) introduced a new method add_note for BaseException class. You can call it on any exception to provide additional context which will be shown at the end of the traceback for the exception:
try:The PEP gives a good example of how it can be useful. The hypothesis library includes in the traceback the arguments that caused the tested code to fail. Читать полностью…
1/0
except Exception as e:
e.add_note('oh no!')
raise
# Traceback (most recent call last):
# File "<stdin>", line 2, in <module>
# ZeroDivisionError: division by zero
# oh no!
Great news everyone! We extracted all our recent posts as Markdown, organized them, and made them more accessible. Now we have:
* 🌐 Website: pythonetc.orsinium.dev
* 📢 RSS: pythonetc.orsinium.dev/index.xml
* 🧑💻️ GitHub: github.com/life4/pythonetc
If you want to write a guest post, just send us a PR on GitHub. The README tells what you can write about and how. Thank you all for staying with us all these years ❤️
Let's say, you have a typical decorator that returns a new function. Something like this:
def debug(f):
name = f.__name__
def inner(*args, **kwargs):
print(f'called {name} with {args=} and {kwargs=}')
return f(*args, **kwargs)
return inner
@debug
def concat(a: str, b: str) -> str:
return a + b
concat('hello ', 'world')
# called concat with args=('hello ', 'world') and kwargs={}
concat using reveal_type, you'll see that its type is unknown because of the decorator:
reveal_type(concat)
# Revealed type is "Any"
x: int = concat(1, 2) won't be detected):
from typing import Callable
def debug(f: Callable) -> Callable: ...
from typing import TypeVar
T = TypeVar('T')
def debug(
f: Callable[..., T],
) -> Callable[..., T]: ...
A = TypeVar('A')
B = TypeVar('B')
R = TypeVar('R')
def debug(
f: Callable[[A, B], R],
) -> Callable[[A, B], R]: ...
inner is not guaranteed to have the same type as the passed callable (for example, someone might pass a class that is callable but we return a function):
F = TypeVar('F', bound=Callable)
def debug(f: F) -> F: ...
Читать полностью…
from typing import Callable, TypeVar, ParamSpec
P = ParamSpec('P')
R = TypeVar('R')
def debug(
f: Callable[P, R],
) -> Callable[P, R]:
def inner(
*args: P.args,
**kwargs: P.kwargs,
) -> R:
...
return f(*args, **kwargs)
return inner
@debug
def concat(a: str, b: str) -> str:
...
reveal_type(concat)
# Revealed type is "def (a: str, b: str) -> str"
Daylight saving time (DST) is the practice of advancing clocks (typically by one hour) during warmer months so that darkness falls at a later clock time and then turning it back for colder months. That means, sometimes, once a year the clock shows the same time twice. It can also happen when the UTC shift of the current timezone is decreased.
To distinguish such situations, PEP-495 (landed in Python 3.6) introduce the fold attribute for datetime that is 0 or 1 depending if this is the first or the second pass through the given time in the given timezone.
For example, in Amsterdam the time is shifted from CEST (Central European Summer Time) to CET (Central European Time) on the last Sunday of October:
from datetime import datetime, timedelta, timezone
from zoneinfo import ZoneInfo
ams = ZoneInfo('Europe/Amsterdam')
d0 = datetime(2023, 10, 29, 0, 0, tzinfo=timezone.utc)
for h in range(3):
du = d0 + timedelta(hours=h)
dl = du.astimezone(ams)
m = f'{du.time()} UTC is {dl.time()} {dl.tzname()} (fold={dl.fold})'
print(m)
00:00:00 UTC is 02:00:00 CEST (fold=0)
01:00:00 UTC is 02:00:00 CET (fold=1)
02:00:00 UTC is 03:00:00 CET (fold=0)
fold is not considered in comparison operations:d1 = datetime(2023, 10, 29, 2, 0, tzinfo=ams)
d2 = datetime(2023, 10, 29, 2, 0, fold=1, tzinfo=ams)
d1 == d2 # True
In the previous post, we had the following code:pythonCan you spot a bug?
import asyncio
async def child():
...
async def main():
asyncio.create_task(child())
...
Since we don't store a reference to the background task we create, the garbage collector may destroy the task before it finishes. To avoid that, we need to store a reference to the task until it finishes. The official documentation recommends the following pattern:python
bg_tasks = set()
async def main():
t = asyncio.create_task(child())
# hold the reference to the task
# in a global set
bg_tasks.add(t)
# automatically remove the task
# from the set when it's done
t.add_done_callback(bg_tasks.discard)
...
The asyncio.gather is the function that you will use the most. You pass to it multiple coroutines, it schedules them, waits for all to finish, and returns the list of results in the same order.
import asyncio
URLS = ['google.com', 'github.com', 't.me']
async def check_alive(url):
print(f'started {url}')
i = URLS.index(url)
await asyncio.sleep(3 - i)
print(f'finished {url}')
return i
async def main():
coros = [check_alive(url) for url in URLS]
statuses = await asyncio.gather(*coros)
for url, alive in zip(URLS, statuses):
print(url, alive)
asyncio.run(main())
started google.com
started github.com
started t.me
finished t.me
finished github.com
finished google.com
google.com 0
github.com 1
t.me 2
asyncio.gather schedules all tasks in order as they are passed.asyncio.gather waits for all tasks to finish.asyncio.gather returns a list of results in the order as the coroutines were passed in it. So, it's safe to zip results with input values.
Читать полностью…
Async is like mold in your fridge or GPL license in your dependencies. It propagates through your code, taking over every corner of it. You can call sync functions from async functions but async functions can be called only from other async functions, using the await keyword.
This one returns a coroutine instead of a result:async def welcome():This is how
return 'hello world'
def main():
return welcome()
main()
# <coroutine object welcome at 0x...>main should look instead:async def main():Alright, but how to call the root function? It also returns a coroutine! The answer is
result = await welcome()
return resultasyncio.run, which will take a coroutine, schedule it, and return its result:coro = main()Keep in mind that
result = asyncio.run(coro)
print(result)asyncio.run should be called only once. You can't use it to call an async function from any sync function. Again, if you have an async function to call, all functions calling it (and all functions calling them, and so on) should also be async. Like a mold.
The float type is infamous for being not as precise as you might expect. When you add 2 numbers, the result might contain a small error in precision. And the more numbers you add together, the higher the error:
sum([.9] * 1_000)
# 899.9999999999849
sum([.9] * 1_000_000)
# 900000.0000153045
math.fsum:import mathЧитать полностью…
math.fsum([.9] * 1_000_000)
# 900000.0
PEP-517 and PEP-518 introduced the build-system section in pyproject.toml that tells package management tools (like pip) how to build wheel distributions for the project. For example, this is the section if you use flit:
[build-system]
requires = ["flit_core >=3.2,<4"]
build-backend = "flit_core.buildapi"
flit_core of the given version and then call callbacks inside flit_core.buildapi, which should build the distribution for the project.setup.py file for pip to be able to install the project from the source (or a non-wheel tarball distribution).
Читать полностью…
PEP 427 introduced (and PEP 491 improved) a new format for Python distributions called "wheel".
Before the PEP, Python distributions were just tar.gz archives containing the source code of the library distributed, some additional files (README.rst, LICENSE, sometimes tests), and setup.py file. To install the library from the distribution, pip had to download the archive, extract it into a temporary directory, and execute python setup.py install to install the package.
Did it work? Well, kind of. It works well enough for pure Python packages, but if the package has C code, it had to be built on the target machine each time the package needs to be installed, because the built binary highly depends on the target OS, architecture, and Python version.
The new wheel format allows to significantly speed up the process. It changed 2 significant things:
1. The file name for wheel packages is standardized. It contains the name and version of the package, the required minimal version (2.7, 3.8), the type (CPython, PyPy) of the Python interpreter, OS name, architecture, and ABI version. For example, flask-1.0.2-py2.py3-none-any.whl says "it is flask package version 1.0.2 for both Python 2 and 3, any ABI, and any OS". That means, Flask is a pure Python package, so can be installed anywhere. Or psycopg2-2.8.6-cp310-cp310-linux_x86_64.whl says "it is psycopg2 version 2.8.6 for CPython 3.10 Linux 64bit". That means psycopg2 has some prebuild C libraries for a very specific environment. The package can have multiple wheel distributions per version, and pip will pick and download the one that is made for you.
2. Instead of setup.py, the archive (which is now zip instead of tar.gz) contains already parsed metadata. So, to install the package, it's enough to just extract it into site-packages directory, no need to execute anything.
Currently, the wheel distribution format is well-adopted and available for almost all modern packages.
When you create a new virtual environment, make sure you have the latest version of setuptools for tarballs, and the latest version of the wheel package for wheels. No, really, do it. The wheel package is not installed by default in the new venvs, and without it, installation of some packages will be slow and painful.
python3 -m venv .venvЧитать полностью…
.venv/bin/pip install -U pip setuptools wheel
PEP 675 (landed in Python 3.11) introduced a new type typing.LiteralString. It matches any Literal type, which is the type for explicit literals and constants in the code. The PEP shows a very good example of how it can be used to implement a SQL driver with protection on the type-checker level against SQL injections:
from typing import LiteralString, FinalЧитать полностью…
def run_query(sql: LiteralString): ...
run_query('SELECT * FROM students') # ok
ALL_STUDENTS: Final = 'SELECT * FROM students'
run_query(ALL_STUDENTS) # ok
arbitrary_query = input()
run_query(arbitrary_query) # type error, don't do that
As we covered a 3 years back (gosh, the channel is old), if the result of a base class is the current class, a TypeVar should be used as the annotation:
from typing import TypeVarThat's quite verbose, but it's how it should be done for the return type for inherited classes to be correct.
U = TypeVar('U', bound='BaseUser')
class BaseUser:
@classmethod
def new(cls: type[U]) -> U:
...
def copy(self: U) -> U:
...
Self that can be used as a shortcut for exactly such cases:from typing import SelfЧитать полностью…
class BaseUser:
@classmethod
def new(cls) -> Self:
...
def copy(self) -> Self:
...
The typing.assert_type function (added in Python 3.11) does nothing in runtime as most of the stuff from the typing module. However, if the type of the first argument doesn't match the type provided as the second argument, the type checker will return an error. It can be useful to write simple "tests" for your library to ensure it is well annotated.
For example, you have a library that defines a lot of decorators, like this:
from typing import Callable, TypeVar
C = TypeVar('C', bound=Callable)
def good_dec(f: C) -> C:
return f
def bad_dec(f) -> Callable:
return f
from typing import Callable, assert_typeЧитать полностью…
@good_dec
def f1(a: int) -> str: ...
@bad_dec
def f2(a: int) -> str: ...
assert_type(f1, Callable[[int], str]) # ok
assert_type(f2, Callable[[int], str]) # not ok
There is one more thing you should know about except*. It can match not only sub-exceptions from ExceptionGroup but regular exceptions too. And for simplicity of handling, regular exceptions will be wrapped into ExceptionGroup:
try:Читать полностью…
raise KeyError
except* KeyError as e:
print('caught:', repr(e))
# caught: ExceptionGroup('', (KeyError(),))
PEP 654 (landed in Python 3.11) introduced ExceptionGroup. It's an exception that nicely wraps and shows multiple exceptions:
try:It's very helpful in many cases when multiple unrelated exceptions have occurred and you want to show all of them: when retrying an operation or when calling multiple callbacks. Читать полностью…
1/0
except Exception as e:
raise ExceptionGroup('wow!', [e, ValueError('oh no')])
# Traceback (most recent call last):
# File "<stdin>", line 2, in <module>
# ZeroDivisionError: division by zero
# During handling of the above exception, another exception occurred:
# + Exception Group Traceback (most recent call last):
# | File "<stdin>", line 4, in <module>
# | ExceptionGroup: wow! (2 sub-exceptions)
# +-+---------------- 1 ----------------
# | Traceback (most recent call last):
# | File "<stdin>", line 2, in <module>
# | ZeroDivisionError: division by zero
# +---------------- 2 ----------------
# | ValueError: oh no
# +------------------------------------
PEP 657 (landed into Python 3.11) enhanced tracebacks so that they now include quite a precise location of where the error occurred:
Traceback (most recent call last):It shows not only where the error occurred for each frame, but also which code was executed. Beautiful! Читать полностью…
File "query.py", line 24, in add_counts
return 25 + query_user(user1) + query_user(user2)
^^^^^^^^^^^^^^^^^
File "query.py", line 32, in query_user
return 1 + query_count(db, response['a']['b']['c']['user'], retry=True)
~~~~~~~~~~~~~~~~~~^^^^^
TypeError: 'NoneType' object is not subscriptable