blog


  1. Python behind the scenes #13: the GIL and its effects on Python multithreading

    As you probably know, the GIL stands for the Global Interpreter Lock, and its job is to make the CPython interpreter thread-safe. The GIL allows only one OS thread to execute Python bytecode at any given time, and the consequence of this is that it's not possible to speed up CPU-intensive Python code by distributing the work among multiple threads. This is, however, not the only negative effect of the GIL. The GIL introduces overhead that makes multi-threaded programs slower, and what is more surprising, it can even have an impact I/O-bound threads.

    In this post I'd like to tell you more about non-obvious effects of the GIL. Along the way, we'll discuss what the GIL really is, why it exists, how it works, and how it's going to affect Python concurrency in the future.

    read more
  2. Python behind the scenes #12: how async/await works in Python

    Mark functions as async. Call them with await. All of a sudden, your program becomes asynchronous – it can do useful things while it waits for other things, such as I/O operations, to complete.

    Code written in the async/await style looks like regular synchronous code but works very differently. To understand how it works, one should be familiar with many non-trivial concepts including concurrency, parallelism, event loops, I/O multiplexing, asynchrony, cooperative multitasking and coroutines. Python's implementation of async/await adds even more concepts to this list: generators, generator-based coroutines, native coroutines, yield and yield from. Because of this complexity, many Python programmers that use async/await do not realize how it actually works. I believe that it should not be the case. The async/await pattern can be explained in a simple manner if you start from the ground up. And that's what we're going to do today.

    read more
  3. Python behind the scenes #11: how the Python import system works

    If you ask me to name the most misunderstood aspect of Python, I will answer without a second thought: the Python import system. Just remember how many times you used relative imports and got something like ImportError: attempted relative import with no known parent package; or tried to figure out how to structure a project so that all the imports work correctly; or hacked sys.path when you couldn't find a better solution. Every Python programmer experienced something like this, and popular StackOverflow questions, such us Importing files from different folder (1822 votes), Relative imports in Python 3 (1064 votes) and Relative imports for the billionth time (993 votes), are a good indicator of that.

    The Python import system doesn't just seem complicated – it is complicated. So even though the documentation is really good, it doesn't give you the full picture of what's going on. The only way to get such a picture is to study what happens behind the scenes when Python executes an import statement. And that's what we're going to do today.

    read more
  4. Python behind the scenes #10: how Python dictionaries work

    Python dictionaries are an extremely important part of Python. Of course they are important because programmers use them a lot, but that's not the only reason. Another reason is that the interpreter uses them internally to run Python code. CPython does a dictionary lookup every time you access an object attribute or a class variable, and accessing a global or built-in variable also involves a dictionary lookup if the result is not cached. What makes a dictionary appealing is that lookups and other dictionary operations are fast and that they remain fast even as we add more and more elements to the dictionary. You probably know why this is the case: Python dictionaries are hash tables. A hash table is a fundamental data structure. The idea behind it is very simple and widely known. Yet, implementing a practical hash table is not a trivial task. There are different hash table designs that vary in complexity and performance. And new, better designs are constantly being developed.

    The goal of this post is to learn how CPython implements hash tables. But understanding all the aspects of hash table design can be hard, and CPython's implementation is especially sophisticated, so we'll approach this topic gradually. In the first part of this post, we'll design a simple fully-functional hash table, discuss its capabilities and limitations and outline a general approach to design a hash table that works well in practice. In the second part, we'll focus on the specifics of CPython's implementation and finally see how Python dictionaries work behind the scenes.

    read more
  5. Python behind the scenes #9: how Python strings work

    In 1991 Guido van Rossum released the first version of the Python programming language. About that time the world began to witness a major change in how computer systems represent written language. The internalization of the Internet increased the demand to support different writing systems, and the Unicode Standard was developed to meet this demand. Unicode defined a universal character set able to represent any written language, various non-alphanumeric symbols and, eventually, emoji 😀. Python wasn't designed with Unicode in mind, but it evolved towards Unicode support during the years. The major change happened when Python got a built-in support for Unicode strings – the unicode type that later became the str type in Python 3. Python strings have been proven to be a convenient way to work with text in the Unicode age. Today we'll see how they work behind the scenes.

    read more
  6. Python behind the scenes #8: how Python integers work

    In the previous parts of this series we studied the core of the CPython interpreter and saw how the most fundamental aspects of Python are implemented. We made an overview of the CPython VM, took a look at the CPython compiler, stepped through the CPython source code, studied how the VM executes the bytecode and learned how variables work. In the two most recent posts we focused on the Python object system. We learned what Python objects and Python types are, how they are defined and what determines their behavior. This discussion gave us a good understanding of how Python objects work in general. What we haven't discussed is how particular objects, such as strings, integers and lists, are implemented. In this and several upcoming posts we'll cover the implementations of the most important and most interesting built-in types. The subject of today's post is int.

    read more
  7. Python behind the scenes #7: how Python attributes work

    What happens when we get or set an attribute of a Python object? This question is not as simple as it may seem at first. It is true that any experienced Python programmer has a good intuitive understanding of how attributes work, and the documentation helps a lot to strengthen the understanding. Yet, when a really non-trivial question regarding attributes comes up, the intuition fails and the documentation can no longer help. To gain a deep understanding and be able to answer such questions, one has to study how attributes are implemented. That's what we're going to do today.

    read more
  8. Python behind the scenes #6: how Python object system works

    As we know from the previous parts of this series, the execution of a Python program consists of two major steps:

    1. The CPython compiler translates Python code to bytecode.
    2. The CPython VM executes the bytecode.

    We've been focusing on the second step for quite a while. In part 4 we've looked at the evaluation loop, a place where Python bytecode gets executed. And in part 5 we've studied how the VM executes the instructions that are used to implement variables. What we haven't covered yet is how the VM actually computes something. We postponed this question because to answer it, we first need to understand how the most fundamental part of the language works. Today, we'll study the Python object system.

    read more
  9. Python behind the scenes #5: how variables are implemented in CPython

    Consider a simple assignment statement in Python:

    a = b
    

    The meaning of this statement may seem trivial. What we do here is take the value of the name b and assign it to the name a, but do we really? This is an ambiguous explanation that gives rise to a lot of questions:

    • What does it mean for a name to be associated with a value? What is a value?
    • What does CPython do to assign a value to a name? To get the value?
    • Are all variables implemented in the same way?

    Today we'll answer these questions and understand how variables, so crucial aspect of a programming language, are implemented in CPython.

    read more
  10. Python behind the scenes #4: how Python bytecode is executed

    We started this series with an overview of the CPython VM. We learned that to run a Python program, CPython first compiles it to bytecode, and we studied how the compiler works in part two. Last time we stepped through the CPython source code starting with the main() function until we reached the evaluation loop, a place where Python bytecode gets executed. The main reason why we spent time studying these things was to prepare for the discussion that we start today. The goal of this discussion is to understand how CPython does what we tell it to do, that is, how it executes the bytecode to which the code we write compiles.

    read more
  11. Python behind the scenes #2: how the CPython compiler works

    In the first post of the series we've looked at the CPython VM. We've learned that it works by executing a series of instructions called bytecode. We've also seen that Python bytecode is not sufficient to fully describe what a piece of code does. That's why there exists a notion of a code object. To execute a code block such as a module or a function means to execute a corresponding code object. A code object contains the block's bytecode, the constants and the names of variables used within the block and the block's various properties.

    Typically, a Python programmer doesn't write bytecode and doesn't create the code objects but writes a normal Python code. So CPython must be able to create a code object from a source code. This job is done by the CPython compiler. In this part we'll explore how it works.

    read more

follow