Lifecycle of a Python multiprocessing.Process
This post is a collection of notes on the creation and termination of a Python multiprocessing Process, investigating in particular what happens when the interpreter terminates.
Background
In the following code snippet, the main thread spawns a worker process that requires a certain amount of time to terminate.
When this code is executed, “All done” is immediately printed, then a 10 seconds delay follows and the message “Worker Process” is printed on the tty. What happens exactly when the process is created and why the interpreter does not return until the worker has completed the execution?
A deeper look with strace
strace
might seem a bit of an overkill in this case, as normally pdb
would be just
enough to understand what happens at the Python multiprocessing library level.
However, I find the level of understanding that comes from analyzing strace output
invaluable and definitely worth the effort of filtering out non-relevant stuff.
In order to trace both the parent and the child, -f
flag is required. The worker
process is initially created via clone
syscall:
The absence of CLONE_VM
, to enable sharing page tables, and CLONE_FS
, to enable
sharing the fs_struct
in task_struct
(i.e. the open files table), when invoking clone
clearly indicates that a new process is being created rather than a
thread. SIGCHLD
is also set, which causes the parent to be signaled upon the
termination of the child. The process then waits 10 seconds via select
syscall and
finally prints its message. What is the parent doing meanwhile?
After printing its message on standard output, it starts a series of non-blocking
wait on the child setting WNOHANG
flag, which causes the syscall to return immediately
if the child hasn’t terminated yet. Without that flag, wait4
by default behaves as waitpid
, i.e. it returns
only if the process waiter for has terminated. A SIGCHLD is then sent by the child, but
there is no side effect on the parent as the default policy for this signal is SIG_IGN
:
the system call is not interrupted with EINTR
nor restarted via SA_RESTART
.
wait4 returns the pid of the process which was being waited. It is interesting now to see how this behavior
is triggered in the multiprocessing library, since it is not immediately obvious:
shouldn’t the interpreter just return after the final statement?.
In this case pdb does not really prove useful and to go as deep as possible,
gdb is the best tool for the trade.
Tracing with gdb
First, python debug symbols must be installed. On Debian,
python-dbg
contains the interpreter compiled with -g
option. On Fedora,
debug symbols can be downloaded separately with the following yum command:
Note however that the package with debug symbols must match the version of
the “plain” package: a mismatch will prevent gdb from loading the symbols.
This mechanism is a bit different between Debian and Fedora. In fact, under Fedora
the package python-debug
contains an executable that does not have debug symbols,
but it has been compiled with internal debug features aimed at supporting development, for
example extension for the interpreter. Debug symbols must be installed
separately. Since I am interested in seeing what happens just before the interpreter
exits, I need to obtain a backtrace at the right moment. Breaking at the exit_group
invocation would not help, as by that time the stack has already been unwound
and only the outermost libc frames are still present on the stack, i.e _start
,
__libc_start_main
and few more. The best choice is probably to trap wait4
syscall to understand
what control path led to its invocation. It is very easy to break on a specific
syscall with gdb and to check which control path led there with bt
command,
which by default shows only the trace of the current process.
Backtracing
Since I am tracing the Python interpreter I expect to see invocations of
CPython internal methods. Luckily gdb is extremely smart and helps a lot in mapping
what is happening in the Python interpreter with the high level source code.
Having trapped wait4
invocations, the first item I expect to see is a libc
control path that leads to that syscall:
Indeed this is the case. Now, what led to that invocation?
PyEval_EvalFrameEx
is the huge infinite for loop that constitutes the core of
the Python interpreter. Basically it goes through the byte code an interprets/executes
all the Python machine level instructions. This function is called with a PyFrameObject
that represents the execution frame in which an instruction is being run. The PyFrameObject
is created upon invoking a function and all the instructions in that function are
executed in that context. A PyFrameObject
contains all the information needed to
link Python machine level instructions to the high level source code. gdb extracts
this information automatically pointing us to the source file and the line number:
Just for the sake of curiosity, let’s try to extract this information manually.
The PyFrameObject
contains the following interesting items:
f_lineno
member which represents the initial line of the source code associated with the PyFrameObjectf_code
, which is a pointer to aPyCodeObject
representing the bytecode being executed.f_code->co_filename
which is a pointer to aPyObject
that represents the name of the source file from which the code object was loaded.f_code->co_name
which is a pointer to aPyObject
representing the name of the function to which the bytecode belongs
The inspection of these values leads to the following results, that definitely match the actual Python source code from forking.py.
The actual source code line that corresponds to the bytecode instruction being
executed is a bit more tricky to obtain, but the interpreter abstracts all the
complexity by providing PyFrame_GetLineNumber
.
Exactly what gdb already told us. After this little digression, let’s go back to
the stack trace. The poll
function is a method of Popen
class in multiprocessing
lib. The next invocation of PyEval_EvalFrameEx
on the stack points to process.py.
poll
is invoked in function _cleanup
to identify child processes that have terminated.
Further down the stack there is another pointer to process.py.
_cleanup
function is now called in active_children
, which returns a list
of child processes which are alive.
Next frame points to util.py.
Here active_children
is called in _exit_function
.
Next and last frame of interest points to atexit.py.
atexit.py is a mechanism that allows to register cleanup functions that are
executed upon normal interpreter termination (the Python counterpart of the libc
atexit
). In util.py at line 330, the module registers \_exit_function
as
an atexit callback:
With this mechanism, the multiprocessing library ensures that the interpreter does not terminate before having waited all the children, therefore not leaving orphaned processes running on the system.