This is a poorly titled post. It is meant to document some nuances I recently came across whilst dealing with Python threads. Yes, yes, I know; Python threads are a bad idea but we had “reasons”. Anyway, lets look at some code:
import threading
import time
class Parent():
def __init__(self):
self.child = ChildThread(self)
self.child.start()
class ChildThread(threading.Thread):
def __init__(self, parent):
threading.Thread.__init__(self)
self.parent = parent
def run(self):
while True:
print "Child thread doing work!"
time.sleep(0.5)
def main():
p = Parent()
if __name__=="__main__":
main()
print "EXITING"
Output:
Child thread doing work!
EXITING
Child thread doing work!
Child thread doing work!
Child thread doing work!
Child thread doing work!
Child thread doing work!
Child thread doing work!
Child thread doing work!
First thing we see is that the child thread keeps running even after we have seemingly reached the end of our program. The issue here is more subtle that it appears, but for the time being we will fix this quite simply by changing our loop condition in the ChildThread
import threading
import time
class Parent():
def __init__(self):
self.stop_requested = False
self.child = ChildThread(self)
self.child.start()
def stop(self):
self.stop_requested = True
class ChildThread(threading.Thread):
def __init__(self, parent):
threading.Thread.__init__(self)
self.parent = parent
def run(self):
while not self.parent.stop_requested:
print "Child thread doing work!"
time.sleep(0.5)
def main():
p = Parent()
time.sleep(2)
p.stop()
if __name__=="__main__":
main()
print "EXITING"
Output:
Child thread doing work!
Child thread doing work!
Child thread doing work!
Child thread doing work!
EXITING
Ok, now the output makes more sense. All good, but there is an enhancement we should make. Now, I also code in other languages, and in C++ for example, bool primitives are not thread-safe (yes, even if you declare them volatile)
So, we will replace our boolean with an Event.
import threading
import time
class Parent():
def __init__(self):
self.stop_requested = threading.Event()
self.child = ChildThread(self)
self.child.start()
def stop(self):
self.stop_requested.set()
class ChildThread(threading.Thread):
def __init__(self, parent):
threading.Thread.__init__(self)
self.parent = parent
def run(self):
while not self.parent.stop_requested.is_set():
print "Child thread doing work!"
time.sleep(0.5)
def main():
p = Parent()
time.sleep(2)
p.stop()
if __name__=="__main__":
main()
print "EXITING"
Output:
Child thread doing work!
Child thread doing work!
Child thread doing work!
Child thread doing work!
EXITING
As you’d expect, the output remains the same. But this is a better because an Event indicates to the OS scheduler that you are waiting on a condition. Admittedly, we’ve only used the event object here as a glorified boolean flag, but it is capable of a little more than that.
Thread death upon scope exit
Here’s a thought. why is the stop() call necessary in the first place? Ideally, I would like the Child thread to automatically die when the Parent goes out of scope. There are two ways to do this; the easy-but-wrong way, and the right-but-slightly-harder way.
Lets see what the easy-but-wrong way is:
self.child.daemon=True
Output:
Child thread doing work!
Child thread doing work!
Child thread doing work!
Child thread doing work!
EXITING
All we do is set the child thread as a daemon thread. The docs state; “The entire Python program exits when no alive non-daemon threads are left.” In other words, by marking our ChildThread as daemon we relinquish control away from the interpretor and to the OS.
The reason why this isnt the Right Solution(TM) is simply because our ChildThread is NOT supposed to be daemon (at least for this use-case). In fact, while monitoring the output of ps -aux, I often see the following:
Child thread doing work!
Child thread doing work!
Child thread doing work!
Child thread doing work!
EXITING
Exception in thread Thread-1 (most likely raised during interpreter shutdown):
Traceback (most recent call last):
File “/usr/lib/python2.6/threading.py”, line 532, in __bootstrap_inner
File “./example4.py”, line 29, in run
: ‘NoneType’ object has no attribute ‘write’
Ok so whats the right way? Well ofcourse, we should implement parent’s __del__ method!, right?
import threading
import time
class Parent():
def __init__(self):
self.stop_requested = threading.Event()
self.child = ChildThread(self)
self.child.start()
def stop(self):
self.stop_requested.set()
def __del__(self):
print "Inside __del__"
self.stop()
class ChildThread(threading.Thread):
def __init__(self, parent):
threading.Thread.__init__(self)
self.parent = parent
def run(self):
while not self.parent.stop_requested.is_set():
print "Child thread doing work!"
time.sleep(0.5)
def main():
p = Parent()
time.sleep(2)
#p.stop()
if __name__=="__main__":
main()
print "EXITING"
Output:
Child thread doing work!
Child thread doing work!
Child thread doing work!
Child thread doing work!
EXITING
Child thread doing work!
Child thread doing work!
Child thread doing work!
Child thread doing work!
…
Whats going on here? why wasn’t the __del__ method called? First of all, the __del__ method is NOT the destructor of a class.
Now look carefully at the Parent and ChildThread classes, you should notice a classical case of circular referencing. Both Parent and ChildThread have references to each other. What we really want to do is tell the child thread to die if the Parent is no longer alive, but Parent will not die (specifically, never get garbage collected) since the ChildThread is always holding a reference to it.
The solution? The weakref module.
import threading
import time
import weakref
class Parent():
def __init__(self):
self.stop_requested = threading.Event()
self.child = ChildThread(weakref.proxy(self))
self.child.start()
def stop(self):
self.stop_requested.set()
def __del__(self):
print "Inside __del__"
self.stop()
class ChildThread(threading.Thread):
def __init__(self, parent):
threading.Thread.__init__(self)
self.parent = parent
def run(self):
try:
while self.parent and not self.parent.stop_requested.is_set():
print "Child thread doing work!"
time.sleep(0.5)
except (ReferenceError):
print "Parent is dead. ChildThread will now also die"
def main():
p = Parent()
time.sleep(2)
#p.stop()
if __name__=="__main__":
main()
print "EXITING"
Output:
Child thread doing work!
Child thread doing work!
Child thread doing work!
Child thread doing work!
Inside __del__
EXITING
Parent is dead. ChildThread will now also die
A weak-ref is best explained by the docs themselves:
A weak reference to an object is not enough to keep the object alive: when the only remaining references to a referent are weak references, garbage collection is free to destroy the referent and reuse its memory for something else
We only pass a weak reference to our ChildThread – this allows the Parent to die. Subsequently, the ChildThread now has the option to also die if its parent is dead.
NOTE: In this example, I am using a weakref.proxy() instead of weakref.ref() – the former just makes the code slightly easier to read.
Also notice that this time the __del__ method was called. With this setup, the ChildThread will die when the client calls stop() on the Parent, or even if the Parent goes out of scope. Cool.
BTW, inspite of the usefulness of this module, it is worth mentioning that this is a result of circumventing a minor nuance of the language implementation. Ideally, one shouldn’t need to modify code based on the underlying algorithm used for garbage-collection. Nonetheless, that is wishful thinking… besides, I am sure there are compelling reasons for things to be the way they are.
A Final note – Semantic Portability:
I like Python, and I like my code to be as Pythonic as possible. Nonetheless, I also write a lot of code in other languages so I prefer to invest in concepts rather fully devote myself to a dogma. Thus; the term “semantic portability”.
Semantic portability, or semantically portable code, refers to code that embraces well understood concepts instead of unique language/library features; thus facilitating the reader to easily understand and implement in another language.
Where am I going with this? Well, Event() is a great abstraction, however, it doesn’t have consistent support across various threading libraries. For example, the popular C++ Boost.Threads doesn’t support them.
Additionally, the Event abstraction is made redundant by Condition variables. So we should re-write a part of our code using them. You will lose some of the cleanliness, but thats the price to pay for correctness and semantic portability.