Fixing Memory Leaks In Popular Python Libraries
I was recently able to make a minimal example that reproduced a Celery memory leak. The memory leak would happen on the main Celery worker process that’s forked to make child processes, which makes the leak especially bad. Issue #4843 has been around for 3+ years and has 140+ comments, so this one has been causing a lot of problems for Celery users for a while.
The memory leak has been causing a lot of issues at my work too, and I was able to get some help resolving the issue during a work hackathon. My coworker Michael Lazar was able to find the root cause of the issue and make a pull request to fix it in py-amqp (a celery dependency when using RabbitMQ as a broker). The code with the issue was 10 years old!
Here’s what the bug looks like:
try: sock.shutdown(socket.SHUT_RDWR) sock.close() except OSError: pass
The problem occurs when
socket.shutdown fails on an
OSError and doesn’t proceed to
socket.close to clean up the socket and allow garbage collection to release the memory used for it. The
shutdown can occur when the remote side of the connection closes the connection first.
The fixed example (with separate
try: sock.shutdown(socket.SHUT_RDWR) except OSError: pass try: sock.close() except OSError: pass
I was able to make the same fix to a few other popular Python libraries too:
Update 12/23: I got a pull request merged into Kombu with the same memory usage reduction fix I made to py-amqp and librabbitmq. I also opened another pull request to Kombu that should fix a memory leak issue when using Celery with Redis.