In new relic transaction logs, I was seeing some PHP requests last as long as an hour. It turns out that time spent waiting on sockets doesn't apply toward max_execution_time, and loops that involved waiting on sockets could end up taking a really long time without timing out.
PHP docs have this to say about max_execution_time: "The set_time_limit() function and the configuration directive max_execution_time only affect the execution time of the script itself. Any time spent on activity that happens outside the execution of the script such as system calls using system(), stream operations, database queries, etc. is not included when determining the maximum time that the script has been running. This is not true on Windows where the measured time is real."
The solution ended up being setting request_terminate_timeout in php-fpm.
If your application is threaded and you're getting a "KeyError" while using the non-decorator version of cachetool's LRUCache, then you need to put whatever is manipulating the cache object inside of a lock. Also, since LRUCache is modified when values are gotten from it, you will also need to make sure you're locking when you get values from cache too. If you can use the decorator version of LRUCache, that's preferred since it has built-in locking.