Monday, December 11, 2017

MySQL - Duplicate Errors & Trailing Whitespace

I had a unique constraint on a VARCHAR column and I inserted two rows with the following values:

  1. "name" (without trailing whitespace)
  2. "name " (with trailing whitespace)

To my surprise, I got a duplicate error on that 2nd insert. It turns out that MySQL ignores that trailing whitespace when it makes comparisons.

The MySQL docs say this: "All MySQL collations are of type PAD SPACE. This means that all CHAR, VARCHAR, and TEXT values are compared without regard to any trailing spaces. “Comparison” in this context does not include the LIKE pattern-matching operator, for which trailing spaces are significant." (https://dev.mysql.com/doc/refman/5.7/en/char.html)

The solution? You should probably be trimming trailing whitespace in your API endpoints and on your front-end.

Thursday, August 10, 2017

Gevent + Requests Performance With verify=True/False

If you use gevent with requests.get on a HTTPS URL with the default verify=True enabled, you'll see almost 2x longer execution times than with verify=False.


Here are the results:
verify=True took: 40.3454630375 secs
verify=False took: 39.3803040981 secs
gevent verify=True took: 2.23735189438 secs
gevent verify=False took: 1.58263015747 secs
I suspect that gevent is having trouble using pyopenssl concurrently because it's a C library.

Tuesday, June 27, 2017

Backing up or dumping a memcached server

I was needing to move from an old cache server to a larger one, but I wanted to do it without flushing cache.

The first thing I came across was this "memcached-tool" which has a dump command: https://github.com/memcached/memcached/blob/master/scripts/memcached-tool

There's another article that mentions using memdump and memcat: http://www.dctrwatson.com/2010/12/how-to-dump-memcache-keyvalue-pairs-fast/

Unfortunately, those methods only dumped a few mb of data. This post explains why: https://stackoverflow.com/a/13941700

You can only dump one page per slab class (1MB of data)
So, I ended up writing a script that loops through the expected cache keys, gets the data in cache, then sets the data in the new cache server.

Monday, June 12, 2017

Gunicorn - "Resource temporarily unavailable"

Are you seeing this error in your logs while your server is under high load?:
[error] 10#0: *14843 connect() to unix:/tmp/gunicorn.sock failed (11: Resource temporarily unavailable) while connecting to upstream, client: 96.44.145.186, server: , request: "GET / HTTP/1.0", upstream: "http://unix:/tmp/gunicorn.sock:/", host: "45.55.46.84"
I ended up making an example dockerfile with nginx + gunicorn + flask to reproduce this problem: https://github.com/pawl/somaxconn_test

Bumping the "net.core.somaxconn" setting ended up fixing it.

Friday, March 17, 2017

SQLAlchemy - empty lists + in_() causing crazy queries (before 1.2.0)

Before SQLAlchemy 1.2.0, if you use an empty list with in_(), it will emit some crazy SQL that will query your entire table. The best solution is probably to upgrade to SQLAlchemy 1.2.0.


Monday, March 13, 2017

The Robustness Principle

I learned about this at a talk called "Implementing Evolvable APIs" at SXSW: https://en.wikipedia.org/wiki/Robustness_principle

For example, making an API that throws errors when an unexpected parameter is provided is a bad idea. What if you need to make changes to the client to add the new parameter? You will need to make sure you deploy the code on the server side first, otherwise it will cause errors.