Making celery and pymongo play nicely

Avatar for vbabiy@howsthe.com

Making celery and pymongo play nicely

Published Dec. 13, 2010 by Vitaly Babiy

UPDATE: If you are still seeing this problem then try the git master branch (pymongo-bug). With pip it's very easy to install

  pip install -e git+https://github.com/mongodb/mongo-python-driver.git#egg=pymongo

Recently I was moving some of http://www.howsthe.com data over to MongoDB, out of MySQL. Once I got the data in to MongoDB I though the world would be full rainbows, but I came across an a weird exception the didn’t make sense to me:

Traceback (most recent call last):
…
  File "/home/vbabiy/.virtualenvs/howsthedotcom/lib/python2.6/site-packages/pymongo/cursor.py", line 332, in __getitem__
    for doc in clone:
  File "/home/vbabiy/.virtualenvs/howsthedotcom/lib/python2.6/site-packages/pymongo/cursor.py", line 601, in next
    if len(self.__data) or self._refresh():
  File "/home/vbabiy/.virtualenvs/howsthedotcom/lib/python2.6/site-packages/pymongo/cursor.py", line 564, in _refresh
    self.__query_spec(), self.__fields))
  File "/home/vbabiy/.virtualenvs/howsthedotcom/lib/python2.6/site-packages/pymongo/cursor.py", line 521, in __send_message
    **kwargs)
  File "/home/vbabiy/.virtualenvs/howsthedotcom/lib/python2.6/site-packages/pymongo/connection.py", line 700, in _send_message_with_response
    return self.__send_and_receive(message, sock)
  File "/home/vbabiy/.virtualenvs/howsthedotcom/lib/python2.6/site-packages/pymongo/connection.py", line 681, in __send_and_receive
    return self.__receive_message_on_socket(1, request_id, sock)
  File "/home/vbabiy/.virtualenvs/howsthedotcom/lib/python2.6/site-packages/pymongo/connection.py", line 671, in __receive_message_on_socket
    struct.unpack("<i", header[8:12])[0])
AssertionError: ids don't match -2015873347 1667330676

After doing some searching I got a better understanding of the error and the cause, I was able to verify that the error was being generated due to multiple processes trying to write to the same connection. A simple way to avoid this is to run celery with a single process but that kinda defeats the purpose of celery.

Luckily I found a very quick and easy fix, thanks to the pymongo driver support for connection pooling. The connection class takes the pool_size as a argument and then will take care of the rest. I found when I set this to about the same size or a few more than the process count in celery everything works like a charm, and now I am in my world of rainbows. This is what my connection looks like:

import pymongo
mongodb = pymongo.connection.Connection(pool_size=10).my_database

Enjoy using pymongo with celery.

Tags
  • celery
  • pymong
  • Python

Written By Vitaly Babiy

Avatar for vbabiy@howsthe.com

Vitaly Babiy is the creator of Howsthe.com (Yes, you can contact him about the service). He is a software engineer at heart, loves working with great technologies like Django and Jquery. Vitaly spends most of his days in python and loves it. Another passion of Vitaly's is learning the business side of things, one of the reason why he started Howsthe.com monitoring service. You can follow him on Twitter

blog comments powered by Disqus

A blog about development, marketing, and design.

The next version of Ubuntu is coming soon