UPDATE: If you are still seeing this problem then try the git master branch (pymongo-bug). With pip it's very easy to install
pip install -e git+https://github.com/mongodb/mongo-python-driver.git#egg=pymongo
Recently I was moving some of http://www.howsthe.com data over to MongoDB, out of MySQL. Once I got the data in to MongoDB I though the world would be full rainbows, but I came across an a weird exception the didn’t make sense to me:
Traceback (most recent call last):
…
File "/home/vbabiy/.virtualenvs/howsthedotcom/lib/python2.6/site-packages/pymongo/cursor.py", line 332, in __getitem__
for doc in clone:
File "/home/vbabiy/.virtualenvs/howsthedotcom/lib/python2.6/site-packages/pymongo/cursor.py", line 601, in next
if len(self.__data) or self._refresh():
File "/home/vbabiy/.virtualenvs/howsthedotcom/lib/python2.6/site-packages/pymongo/cursor.py", line 564, in _refresh
self.__query_spec(), self.__fields))
File "/home/vbabiy/.virtualenvs/howsthedotcom/lib/python2.6/site-packages/pymongo/cursor.py", line 521, in __send_message
**kwargs)
File "/home/vbabiy/.virtualenvs/howsthedotcom/lib/python2.6/site-packages/pymongo/connection.py", line 700, in _send_message_with_response
return self.__send_and_receive(message, sock)
File "/home/vbabiy/.virtualenvs/howsthedotcom/lib/python2.6/site-packages/pymongo/connection.py", line 681, in __send_and_receive
return self.__receive_message_on_socket(1, request_id, sock)
File "/home/vbabiy/.virtualenvs/howsthedotcom/lib/python2.6/site-packages/pymongo/connection.py", line 671, in __receive_message_on_socket
struct.unpack("<i", header[8:12])[0])
AssertionError: ids don't match -2015873347 1667330676
After doing some searching I got a better understanding of the error and the cause, I was able to verify that the error was being generated due to multiple processes trying to write to the same connection. A simple way to avoid this is to run celery with a single process but that kinda defeats the purpose of celery.
Luckily I found a very quick and easy fix, thanks to the pymongo driver support for connection pooling. The connection class takes the pool_size as a argument and then will take care of the rest. I found when I set this to about the same size or a few more than the process count in celery everything works like a charm, and now I am in my world of rainbows. This is what my connection looks like:
import pymongo
mongodb = pymongo.connection.Connection(pool_size=10).my_database
Enjoy using pymongo with celery.

