On Wed Feb 19 09:14:02 2014, damien.chaumette@gmail.com wrote:
Show quoted text> They are definitely happening at the GET stage, not at the STORE.
> On the data_long C::M::F run it'll go on for ever, I let it go for a few
> minutes and it didn't stop.
> To be fair if it is trying to communicate with the client it's much too
> late at this point as we have exited a long time ago.
C::M::F client sends get request, reads the reply and sends nothing else after that. If I got you correctly then memcached server is trying to read something from the client, and I don't understand why.
Show quoted text> I've done as requested (first two gets are C::M::F), behaviour is the same
> as per below:
What I see from the trace:
Show quoted text> <608 new auto-negotiating client connection
> 608: Client using the ascii protocol
> <608 set data_short 0 0 1024
> >608 STORED
> <608 set data_long 0 0 10240000
> >608 STORED
C::M::F stored two keys in memcached.
Show quoted text> <616 new auto-negotiating client connection
> 616: Client using the ascii protocol
> <616 get data_long
> >616 sending key data_long
> >616 END
> <616 get data_short
> >616 sending key data_short
> >616 END
Other client got two keys.
Show quoted text> <608 get data_long
> >608 sending key data_long
> >608 END
C::M::F requested data_long and got some error at this point, closed the connection, and requested data_short via another connection:
Show quoted text> <620 new auto-negotiating client connection
> 620: Client using the ascii protocol
> <620 get data_short
> >620 sending key data_short
> >620 END
> <616 connection closed.
> <620 connection closed.
Given that you disabled io_timeout I can only guess what error could happen, but C::M::F definitely got one, otherwise it wouldn't open another connection. So the question is why C::M::F thinks it got an error, and whether its perception is valid.
Show quoted text> There doesn't seem to be official Windows builds for Memcached any more,
> and it's difficult to justify spending too much time in trying to build
> from source when there doesn't seem to be a known issue with this revision
> and other clients aren't having the same problem.
There's a tiny possibility that C::M::F specifically triggers some race in memcached server that other clients don't, and race fixes happen almost every memcached release (without detailed description what they could affect). But I got your last reply, the problem is reproduceable with the latest memcached.
Show quoted text> I understand if you can't devote more time to find this Windows specific
> problem, but could you perhaps point me to the places in the C code I
> should look first should I want to troubleshoot myself?
First of all an easier way would be to use another module like Cache::Memcached::libmemcached (provided that it builds on Windows and works) - should be comparably fast. Debugging Perl modules in C is a pain even on Linux, and simply looking into the code won't reveal much I think because the problem somehow relates to your setup (I built memcached 1.4.5 here but couldn't reproduce). Value reading happens in src/client.c:read_value(). But if you really want to devote time to it, the first thing I would try is to capture network traffic to see actual packet contents (data_long is big, but protocol commands will always be at packet beginnings), and to trace system calls (on Linux we have strace utility that shows system calls and their results; don't know what Windows has).