Skip to content

single get() failover doesnt work! #154

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
stangl opened this issue Oct 3, 2014 · 5 comments
Closed

single get() failover doesnt work! #154

stangl opened this issue Oct 3, 2014 · 5 comments

Comments

@stangl
Copy link

stangl commented Oct 3, 2014

php-memcached: 2.2.0
libmemcached: 1.0.16

$a = new Memcached();
$a->setOption(Memcached::OPT_BINARY_PROTOCOL,true);
$a->setOption(Memcached::OPT_REMOVE_FAILED_SERVERS, true);
$a->setOption(Memcached::OPT_DISTRIBUTION,Memcached::DISTRIBUTION_CONSISTENT);
$a->setOption(Memcached::OPT_LIBKETAMA_COMPATIBLE,true);
$a->addServer('working_host', 11211);
$a->addServer('non_working_host', 11211);

$a->set('key_on_non_working_host', 1); // = TRUE -> WORKS! value stored on working_host

// Setup new connection
$b = new Memcached();
$b->setOption(Memcached::OPT_BINARY_PROTOCOL,true);
$b->setOption(Memcached::OPT_REMOVE_FAILED_SERVERS, true);
$b->setOption(Memcached::OPT_DISTRIBUTION,Memcached::DISTRIBUTION_CONSISTENT);
$b->setOption(Memcached::OPT_LIBKETAMA_COMPATIBLE,true);
$b->addServer('working_host', 11211);
$b->addServer('non_working_host', 11211);

$b->get('key_on_non_working_host'); // = FALSE -> DOENST WORK!

// we have to do AFTER first false-get():

$b->setOption(Memcached::OPT_LIBKETAMA_COMPATIBLE,true);
$b->get('key_on_non_working_host'); // = TRUE -> SECOND CALL NOW WORKS!

this failure doenst happen if the inital set() and the get() are called in same connection. then it seems the system "remember" from set() that the non_working_host istn available.

@mkoppanen
Copy link
Member

I just tested with libmemcached 1.0.18 and I cannot reproduce this issue

@stangl
Copy link
Author

stangl commented Oct 8, 2014

nope. just updated to 1.0.18 and the Problem still exists.

please be sure you try a key for non-working-host.

this is a full working example. if you change Hosts you have to check if you have the right key for non_working_host:

$a = new Memcached();
$a->setOption(Memcached::OPT_BINARY_PROTOCOL,true);
$a->setOption(Memcached::OPT_REMOVE_FAILED_SERVERS, true);
$a->setOption(Memcached::OPT_DISTRIBUTION,Memcached::DISTRIBUTION_CONSISTENT);
$a->setOption(Memcached::OPT_LIBKETAMA_COMPATIBLE,true);

$a->addServer('localhost', 11211);
$a->addServer('non_working_host', 11222);

print "be sure you use a key for non_working_host: ".$a->getServerByKey ('key_on_non_working_host')['port']." : should be 11222 \n\n";

$t1 = $a->set('key_on_non_working_host', 1);

var_dump($t1); // = TRUE -> WORKS! value stored on working_host (=localhost)
print $a->getResultCode()."\n\n"; // 0

// Setup new connection
$b = new Memcached();
$b->setOption(Memcached::OPT_BINARY_PROTOCOL,true);
$b->setOption(Memcached::OPT_REMOVE_FAILED_SERVERS, true);
$b->setOption(Memcached::OPT_DISTRIBUTION,Memcached::DISTRIBUTION_CONSISTENT);
$b->setOption(Memcached::OPT_LIBKETAMA_COMPATIBLE,true);

$b->addServer('localhost', 11211);
$b->addServer('non_working_host', 11222);

$t1 = $b->get('key_on_non_working_host');
var_dump($t1); // = FALSE -> AUTO FAILOVER DOESNT WORK!
print $b->getResultCode()."\n"; // 2

// we have to do AFTER first false-get():

$b->setOption(Memcached::OPT_LIBKETAMA_COMPATIBLE,true); // THIS is needed! without that we get at the following get() FALSE and Error 35!
$t1 = $b->get('key_on_non_working_host');
var_dump($t1); // = TRUE -> SECOND CALL NOW WORKS!
print $b->getResultCode()."\n\n\n\n"; // 0

can you confirm that we should get a result with the first get() call and without the second setoption(OPT_LIBKETAMA_COMPATIBLE)+get() ?

@shmuel-krakower
Copy link

Hi @mkoppanen @stangl I think I'm having the same issue.
The thing is that even with a second get after setOption of LIBKETAMA_COMPATIBLE, I'm still getting error 35 (MEMCACHED_SERVER_MARKED_DEAD).

How did you work around that?
For me it seems like no matter what - while using 2 memcached nodes - the client still trying to use the dead node and doesn't re-calculate the distribution accordingly to only healthy nodes.

@sodabrew
Copy link
Contributor

Closing. This is the intentional behavior, although the documentation could be better.

Here's a good blog post that goes into detail, and I'll excerpt a relevant part:
http://hoborglabs.com/en/blog/2013/memcached-php

Everybody knows what consistent distribution is all about. But I (and maybe you too) made some false assumptions about it. I've assumed that memcached driver will take care of dead servers and will start reassigning keys from dead server across existing ones - oh boy, I was wrong :) What consistent distribution means is that given keys will be stored on given servers no matter what. The distribution will change when you ask driver to recalculate it, by setting distribution method again, which updates hashing function and dead server(s) are no longer used.

[emphasis added]

The auto-failover in libmemcached does work, but there are internal timeouts. The distribution of keys is not immediately recalculated after a single failed get to a backend memcached. To do so would trigger horrible flapping, because failures can and do occur randomly on a heavily loaded server network.

Also with some more details:
https://bugs.launchpad.net/libmemcached/+bug/1158676/comments/7

@shmuel-krakower
Copy link

Hi
Just to wrap it up from my side - we eventually used mcroute to take care for load balancing and failover as this behavior here was not clear and was not flexible enough.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants