-
Notifications
You must be signed in to change notification settings - Fork 3.9k
core, grpclb,xds: let leaf LB policies explicitly refresh name resolution when subchannel connection is broken #8048
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
core, grpclb,xds: let leaf LB policies explicitly refresh name resolution when subchannel connection is broken #8048
Conversation
…is broken, let LB policies handle it.
…c in round_robin.
…x/move_subchannel_refreshing_ns_to_lb
…c in ring hash LB policy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have any ideas how we're supposed to notice bugs with LB behavior? It would seem trivial for a LB policy to not be triggering refreshes and no test would fail.
…l's refresh for cases LoadBalancer intentionally does not want a refresh.
…when handling subchannel state changes. Log a warning and do a refresh if it is not.
… a name resolution refresh for the resolver in Channel, so use ignoreRefreshNameResolutionCheck() to avoid false-positive warnings.
9256ec1
to
2ff23c6
Compare
Fixed by adding a flag. The Channel will still do the automatic refresh if the LoadBalancer did not do so. PTAL. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to help check that we've updated our code appropriately, replace the warning with an exception/panic and run the tests to make sure they pass. After a few releases with the warning, I wouldn't be surprised if we want to do something similar for a grpc-java release.
…o no-op, update its doc and not mark it as deprecated for now.
… for OOB channel state changes.
Currently each subchannel implicitly refreshes the name resolution when its connection is broken. That is, this feature is built into subchannel's internal implementation. Although it eliminates the burden of having LB implementations refreshing the resolver when connections to backends are broken, this is gives LB policies no chance to disable or override this refresh (e.g., in some complex load balancing hierarchy like xDS, LB policies may embed a resolver inside for resolving backends so the refreshing resolution operation should be hooked to the resolver embedded in the LB policy instead of the one in Channel).
This is likely to be a breaking change for users implementing their own LB policies (performing load balancing directly on backends). In order to make this transition smoothly, we add a check to SubchannelImpl that checks if the LoadBalancer has explicitly called
Helper.refreshNameResolution
for broken subchannels created by it. If not, it logs a warning and do the refresh.A temporary LoadBalancer.Helper API
ignoreRefreshNameResolution()
is added to avoid false-positive warnings for xDS that intentionally does not want a refresh. Once the migration is done, this should be deleted.See details in #8088.