1:两者的关系
ConsumerCoordinator负责向ConsumerNetworkClient发起各种请求,再发起broker节点,GroupCoordinator根据请求的类型进行请求处理,所以说ConsumerCoordinator调用GroupCoordinator,GroupCoordinator才是处理真正的协调组。
Kafka Server 端要处理的请求总共有21 种,其中有 8 种是由 GroupCoordinator 来完成的。处理请求的类型
ApiKeys.OFFSET_COMMIT;
ApiKeys.OFFSET_FETCH;
ApiKeys.JOIN_GROUP;
ApiKeys.LEAVE_GROUP;
ApiKeys.SYNC_GROUP;
ApiKeys.DESCRIBE_GROUPS;
ApiKeys.LIST_GROUPS;
ApiKeys.HEARTBEAT;
2:类ConsumerCoordinator
- 1:其主要是对AbstractCoordinator的实现,ConsumerCoordinator负责和GroupCoordinator通信:
- 2:利用ConsumerNetworkClient完成与Kafka broker节点的通信,发出请求、制定异步响应流程。比如leader选举请求、加入组的请求、提交偏移量的请求。
类图如下
2.1:变量
主要变量如下
private final List<PartitionAssignor> assignors; //coonsumer-分区分配算法
private final Metadata metadata;//集群元数据
private final ConsumerCoordinatorMetrics sensors;
private final SubscriptionState subscriptions;//存储订阅的信息,消费的topic等
private final OffsetCommitCallback defaultOffsetCommitCallback;
private final boolean autoCommitEnabled;//偏移量自动提交是否开启
private final int autoCommitIntervalMs;//提交的时间间隔
//线程安全的集合,从已经提交的偏移量处理线程汇总获取
private final ConcurrentLinkedQueue<OffsetCommitCompletion> completedOffsetCommits;
private boolean isLeader = false;
private Set<String> joinedSubscription; //存储订阅的topic
2.2:方法
- poll():负责和GroupCoordinator交互,确定消费者和pritition的消费方案
- onJoinComplete:根据分区分配算法,进行consumer-partition的分配
- metadata():获取每种consumer-partition分配算法对应的有哪些topic集合
- needRejoin():是否重新发送JoinGroupRequest请求
方法中定义了大量xxResponseHandler,制定了收到响应后的行为。比如JoinGroupResponseHandler描述了发出join group请求并收到响应后的行为。
主要的方法-metadata()
@Override//获取每种consumer-partition分配算法对应的有哪些topic集合
public List<ProtocolMetadata> metadata() {
//获取subscriptions订阅中的topic列表
this.joinedSubscription = subscriptions.subscription();
List<ProtocolMetadata> metadataList = new ArrayList<>();
//遍历集合,获取分配算法
for (PartitionAssignor assignor : assignors) {
Subscription subscription = assignor.subscription(joinedSubscription);
ByteBuffer metadata = ConsumerProtocol.serializeSubscription(subscription);
metadataList.add(new ProtocolMetadata(assignor.name(), metadata));
}
return metadataList;
}
方法之-poll
在KafkaConsumer.pollOnce每次拉取消息都会执行。确保consumer coordinator的存在
public void poll(long now, long remainingMs) {
// 触发OffsetCommitCallback#onComplete逻辑。
invokeCompletedOffsetCommitCallbacks();
// 如果订阅类型是根据主题自动分配分区,根据主题pattern自动分配分区两者之一。
//每个broker上都有一个coordinator,那么每个groupid启动都会为其分配一个coordinator作为自己的协调器,所以步骤就是1:找到这个coordinator node节点
//2:其他所有的consumer向node发起join请求,加入group,找出一个consumer 作为leader
//3:leader发起请求为consumer分配partition
if (subscriptions.partitionsAutoAssigned()) {
if (coordinatorUnknown()) { //1:查找node,是否找到coordinator协调器,返回node节点
ensureCoordinatorReady();//coordinator就是指groupid在__consumers_offsets取hash后对应分区的leader,ensureCoordinatorReady会与某broker联系,找到coordinator,并与之建立连接
now = time.milliseconds();
}
if (needRejoin()) { //2:找到coordinator协调器后加入
// due to a race condition between the initial metadata fetch and the initial rebalance,
// we need to ensure that the metadata is fresh before joining initially. This ensures
// that we have matched the pattern against the cluster's topics at least once before joining.
if (subscriptions.hasPatternSubscription())
client.ensureFreshMetadata();
ensureActiveGroup(); //3:执行消费者的分区分配策略。
now = time.milliseconds();
}
pollHeartbeat(now);
} else {
// For manually assigned partitions, if there are no ready nodes, await metadata.
// If connections to all nodes fail, wakeups triggered while attempting to send fetch
// requests result in polls returning immediately, causing a tight loop of polls. Without
// the wakeup, poll() with no channels would block for the timeout, delaying re-connection.
// awaitMetadataUpdate() initiates new connections with configured backoff and avoids the busy loop.
// When group management is used, metadata wait is already performed for this scenario as
// coordinator is unknown, hence this check is not required.
if (metadata.updateRequested() && !client.hasReadyNodes()) {
boolean metadataUpdated = client.awaitMetadataUpdate(remainingMs);
if (!metadataUpdated && !client.hasReadyNodes())
return;
now = time.milliseconds();
}
}
maybeAutoCommitOffsetsAsync(now);
}
3:GroupCoordinator
每个Kafka服务器实例化一个协调器,成为GroupCoordinator
- 该实例负责处理一部分消费者组成员管理(是否发生reblance重平衡)
- 偏移的提交管理
- leader GroupCoordinator负责监听消费的各个分区的心跳。
ConsumerGroup对应的GroupCoordinator leader是如何确定的
计算公式:Math.abs(groupId.hashCode() % groupMetadataTopicPartitionCount)
groupMetadataTopicPartitionCount由offsets.topic.num.partitions指定,默认是50个分区,每个分区对应的leader副本即为对应的GroupCoordinator;
选举出来的leader负责制定consumer group的消费方案。
3.1:属性
3.2:方法
- startup:broker启动时启动一个实例
- handleJoinGroup:处理Coordinator发起的JOIN_GROUP请求,所有的consumer成员都会发起请求,最先加入的就是leader,leader负责制定消费者方案
1:startup启动一个GroupCoordinator实例
def startup(enableMetadataExpiration: Boolean = true) {
info("Starting up.")
groupManager.startup(enableMetadataExpiration)
isActive.set(true)
info("Startup complete.")
}
2:处理ApiKeys.JOIN_GROUP请求
根据选举出来的GroupCoordinator leader,进行消费方案的分配。
def handleJoinGroup(groupId: String,
memberId: String,
clientId: String,
clientHost: String,
rebalanceTimeoutMs: Int,
sessionTimeoutMs: Int,
protocolType: String,
protocols: List[(String, Array[Byte])],
responseCallback: JoinCallback) {
// only try to create the group if the group is not unknown AND仅当组不是未知时尝试创建组
// the member id is UNKNOWN, if member is specified but group does not
// exist we should reject the request
groupManager.getGroup(groupId) match {
//此时还没有分配方案
case None =>
if (memberId != JoinGroupRequest.UNKNOWN_MEMBER_ID) {
responseCallback(joinError(memberId, Errors.UNKNOWN_MEMBER_ID))
} else {
//当group和memberId都不存在,表示分组中第一个Consumer进行joinGroup,此时需要添加gourp及初始化group的状态等信息
val group = groupManager.addGroup(new GroupMetadata(groupId, initialState = Empty))
doJoinGroup(group, memberId, clientId, clientHost, rebalanceTimeoutMs, sessionTimeoutMs, protocolType, protocols, responseCallback)
}
//制定消费者分区方案的方法ConsumerCoordinator.performAssignment,此处单独放一个没看懂
performAssignment
case Some(group) =>
doJoinGroup(group, memberId, clientId, clientHost, rebalanceTimeoutMs, sessionTimeoutMs, protocolType, protocols, responseCallback)
}
}