RocketMQ源码分析5：Broker消息分发流程

2022年05月17日浏览：927

在RocketMQ中，消息在服务端的存储结构如下，每条消息都会有对应的索引信息，Consumer通过ConsumeQueue这个二级索引来读取消息实体内容，其流程如下：

在消息写入commitlog后，Rocketmq会通过异步线程实时的将消息的物理偏移量(用于定位消息的)分到consumeQueue中和indexFile中。

consumeQueue: 消息消费队列，引入的目的主要是提高消息消费的性能，由于RocketMQ是基于主题topic的订阅模式，消息消费是针对主题进行的，如果要遍历commitlog文件中根据topic检索消息是非常低效的。Consumer即可根据ConsumeQueue来查找待消费的消息。其中，ConsumeQueue（逻辑消费队列）作为消费消息的索引，保存了指定Topic下的队列消息在CommitLog中的起始物理偏移量offset，消息大小size和消息Tag的HashCode值。
IndexFile：IndexFile（索引文件）提供了一种可以通过key或时间区间来查询消息的方法

1.消息分发线程

我们来简单看下Broker启动流程中和消息分发相关的过程：

BrokerController

public void start() throws Exception { if (this.messageStore != null) { //TODO:DefaultMessageStore this.messageStore.start();
    } //TODO:other code...... } 复制代码

我们继续看DefaultMessageStore的内部实现:

public void start() throws Exception { //TODO:other code..... //TODO:标记消息分发的起始位置，就是从commitlog哪个位置开始分发 this.reputMessageService.setReputFromOffset(maxPhysicalPosInLogicQueue); //TODO:消息分发服务 this.reputMessageService.start(); this.recoverTopicQueueTable(); if (!messageStoreConfig.isEnableDLegerCommitLog()) { this.haService.start(); this.handleScheduleMessageService(messageStoreConfig.getBrokerRole());
    } //TODO:consumequeue 刷盘 this.flushConsumeQueueService.start(); this.commitLog.start(); this.storeStatsService.start(); this.createTempFile(); this.addScheduleTask(); this.shutdown = false;
} 复制代码

ReputMessageService是一个异步线程，继承了ServiceThread,实现了Runnable

public abstract class ServiceThread implements Runnable { public abstract String getServiceName(); public void start() {
        log.info("Try to start service thread:{} started:{} lastThread:{}", getServiceName(), started.get(), thread); if (!started.compareAndSet(false, true)) { return;
        }
        stopped = false; this.thread = new Thread(this, getServiceName()); this.thread.setDaemon(isDaemon); this.thread.start();
    }
 } 复制代码

我们来看下ReputMessageService的run()方法：

它是DefaultMessageStore的一个内部类

class ReputMessageService extends ServiceThread { @Override public void run() {
        DefaultMessageStore.log.info(this.getServiceName() + " service started"); while (!this.isStopped()) { try {
                Thread.sleep(1); //TODO:核心方法：消息分发 this.doReput();
            } catch (Exception e) {
                DefaultMessageStore.log.warn(this.getServiceName() + " service has exception. ", e);
            }
        }

        DefaultMessageStore.log.info(this.getServiceName() + " service end");
    }
} 复制代码

2.分发逻辑

从上面我们就可以看出，doReput()就是分发的核心逻辑了，那么就来看下，他是怎么分发的。

DefaultMessageStore.ReputMessageService#doReput()

private void doReput() { //TODO: broker启动时，将commitlog的最大物理偏移量设置给reputFromOffset，就是标记从commitlog哪个位置开始分发，分发过了就不要再分发了 if (this.reputFromOffset < DefaultMessageStore.this.commitLog.getMinOffset()) {
        log.warn("The reputFromOffset={} is smaller than minPyOffset={}, this usually indicate that the dispatch behind too much and the commitlog has expired.", this.reputFromOffset, DefaultMessageStore.this.commitLog.getMinOffset()); this.reputFromOffset = DefaultMessageStore.this.commitLog.getMinOffset();
    } //TODO: 一直到最后一个 CommitLog 文件的最大有效数据的位置 for (boolean doNext = true; this.isCommitLogAvailable() && doNext; ) { if (DefaultMessageStore.this.getMessageStoreConfig().isDuplicationEnable()
            && this.reputFromOffset >= DefaultMessageStore.this.getConfirmOffset()) { break;
        } //TODO:从commitlog读取消息 //TODO:假设我第一条消息的总大小是245byte,那么写入成功后 MappedFile.wrotePosition 会记录位置 //TODO: 所以它就从0读取到245byte返回 SelectMappedBufferResult result = DefaultMessageStore.this.commitLog.getData(reputFromOffset); if (result != null) { try { //TODO: 标记下次拉取的起始位置 //TODO: 第一次默认是0，则从0开始读取一条消息，消息大小为245byte //TODO: 那么第二次拉取时，这个值就是245 this.reputFromOffset = result.getStartOffset(); for (int readSize = 0; readSize < result.getSize() && doNext; ) { //TODO: 读出一条消息体内容 DispatchRequest dispatchRequest =
                        DefaultMessageStore.this.commitLog.checkMessageAndReturnSize(result.getByteBuffer(), false, false); //TODO: 消息大小 int size = dispatchRequest.getBufferSize() == -1 ? dispatchRequest.getMsgSize() : dispatchRequest.getBufferSize(); //TODO : 如果读取成功 if (dispatchRequest.isSuccess()) { //TODO: 表示有消息 if (size > 0) { //TODO: 开始处理 consumequeue 和 indexFile DefaultMessageStore.this.doDispatch(dispatchRequest); //TODO: 这里暂时忽略,大致意思是异步监听线程监听是否有消息达到了 if (BrokerRole.SLAVE != DefaultMessageStore.this.getMessageStoreConfig().getBrokerRole()
                                    && DefaultMessageStore.this.brokerConfig.isLongPollingEnable()
                                    && DefaultMessageStore.this.messageArrivingListener != null) {
                                DefaultMessageStore.this.messageArrivingListener.arriving(dispatchRequest.getTopic(),
                                    dispatchRequest.getQueueId(), dispatchRequest.getConsumeQueueOffset() + 1,
                                    dispatchRequest.getTagsCode(), dispatchRequest.getStoreTimestamp(),
                                    dispatchRequest.getBitMap(), dispatchRequest.getPropertiesMap());
                            } //TODO: size就是一个消息体的大小，这样可以顺延到第二个消息的偏移量 this.reputFromOffset += size; //TODO: 已经读过了的消息 readSize += size; if (DefaultMessageStore.this.getMessageStoreConfig().getBrokerRole() == BrokerRole.SLAVE) {
                                DefaultMessageStore.this.storeStatsService
                                    .getSinglePutMessageTopicTimesTotal(dispatchRequest.getTopic()).incrementAndGet();
                                DefaultMessageStore.this.storeStatsService
                                    .getSinglePutMessageTopicSizeTotal(dispatchRequest.getTopic())
                                    .addAndGet(dispatchRequest.getMsgSize());
                            }
                        } else if (size == 0) { //TODO: 如果没有消息，则切换到下一个文件 this.reputFromOffset = DefaultMessageStore.this.commitLog.rollNextFile(this.reputFromOffset);
                            readSize = result.getSize();
                        }
                    } else if (!dispatchRequest.isSuccess()) { if (size > 0) {
                            log.error("[BUG]read total count not equals msg total size. reputFromOffset={}", reputFromOffset); this.reputFromOffset += size;
                        } else {
                            doNext = false; // If user open the dledger pattern or the broker is master node, // it will not ignore the exception and fix the reputFromOffset variable if (DefaultMessageStore.this.getMessageStoreConfig().isEnableDLegerCommitLog() ||
                                DefaultMessageStore.this.brokerConfig.getBrokerId() == MixAll.MASTER_ID) {
                                log.error("[BUG]dispatch message to consume queue error, COMMITLOG OFFSET: {}", this.reputFromOffset); this.reputFromOffset += result.getSize() - readSize;
                            }
                        }
                    }
                }
            } finally {
                result.release();
            }
        } else {
            doNext = false;
        }
    }
} 复制代码

这个方法的内容也不少，我们还是简单总结下:

从commitlog读取消息。

读取多少条呢？前面我们在看消息写入commitlog中时知道，如果消息写入缓冲区成功，则有一个wrotePosition属性，记录commitlog的写入的最大位置。那么这里我就是读取到wrotePosition属性记录的位置的消息。

遍历读取到的消息，将消息的内容保存到DispatchRequest对象中。
如果读取到了消息，则开始分发

//TODO: 开始处理 consumequeue 和 indexFile。分发的核心逻辑 DefaultMessageStore.this.doDispatch(dispatchRequest); 复制代码

我们继续点进去看它的内部实现：

public class DefaultMessageStore implements MessageStore { private final LinkedList<CommitLogDispatcher> dispatcherList; /**
     * DefaultMessageStore 构造器
     */ public DefaultMessageStore(....) throws IOException { //TODO:保存消息分发的容器 this.dispatcherList = new LinkedList<>(); //TODO: 写入 ConsumeQueue 文件 this.dispatcherList.addLast(new CommitLogDispatcherBuildConsumeQueue()); //TODO:写入 IndexFile 文件 this.dispatcherList.addLast(new CommitLogDispatcherBuildIndex());
        ...
    } /**
     * 分发
     */ public void doDispatch(DispatchRequest req) { // 进行分发操作，dispatcherList 包含两个对象： // 1. CommitLogDispatcherBuildConsumeQueue：写入 ConsumeQueue 文件 // 2. CommitLogDispatcherBuildIndex：写入 Index 文件 for (CommitLogDispatcher dispatcher : this.dispatcherList) {
            dispatcher.dispatch(req);
        }
    }
} 复制代码

在创建DefaultMessageStore对象时，其内部会创建两个CommitLogDispatcher

CommitLogDispatcherBuildConsumeQueue: 负责处理ConsumeQueue
CommitLogDispatcherBuildIndex: 负责处理 IndexFile

2.1 分发ConsumeQueue

class CommitLogDispatcherBuildConsumeQueue implements CommitLogDispatcher { @Override public void dispatch(DispatchRequest request) { final int tranType = MessageSysFlag.getTransactionValue(request.getSysFlag()); switch (tranType) { case MessageSysFlag.TRANSACTION_NOT_TYPE: case MessageSysFlag.TRANSACTION_COMMIT_TYPE: //TODO:保存Consumequeue信息 DefaultMessageStore.this.putMessagePositionInfo(request); break; case MessageSysFlag.TRANSACTION_PREPARED_TYPE: case MessageSysFlag.TRANSACTION_ROLLBACK_TYPE: break;
        }
    }
} 复制代码

我们继续看内部实现：

public void putMessagePositionInfo(DispatchRequest dispatchRequest) { //TODO:根据topic和queueid获取ConsumeQueue ConsumeQueue cq = this.findConsumeQueue(dispatchRequest.getTopic(), dispatchRequest.getQueueId()); //TODO:保存消息索引单元信息 cq.putMessagePositionInfoWrapper(dispatchRequest);
} 复制代码

首先就是根据topic和queueid获取ConsumeQueue对象。然后调用ConsumeQueue的putMessagePositionInfoWrapper(.)方法

public void putMessagePositionInfoWrapper(DispatchRequest request) { //TODO:...省略部分代码...... //TODO:只看关键部分，参数分别是：commitlog offset, msgsize, tag hashcode, queue offset boolean result = this.putMessagePositionInfo(request.getCommitLogOffset(),
            request.getMsgSize(), tagsCode, request.getConsumeQueueOffset()); //TODO:...省略部分代码.... } 复制代码

然后我们继续看它的内部实现：

private boolean putMessagePositionInfo(final long offset, final int size, final long tagsCode, final long cqOffset) { if (offset + size <= this.maxPhysicOffset) {
        log.warn("Maybe try to build consume queue repeatedly maxPhysicOffset={} phyOffset={}", maxPhysicOffset, offset); return true;
    } this.byteBufferIndex.flip(); this.byteBufferIndex.limit(CQ_STORE_UNIT_SIZE); //TODO: 将数据放入缓冲区,这就是一个索引单元所包含的数据内容，共计20byte this.byteBufferIndex.putLong(offset); this.byteBufferIndex.putInt(size); this.byteBufferIndex.putLong(tagsCode); //TODO: CQ_STORE_UNIT_SIZE = 20byte final long expectLogicOffset = cqOffset * CQ_STORE_UNIT_SIZE; //TODO: 根据索引单元的实际物理偏移量获取索引单元文件 MappedFile mappedFile = this.mappedFileQueue.getLastMappedFile(expectLogicOffset); if (mappedFile != null) { if (mappedFile.isFirstCreateInQueue() && cqOffset != 0 && mappedFile.getWrotePosition() == 0) { //TODO: 队列中的最小offset this.minLogicOffset = expectLogicOffset; this.mappedFileQueue.setFlushedWhere(expectLogicOffset); this.mappedFileQueue.setCommittedWhere(expectLogicOffset); this.fillPreBlank(mappedFile, expectLogicOffset);
            log.info("fill pre blank space " + mappedFile.getFileName() + " " + expectLogicOffset + " " + mappedFile.getWrotePosition());
        } //TODO:......... //TODO: ConsumeQueue中记录着commitlog 中最大的物理偏移量 this.maxPhysicOffset = offset + size; //TODO: 将索引单元写入FileChannel中 return mappedFile.appendMessage(this.byteBufferIndex.array());
    } return false;
} 复制代码

总结一下写入的过程：

将消息的物理偏移量commitlog offset, 消息大小 msgSize, 消息的tag hashcode, 按照顺序分别写入索引单元中。

根据consumequeue 的（offset * 20）计算逻辑偏移量，根据这个偏移量就可以获取ConsumeQueue对应的最新的MappedFile。（每个consumequeue 索引单元固定20字节)

这个consumequeue 的 offset 是在消息写入commitlog时就已经计算好了，等消息分发时，直接获取这个offset, 然后将索引内容顺序写入到consumequeue的文件中。

将commitlog最大物理偏移量设置到Consumequeue中
将索引单元内容写入了FileChannel中，等待刷盘

在DefaultMessageStore对象启动时，其内部的FlushConsumeQueueService对象也会启动，它就是ConsumeQueue的刷盘服务，每隔1s执行一次。
默认存储路径：$home/store/consumequeue/{topic}/{queueid}/{fileName}

2.2 分发IndexFile

class CommitLogDispatcherBuildIndex implements CommitLogDispatcher { @Override public void dispatch(DispatchRequest request) { //TODO:这个功能是可以关闭的，默认是开启的 if (DefaultMessageStore.this.messageStoreConfig.isMessageIndexEnable()) {
            DefaultMessageStore.this.indexService.buildIndex(request);
        }
    }
} 复制代码

我们继续看它的内部实现：

public void buildIndex(DispatchRequest req) { //TODO: 尝试获取或者创建IndexFile IndexFile indexFile = retryGetAndCreateIndexFile(); if (indexFile != null) { long endPhyOffset = indexFile.getEndPhyOffset();
        DispatchRequest msg = req;
        String topic = msg.getTopic();
        String keys = msg.getKeys(); //TODO:...省略部分代码....... //TODO:这个uniqKey就是msgId，它是broker端生成的，不为空 if (req.getUniqKey() != null) { //TODO:核心逻辑 indexFile = putKey(indexFile, msg, buildKey(topic, req.getUniqKey())); if (indexFile == null) {
                log.error("putKey error commitlog {} uniqkey {}", req.getCommitLogOffset(), req.getUniqKey()); return;
            }
        } //TODO:这个是由生产者指定的，如果生产者没有指定，则不会创建该索引信息 if (keys != null && keys.length() > 0) {
            String[] keyset = keys.split(MessageConst.KEY_SEPARATOR); for (int i = 0; i < keyset.length; i++) {
                String key = keyset[i]; if (key.length() > 0) { //TODO:核心逻辑 indexFile = putKey(indexFile, msg, buildKey(topic, key)); if (indexFile == null) {
                        log.error("putKey error commitlog {} uniqkey {}", req.getCommitLogOffset(), req.getUniqKey()); return;
                    }
                }
            }
        }
    } else {
        log.error("build index error, stop building index");
    }
} 复制代码

在这里我们简单总结下：

获取或者创建IndexFile文件对象。

说明：这个方法内部有刷盘的逻辑，只有当IndexFile写满了，才会刷盘

判断是否需要创建索引

broker端会生成msgId，这里会为msgId生成索引信息
生产者指定keys, 可以指定多个，用空格分割,如果不指定，则不会创建索引信息。一般都会指定，因为一般为了避免重复消费，消费者可以根据这个keys进行消息去重。

调用IndexFile对象的putKey(...)逻辑

buildKey(topic, key))方法返回的是 (topic + "#" + key)

接下来我们就看下它的核心逻辑：

public boolean putKey(final String key, final long phyOffset, final long storeTimestamp) { if (this.indexHeader.getIndexCount() < this.indexNum) { //TODO: 对key取hash值 int keyHash = indexKeyHashMethod(key); //TODO: 计算这个key 在 slot槽的位置（500w个） int slotPos = keyHash % this.hashSlotNum; //TODO: 计算slot的位置：固定40字节的IndexHeader + (slotPos * 每个slot槽固定4byte) int absSlotPos = IndexHeader.INDEX_HEADER_SIZE + slotPos * hashSlotSize;

        FileLock fileLock = null; try { // fileLock = this.fileChannel.lock(absSlotPos, hashSlotSize, // false); //TODO: 取出当前slot槽中存储的值 slotValue int slotValue = this.mappedByteBuffer.getInt(absSlotPos); if (slotValue <= invalidIndex || slotValue > this.indexHeader.getIndexCount()) {
                slotValue = invalidIndex;
            } //TODO: storeTimestamp 消息的存储时间（写到commitlog的时间） //TODO: this.indexHeader.getBeginTimestamp() 写入这个IndexFile第一条消息的时间 long timeDiff = storeTimestamp - this.indexHeader.getBeginTimestamp(); //TODO: timeDiff/1000, 这样占用4byte就够用了 timeDiff = timeDiff / 1000; if (this.indexHeader.getBeginTimestamp() <= 0) {
                timeDiff = 0;
            } else if (timeDiff > Integer.MAX_VALUE) {
                timeDiff = Integer.MAX_VALUE;
            } else if (timeDiff < 0) {
                timeDiff = 0;
            } //TODO: 计算索引单元的位置 //TODO: IndexHeader.INDEX_HEADER_SIZE 固定长度是40byte //TODO: this.hashSlotNum * hashSlotSize 就是500个slot槽 * 每个slot槽4byte //TODO: indexCount * 每个索引单元固定20byte (indexCount 在IndexHeader中有记录） int absIndexPos =
                IndexHeader.INDEX_HEADER_SIZE + this.hashSlotNum * hashSlotSize
                    + this.indexHeader.getIndexCount() * indexSize; //TODO: 按照顺序放入索引单元数据， //TODO: 分别是4byte的key的hash值，8byte的消息物理偏移量，4byte的(消息存储时间与当前IndexFile存储第一条消息的时间差) //TODO: 以及4byte上一次这个slot槽存储的值（它的存储结构类似于HashMap, 所以这个值就是用来解决hash冲突的） this.mappedByteBuffer.putInt(absIndexPos, keyHash); this.mappedByteBuffer.putLong(absIndexPos + 4, phyOffset); this.mappedByteBuffer.putInt(absIndexPos + 4 + 8, (int) timeDiff); this.mappedByteBuffer.putInt(absIndexPos + 4 + 8 + 4, slotValue); //TODO: 这个slot槽存储的值indexCount的最大值 this.mappedByteBuffer.putInt(absSlotPos, this.indexHeader.getIndexCount()); if (this.indexHeader.getIndexCount() <= 1) { //TODO: 写入IndexFile的第一条消息的物理偏移量放入IndexHeader中 this.indexHeader.setBeginPhyOffset(phyOffset); //TODO: 写入IndexFile的第一条消息的时间放入IndexHeader中 this.indexHeader.setBeginTimestamp(storeTimestamp);
            } if (invalidIndex == slotValue) { this.indexHeader.incHashSlotCount();
            } //TODO: indexCount++，统计在 IndexHeader中 this.indexHeader.incIndexCount(); //TODO: 将写入IndexFile的最后一条消息的物理偏移量写入IndexHeader中 this.indexHeader.setEndPhyOffset(phyOffset); //TODO: 将写入IndexFile的最后一条消息的时间写入IndexHeader中 this.indexHeader.setEndTimestamp(storeTimestamp); return true;
        } catch (Exception e) {
            log.error("putKey exception, Key: " + key + " KeyHashCode: " + key.hashCode(), e);
        } finally { if (fileLock != null) { try {
                    fileLock.release();
                } catch (IOException e) {
                    log.error("Failed to release the lock", e);
                }
            }
        }
    } else {
        log.warn("Over index file capacity: index count = " + this.indexHeader.getIndexCount()
            + "; index max num = " + this.indexNum);
    } return false;
} 复制代码

这里的逻辑相比consumequeue要复杂很多，我们先说明一下IndexFile文件的组成

我通过图解来说明IndexFile构建索引的过程：

个人觉得，IndexFile索引虽然相比ConsumeQueue要复杂，但知道即可，因为它的作用完全没有ConsumeQueue大，重点知道ConsumeQueue索引就好。

3.总结

本文讲述了Broker消息分发的过程，所谓分发，就是将消息写入ConsumeQueue和IndexFile中用来构建消息的索引，便于消费者快速消费消息。
消息分发是一个独立的异步线程：ReputMessageService,这个线程会实时的从commitlog中读取消息，然后将消息的物理偏移量(消息的物理偏移量可以定位一条物理消息)写入到ConsumeQueue和IndexFile中。

好了，消息分发就写到这里吧。

作者：秋官
链接：https://juejin.cn/post/7098545135306145823
来源：稀土掘金
著作权归作者所有。商业转载请联系作者获得授权，非商业转载请注明出处。