YOLOv8+DeepSort实现

原创已于 2024-08-12 09:50:26 修改 · 5.1k 阅读

150 ·

CC 4.0 BY-SA版权

文章标签：

#YOLO #计算机视觉 #人工智能 #视觉检测

于 2024-08-12 09:44:49 首次发布

3D视觉从入门到精通专栏收录该内容

72 篇文章

订阅专栏

4.使用匈牙利算法将预测后的tracks和当前帧中的detections进行匹配

5. 卡尔曼滤波更新

4，代码实现

结果：

1，YOLOv8算法简介

YOLOv8是由Ultralytics公司开发的最新一代目标检测算法，它是YOLO系列的一次重大更新，支持图像分类、物体检测和实例分割等多种视觉AI任务。YOLOv8在继承了YOLO系列优点的基础上，进行了速度和精度的进一步优化，具有更快的推理速度和更高的检测精度。

YOLOv8的核心特点包括：

网络架构：采用了轻量级的网络架构，引入了注意力机制，优化了网络结构，减少了冗余计算。
损失函数：使用了多任务损失函数，结合了分类损失和定位损失，引入了IOU损失函数，更好地处理重叠目标。
数据增强：在训练过程中应用了多种数据增强技术，如随机裁剪、旋转和缩放，提高了模型的泛化能力和鲁棒性。

YOLOv8的实际应用非常广泛，它在安防监控、自动驾驶、智能家居等领域都有应用前景。此外，YOLOv8的开源库被定位为算法框架，具有很好的可扩展性，不仅可以用于YOLO系列模型，还支持非YOLO模型以及分类分割姿态估计等任务。

YOLOv8的创新之处在于它结合了当前多个SOTA技术，包括一个新的骨干网络、Ancher-Free检测头和新的损失函数，能够在多种硬件平台上运行。YOLOv8的Backbone采用了C2f模块代替C3模块，增加了梯度流，提高了模型性能和收敛速度。同时，YOLOv8的Head部分采用了解耦头结构，将分类和检测头分离，并从Anchor-Based变成了Anchor-Free 。

YOLOv8的训练策略也有所改进，训练总epoch数从300提升到了500，有助于进一步提升模型性能。此外，YOLOv8还引入了TaskAlignedAssigner正样本分配策略和Distribution Focal Loss，优化了模型的Loss计算。

在性能方面，YOLOv8在COCO数据集上的测试结果表明，相比YOLOv5，YOLOv8在精度上有了显著提升，但相应的参数量和FLOPs也有所增加。尽管如此，YOLOv8依然保持了较高的推理速度，适用于实时目标检测任务。

2，DeepSort算法介绍

DeepSORT是一种计算机视觉目标跟踪算法，旨在为每个对象分配唯一的ID并跟踪它们。它是SORT（Simple Online and Realtime Tracking，简单在线实时跟踪）算法的扩展和改进版本。SORT是一种轻量级目标跟踪算法，用于处理实时视频流中的目标跟踪问题。DeepSORT引入了深度学习技术，以加强SORT的性能，并特别关注在多个帧之间跟踪目标的一致性。

1. SORT目标追踪

SORT 是一种对象跟踪方法，其中使用卡尔曼滤波器和匈牙利算法等基本方法来跟踪对象，并声称比许多在线跟踪器更好。SORT 由以下 4 个关键组件组成：

检测：首先，在跟踪流程的第一步，目标检测器被用来检测当前帧中需要跟踪的目标对象。常用的目标检测器包括Faster R-CNN、YOLO等。
估计：在估计阶段，检测结果从当前帧传播到下一帧，使用恒速模型来估计下一帧中目标的位置。当检测结果与已知的目标相关联时，检测到的边界框信息用于更新目标的状态，包括速度分量，这是通过卡尔曼滤波器框架来实现的。
数据关联：在数据关联步骤中，目标的边界框信息与检测结果结合，从而形成一个成本矩阵，该矩阵计算每个检测与已知目标的所有预测边界框之间的交并比（IOU）距离。然后，使用匈牙利算法来优化分配，以确保正确地将检测结果与目标关联起来。这个技术有助于解决遮挡问题并保持目标的唯一身份。
管理目标ID的创建与删除：跟踪模块负责创建和销毁目标的唯一身份（ID）。如果检测结果与目标的IOU小于某个预定义的阈值（通常称为IOUmin），则不会将检测结果与目标相关联，这表示目标未被跟踪。此外，如果在连续TLost帧中没有检测到目标，跟踪将终止该目标的轨迹，其中TLost是一个可配置的参数。如果目标重新出现，跟踪将在新的身份下恢复。

3，实现流程

1.检测

在每一帧中，目标检测器识别并提取出边界框（bbox），这些边界框表示在当前帧中检测到的目标物体。

 def detect(self,cv_src):
        boxes, scores, class_ids = self.detector(cv_src)

        pred_boxes = []
        for i in range(len(boxes)):
            x1,y1 = int(boxes[i][0]),int(boxes[i][1])
            x2,y2 = int(boxes[i][2]),int(boxes[i][3])
            lbl = class_names[class_ids[i]]
            # print(class_ids[i])

            # if lbl in ['person','sack','elec','bag','box','caron']:
            #     continue

            pred_boxes.append((x1,y1,x2,y2,lbl,class_ids[i]))

        return cv_src,pred_boxes

2. 生成detections

从这些检测到的边界框中，生成称为"detections"的目标检测结果。每个detection通常包含有关目标的信息，如边界框坐标和可信度分数。

#  deep_sort.py
def update(self, bbox_xywh, confidences, ori_img):
    self.height, self.width = ori_img.shape[:2]
    # 提取每个bbox的feature
    features = self._get_features(bbox_xywh, ori_img)
    # [cx,cy,w,h] -> [x1,y1,w,h]
    bbox_tlwh = self._xywh_to_tlwh(bbox_xywh)
    # 过滤掉置信度小于self.min_confidence的bbox，生成detections
    detections = [Detection(bbox_tlwh[i], conf, features[i]) for i,conf in enumerate(confidences) if conf > self.min_confidence]
    # NMS (这里self.nms_max_overlap的值为1，即保留了所有的detections)
    boxes = np.array([d.tlwh for d in detections])
    scores = np.array([d.confidence for d in detections])
    indices = non_max_suppression(boxes, self.nms_max_overlap, scores)
    detections = [detections[i] for i in indices]
    ...

3. 卡尔曼滤波预测

对于已知的跟踪对象（“tracks”），在下一帧中进行卡尔曼滤波预测，以估计其新的位置和速度。

#  track.py
def predict(self, kf):
    """Propagate the state distribution to the current time step using a 
       Kalman filter prediction step.
    Parameters
    ----------
    kf: The Kalman filter.
    """
    self.mean, self.covariance = kf.predict(self.mean, self.covariance)  # 预测
    self.age += 1  # 该track自出现以来的总帧数加1
    self.time_since_update += 1  # 该track自最近一次更新以来的总帧数加1

4.使用匈牙利算法将预测后的tracks和当前帧中的detections进行匹配

这是DeepSORT中的核心步骤。DeepSORT使用匈牙利算法来将预测的tracks和当前帧的detections进行匹配。这个匹配可以采用两种级联方法：首先，通过计算马氏距离来估算预测对象与检测对象之间的关联，如果马氏距离小于指定的阈值，则将它们匹配为同一目标。其次，DeepSORT还使用外观特征余弦距离度量，通过一个重识别模型获得不同物体的特征向量，然后构建余弦距离代价函数，以计算预测对象与检测对象的相似度。这两个代价函数的结果都趋向于小，如果边界框接近且特征相似，则将它们匹配为同一目标。

#  tracker.py
def _match(self, detections):
    def gated_metric(racks, dets, track_indices, detection_indices):
        """
        基于外观信息和马氏距离，计算卡尔曼滤波预测的tracks和当前时刻检测到的detections的代价矩阵
        """
        features = np.array([dets[i].feature for i in detection_indices])
        targets = np.array([tracks[i].track_id for i in track_indices]
 # 基于外观信息，计算tracks和detections的余弦距离代价矩阵
        cost_matrix = self.metric.distance(features, targets)
 # 基于马氏距离，过滤掉代价矩阵中一些不合适的项 (将其设置为一个较大的值)
        cost_matrix = linear_assignment.gate_cost_matrix(self.kf, cost_matrix, tracks, 
                      dets, track_indices, detection_indices)
        return cost_matrix

    # 区分开confirmed tracks和unconfirmed tracks
    confirmed_tracks = [i for i, t in enumerate(self.tracks) if t.is_confirmed()]
    unconfirmed_tracks = [i for i, t in enumerate(self.tracks) if not t.is_confirmed()]

    # 对confirmd tracks进行级联匹配
    matches_a, unmatched_tracks_a, unmatched_detections = \
        linear_assignment.matching_cascade(
            gated_metric, self.metric.matching_threshold, self.max_age,
            self.tracks, detections, confirmed_tracks)

    # 对级联匹配中未匹配的tracks和unconfirmed tracks中time_since_update为1的tracks进行IOU匹配
    iou_track_candidates = unconfirmed_tracks + [k for k in unmatched_tracks_a if
                                                 self.tracks[k].time_since_update == 1]
    unmatched_tracks_a = [k for k in unmatched_tracks_a if
                          self.tracks[k].time_since_update != 1]
    matches_b, unmatched_tracks_b, unmatched_detections = \
        linear_assignment.min_cost_matching(
            iou_matching.iou_cost, self.max_iou_distance, self.tracks,
            detections, iou_track_candidates, unmatched_detections)
 
    # 整合所有的匹配对和未匹配的tracks
    matches = matches_a + matches_b
    unmatched_tracks = list(set(unmatched_tracks_a + unmatched_tracks_b))
    
    return matches, unmatched_tracks, unmatched_detections


# 级联匹配源码  linear_assignment.py
def matching_cascade(distance_metric, max_distance, cascade_depth, tracks, detections, 
                     track_indices=None, detection_indices=None):
    ...
    unmatched_detections = detection_indice
    matches = []
    # 由小到大依次对每个level的tracks做匹配
    for level in range(cascade_depth):
 # 如果没有detections，退出循环
        if len(unmatched_detections) == 0:  
            break
 # 当前level的所有tracks索引
        track_indices_l = [k for k in track_indices if 
                           tracks[k].time_since_update == 1 + level]
 # 如果当前level没有track，继续
        if len(track_indices_l) == 0: 
            continue
  
 # 匈牙利匹配
        matches_l, _, unmatched_detections = min_cost_matching(distance_metric, max_distance, tracks, detections, 
                                                               track_indices_l, unmatched_detections)
        
 matches += matches_l
 unmatched_tracks = list(set(track_indices) - set(k for k, _ in matches))
    return matches, unmatched_tracks, unmatched_detections

5. 卡尔曼滤波更新

匹配后，DeepSORT使用检测到的detections来更新每个已知的跟踪对象的状态，例如位置和速度。这有助于保持跟踪对象的准确性和连续性。

def update(self, detections):
    """Perform measurement update and track management.
    Parameters
    ----------
    detections: List[deep_sort.detection.Detection]
                A list of detections at the current time step.
    """
    # 得到匹配对、未匹配的tracks、未匹配的dectections
    matches, unmatched_tracks, unmatched_detections = self._match(detections)

    # 对于每个匹配成功的track，用其对应的detection进行更新
    for track_idx, detection_idx in matches:
        self.tracks[track_idx].update(self.kf, detections[detection_idx])
    
	# 对于未匹配的成功的track，将其标记为丢失
	for track_idx in unmatched_tracks:
        self.tracks[track_idx].mark_missed()
	
    # 对于未匹配成功的detection，初始化为新的track
    for detection_idx in unmatched_detections:
        self._initiate_track(detections[detection_idx])
    
	...

4，代码实现

首先去GitHub官网将项目下载或者拉下来

网址：MuhammadMoinFaisal/YOLOv8-DeepSORT-Object-Tracking: YOLOv8 Object Tracking using PyTorch, OpenCV and DeepSORT (github.com)然后按照readme文档将环境配置好

pip install -e '.[dev]'

进入到detect中

cd ultralytics/yolo/v8/detect

接着得去下面的网址下载一个DeepSORT文件

https://drive.google.com/drive/folders/1kna8eWGrSfzaR6DtNJ8_GchGgPMv3VC8?usp=sharing

然后运行

python predict.py model=yolov8l.pt source="test3.mp4" show=True