[论文解读] PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation

micaudience

已于 2025-06-30 15:07:31 修改

阅读量760

点赞数 20

CC 4.0 BY-SA版权

分类专栏：计算机视觉文章标签：深度学习人工智能

于 2025-06-30 15:06:56 首次发布

版权声明：本文为博主原创文章，遵循 CC 4.0 BY-SA 版权协议，转载请附上原文出处链接和本声明。

本文链接：https://blog.csdn.net/micaudience/article/details/146514375

计算机视觉专栏收录该内容

1 篇文章

订阅专栏

PointNet

原文：PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation
https://arxiv.org/pdf/1612.00593
Stanford University

定义3D点云

点云：一组3D点 ${P_I|i=1,...,n\}$ ，其中 $P_i$ 由坐标 $(x, y, z)$ 及其他特征构成。在PointNet这篇论文中，仅使用了坐标。

3D点云的特点

Unordered: 点之间是无序的。因而，模型的输出结果应不随输入顺序改变而改变。
Interaction among points: 点之间是有空间（距离）关系的，并非独立存在的。因而，模型需要捕获局部结构关系。
Invariance under transformations: 变换不应改变点本身的分类或者分割关系。

原文这一段的写法非常值得借鉴，简明介绍了点云数据的特点，又点出了对点云模型设计的需求。

PointNet架构

在这里插入图片描述
核心模块包括：

the max pooling layer as a symmetric function to aggregate information from all the points
a local and global information combination structure
two joint alignment networks that align both input points and point features

Symmetry Function for Unordered Input (针对无序输入的对称映射方法)

为了保证数据的置换不变性（结果不随数据输入顺序而改变），一般有三种选择：

将序列进行排序，得到一个绝对顺序
枚举全部排序进行训练
使用对称映射方法将所有点的信息聚合，例如加法、乘法
本文选取第三种方式，用神经网络拟合一个对称映射方法。具体而言：
$f(\{x_1,...,x_n\})\approx g(h(x_1), ..., h(x_n))$
其中 $h$ 是MLP， $g$ 是一个单值函数和一个最大池化层(max pool)。

此处是为了解决前文提到的unordered的问题

Local and Global Information Aggregation (局部和全局信息整合)

主要指架构中的Segmentation Network部分，即在获得全局信息(global feature)后，将全局信息与此前步骤得到的局部信息整合。具体来说，将全局信息(1 x 1024)扩展成(n x 1024)，与前面得到的(n x 64)进行整合，得到(n x 1088)，然后用MLP进行信息提取。
在这里插入图片描述

主要解决Interaction among points问题

Joint Alignment Network（对齐网络）

使用T-net架构，并且引入了正则化loss，使其接近正交矩阵，在降低参数量的同时尽可能保持信息。
在这里插入图片描述
正则化loss： $L=||I-AA^T||^2_F$

此处笔者仍存在疑惑，敬请读者指导更正。

在此之后还产生了一系列比较知名的工作，例如PointNet++、Point Transformer V1~V3，对这一领域的影响极其深远。

博客等级

码龄5年

95
原创

243
点赞

359
收藏

138
粉丝

关注

私信

热门文章

分类专栏

展开全部收起

上一篇：: [论文阅读] Neural Architecture Search: Insights from 1000 Papers

下一篇：: [课程学习] 图学习中的拓扑不均衡问题初探

最新评论

[论文阅读] Neural Architecture Search: Insights from 1000 Papers
CSDN-Ada助手: 你好，CSDN 开始提供 #论文阅读# 的列表服务了。请看：https://blog.csdn.net/nav/advanced-technology/paper-reading?utm_source=csdn_ai_ada_blog_reply 。如果你有更多需求，请来这里 https://gitcode.net/csdn/csdn-tags/-/issues/34?utm_source=csdn_ai_ada_blog_reply 给我们提。
论文阅读【时间序列】ModerTCN (ICLR2024)
CSDN-Ada助手: 你好，CSDN 开始提供 #论文阅读# 的列表服务了。请看：https://blog.csdn.net/nav/advanced-technology/paper-reading?utm_source=csdn_ai_ada_blog_reply 。如果你有更多需求，请来这里 https://gitcode.net/csdn/csdn-tags/-/issues/34?utm_source=csdn_ai_ada_blog_reply 给我们提。
论文阅读【时空+大模型】ST-LLM（MDM2024）
CSDN-Ada助手: 你好，CSDN 开始提供 #论文阅读# 的列表服务了。请看：https://blog.csdn.net/nav/advanced-technology/paper-reading?utm_source=csdn_ai_ada_blog_reply 。如果你有更多需求，请来这里 https://gitcode.net/csdn/csdn-tags/-/issues/34?utm_source=csdn_ai_ada_blog_reply 给我们提。
论文阅读【时间序列】TimeMixer (ICLR2024)
CSDN-Ada助手: 你好，CSDN 开始提供 #论文阅读# 的列表服务了。请看：https://blog.csdn.net/nav/advanced-technology/paper-reading?utm_source=csdn_ai_ada_blog_reply 。如果你有更多需求，请来这里 https://gitcode.net/csdn/csdn-tags/-/issues/34?utm_source=csdn_ai_ada_blog_reply 给我们提。
论文阅读【时间序列】DSformer
CSDN-Ada助手: 你好，CSDN 开始提供 #论文阅读# 的列表服务了。请看：https://blog.csdn.net/nav/advanced-technology/paper-reading?utm_source=csdn_ai_ada_blog_reply 。如果你有更多需求，请来这里 https://gitcode.net/csdn/csdn-tags/-/issues/34?utm_source=csdn_ai_ada_blog_reply 给我们提。

大家在看

最新文章

目录

展开全部

收起

评论

被折叠的条评论为什么被折叠?

到【灌水乐园】发言

查看更多评论

添加红包

成就一亿技术人!

hope_wisdom

发出的红包

实付元

使用余额支付

点击重新获取

扫码支付

钱包余额 0

抵扣说明：

1.余额是钱包充值的虚拟货币，按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载，可以购买VIP、付费专栏及课程。