
论文阅读笔记
文章平均质量分 89
寻丶幽风
这个作者很懒,什么都没留下…
展开
专栏收录文章
- 默认排序
- 最新发布
- 最早发布
- 最多阅读
- 最少阅读
-
论文阅读笔记——Autoregressive Image Generation without Vector Quantization
Autoregressive Image Generation without Vector Quantization 论文阅读笔记原创 2025-07-03 22:14:12 · 833 阅读 · 0 评论 -
论文阅读笔记——Harmonizing Visual Representations for Unified Multimodal Understanding and Generation
Harmonizing Visual Representations for Unified Multimodal Understanding and Generation 论文阅读笔记原创 2025-07-03 17:17:19 · 218 阅读 · 0 评论 -
论文阅读笔记——VGGT: Visual Geometry Grounded Transformer
VGGT: Visual Geometry Grounded Transformer 论文阅读笔记原创 2025-07-02 14:41:12 · 836 阅读 · 0 评论 -
论文阅读笔记——NoPoSplat
论文阅读笔记 NO POSE, NO PROBLEM: SURPRISINGLY SIMPLE 3D GAUSSIAN SPLATS FROM SPARSE UNPOSED IMAGES原创 2025-07-01 16:04:54 · 880 阅读 · 0 评论 -
论文阅读笔记——ComfyMind: Toward General-Purpose Generation via Tree-Based Planning and Reactive Feedback
ComfyMind: Toward General-Purpose Generation via Tree-Based Planning and Reactive Feedback 论文阅读笔记原创 2025-06-09 10:58:24 · 695 阅读 · 0 评论 -
论文阅读笔记——FLUX.1 Kontext: Flow Matching for In-Context Image Generation and Editing in Latent Space
FLUX.1 Kontext: Flow Matching for In-Context Image Generation and Editing in Latent Space 论文阅读笔记原创 2025-06-02 15:33:06 · 1477 阅读 · 0 评论 -
论文阅读笔记——MetaMorph: Multimodal Understanding and Generation via Instruction Tuning
MetaMorph: Multimodal Understanding and Generation via Instruction Tuning 论文阅读笔记原创 2025-05-30 19:23:22 · 1017 阅读 · 0 评论 -
论文阅读笔记——In-Context Edit
In-Context Edit: Enabling Instructional Image Editing with In-Context Generation in Large Scale Diffusion Transformer 论文阅读笔记原创 2025-05-28 15:44:52 · 1391 阅读 · 0 评论 -
论文阅读笔记——Step1X-Edit: A Practical Framework for General Image Editing
Step1X-Edit: A Practical Framework for General Image Editing 论文阅读笔记原创 2025-05-27 23:47:46 · 1366 阅读 · 0 评论 -
论文阅读笔记——Nexus-Gen: A Unified Model for Image Understanding, Generation, and Editing
Nexus-Gen: A Unified Model for Image Understanding, Generation, and Editing 论文阅读笔记原创 2025-05-27 22:42:36 · 1002 阅读 · 0 评论 -
论文阅读笔记——ViLT: Vision-and-Language Transformer Without Convolution or Region Supervision
ViLT: Vision-and-Language Transformer Without Convolution or Region Supervision 论文阅读笔记原创 2025-05-26 20:51:25 · 256 阅读 · 0 评论 -
论文阅读笔记——Transfusion: Predict the Next Token and Diffuse Images with One Multi-Modal Model
Transfusion: Predict the Next Token and Diffuse Images with One Multi-Modal Model 论文阅读笔记原创 2025-05-26 16:26:01 · 944 阅读 · 0 评论 -
论文阅读笔记——Janus,Janus Pro
Janus、Janus Pro 论文阅读笔记原创 2025-05-25 18:40:34 · 1400 阅读 · 0 评论 -
论文阅读笔记——Emerging Properties in Unified Multimodal Pretraining
Emerging Properties in Unified Multimodal Pretraining 论文阅读笔记原创 2025-05-24 19:08:26 · 1190 阅读 · 0 评论 -
论文阅读笔记——PixArt-α,PixArt-δ
PixArt-α,PixArt-δ 论文阅读笔记原创 2025-05-22 20:15:07 · 1036 阅读 · 0 评论 -
论文阅读笔记——ROBOGROUND: Robotic Manipulation with Grounded Vision-Language Priors
ROBOGROUND: Robotic Manipulation with Grounded Vision-Language Priors 论文阅读笔记原创 2025-05-06 23:24:24 · 1347 阅读 · 0 评论 -
论文阅读笔记——STDArm
STDArm: Transferring Visuomotor Policies From Static Data Training to Dynamic Robot Manipulation 论文阅读笔记原创 2025-05-04 11:26:26 · 1581 阅读 · 0 评论 -
论文阅读笔记——TesserAct: Learning 4D Embodied World Models
TesserAct: Learning 4D Embodied World Models 论文阅读笔记原创 2025-05-02 13:08:02 · 1549 阅读 · 0 评论 -
论文阅读笔记——Phoenix: A Motion-based Self-Reflection Framework for Fine-grained Robotic Action Correction
Phoenix: A Motion-based Self-Reflection Framework for Fine-grained Robotic Action Correction 论文阅读笔记原创 2025-04-30 10:32:22 · 874 阅读 · 1 评论 -
论文阅读笔记——ZeroGrasp: Zero-Shot Shape Reconstruction Enabled Robotic Grasping
ZeroGrasp: Zero-Shot Shape Reconstruction Enabled Robotic Grasping 论文阅读笔记原创 2025-04-25 16:59:39 · 1139 阅读 · 0 评论 -
论文阅读笔记——π0.5: a Vision-Language-Action Model with Open-World Generalization
π0.5: a Vision-Language-Action Model with Open-World Generalization 论文阅读笔记原创 2025-04-24 10:04:09 · 1782 阅读 · 0 评论 -
论文阅读笔记——A0: An Affordance-Aware Hierarchical Model for General Robotic Manipulation
A0: An Affordance-Aware Hierarchical Model for General Robotic Manipulation 论文阅读笔记,其核心创新在于将任务分解为**高层空间可操作性推理**与**底层动作执行**,通过跨平台的**具身无关可操作性表示**(Embodiment-Agnostic Affordance Representation)预测物体中心的接触点与轨迹,实现多机器人系统的泛化能力。原创 2025-04-21 12:00:00 · 1332 阅读 · 1 评论 -
论文阅读笔记——OPAL: Encoding Causal Understanding of Physical Systems for Robot Learning
OPAL: Encoding Causal Understanding of Physical Systems for Robot Learning 论文阅读笔记原创 2025-04-19 14:00:55 · 1063 阅读 · 1 评论 -
论文阅读笔记——RoboBrain: A Unified Brain Model for Robotic Manipulation from Abstract to Concrete
RoboBrain: A Unified Brain Model for Robotic Manipulation from Abstract to Concrete论文阅读笔记原创 2025-04-17 12:14:54 · 614 阅读 · 0 评论 -
论文阅读笔记——Generating Long Sequences with Sparse Transformers
Generating Long Sequences with Sparse Transformers 论文阅读笔记原创 2025-04-14 19:00:00 · 1318 阅读 · 0 评论 -
论文阅读笔记——Reactive Diffusion Policy
Reactive Diffusion Policy: Slow-Fast Visual-Tactile Policy Learning for Contact-Rich Manipulation 论文阅读笔记原创 2025-04-13 14:57:04 · 1118 阅读 · 0 评论 -
论文阅读笔记——Multi-Token Attention
Multi-Token Attention 论文阅读笔记原创 2025-04-12 21:00:00 · 1412 阅读 · 0 评论 -
论文阅读笔记——GPT-1,GPT-2,GPT-3,InstructGPT
GPT-1,GPT-2,GPT-3,InstructGPT 论文阅读笔记原创 2025-04-09 12:00:00 · 1098 阅读 · 0 评论 -
论文阅读笔记——Deformable Radial Kernel Splatting
Deformable Radial Kernel Splatting 论文阅读笔记原创 2025-04-06 11:45:13 · 1182 阅读 · 1 评论 -
论文阅读笔记——RDT-1B: A DIFFUSION FOUNDATION MODEL FOR BIMANUAL MANIPULATION
RDT-1B: A DIFFUSION FOUNDATION MODEL FOR BIMANUAL MANIPULATION 论文阅读笔记原创 2025-04-05 16:54:44 · 1338 阅读 · 0 评论 -
论文阅读笔记——SpatialVLA: Exploring Spatial Representations for Visual-Language-Action Model
SpatialVLA 论文阅读笔记原创 2025-04-01 12:36:27 · 1184 阅读 · 0 评论 -
论文阅读笔记——PointVLA: Injecting the 3D World into Vision-Language-Action Models
PointVLA 论文阅读笔记原创 2025-03-30 23:19:05 · 546 阅读 · 0 评论 -
论文阅读笔记——ReconDreamer
ReconDreamer 论文阅读笔记原创 2025-03-29 17:41:16 · 1394 阅读 · 0 评论 -
论文阅读笔记——ST-4DGS,WideRange4D
ST-4DGS,WideRange4D 论文阅读笔记原创 2025-03-27 20:42:20 · 1413 阅读 · 0 评论 -
论文阅读笔记——Diffuser,Diffusion Policy
Diffuser,Diffusion Policy 论文阅读笔记原创 2025-03-26 12:00:00 · 1201 阅读 · 0 评论 -
论文阅读笔记——MTGS: Multi-Traversal Gaussian Splatting
MTGS: Multi-Traversal Gaussian Splatting 论文阅读笔记原创 2025-03-23 19:00:00 · 1024 阅读 · 0 评论 -
论文阅读笔记——MuDG: Taming Multi-modal Diffusion with Gaussian Splatting for Urban Scene Reconstruction
MuDG: Taming Multi-modal Diffusion with Gaussian Splatting for Urban Scene Reconstruction 论文阅读笔记原创 2025-03-22 21:00:00 · 899 阅读 · 0 评论 -
论文阅读笔记——MagicDrive3D: Controllable 3D Generation for Any-View Rendering in Street Scenes
MagicDrive3D: Controllable 3D Generation for Any-View Rendering in Street Scenes 论文阅读笔记原创 2025-03-20 12:00:00 · 928 阅读 · 0 评论 -
论文阅读笔记——MAGICDRIVE: STREET VIEW GENERATION WITH DIVERSE 3D GEOMETRY CONTROL
MAGICDRIVE: STREET VIEW GENERATION WITH DIVERSE 3D GEOMETRY CONTROL 论文阅读笔记原创 2025-03-19 12:00:00 · 1303 阅读 · 0 评论 -
论文阅读笔记——OpenVLA: An Open-Source Vision-Language-Action Model
OpenVLA: An Open-Source Vision-Language-Action Model 论文阅读笔记原创 2025-03-09 13:25:19 · 1367 阅读 · 0 评论