dist_train.sh转为launch.json启动项目

本文介绍了如何在VSCode中使用launch.json文件替代bash脚本来启动Python分布式训练,特别关注了如何将原始bash环境变量转换为JSON格式的配置,以解决PYTHONPATH问题。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

使用vscode进行debug时,需要写一个launch.json文件来启动,这就需要将原始的bash文件中的变量配置转换为json配置。

原始bash指令如下

#!/usr/bin/env bash
CONFIG=$1

PYTHONPATH="$(dirname $0)/..":$PYTHONPATH \
python -m torch.distributed.launch \
    $(dirname "$0")/train.py \
    $CONFIG \
    --seed 0 \
    --launcher pytorch ${@:3}

launch.json如下:program里面使用launch.py 对应 python -m torch.distributed.launch。env一定要填写pythonpath,否则会出现找不到模块的情况。

{
    // Use IntelliSense to learn about possible attributes.
    // Hover to view descriptions of existing attributes.
    // For more information, visit: https://go.microsoft.com/fwlink/?linkid=830387
    "version": "0.2.0",
    "configurations": [
        {
            "name": "Python: Current File",
            "type": "python",
            "request": "launch",
            "program": "/root/miniconda3/envs/occformer/lib/python3.7/site-packages/torch/distributed/launch.py",
            "console": "integratedTerminal",
            "args": [
                "tools/train.py",
                "--config",
                "/root/OccFormer/projects/configs/occformer_nusc/occformer_nusc_panoptic_r50_256x704.py"
            ],
            "env": {"PYTHONPATH": "\"$(dirname $0)/..\":${env:PYTHONPATH}"},
        }
    ]
}

(/home/ubuntu/WorkSpace/env1/xunlian) (xunlian) ubuntu@ubun:~/WorkSpace/xqs/Open-GroundingDino$ pip install yapf==0.32.0 Collecting yapf==0.32.0 Using cached yapf-0.32.0-py2.py3-none-any.whl.metadata (34 kB) Using cached yapf-0.32.0-py2.py3-none-any.whl (190 kB) Installing collected packages: yapf Successfully installed yapf-0.32.0 (/home/ubuntu/WorkSpace/env1/xunlian) (xunlian) ubuntu@ubun:~/WorkSpace/xqs/Open-GroundingDino$ bash /home/ubuntu/WorkSpace/xqs/Open-GroundingDino/scripts/train_dist.sh Traceback (most recent call last): File "/home/ubuntu/WorkSpace/xqs/Open-GroundingDino/main.py", line 19, in <module> from util.slconfig import DictAction, SLConfig File "/home/ubuntu/WorkSpace/xqs/Open-GroundingDino/util/slconfig.py", line 16, in <module> from yapf.yapflib.code_formatting import FormatCode ModuleNotFoundError: No module named 'yapf.yapflib.code_formatting' E0610 11:10:05.385000 952921 site-packages/torch/distributed/elastic/multiprocessing/api.py:869] failed (exitcode: 1) local_rank: 0 (pid: 952958) of binary: /home/ubuntu/WorkSpace/env1/xunlian/bin/python3.11 Traceback (most recent call last): File "/home/ubuntu/WorkSpace/env1/xunlian/bin/torchrun", line 8, in <module> sys.exit(main()) ^^^^^^ File "/home/ubuntu/WorkSpace/env1/xunlian/lib/python3.11/site-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 355, in wrapper return f(*args, **kwargs) ^^^^^^^^^^^^^^^^^^ File "/home/ubuntu/WorkSpace/env1/xunlian/lib/python3.11/site-packages/torch/distributed/run.py", line 919, in main run(args) File "/home/ubuntu/WorkSpace/env1/xunlian/lib/python3.11/site-packages/torch/distributed/run.py", line 910, in run elastic_launch( File "/home/ubuntu/WorkSpace/env1/xunlian/lib/python3.11/site-packages/torch/distributed/launcher/api.py", line 138, in __call__ return launch_agent(self._config, self._entrypoint, list(args)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/ubuntu/WorkSpace/env1/xunlian/lib/python3.11/site-packages/torch/distributed/launcher/api.py", line 269, in launch_agent raise ChildFailedError( torch.distributed.elastic.multiprocessing.errors.ChildFailedError: ============================================================ main.py FAILED ------------------------------------------------------------ Failures: <NO_OTHER_FAILURES> ------------------------------------------------------------ Root Cause (first observed failure): [0]: time : 2025-06-10_11:10:05 host : UBUN rank : 0 (local_rank: 0) exitcode : 1 (pid: 952958) error_file: <N/A> traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html
最新发布
06-11
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值