欢迎您访问程序员文章站本站旨在为大家提供分享程序员计算机编程知识!
您现在的位置是: 首页

(fast-reid)多GPU训练出现RuntimeError: Address already in use解决

程序员文章站 2022-06-11 15:46:46
...

https://github.com/JDAI-CV/fast-reid

0.环境 

ubuntu16.04
cuda9.0
python3.6
torch==1.1.0
torchvision==0.3.0
Cython
yacs
tensorboard
future
termcolor
sklearn
tqdm
opencv-python==4.1.0.25
matplotlib
scikit-image
faiss-gpu==1.6.3
tabulate
gdown

1.多次使用多GPU出现错误

RuntimeError: Address already in use

(fast-reid)多GPU训练出现RuntimeError: Address already in use解决

2.解决

usage: train_net.py [-h] [--config-file FILE] [--resume] [--eval-only]
                    [--num-gpus NUM_GPUS] [--num-machines NUM_MACHINES]
                    [--machine-rank MACHINE_RANK] [--dist-url DIST_URL]

(fast-reid)多GPU训练出现RuntimeError: Address already in use解决

(fast-reid)多GPU训练出现RuntimeError: Address already in use解决

 

使用--dist-url参数指定路径,每次使用的都不一样就可以了。

CUDA_VISIBLE_DEVICES='0,1' python ./tools/train_net.py \
 --config-file ./projects/DistillReID/configs-bagtricks-ibn-market1501/bagtricks_R50-ibn.yml \
 --num-gpus 2 --dist-url tcp://127.0.0.1:50001

参考

1.AdelaiDet/issues/149

相关标签: fast-reid pytorch