(fast-reid)多GPU训练出现RuntimeError: Address already in use解决
程序员文章站
2022-06-11 15:46:46
...
https://github.com/JDAI-CV/fast-reid
0.环境
ubuntu16.04
cuda9.0
python3.6
torch==1.1.0
torchvision==0.3.0
Cython
yacs
tensorboard
future
termcolor
sklearn
tqdm
opencv-python==4.1.0.25
matplotlib
scikit-image
faiss-gpu==1.6.3
tabulate
gdown
1.多次使用多GPU出现错误
RuntimeError: Address already in use
2.解决
usage: train_net.py [-h] [--config-file FILE] [--resume] [--eval-only]
[--num-gpus NUM_GPUS] [--num-machines NUM_MACHINES]
[--machine-rank MACHINE_RANK] [--dist-url DIST_URL]
使用--dist-url参数指定路径,每次使用的都不一样就可以了。
CUDA_VISIBLE_DEVICES='0,1' python ./tools/train_net.py \
--config-file ./projects/DistillReID/configs-bagtricks-ibn-market1501/bagtricks_R50-ibn.yml \
--num-gpus 2 --dist-url tcp://127.0.0.1:50001
参考
上一篇: Request对象