多机多卡训练mmseg工程时,命令 第一台机器: NnodeS=2 NODE_RANK=0 PORT=8888 MASTER_ADDR=192.168.XX.XX sh tools/dist_train.sh ./configs/temp.
多机多卡训练mmseg工程时,命令
第一台机器:
NnodeS=2 NODE_RANK=0 PORT=8888 MASTER_ADDR=192.168.XX.XX sh tools/dist_train.sh ./configs/temp.py 4
第二台机器:
NNODES=2 NODE_RANK=1 PORT=8888 MASTER_ADDR=192.168.XX.XX sh tools/dist_train.sh ./configs/temp.py 4
报错信息如下:
RuntimeError: The server Socket has failed to listen on any local network address. The server socket has failed to bind to [::]:8888 (errno: 98 - Address already in use). The server socket has failed to bind to ?UNKNOWN? (errno: 98 - Address already in use).
根据报错信息,可以看到是因为8888这个端口号被使用了 ,此时只需要更换PORT的端口号就可以了,比如改成29050,29051......
至此,问题解决!
来源地址:https://blog.csdn.net/qq_38308388/article/details/128724358
--结束END--
本文标题: 【debug】mmseg多级多卡训练报错:The server socket has failed to listen on any local network address.
本文链接: https://lsjlt.com/news/396210.html(转载时请注明来源链接)
有问题或投稿请发送至: 邮箱/279061341@qq.com QQ/279061341
2024-10-22
2024-10-22
2024-10-22
2024-10-22
2024-10-22
2024-10-22
2024-10-22
2024-10-22
2024-10-22
2024-10-22
回答
回答
回答
回答
回答
回答
回答
回答
回答
回答
0