栏目分类:
子分类:
返回
名师互学网用户登录
快速导航关闭
当前搜索
当前分类
子分类
实用工具
热门搜索
名师互学网 > IT > 软件开发 > 后端开发 > Python

解决YOLOV5出现全为nan和0的问题

Python 更新时间: 发布时间: IT归档 最新发布 模块sitemap 名妆网 法律咨询 聚返吧 英语巴士网 伯小乐 网商动力

解决YOLOV5出现全为nan和0的问题

 yolov5训练时,出现系数为nan和0的问题。

cpu跑没有问题,gpu出现nan和0的问题。一般问题cuda问题和显卡的原因。

显卡为GTX 16XX系列的在cuda使用较新版本时会出现该问题。

例如我自己的问题:飞行堡垒7锐龙版 显卡:GTX 1650  cuda11.3(cuda11.5调试过)都会出现该问题 pytorch为1.11.0 。

AutoAnchor: 6.13 anchors/target, 1.000 Best Possible Recall (BPR). Current anchors are a good fit to dataset 
Image sizes 640 train, 640 val
Using 0 dataloader workers
Logging results to runstrainexp7
Starting training for 100 epochs...

     Epoch   gpu_mem       box       obj       cls    labels  img_size
      0/99     1.88G       nan       nan       nan        10       640: 100%|██████████| 14/14 [00:35<00:00,  2.52s/it]
D:19837anaconda3envspytorchlibsite-packagestorchoptimlr_scheduler.py:136: UserWarning: Detected call of `lr_scheduler.step()` before `optimizer.step()`. In PyTorch 1.1.0 and later, you should call them in the opposite order: `optimizer.step()` before `lr_scheduler.step()`.  Failure to do this will result in PyTorch skipping the first value of the learning rate schedule. See more details at https://pytorch.org/docs/stable/optim.html#how-to-adjust-learning-rate
  "https://pytorch.org/docs/stable/optim.html#how-to-adjust-learning-rate", UserWarning)
               Class     Images     Labels          P          R     mAP@.5 mAP@.5:.95: 100%|██████████| 7/7 [00:07<00:00,  1.09s/it]
                 all        106          0          0          0          0          0

     Epoch   gpu_mem       box       obj       cls    labels  img_size
      1/99     1.96G       nan       nan       nan       104       640:   7%|▋         | 1/14 [00:02<00:38,  2.92s/it]
Process finished with exit code -1

解决方案为将cuda换为10.2的版本,链接如下,直接进行下载

CUDA Toolkit Archive | NVIDIA DeveloperPrevious releases of the CUDA Toolkit, GPU Computing SDK, documentation and developer drivers can be found using the links below. Please select the release you want from the list below, and be sure to check www.nvidia.com/drivers for more recent production drivers appropriate for your hardware configuration.https://developer.nvidia.com/cuda-toolkit-archive

cudnn下载:

cuDNN Archive | NVIDIA DeveloperNVIDIA cuDNN is a GPU-accelerated library of primitives for deep neural networks.https://developer.nvidia.com/rdp/cudnn-archive#a-collapse51b选择对应的版本

安装cuda过后将cudnn里面的放入C:Program FilesNVIDIA GPU Computing ToolkitCUDAv10.2这个路径下。根据自己的路径进行修改

然后继续安装pytorch cu102版本

pip install torch==1.10.1+cu102 torchvision==0.11.2+cu102 torchaudio==0.10.1 -f https://download.pytorch.org/whl/torch_stable.html

接下来回到运行程序阶段

AutoAnchor: 6.13 anchors/target, 1.000 Best Possible Recall (BPR). Current anchors are a good fit to dataset 
Image sizes 640 train, 640 val
Using 0 dataloader workers
Logging results to runstrainexp10
Starting training for 100 epochs...

     Epoch   gpu_mem       box       obj       cls    labels  img_size
      0/99     1.85G    0.1244    0.0515   0.06827        10       640: 100%|██████████| 14/14 [01:59<00:00,  8.53s/it]
               Class     Images     Labels          P          R     mAP@.5 mAP@.5:.95: 100%|██████████| 7/7 [00:11<00:00,  1.62s/it]
                 all        106        433    0.00107    0.00842   0.000487   0.000122

     Epoch   gpu_mem       box       obj       cls    labels  img_size
      1/99     1.96G    0.1171   0.06178   0.06603        63       640:  50%|█████     | 7/14 [00:30<00:30,  4.37s/it]

 至此就完成了

转载请注明:文章转载自 www.mshxw.com
本文地址:https://www.mshxw.com/it/834673.html
我们一直用心在做
关于我们 文章归档 网站地图 联系我们

版权所有 (c)2021-2022 MSHXW.COM

ICP备案号:晋ICP备2021003244-6号