深度学习入门教程-Ubuntu18.04系统安装cuDNN7和NCCL2
深度学习入门教程-Ubuntu18.04系统安装cuDNN7和NCCL2
说明:
- 介绍如何安装在Ubuntu18.04系统安装cuDNN7和NCCL2
环境:
- 系统:ubuntu 18.04
- 显卡:GTX 1080Ti
安装PPA:
- 找到合适的PPA版本,https://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64/
- 目前选择nvidia-machine-learning-repo-ubuntu1804_1.0.0-1_amd64.deb
- 下载安装:
wget https://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64/nvidia-machine-learning-repo-ubuntu1804_1.0.0-1_amd64.deb
sudo dpkg -i nvidia-machine-learning-repo-ubuntu1804_1.0.0-1_amd64.deb
sudo apt update
- 安装cuDNN
sudo apt install -y libcudnn7 libcudnn7-dev libnccl2 libc-ares-dev
- 链接到正常位置
sudo mkdir -p /usr/local/cuda/nccl/lib
sudo ln -s /usr/lib/x86_64-linux-gnu/libnccl.so.2 /usr/local/cuda/nccl/lib/
sudo ln -s /usr/lib/x86_64-linux-gnu/libcudnn.so.7 /usr/local/cuda/lib64/
检测:
- 执行
cat /usr/include/cudnn.h | grep CUDNN_MAJOR -A 2
- 效果如下:
ubuntu@AiDLHost:~$ cat /usr/include/cudnn.h | grep CUDNN_MAJOR -A 2
#define CUDNN_MAJOR 7
#define CUDNN_MINOR 5
#define CUDNN_PATCHLEVEL 1
--
#define CUDNN_VERSION (CUDNN_MAJOR * 1000 + CUDNN_MINOR * 100 + CUDNN_PATCHLEVEL)
#include "driver_types.h"
- 下载测试例子,点击下载
wget http://file.ncnynl.com/ros/2019/libcudnn7-doc_7.5.0.56-1+cuda10.1_amd64.deb
sudo apt install ./libcudnn7-doc_7.5.0.56-1+cuda10.1_amd64.deb
cp -r /usr/src/cudnn_samples_v7/ ~/tools/
- 安装编译:
cd ~/tools/cudnn_samples_v7/mnistCUDNN
make clean && make
- 测试如下:
ubuntu@AiDLHost:~/tools/cudnn_samples_v7/mnistCUDNN$ ./mnistCUDNN
cudnnGetVersion() : 7501 , CUDNN_VERSION from cudnn.h : 7501 (7.5.1)
Host compiler version : GCC 7.4.0
There are 1 CUDA capable devices on your machine :
device 0 : sms 28 Capabilities 6.1, SmClock 1632.5 Mhz, MemSize (Mb) 11175, MemClock 5505.0 Mhz, Ecc=0, boardGroupID=0
Using device 0
Testing single precision
Loading image data/one_28x28.pgm
Performing forward propagation ...
Testing cudnnGetConvolutionForwardAlgorithm ...
Fastest algorithm is Algo 1
Testing cudnnFindConvolutionForwardAlgorithm ...
^^^^ CUDNN_STATUS_SUCCESS for Algo 0: 0.014336 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 1: 0.026176 time requiring 3464 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 2: 0.032768 time requiring 57600 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 5: 0.059424 time requiring 203008 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 4: 0.062240 time requiring 207360 memory
Resulting weights from Softmax:
0.0000000 0.9999399 0.0000000 0.0000000 0.0000561 0.0000000 0.0000012 0.0000017 0.0000010 0.0000000
Loading image data/three_28x28.pgm
Performing forward propagation ...
Resulting weights from Softmax:
0.0000000 0.0000000 0.0000000 0.9999288 0.0000000 0.0000711 0.0000000 0.0000000 0.0000000 0.0000000
Loading image data/five_28x28.pgm
Performing forward propagation ...
Resulting weights from Softmax:
0.0000000 0.0000008 0.0000000 0.0000002 0.0000000 0.9999820 0.0000154 0.0000000 0.0000012 0.0000006
Result of classification: 1 3 5
Test passed!
Testing half precision (math in single precision)
Loading image data/one_28x28.pgm
Performing forward propagation ...
Testing cudnnGetConvolutionForwardAlgorithm ...
Fastest algorithm is Algo 1
Testing cudnnFindConvolutionForwardAlgorithm ...
^^^^ CUDNN_STATUS_SUCCESS for Algo 0: 0.013312 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 1: 0.025600 time requiring 3464 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 2: 0.032800 time requiring 28800 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 5: 0.043008 time requiring 203008 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 4: 0.061280 time requiring 207360 memory
Resulting weights from Softmax:
0.0000001 1.0000000 0.0000001 0.0000000 0.0000563 0.0000001 0.0000012 0.0000017 0.0000010 0.0000001
Loading image data/three_28x28.pgm
Performing forward propagation ...
Resulting weights from Softmax:
0.0000000 0.0000000 0.0000000 1.0000000 0.0000000 0.0000714 0.0000000 0.0000000 0.0000000 0.0000000
Loading image data/five_28x28.pgm
Performing forward propagation ...
Resulting weights from Softmax:
0.0000000 0.0000008 0.0000000 0.0000002 0.0000000 1.0000000 0.0000154 0.0000000 0.0000012 0.0000006
Result of classification: 1 3 5
Test passed!
获取最新文章: 扫一扫右上角的二维码加入“创客智造”公众号