首页 > 代码库 > ubuntu14.04 配置cuda

ubuntu14.04 配置cuda

首先,我装的系统是Ubuntu14.04.1

1. 预检查

按照参考链接1中所示,检查系统。

执行命令:

:~$ lspci | grep -i nvidia
03:00.0 3D controller: NVIDIA Corporation GK110GL [Tesla K20c] (rev a1)
04:00.0 VGA compatible controller: NVIDIA Corporation GK106GL [Quadro K4000] (rev a1)
04:00.1 Audio device: NVIDIA Corporation GK106 HDMI Audio Controller (rev a1)

发现有K20K4000两块GPU,还有一块Audio的应该是声卡。

然后,执行命令检查系统版本:

~$ uname -m && cat /etc/*release
x86_64
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=14.04
DISTRIB_CODENAME=trusty
DISTRIB_DESCRIPTION="Ubuntu 14.04.1 LTS"
NAME="Ubuntu"
VERSION="14.04.1 LTS, Trusty Tahr"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 14.04.1 LTS"
VERSION_ID="14.04"
HOME_URL="http://www.ubuntu.com/"
SUPPORT_URL="http://help.ubuntu.com/"
BUG_REPORT_URL="http://bugs.launchpad.net/ubuntu/"

可以看到,机器是ubuntu14.04的版本。

然后,使用gcc --version检查gcc版本是否符合链接1中的要求:

~$ gcc --version
gcc (Ubuntu 4.8.2-19ubuntu1) 4.8.2
Copyright (C) 2013 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

检查完毕,就去nvidia的官网(参考链接3)上下载驱动,为下载的是ubuntu14.04deb包。

2. 安装

Deb包安装较为简单,但是安装过程中提示不稳定,不过用着也没啥出错的地方。

先按照参考链接2安装必要的库。

sudo apt-get install freeglut3-dev build-essential libx11-dev libxmu-dev libxi-dev libgl1-mesa-glx libglu1-mesa libglu1-mesa-dev

还是按照官网上的流程来。

$ sudo dpkg -i cuda-repo-<distro>_<version>_<architecture>.deb
$ sudo apt-get update
$ sudo apt-get install cuda

可能需要下载较长时间,但是没关系,放在那等着就是。

没啥问题就算安装好了。

安装过程中提示:

*** Please reboot your computer and verify that the nvidia graphics driver is loaded. ***
*** If the driver fails to load, please use the NVIDIA graphics driver .run installer ***
*** to get into a stable state.

我没管,提示使用.run安装比较稳定,但我现在用着没问题。

3. 配置环境

我的系统是64位的,因此配置环境时在.bashrc中加入

$ export PATH=/usr/local/cuda-6.5/bin:$PATH
$ export LD_LIBRARY_PATH=/usr/local/cuda-6.5/lib64:$LD_LIBRARY_PATH

配置完环境后,执行命令

~$ source .bashrc 

使其立刻生效。

4. 安装sample

配置好环境后,可以执行如下命令:

$ cuda-install-samples-6.5.sh <dir>

这样,就将cudasample拷贝到dir文件夹下了。该命令只是一个拷贝操作。

然后进入该文件夹,执行make命令进行编译,编译时间较长,需要等待。

5. 验证安装是否成功

5.1. 驱动验证

首先,验证nvidia的驱动是否安装成功。

~$ cat /proc/driver/nvidia/version 
NVRM version: NVIDIA UNIX x86_64 Kernel Module  340.29  Thu Jul 31 20:23:19 PDT 2014
GCC version:  gcc version 4.8.2 (Ubuntu 4.8.2-19ubuntu1) 

5.2. Toolkit验证

验证cuda toolkit是否成功。

~$ nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2014 NVIDIA Corporation
Built on Thu_Jul_17_21:41:27_CDT_2014
Cuda compilation tools, release 6.5, V6.5.12

5.3. 设备识别

使用cuda sample已经编译好的deviceQuery来验证。deviceQuery<cuda_sample_install_path>/bin/x_86_64/linux/release目录下。我的结果如下,检测出了两块GPU来。

~/install/NVIDIA_CUDA-6.5_Samples/bin/x86_64/linux/release$ ./deviceQuery
./deviceQuery Starting...
 CUDA Device Query (Runtime API) version (CUDART static linking)
Detected 2 CUDA Capable device(s)
 
Device 0: "Tesla K20c"
  CUDA Driver Version / Runtime Version          6.5 / 6.5
  CUDA Capability Major/Minor version number:    3.5
  Total amount of global memory:                 4800 MBytes (5032706048 bytes)
  (13) Multiprocessors, (192) CUDA Cores/MP:     2496 CUDA Cores
  GPU Clock rate:                                706 MHz (0.71 GHz)
  Memory Clock rate:                             2600 Mhz
  Memory Bus Width:                              320-bit
  L2 Cache Size:                                 1310720 bytes
  Maximum Texture Dimension Size (x,y,z)         1D=(65536), 2D=(65536, 65536), 3D=(4096, 4096, 4096)
  Maximum Layered 1D Texture Size, (num) layers  1D=(16384), 2048 layers
  Maximum Layered 2D Texture Size, (num) layers  2D=(16384, 16384), 2048 layers
  Total amount of constant memory:               65536 bytes
  Total amount of shared memory per block:       49152 bytes
  Total number of registers available per block: 65536
  Warp size:                                     32
  Maximum number of threads per multiprocessor:  2048
  Maximum number of threads per block:           1024
  Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
  Max dimension size of a grid size    (x,y,z): (2147483647, 65535, 65535)
  Maximum memory pitch:                          2147483647 bytes
  Texture alignment:                             512 bytes
  Concurrent copy and kernel execution:          Yes with 2 copy engine(s)
  Run time limit on kernels:                     No
  Integrated GPU sharing Host Memory:            No
  Support host page-locked memory mapping:       Yes
  Alignment requirement for Surfaces:            Yes
  Device has ECC support:                        Enabled
  Device supports Unified Addressing (UVA):      Yes
  Device PCI Bus ID / PCI location ID:           3 / 0
  Compute Mode:
     < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >
 
Device 1: "Quadro K4000"
  CUDA Driver Version / Runtime Version          6.5 / 6.5
  CUDA Capability Major/Minor version number:    3.0
  Total amount of global memory:                 3071 MBytes (3220504576 bytes)
  ( 4) Multiprocessors, (192) CUDA Cores/MP:     768 CUDA Cores
  GPU Clock rate:                                811 MHz (0.81 GHz)
  Memory Clock rate:                             2808 Mhz
  Memory Bus Width:                              192-bit
  L2 Cache Size:                                 393216 bytes
  Maximum Texture Dimension Size (x,y,z)         1D=(65536), 2D=(65536, 65536), 3D=(4096, 4096, 4096)
  Maximum Layered 1D Texture Size, (num) layers  1D=(16384), 2048 layers
  Maximum Layered 2D Texture Size, (num) layers  2D=(16384, 16384), 2048 layers
  Total amount of constant memory:               65536 bytes
  Total amount of shared memory per block:       49152 bytes
  Total number of registers available per block: 65536
  Warp size:                                     32
  Maximum number of threads per multiprocessor:  2048
  Maximum number of threads per block:           1024
  Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
  Max dimension size of a grid size    (x,y,z): (2147483647, 65535, 65535)
  Maximum memory pitch:                          2147483647 bytes
  Texture alignment:                             512 bytes
  Concurrent copy and kernel execution:          Yes with 1 copy engine(s)
  Run time limit on kernels:                     Yes
  Integrated GPU sharing Host Memory:            No
  Support host page-locked memory mapping:       Yes
  Alignment requirement for Surfaces:            Yes
  Device has ECC support:                        Disabled
  Device supports Unified Addressing (UVA):      Yes
  Device PCI Bus ID / PCI location ID:           4 / 0
  Compute Mode:
     < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >
> Peer access from Tesla K20c (GPU0) -> Quadro K4000 (GPU1) : No
> Peer access from Quadro K4000 (GPU1) -> Tesla K20c (GPU0) : No
 
deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 6.5, CUDA Runtime Version = 6.5, NumDevs = 2, Device0 = Tesla K20c, Device1 = Quadro K4000
Result = PASS

这样,cuda就安装成功了。



参考链接

1.http://docs.nvidia.com/cuda/cuda-getting-started-guide-for-linux/index.html#axzz3EiJkLjAq

2.http://blog.csdn.net/abcjennifer/article/details/23016583

3.https://developer.nvidia.com/cuda-downloads

 

 

 

 

ubuntu14.04 配置cuda