TX1入门教程软件篇-安装TensorFlow(1.0.1)
TX1入门教程软件篇-安装TensorFlow(1.0.1)
说明:
- 介绍如何在TX1上安装TensorFlow 1.0.1版本,1.0版本以上可以支持更多功能实现。
准备:
- 利用Jetpack安装如下:
- L4T 24.2.1 an Ubuntu 16.04 64-bit variant (aarch64)
- CUDA 8.0
- cuDNN 5.1.5
- TensorFlow安装需要用到CUDA和cuDNN
- TensorFlow占用比较多空间,TX1通常空间不足,最好增加64G+的U盘作为root分区启动,增加交换分区大小为8G+
安装:
- 安装Java:
sudo add-apt-repository ppa:webupd8team/java
sudo apt-get update
sudo apt-get install oracle-java8-installer
- 安装依赖,(使用Python 2.7)
sudo apt-get install zip unzip autoconf automake libtool curl zlib1g-dev maven -y
sudo apt-get install python-numpy swig python-dev python-pip python-wheel -y
- 安装Bazel(0.5.0版本)
wget https://github.com/bazelbuild/bazel/releases/download/0.4.5/bazel-0.4.5-dist.zip
- 解压,进入
cd bazel-0.4.5-dist
- 修改:
vim src/main/java/com/google/devtools/build/lib/util/CPU.java
- 其中28行:ARM("arm", ImmutableSet.of("arm","armv7l"))
- 修改为:ARM("arm", ImmutableSet.of("aarch64", "arm","armv7l"))
- 编译:
./compile.sh
- 复制到系统bin目录:
sudo cp output/bazel /usr/local/bin
创建swap文件
- 创建8G swap文件:
fallocate -l 8G swapfile
- 修改权限
chmod 600 swapfile
- 创建swap区
mkswap swapfile
- 激活
swapon swapfile
- 确认
swapon -s
安装TensorFlow
- 克隆
git clone https://github.com/tensorflow/tensorflow.git
- checkout 新版本
cd tensorflow
git checkout v1.0.1
- 修改:
tensorflow/stream_executor/cuda/cuda_gpu_executor.cc
- 找到:
static int TryToReadNumaNode(conststring &pci_bus_id,intdevice_ordinal)
- 添加:
#if defined(__APPLE__)
#ifdef __aarch64__
LOG(INFO) << "ARM64 does not support NUMA - returning NUMA node zero";
return 0;
#elif defined(__APPLE__)
- 增加头文件
sudo cp /usr/include/cudnn.h /usr/lib/aarch64-linux-gnu/include/cudnn.h
- 编译:
./configure
- 配置:
ubuntu@tegra-ubuntu:~/tensorflow$ ./configure
Please specify the location of python. [Default is /usr/bin/python]: /usr/bin/python2.7
Please specify optimization flags to use during compilation [Default is -march=native]:
Do you wish to use jemalloc as the malloc implementation? (Linux only) [Y/n] y
jemalloc enabled on Linux
Do you wish to build TensorFlow with Google Cloud Platform support? [y/N] n
No Google Cloud Platform support will be enabled for TensorFlow
Do you wish to build TensorFlow with Hadoop File System support? [y/N] n
No Hadoop File System support will be enabled for TensorFlow
Do you wish to build TensorFlow with the XLA just-in-time compiler (experimental)? [y/N] y
XLA JIT support will be enabled for TensorFlow
Found possible Python library paths:
/usr/local/lib/python2.7/dist-packages
/usr/lib/python2.7/dist-packages
Please input the desired Python library path to use. Default is [/usr/local/lib/python2.7/dist-packages]
Using python library path: /usr/local/lib/python2.7/dist-packages
Do you wish to build TensorFlow with OpenCL support? [y/N] n
No OpenCL support will be enabled for TensorFlow
Do you wish to build TensorFlow with CUDA support? [y/N] y
CUDA support will be enabled for TensorFlow
Please specify which gcc should be used by nvcc as the host compiler. [Default is /usr/bin/gcc]:
Please specify the CUDA SDK version you want to use, e.g. 7.0. [Leave empty to use system default]:
Please specify the location where CUDA toolkit is installed. Refer to README.md for more details. [Default is /usr/local/cuda]:
Please specify the Cudnn version you want to use. [Leave empty to use system default]:
Please specify the location where cuDNN library is installed. Refer to README.md for more details. [Default is /usr/local/cuda]:
Please specify a list of comma-separated Cuda compute capabilities you want to build with.
You can find the compute capability of your device at: https://developer.nvidia.com/cuda-gpus.
Please note that each additional compute capability significantly increases your build time and binary size.
Extracting Bazel installation...
.......................
INFO: Starting clean (this may take a while). Consider using --expunge_async if the clean takes more than several minutes.
.......................
INFO: All external dependencies fetched successfully.
Configuration finished
- bazel 编译:
bazel build -c opt --local_resources 3072,4.0,1.0 --verbose_failures --config=cuda //tensorflow/tools/pip_package:build_pip_package
- bazel生成whl文件
bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_pkg
- 保存whl文件
mv /tmp/tensorflow_pkg/tensorflow-1.0.1-cp27-cp27mu-linux_aarch64.whl $HOME/
- 安装
sudo pip install $HOME/tensorflow-1.0.1-cp27-cp27mu-linux_aarch64.whl
- 重启
sudo reboot
- 测试:
ubuntu@tegra-ubuntu:~$ python
Python 2.7.12 (default, Nov 19 2016, 06:48:10)
[GCC 5.4.0 20160609] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import tensorflow as tf
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcublas.so.8.0 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcudnn.so.5 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcufft.so.8.0 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcuda.so.1 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcurand.so.8.0 locally
>>> x = tf.constant(1.0)
>>> y = tf.constant(2.0)
>>> z = x + y
>>> with tf.Session() as sess:
... print z.eval()
...
I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:874] ARM has no NUMA node, hardcoding to return zero
I tensorflow/core/common_runtime/gpu/gpu_device.cc:885] Found device 0 with properties:
name: GP10B
major: 6 minor: 2 memoryClockRate (GHz) 1.3005
pciBusID 0000:00:00.0
Total memory: 7.67GiB
Free memory: 6.79GiB
I tensorflow/core/common_runtime/gpu/gpu_device.cc:906] DMA: 0
I tensorflow/core/common_runtime/gpu/gpu_device.cc:916] 0: Y
I tensorflow/core/common_runtime/gpu/gpu_device.cc:975] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GP10B, pci bus id: 0000:00:00.0)
E tensorflow/stream_executor/cuda/cuda_driver.cc:1002] failed to allocate 6.45G (6929413888 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
E tensorflow/stream_executor/cuda/cuda_driver.cc:1002] failed to allocate 5.81G (6236472320 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
E tensorflow/stream_executor/cuda/cuda_driver.cc:1002] failed to allocate 5.23G (5612825088 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
E tensorflow/stream_executor/cuda/cuda_driver.cc:1002] failed to allocate 4.70G (5051542528 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
E tensorflow/stream_executor/cuda/cuda_driver.cc:1002] failed to allocate 4.23G (4546387968 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
E tensorflow/stream_executor/cuda/cuda_driver.cc:1002] failed to allocate 3.81G (4091749120 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
I tensorflow/compiler/xla/service/platform_util.cc:58] platform CUDA present with 1 visible devices
I tensorflow/compiler/xla/service/platform_util.cc:58] platform Host present with 4 visible devices
I tensorflow/compiler/xla/service/service.cc:180] XLA service executing computations on platform Host. Devices:
I tensorflow/compiler/xla/service/service.cc:187] StreamExecutor device (0): <undefined>, <undefined>
I tensorflow/compiler/xla/service/platform_util.cc:58] platform CUDA present with 1 visible devices
I tensorflow/compiler/xla/service/platform_util.cc:58] platform Host present with 4 visible devices
I tensorflow/compiler/xla/service/service.cc:180] XLA service executing computations on platform CUDA. Devices:
I tensorflow/compiler/xla/service/service.cc:187] StreamExecutor device (0): GP10B, Compute Capability 6.2
3.0
问题:
- 错误:
tensorflow/stream_executor/BUILD:39:1: C++ compilation of rule '//tensorflow/stream_executor:cuda_platform' failed: crosstool_wrapper_driver_is_not_gcc failed: error executing command
- 解决:
https://github.com/tensorflow/tensorflow/issues/2559
https://github.com/tensorflow/tensorflow/issues/2556
- 修改:tensorflow/stream_executor/cuda/cuda_blas.cc, 在
#if CUDA_VERSION >= 7050
#define EIGEN_HAS_CUDA_FP16
#endif
- 增加定义:
#if CUDA_VERSION >= 8000
#define CUBLAS_DATA_HALF CUDA_R_16F
#endif
获取最新文章: 扫一扫右上角的二维码加入“创客智造”公众号