从零到实践《知乎"看山杯"第一名 init 队解决方案(PyTorch)》
首先我是一名JAVA开发者,对Python了解较少,最近工作需要对大量文本进行分析整理,然后就开始从网上找资料,从知乎渠道了解到知乎举办的看山杯比赛,找到了冠军init队的解决方案,便开始了尝试。我的思路可能是错误的。
事实证明:机器学习需要带GPU的大内存linux系统,虚拟机安装的系统无法计算。
首先linux系统需要64位,我使用虚拟机安装了linux系统。
虚拟机版本:VMware-workstation-full-14.1.1-7528167.exe,14pro版本
linux系统版本:CentOS-7-x86_64-DVD-1708.iso 4G左右 系统是从阿里云镜像站下载的
写这个博客的目的是记录我操作过程中步骤及问题解决办法。因为是一遍操作一遍记录,所以篇幅可能没有排版,后面做完之后会进行整理排版,另外本人也是一个相对完美主义者。
1,安装完系统后,内存设置为2G,硬盘:40G,因为电脑配置低,没有满足init解决方案的最低配置,但抱着尝试的态度去尝试。现在无法知道最后是否能完成,虚拟机默认为NET网卡,开启系统后,在虚拟机中操作ifconfig,(如果安装的是简单系统版本是没有这个命令的),没有ip地址。也无法ping通baidu.com。首先要可以与主机网络互通,所以执行以下操作:
[[email protected] ~]# vi /etc/sysconfig/network-scripts/ifcfg-ens33
打开后编辑
ONBOOT为yes ---刚开始为no
TYPE=Ethernet
PROXY_METHOD=none
BROWSER_ONLY=no
BOOTPROTO=dhcp
DEFROUTE=yes
IPV4_FAILURE_FATAL=no
IPV6INIT=yes
IPV6_AUTOCONF=yes
IPV6_DEFROUTE=yes
IPV6_FAILURE_FATAL=no
IPV6_ADDR_GEN_MODE=stable-privacy
NAME=ens33
UUID=763f71d6-ec83-4bee-8322-4c903f6b78ed
DEVICE=ens33
ONBOOT=yes
编辑完成后,保存退出,重启网络或者重启系统,这里由于固定IP操作比较麻烦,所以未做这一步。
重启完成后,我使用XSHELL进行连接操作。该软件视图清晰,而且容易复制。在虚拟机中操作不可以复制粘贴命令,不太方便。连接之后,执行命令。表示可以联网了。
[[email protected] ~]# ping baidu.com
PING baidu.com (123.125.115.110) 56(84) bytes of data.
64 bytes from 123.125.115.110 (123.125.115.110): icmp_seq=1 ttl=128 time=130 ms
接下来可能要安装软件,所以看下yum是否可以操作,如下命令表示可以使用yum
[[email protected] ~]# yum install unzip
已加载插件:fastestmirror
base
系统自带Python,因为方案是用的Python2.7,所以无需再重新安装
[[email protected] ~]# python
Python 2.7.5 (default, Aug 4 2017, 00:39:18)
[GCC 4.8.5 20150623 (Red Hat 4.8.5-16)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>>
安装pip和wheel,setuptools
[[email protected] /]# mkdir weblogic
[[email protected] /]# ls
bin boot dev etc home lib lib64 media mnt opt proc root run sbin srv sys tmp usr var weblogic
[[email protected] /]# cd weblogic/
[[email protected] weblogic]# curl https://bootstrap.pypa.io/get-pip.py -o get-pip.py
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 1603k 100 1603k 0 0 465k 0 0:00:03 0:00:03 --:--:-- 465k
[[email protected] weblogic]# python get-pip.py
Collecting pip
Downloading https://files.pythonhosted.org/packages/0f/74/ecd13431bcc456ed390b44c8a6e917c1820365cbebcb6a8974d1cd045ab4/pip-10.0.1-py2.py3-none-any.whl (1.3MB)
100% |████████████████████████████████| 1.3MB 294kB/s
Collecting setuptools
Downloading https://files.pythonhosted.org/packages/7f/e1/820d941153923aac1d49d7fc37e17b6e73bfbd2904959fffbad77900cf92/setuptools-39.2.0-py2.py3-none-any.whl (567kB)
100% |████████████████████████████████| 573kB 406kB/s
Collecting wheel
Downloading https://files.pythonhosted.org/packages/81/30/e935244ca6165187ae8be876b6316ae201b71485538ffac1d718843025a9/wheel-0.31.1-py2.py3-none-any.whl (41kB)
100% |████████████████████████████████| 51kB 729kB/s
Installing collected packages: pip, setuptools, wheel
Successfully installed pip-10.0.1 setuptools-39.2.0 wheel-0.31.1
[[email protected] weblogic]#
[[email protected] weblogic]# ls
get-pip.py ipdb-0.11.tar.gz pip-10.0.1.tar.gz setuptools-39.2.0 setuptools-39.2.0.zip torch-0.1.12.post2-cp27-none-linux_x86_64.whl wheel-0.31.1 wheel-0.31.1.tar.gz
[[email protected] weblogic]# cd wheel-0.31.1
[[email protected] wheel-0.31.1]# python setup.py install
安装(PyTorch)
[[email protected] weblogic]# pip install torch-0.1.12.post2-cp27-none-linux_x86_64.whl
Processing ./torch-0.1.12.post2-cp27-none-linux_x86_64.whl
安装GIT
[[email protected] PyTorchText-master]# yum install curl-devel expat-devel gettext-devel openssl-devel zlib-devel gcc perl-ExtUtils-MakeMaker
已加载插件:fastestmirror
Loading mirror speeds from cached hostfile
下载git安装包
wget https://www.kernel.org/pub/software/scm/git/git-2.8.3.tar.gz
解压git安装包
tar -zxvf git-2.8.3.tar.gz
cd git-2.8.3
[[email protected] git-2.8.3]# pwd
/weblogic/git-2.8.3
[[email protected] git-2.8.3]# gcc --version
gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-28)
Copyright © 2015 Free Software Foundation, Inc.
本程序是*软件;请参看源代码的版权声明。本软件没有任何担保;
包括没有适销性和某一专用目的下的适用性担保。
[[email protected] git-2.8.3]# ./configure prefix=/usr/local/git/
configure: Setting lib to 'lib' (the default)
./check_bindir "z$bindir" "z$execdir" "$bindir/git-add"
[[email protected] git-2.8.3]# git --version
git version 1.8.3.1
[[email protected] git-2.8.3]#
[[email protected] PyTorchText-master]# pip install Cython
Collecting Cython
Downloading https://files.pythonhosted.org/packages/f6/23/ef5521e077e9e7ef8e4603e27713ae95fee69e9c19c7cd036b4299c7ced5/Cython-0.28.3-cp27-cp27mu-manylinux1_x86_64.whl (3.3MB)
100% |████████████████████████████████| 3.3MB 486kB/s
Installing collected packages: Cython
Successfully installed Cython-0.28.3
[[email protected] PyTorchText-master]#
安装fasttext时,如果用pip会报错,
ImportError: No module named Cython.Build
解决方案如下:
pip install Cython
pip install fasttext ---这个安装报错了。信息如下
[[email protected] PyTorchText-master]# pip install fasttext
Collecting fasttext
Downloading https://files.pythonhosted.org/packages/a4/86/ff826211bc9e28d4c371668b30b4b2c38a09127e5e73017b1c0cd52f9dfa/fasttext-0.8.3.tar.gz (73kB)
100% |████████████████████████████████| 81kB 315kB/s
Collecting numpy>=1 (from fasttext)
Downloading https://files.pythonhosted.org/packages/c0/e7/08f059a00367fd613e4f2875a16c70b6237268a1d6d166c6d36acada8301/numpy-1.14.3-cp27-cp27mu-manylinux1_x86_64.whl (12.1MB)
100% |████████████████████████████████| 12.1MB 392kB/s
Collecting future (from fasttext)
Downloading https://files.pythonhosted.org/packages/00/2b/8d082ddfed935f3608cc61140df6dcbf0edea1bc3ab52fb6c29ae3e81e85/future-0.16.0.tar.gz (824kB)
100% |████████████████████████████████| 829kB 441kB/s
Building wheels for collected packages: fasttext, future
Running setup.py bdist_wheel for fasttext ... error
Complete output from command /usr/bin/python -u -c "import setuptools, tokenize;__file__='/tmp/pip-install-DZjW32/fasttext/setup.py';f=getattr(tokenize, 'open', open)(__file__);
code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" bdist_wheel -d /tmp/pip-wheel-KodiTL --python-tag cp27: running bdist_wheel
running build
running build_py
creating build
creating build/lib.linux-x86_64-2.7
creating build/lib.linux-x86_64-2.7/fasttext
copying fasttext/__init__.py -> build/lib.linux-x86_64-2.7/fasttext
copying fasttext/model.py -> build/lib.linux-x86_64-2.7/fasttext
running build_ext
building 'fasttext.fasttext' extension
creating build/temp.linux-x86_64-2.7
creating build/temp.linux-x86_64-2.7/fasttext
creating build/temp.linux-x86_64-2.7/fasttext/cpp
creating build/temp.linux-x86_64-2.7/fasttext/cpp/src
gcc -pthread -fno-strict-aliasing -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches -m64 -mtune=ge
neric -D_GNU_SOURCE -fPIC -fwrapv -DNDEBUG -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches -m64 -mtune=generic -D_GNU_SOURCE -fPIC -fwrapv -fPIC -I./fasttext -I/usr/include/python2.7 -c fasttext/fasttext.cpp -o build/temp.linux-x86_64-2.7/fasttext/fasttext.o -O3 -pthread -funroll-loops -std=c++0x gcc: error trying to exec 'cc1plus': execvp: 没有那个文件或目录
error: command 'gcc' failed with exit status 1
----------------------------------------
Failed building wheel for fasttext
Running setup.py clean for fasttext
Running setup.py bdist_wheel for future ... done
Stored in directory: /root/.cache/pip/wheels/bf/c9/a3/c538d90ef17cf7823fa51fc701a7a7a910a80f6a405bf15b1a
Successfully built future
Failed to build fasttext
Installing collected packages: numpy, future, fasttext
Running setup.py install for fasttext ... error
Complete output from command /usr/bin/python -u -c "import setuptools, tokenize;__file__='/tmp/pip-install-DZjW32/fasttext/setup.py';f=getattr(tokenize, 'open', open)(__file__
);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" install --record /tmp/pip-record-VyEfve/install-record.txt --single-version-externally-managed --compile: running install
running build
running build_py
creating build
creating build/lib.linux-x86_64-2.7
creating build/lib.linux-x86_64-2.7/fasttext
copying fasttext/__init__.py -> build/lib.linux-x86_64-2.7/fasttext
copying fasttext/model.py -> build/lib.linux-x86_64-2.7/fasttext
running build_ext
building 'fasttext.fasttext' extension
creating build/temp.linux-x86_64-2.7
creating build/temp.linux-x86_64-2.7/fasttext
creating build/temp.linux-x86_64-2.7/fasttext/cpp
creating build/temp.linux-x86_64-2.7/fasttext/cpp/src
gcc -pthread -fno-strict-aliasing -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches -m64 -mtune=
generic -D_GNU_SOURCE -fPIC -fwrapv -DNDEBUG -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches -m64 -mtune=generic -D_GNU_SOURCE -fPIC -fwrapv -fPIC -I./fasttext -I/usr/include/python2.7 -c fasttext/fasttext.cpp -o build/temp.linux-x86_64-2.7/fasttext/fasttext.o -O3 -pthread -funroll-loops -std=c++0x gcc: error trying to exec 'cc1plus': execvp: 没有那个文件或目录
error: command 'gcc' failed with exit status 1
----------------------------------------
Command "/usr/bin/python -u -c "import setuptools, tokenize;__file__='/tmp/pip-install-DZjW32/fasttext/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace(
'\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" install --record /tmp/pip-record-VyEfve/install-record.txt --single-version-externally-managed --compile" failed with error code 1 in /tmp/pip-install-DZjW32/fasttext/
安装TensorFlowpip install -r requirements.txt
[[email protected] PyTorchText-master]# pip install -r requirements.txt
Collecting git+https://github.com/pytorch/[email protected] (from -r requirements.txt (line 5))
Cloning https://github.com/pytorch/tnt.git (to revision master) to /tmp/pip-req-build-E_05vl
Collecting ipdb (from -r requirements.txt (line 1))
Collecting fire (from -r requirements.txt (line 2))
Collecting tqdm (from -r requirements.txt (line 3))
Using cached https://files.pythonhosted.org/packages/93/24/6ab1df969db228aed36a648a8959d1027099ce45fad67532b9673d533318/tqdm-4.23.4-py2.py3-none-any.whl
Collecting visdom (from -r requirements.txt (line 4))
Collecting word2vec (from -r requirements.txt (line 6))
Using cached https://files.pythonhosted.org/packages/5b/33/8e1cf93216342f0fe8aa4484ef1a833a12c4f6d6bf8e8b46ecc0feb5e5e8/word2vec-0.9.2.tar.gz
Requirement already satisfied: torch in /usr/lib64/python2.7/site-packages (from torchnet==0.0.2->-r requirements.txt (line 5)) (0.1.12.post2)
Requirement already satisfied: six in /usr/lib/python2.7/site-packages (from torchnet==0.0.2->-r requirements.txt (line 5)) (1.11.0)
Collecting ipython<6.0.0,>=5.0.0; python_version == "2.7" (from ipdb->-r requirements.txt (line 1))
Using cached https://files.pythonhosted.org/packages/52/19/aadde98d6bde1667d0bf431fb2d22451f880aaa373e0a241c7e7cb5815a0/ipython-5.7.0-py2-none-any.whl
Requirement already satisfied: setuptools in /usr/lib/python2.7/site-packages (from ipdb->-r requirements.txt (line 1)) (39.2.0)
Collecting torchfile (from visdom->-r requirements.txt (line 4))
Collecting pyzmq (from visdom->-r requirements.txt (line 4))
Using cached https://files.pythonhosted.org/packages/5d/b0/3aea046f5519e2e059a225e8c924f897846b608793f890be987d07858b7c/pyzmq-17.0.0-cp27-cp27mu-manylinux1_x86_64.whl
Requirement already satisfied: numpy>=1.8 in /usr/lib64/python2.7/site-packages (from visdom->-r requirements.txt (line 4)) (1.14.3)
Collecting tornado (from visdom->-r requirements.txt (line 4))
Collecting websocket-client (from visdom->-r requirements.txt (line 4))
Using cached https://files.pythonhosted.org/packages/8a/a1/72ef9aa26cfe1a75cee09fc1957e4723add9de098c15719416a1ee89386b/websocket_client-0.48.0-py2.py3-none-any.whl
Collecting pillow (from visdom->-r requirements.txt (line 4))
Using cached https://files.pythonhosted.org/packages/00/49/a0483e7308b4b04b5a898789911dbb876d9fea54e7df0453915e47744cfd/Pillow-5.1.0-cp27-cp27mu-manylinux1_x86_64.whl
Collecting scipy (from visdom->-r requirements.txt (line 4))
Using cached https://files.pythonhosted.org/packages/2a/f3/de9c1bd16311982711209edaa8c6caa962db30ebb6a8cc6f1dcd2d3ef616/scipy-1.1.0-cp27-cp27mu-manylinux1_x86_64.whl
Collecting requests (from visdom->-r requirements.txt (line 4))
Using cached https://files.pythonhosted.org/packages/49/df/50aa1999ab9bde74656c2919d9c0c085fd2b3775fd3eca826012bef76d8c/requests-2.18.4-py2.py3-none-any.whl
Requirement already satisfied: cython in /usr/lib64/python2.7/site-packages (from word2vec->-r requirements.txt (line 6)) (0.28.3)
Requirement already satisfied: pyyaml in /usr/lib64/python2.7/site-packages (from torch->torchnet==0.0.2->-r requirements.txt (line 5)) (3.12)
Requirement already satisfied: prompt-toolkit<2.0.0,>=1.0.4 in /usr/lib/python2.7/site-packages (from ipython<6.0.0,>=5.0.0; python_version == "2.7"->ipdb->-r requirements.txt (li
ne 1)) (1.0.15)Requirement already satisfied: decorator in /usr/lib/python2.7/site-packages (from ipython<6.0.0,>=5.0.0; python_version == "2.7"->ipdb->-r requirements.txt (line 1)) (3.4.0)
Requirement already satisfied: pexpect; sys_platform != "win32" in /usr/lib/python2.7/site-packages (from ipython<6.0.0,>=5.0.0; python_version == "2.7"->ipdb->-r requirements.txt
(line 1)) (4.6.0)Requirement already satisfied: backports.shutil-get-terminal-size; python_version == "2.7" in /usr/lib/python2.7/site-packages (from ipython<6.0.0,>=5.0.0; python_version == "2.7"
->ipdb->-r requirements.txt (line 1)) (1.0.0)Requirement already satisfied: pygments in /usr/lib64/python2.7/site-packages (from ipython<6.0.0,>=5.0.0; python_version == "2.7"->ipdb->-r requirements.txt (line 1)) (2.2.0)
Collecting pathlib2; python_version == "2.7" or python_version == "3.3" (from ipython<6.0.0,>=5.0.0; python_version == "2.7"->ipdb->-r requirements.txt (line 1))
Using cached https://files.pythonhosted.org/packages/66/a7/9f8d84f31728d78beade9b1271ccbfb290c41c1e4dc13dbd4997ad594dcd/pathlib2-2.3.2-py2.py3-none-any.whl
Collecting traitlets>=4.2 (from ipython<6.0.0,>=5.0.0; python_version == "2.7"->ipdb->-r requirements.txt (line 1))
Using cached https://files.pythonhosted.org/packages/93/d6/abcb22de61d78e2fc3959c964628a5771e47e7cc60d53e9342e21ed6cc9a/traitlets-4.3.2-py2.py3-none-any.whl
Collecting simplegeneric>0.8 (from ipython<6.0.0,>=5.0.0; python_version == "2.7"->ipdb->-r requirements.txt (line 1))
Collecting pickleshare (from ipython<6.0.0,>=5.0.0; python_version == "2.7"->ipdb->-r requirements.txt (line 1))
Using cached https://files.pythonhosted.org/packages/9f/17/daa142fc9be6b76f26f24eeeb9a138940671490b91cb5587393f297c8317/pickleshare-0.7.4-py2.py3-none-any.whl
Collecting backports-abc>=0.4 (from tornado->visdom->-r requirements.txt (line 4))
Using cached https://files.pythonhosted.org/packages/7d/56/6f3ac1b816d0cd8994e83d0c4e55bc64567532f7dc543378bd87f81cebc7/backports_abc-0.5-py2.py3-none-any.whl
Requirement already satisfied: futures in /usr/lib/python2.7/site-packages (from tornado->visdom->-r requirements.txt (line 4)) (3.2.0)
Collecting singledispatch (from tornado->visdom->-r requirements.txt (line 4))
Using cached https://files.pythonhosted.org/packages/c5/10/369f50bcd4621b263927b0a1519987a04383d4a98fb10438042ad410cf88/singledispatch-3.4.0.3-py2.py3-none-any.whl
Collecting certifi>=2017.4.17 (from requests->visdom->-r requirements.txt (line 4))
Using cached https://files.pythonhosted.org/packages/7c/e6/92ad559b7192d846975fc916b65f667c7b8c3a32bea7372340bfe9a15fa5/certifi-2018.4.16-py2.py3-none-any.whl
Collecting chardet<3.1.0,>=3.0.2 (from requests->visdom->-r requirements.txt (line 4))
Using cached https://files.pythonhosted.org/packages/bc/a9/01ffebfb562e4274b6487b4bb1ddec7ca55ec7510b22e4c51f14098443b8/chardet-3.0.4-py2.py3-none-any.whl
Collecting idna<2.7,>=2.5 (from requests->visdom->-r requirements.txt (line 4))
Using cached https://files.pythonhosted.org/packages/27/cc/6dd9a3869f15c2edfab863b992838277279ce92663d334df9ecf5106f5c6/idna-2.6-py2.py3-none-any.whl
Collecting urllib3<1.23,>=1.21.1 (from requests->visdom->-r requirements.txt (line 4))
Using cached https://files.pythonhosted.org/packages/63/cb/6965947c13a94236f6d4b8223e21beb4d576dc72e8130bd7880f600839b8/urllib3-1.22-py2.py3-none-any.whl
Requirement already satisfied: wcwidth in /usr/lib/python2.7/site-packages (from prompt-toolkit<2.0.0,>=1.0.4->ipython<6.0.0,>=5.0.0; python_version == "2.7"->ipdb->-r requirement
s.txt (line 1)) (0.1.7)Requirement already satisfied: ptyprocess>=0.5 in /usr/lib/python2.7/site-packages (from pexpect; sys_platform != "win32"->ipython<6.0.0,>=5.0.0; python_version == "2.7"->ipdb->-r
requirements.txt (line 1)) (0.5.2)Collecting scandir; python_version < "3.5" (from pathlib2; python_version == "2.7" or python_version == "3.3"->ipython<6.0.0,>=5.0.0; python_version == "2.7"->ipdb->-r requirement
s.txt (line 1)) Using cached https://files.pythonhosted.org/packages/13/bb/e541b74230bbf7a20a3949a2ee6631be299378a784f5445aa5d0047c192b/scandir-1.7.tar.gz
Collecting ipython-genutils (from traitlets>=4.2->ipython<6.0.0,>=5.0.0; python_version == "2.7"->ipdb->-r requirements.txt (line 1))
Using cached https://files.pythonhosted.org/packages/fa/bc/9bd3b5c2b4774d5f33b2d544f1460be9df7df2fe42f352135381c347c69a/ipython_genutils-0.2.0-py2.py3-none-any.whl
Requirement already satisfied: enum34; python_version == "2.7" in /usr/lib/python2.7/site-packages (from traitlets>=4.2->ipython<6.0.0,>=5.0.0; python_version == "2.7"->ipdb->-r r
equirements.txt (line 1)) (1.1.6)Building wheels for collected packages: word2vec, torchnet, scandir
Running setup.py bdist_wheel for word2vec ... error
Complete output from command /usr/bin/python -u -c "import setuptools, tokenize;__file__='/tmp/pip-install-aaHbvs/word2vec/setup.py';f=getattr(tokenize, 'open', open)(__file__);
code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" bdist_wheel -d /tmp/pip-wheel-0AZ5iL --python-tag cp27: running bdist_wheel
running build
running build_py
creating build
creating build/lib.linux-x86_64-2.7
creating build/lib.linux-x86_64-2.7/word2vec
copying word2vec/__init__.py -> build/lib.linux-x86_64-2.7/word2vec
copying word2vec/_version.py -> build/lib.linux-x86_64-2.7/word2vec
copying word2vec/io.py -> build/lib.linux-x86_64-2.7/word2vec
copying word2vec/scripts_interface.py -> build/lib.linux-x86_64-2.7/word2vec
copying word2vec/utils.py -> build/lib.linux-x86_64-2.7/word2vec
copying word2vec/wordclusters.py -> build/lib.linux-x86_64-2.7/word2vec
copying word2vec/wordvectors.py -> build/lib.linux-x86_64-2.7/word2vec
creating build/lib.linux-x86_64-2.7/word2vec/tests
copying word2vec/tests/__init__.py -> build/lib.linux-x86_64-2.7/word2vec/tests
copying word2vec/tests/test_word2vec.py -> build/lib.linux-x86_64-2.7/word2vec/tests
UPDATING build/lib.linux-x86_64-2.7/word2vec/_version.py
set build/lib.linux-x86_64-2.7/word2vec/_version.py to '0.9.2'
running build_ext
building 'word2vec.word2vec_noop' extension
creating build/temp.linux-x86_64-2.7
creating build/temp.linux-x86_64-2.7/word2vec
gcc -pthread -fno-strict-aliasing -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches -m64 -mtune=ge
neric -D_GNU_SOURCE -fPIC -fwrapv -DNDEBUG -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches -m64 -mtune=generic -D_GNU_SOURCE -fPIC -fwrapv -fPIC -I/usr/include/python2.7 -c word2vec/word2vec_noop.c -o build/temp.linux-x86_64-2.7/word2vec/word2vec_noop.o word2vec/word2vec_noop.c:16:20: 致命错误:Python.h:没有那个文件或目录
#include "Python.h"
^
编译中断。
error: command 'gcc' failed with exit status 1
----------------------------------------
Failed building wheel for word2vec
Running setup.py clean for word2vec
Running setup.py bdist_wheel for torchnet ... done
Stored in directory: /tmp/pip-ephem-wheel-cache-nmBFRj/wheels/17/05/ec/d05d051a225871af52bf504f5e8daf57704811b3c1850d0012
Running setup.py bdist_wheel for scandir ... error
Complete output from command /usr/bin/python -u -c "import setuptools, tokenize;__file__='/tmp/pip-install-aaHbvs/scandir/setup.py';f=getattr(tokenize, 'open', open)(__file__);c
ode=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" bdist_wheel -d /tmp/pip-wheel-YBhBvd --python-tag cp27: running bdist_wheel
running build
running build_py
creating build
creating build/lib.linux-x86_64-2.7
copying scandir.py -> build/lib.linux-x86_64-2.7
running build_ext
building '_scandir' extension
creating build/temp.linux-x86_64-2.7
gcc -pthread -fno-strict-aliasing -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches -m64 -mtune=ge
neric -D_GNU_SOURCE -fPIC -fwrapv -DNDEBUG -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches -m64 -mtune=generic -D_GNU_SOURCE -fPIC -fwrapv -fPIC -I/usr/include/python2.7 -c _scandir.c -o build/temp.linux-x86_64-2.7/_scandir.o _scandir.c:14:20: 致命错误:Python.h:没有那个文件或目录
#include <Python.h>
^
编译中断。
error: command 'gcc' failed with exit status 1
----------------------------------------
Failed building wheel for scandir
Running setup.py clean for scandir
Successfully built torchnet
Failed to build word2vec scandir
Installing collected packages: scandir, pathlib2, ipython-genutils, traitlets, simplegeneric, pickleshare, ipython, ipdb, fire, tqdm, torchfile, pyzmq, backports-abc, singledispat
ch, tornado, websocket-client, pillow, scipy, certifi, chardet, idna, urllib3, requests, visdom, word2vec, torchnet Running setup.py install for scandir ... error
Complete output from command /usr/bin/python -u -c "import setuptools, tokenize;__file__='/tmp/pip-install-aaHbvs/scandir/setup.py';f=getattr(tokenize, 'open', open)(__file__)
;code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" install --record /tmp/pip-record-GKZVrW/install-record.txt --single-version-externally-managed --compile: running install
running build
running build_py
creating build
creating build/lib.linux-x86_64-2.7
copying scandir.py -> build/lib.linux-x86_64-2.7
running build_ext
building '_scandir' extension
creating build/temp.linux-x86_64-2.7
gcc -pthread -fno-strict-aliasing -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches -m64 -mtune=
generic -D_GNU_SOURCE -fPIC -fwrapv -DNDEBUG -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches -m64 -mtune=generic -D_GNU_SOURCE -fPIC -fwrapv -fPIC -I/usr/include/python2.7 -c _scandir.c -o build/temp.linux-x86_64-2.7/_scandir.o _scandir.c:14:20: 致命错误:Python.h:没有那个文件或目录
#include <Python.h>
^
编译中断。
error: command 'gcc' failed with exit status 1
----------------------------------------
Command "/usr/bin/python -u -c "import setuptools, tokenize;__file__='/tmp/pip-install-aaHbvs/scandir/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('
\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" install --record /tmp/pip-record-GKZVrW/install-record.txt --single-version-externally-managed --compile" failed with error code 1 in /tmp/pip-install-aaHbvs/scandir/[[email protected] PyTorchText-master]#
[[email protected] PyTorchText-master]# yum install python-dev
已加载插件:fastestmirror
Loading mirror speeds from cached hostfile
* base: mirrors.163.com
* extras: mirrors.163.com
* updates: mirrors.cn99.com
没有可用软件包 python-dev。
错误:无须任何处理
[[email protected] PyTorchText-master]# yum install Python-devel
已加载插件:fastestmirror
Loading mirror speeds from cached hostfile
* base: mirrors.163.com
* extras: mirrors.163.com
* updates: mirrors.cn99.com
没有可用软件包 Python-devel。
* 也许您想要:python-devel
错误:无须任何处理
[[email protected] PyTorchText-master]# yum install python-devel
已加载插件:fastestmirror
Loading mirror speeds from cached hostfile
* base: mirrors.163.com
* extras: mirrors.163.com
* updates: mirrors.cn99.com
正在解决依赖关系
--> 正在检查事务
---> 软件包 python-devel.x86_64.0.2.7.5-68.el7 将被 安装
--> 正在处理依赖关系 python(x86-64) = 2.7.5-68.el7,它被软件包 python-devel-2.7.5-68.el7.x86_64 需要
--> 正在检查事务
---> 软件包 python.x86_64.0.2.7.5-58.el7 将被 升级
---> 软件包 python.x86_64.0.2.7.5-68.el7 将被 更新
--> 正在处理依赖关系 python-libs(x86-64) = 2.7.5-68.el7,它被软件包 python-2.7.5-68.el7.x86_64 需要
--> 正在检查事务
---> 软件包 python-libs.x86_64.0.2.7.5-58.el7 将被 升级
---> 软件包 python-libs.x86_64.0.2.7.5-68.el7 将被 更新
--> 解决依赖关系完成
依赖关系解决
===================================================================================================================================================================================
Package 架构 版本 源 大小
===================================================================================================================================================================================
正在安装:
python-devel x86_64 2.7.5-68.el7 base 397 k
为依赖而更新:
python x86_64 2.7.5-68.el7 base 93 k
python-libs x86_64 2.7.5-68.el7 base 5.6 M
事务概要
===================================================================================================================================================================================
安装 1 软件包
升级 ( 2 依赖软件包)
总下载量:6.1 M
Is this ok [y/d/N]: y
Downloading packages:
Delta RPMs disabled because /usr/bin/applydeltarpm not installed.
(1/3): python-2.7.5-68.el7.x86_64.rpm | 93 kB 00:00:00
(2/3): python-devel-2.7.5-68.el7.x86_64.rpm | 397 kB 00:00:03
(3/3): python-libs-2.7.5-68.el7.x86_64.rpm | 5.6 MB 00:00:38
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
总计 160 kB/s | 6.1 MB 00:00:39
Running transaction check
Running transaction test
Transaction test succeeded
Running transaction
正在更新 : python-libs-2.7.5-68.el7.x86_64 1/5
正在更新 : python-2.7.5-68.el7.x86_64 2/5
正在安装 : python-devel-2.7.5-68.el7.x86_64 3/5
清理 : python-2.7.5-58.el7.x86_64 4/5
清理 : python-libs-2.7.5-58.el7.x86_64 5/5
验证中 : python-libs-2.7.5-68.el7.x86_64 1/5
验证中 : python-devel-2.7.5-68.el7.x86_64 2/5
验证中 : python-2.7.5-68.el7.x86_64 3/5
验证中 : python-libs-2.7.5-58.el7.x86_64 4/5
验证中 : python-2.7.5-58.el7.x86_64 5/5
已安装:
python-devel.x86_64 0:2.7.5-68.el7
作为依赖被升级:
python.x86_64 0:2.7.5-68.el7 python-libs.x86_64 0:2.7.5-68.el7
完毕!
[[email protected] PyTorchText-master]#
然后执行成功
[[email protected] PyTorchText-master]# pip install -r requirements.txt
Collecting git+https://github.com/pytorch/[email protected] (from -r requirements.txt (line 5))
Cloning https://github.com/pytorch/tnt.git (to revision master) to /tmp/pip-req-build-kyfk8D
Collecting ipdb (from -r requirements.txt (line 1))
Collecting fire (from -r requirements.txt (line 2))
Collecting tqdm (from -r requirements.txt (line 3))
Using cached https://files.pythonhosted.org/packages/93/24/6ab1df969db228aed36a648a8959d1027099ce45fad67532b9673d533318/tqdm-4.23.4-py2.py3-none-any.whl
Collecting visdom (from -r requirements.txt (line 4))
Collecting word2vec (from -r requirements.txt (line 6))
Using cached https://files.pythonhosted.org/packages/5b/33/8e1cf93216342f0fe8aa4484ef1a833a12c4f6d6bf8e8b46ecc0feb5e5e8/word2vec-0.9.2.tar.gz
Requirement already satisfied: torch in /usr/lib64/python2.7/site-packages (from torchnet==0.0.2->-r requirements.txt (line 5)) (0.1.12.post2)
Requirement already satisfied: six in /usr/lib/python2.7/site-packages (from torchnet==0.0.2->-r requirements.txt (line 5)) (1.11.0)
Collecting ipython<6.0.0,>=5.0.0; python_version == "2.7" (from ipdb->-r requirements.txt (line 1))
Using cached https://files.pythonhosted.org/packages/52/19/aadde98d6bde1667d0bf431fb2d22451f880aaa373e0a241c7e7cb5815a0/ipython-5.7.0-py2-none-any.whl
Requirement already satisfied: setuptools in /usr/lib/python2.7/site-packages (from ipdb->-r requirements.txt (line 1)) (39.2.0)
Collecting torchfile (from visdom->-r requirements.txt (line 4))
Collecting tornado (from visdom->-r requirements.txt (line 4))
Collecting scipy (from visdom->-r requirements.txt (line 4))
Using cached https://files.pythonhosted.org/packages/2a/f3/de9c1bd16311982711209edaa8c6caa962db30ebb6a8cc6f1dcd2d3ef616/scipy-1.1.0-cp27-cp27mu-manylinux1_x86_64.whl
Requirement already satisfied: numpy>=1.8 in /usr/lib64/python2.7/site-packages (from visdom->-r requirements.txt (line 4)) (1.14.3)
Collecting pillow (from visdom->-r requirements.txt (line 4))
Using cached https://files.pythonhosted.org/packages/00/49/a0483e7308b4b04b5a898789911dbb876d9fea54e7df0453915e47744cfd/Pillow-5.1.0-cp27-cp27mu-manylinux1_x86_64.whl
Collecting pyzmq (from visdom->-r requirements.txt (line 4))
Using cached https://files.pythonhosted.org/packages/5d/b0/3aea046f5519e2e059a225e8c924f897846b608793f890be987d07858b7c/pyzmq-17.0.0-cp27-cp27mu-manylinux1_x86_64.whl
Collecting websocket-client (from visdom->-r requirements.txt (line 4))
Using cached https://files.pythonhosted.org/packages/8a/a1/72ef9aa26cfe1a75cee09fc1957e4723add9de098c15719416a1ee89386b/websocket_client-0.48.0-py2.py3-none-any.whl
Collecting requests (from visdom->-r requirements.txt (line 4))
Using cached https://files.pythonhosted.org/packages/49/df/50aa1999ab9bde74656c2919d9c0c085fd2b3775fd3eca826012bef76d8c/requests-2.18.4-py2.py3-none-any.whl
Requirement already satisfied: cython in /usr/lib64/python2.7/site-packages (from word2vec->-r requirements.txt (line 6)) (0.28.3)
Requirement already satisfied: pyyaml in /usr/lib64/python2.7/site-packages (from torch->torchnet==0.0.2->-r requirements.txt (line 5)) (3.12)
Collecting pathlib2; python_version == "2.7" or python_version == "3.3" (from ipython<6.0.0,>=5.0.0; python_version == "2.7"->ipdb->-r requirements.txt (line 1))
Using cached https://files.pythonhosted.org/packages/66/a7/9f8d84f31728d78beade9b1271ccbfb290c41c1e4dc13dbd4997ad594dcd/pathlib2-2.3.2-py2.py3-none-any.whl
Requirement already satisfied: backports.shutil-get-terminal-size; python_version == "2.7" in /usr/lib/python2.7/site-packages (from ipython<6.0.0,>=5.0.0; python_version == "2.7"
->ipdb->-r requirements.txt (line 1)) (1.0.0)Collecting simplegeneric>0.8 (from ipython<6.0.0,>=5.0.0; python_version == "2.7"->ipdb->-r requirements.txt (line 1))
Requirement already satisfied: pygments in /usr/lib64/python2.7/site-packages (from ipython<6.0.0,>=5.0.0; python_version == "2.7"->ipdb->-r requirements.txt (line 1)) (2.2.0)
Requirement already satisfied: pexpect; sys_platform != "win32" in /usr/lib/python2.7/site-packages (from ipython<6.0.0,>=5.0.0; python_version == "2.7"->ipdb->-r requirements.txt
(line 1)) (4.6.0)Requirement already satisfied: prompt-toolkit<2.0.0,>=1.0.4 in /usr/lib/python2.7/site-packages (from ipython<6.0.0,>=5.0.0; python_version == "2.7"->ipdb->-r requirements.txt (li
ne 1)) (1.0.15)Collecting pickleshare (from ipython<6.0.0,>=5.0.0; python_version == "2.7"->ipdb->-r requirements.txt (line 1))
Using cached https://files.pythonhosted.org/packages/9f/17/daa142fc9be6b76f26f24eeeb9a138940671490b91cb5587393f297c8317/pickleshare-0.7.4-py2.py3-none-any.whl
Requirement already satisfied: decorator in /usr/lib/python2.7/site-packages (from ipython<6.0.0,>=5.0.0; python_version == "2.7"->ipdb->-r requirements.txt (line 1)) (3.4.0)
Collecting traitlets>=4.2 (from ipython<6.0.0,>=5.0.0; python_version == "2.7"->ipdb->-r requirements.txt (line 1))
Using cached https://files.pythonhosted.org/packages/93/d6/abcb22de61d78e2fc3959c964628a5771e47e7cc60d53e9342e21ed6cc9a/traitlets-4.3.2-py2.py3-none-any.whl
Requirement already satisfied: futures in /usr/lib/python2.7/site-packages (from tornado->visdom->-r requirements.txt (line 4)) (3.2.0)
Collecting singledispatch (from tornado->visdom->-r requirements.txt (line 4))
Using cached https://files.pythonhosted.org/packages/c5/10/369f50bcd4621b263927b0a1519987a04383d4a98fb10438042ad410cf88/singledispatch-3.4.0.3-py2.py3-none-any.whl
Collecting backports-abc>=0.4 (from tornado->visdom->-r requirements.txt (line 4))
Using cached https://files.pythonhosted.org/packages/7d/56/6f3ac1b816d0cd8994e83d0c4e55bc64567532f7dc543378bd87f81cebc7/backports_abc-0.5-py2.py3-none-any.whl
Collecting urllib3<1.23,>=1.21.1 (from requests->visdom->-r requirements.txt (line 4))
Using cached https://files.pythonhosted.org/packages/63/cb/6965947c13a94236f6d4b8223e21beb4d576dc72e8130bd7880f600839b8/urllib3-1.22-py2.py3-none-any.whl
Collecting idna<2.7,>=2.5 (from requests->visdom->-r requirements.txt (line 4))
Using cached https://files.pythonhosted.org/packages/27/cc/6dd9a3869f15c2edfab863b992838277279ce92663d334df9ecf5106f5c6/idna-2.6-py2.py3-none-any.whl
Collecting chardet<3.1.0,>=3.0.2 (from requests->visdom->-r requirements.txt (line 4))
Using cached https://files.pythonhosted.org/packages/bc/a9/01ffebfb562e4274b6487b4bb1ddec7ca55ec7510b22e4c51f14098443b8/chardet-3.0.4-py2.py3-none-any.whl
Collecting certifi>=2017.4.17 (from requests->visdom->-r requirements.txt (line 4))
Using cached https://files.pythonhosted.org/packages/7c/e6/92ad559b7192d846975fc916b65f667c7b8c3a32bea7372340bfe9a15fa5/certifi-2018.4.16-py2.py3-none-any.whl
Collecting scandir; python_version < "3.5" (from pathlib2; python_version == "2.7" or python_version == "3.3"->ipython<6.0.0,>=5.0.0; python_version == "2.7"->ipdb->-r requirement
s.txt (line 1)) Using cached https://files.pythonhosted.org/packages/13/bb/e541b74230bbf7a20a3949a2ee6631be299378a784f5445aa5d0047c192b/scandir-1.7.tar.gz
Requirement already satisfied: ptyprocess>=0.5 in /usr/lib/python2.7/site-packages (from pexpect; sys_platform != "win32"->ipython<6.0.0,>=5.0.0; python_version == "2.7"->ipdb->-r
requirements.txt (line 1)) (0.5.2)Requirement already satisfied: wcwidth in /usr/lib/python2.7/site-packages (from prompt-toolkit<2.0.0,>=1.0.4->ipython<6.0.0,>=5.0.0; python_version == "2.7"->ipdb->-r requirement
s.txt (line 1)) (0.1.7)Requirement already satisfied: enum34; python_version == "2.7" in /usr/lib/python2.7/site-packages (from traitlets>=4.2->ipython<6.0.0,>=5.0.0; python_version == "2.7"->ipdb->-r r
equirements.txt (line 1)) (1.1.6)Collecting ipython-genutils (from traitlets>=4.2->ipython<6.0.0,>=5.0.0; python_version == "2.7"->ipdb->-r requirements.txt (line 1))
Using cached https://files.pythonhosted.org/packages/fa/bc/9bd3b5c2b4774d5f33b2d544f1460be9df7df2fe42f352135381c347c69a/ipython_genutils-0.2.0-py2.py3-none-any.whl
Building wheels for collected packages: word2vec, torchnet, scandir
Running setup.py bdist_wheel for word2vec ... done
Stored in directory: /root/.cache/pip/wheels/89/a1/cb/417bcc7143a3e2befcc82da185ce8ad4a340eb82c0bf48969c
Running setup.py bdist_wheel for torchnet ... done
Stored in directory: /tmp/pip-ephem-wheel-cache-oQlzp4/wheels/17/05/ec/d05d051a225871af52bf504f5e8daf57704811b3c1850d0012
Running setup.py bdist_wheel for scandir ... done
Stored in directory: /root/.cache/pip/wheels/4a/ca/d7/26c3620234732f2d5b3ca86d7ccb0f59a21bd7712bffbbedc2
Successfully built word2vec torchnet scandir
Installing collected packages: scandir, pathlib2, simplegeneric, pickleshare, ipython-genutils, traitlets, ipython, ipdb, fire, tqdm, torchfile, singledispatch, backports-abc, tor
nado, scipy, pillow, pyzmq, websocket-client, urllib3, idna, chardet, certifi, requests, visdom, word2vec, torchnetSuccessfully installed backports-abc-0.5 certifi-2018.4.16 chardet-3.0.4 fire-0.1.3 idna-2.6 ipdb-0.11 ipython-5.7.0 ipython-genutils-0.2.0 pathlib2-2.3.2 pickleshare-0.7.4 pillow
-5.1.0 pyzmq-17.0.0 requests-2.18.4 scandir-1.7 scipy-1.1.0 simplegeneric-0.8.1 singledispatch-3.4.0.3 torchfile-0.1.0 torchnet-0.0.2 tornado-5.0.2 tqdm-4.23.4 traitlets-4.3.2 urllib3-1.22 visdom-0.1.8.3 websocket-client-0.48.0 word2vec-0.9.2[[email protected] PyTorchText-master]#
安装完上述依赖之后,启动可视化工具visdom 服务
```sh
python -m visdom.server
```
pytorch学习笔记(八):PytTorch可视化工具 visdom
至此,环境已经准备好了,接下来就要准备init的源码和数据文件了
[[email protected] PyTorchText-master]# ll *.txt
-rw-r--r--. 1 root root 29200241 6月 5 16:55 char_embedding.txt
-rw-r--r--. 1 root root 239862273 6月 5 16:53 question_eval_set.txt
-rw-r--r--. 1 root root 204459814 6月 5 16:52 question_topic_train_set.txt
-rw-r--r--. 1 root root 3317236306 6月 5 16:57 question_train_set.txt
-rw-r--r--. 1 root root 77 6月 5 11:45 requirements.txt
-rw-r--r--. 1 root root 1072551 6月 5 16:53 topic_info.txt
-rw-r--r--. 1 root root 1005008916 6月 5 16:55 word_embedding.txt
[[email protected] PyTorchText-master]#
## 2. 数据预处理
### 2.1 词向量转成numpy数组
[[email protected] PyTorchText-master]# python scripts/data_process/embedding2matrix.py main char_embedding.txt char_embedding.npz
[[email protected] PyTorchText-master]# ls
char_embedding.npz data main-all.1.py models question_topic_train_set.txt rep.py test.3.py
char_embedding.txt del main-all.py notebooks question_train_set.txt requirements.txt topic_info.txt
checkpoints ??ɽ??init?????.pdf main.py ??ɽ??-??ʿ????????.pptx readme.md scripts utils
config.py LICENSE ˵??.md question_eval_set.txt readme-zh.md test.1.py word_embedding.txt
[[email protected] PyTorchText-master]# python scripts/data_process/embedding2matrix.py main word_embedding.txt word_embedding.npz
[[email protected] PyTorchText-master]# ls
char_embedding.npz data main-all.1.py models question_topic_train_set.txt rep.py test.3.py word_embedding.txt
char_embedding.txt del main-all.py notebooks question_train_set.txt requirements.txt topic_info.txt
checkpoints ??ɽ??init?????.pdf main.py ??ɽ??-??ʿ????????.pptx readme.md scripts utils
config.py LICENSE ˵??.md question_eval_set.txt readme-zh.md test.1.py word_embedding.npz
### 2.2 问题转成numpy 数组这一步很耗内存,请确保内存>32G,仅操作了小文件
[[email protected] PyTorchText-master]# python scripts/data_process/question2array.py main question_eval_set.txt test.npz
Traceback (most recent call last):
File "scripts/data_process/question2array.py", line 85, in <module>
fire.Fire()
File "/usr/lib/python2.7/site-packages/fire/core.py", line 127, in Fire
component_trace = _Fire(component, args, context, name)
File "/usr/lib/python2.7/site-packages/fire/core.py", line 366, in _Fire
component, remaining_args)
File "/usr/lib/python2.7/site-packages/fire/core.py", line 542, in _CallCallable
result = fn(*varargs, **kwargs)
File "scripts/data_process/question2array.py", line 19, in main
char2id = np.load('/mnt/7/zhihu/ieee_zhihu_cup/data/char_embedding.npz')['word2id'].item()
File "/usr/lib64/python2.7/site-packages/numpy/lib/npyio.py", line 372, in load
fid = open(file, "rb")
IOError: [Errno 2] No such file or directory: '/mnt/7/zhihu/ieee_zhihu_cup/data/char_embedding.npz'
[[email protected] PyTorchText-master]#
报错,需要修改文件中的路径,
[[email protected] PyTorchText-master]# python scripts/data_process/question2array.py main question_eval_set.txt test.npz
217360it [00:34, 6317.30it/s]
a
b
c
d
[[email protected] PyTorchText-master]#
### 2.3 处理label,转成json
[[email protected] PyTorchText-master]# python scripts/data_process/label2id.py main question_topic_train_set.txt labels.json
> /weblogic/PyTorchText-master/scripts/data_process/label2id.py(17)main()
16 import ipdb;ipdb.set_trace()
---> 17 all_labels = { _ for ii,jj in results for _ in jj }
18 sorted_labels = sorted(all_labels)
ipdb> n
> /weblogic/PyTorchText-master/scripts/data_process/label2id.py(18)main()
17 all_labels = { _ for ii,jj in results for _ in jj }
---> 18 sorted_labels = sorted(all_labels)
19 label2id = {l_:ii for ii,l_ in enumerate(sorted_labels)}#-3239204820424->1
ipdb> n
> /weblogic/PyTorchText-master/scripts/data_process/label2id.py(19)main()
18 sorted_labels = sorted(all_labels)
---> 19 label2id = {l_:ii for ii,l_ in enumerate(sorted_labels)}#-3239204820424->1
20 id2label = {ii:l_ for ii,l_ in enumerate(sorted_labels)}
ipdb> n
> /weblogic/PyTorchText-master/scripts/data_process/label2id.py(20)main()
19 label2id = {l_:ii for ii,l_ in enumerate(sorted_labels)}#-3239204820424->1
---> 20 id2label = {ii:l_ for ii,l_ in enumerate(sorted_labels)}
21
ipdb> n
> /weblogic/PyTorchText-master/scripts/data_process/label2id.py(22)main()
21
---> 22 d = {ii:[label2id[jj] for jj in labels ] for ii,labels in results}
23
ipdb> n
n> /weblogic/PyTorchText-master/scripts/data_process/label2id.py(24)main()
23
---> 24 data = dict(d=d,label2id=label2id,id2label=id2label)
25 import json
ipdb> n
> /weblogic/PyTorchText-master/scripts/data_process/label2id.py(25)main()
24 data = dict(d=d,label2id=label2id,id2label=id2label)
---> 25 import json
26 with open(outfile,'w') as f:
ipdb> n
> /weblogic/PyTorchText-master/scripts/data_process/label2id.py(26)main()
25 import json
---> 26 with open(outfile,'w') as f:
27 json.dump(data,f)
ipdb> n
> /weblogic/PyTorchText-master/scripts/data_process/label2id.py(27)main()
26 with open(outfile,'w') as f:
---> 27 json.dump(data,f)
28
ipdb> n
--Return--
None
> /weblogic/PyTorchText-master/scripts/data_process/label2id.py(27)main()
26 with open(outfile,'w') as f:
---> 27 json.dump(data,f)
28
ipdb>
> /usr/lib/python2.7/site-packages/fire/core.py(543)_CallCallable()
542 result = fn(*varargs, **kwargs)
--> 543 return result, consumed_args, remaining_args, capacity
544
ipdb> c
[1]+ 已杀死 python scripts/data_process/label2id.py main question_topic_train_set.txt labels.json
[[email protected] PyTorchText-master]#
操作文档中说很耗内存的一步,也操作完成了,我的内存是2G。未找到train.npz,可能是因为内存原因失败了。
[[email protected] PyTorchText-master]# python scripts/data_process/question2array.py main question_train_set.txt train.npz
已杀死
[[email protected] PyTorchText-master]#
接下来从训练集中抽取一部分的数据生成验证集, 这部分代码是从ipython中备份的,__注意修改代码中的数据存放路径__ .
[[email protected] PyTorchText-master]# python scripts/data_process/get_val.py
[[email protected] PyTorchText-master]#
## 3. 训练模型
我发现了致命的错误
[[email protected] PyTorchText-master]# python main.py main --max_epoch=5 --plot_every=100 --env='MultiCNNText' --weight=1 --model='MultiCNNTextBNDeep' --batch-size=64 --lr=0.001
--lr2=0.000 --lr_decay=0.8 --decay_every=10000 --title-dim=250 --content-dim=250 --weight-decay=0 --type_='word' --debug-file='/tmp/debug' --linear-hidden-size=2000 --zhuge=True --augument=FalseTraceback (most recent call last):
File "main.py", line 158, in <module>
fire.Fire()
File "/usr/lib/python2.7/site-packages/fire/core.py", line 127, in Fire
component_trace = _Fire(component, args, context, name)
File "/usr/lib/python2.7/site-packages/fire/core.py", line 366, in _Fire
component, remaining_args)
File "/usr/lib/python2.7/site-packages/fire/core.py", line 542, in _CallCallable
result = fn(*varargs, **kwargs)
File "main.py", line 74, in main
model = getattr(models,opt.model)(opt).cuda()
File "/usr/lib64/python2.7/site-packages/torch/nn/modules/module.py", line 147, in cuda
return self._apply(lambda t: t.cuda(device_id))
File "/usr/lib64/python2.7/site-packages/torch/nn/modules/module.py", line 118, in _apply
module._apply(fn)
File "/usr/lib64/python2.7/site-packages/torch/nn/modules/module.py", line 124, in _apply
param.data = fn(param.data)
File "/usr/lib64/python2.7/site-packages/torch/nn/modules/module.py", line 147, in <lambda>
return self._apply(lambda t: t.cuda(device_id))
File "/usr/lib64/python2.7/site-packages/torch/_utils.py", line 65, in _cuda
return new_type(self.size()).copy_(self, async)
File "/usr/lib64/python2.7/site-packages/torch/cuda/__init__.py", line 272, in __new__
_lazy_init()
File "/usr/lib64/python2.7/site-packages/torch/cuda/__init__.py", line 84, in _lazy_init
_check_driver()
File "/usr/lib64/python2.7/site-packages/torch/cuda/__init__.py", line 58, in _check_driver
http://www.nvidia.com/Download/index.aspx""")
AssertionError:
Found no NVIDIA driver on your system. Please check that you
have an NVIDIA GPU and installed a driver from
http://www.nvidia.com/Download/index.aspx
[[email protected] PyTorchText-master]#
下一篇: Centos 6.5 redmine