Python 多线程爬取站酷（zcool.com.cn）图片

程序员文章站 2022-03-20 16:56:16

极速爬取下载站酷（ "https://www.zcool.com.cn/" ）上传的全部等图片。项目地址： "https://github.com/lonsty/scraper" 特点： 1. 极速下载：多线程异步下载，可以根据需要设置线程数 2. 异常重试：只要重试次数足够多，就没有下载不下 ......

极速爬取下载站酷（）设计师/用户上传的全部照片/插画等图片。

项目地址：

特点：

极速下载：多线程异步下载，可以根据需要设置线程数
异常重试：只要重试次数足够多，就没有下载不下来的图片 (^o^)/
增量下载：设计师/用户有新的上传，再跑一遍程序就行了 o(∩_∩)o嗯!
支持代理：可以配置使用代理

环境：

python3.6及以上

1. 快速使用

1) 克隆项目到本地

git clone https://github.com/lonsty/scraper

2) 安装依赖包

cd scraper
pip install -r requirements.txt

3) 快速使用

通过用户名username下载所有图片到路径path下：

python crawler.py -u <username> -d <path>

运行截图

Python 多线程爬取站酷（zcool.com.cn）图片

爬取结果

Python 多线程爬取站酷（zcool.com.cn）图片

2. 使用帮助

查看所有命令

python crawler.py --help

usage: crawler.py [options]

  use multi-threaded to download images from https://www.zcool.com.cn in
  bulk by username or id.

options:
  -i, --id text              user id.
  -u, --username text        user name.
  -d, --directory text       directory to save images.
  -p, --max-pages integer    maximum pages to parse.
  -t, --max-topics integer   maximum topics per page to parse.
  -w, --max-workers integer  maximum thread workers.  [default: 20]
  -r, --retries integer      repeat download for failed images.  [default: 3]
  -r, --redownload text      redownload images from failed records.
  -o, --override             override existing files.  [default: false]
  --proxies text             use proxies to access websites.
                             example:
                             '{"http": "user:passwd@www.example.com:port",
                             "https": "user:passwd@www.example.com:port"}'
  --help                     show this message and exit.

3. 更新历史

version 0.1.0 (2019.09.09)

主要功能：
- 极速下载：多线程异步下载，可以根据需要设置线程数
- 异常重试：只要重试次数足够多，就没有下载不下来的图片 (^o^)/
- 增量下载：设计师/用户有新的上传，再跑一遍程序就行了 o(∩_∩)o嗯!
- 支持代理：可以配置使用代理

上一篇：安装linux

下一篇： echarts白色实心环形图（空心饼图）的编写

Python 多线程爬取站酷（zcool.com.cn）图片

特点：

环境：

1. 快速使用

1) 克隆项目到本地

2) 安装依赖包

3) 快速使用

2. 使用帮助

3. 更新历史

version 0.1.0 (2019.09.09)

Python 多线程爬取站酷（zcool.com.cn）图片

Python爬取mn52网站美女图片以及图片防盗链的解决方法

python3实现爬虫爬取今日头条上面的图片（requests+正则表达式+beautifulSoup+Ajax+多线程）

Python爬取mn52网站美女图片以及图片防盗链的解决方法

Python 多线程爬取站酷（zcool.com.cn）图片

如何利用python多线程爬取天气网站图片并保存