欢迎您访问程序员文章站本站旨在为大家提供分享程序员计算机编程知识!
您现在的位置是: 首页  >  IT编程

Scrapy at a glance预览

程序员文章站 2022-03-16 22:58:24
1、安装scrapy 2、创建爬虫项目 scrapy startproject test_scrapy3、创建quotes_spider.py文件4、复制下面代码到quotes_spider.py文件import scrapy #导入模块#编写QuotesSpider类 class QuotesSp ......
1、安装scrapy

Scrapy at a glance预览


2、创建爬虫项目 scrapy startproject test_scrapy

3、创建quotes_spider.py文件
4、复制下面代码到quotes_spider.py文件
import scrapy #导入模块
#编写quotesspider类
class quotesspider(scrapy.spider):
name = "quotes"
#爬取网站地址
start_urls = [
'http://quotes.toscrape.com/tag/humor/',
]
def parse(self, response): #定义解析方法
for quote in response.css('div.quote'): #解析class="quote"的div
#采用字典记录,爬取内容部分定义
yield {
'text': quote.css('span.text::text').extract_first(),
'author': quote.xpath('span/small/text()').extract_first(),
}
#下一页地址
next_page = response.css('li.next a::attr("href")').extract_first()
if next_page is not none:
yield response.follow(next_page, self.parse)
5、cd test_scrapy 到quotes_spider.py文件目录
6、运行scrapy runspider quotes_spider.py -o quotes.json命令
可看到目录下多了quotes.json文件
打开quotes文件可看到
[
{"text": "\u201cthe person, be it gentleman or lady, who has not pleasure in a good novel, must be intolerably stupid.\u201d", "author": "jane austen"},
{"text": "\u201ca day without sunshine is like, you know, night.\u201d", "author": "steve martin"},
{"text": "\u201canyone who thinks sitting in church can make you a christian must also think that sitting in a garage can make you a car.\u201d", "author": "garrison keillor"},
{"text": "\u201cbeauty is in the eye of the beholder and it may be necessary from time to time to give a stupid or misinformed beholder a black eye.\u201d", "author": "jim henson"},
{"text": "\u201call you need is love. but a little chocolate now and then doesn't hurt.\u201d", "author": "charles m. schulz"},
{"text": "\u201cremember, we're madly in love, so it's all right to kiss me anytime you feel like it.\u201d", "author": "suzanne collins"},
{"text": "\u201csome people never go crazy. what truly horrible lives they must lead.\u201d", "author": "charles bukowski"},
{"text": "\u201cthe trouble with having an open mind, of course, is that people will insist on coming along and trying to put things in it.\u201d", "author": "terry pratchett"},
{"text": "\u201cthink left and think right and think low and think high. oh, the thinks you can think up if only you try!\u201d", "author": "dr. seuss"},
{"text": "\u201cthe reason i talk to myself is because i\u2019m the only one whose answers i accept.\u201d", "author": "george carlin"},
{"text": "\u201ci am free of all prejudice. i hate everyone equally. \u201d", "author": "w.c. fields"},
{"text": "\u201ca lady's imagination is very rapid; it jumps from admiration to love, from love to matrimony in a moment.\u201d", "author": "jane austen"}
]