python爬虫初学者1

程序员文章站 2022-04-25 23:10:02

...

把这几天学到的东西补上。
从网上找来的一段代码？忘了出处！
最初运行一直出错，万能的百度告诉我缺少requests模块，
百度百度告诉我，模块怎么安。
百度：

pip install requests

python爬虫初学者1
（上面这图是盗来的，自己找不到了。）

import requests
url = "https://item.jd.com/3112072.html"
try:
    r = requests.get(url)
    r.raise_for_status()
    #查看状态信息，返回的是200，说明返回信息正确并且已经获得该链接相应内容。
    r.encoding = r.apparent_encoding
    #查看编码格式，这个格式是jbk，说明我们从http的头部分已经可以解析出网站信息。
    print(r.text[:1000])
except:
    print("爬取失败")

运行结果：

>>> 
========= RESTART: C:\Users\Administrator.USER-20190824JP\Desktop\1.py =========
<!DOCTYPE HTML>
<html lang="zh-CN">
<head>
    <!-- shouji -->
    <meta http-equiv="Content-Type" content="text/html; charset=gbk" />
    <title>【全棉时代棉柔巾】全棉时代 棉柔巾一次性洗脸巾纯棉柔巾擦脸巾洁面巾20CM*20CM 100抽*6包【行情 报价 价格 评测】-京东</title>
    <meta name="keywords" content="PurCotton棉柔巾,全棉时代棉柔巾,全棉时代棉柔巾报价,PurCotton棉柔巾报价"/>
    <meta name="description" content="【全棉时代棉柔巾】京东JD.COM提供全棉时代棉柔巾正品行货，并包括PurCotton棉柔巾网购指南，以及全棉时代棉柔巾图片、棉柔巾参数、棉柔巾评论、棉柔巾心得、棉柔巾技巧等信息，网购全棉时代棉柔巾上京东,放心又轻松" />
    <meta name="format-detection" content="telephone=no">
    <meta http-equiv="mobile-agent" content="format=xhtml; url=//item.m.jd.com/product/3112072.html">
    <meta http-equiv="mobile-agent" content="format=html5; url=//item.m.jd.com/product/3112072.html">
    <meta http-equiv="X-UA-Compatible" content="IE=Edge">
    <link rel="canonical" href="//item.jd.com/3112072.html"/>
        <link rel="dns-prefetch" href="//misc.360buyimg.com"/>
    <link rel="dns-prefetch" href="//static.360buyimg.com"/>
    <link rel="dns-prefetch" href="//i

python爬虫初学者1

Python爬虫框架Scrapy实战之批量抓取招聘信息

python爬虫教程之爬取百度贴吧并下载的示例

python爬虫获取动漫截图

python爬虫是什么意思（简单好玩的编程代码）

零基础写python爬虫之打包生成exe文件

零基础写python爬虫之爬虫框架Scrapy安装配置

零基础写python爬虫之抓取百度贴吧并存储到本地txt文件改进版

跟老齐学Python之不要红头文件(1)

python实现博客文章爬虫示例

零基础写python爬虫之urllib2中的两个重要概念：Openers和Handlers

python爬虫初学者1

Python爬虫框架Scrapy实战之批量抓取招聘信息

python爬虫教程之爬取百度贴吧并下载的示例

python爬虫 获取动漫截图

python爬虫是什么意思（简单好玩的编程代码）

零基础写python爬虫之打包生成exe文件

零基础写python爬虫之爬虫框架Scrapy安装配置

零基础写python爬虫之抓取百度贴吧并存储到本地txt文件改进版

跟老齐学Python之不要红头文件(1)

python实现博客文章爬虫示例

零基础写python爬虫之urllib2中的两个重要概念：Openers和Handlers

python爬虫获取动漫截图