java爬虫爬取的bug记录
程序员文章站
2022-07-06 15:18:08
java爬虫第一天-bug记录注意:实现爬虫要使用idea自带的maven。bug1:Cannot resolve symbol 'response'原因:try { CloseableHttpResponse response = httpClient.execute(httpGet); if(response.getStatusLine().getStatusCode() == 200) { String c...
java爬虫第一天-bug记录
注意:实现爬虫要使用idea自带的maven。
bug1:
Cannot resolve symbol 'response'
原因:
try {
CloseableHttpResponse response = httpClient.execute(httpGet);
if(response.getStatusLine().getStatusCode() == 200) {
String content = EntityUtils.toString(response.getEntity() , "utf-8");
System.out.println(content.length());
}
} catch (IOException e) {
e.printStackTrace();
}finally {
try {
response.close();
} catch (IOException e) {
e.printStackTrace();
}
try {
httpClient.close();
} catch (IOException e) {
e.printStackTrace();
}
}
解决办法:
CloseableHttpResponse response = null;
try {
response = httpClient.execute(httpGet);
if(response.getStatusLine().getStatusCode() == 200) {
String content = EntityUtils.toString(response.getEntity() , "utf-8");
System.out.println(content.length());
}
} catch (IOException e) {
e.printStackTrace();
}finally {
try {
response.close();
} catch (IOException e) {
e.printStackTrace();
}
try {
httpClient.close();
} catch (IOException e) {
e.printStackTrace();
}
}
bug2:
org.apache.http.client.ClientProtocolException
at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:187)
at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83)
at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:108)
at cn.itcast.crawler.test.HttpGetTest.main(HttpGetTest.java:21)
Caused by: org.apache.http.ProtocolException: Target host is not specified
at org.apache.http.impl.conn.DefaultRoutePlanner.determineRoute(DefaultRoutePlanner.java:71)
at org.apache.http.impl.client.InternalHttpClient.determineRoute(InternalHttpClient.java:125)
at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:184)
... 3 more
Exception in thread "main" java.lang.NullPointerException
at cn.itcast.crawler.test.HttpGetTest.main(HttpGetTest.java:31)
原因:
CloseableHttpClient httpClient = HttpClients.createDefault();
HttpGet httpGet = new HttpGet("");
CloseableHttpResponse response = null;
try {
response = httpClient.execute(httpGet);
if(response.getStatusLine().getStatusCode() == 200) {
String content = EntityUtils.toString(response.getEntity() , "utf-8");
System.out.println(content.length());
}
} catch (IOException e) {
e.printStackTrace();
}
解决办法:
CloseableHttpClient httpClient = HttpClients.createDefault();
HttpGet httpGet = new HttpGet("http://www.itcast.cn");
CloseableHttpResponse response = null;
try {
response = httpClient.execute(httpGet);
if(response.getStatusLine().getStatusCode() == 200) {
String content = EntityUtils.toString(response.getEntity() , "utf-8");
System.out.println(content.length());
}
} catch (IOException e) {
e.printStackTrace();
}
bug3:注释掉test仍看不到日志输出。
log4j:WARN No appenders could be found for logger (org.apache.http.client.protocol.RequestAddCookies).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
</dependency>
<!-- https://mvnrepository.com/artifact/org.slf4j/slf4j-log4j12 -->
<dependency >
<groupId>org.slf4j</groupId>
<artifactId>slf4j-log4j12</artifactId>
<version>1.7.25</version>
<!-- <scope>test</scope>-->
</dependency>
解决:
因为1.2版本的要在 log4j.properties 文件里配置输出
# Global logging configuration 这个配置是调试用的配置,生产环境要改成INFO或更高级别
log4j.rootLogger=DEBUG, stdout
# Console output...
log4j.appender.stdout=org.apache.log4j.ConsoleAppender
log4j.appender.stdout.layout=org.apache.log4j.PatternLayout
log4j.appender.stdout.layout.ConversionPattern= %-d{yyyy-MM-dd HH:mm:ss} [ %t:%r ] - [ %p ] %m%n
源代码:
package cn.itcast.crawler.test;
import org.apache.http.client.methods.CloseableHttpResponse;
import org.apache.http.client.methods.HttpGet;
import org.apache.http.impl.client.CloseableHttpClient;
import org.apache.http.impl.client.HttpClients;
import org.apache.http.util.EntityUtils;
import java.io.IOException;
public class HttpGetTest {
public static void main(String[] args) {
CloseableHttpClient httpClient = HttpClients.createDefault();
HttpGet httpGet = new HttpGet("http://www.itcast.cn");
CloseableHttpResponse response = null;
try {
response = httpClient.execute(httpGet);
if(response.getStatusLine().getStatusCode() == 200) {
String content = EntityUtils.toString(response.getEntity() , "utf-8");
System.out.println(content.length());
}
} catch (IOException e) {
e.printStackTrace();
}finally {
try {
response.close();
} catch (IOException e) {
e.printStackTrace();
}
try {
httpClient.close();
} catch (IOException e) {
e.printStackTrace();
}
}
}
}
本文地址:https://blog.csdn.net/m0_48333563/article/details/109564620