Htmlunit获取页面cookie的用法
程序员文章站
2022-05-05 14:51:48
...
通过Htmlunit获取cookie
Htmlunit 是一款开源的java 页面分析工具,读取页面后,可以有效的使用htmlunit分析页面上的内容。项目可以模拟浏览器运行,被誉为java浏览器的开源实现。是一个没有界面的浏览器,运行速度迅速,是junit的扩展之一,它的用法很多,本文只简单的介绍如何通过它来获取页面返回的cookie。
Htmlunit 的官网:http://htmlunit.sourceforge.net/
public static synchronized String getCookie(String url){
String cks = "";
try {
long time = System.currentTimeMillis();
final WebClient webclient = new WebClient(BrowserVersion.FIREFOX_52);
webclient.getOptions().setJavaScriptEnabled(true);
webclient.getOptions().setThrowExceptionOnScriptError(true);
webclient.getOptions().setCssEnabled(false);
webclient.getCookieManager().clearCookies();
webclient.getCache().clear();
webclient.setRefreshHandler(new ImmediateRefreshHandler());
webclient.getOptions().setTimeout(600*1000);
webclient.setJavaScriptTimeout(600*1000);
webclient.setAjaxController(new NicelyResynchronizingAjaxController());
webclient.setJavaScriptTimeout(600*1000);
webclient.getOptions().setRedirectEnabled(true);
webclient.waitForBackgroundJavaScript(60*1000);
webclient.getOptions().setThrowExceptionOnFailingStatusCode(false);
webclient.getOptions().setUseInsecureSSL(true);
final HtmlPage page = webclient.getPage(url);
CookieManager CM = webclient.getCookieManager();
Set<Cookie> cookies = CM.getCookies();//返回的Cookie在这里,下次请求的时候可能可以用上啦。
for(Cookie c : cookies) {
cks = cks+c.getName()+"="+c.getValue()+";";
}
webclient.close();
if (!StringUtils.isEmpty(cks)) {
logger.info("获取cookie耗时:"+(System.currentTimeMillis()-time));
}else {
logger.info("*******获取cookie失败,耗时:"+(System.currentTimeMillis()-time)+"******");
}
} catch (Exception e) {
logger.error("通过htmlunit获取cookie失败......" , e);
}
return cks;
}
上一篇: JAVA实现网页抓取(htmlunit)
下一篇: htmlunit实现爬取网页