Java读取网页内容并下载图片的实例
程序员文章站
2024-02-28 12:55:10
java读取网页内容并下载图片的实例
很多人在第一次了解数据采集的时候,可能无从下手,尤其是作为一个新手,更是...
java读取网页内容并下载图片的实例
很多人在第一次了解数据采集的时候,可能无从下手,尤其是作为一个新手,更是感觉很是茫然,所以,在这里分享一下自己的心得,希望和大家一起分享技术,如果有什么不足,还请大家指正。写出这篇目的,就是希望大家一起成长,我也相信技术之间没有高低,只有互补,只有分享,才能使彼此更加成长。
示例代码:
import java.io.bufferedinputstream; import java.io.bufferedreader; import java.io.file; import java.io.filenotfoundexception; import java.io.fileoutputstream; import java.io.ioexception; import java.io.inputstreamreader; import java.net.malformedurlexception; import java.net.url; import java.util.regex.matcher; import java.util.regex.pattern; public class getcontentpicture { public void gethtmlpicture(string httpurl) { url url; bufferedinputstream in; fileoutputstream file; try { system.out.println("取网络图片"); string filename = httpurl.substring(httpurl.lastindexof("/")); string filepath = "./pic/"; url = new url(httpurl); in = new bufferedinputstream(url.openstream()); file = new fileoutputstream(new file(filepath+filename)); int t; while ((t = in.read()) != -1) { file.write(t); } file.close(); in.close(); system.out.println("图片获取成功"); } catch (malformedurlexception e) { e.printstacktrace(); } catch (filenotfoundexception e) { e.printstacktrace(); } catch (ioexception e) { e.printstacktrace(); } } public string gethtmlcode(string httpurl) throws ioexception { string content =""; url uu = new url(httpurl); // 创建url类对象 bufferedreader ii = new bufferedreader(new inputstreamreader(uu .openstream())); // //使用openstream得到一输入流并由此构造一个bufferedreader对象 string input; while ((input = ii.readline()) != null) { // 建立读取循环,并判断是否有读取值 content += input; } ii.close(); return content; } public void get(string url) throws ioexception { string searchimgreg = "(?x)(src|src|background|background)=('|\")/?(([\\w-]+/)*([\\w-]+\\.(jpg|jpg|png|png|gif|gif)))('|\")"; string searchimgreg2 = "(?x)(src|src|background|background)=('|\")(http://([\\w-]+\\.)+[\\w-]+(:[0-9]+)*(/[\\w-]+)*(/[\\w-]+\\.(jpg|jpg|png|png|gif|gif)))('|\")"; string content = this.gethtmlcode(url); system.out.println(content); pattern pattern = pattern.compile(searchimgreg); matcher matcher = pattern.matcher(content); while (matcher.find()) { system.out.println(matcher.group(3)); this.gethtmlpicture(url+matcher.group(3)); } pattern = pattern.compile(searchimgreg2); matcher = pattern.matcher(content); while (matcher.find()) { system.out.println(matcher.group(3)); this.gethtmlpicture(matcher.group(3)); } // searchimgreg = // "(?x)(src|src|background|background)=('|\")/?(([\\w-]+/)*([\\w-]+\\.(jpg|jpg|png|png|gif|gif)))('|\")"; } public static void main(string[] args) throws ioexception { string url = "http://www.baidu.com/"; getcontentpicture gcp = new getcontentpicture(); gcp.get(url); } }
如有疑问请留言或者到本站社区交流讨论,感谢阅读,希望能帮助到大家,谢谢大家对本站的支持!
上一篇: 详解使用Spring Security进行自动登录验证
下一篇: mysql远程登录出错的解决方法