欢迎您访问程序员文章站本站旨在为大家提供分享程序员计算机编程知识!
您现在的位置是: 首页

Java使用SAX解析无根节点的xml文件并过滤不合法字符

程序员文章站 2022-05-28 08:32:37
...

使用HttpURLConnection请求远程读取xml文件,针对无根节点的xml文件,在首位添加根节点,并进行过滤,代码如下

        URL url = new URL(path); 
		//URL url = new URL("http://192.168.11.111:8080/xxxx/"+path);
		HttpURLConnection conn = (HttpURLConnection) url.openConnection(); 
		conn.connect(); 
		InputStream stream = conn.getInputStream(); 
		BufferedReader reader = new BufferedReader(new InputStreamReader(stream, "UTF-8")); 
		StringBuffer document = new StringBuffer(); 
		String line = null; 
		while ((line = reader.readLine()) != null) { 
			document.append(line); 
		}
		document.insert(document.indexOf(">")+1, "<root>");//插入root根节点
		SAXBuilder sax = new SAXBuilder(); 
		Document doc = (Document) sax.build(new StringReader(document.append("</root>").toString().replaceAll("[\\x00-\\x08\\x0b-\\x0c\\x0e-\\x1f]", ""))); //过滤不合法字符
		Element root = doc.getRootElement();