Java使用SAX解析无根节点的xml文件并过滤不合法字符
程序员文章站
2022-05-28 08:32:37
...
使用HttpURLConnection请求远程读取xml文件,针对无根节点的xml文件,在首位添加根节点,并进行过滤,代码如下
URL url = new URL(path);
//URL url = new URL("http://192.168.11.111:8080/xxxx/"+path);
HttpURLConnection conn = (HttpURLConnection) url.openConnection();
conn.connect();
InputStream stream = conn.getInputStream();
BufferedReader reader = new BufferedReader(new InputStreamReader(stream, "UTF-8"));
StringBuffer document = new StringBuffer();
String line = null;
while ((line = reader.readLine()) != null) {
document.append(line);
}
document.insert(document.indexOf(">")+1, "<root>");//插入root根节点
SAXBuilder sax = new SAXBuilder();
Document doc = (Document) sax.build(new StringReader(document.append("</root>").toString().replaceAll("[\\x00-\\x08\\x0b-\\x0c\\x0e-\\x1f]", ""))); //过滤不合法字符
Element root = doc.getRootElement();