使用dom4j的xPath解析XML
程序员文章站
2022-03-03 16:13:48
...
books.xml:
下面我们使用dom4j的xPath来解析:
segment of ParseXML.java:
Document doc = reader.read("books.xml");的意思是加载XML文档,此是可以用doc.asXML()来查看,它将打印整个xml文档。
Node root = doc.selectSingleNode("/books");是读取刚才加载的xml文档内的books节点下的所有内容,对于本例也是整个xml文档。
当然我们也可以加载/books下的某一个节点,如:book节点
Node root = doc.selectSingleNode("/books/book");
或:Node root = doc.selectSingleNode("/books/*");
注意:如果有多个book节点,它只会读取第一个
root.asXML()将打印:
<book show="yes" url="lucene.net">
<title id="456">Lucene Studing</title>
</book>
既然加载了这么多,那我怎么精确的得到我想要的节点呢,别急,看下面:
List list = root.selectNodes("book[@url='dom4j.com']");
它的意思就是读取books节点下的book节点,且book的节点的url属性为dom4j.com
为什么使用list来接收呢,如果有两个book节点,且它们的url属性都为dom4j.com,此时就封闭到list里了。
[color=red]如果想读取books下的所有book节点,可以这样:
List list = root.selectNodes("book");
如果想读取books节点下的book节点下的title节点,可以这样:
List list2 = root.selectNodes("book[@url='dom4j.com']/title[@id='123']");[/color]
注意:selectNodes()参数的格式:
节点名[@属性名='属性值'],如:book[@url='dom4j.com']
如果有多个节点,用“/”分开,如:book[@url='dom4j.com']/title[@id='123']
最后就是读取封闭在List里的内容了,可以用Node来读取,也可以用Element来转换。
attributeValue("属性")是读取该节点的属性值
getText()是读取节点的的内容。
可参考:
[url]http://newbutton.blog.163.com/blog/static/440539462007919115928634/[/url]
下面介绍一个复杂一点的例子,通过请求,从服务器传回一串xml格式的字符串,然后再parse方法中解析,就得到了TermInfo对象实例。
服务器传回的字符串如下:
<?xml version=\"1.0\" encoding=\"utf-8\"?><message><head><messageId>20100707163000062</messageId><result>0000</result><encryptionType>0</encryptionType><md>a4820454be3b0bcc42cb62884a8ef44e</md></head><body><termInfo winTermNo=\"10077\" preTermNo=\"\"><lotteryResult>0607091724|0212</lotteryResult><missCount >10,29,3,4,5,0,0,12,0,2,4,1,12,7,11,1,0,8,1,3,11,11,2,0,5,6,2,7,8,4,14,3,17,9,1|4,0,5,28,2,9,2,8,8,5,4,0</missCount ><limitNumber ></limitNumber ><salesVolume >51385724</salesVolume ><jackpot >68192388.87</jackpot ><winResult><win id=\"1\"><winCount>0</winCount><winMoney>0</winMoney><winAddCount>0</winAddCount><winAddMoney>0</winAddMoney></win><win id=\"2\"><winCount>3</winCount><winMoney>1018831</winMoney><winAddCount>1</winAddCount><winAddMoney>611298</winAddMoney></win><win id=\"3\"><winCount>38</winCount><winMoney>22364</winMoney><winAddCount>5</winAddCount><winAddMoney>13418</winAddMoney></win><win id=\"4\"><winCount>40</winCount><winMoney>3000</winMoney><winAddCount>9</winAddCount><winAddMoney>1500</winAddMoney></win><win id=\"5\"><winCount>1473</winCount><winMoney>600</winMoney><winAddCount>464</winAddCount><winAddMoney>300</winAddMoney></win><win id=\"6\"><winCount>6540</winCount><winMoney>100</winMoney><winAddCount>1573</winAddCount><winAddMoney>50</winAddMoney></win><win id=\"7\"><winCount>69490</winCount><winMoney>10</winMoney><winAddCount>18273</winAddCount><winAddMoney>5</winAddMoney></win><win id=\"8\"><winCount>832924</winCount><winMoney>5</winMoney></win></winResult><term termNo=\"10078\"><termStatus>1</termStatus><winStatus>1</winStatus><saleStatus>1</saleStatus><startTime></startTime><deadLine>20100707193000</deadLine><deadLine2>20100707190000</deadLine2><winLine>20100707220000</winLine><startTime2></startTime2><deadLine3>20100707200000</deadLine3><winLine2>20100707220000</winLine2><changeLine>20100905220000</changeLine><reserve></reserve></term></termInfo></body></message>
解析如下:
<![CDATA[你的有特殊字符的内容]]>,比如:
<message><![CDATA[salary<1000]]></message>
<?xml version="1.0" encoding="UTF-8"?>
<books>
<!--This is a test for dom4j, jakoes, 2007.7.19-->
<book show="yes" url="lucene.net">
<title id="456">Lucene Studing</title>
</book>
<book show="yes" url="dom4j.com">
<title id="123">Dom4j Tutorials</title>
</book>
<book show="no" url="spring.org">
<title id="789">Spring in Action</title>
</book>
<owner>O'Reilly</owner>
</books>
下面我们使用dom4j的xPath来解析:
segment of ParseXML.java:
public void parseBooks(){
SAXReader reader = new SAXReader();
try {
Document doc = reader.read("books.xml");
Node root = doc.selectSingleNode("/books");
List list = root.selectNodes("book[@url='dom4j.com']");
for(Object o:list){
Element e = (Element) o;
String show=e.attributeValue("show");
System.out.println("show = " + show);
}
} catch (Exception e) {
e.printStackTrace();
}
}
Document doc = reader.read("books.xml");的意思是加载XML文档,此是可以用doc.asXML()来查看,它将打印整个xml文档。
Node root = doc.selectSingleNode("/books");是读取刚才加载的xml文档内的books节点下的所有内容,对于本例也是整个xml文档。
当然我们也可以加载/books下的某一个节点,如:book节点
Node root = doc.selectSingleNode("/books/book");
或:Node root = doc.selectSingleNode("/books/*");
注意:如果有多个book节点,它只会读取第一个
root.asXML()将打印:
<book show="yes" url="lucene.net">
<title id="456">Lucene Studing</title>
</book>
既然加载了这么多,那我怎么精确的得到我想要的节点呢,别急,看下面:
List list = root.selectNodes("book[@url='dom4j.com']");
它的意思就是读取books节点下的book节点,且book的节点的url属性为dom4j.com
为什么使用list来接收呢,如果有两个book节点,且它们的url属性都为dom4j.com,此时就封闭到list里了。
[color=red]如果想读取books下的所有book节点,可以这样:
List list = root.selectNodes("book");
如果想读取books节点下的book节点下的title节点,可以这样:
List list2 = root.selectNodes("book[@url='dom4j.com']/title[@id='123']");[/color]
注意:selectNodes()参数的格式:
节点名[@属性名='属性值'],如:book[@url='dom4j.com']
如果有多个节点,用“/”分开,如:book[@url='dom4j.com']/title[@id='123']
最后就是读取封闭在List里的内容了,可以用Node来读取,也可以用Element来转换。
attributeValue("属性")是读取该节点的属性值
getText()是读取节点的的内容。
可参考:
[url]http://newbutton.blog.163.com/blog/static/440539462007919115928634/[/url]
下面介绍一个复杂一点的例子,通过请求,从服务器传回一串xml格式的字符串,然后再parse方法中解析,就得到了TermInfo对象实例。
服务器传回的字符串如下:
<?xml version=\"1.0\" encoding=\"utf-8\"?><message><head><messageId>20100707163000062</messageId><result>0000</result><encryptionType>0</encryptionType><md>a4820454be3b0bcc42cb62884a8ef44e</md></head><body><termInfo winTermNo=\"10077\" preTermNo=\"\"><lotteryResult>0607091724|0212</lotteryResult><missCount >10,29,3,4,5,0,0,12,0,2,4,1,12,7,11,1,0,8,1,3,11,11,2,0,5,6,2,7,8,4,14,3,17,9,1|4,0,5,28,2,9,2,8,8,5,4,0</missCount ><limitNumber ></limitNumber ><salesVolume >51385724</salesVolume ><jackpot >68192388.87</jackpot ><winResult><win id=\"1\"><winCount>0</winCount><winMoney>0</winMoney><winAddCount>0</winAddCount><winAddMoney>0</winAddMoney></win><win id=\"2\"><winCount>3</winCount><winMoney>1018831</winMoney><winAddCount>1</winAddCount><winAddMoney>611298</winAddMoney></win><win id=\"3\"><winCount>38</winCount><winMoney>22364</winMoney><winAddCount>5</winAddCount><winAddMoney>13418</winAddMoney></win><win id=\"4\"><winCount>40</winCount><winMoney>3000</winMoney><winAddCount>9</winAddCount><winAddMoney>1500</winAddMoney></win><win id=\"5\"><winCount>1473</winCount><winMoney>600</winMoney><winAddCount>464</winAddCount><winAddMoney>300</winAddMoney></win><win id=\"6\"><winCount>6540</winCount><winMoney>100</winMoney><winAddCount>1573</winAddCount><winAddMoney>50</winAddMoney></win><win id=\"7\"><winCount>69490</winCount><winMoney>10</winMoney><winAddCount>18273</winAddCount><winAddMoney>5</winAddMoney></win><win id=\"8\"><winCount>832924</winCount><winMoney>5</winMoney></win></winResult><term termNo=\"10078\"><termStatus>1</termStatus><winStatus>1</winStatus><saleStatus>1</saleStatus><startTime></startTime><deadLine>20100707193000</deadLine><deadLine2>20100707190000</deadLine2><winLine>20100707220000</winLine><startTime2></startTime2><deadLine3>20100707200000</deadLine3><winLine2>20100707220000</winLine2><changeLine>20100905220000</changeLine><reserve></reserve></term></termInfo></body></message>
解析如下:
import java.util.ArrayList;
import java.util.Iterator;
import java.util.List;
import java.util.Map;
import java.util.TreeMap;
import org.dom4j.Document;
import org.dom4j.DocumentException;
import org.dom4j.DocumentHelper;
import org.dom4j.Element;
import com.tlt.app.util.Constants;
/**
* 当前在售期彩期信息
* @author Administrator
*
*/
public class TermInfo extends BaseModel{
private Head head;
/*选填,最近一期开奖期号*/
private String winTermNo;
/*选填,最近一期预售期号*/
private String preTermNo;
/*最近开奖结果*/
private String lotteryResult;
/*遗漏信息*/
private String missCount;
/*限号信息*/
private String limitNumber;
/*上期销量*/
private String salesVolume;
/*奖池滚存*/
private String jackpot;
/*最近开奖奖级结果*/
private WinResult winResult;
/*正在销售彩期信息*/
private Term term;
/*保留*/
private String reserve;
/*出错信息*/
private String errorMsg;
/*球队*/
private List<GameInfo> gameInfo;
public List<GameInfo> getGameInfo() {
return gameInfo;
}
public void setGameInfo(List<GameInfo> gameInfo) {
this.gameInfo = gameInfo;
}
public String getWinTermNo() {
return winTermNo;
}
public void setWinTermNo(String winTermNo) {
this.winTermNo = winTermNo;
}
public String getPreTermNo() {
return preTermNo;
}
public void setPreTermNo(String preTermNo) {
this.preTermNo = preTermNo;
}
public String getErrorMsg() {
return errorMsg;
}
public void setErrorMsg(String errorMsg) {
this.errorMsg = errorMsg;
}
public Head getHead() {
return head;
}
public void setHead(Head head) {
this.head = head;
}
public String getLotteryResult() {
return lotteryResult;
}
public void setLotteryResult(String lotteryResult) {
this.lotteryResult = lotteryResult;
}
public String getMissCount() {
return missCount;
}
public void setMissCount(String missCount) {
this.missCount = missCount;
}
public String getLimitNumber() {
return limitNumber;
}
public void setLimitNumber(String limitNumber) {
this.limitNumber = limitNumber;
}
public String getSalesVolume() {
return salesVolume;
}
public void setSalesVolume(String salesVolume) {
this.salesVolume = salesVolume;
}
public String getJackpot() {
return jackpot;
}
public void setJackpot(String jackpot) {
this.jackpot = jackpot;
}
public WinResult getWinResult() {
return winResult;
}
public void setWinResult(WinResult winResult) {
this.winResult = winResult;
}
public Term getTerm() {
return term;
}
public void setTerm(Term term) {
this.term = term;
}
public String getReserve() {
return reserve;
}
public void setReserve(String reserve) {
this.reserve = reserve;
}
@Override
public boolean parse(String xmlString) {
// TODO Auto-generated method stub
try {
Document document = DocumentHelper.parseText(xmlString);
Head head=new Head();
head.setMessageId(document.selectSingleNode("//message/head/messageId").getText());
head.setResult(document.selectSingleNode("//message/head/result").getText());
head.setEncryptionType(document.selectSingleNode("//message/head/encryptionType").getText());
head.setMd(document.selectSingleNode("//message/head/md").getText());
setHead(head);
//如果返回失败信息,就没必要继续解析了
if(!head.getResult().equals(Constants.SUCCESS)){
setErrorMsg(document.selectSingleNode("//message/head/result").getText()+":"+document.selectSingleNode("//message/body/errorMsg").getText());
return false;
}
setWinTermNo(document.selectSingleNode("//message/body/termInfo/@winTermNo").getText());
setPreTermNo(document.selectSingleNode("//message/body/termInfo/@preTermNo").getText());
setLotteryResult(document.selectSingleNode("//message/body/termInfo/lotteryResult").getText());
setMissCount(document.selectSingleNode("//message/body/termInfo/missCount").getText());
setLimitNumber(document.selectSingleNode("//message/body/termInfo/limitNumber").getText());
setSalesVolume(document.selectSingleNode("//message/body/termInfo/salesVolume").getText());
setJackpot(document.selectSingleNode("//message/body/termInfo/jackpot").getText());
winResult=new WinResult();
Map<String,Win> winMap=new TreeMap<String,Win>();//需要排序
List list = document.selectNodes("//message/body/termInfo/winResult/win");
for (Iterator iter = list.iterator(); iter.hasNext();) {
Win win=new Win();
Element winEle=(Element)iter.next();
win.setWinCount(winEle.element("winCount").getText());
win.setWinMoney(winEle.element("winMoney").getText());
if(winEle.element("winAddCount")!=null){//8等奖没有追加
win.setWinAddCount(winEle.element("winAddCount").getText());
win.setWinAddMoney(winEle.element("winAddMoney").getText());
}else{
win.setWinAddCount("");
win.setWinAddMoney("");
}
String id=winEle.attribute("id").getValue();
win.setId(id);
winMap.put(id,win);
}
winResult.setWin(winMap);
setWinResult(winResult);
Term term=new Term();
term.setTermNo(document.selectSingleNode("//message/body/termInfo/term/@termNo").getText());
term.setTermStatus(document.selectSingleNode("//message/body/termInfo/term/termStatus").getText());
term.setWinStatus(document.selectSingleNode("//message/body/termInfo/term/winStatus").getText());
term.setSaleStatus(document.selectSingleNode("//message/body/termInfo/term/saleStatus").getText());
term.setStartTime(document.selectSingleNode("//message/body/termInfo/term/startTime").getText());
term.setDeadLine(document.selectSingleNode("//message/body/termInfo/term/deadLine").getText());
term.setDeadLine2(document.selectSingleNode("//message/body/termInfo/term/deadLine2").getText());
term.setWinLine(document.selectSingleNode("//message/body/termInfo/term/winLine").getText());
term.setStartTime2(document.selectSingleNode("//message/body/termInfo/term/startTime2").getText());
term.setDeadLine3(document.selectSingleNode("//message/body/termInfo/term/deadLine3").getText());
term.setWinLine2(document.selectSingleNode("//message/body/termInfo/term/winLine2").getText());
term.setChangeLine(document.selectSingleNode("//message/body/termInfo/term/changeLine").getText());
term.setReserve(document.selectSingleNode("//message/body/termInfo/term/reserve").getText());
setTerm(term);
List list_gameInfo = document.selectNodes("//message/body/termInfo/term/gameInfo/game");
gameInfo=new ArrayList<GameInfo>();
for (Iterator iter = list_gameInfo.iterator(); iter.hasNext();) {
GameInfo info=new GameInfo();
Element gameInfoEle=(Element)iter.next();
info.setId(gameInfoEle.attributeValue("id"));
info.setHomeTeam(gameInfoEle.element("homeTeam").getText());
info.setAwayTeam(gameInfoEle.element("awayTeam").getText());
info.setGameDate(gameInfoEle.element("gameDate").getText());
info.setLeagueMatch(gameInfoEle.element("leagueMatch").getText());//巴甲
info.setReserve(gameInfoEle.element("reserve").getText());
gameInfo.add(info);
}
setGameInfo(gameInfo);
} catch (DocumentException e) {
// TODO Auto-generated catch block
e.printStackTrace();
return false;
}
return true;
}
}
<![CDATA[你的有特殊字符的内容]]>,比如:
<message><![CDATA[salary<1000]]></message>
上一篇: 使用dom4j查询xml