欢迎您访问程序员文章站本站旨在为大家提供分享程序员计算机编程知识!
您现在的位置是: 首页  >  移动技术

Android 使用Pull方法解析XML文件的方法

程序员文章站 2023-11-20 17:49:10
pull解析方法给应用程序完全的控制文档该怎么样被解析。android中对pull方法提供了支持的api,主要是复制代码 代码如下:org.xmlpull.v1.xmlpu...
pull解析方法给应用程序完全的控制文档该怎么样被解析。android中对pull方法提供了支持的api,主要是
复制代码 代码如下:

org.xmlpull.v1.xmlpullparser;
org.xmlpull.v1.xmlpullparserfactory;

二个类,其中主要使用的是xmlpullparser,xmlpullparserfactory是一个工厂,用于构建xmlpullparser对象。
应用程序通过调用xmlpullparser.next()等方法来产生event,然后再处理event。可以看到它与push方法的不同,push方法是由parser自己主动产生event,回调给应用程序。而pull方法是主动的调用parser的方法才能产生事件。
假如xml中的语句是这样的:"<author country="united states">james elliott</author>",author是tag,country是attribute,"james elliott"是text。
要想解析文档先要构建一个xmlpullparser对象
复制代码 代码如下:

final xmlpullparserfactory factory = xmlpullparserfactory.newinstance();
factory.setnamespaceaware(true);
final xmlpullparser parser = factory.newpullparser();

pull解析是一个遍历文档的过程,每次调用next(),nexttag(), nexttoken()和nexttext()都会向前推进文档,并使parser停留在某些事件上面,但是不能倒退。
然后把文档设置给parser
复制代码 代码如下:

parser.setinput(new stringreader("<author country=\"united states\">james elliott</author>");

这时,文档刚被初始化,所以它应该位于文档的开始,事件应该是start_document,可以通过xmlpullparser.geteventtype()来获取。然后调用next()会产生
start_tag,这个事件告诉应用程序一个标签已经开始了,调用getname()会返回"author";再next()会产生
text事件,调用gettext()会返回"james elliott",再next(),会产生
end_tag,这个告诉你一个标签已经处理完了,再next(),会产生
end_document,它告诉你整个文档已经处理完成了。
除了next()外,nexttoken()也可以使用,只不过它会返回更加详细的事件,比如 comment, cdsect, docdecl, entity等等非常详细的信息。如果程序得到比较底层的信息,可以用nexttoken()来驱动并处理详细的事件。需要注意一点的是text事件是有可能返回空白的white spaces比如换行符或空格等。
另外有二个非常实用的方法nexttag()和nexttext()
nexttag()--首先它会忽略white spaces,如果可以确定下一个是start_tag或end_tag,就可以调用nexttag()直接跳过去。通常它有二个用处:当start_tag时,如果能确定这个tag含有子tag,那么就可以调用nexttag()产生子标签的start_tag事件;当end_tag时,如果确定不是文档结尾,就可以调用nexttag()产生下一个标签的start_tag。在这二种情况下如果用next()会有text事件,但返回的是换行符或空白符。
nexttext()--它只能在start_tag时调用。当下一个元素是text时,text的内容会返回;当下一个元素是end_tag时,也就是说这个标签的内容为空,那么空字串返回;这个方法返回后,parser会停在end_tag上。比如:
复制代码 代码如下:

<author>james elliott</author>
<author></author>
<author/>

当start_tag时,调用nexttext(),依次返回:
"james elliott"
""(empty)
""(empty)
这个方法在处理没有子标签的标签时很有用。比如:
复制代码 代码如下:

<title>what is hibernate</title>
<author>james elliott</author>
<category>web</category>

就可以用以下代码来处理:
复制代码 代码如下:

        while (eventtype != xmlpullparser.end_tag) {
            switch (eventtype) {
            case xmlpullparser.start_tag:
                tag = parser.getname();
                final string content = parser.nexttext();
                log.e(tag, tag + ": [" + content + "]");
                eventtype = parser.nexttag();
                break;
            default:
                break;
            }
        }

这就要比用next()来处理方便多了,可读性也大大的加强了。
最后附上一个解析xml的实例android程序
复制代码 代码如下:

import java.io.ioexception;
import java.io.inputstream;
import org.xmlpull.v1.xmlpullparser;
import org.xmlpull.v1.xmlpullparserexception;
import org.xmlpull.v1.xmlpullparserfactory;
import android.util.log;
public class rsspullparser extends rssparser {
    private final string tag = feedsettings.global_tag;

    private inputstream minputstream;

    public rsspullparser(inputstream is) {
        minputstream = is;
    }

    public void parse() throws readerbaseexception, xmlpullparserexception, ioexception {
        if (minputstream == null) {
            throw new readerbaseexception("no input source, did you initialize this class correctly?");
        }
        final xmlpullparserfactory factory = xmlpullparserfactory.newinstance();
        factory.setnamespaceaware(true);
        final xmlpullparser parser = factory.newpullparser();

        parser.setinput(minputstream);
        int eventtype = parser.geteventtype();
        if (eventtype != xmlpullparser.start_document) {
            throw new readerbaseexception("not starting with 'start_document'");
        }
        eventtype = parserss(parser);
        if (eventtype != xmlpullparser.end_document) {
            throw new readerbaseexception("not ending with 'end_document', do you finish parsing?");
        }
        if (minputstream != null) {
            minputstream.close();
        } else {
            log.e(tag, "inputstream is null, xmlpullparser closed it??");
        }
    }

    /**
     * parsing the xml document. current type must be start_document.
     * after calling this, parser is positioned at end_document.
     * @param parser
     * @return event end_document
     * @throws xmlpullparserexception
     * @throws readerbaseexception
     * @throws ioexception
     */
    private int parserss(xmlpullparser parser) throws xmlpullparserexception, readerbaseexception, ioexception {
        int eventtype = parser.geteventtype();
        if (eventtype != xmlpullparser.start_document) {
            throw new readerbaseexception("not starting with 'start_document', is this a new document?");
        }
        log.e(tag, "starting document, are you aware of that!");
        eventtype = parser.next();
        while (eventtype != xmlpullparser.end_document) {
            switch (eventtype) {
            case xmlpullparser.start_tag: {
                log.e(tag, "start tag: '" + parser.getname() + "'");
                final string tagname = parser.getname();
                if (tagname.equals(rssfeed.tag_rss)) {
                    log.e(tag, "starting an rss feed <<");
                    final int attrsize = parser.getattributecount();
                    for (int i = 0; i < attrsize; i++) {
                        log.e(tag, "attr '" + parser.getattributename(i) + "=" + parser.getattributevalue(i) + "'");
                    }
                } else if (tagname.equals(rssfeed.tag_channel)) {
                    log.e(tag, "\tstarting an channel <<");
                    parsechannel(parser);
                }
                break;
            }
            case xmlpullparser.end_tag: {
                log.e(tag, "end tag: '" + parser.getname() + "'");
                final string tagname = parser.getname();
                if (tagname.equals(rssfeed.tag_rss)) {
                    log.e(tag, ">> edning an rss feed");
                } else if (tagname.equals(rssfeed.tag_channel)) {
                    log.e(tag, "\t>> ending an channel");    
                }
                break;
            }
            default:
                break;
            }
            eventtype = parser.next();
        }
        log.e(tag, "end of document, it is over");
        return parser.geteventtype();
    }

    /**
     * parse a channel. must be start tag of an channel, otherwise exception thrown.
     * param xmlpullparser
     * after calling this function, parser is positioned at end_tag of channel.
     * return end tag of a channel
     * @throws xmlpullparserexception
     * @throws readerbaseexception
     * @throws ioexception
     */
    private int parsechannel(xmlpullparser parser) throws xmlpullparserexception, readerbaseexception, ioexception {
        int eventtype = parser.geteventtype();
        string tagname = parser.getname();
        if (eventtype != xmlpullparser.start_tag || !rssfeed.tag_channel.equals(tagname)) {
            throw new readerbaseexception("not start with 'start tag', is this a start of a channel?");
        }
        log.e(tag, "\tstarting " + tagname);
        eventtype = parser.nexttag();
        while (eventtype != xmlpullparser.end_tag) {
            switch (eventtype) {
            case xmlpullparser.start_tag: {
                final string tag = parser.getname();
                if (tag.equals(rssfeed.tag_image)) {
                    parseimage(parser);
                } else if (tag.equals(rssfeed.tag_item)) {
                    parseitem(parser);
                } else {
                    final string content = parser.nexttext();
                    log.e(tag, tag + ": [" + content + "]");
                }
                // now it should be at end_tag, ensure it
                if (parser.geteventtype() != xmlpullparser.end_tag) {
                    throw new readerbaseexception("not ending with 'end tag', did you finish parsing sub item?");
                }
                eventtype = parser.nexttag();
                break;
            }
            default:
                break;
            }
        }
        log.e(tag, "\tending " + parser.getname());
        return parser.geteventtype();
    }

    /**
     * parse image in a channel.
     * precondition: position must be at start_tag and tag must be 'image'
     * postcondition: position is end_tag of '/image'
     * @throws ioexception
     * @throws xmlpullparserexception
     * @throws readerbaseexception
     */
    private int parseimage(xmlpullparser parser) throws xmlpullparserexception, ioexception, readerbaseexception {
        int eventtype = parser.geteventtype();
        string tag = parser.getname();
        if (eventtype != xmlpullparser.start_tag || !rssfeed.tag_image.equals(tag)) {
            throw new readerbaseexception("not start with 'start tag', is this a start of an image?");
        }
        log.e(tag, "\t\tstarting image " + tag);
        eventtype = parser.nexttag();
        while (eventtype != xmlpullparser.end_tag) {
            switch (eventtype) {
            case xmlpullparser.start_tag:
                tag = parser.getname();
                log.e(tag, tag + ": [" + parser.nexttext() + "]");
                // now it should be at end_tag, ensure it
                if (parser.geteventtype() != xmlpullparser.end_tag) {
                    throw new readerbaseexception("not ending with 'end tag', did you finish parsing sub item?");
                }
                eventtype = parser.nexttag();
                break;
            default:
                break;
            }
        }
        log.e(tag, "\t\tending image " + parser.getname());
        return parser.geteventtype();
    }

    /**
     * parse an item in a channel.
     * precondition: position must be at start_tag and tag must be 'item'
     * postcondition: position is end_tag of '/item'
     * @throws ioexception
     * @throws xmlpullparserexception
     * @throws readerbaseexception
     */
    private int parseitem(xmlpullparser parser) throws xmlpullparserexception, ioexception, readerbaseexception {
        int eventtype = parser.geteventtype();
        string tag = parser.getname();
        if (eventtype != xmlpullparser.start_tag || !rssfeed.tag_item.equals(tag)) {
            throw new readerbaseexception("not start with 'start tag', is this a start of an item?");
        }
        log.e(tag, "\t\tstarting " + tag);
        eventtype = parser.nexttag();
        while (eventtype != xmlpullparser.end_tag) {
            switch (eventtype) {
            case xmlpullparser.start_tag:
                tag = parser.getname();
                final string content = parser.nexttext();
                log.e(tag, tag + ": [" + content + "]");
                // now it should be at end_tag, ensure it
                if (parser.geteventtype() != xmlpullparser.end_tag) {
                    throw new readerbaseexception("not ending with 'end tag', did you finish parsing sub item?");
                }
                eventtype = parser.nexttag();
                break;
            default:
                break;
            }
        }
        log.e(tag, "\t\tending " + parser.getname());
        return parser.geteventtype();
    }
}