libmxml数据结构(源码分析)
程序员文章站
2022-06-29 11:14:30
libmxml是一个开源、小巧的C语言xml库。这里简单分析一下它是用什么样的数据结构来保存分析过的xml文档。 mxml关键的结构体mxml_node_t是这样的实现的: 它使用左孩子右兄弟的树形结构来描述xml报文:即下层节点登记在child链表,兄弟节点登记在next链表。 如果某个节点下面有 ......
libmxml是一个开源、小巧的c语言xml库。这里简单分析一下它是用什么样的数据结构来保存分析过的xml文档。
mxml关键的结构体mxml_node_t是这样的实现的:
struct mxml_node_s /**** an xml node. @private@ ****/ { mxml_type_t type; /* node type */ struct mxml_node_s *next; /* next node under same parent */ struct mxml_node_s *prev; /* previous node under same parent */ struct mxml_node_s *parent; /* parent node */ struct mxml_node_s *child; /* first child node */ struct mxml_node_s *last_child; /* last child node */ mxml_value_t value; /* node value */ int ref_count; /* use count */ void *user_data; /* user data */ }; typedef struct mxml_node_s mxml_node_t; /**** an xml node. ****/
它使用左孩子右兄弟的树形结构来描述xml报文:即下层节点登记在child链表,兄弟节点登记在next链表。 如果某个节点下面有n个子节点,则child指向第一个子节点,该子节点的next指向下一个同父节点的子节点。 比较特殊的是,mxml把xml节点值也认为是一个子节点。例如<group>value</group>, 其中value(type是mxml_opaque)是一个独立的子节点,挂载在group节点(type是mxml_element)下面。 另外,空白符(空格,回车换行,制表符)和注释,虽然对xml报文无实质意义,但mxml还是把它们做为一个节点存储起来。
由于mxml只是使用简单的链表存储xml元素,所以元素节点个数比较多时,mxml查找元素效率是比较低的。所以libmxml提供了一个索引查找的函数,它需要先遍历xml元素树,生成一个排序过的数组,加快查找速度。
为了方便大家理解,我写了一个函数打印xml结构体。
void printnode(mxml_node_t *node, int nnodesn, int level) { static int currnodesn = 0; if (node == null) { return; } ++currnodesn; //每遇到一个新节点 则将节点序号递增,做为本节点序号 printf("[%- 3d -> %- 3d] ", currnodesn, nnodesn); switch (node->type) { case mxml_element: { int i; printf("level %d mxml_element [%s]", level, node->value.element.name); for (i = 0; i < node->value.element.num_attrs; ++i) { printf(" %s=%s", node->value.element.attrs[i].name, node->value.element.attrs[i].value); } printf("\n"); } break; case mxml_integer: printf("level %d mxml_integer %d\n", level, node->value.integer); break; case mxml_opaque: printf("level %d mxml_opaque [%s]\n", level, node->value.opaque); break; case mxml_real: printf("level %d mxml_real %lf\n", level, node->value.real); break; case mxml_text: printf("level %d mxml_text [%s]\n", level, node->value.text.string); break; case mxml_custom: printf("level %d mxml_custom\n", level); break; default: printf("unknown node type %d\n", node->type); } //深度优先遍历 if (node->child) { //访问子节点时把本节点序号做为父节点序号 层级加1 printnode(node->child, currnodesn, level + 1); } if (node->next) { //访问兄弟节点,直接传父节点序号即可 层级也不用加1 printnode(node->next, nnodesn, level); } }
运行示例如下:
xml源如下:
<?xml version="1.0" encoding="gbk" ?> <group> <option>122334 我们 <string>我们</string>45677 <keyword type="opaque">inputslot</keyword> <default type="opaque">auto</default> <text>media source</text> <order type="real">10.000000</order> <choice> <keyword type="opaque">auto</keyword> <text>auto tray selection</text> <code type="opaque" /> </choice> <choice> <keyword type="opaque">upper</keyword> <text>tray 1</text> <code type="opaque"><</mediaposition 0>>setpagedevice</code> </choice> <choice> <keyword type="opaque">lower</keyword> <text>tray 2</text> <code type="opaque"><</mediaposition 1>>setpagedevice</code> </choice> </option> 我 12334545 050504550 <integer>123</integer> <string>now is the time for all good men to come to the aid of their country.</string> <!-- this is a comment --> <![cdata[this is cdata 0123456789abcdef]]> </group>
用我这个printnode分析结果如下:
说明:[ 1 -> 0 ],代表本节点序号是1,其父节点序号是0,level 0代表本节点是最顶层节点。 [ 1 -> 0 ] level 0 mxml_element [?xml version="1.0" encoding="gbk" ?] [ 2 -> 1 ] level 1 mxml_opaque [ ] [ 3 -> 1 ] level 1 mxml_element [group] [ 4 -> 3 ] level 2 mxml_opaque [ ] [ 5 -> 3 ] level 2 mxml_element [option] [ 6 -> 5 ] level 3 mxml_opaque [122334 我们 ] [ 7 -> 5 ] level 3 mxml_element [string] [ 8 -> 7 ] level 4 mxml_opaque [我们] [ 9 -> 5 ] level 3 mxml_opaque [45677 ] [ 10 -> 5 ] level 3 mxml_element [keyword] type=opaque [ 11 -> 10] level 4 mxml_opaque [inputslot] [ 12 -> 5 ] level 3 mxml_opaque [ ] [ 13 -> 5 ] level 3 mxml_element [default] type=opaque [ 14 -> 13] level 4 mxml_opaque [auto] [ 15 -> 5 ] level 3 mxml_opaque [ ] [ 16 -> 5 ] level 3 mxml_element [text] [ 17 -> 16] level 4 mxml_opaque [media source] [ 18 -> 5 ] level 3 mxml_opaque [ ] [ 19 -> 5 ] level 3 mxml_element [order] type=real [ 20 -> 19] level 4 mxml_opaque [10.000000] [ 21 -> 5 ] level 3 mxml_opaque [ ] [ 22 -> 5 ] level 3 mxml_element [choice] [ 23 -> 22] level 4 mxml_opaque [ ] [ 24 -> 22] level 4 mxml_element [keyword] type=opaque [ 25 -> 24] level 5 mxml_opaque [auto] [ 26 -> 22] level 4 mxml_opaque [ ] [ 27 -> 22] level 4 mxml_element [text] [ 28 -> 27] level 5 mxml_opaque [auto tray selection] [ 29 -> 22] level 4 mxml_opaque [ ] [ 30 -> 22] level 4 mxml_element [code] type=opaque [ 31 -> 22] level 4 mxml_opaque [ ] [ 32 -> 5 ] level 3 mxml_opaque [ ] [ 33 -> 5 ] level 3 mxml_element [choice] [ 34 -> 33] level 4 mxml_opaque [ ] [ 35 -> 33] level 4 mxml_element [keyword] type=opaque [ 36 -> 35] level 5 mxml_opaque [upper] [ 37 -> 33] level 4 mxml_opaque [ ] [ 38 -> 33] level 4 mxml_element [text] [ 39 -> 38] level 5 mxml_opaque [tray 1] [ 40 -> 33] level 4 mxml_opaque [ ] [ 41 -> 33] level 4 mxml_element [code] type=opaque [ 42 -> 41] level 5 mxml_opaque [<</mediaposition 0>>setpagedevice] [ 43 -> 33] level 4 mxml_opaque [ ] [ 44 -> 5 ] level 3 mxml_opaque [ ] [ 45 -> 5 ] level 3 mxml_element [choice] [ 46 -> 45] level 4 mxml_opaque [ ] [ 47 -> 45] level 4 mxml_element [keyword] type=opaque [ 48 -> 47] level 5 mxml_opaque [lower] [ 49 -> 45] level 4 mxml_opaque [ ] [ 50 -> 45] level 4 mxml_element [text] [ 51 -> 50] level 5 mxml_opaque [tray 2] [ 52 -> 45] level 4 mxml_opaque [ ] [ 53 -> 45] level 4 mxml_element [code] type=opaque [ 54 -> 53] level 5 mxml_opaque [<</mediaposition 1>>setpagedevice] [ 55 -> 45] level 4 mxml_opaque [ ] [ 56 -> 5 ] level 3 mxml_opaque [ ] [ 57 -> 3 ] level 2 mxml_opaque [ 我12334545 050504550 ] [ 58 -> 3 ] level 2 mxml_element [integer] [ 59 -> 58] level 3 mxml_opaque [123] [ 60 -> 3 ] level 2 mxml_opaque [ ] [ 61 -> 3 ] level 2 mxml_element [string] [ 62 -> 61] level 3 mxml_opaque [now is the time for all good men to come to the aid of their country.] [ 63 -> 3 ] level 2 mxml_opaque [ ] [ 64 -> 3 ] level 2 mxml_element [!-- this is a comment --] [ 65 -> 3 ] level 2 mxml_opaque [ ] [ 66 -> 3 ] level 2 mxml_element [![cdata[this is cdata 0123456789abcdef]]] [ 67 -> 3 ] level 2 mxml_opaque [ ]
上一篇: 详解前端到底可以用nginx做什么
下一篇: 教你如何搭建一个时间服务器