欢迎您访问程序员文章站本站旨在为大家提供分享程序员计算机编程知识!
您现在的位置是: 首页  >  数据库

Trees in MongoDB

程序员文章站 2022-04-18 13:31:37
...

Trees in MongoDBPosted on 引用地址:%20in%20MongoDB.html 树结构存储最好的方式常常依赖于要执行的操作;下面讨论一下不同的存储方案。在实践中,许多开发人员找到了一些使用起来很方便的模式:单文档存储整根树(Full Tree in single Document),父连接(P

Trees in MongoDB Posted on

Trees in MongoDB

引用地址:%20in%20MongoDB.html

树结构存储最好的方式常常依赖于要执行的操作;下面讨论一下不同的存储方案。在实践中,许多开发人员找到了一些使用起来很方便的模式:“单文档存储整根树(Full Tree in single Document)”,“父连接(Parent Links)”和“祖先数组(Array of Ancestors)”。

1 模式 1.1 单文档存储整根树(Full Tree in Signle Document)

{

comments: [

{by: "mathias", text: "...", replies: []}

{by: "eliot", text: "...", replies: [

{by: "mike", text: "...", replies: []}

]}

]

}

优点:

缺点:

1.2 (父连接)Parent Links

用单个集合来存储所有节点,服务器空间,每个节点包含他父节点的ID,是一种简单的解决方案。这种方法最大的问题是获取完整子树时需要查找多次数据库(或使用db.eval函数)。

> t = db.tree1; > t.find() { "_id" : 1 } { "_id" : 2, "parent" : 1 } { "_id" : 3, "parent" : 1 } { "_id" : 4, "parent" : 2 } { "_id" : 5, "parent" : 4 } { "_id" : 6, "parent" : 4 } > // find children of node 4 > t.ensureIndex({parent:1}) > t.find( {parent : 4 } ) { "_id" : 5, "parent" : 4 } { "_id" : 6, "parent" : 4 } 1.3 (子链接)Child Links

另一种选择是在每个节点文档中存储所有子节点的ID。这个方法是有限制的,如果不操作完整子树是没有问题。他可能也是用于存储一个节点有多个父节点情况的最有效方法。

> t = db.tree2 > t.find() { "_id" : 1, "children" : [ 2, 3 ] } { "_id" : 2 } { "_id" : 3, "children" : [ 4 ] } { "_id" : 4 } > // find immediate children of node 3 > t.findOne({_id:3}).children [ 4 ] > // find immediate parent of node 3 > t.ensureIndex({children:1}) > t.find({children:3}) { "_id" : 1, "children" : [ 2, 3 ] } 1.4 (祖先数组)Array of Ancestors

在这种方法中将一个节点的所有祖先节点存储到一个数组中。这使得类似于“获取X节点的所有子节点”的操作快且容易。

> t = db.mytree; > t.find() { "_id" : "a" } { "_id" : "b", "ancestors" : [ "a" ], "parent" : "a" } { "_id" : "c", "ancestors" : [ "a", "b" ], "parent" : "b" } { "_id" : "d", "ancestors" : [ "a", "b" ], "parent" : "b" } { "_id" : "e", "ancestors" : [ "a" ], "parent" : "a" } { "_id" : "f", "ancestors" : [ "a", "e" ], "parent" : "e" } { "_id" : "g", "ancestors" : [ "a", "b", "d" ], "parent" : "d" } > t.ensureIndex( { ancestors : 1 } ) > // find all descendents of b: > t.find( { ancestors : 'b' }) { "_id" : "c", "ancestors" : [ "a", "b" ], "parent" : "b" } { "_id" : "d", "ancestors" : [ "a", "b" ], "parent" : "b" } { "_id" : "g", "ancestors" : [ "a", "b", "d" ], "parent" : "d" } > // get all ancestors of f: > anc = db.mytree.findOne({_id:'f'}).ancestors [ "a", "e" ] > db.mytree.find( { _id : { $in : anc } } ) { "_id" : "a" } { "_id" : "e", "ancestors" : [ "a" ], "parent" : "a" }

ensureIndex和MongoDB的multikey特性可以使上面的查询更高效。

除了祖先数组,我们也存储了节点的直接父节点,使得查找节点的直接父节点更容易。

1.5 物化路径(Materialized Path[Full Path in Each Node))

物化路径使得对树的特定查询容易。我们在每个节点中存储文档在树中位置的全路径。通常情况下上面提到的“祖先数组”方法都工作很好;当不得不处理字符串建造、正则表达式,字符逃逸,物化路径更容易。(理论上,物化路径将会更快。)

MongoDB实现物化路径最好的方式是将路径存储成字符串,然后采用正则表达式查询。以“^”开头的正则表达可以被高效执行。把数据看作一个字符串,你需要选择一个分隔符,我们采用“,”。举例:

> t = db.tree test.tree > // get entire tree -- we use sort() to make the order nice > t.find().sort({path:1}) { "_id" : "a", "path" : "a," } { "_id" : "b", "path" : "a,b," } { "_id" : "c", "path" : "a,b,c," } { "_id" : "d", "path" : "a,b,d," } { "_id" : "g", "path" : "a,b,g," } { "_id" : "e", "path" : "a,e," } { "_id" : "f", "path" : "a,e,f," } { "_id" : "g", "path" : "a,b,g," } > t.ensureIndex( {path:1} ) > // find the node 'b' and all its descendents: > t.find( { path : /^a,b,/ } ) { "_id" : "b", "path" : "a,b," } { "_id" : "c", "path" : "a,b,c," } { "_id" : "d", "path" : "a,b,d," } { "_id" : "g", "path" : "a,b,g," } > // find the node 'b' and its descendents, where path to 'b' is not already known: > nodeb = t.findOne( { _id : "b" } ) { "_id" : "b", "path" : "a,b," } > t.find( { path : new RegExp("^" + nodeb.path) } ) { "_id" : "b", "path" : "a,b," } { "_id" : "c", "path" : "a,b,c," } { "_id" : "d", "path" : "a,b,d," } { "_id" : "g", "path" : "a,b,g," }

Ruby实例:

嵌套数据集:

相关标签: Trees MongoDB