# Help:Tree Structures

**Comparisons:** Hierarchical Data Storage Technique Comparisons

http://www.evoio.org/wiki/Phylotastic/Datastore

http://www.dbmonster.com/Uwe/Forum.aspx/db-theory/986/Nested-Sets-vs-Nested-Intervals

Options for Storing Hierarchical Data in a Relational Database: http://www.cnblogs.com/ttltry-air/archive/2012/08/10/2633164.html

Trees in the Database: Advanced Data Structures: http://www.alberton.info/talks/show/id/14

Modeling trees with Nested Sets and Nested Intervals: http://www.postgresql.org/message-id/200604070028.15476.db@kavod.com

Using Arrays as Materialized Paths in Postgres: http://justcramer.com/2012/04/08/using-arrays-as-materialized-paths-in-postgres/

Model Tree Structures in MongoDB: http://docs.mongodb.org/manual/tutorial/model-tree-structures/

http://en.wikipedia.org/wiki/Trie

http://www.evoio.org/wiki/Phylotastic/Datastore

PostgreSQL Tree Structure and Recursion: http://stackoverflow.com/questions/13184334/tree-structure-and-recursion

Create large persistent data structures that also persists on disk automatically.: https://github.com/ningke/PersistDS

ETE is a python programming toolkit that assists in the automated manipulation, analysis and visualization of any type of hierarchical trees.: https://github.com/jhcepas/ete

(python) A simple, demonstrative implementation of trees of nodes.: https://github.com/ejones/nodetree

https://github.com/topher200/data_structures/blob/master/tree.py

Python program that finds the path associated to a rational number in the Calkin-Wilf tree.: https://github.com/bharathdandala/Calkin-Wilf-Tree

ruby: https://github.com/the-teacher/the_sortable_tree

Tree Implementation in python: simple to use for you.: https://github.com/caesar0301/pyTree

A Python 3 implementation of Tree data structure: https://github.com/yoyzhou/pyTree

A list-derived tree data structure library for Python.: https://github.com/errantlinguist/PyTree

A tree structure for Mongoid documents using the materialized path pattern: https://github.com/benedikt/mongoid-tree

http://search.cpan.org/~mlawren/SQL-Tree-0.04/bin/sqltree

http://www.depesz.com/2008/04/11/my-take-on-trees-in-sql/

http://search.cpan.org/~mlawren/SQL-Tree-0.04/lib/SQL/Tree.pm

Trees in SQL : an approach based on materialized paths and normalization for MySQL: http://www.cloudconnected.fr/2009/05/26/trees-in-sql-an-approach-based-on-materialized-paths-and-normalization-for-mysql/

http://www.postgresql.org/docs/9.2/static/ltree.html

**Nested Sets design AKA "Modified Pre-order Tree Traversal" (left and right columns for numbering):**

Nested Sets AKA Modified Pre-order Tree Traversal

- Columns: Left, Right

- First described by Joe Celko - covered in depth in his book Trees and Hierarchies in SQL for Smarties

- - Finding descendants is easy and relatively efficient (i.e. proportional to the size of the subtree)

- - Finding ancestors is easy but inefficient

- - Finding node's children as all the descendants restricted to the next level is inefficient (e.g. consider root node)

- - Finding node's parent as ancestor on the previous level is inefficient due to inefficiency of ancestors search

- - Aggregate queries are relatively slow (except counting, which is super fast)!

- - Tree reorg is hard

- Cheap level, ancestry, descendants

- Compared to Adjacency List, moves, inserts, deletes more expensive.

- Requires a specific sort order (e.g. created). So sorting all descendants in a different order requires additional work.

- easy to retrieve data, hard to add data

- Moving nodes is difficult because it requires renumbering all other nodes to the right (can use -number trick to speed up a bit)

- Efficient to retrieve a fully ordered hierarchical tree, Typically, a read/traverse/retrieve operation requires a single simple SQL statement

- Model very well adapted to implement tree structures with databases (whatever that means... indexing.. views?)

- Allows for compact and elegant implementations with nice code symmetries

- Inefficient model to modify tree structure. Typically, an insert/reorder operation requires an SQL statement that modifies n/2 nodes at the average, or n nodes in the worst case (n = # nodes in a tree)

- Non-intuitive database values. A simple db query of a table doesn't intuitively reveal the tree structure. As a consequence, nested set trees need to be handled and interpreted programmatically

**Nested Intervals AKA Nested Sets using Indexed Rational Numbers AKA Matrix Encoding AKA Farey Fraction Encoding:**

Nested Intervals AKA Matrix Encoding

- Columns: Left/Right as floating point numbers

- Combination of Nested Sets and Materialized Path where left/right columns are floating point decimals instead of integers and encode the path information. In the later development of this idea nested intervals gave rise to matrix encoding.

- - Same as MP: Finding descendants is easy and and relatively efficient (i.e. proportional to the size of the subtree)

- - Same as MP: Finding ancestors is tricky but efficient

- - Same as AR: Finding node's children is trivial

- - Same as AR: Finding node's parent is trivial

- - All aggregate queries are relatively slow

- - Tree reorg is easy (but not as simple as in AR)

**Adjacency Relation AKA Adjacency List Design AKA Recursion (columns like parent_id and level):**

Adjacency List Design AKA Recursion

- Columns: ID, ParentID

- - Have to use proprietory SQL extensions for finding ancestors and descendants; although the queries are efficient

- - Finding descendants is relatively efficient (i.e. proportional to the size of the subtree)

- - Finding node's children is trivial

- - Finding node's parent is trivial

- - Aggregate queries are relatively slow

- - Tree reorg is very simple

- Easy to implement.

- Cheap node moves, inserts, and deletes.

- Expensive to find level (can store as a computed column), ancestry & descendants (Bridge Hierarchy combined with level column can solve), path (Lineage Column can solve).

- Use Common Table Expressions in those databases that support them to traverse.

- hard to retrieve data, easy to add data

- you have both an ID column and a Parent ID column in a table. This is easy to understand and maintain but can be difficult or inefficient when you want to retrieve hierarchies of records.

- hard to retrieve a fully ordered hierarchical tree

- more-intuitive database values (parent id, depth)

**Materialized Path AKA Legal Numbering AKA Lineage Column:**

- Column: lineage (e.g. /parent/child/grandchild/etc...)

- - Finding descendants is easy and relatively efficient (i.e. proportional to the size of the subtree)

- - Finding ancestors is tricky but efficient

- - Finding node's children as descendants on next level is inefficient

- - Finding node's parent as ancestor on the previous level is efficient

- - All aggregate queries are relatively slow

- - Tree reorg is easy

- Limit to how deep the hierarchy can be.

- Descendants cheap (e.g. LEFT(lineage, #) = '/enumerated/path')

- Ancestry tricky (database specific queries)

**Closure Table (aka "Bridge Table" aka "Adjacency Relation"):**

- Columns: ancestor, descendant

- Stands apart from table it describes.

- Can include some nodes in more than one hierarchy.

- Cheap ancestry and descendants (albeit not in what order)

- For complete knowledge of a hierarchy needs to be combined with another option.

**Flat Table Design**

- Columns: ID, ParentID, Level/Rank

- A modification of the Adjacency List that adds a Level and Rank (e.g. ordering) column to each record.

- Expensive move and delete

- Cheap ancestry and descendants

- Good Use: threaded discussion - forums / blog comments

**XML DB's:**

http://en.wikipedia.org/wiki/Sedna_%28database%29

http://xml.apache.org/xindice/

**NoSQL:**

http://redis.io/topics/introduction

http://www.mongodb.org/display/DOCS/Trees+in+MongoDB

Top DBA Shell Scripts for Monitoring the Database: https://communities.bmc.com/communities/docs/DOC-9942

IMS: https://communities.bmc.com/communities/docs/DOC-9908

Efficient lineage tracking for scientific workflows: http://portal.acm.org/citation.cfm?doid=1376616.1376716