文本文件  |  230行  |  9.25 KB

Binary Tree Package
===================

Abstract
========

This package provides Binary- RedBlack- and AVL-Trees written in Python and Cython.

This Classes are much slower than the built-in dict class, but all
iterators/generators yielding data in sorted key order.

Source of Algorithms
--------------------

AVL- and RBTree algorithms taken from Julienne Walker: http://eternallyconfuzzled.com/jsw_home.aspx

Trees written in Python (only standard library)
-----------------------------------------------

    - *BinaryTree* -- unbalanced binary tree
    - *AVLTree* -- balanced AVL-Tree
    - *RBTree* -- balanced Red-Black-Tree

Trees written with C-Functions and Cython as wrapper
----------------------------------------------------

    - *FastBinaryTree* -- unbalanced binary tree
    - *FastAVLTree* -- balanced AVL-Tree
    - *FastRBTree* -- balanced Red-Black-Tree

All trees provides the same API, the pickle protocol is supported.

FastXTrees has C-structs as tree-node structure and C-implementation for low level
operations: insert, remove, get_value, max_item, min_item.

Constructor
~~~~~~~~~~~

    * Tree() -> new empty tree;
    * Tree(mapping) -> new tree initialized from a mapping (requires only an items() method)
    * Tree(seq) -> new tree initialized from seq [(k1, v1), (k2, v2), ... (kn, vn)]

Methods
~~~~~~~

    * __contains__(k) -> True if T has a key k, else False, O(log(n))
    * __delitem__(y) <==> del T[y], del[s:e], O(log(n))
    * __getitem__(y) <==> T[y], T[s:e], O(log(n))
    * __iter__() <==> iter(T)
    * __len__() <==> len(T), O(1)
    * __max__() <==> max(T), get max item (k,v) of T, O(log(n))
    * __min__() <==> min(T), get min item (k,v) of T, O(log(n))
    * __and__(other) <==> T & other, intersection
    * __or__(other) <==> T | other, union
    * __sub__(other) <==> T - other, difference
    * __xor__(other) <==> T ^ other, symmetric_difference
    * __repr__() <==> repr(T)
    * __setitem__(k, v) <==> T[k] = v, O(log(n))
    * clear() -> None, remove all items from T, O(n)
    * copy() -> a shallow copy of T, O(n*log(n))
    * discard(k) -> None, remove k from T, if k is present, O(log(n))
    * get(k[,d]) -> T[k] if k in T, else d, O(log(n))
    * is_empty() -> True if len(T) == 0, O(1)
    * items([reverse]) -> generator for (k, v) items of T, O(n)
    * keys([reverse]) -> generator for keys of T, O(n)
    * values([reverse]) -> generator for values of  T, O(n)
    * pop(k[,d]) -> v, remove specified key and return the corresponding value, O(log(n))
    * popitem() -> (k, v), remove and return some (key, value) pair as a 2-tuple, O(log(n))
    * setdefault(k[,d]) -> T.get(k, d), also set T[k]=d if k not in T, O(log(n))
    * update(E) -> None.  Update T from dict/iterable E, O(E*log(n))
    * foreach(f, [order]) -> visit all nodes of tree (0 = 'inorder', -1 = 'preorder' or +1 = 'postorder') and call f(k, v) for each node, O(n)

slicing by keys
~~~~~~~~~~~~~~~

    * itemslice(s, e) -> generator for (k, v) items of T for s <= key < e, O(n)
    * keyslice(s, e) -> generator for keys of T for s <= key < e, O(n)
    * valueslice(s, e) -> generator for values of T for s <= key < e, O(n)
    * T[s:e] -> TreeSlice object, with keys in range s <= key < e, O(n)
    * del T[s:e] -> remove items by key slicing, for s <= key < e, O(n)

    start/end parameter:

    * if 's' is None or T[:e] TreeSlice/iterator starts with value of min_key();
    * if 'e' is None or T[s:] TreeSlice/iterator ends with value of max_key();
    * T[:] is a TreeSlice which represents the whole tree;

    TreeSlice is a tree wrapper with range check, and contains no references
    to objects, deleting objects in the associated tree also deletes the object
    in the TreeSlice.

    * TreeSlice[k] -> get value for key k, raises KeyError if k not exists in range s:e
    * TreeSlice[s1:e1] -> TreeSlice object, with keys in range s1 <= key < e1
        - new lower bound is max(s, s1)
        - new upper bound is min(e, e1)

    TreeSlice methods:

    * items() -> generator for (k, v) items of T, O(n)
    * keys() -> generator for keys of T, O(n)
    * values() -> generator for values of  T, O(n)
    * __iter__ <==> keys()
    * __repr__ <==> repr(T)
    * __contains__(key)-> True if TreeSlice has a key k, else False, O(log(n))

prev/succ operations
~~~~~~~~~~~~~~~~~~~~

    * prev_item(key) -> get (k, v) pair, where k is predecessor to key, O(log(n))
    * prev_key(key) -> k, get the predecessor of key, O(log(n))
    * succ_item(key) -> get (k,v) pair as a 2-tuple, where k is successor to key, O(log(n))
    * succ_key(key) -> k, get the successor of key, O(log(n))
    * floor_item(key) -> get (k, v) pair, where k is the greatest key less than or equal to key, O(log(n))
    * floor_key(key) -> k, get the greatest key less than or equal to key, O(log(n))
    * ceiling_item(key) -> get (k, v) pair, where k is the smallest key greater than or equal to key, O(log(n))
    * ceiling_key(key) -> k, get the smallest key greater than or equal to key, O(log(n))

Heap methods
~~~~~~~~~~~~

    * max_item() -> get largest (key, value) pair of T, O(log(n))
    * max_key() -> get largest key of T, O(log(n))
    * min_item() -> get smallest (key, value) pair of T, O(log(n))
    * min_key() -> get smallest key of T, O(log(n))
    * pop_min() -> (k, v), remove item with minimum key, O(log(n))
    * pop_max() -> (k, v), remove item with maximum key, O(log(n))
    * nlargest(i[,pop]) -> get list of i largest items (k, v), O(i*log(n))
    * nsmallest(i[,pop]) -> get list of i smallest items (k, v), O(i*log(n))

Set methods (using frozenset)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

    * intersection(t1, t2, ...) -> Tree with keys *common* to all trees
    * union(t1, t2, ...) -> Tree with keys from *either* trees
    * difference(t1, t2, ...) -> Tree with keys in T but not any of t1, t2, ...
    * symmetric_difference(t1) -> Tree with keys in either T and t1  but not both
    * issubset(S) -> True if every element in T is in S
    * issuperset(S) -> True if every element in S is in T
    * isdisjoint(S) ->  True if T has a null intersection with S

Classmethods
~~~~~~~~~~~~

    * fromkeys(S[,v]) -> New tree with keys from S and values equal to v.

Performance
===========

Profiling with timeit(): 5000 unique random int keys, time in seconds

========================  =============  ==============  ==============  ==============
unbalanced Trees          CPython 2.7.2  FastBinaryTree  ipy 2.7.0       pypy 1.7.0
========================  =============  ==============  ==============  ==============
build time 100x            7,55           0,60            2,51            0,29
build & delete time 100x  13,34           1,48            4,45            0,47
search 100x all keys       2,86           0,96            0,27            0,06
========================  =============  ==============  ==============  ==============

========================  =============  ==============  ==============  ==============
AVLTrees                  CPython 2.7.2  FastAVLTree     ipy 2.7.0       pypy 1.7.0
========================  =============  ==============  ==============  ==============
build time 100x	          22,66           0,65           10,45            1,29
build & delete time 100x  36,71           1,47           20,89            3,02
search 100x all keys       2,34           0,85            0,89            0,14
========================  =============  ==============  ==============  ==============

========================  =============  ==============  ==============  ==============
RBTrees                   CPython 2.7.2  FastRBTree      ipy 2.7.0       pypy 1.7.0
========================  =============  ==============  ==============  ==============
build time 100x	          14,78           0,65            4,43            0,49
build & delete time 100x  39,34           1,63           12,43            1,32
search 100x all keys       2,32           0,86            0,86            0,13
========================  =============  ==============  ==============  ==============

News
====

Version 1.0.1 February 2013

  * bug fixes
  * refactorings by graingert
  * skip useless tests for pypy
  * new license: MIT License
  * tested with CPython2.7, CPython3.2, CPython3.3, pypy-1.9, pypy-2.0-beta1
  * unified line endings to LF
  * PEP8 refactorings
  * added floor_item/key, ceiling_item/key methods, thanks to Dai Mikurube

Version 1.0.0

  * bug fixes
  * status: 5 - Production/Stable
  * removed useless TreeIterator() class and T.treeiter() method.
  * patch from Max Motovilov to use Visual Studio 2008 for building C-extensions

Version 0.4.0

  * API change!!!
  * full Python 3 support, also for Cython implementations
  * removed user defined compare() function - keys have to be comparable!
  * removed T.has_key(), use 'key in T'
  * keys(), items(), values() generating 'views'
  * removed iterkeys(), itervalues(), iteritems() methods
  * replaced index slicing by key slicing
  * removed index() and item_at()
  * repr() produces a correct representation
  * installs on systems without cython (tested with pypy)
  * new license: GNU Library or Lesser General Public License (LGPL)

Installation
============

from source::

    python setup.py install

Download
========

http://bitbucket.org/mozman/bintrees/downloads

Documentation
=============

this README.txt

bintrees can be found on bitbucket.org at:

http://bitbucket.org/mozman/bintrees