| - Project name |
| |
| marisa-trie |
| http://code.google.com/p/marisa-trie/ |
| |
| - Project summary |
| |
| Matching Algorithms with Recursively Implemented StorAge |
| |
| - Version |
| |
| 0.1.4 |
| |
| - Description |
| |
| This project *marisa-trie* provides a C++ library *libmarisa* and command line tools *`marisa-*`* for building and operating nesting patricia tries. The brand-new dictionary structure is designed to be static and space efficient. Also, *marisa-trie* enables not only simple lookups but also prefix searches and predictive searches. |
| |
| * Prefix searches are to find keys from prefixes of a given query. |
| * Predictive searches are to find keys starting with a given query. |
| |
| The biggest advantage of *marisa-trie* is that it can build a considerably compact dictionary. See below for the size of dictionaries built with various trie implementations. |
| |
| * Input |
| * Source: enwiki-20110115-all-titles-in-ns0.gz |
| * Contents: all page titles of English Wikipedia (Jan. 2011) |
| * Number of keys: 8,279,325 |
| * Total size: 167,223,035 bytes (plain) / 46,344,646 bytes (gzipped) |
| |
| || *Implementation* || *Size (bytes)* || *Remarks* || |
| || darts-clone || 316,065,792 || Compacted double-array trie || |
| || tx-trie || 107,119,864 || LOUDS-based trie || |
| || *marisa-trie* || *42,688,271* || Nesting patricia trie || |
| |
| * Documentation (in Japanese) |
| * HowTo |
| * ListOfTools |
| * LibraryInterface |
| * BenchmarkResults |
| |
| - Version control system |
| |
| Subversion |
| |
| - Source code license |
| |
| New BSD License |
| |
| - Project labels |
| |
| Nesting |
| Patricia |
| Trie |
| Static |
| Dictionary |
| CPlusPlus |