(Approximate String Matching)

Release 1.0 (April 17, 2007)

Department of Computer Science, UC Irvine

This version is outdated. Our most recent release is here.


« Back to Flamingo Main Page

This release (in C++) includes the source code of several algorithms for approximate string matching. They include algorithms of our recently published papers, an algorithm of our ongoing work, and an algorithm invented by Microsoft Researchers in a paper published in VLDB 2006.

The motivation of this research is to efficiently answer the following two types of approximate string queries:

There are various string similarity functions, such as edit distance, jaccard, and cosine. The following is the list of algorithms corresponding to the source directory structure:

In addition, we have provided some commonly used functions in the "util" directory.


[DIR]Parent Directory

Acknowledgements: This release is partially supported by the NSF CAREER Award, No. IIS-0238586, the NSF-funded RESCUE project, a Google Research Award, and a fund from CalIt2.

License Agreement: Permission to use, copy, modify, and distribute the implementations of MAT-Tree, SEPIA, StringMap, and FilterTree is permitted under the terms of the GNU Public License (GPL). The implementation of the PartEnum algorithm invented by Microsoft researchers is limited to non commercial use, which would be covered under the royalty free covenant that Microsoft made public.

For any questions regarding this release, please send email to flamingo AT ics.uci.edu