FLAMINGO Package
(Approximate String Matching)

Release 4.1 (February 22, 2012)

Department of Computer Science, UC Irvine

Contributors

« Back to Flamingo Main Page

Getting Started

Please refer to the Flamingo Getting Started Guide.

Introduction

This release (in C++) includes the source code of several algorithms for approximate string matching developed at UC Irvine. It includes algorithms for approximate selection queries, location-based approximate keyword search, selectivity estimation for approximate selection queries, approximate queries on mixed types, and others. Although an implementation for approximate joins is included, the focus of this release is on approximate selection queries.

Here is a brief explanation of the terms used above:

There are various string similarity functions, such as Levenshtein Distance (aka the Edit Distance), Jaccard Similarity, Cosine Similarity, and Dice Similarity. The following is a description of the modules corresponding to the source directory structure:

In addition, we have provided some commonly used functions in the util directory.

Changes in Version 4.1 (compared to Version 4.0)

Bibtex

@misc{misc/flamingo4.1-2010,
      author = {Alexander Behm and Rares Vernica and Sattam Alsubaiee and Shengyue Ji and Jiaheng Lu and Liang Jin and Yiming Lu and Chen Li},
      year = {2010},
      title = {{UCI} {Flamingo} {Package} 4.1},
      url = {http://flamingo.ics.uci.edu/releases/4.1/},
      institution = {University of California, Irvine, School of Information and Computer Sciences}
} 
[ICO]Name

[DIR]Parent Directory
[DIR]docs/
[DIR]src/
[DIR]flamingo-4.1.tgz2.8M
[DIR]README.txt

Acknowledgements: This release is partially supported by the NSF CAREER Award No. IIS-0238586, the NSF award No. IIS-0742960, the NSF-funded RESCUE project, a Google Research Award, a gift fund from Microsoft, a fund from CalIt2, the NSF CluE Project and the ASTERIX Project funded by the NSF.
Many thanks to Minh Doan, and Kensuke Ohta for their valuable testing and feedback on the code and documentation.

License Agreement: Permission to use, copy, modify, and distribute the implementations of MAT-Tree, SEPIA, StringMap, FilterTree, and LBAK-Tree is permitted under the terms of the BSD license. Permission to use, copy, modify, and distribute the implementations of the compression techniques DiscardLists and CombineLists is permitted under the terms of the following Academic BSD License. The implementation of the PartEnum algorithm invented by Microsoft researchers is limited to non commercial use, which would be covered under the royalty free covenant that Microsoft made public.

Academic BSD License:
The (compression techniques) DiscardLists and CombineLists are the proprietary property of The Regents of the University of California (“The Regents.”)
Copyright © 2009 The Regents of the University of California, Irvine. All Rights Reserved.
Redistribution and use in source and binary forms, with or without modification, are permitted by nonprofit, research institutions for research use only, provided that the following conditions are met:

The end-user understands that the program was developed for research purposes and is advised not to rely exclusively on the program for any reason.

THE SOFTWARE PROVIDED IS ON AN "AS IS" BASIS, AND THE REGENTS AND CONTRIBUTORS HAVE NO OBLIGATION TO PROVIDE MAINTENANCE, SUPPORT, UPDATES, ENHANCEMENTS, OR MODIFICATIONS. THE REGENTS AND CONTRIBUTORS SPECIFICALLY DISCLAIM ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE TO ANY PARTY FOR DIRECT, INDIRECT, SPECIAL, INCIDENTAL, EXEMPLARY OR CONSEQUENTIAL DAMAGES, INCLUDING BUT NOT LIMITED TO PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES, LOSE OF USE, DATA OR PROFITS, OR BUSINESS INTERRUPTION, HOWEVER CAUSED AND UNDER ANY THEORY OF LIABILITY WHETHER IN CONTRACT, STRICT LIABILITY OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE AND ITS DOCUMENTATION, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

If you do not agree to these terms, do not download or use the software. This license may be modified only in a writing signed by authorized signatory of both parties.


For any questions regarding this release, please send email to flamingo AT ics.uci.edu