The FLAMINGO Project on Data Cleaning

Department of Computer Science, UC Irvine

Objective

The Flamingo Project focuses on data cleaning, i.e., how to deal with errors and inconsistencies in information systems. As an example, in many applications such as data integration, commercial organizations need to collect data from various sources to conduct analysis and make decisions. Often, the data from these different sources can have inconsistencies. For instance, we use first name, last name, SSN, and birthday to identify a person. However, the same name, e.g., "Schwarzenegger", may be misspelled as "Swarzzengaer" or other forms. Such errors make it more challenging to link records from different places and answer queries approximately. We are developing algorithms in order to make query answering and information retrieval efficient in the presence of such inconsistencies and errors.

News

Releases

People

Alumni and Visitors

Publications

Acknowledgements: This release is partially supported by the NSF CAREER Award No. IIS-0238586, the NSF award No. IIS-0742960, the NSF-funded RESCUE project, a Google Research Award, and a fund from CalIt2.


For any questions regarding this project, please send email to flamingo AT ics.uci.edu