Back to Index

AppString > AppStringDoc



MAT-tree: A tree-based structure for indexing string and numeric attributes. Using MAT-tree, we can perform range queries on both string and numeric attributes. [1]


The program can be compiled using Visual C or gnu C++.

Compile the project in Visual C, and run accordingly. You can also write a makefile and compile it using a GNU C compiler.


Main files:

Useful parameters:

const int MAXLEN = 100; //maximum length of a string attribute
const int PGSIZE = 256; //page size
const int TRIELEN = 1000; //maximum size of a Trie (in string representation)
const int K = 400;  //# of centers in MAT-tree
const int STRDELTA = 3; //threshold for string attribute
const int NUMDELTA = 4; //threshold for numeric attribute
const int SIZES = 80000; //size of the dataset
const int ALPH_SIZE = 29; //size of the alphabet
#define DATAFILE "data.txt" //input file for dataset
#define QUERYFILE "query.txt" //query file
const int NUMQUERY = 10; //# of queries to run

Prepare DATAFILE and QUERYFILE. Each record is in one line, with a string followed by by aq numeric value. In the case there are white spaces in the string, you need to replace them with special characters first.


The performance results are available in [1].


[1] Liang Jin, Nick Koudas, Chen Li, Anthony K. H. Tung: Indexing Mixed Types for Approximate Retrieval. VLDB 2005: 793-804