Adrian Perrig
Dept. of Computer Science
Carnegie Mellon University
adrian@cs.cmu.edu
-
Qifa Ke
Dept. of Computer Science
Carnegie Mellon University
ke+@cs.cmu.edu
-
Dawn Xiaodong Song
Dept. of Computer Science
Carnegie Mellon University
skyxd@cs.cmu.edu
Many databases such as medical databases contain sensitive information. We need to enforce access policies upon releasing the databases. In this report we propose a way to automatically find the generalized association rules from the database and automatically generalize the database so that the newly generated database will enforce the required access policies. For mining the generalized association rules, we developed an improved algorithm based on IBM's Cumulate algorithms (frequency counting). We also propose a fast greedy algorithm to generalize the database with minimum information loss. We developed our own data generation engine. We tested our algorithms on the datasets generated by this engine. Our test dataset is as big as 1,000,000 tuples. The result shows that our method out-preforms the exising method. In many cases, our algorithm is twice as fast as the Cumulate algorithm.