A Rough-Set Feature Selection Model for Classification and Knowledge Discovery

Feature selection aims to remove features unnecessary to the target concept. Rough-set theory (RST) eliminates unimportant or irrelevant features, thus generating a smaller (than the original) set of attributes with the same, or close to, classificatory power. This paper analyses the effects of rough sets on classification using 10 datasets, each including a decision attribute. Classification accuracy mapped to the type and number of attributes both in the original and the reduced datasets. This generates a framework for applying rough-sets for classification purposes. Rough-sets are then used for knowledge discovery in classification and the conclusion indicate a very significant result that removal of individual numeric attributes has far more effect on classification accuracy than removal of categorical attributes.