Decision-theoretic rough sets

From Infogalactic: the planetary knowledge core
Jump to: navigation, search

In the mathematical theory of decisions, decision-theoretic rough sets (DTRS) is a probabilistic extension of rough set classification. First created in 1990 by Dr. Yiyu Yao,[1] the extension makes use of loss functions to derive \textstyle \alpha and \textstyle \beta region parameters. Like rough sets, the lower and upper approximations of a set are used.

Definitions

The following contains the basic principles of decision-theoretic rough sets.

Conditional risk

Using the Bayesian decision procedure, the decision-theoretic rough set (DTRS) approach allows for minimum-risk decision making based on observed evidence. Let \textstyle A=\{a_1,\ldots,a_m\} be a finite set of \textstyle m possible actions and let \textstyle \Omega=\{w_1,\ldots, w_s\} be a finite set of s states. \textstyle P(w_j\mid[x]) is calculated as the conditional probability of an object \textstyle x being in state \textstyle w_j given the object description \textstyle [x]. \textstyle \lambda(a_i\mid w_j) denotes the loss, or cost, for performing action \textstyle a_i when the state is \textstyle w_j. The expected loss (conditional risk) associated with taking action \textstyle a_i is given by:


R(a_i\mid [x]) = \sum_{j=1}^s \lambda(a_i\mid w_j)P(w_j\mid[x]).

Object classification with the approximation operators can be fitted into the Bayesian decision framework. The set of actions is given by \textstyle A=\{a_P,a_N,a_B\}, where \textstyle a_P, \textstyle a_N, and \textstyle a_B represent the three actions in classifying an object into POS(\textstyle A), NEG(\textstyle A), and BND(\textstyle A) respectively. To indicate whether an element is in \textstyle A or not in \textstyle A, the set of states is given by \textstyle \Omega=\{A,A^c\}. Let \textstyle \lambda(a_\diamond\mid A) denote the loss incurred by taking action \textstyle a_\diamond when an object belongs to \textstyle A, and let \textstyle \lambda(a_\diamond\mid A^c) denote the loss incurred by take the same action when the object belongs to \textstyle A^c.

Loss functions

Let \textstyle \lambda_{PP} denote the loss function for classifying an object in \textstyle A into the POS region, \textstyle \lambda_{BP} denote the loss function for classifying an object in \textstyle A into the BND region, and let \textstyle \lambda_{NP} denote the loss function for classifying an object in \textstyle A into the NEG region. A loss function \textstyle \lambda_{\diamond N} denotes the loss of classifying an object that does not belong to \textstyle A into the regions specified by \textstyle \diamond.

Taking individual can be associated with the expected loss \textstyle R(a_\diamond\mid[x])actions and can be expressed as:

\textstyle R(a_P\mid[x]) = \lambda_{PP}P(A\mid[x]) + \lambda_{PN}P(A^c\mid[x]),
\textstyle R(a_N\mid[x]) = \lambda_{NP}P(A\mid[x]) + \lambda_{NN}P(A^c\mid[x]),
\textstyle R(a_B\mid[x]) = \lambda_{BP}P(A\mid[x]) + \lambda_{BN}P(A^c\mid[x]),

where \textstyle \lambda_{\diamond P}=\lambda(a_\diamond\mid A), \textstyle \lambda_{\diamond N}=\lambda(a_\diamond\mid A^c), and \textstyle \diamond=P, \textstyle N, or \textstyle B.

Minimum-risk decision rules

If we consider the loss functions \textstyle \lambda_{PP} \leq \lambda_{BP} < \lambda_{NP} and \textstyle \lambda_{NN} \leq \lambda_{BN} < \lambda_{PN}, the following decision rules are formulated (P, N, B):

  • P: If \textstyle P(A\mid[x]) \geq \gamma and \textstyle P(A\mid[x]) \geq \alpha, decide POS(\textstyle A);
  • N: If \textstyle P(A\mid[x]) \leq \beta and \textstyle P(A\mid[x]) \leq \gamma, decide NEG(\textstyle A);
  • B: If \textstyle \beta \leq P(A\mid[x]) \leq \alpha, decide BND(\textstyle A);

where,

\alpha = \frac{\lambda_{PN} - \lambda_{BN}}{(\lambda_{BP} - \lambda_{BN}) - (\lambda_{PP}-\lambda_{PN})},
\gamma = \frac{\lambda_{PN} - \lambda_{NN}}{(\lambda_{NP} - \lambda_{NN}) - (\lambda_{PP}-\lambda_{PN})},
\beta = \frac{\lambda_{BN} - \lambda_{NN}}{(\lambda_{NP} - \lambda_{NN}) - (\lambda_{BP}-\lambda_{BN})}.

The \textstyle \alpha, \textstyle \beta, and \textstyle \gamma values define the three different regions, giving us an associated risk for classifying an object. When \textstyle \alpha > \beta, we get \textstyle \alpha > \gamma > \beta and can simplify (P, N, B) into (P1, N1, B1):

  • P1: If \textstyle P(A\mid [x]) \geq \alpha, decide POS(\textstyle A);
  • N1: If \textstyle P(A\mid[x]) \leq \beta, decide NEG(\textstyle A);
  • B1: If \textstyle \beta < P(A\mid[x]) < \alpha, decide BND(\textstyle A).

When \textstyle \alpha = \beta = \gamma, we can simplify the rules (P-B) into (P2-B2), which divide the regions based solely on \textstyle \alpha:

  • P2: If \textstyle P(A\mid[x]) > \alpha, decide POS(\textstyle A);
  • N2: If \textstyle P(A\mid[x]) < \alpha, decide NEG(\textstyle A);
  • B2: If \textstyle P(A\mid[x]) = \alpha, decide BND(\textstyle A).

Data mining, feature selection, information retrieval, and classifications are just some of the applications in which the DTRS approach has been successfully used.

See also

References

  1. Yao, Y.Y.; Wong, S.K.M.; Lingras, P. (1990). "A decision-theoretic rough set model". Methodologies for Intelligent Systems, 5, Proceedings of the 5th International Symposium on Methodologies for Intelligent Systems. Knoxville, Tennessee, USA: North-Holland: 17–25.<templatestyles src="Module:Citation/CS1/styles.css"></templatestyles>

External links