Decision Trees

It is also known as Divide-And-Conquer method. This method constructs a rule by dividing overly general rules into a set of rules, which correspond to conjunction subsets of the examples. It then continues recursively with those rules for which the corresponding subsets contain both positive and negative examples. The final rule set consists of all specialized rules for which the corresponding sets contain positive examples only. Some examples of these systems are:

J48 (C4.5)
J48 algorithm is the Weka implementation of the C4.5 top-down decision tree learner proposed by Quinlan. The algorithm uses the greedy technique and is a variant of ID3, which determines at each step the most predictive attribute, and splits a node based on this attribute. Each node represents a decision point over the value of some attribute. J48 attempts to account for noise and missing data. It also deals with numeric attributes by determining where thresholds for decision splits should be placed. The main parameters that can be set for this algorithm are the confidence threshold, the minimum number of instances per leaf and the number of folds for reduced error pruning.

Ross Quinlan (1993). C4.5: Programs for Machine Learning. Morgan Kaufmann Publishers, San Mateo, CA.

Alternating decision trees (ADTree) algorithm is a generalization of decision trees, voted decision trees and voted decision stumps. The algorithm applied boosting procedures to decision tree algorithms to produce accurate classifiers. The classifiers are in the form of a majority vote over a number of decision trees but having a smaller and easier to understand classification rules.

DecisionStump algorithm builds simple binary decision ‘stumps’ (1 level decision tress) for both numeric and nominal classification problems. It copes with mission values by extending a third branch from the stump or treating ‘missing’ as a separate attribute value. DecisionStump is usually used in conjunction with a boosting algorithm such as LogitBoost. It does regression (based on mean-squared error) or classification (based on entropy).

Witten, I.H., Frank, E., Trigg, L., Hall, M., Holmes, G., Cunningham, S.J. Weka: Practical machine learning tools and techniques with java implementations. In Proc. ICONIP/ANZIIS/ANNES’99 Int. Workshop: Emerging Knowledge Engineering and Connectionist-Based Info. Systems. (1999) 192-196

RandomTree is an algorithm for constructing a tree that considers K randomly chosen attributes at each node. It performs no pruning.

REPTree algorithm is a fast decision tree learner. It builds a decision/regression tree using information gain/variance and prunes it using reduced-error pruning (with back-fitting). The algorithm only sorts values for numeric attributes once. Missing values are dealt with by splitting the corresponding instances into pieces (i.e. as in C4.5).

For More Information about Data Minining click here

You may also like

Leave a Reply

Your email address will not be published. Required fields are marked *