2370 - Identify the most popular decision tree algorithms

Identify the most popular decision tree algorithms

Concepts

  • [IP3-4-5] Decision trees
    Decision trees is a data mining technique used in different disciplines including Remote Sensing. The major advantages of decision tree methods include the ability to capture interactions between the variables used for modeling, the understandability of the produced models (trees) and their efficiency. Input data for decision trees are either a large number of examples or a large number of variables. This is important in the context of pixel-based classification in geographical information systems, where very large numbers of spatial units/points need to be classified. Decision tree consist of nodes, branches and leaves. Each node contains a test on an attribute, out of which branches are created with a grouped subset of data depending on the results of the node test. The resulting subsets will have as homogeneous values of the class as possible. This is done in a hierarchical manner dividing the training dataset until it reaches rules set at the start- the lowest number of training data within each leaf or set level of confidence. For discrete attributes, a branch of the tree is typically created for each possible value of the attribute. For continuous attributes, a threshold is selected and two branches are created based on that threshold. This also determines whether the decision tree is called a classification or a regression tree: if we are dealing with classification (discrete target) or a regression problem (continuous target), respectively. Decision trees are derived from data only. As such, they represent the data driven or empirical approach, which is more appropriate when we have plenty of high-quality (reliable and relevant) measured data and little knowledge about the studied system, for instance what is the spectral response of each land cover class needed for classification. An important mechanism used to improve decision tree performance is tree pruning. Pruning reduces the size of a decision tree by removing sections of the tree (subtrees) that are unreliable and do not contribute to the predictive performance of the tree. The pruning reduces complexity of the tree and helps to achieve better predictive accuracy by the reduction of over-fitting and removal of sections of the tree that may be based on noisy or erroneous data. Depending when the pruning is done during the creation of the tree, it is called pre- or post-pruning. The CART (Classification And Regression Trees) system is the first widely known and used system for learning decision trees. After that, notable ones are the C4.5 system for learning classification trees (or J4.8 as called within WEKA software), succeeded by C5.0.