机器学习笔记（Washington University）- Classification Specialization-week 3

2024-09-21 16:25:59 221人阅读

1. Quality metric

Quality metric for the desicion tree is the classification error

error=number of incorrect predictions / number of examples

2. Greedy algorithm

Procedure

Step 1: Start with an empty tree

Step 2: Select a feature to split data

explanation:

　　Split data for each feature

　　Calculate classification error of this decision stump

　　choose the one with the lowest error

For each split of the tree:

　　Step 3: If all data in these nodes have same y value

　　　　　　Or if we already use up all the features, stop. 　　　　　　　

　　Step 4: Otherwise go to step 2 and continue on this split

Algorithm

predict(tree_node, input)

if current tree_node is a leaf:

　　return majority class of data points in leaf

else:

　　next_node = child node of tree_node whose feature value agrees with input

　　return (tree_node, input)

3 Threshold split

Threshold split is for the continous input

we just pick a threshold value for the continous input and classify the data.

Procedure:

Step 1: Sort the values of a feature h_j(x) {v₁, v₂,...,v_n}

Step 2: For i = 1 .... N-1(all the data points)

　　　　　　consider split t_i=(v_i+v_i+1)/2

　　　　　　compute the classification error of the aplit

　　　　choose t_i with the lowest classification error

机器学习笔记（Washington University）- Classification Specialization-week 3

声明：以上内容来自用户投稿及互联网公开渠道收集整理发布，本网站不拥有所有权，未作人工编辑处理，也不承担相关法律责任，若内容有误或涉及侵权可进行投诉：投诉/举报工作人员会在5个工作日内联系你，一经查实，本站将立刻删除涉嫌侵权内容。

联系
我们