码迷,mamicode.com
首页 > 其他好文 > 详细

决策树(decision tree)

时间:2018-01-18 01:05:29      阅读:183      评论:0      收藏:0      [点我收藏+]

标签:main   key   ret   val   xtend   return   gpo   tree   推导   

代码还好懂,但是后面选择更好的划分数据集的方法,有点不知道为什么那样选。

还要好好理解推导。

from math import log
#计算香农熵
def calcShannonEnt(dataSet):
    numEntries = len(dataSet)
    labelCount = {}
    for featVector in dataSet:
        currentlabel = featVector[-1]
        labelCount[currentlabel] = labelCount.get(currentlabel,0) + 1
        shannonEnt = 0.0
        for key in labelCount:
            prob = float(labelCount[key])/numEntries
            shannonEnt -= prob * log(prob, 2)
    return shannonEnt
#训练样本
def createDataSet():
    dataSet = [[1,1,yes],[1,1,yes],[1,0,no],
    [0,1,no],[0,1,no]]
    labels = [no surfacing,flippers]
    return dataSet,labels
#按照给定特征划分数据集
def splitDataSet(dataSet,axis,value):
    retDataSet = []
    for featVec in dataSet:
        if(featVec[0]==value):
            reducedFeatVec = featVec[:axis] #这个变量干嘛的?
            reducedFeatVec.extend(featVec[axis+1:])
            retDataSet.append(reducedFeatVec)
    return retDataSet
def main():
    dataSet,labels = createDataSet()
    # shannonEnt = calcShannonEnt(dataSet) #香农熵
    # print(shannonEnt)
    print(splitDataSet(dataSet,0,1))
    print(splitDataSet(dataSet,0,0))
main()

append和extend区别:

a = [1,2,3]
c = [1,2,3]
b = [4,5,6]
a.append(b)
c.extend(b)
print(a)
print(c)
[1, 2, 3, [4, 5, 6]]
[1, 2, 3, 4, 5, 6]

 

决策树(decision tree)

标签:main   key   ret   val   xtend   return   gpo   tree   推导   

原文地址:https://www.cnblogs.com/littlepear/p/8306820.html

(0)
(0)
   
举报
评论 一句话评论(0
登录后才能评论!
© 2014 mamicode.com 版权所有  联系我们:gaon5@hotmail.com
迷上了代码!