标签:scikit-learn 机器学习 xgboost machine learning in
接上篇:http://blog.csdn.net/mmc2015/article/details/47304591
def xgboost_pred(train,labels,test):
params = {}
params["objective"] = "reg:linear"
params["eta"] = 0.005
params["min_child_weight"] = 6
params["subsample"] = 0.7
params["colsample_bytree"] = 0.7
params["scale_pos_weight"] = 1
params["silent"] = 1
params["max_depth"] = 9
plst = list(params.items())
#Using 5000 rows for early stopping.
offset = 4000
num_rounds = 10000
xgtest = xgb.DMatrix(test)
#create a train and validation dmatrices
xgtrain = xgb.DMatrix(train[offset:,:], label=labels[offset:])
xgval = xgb.DMatrix(train[:offset,:], label=labels[:offset])
#train using early stopping and predict
watchlist = [(xgtrain, 'train'),(xgval, 'val')]
model = xgb.train(plst, xgtrain, num_rounds, watchlist, early_stopping_rounds=120)
preds1 = model.predict(xgtest,ntree_limit=model.best_iteration)
#reverse train and labels and use different 5k for early stopping.
# this adds very little to the score but it is an option if you are concerned about using all the data.
train = train[::-1,:]
labels = np.log(labels[::-1])
xgtrain = xgb.DMatrix(train[offset:,:], label=labels[offset:])
xgval = xgb.DMatrix(train[:offset,:], label=labels[:offset])
watchlist = [(xgtrain, 'train'),(xgval, 'val')]
model = xgb.train(plst, xgtrain, num_rounds, watchlist, early_stopping_rounds=120)
preds2 = model.predict(xgtest,ntree_limit=model.best_iteration)
#combine predictions
#since the metric only cares about relative rank we don't need to average
preds = (preds1)*1.4 + (preds2)*8.6
return preds
代码具体分析有时间写,欢迎吐槽。。。。
版权声明:本文为博主原创文章,未经博主允许不得转载。
machine learning in coding(python):使用xgboost构建预测模型
标签:scikit-learn 机器学习 xgboost machine learning in
原文地址:http://blog.csdn.net/mmc2015/article/details/47304779