R语言︱H2o深度学习的一些R语言实践——H2o包

library(h2o)
# single thread
h2o.init()
#连接h2o平台


train_file <- "https://h2o-public-test-data.s3.amazonaws.com/bigdata/laptop/mnist/train.csv.gz"
test_file <- "https://h2o-public-test-data.s3.amazonaws.com/bigdata/laptop/mnist/test.csv.gz"

train <- h2o.importFile(train_file)
test  <- h2o.importFile(test_file)

# To see a brief summary of the data, run the following command
summary(train)
summary(test)

y <- "C785"
x <- setdiff(names(train), y)

# We encode the response column as categorical for multinomial
#classification
train[,y] <- as.factor(train[,y])
test[,y]  <- as.factor(test[,y])

# Train a Deep Learning model and valid
system.time(
  model_cv <- h2o.deeplearning(x = x,
                               y = y,
                               training_frame = train,
                               distribution = "multinomial",
                               activation = "Rectifier",
                               hidden = c(32),
                               l1 = 1e-5,
                               epochs = 200)
)

三、最简单的案例——基于iris数据集的深度学习

本案例主要来自h2o官方手册中，h2o.deeplearning包的示例，比较简单易懂。如果你想看预测的数据可以用as.data.frame来变成R能识别的数据框格式。

##参考来自：h2o官方手册,h2o.deeplearning函数的示例
library(h2o)
h2o.init()
iris.hex <- as.h2o(iris)

iris.dl <- h2o.deeplearning(x = 1:4, y = 6, training_frame = iris.hex)  #模型拟合
# now make a prediction
predictions <- h2o.predict(iris.dl, iris.hex)          #预测
as.data.frame(predictions)                             #预测数据变成数据框

performance = h2o.performance(model = iris.dl)
print(performance)

输出的结果长成下面这个样子。

大概构成是：模型评价指标+混淆矩阵+一些指标的阈值（这个是啥？？）

看到混淆矩阵，你就差不多懂了~

> print(performance)
H2OBinomialMetrics: deeplearning
** Reported on training data. **
Description: Metrics reported on full training frame

MSE:  0.01030833
R^2:  0.9536125
LogLoss:  0.05097025
AUC:  1
Gini:  1

Confusion Matrix for F1-optimal threshold:
         0  1    Error    Rate
0      100  0 0.000000  =0/100
1        0 50 0.000000   =0/50
Totals 100 50 0.000000  =0/150

Maximum Metrics: Maximum metrics at their respective thresholds
                      metric threshold    value idx
1                     max f1  0.983179 1.000000  49
2                     max f2  0.983179 1.000000  49
3               max f0point5  0.983179 1.000000  49
4               max accuracy  0.983179 1.000000  49
5              max precision  0.999915 1.000000   0
6                 max recall  0.983179 1.000000  49
7            max specificity  0.999915 1.000000   0
8           max absolute_MCC  0.983179 1.000000  49
9 max min_per_class_accuracy  0.983179 1.000000  49

Gains/Lift Table: Extract with `h2o.gainsLift(<model>, <data>)` or `h2o.gainsLift(<model>, valid=<T/F>, xval=<T/F>)`

R语言︱H2o深度学习的一些R语言实践——H2o包

标签：following frame parallel amazon pad ict read inline run

原文地址：http://blog.csdn.net/sinat_26917383/article/details/51219025

踩

(0)

评论一句话评论（0）

分享档案

更多>

2021年07月29日 (22)
2021年07月28日 (40)
2021年07月27日 (32)
2021年07月26日 (79)
2021年07月23日 (29)
2021年07月22日 (30)
2021年07月21日 (42)
2021年07月20日 (16)
2021年07月19日 (90)
2021年07月16日 (35)

周排行