Machine Learning

时间：2019-12-14 09:50:37 阅读：113 评论：0 收藏：0 [点我收藏+]

标签：img data rds tor standards on() 学习 ssi nump

假期找了几本关于机器学习的书，将一些比较重要的核心公式整体到这里。

模型描述

特征空间假设，寻找线性系数 $ theta $ 以希望用一个线性函数逼近目标向量。

逼近的效果好坏叫做 Cost Function，下面列出的MSE便是其中一种。

Linear Regression

梯度下降

其中

带有正则项

Ridge Regression

LASSO

Elastic Net

sklearn-线性回归


from sklearn.linear_model import LinearRegression
lr = LinearRegression()
lr.fit(X, y)
lr.intercept_, lr.coef_


from sklearn.metrics import mean_squared_error

# sgd
from sklearn.linear_model import SGDRegressor

对数线性回归 - Logistic Regression

$ sigma(t) $ 是Sigmoid函数

Logistic Regression cost function (log loss)

Logistic cost function partial derivatives

sklearn-Logistic Regression


from sklearn.linear_model import LogisticRegression

log_reg = LogisticRegression()
log_reg.fit(X, y)

Softmax Regression

支持向量机

Support Vector Machine

Decision Functions and Predictions

Hard Margin Classification

subject to

Soft Margin Classification

subject to

Dual Problem

subject to

LinearSVC



import numpy as np
from sklearn import datasets
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler
from sklearn.svm import LinearSVC
< 大专栏  Machine Learningspan class="line">
iris = datasets.load_iris()
X = iris["data"][:, (2, 3)]  # petal length, petal width
y = (iris["target"] == 2).astype(np.float64)  # Iris-Virginica

svm_clf = Pipeline([
        ("scaler", StandardScaler()),
        ("linear_svc", LinearSVC(C=1, loss="hinge")),
    ])

svm_clf.fit(X, y)

Common kernels

Linear

Polynomial

Gaussian RBF

Sigmoid

树

从树到森林。

Decision Tree

Decision Trees

Gini impurity

Entropy

CART cost function for regression

where

DecisionTreeClassifier

from sklearn.datasets import load_iris
from sklearn.tree import DecisionTreeClassifier

iris = load_iris()
X = iris.data[:, 2:] # petal length and width
y = iris.target

tree_clf = DecisionTreeClassifier(max_depth=2)
tree_clf.fit(X, y)

Random Forests

RF 在我看来是 Ensemble Learning (集成学习)的经典代表。

以Classifiers举例，对待同样的数据，不同分类器可能有不同的决策结果。

Logistic Regression classifier, Random Forest classifier, K-Nearest Neighbors classifier

自然而然的，可引入选举策略来作最终决策。

voting of classifier


from sklearn.ensemble import RandomForestClassifier
from sklearn.ensemble import VotingClassifier
from sklearn.linear_model import LogisticRegression
from sklearn.svm import SVC

log_clf = LogisticRegression()
rnd_clf = RandomForestClassifier()
svm_clf = SVC()

voting_clf = VotingClassifier(
    estimators=[('lr', log_clf), ('rf', rnd_clf), ('svc', svm_clf)],
    voting='hard')
voting_clf.fit(X_train, y_train)

Boosting

Adaboost

Gradient Boosting

效果指标

确定Model收敛的方向，对连续和离散模型都有若干种Metrics

Classification

$F_1$ 是二者的调和平均

precision_score and recall_score

1 2	from sklearn.metrics import precision_score, recall_score

Regression

Machine Learning

标签：img data rds tor standards on() 学习 ssi nump

原文地址：https://www.cnblogs.com/lijianming180/p/12037887.html

踩

(0)

评论一句话评论（0）

分享档案

更多>

2021年07月29日 (22)
2021年07月28日 (40)
2021年07月27日 (32)
2021年07月26日 (79)
2021年07月23日 (29)
2021年07月22日 (30)
2021年07月21日 (42)
2021年07月20日 (16)
2021年07月19日 (90)
2021年07月16日 (35)

周排行