2023-02-16发表2023-02-18更新学习5 分钟读完 (大约795个字)

Kaggle Intro to Deep Learning 学习笔记

1. A Single Neuron

创建单个神经元

from tensorflow import keras
from tensorflow.keras import layers

# Create a network with 1 linear unit
model = keras.Sequential([
    layers.Dense(units=1, input_shape=[3])
])

其中 layers.Dense() 表示一个稠密层，units 表示该层输出元素个数，input_shape=[height, width, channels] 描述该层输入大小。

用 model.weights 获取参数。

2. Deep Neural Networks

创建网络

model = keras.Sequential([
    # the hidden ReLU layers
    layers.Dense(units=4, activation='relu', input_shape=[2]),
    layers.Dense(units=3, activation='relu'),
    # the linear output layer 
    layers.Dense(units=1),
])

也可以使用 layers.Activation('relu')。

其他激活函数

ReLU：

$$ReLU(x) = \max{0, x}$$

ELU：

$$
ELU(x, \alpha) =
\begin{cases}
x, &x \geq 0 \
\alpha(e^x - 1), &x < 0
\end{cases}
$$

SeLU：

$$
SeLU(x) =
\begin{cases}
\lambda_{selu}x, &x \geq 0 \
\lambda_{selu}\alpha_{selu}(e^x - 1), &x < 0 \
\end{cases}
$$

其中 $\alpha_{selu} \approx 1.6733, \lambda_{selu} \approx 1.0507$。

3. Stochastic Gradient Descent

RMSE（Root Mean Square Error，均方根误差）
MAE（Mean Absolute Error，平均绝对误差）

# 钦定模型所用的 optimizer 和损失函数
model.compile(
    optimizer="adam",
    loss="mae",
)

history = model.fit(
    X_train, y_train,
    validation_data=(X_valid, y_valid),
    batch_size=256,
    epochs=10,
)

# convert the training history to a dataframe
history_df = pd.DataFrame(history.history)
# use Pandas native plot method
history_df['loss'].plot();

4. Overfitting and Underfitting

Underfitting

Overfitting

提前停止（Early stopping）

Keras 中，Early stopping 是一种回调（Callback）函数，每次迭代后都会执行。

from tensorflow.keras.callbacks import EarlyStopping

early_stopping = EarlyStopping(
    min_delta=0.001,
    patience=20,
    restore_best_weights=True,
)
# 当 20 次迭代内都没有使得验证集的损失函数值减小至少 0.001 时，停止训练

history = model.fit(
    X_train, y_train,
    validation_data=(X_valid, y_valid),
    batch_size=256,
    epochs=500,
    callbacks=[early_stopping], 
    verbose=0,  # 关闭日志输出
)

history_df = pd.DataFrame(history.history)
history_df.loc[5:, ['loss', 'val_loss']].plot();
print("Minimum validation loss: {}".format(history_df['val_loss'].min()))

5. Dropout and Batch Normalization

Dropout

layers.Dropout(rate=0.3)

Batch Normalization（BN）

BN 层拥有两个可训练的参数 $\mu, \beta$。首先，BN 会对输入参数进行正则化（$\mu = 0, \sigma = 1$），即 $x_i \leftarrow \frac {x_i - \mu}{\sqrt {\sigma^2 + \epsilon}}$；之后再让 $x_i \leftarrow \mu x_i + \beta$。这样可以在正则化数据的同时，又用 $\mu, \beta$ 作为还原参数，一定程度上保留原数据的分布。

BN 层一般可以缓解梯度爆炸或梯度消失的问题，也能使训练变得更快。

layers.BatchNormalization()

示例：

model = keras.Sequential([
    layers.Dense(1024, activation='relu', input_shape=[11]),
    layers.Dropout(0.3),
    layers.BatchNormalization(),
    layers.Dense(1024, activation='relu'),
    layers.Dropout(0.3),
    layers.BatchNormalization(),
    layers.Dense(1024, activation='relu'),
    layers.Dropout(0.3),
    layers.BatchNormalization(),
    layers.Dense(1),
])

6. Binary Classification

Cross-Entropy（交叉熵）

分类问题使用的损失函数，即 $-\ln p_x$。

示例

model = keras.Sequential([
    layers.Dense(4, activation='relu', input_shape=[33]),
    layers.Dense(4, activation='relu'),    
    layers.Dense(1, activation='sigmoid'),
])

model.compile(
    optimizer='adam',
    loss='binary_crossentropy',
    metrics=['binary_accuracy'],
)

Kaggle Intro to Deep Learning 学习笔记

https://cutekibry.github.io/2023/02/16/learning-note/kaggle-intro-to-deep-learning/

作者

Tsukimaru Oshawott

发布于

2023-02-16

更新于

2023-02-18

许可协议

#Kaggle

Kaggle Intro to Deep Learning 学习笔记

1. A Single Neuron

创建单个神经元

2. Deep Neural Networks

创建网络

其他激活函数

3. Stochastic Gradient Descent

4. Overfitting and Underfitting

提前停止（Early stopping）

5. Dropout and Batch Normalization

Dropout

Batch Normalization（BN）

6. Binary Classification

Cross-Entropy（交叉熵）

示例

作者

发布于

更新于

许可协议

评论

分类

最新文章

标签