CS231n-模型参数初始化--Weight Initialization

最新推荐文章于 2022-08-21 19:05:42 发布

星月夜语

最新推荐文章于 2022-08-21 19:05:42 发布

阅读量425

点赞数

CC 4.0 BY-SA版权

分类专栏：深度学习基础知识 python 算法类文章标签：模型初始化 pytorch

本文链接：https://blog.csdn.net/ljh618625/article/details/105457644

深度学习基础知识同时被 3 个专栏收录

40 篇文章 ¥49.90 ¥99.00

订阅专栏

超级会员免费看

算法类

24 篇文章 ¥19.90 ¥99.00

订阅专栏

超级会员免费看

python

34 篇文章

订阅专栏

本文讨论了深度学习模型初始化的重要性和常见方法。全零初始化会导致神经元失去差异性，而使用小的随机数可以打破对称性，使神经元计算不同的更新。权重通常从均值为0、标准差为1的高斯分布中初始化，以确保每个神经元在输入空间中指向随机方向。对于ReLU激活函数，有时会将偏置初始化为一个小常数值以确保开始时所有单元都能激活，但零初始化也是常见的选择。建议遵循He等人的建议，使用特定的初始化策略。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

Before we can begin to train the network we have to initialize its parameters.

1、Pitfall: all zero initialization. Lets start with what we should not do. Note that we do not know what the final value of every weight should be in the trained network, but with proper data normalization it is reasonable to assume that approximately half of the weights will be positive and half of them will be negative. A reasonable-sounding idea then might be to set all the initial weights to zero, which we expect to be the “best guess” in expectation. This turns out to be a mistake, because if every neuron in the network computes the same output, then they will also all compute the same gradie