Pytorch中nn.Conv1d 和nn.Linear差别_conv1d和linear-CSDN博客

本文链接：https://blog.csdn.net/qq_39964246/article/details/126293759

nn.Conv1d 和nn.Linear差别

1、两者的计算结果是相同的，只不过是nn.Conv1d是 Channel first,需要将通道数转换为（B,C,H）而nn.Linear的通道数是（B,H,C）
验证代码：

import torch

def count_parameters(model):
    """Count the number of parameters in a model."""
    return sum([p.numel() for p in model.parameters()])

conv = torch.nn.Conv1d(8,32,1)
print(count_parameters(conv))
# 288

linear = torch.nn.Linear(8,32)
print(count_parameters(linear))
# 288

print(conv.weight.shape)
# torch.Size([32, 8, 1])
print(linear.weight.shape)
# torch.Size([32, 8])

# use same initialization
linear.weight = torch.nn.Parameter(conv.weight.squeeze(2))
linear.bias = torch.nn.Parameter(conv.bias)

tensor = torch.randn(128,256,8)
permuted_tensor = tensor.permute(0,2,1).clone().contiguous()	# 注意此处进行了维度重新排列

out_linear = linear(tensor)
print(out_linear.mean())
# tensor(0.0067, grad_fn=<MeanBackward0>)

out_conv = conv(permuted_tensor)
print(out_conv.mean())
# tensor(0.0067, grad_fn=<MeanBackward0>)

2、由于nn.Conv1d和nn.Linear的计算方法有所区别，因此两者的执行速度也有所差别：

#Speed test:

%%timeit
_ = linear(tensor)
# 151 µs ± 297 ns per loop

%%timeit
_ = conv(permuted_tensor)
# 1.43 ms ± 6.33 µs per loop```
可见nn.Linear的计算速度更快。
3、由于nn.Conv1d和nn.Linear在网络的前向传播和梯度反传中数值的保存精度有所差别，因此导致网络越深，两者计算的数值差别越大。

以上均来自[Stack Overflow](https://stackoverflow.com/questions/55576314/conv1d-with-kernel-size-1-vs-linear-layer?answertab=oldest#tab-top)