深度学习 - 43.SeNET、Bilinear Interaction 实现特征交叉 By Keras

BIT_666

已于 2023-05-09 17:22:14 修改

阅读量1.1k

点赞数 1

分类专栏：深度学习文章标签：深度学习神经网络 SENET BiLinear FiBiNet

于 2023-04-25 08:00:00 首次发布

本文链接：https://blog.csdn.net/BIT_666/article/details/130343130

版权

深度学习专栏收录该内容

63 篇文章

订阅专栏

本文详细介绍了如何使用Keras实现SENETLayer和BilinearIntercationLayer，这两个层是FiBiNet网络的关键组成部分，用于特征交叉和增强。SENET通过压缩与激励操作动态更新特征重要性，而BilinearIntercationLayer则通过参数矩阵实现特征间的交互。代码示例展示了从初始化、构建到调用的完整过程。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

2.4 Test Main Function

2.5 完整代码

三.BiLinear Intercation Layer

2.4 Test Main Function

2.5 完整代码

四.总结

一.引言

上一篇文章我们对 FiBiNet 网络做了全面的了解，其引入 SENET 与 BiLinear Interaction 实现特征交叉，实验表明 FiBiNet 在浅层网络效果优于 FM、FFM，在深层网络效果优于 DeepFm、XdeepFm。本文用 kears 实现基本的 SENET Layer 与 Bilinear Interaction Layer。

二.SENET Layer

1.简介

SENet 全称为 Squeeze-and-Excitation Networks, 可翻译为压缩与激励网络。

实现流程：

AvgPool 平均池化 => FC + σ 全连接激活 => FC + σ 全连接激活 => Multiply 加权

这里第一个激活函数 σ 为 ReLU，第二个激活函数有的使用 Sigmoid 有的使用 ReLU。

2.Keras 实现

2.1 Init Function

    def __init__(self, reduction_ratio=3, **kwargs):
        self.field_size = None
        self.embedding_size = None
        self.dense1 = None
        self.dense2 = None
        self.reduction_ratio = reduction_ratio

        super(SETNetLayer, self).__init__(**kwargs)

初始化函数主要定义 SENET 需要的变量，主要是 Field 数量，Embedding 嵌入维度以及 Squeeze 挤压和 Excitation 激发对应的两个 Full Connect 全连接 Dense 层以及对应的 Squeeze 参数 reduction_ratio。

2.2 Build Function

    def build(self, input_shape):
        self.field_size, self.embedding_size = input_shape
        reduction_size = max(1, self.field_size // self.reduction_ratio)

        self.dense1 = Dense(reduction_size, activation='relu', kernel_initializer=glorot_normal_initializer)
        self.dense2 = Dense(self.field_size, activation='sigmoid', kernel_initializer=glorot_normal_initializer)

        super(SETNetLayer, self).build(input_shape)

这里没有调用 add_weight 方法初始化参数矩阵，直接使用 layer 层下的 Dense 层初始化。

2.3 Call Function

    def call(self, inputs, training=None, **kwargs):
        # inputs = F x K
        mean_pooling = tf.expand_dims(tf.reduce_mean(inputs, axis=-1), axis=0)  # 1 x F
        compression = self.dense1(mean_pooling)  # 1 x reduction
        reconstruction = self.dense2(compression)  # 1 x F
        add_weight = tf.squeeze(tf.multiply(inputs, tf.expand_dims(reconstruction, axis=2)))  # F x K

        return add_weight

原始维度为 FxK，F 为 Field_size、K 为 Embedding_dim 输入输出，加权后输出维度仍然为 FxK。

2.4 Test Main Function

if __name__ == '__main__':
    # 数据准备
    F = 6  # Field 数量
    K = 8  # 特征维度
    samples = np.ones(shape=(F, K))
    seNetLayer = SETNetLayer()
    output = seNetLayer(samples)
    print(output)

实际场景同可以通过引入 SENET 达到动态更新 Field 重要性的目的。

2.5 完整代码

import numpy as np
import tensorflow as tf
from tensorflow.python.keras.layers import *
from tensorflow.keras.layers import Layer
from tensorflow.python.ops.init_ops import glorot_normal_initializer


class SETNetLayer(Layer):

    def __init__(self, reduction_ratio=3, **kwargs):
        self.field_size = None
        self.embedding_size = None
        self.dense1 = None
        self.dense2 = None
        self.reduction_ratio = reduction_ratio

        super(SETNetLayer, self).__init__(**kwargs)

    def build(self, input_shape):
        self.field_size, self.embedding_size = input_shape
        reduction_size = max(1, self.field_size // self.reduction_ratio)

        self.dense1 = Dense(reduction_size, activation='relu', kernel_initializer=glorot_normal_initializer)
        self.dense2 = Dense(self.field_size, activation='sigmoid', kernel_initializer=glorot_normal_initializer)

        super(SETNetLayer, self).build(input_shape)

    def call(self, inputs, training=None, **kwargs):
        # inputs = F x K
        mean_pooling = tf.expand_dims(tf.reduce_mean(inputs, axis=-1), axis=0)  # 1 x F
        compression = self.dense1(mean_pooling)  # 1 x reduction
        reconstruction = self.dense2(compression)  # 1 x F
        add_weight = tf.squeeze(tf.multiply(inputs, tf.expand_dims(reconstruction, axis=2)))  # F x K

        return add_weight

    def compute_output_shape(self, input_shape):
        return input_shape


if __name__ == '__main__':
    # 数据准备
    F = 6  # Field 数量
    K = 8  # 特征维度
    samples = np.ones(shape=(F, K))
    seNetLayer = SETNetLayer()
    output = seNetLayer(samples)
    print(output)

三.BiLinear Intercation Layer

1.简介

BiLinear Inteaction Layer 引入参数交叉矩阵实现 i、j 特征之间的交互代替原有的内积或哈达玛积，其中共设计了三种模式：

- Filed All Type

所有交叉特征共享一个 kxk 的参数矩阵

- Field Each Type

每个 Field 一个参数矩阵 Wi ∈ R kxk

- Field Interaction Type

每个交叉特征 i、j 一个参数矩阵 W i,j ∈ R kxk

2.Keras 实现

2.1 Init Function

    def __init__(self, biLinear_type='all', seed=1024, **kwargs):
        self.biLinear_type = biLinear_type
        self.seed = seed
        self.field_size = None
        self.embedding_size = None
        self.W = None
        self.W_list = None

        super(BiLinearInteraction, self).__init__(**kwargs)

biLinear_type 控制特征交互方式，Filed_size 为特征数量，Embedding_size 为嵌入维度，Filed-All-Type 场景下使用单一 W 参数矩阵，Field-Each-Type 和 Field-Interaction-Type 使用 W_list 多参数矩阵的形式，前者 W 个数为 Field 个，后者为 (F-1)·F / 2 个。

2.2 Build Function

    def build(self, input_shape):
        self.field_size, self.embedding_size = input_shape

        if self.biLinear_type == "all":
            self.W = self.add_weight(shape=(self.embedding_size, self.embedding_size),
                                     initializer=glorot_normal_initializer(seed=self.seed),
                                     name="biLinearWeight")
        elif self.biLinear_type == "each":
            self.W_list = [self.add_weight(shape=(self.embedding_size, self.embedding_size),
                                           initializer=glorot_normal_initializer(seed=self.seed),
                                           name="biLinearWeight" + str(i)) for i in range(self.field_size)]
        elif self.biLinear_type == "interaction":
            self.W_list = [self.add_weight(shape=(self.embedding_size, self.embedding_size),
                                           initializer=glorot_normal_initializer(seed=self.seed),
                                           name="biLinearWeight" + str(i) + '_' + str(j)) for i, j in
                           itertools.combinations(range(self.field_size), 2)]
        else:
            raise NotImplementedError

        super(BiLinearInteraction, self).build(input_shape)

根据 input_shape 解析得到 Field_size 和 Embedding_size，根据 biLinear_type 的不同，初始化不同的参数矩阵 W 与 W_list，itertools.combinations 方法用于生成所有 Filed 的组合。

2.3 Call Function

    def call(self, inputs, **kwargs):

        n = len(inputs)
        if self.biLinear_type == "all":
            # 所有特征交叉公用一个参数矩阵 W
            v_dots = [tf.tensordot(inputs[i], self.W, axes=(-1, 0)) for i in range(n)]  # F x K
            p = [tf.multiply(v_dots[i], inputs[j]) for i, j in itertools.combinations(range(n), 2)]  # (F-1)·F/2 x K
        elif self.biLinear_type == "each":
            # 每个特征一个参数矩阵 Wi
            v_dots = [tf.tensordot(inputs[i], self.W_list[i], axes=(-1, 0)) for i in range(n)]  # F x K
            p = [tf.multiply(v_dots[i], inputs[j]) for i, j in itertools.combinations(range(n), 2)]  # (F-1)·F/2 x K
        elif self.biLinear_type == "interaction":
            # 每一个组合特征 Vi-Vj 以及对应的 Wij
            p = [tf.multiply(tf.tensordot(v[0], w, axes=(-1, 0)), v[1])
                 for v, w in zip(itertools.combinations(inputs, 2), self.W_list)]  # (F-1)·F/2 x K
        else:
            raise NotImplementedError

        # (F-1)·F/2 x K
        _output = tf.reshape(p, shape=(-1, int(self.embedding_size)))
        return _output

分别执行内积与哈达玛积，区别是交互的 W 参数矩阵不同，这里与 SENET 不同，SENET 输入输出维度相同，BiLinear Interaction Layer 输入 F x K，输出 (F-1)·F / 2 x K，因为前者是对 Field 的交叉，后者是对每一个 FF 特征的交叉。

2.4 Test Main Function

if __name__ == '__main__':
    # 数据准备
    F = 4  # Field 数量
    K = 8  # 特征维度
    samples = np.ones(shape=(F, K))

    BiLinearLayer = BiLinearInteraction("interaction")
    output = BiLinearLayer(samples)
    print(output)

F = 4，K = 8，所以输出 6x8。

2.5 完整代码

import itertools

import numpy as np
import tensorflow as tf
from tensorflow.keras.layers import Layer
from tensorflow.python.ops.init_ops import glorot_normal_initializer


class BiLinearInteraction(Layer):

    def __init__(self, biLinear_type='interaction', seed=1024, **kwargs):
        self.biLinear_type = biLinear_type
        self.seed = seed
        self.field_size = None
        self.embedding_size = None
        self.W = None
        self.W_list = None

        super(BiLinearInteraction, self).__init__(**kwargs)

    def build(self, input_shape):
        self.field_size, self.embedding_size = input_shape

        if self.biLinear_type == "all":
            self.W = self.add_weight(shape=(self.embedding_size, self.embedding_size),
                                     initializer=glorot_normal_initializer(seed=self.seed),
                                     name="biLinearWeight")
        elif self.biLinear_type == "each":
            self.W_list = [self.add_weight(shape=(self.embedding_size, self.embedding_size),
                                           initializer=glorot_normal_initializer(seed=self.seed),
                                           name="biLinearWeight" + str(i)) for i in range(self.field_size)]
        elif self.biLinear_type == "interaction":
            self.W_list = [self.add_weight(shape=(self.embedding_size, self.embedding_size),
                                           initializer=glorot_normal_initializer(seed=self.seed),
                                           name="biLinearWeight" + str(i) + '_' + str(j)) for i, j in
                           itertools.combinations(range(self.field_size), 2)]
        else:
            raise NotImplementedError

        super(BiLinearInteraction, self).build(input_shape)

    def call(self, inputs, **kwargs):

        n = len(inputs)
        if self.biLinear_type == "all":
            # 所有特征交叉公用一个参数矩阵 W
            v_dots = [tf.tensordot(inputs[i], self.W, axes=(-1, 0)) for i in range(n)]  # F x K
            p = [tf.multiply(v_dots[i], inputs[j]) for i, j in itertools.combinations(range(n), 2)]  # (F-1)·F/2 x K
        elif self.biLinear_type == "each":
            # 每个特征一个参数矩阵 Wi
            v_dots = [tf.tensordot(inputs[i], self.W_list[i], axes=(-1, 0)) for i in range(n)]  # F x K
            p = [tf.multiply(v_dots[i], inputs[j]) for i, j in itertools.combinations(range(n), 2)]  # (F-1)·F/2 x K
        elif self.biLinear_type == "interaction":
            # 每一个组合特征 Vi-Vj 以及对应的 Wij
            p = [tf.multiply(tf.tensordot(v[0], w, axes=(-1, 0)), v[1])
                 for v, w in zip(itertools.combinations(inputs, 2), self.W_list)]  # (F-1)·F/2 x K
        else:
            raise NotImplementedError

        # (F-1)·F/2 x K
        _output = tf.reshape(p, shape=(-1, int(self.embedding_size)))
        return _output


if __name__ == '__main__':
    # 数据准备
    F = 4  # Field 数量
    K = 8  # 特征维度
    samples = np.ones(shape=(F, K))

    BiLinearLayer = BiLinearInteraction("interaction")
    output = BiLinearLayer(samples)
    print(output)

四.总结

如果我们去掉 SENET 层和双线性交互层，我们的浅 FiBiNET 和深 FiBiNET 将降级为 FM 和FNN，为了进一步提高性能，将上述浅层模型与 DNN 结合得到 FiBiNet 由于 DeepFm 和 XdeepFm 等深层模型。上图为 FiBiNet 模型架构，其中绿框部分为 SENET Layer，红框部门为 Bilinear-Interaction Layer，剩下的 Combination Layer 和 DNN 的构建比较基础，有兴趣的同学可以自己实现 FiBiNet。

更多推荐算法相关深度学习：深度学习导读专栏