SM4算法的Verilog流水线实现(带测试)

一、SM4的Verilog流水线实现原理

SM4算法是中国国家密码管理局发布的分组密码标准,采用32轮非线性迭代结构。Verilog流水线实现通过将算法分解为多个处理阶段,每个阶段由专用硬件并行执行,显著提高了吞吐量。

流水线设计的关键是将32轮加密操作展开为32个连续的硬件处理单元。每个时钟周期,数据从一个处理单元传递到下一个,形成流水作业。这种设计使得系统可以同时处理多个数据块的不同加密阶段,极大地提高了整体处理速度。

本实现的设计主要包括两个部分:密钥扩展模块和加密模块。密钥扩展模块预先计算32轮子密钥并存储在寄存器中,加密模块则使用这些子密钥并行处理32轮变换,这种分离设计允许密钥扩展和加密操作并行执行。

流水线控制通过状态信号实现,包括busy、din_valid和dout_valid等信号协调各模块工作。当busy信号为低时表示系统就绪可以接收新数据,dout_valid信号则指示输出数据有效,这种清晰的握手协议确保了数据在流水线中的正确流动。

二、Verilog代码解析

1. sm4_top模块

sm4_top是系统的顶层模块,负责实例化和连接密钥扩展与加密模块。它包含时钟、复位、主密钥加载、明文输入和密文输出等接口。顶层设计采用清晰的层次结构,将密钥处理和加密处理分离。

该模块通过busy信号指示系统状态,这是密钥扩展模块和加密模块busy信号的或操作结果。这种设计使得外部控制器可以方便地了解系统忙闲状态,协调数据输入时机。

输入输出采用标准的同步设计,所有信号在时钟上升沿采样。load_mkey信号控制主密钥加载,din_valid信号启动加密过程,dout_valid信号标记输出数据有效,形成完整的数据处理流程。

module sm4_top (
    input clk,
    input rst_n,
    
    input [127:0] mkey,
    input load_mkey,

    input [127:0] plaintext,
    input din_valid,

    output [127:0] ciphertext,
    output dout_valid,
    
    output busy
);
    wire [1023:0] rk_flatten;
    wire encrypt_busy;
    wire key_expand_busy;

    assign busy = encrypt_busy | key_expand_busy;

    encrypt u_en (
        .clk(clk),
        .rst_n(rst_n),
        .din_valid(din_valid),
        .plaintext(plaintext),
        .ciphertext(ciphertext),
        .dout_valid(dout_valid),
        .busy(encrypt_busy),
        .rk_flatten(rk_flatten)
    );

    key_expand u_ke (
        .clk(clk),
        .rst_n(rst_n),
        .load_mkey(load_mkey),
        .mkey(mkey),
        .rk_flatten(rk_flatten),
        .busy(key_expand_busy)
    );
endmodule

2. key_expand模块

key_expand模块负责从128位主密钥生成32个32位的轮密钥。实现中首先将主密钥与固定密钥FK进行异或,然后通过32轮迭代生成轮密钥。每轮使用不同的固定参数cki。

模块内部采用移位寄存器结构存储中间密钥状态。round_counter计数器控制密钥生成轮数,busy信号在密钥生成期间保持高电平。生成的轮密钥通过rk_flatten总线输出到加密模块。

密钥扩展算法核心是key_expand_round子模块,它实现SM4的密钥扩展非线性变换。该变换包括S盒替换和线性变换操作,与加密轮函数类似但参数不同,确保了密钥的充分混淆和扩散。

module key_expand (
    input clk,
    input rst_n,
    input load_mkey,
    input [127:0] mkey,
    output [1023:0] rk_flatten,
    output reg busy
);
    localparam FK = 128'ha3b1bac656aa3350677d9197b27022dc;

    reg [4:0] round_counter;
    reg [31:0] K0, K1, K2, K3;
    reg [31:0] round_keys [0:31];
    wire [31:0] cki;
    wire [31:0] round_dout;
    wire start_expand = load_mkey & ~busy;

    integer i;

    assign rk_flatten = {round_keys[0], round_keys[1], round_keys[2], round_keys[3],
                         round_keys[4], round_keys[5], round_keys[6], round_keys[7],
                         round_keys[8], round_keys[9], round_keys[10], round_keys[11],
                         round_keys[12], round_keys[13], round_keys[14], round_keys[15],
                         round_keys[16], round_keys[17], round_keys[18], round_keys[19],
                         round_keys[20], round_keys[21], round_keys[22], round_keys[23],
                         round_keys[24], round_keys[25], round_keys[26], round_keys[27],
                         round_keys[28], round_keys[29], round_keys[30], round_keys[31]};

    always @(posedge clk, negedge rst_n) begin
        if (~rst_n) {K0, K1, K2, K3} <= 128'h0;
        else begin
            if (start_expand) {K0, K1, K2, K3} <= mkey ^ FK;
            else if (busy) {K0, K1, K2, K3} <= {K1, K2, K3, round_dout};
        end
    end

    always @(posedge clk) begin
        if (busy) begin
            round_keys[31] <= round_dout;
            for (i = 30; i >= 0; i = i - 1) begin
                round_keys[i] <= round_keys[i+1];
            end
        end
    end

    always @(posedge clk, negedge rst_n) begin
        if (~rst_n) round_counter <= 5'd0;
        else begin
            if (busy) round_counter <= round_counter + 5'd1;
            else if (start_expand) round_counter <= 5'd0;
        end
    end

    always @(posedge clk, negedge rst_n) begin
        if (~rst_n) busy <= 1'b0;
        else begin
            if (start_expand) busy <= 1'b1;
            else if (round_counter == 5'd31) busy <= 1'b0;
        end
    end

    key_expand_cki u_ck (.round(round_counter), .cki(cki));
    key_expand_round u_round (.din({K0, K1, K2, K3}), .cki(cki), .dout(round_dout));
endmodule

3. encrypt模块

encrypt模块实现SM4的32轮加密流水线。它接收plaintext输入和rk_flatten轮密钥,输出ciphertext密文。模块内部包含32个encrypt_round实例,形成完整的处理流水线。

该模块使用round_ctrl移位寄存器跟踪数据在流水线中的进度。X0_3数组存储每轮的中间状态,32轮完成后通过重排列生成最终密文。dout_valid信号在第32个周期后置位,表示输出有效。

流水线控制逻辑确保新数据输入时能正确初始化加密过程。busy信号综合了din_valid和round_ctrl状态,准确反映模块工作状态。这种设计允许背靠背的数据输入,最大化吞吐量。

module encrypt (
    input clk,
    input rst_n,
    input din_valid,
    input [127:0] plaintext,
    output [127:0] ciphertext,
    output dout_valid,
    output busy,

    input [1023:0] rk_flatten
);

    reg [32:0] round_ctrl;
    reg [127:0] X0_3 [0:32];
    wire [31:0] round_keys [0:31];
    wire [127:0] round_out [0:31];

    integer i;
    genvar j;

    assign dout_valid = round_ctrl[32];
    assign busy = din_valid | (|round_ctrl);
    assign ciphertext = {X0_3[32][31:0], X0_3[32][63:32], X0_3[32][95:64], X0_3[32][127:96]};

    assign {
        round_keys[0], round_keys[1], round_keys[2], round_keys[3],
        round_keys[4], round_keys[5], round_keys[6], round_keys[7],
        round_keys[8], round_keys[9], round_keys[10], round_keys[11],
        round_keys[12], round_keys[13], round_keys[14], round_keys[15],
        round_keys[16], round_keys[17], round_keys[18], round_keys[19],
        round_keys[20], round_keys[21], round_keys[22], round_keys[23],
        round_keys[24], round_keys[25], round_keys[26], round_keys[27],
        round_keys[28], round_keys[29], round_keys[30], round_keys[31]
    } = rk_flatten;


    always @(posedge clk, negedge rst_n) begin
        if (~rst_n) begin
            for (i = 0; i < 33; i = i + 1) begin
                X0_3[i] <= 128'h0;
            end
        end else begin
            if (din_valid) begin
                X0_3[0] <= plaintext;
            end
            for (i = 1; i < 33; i = i + 1) begin
                X0_3[i] <= round_out[i-1];
            end
        end
    end

    
    always @(posedge clk, negedge rst_n) begin
        if (~rst_n) begin
            round_ctrl <= 33'd0;
        end else begin
            round_ctrl <= {round_ctrl[32:0], din_valid};
        end
    end

    generate
        for (j = 0; j < 32; j = j + 1) begin : enr_instances
            encrypt_round u_er (
                .din (X0_3[j]),
                .rki (round_keys[j]),
                .dout(round_out[j])
            );
        end
    endgenerate

endmodule

4. encrypt_round模块

encrypt_round实现SM4的轮函数变换,模块接收128位输入数据和32位轮密钥,输出128位变换结果。核心操作包括32位异或、S盒替换和L变换。变换过程首先将输入数据与轮密钥组合,然后通过4个并行S盒进行字节替换。替换结果经过L变换(循环移位和异或)后,与原始数据混合生成输出。

module encrypt_round (
    input  [127:0] din,
    input  [31:0] rki,
    output [127:0] dout
);
    wire [31:0] word_0, word_1, word_2, word_3;
    wire [31:0] transform_din;
    wire [31:0] transform_dout;
    wire [7:0] sbox_bin0, sbox_bin1, sbox_bin2, sbox_bin3;
    wire [7:0] sbox_bout0, sbox_bout1, sbox_bout2, sbox_bout3;
    wire [31:0] sbox_wout = {sbox_bout0, sbox_bout1, sbox_bout2, sbox_bout3};

    assign {word_0, word_1, word_2, word_3} = din;
    assign transform_din = word_1 ^ word_2 ^ word_3 ^ rki;
    assign {sbox_bin0, sbox_bin1, sbox_bin2, sbox_bin3} = transform_din;
    assign transform_dout = ((sbox_wout ^ {sbox_wout[29:0], sbox_wout[31:30]}) ^ ({sbox_wout[21:0], sbox_wout[31:22]}
                            ^ {sbox_wout[13:0], sbox_wout[31:14]})) ^ {sbox_wout[7:0], sbox_wout[31:8]};
    assign dout = {word_1, word_2, word_3, transform_dout ^ word_0};

    sm4_sbox sm4_sbox0 (.s_in (sbox_bin0), .s_out(sbox_bout0));
    sm4_sbox sm4_sbox1 (.s_in (sbox_bin1), .s_out(sbox_bout1));
    sm4_sbox sm4_sbox2 (.s_in (sbox_bin2), .s_out(sbox_bout2));
    sm4_sbox sm4_sbox3 (.s_in (sbox_bin3), .s_out(sbox_bout3));

endmodule

5. 其他模块

key_expand_cki模块提供密钥扩展所需的32轮常数cki,采用查找表方式实现。key_expand_round模块实现密钥扩展的轮函数,结构与encrypt_round类似但线性变换不同。sm4_sbox模块实现SM4的S盒替换,使用256字节的查找表实现非线性变换。

module key_expand_cki(
   input [4:0] round,
	output reg [31:0] cki
);

always@(*)
	case(round)
	5'h00: cki <= 32'h00070e15;
	5'h01: cki <= 32'h1c232a31;
	5'h02: cki <= 32'h383f464d;
	5'h03: cki <= 32'h545b6269;
	5'h04: cki <= 32'h70777e85;
	5'h05: cki <= 32'h8c939aa1;
	5'h06: cki <= 32'ha8afb6bd;
	5'h07: cki <= 32'hc4cbd2d9;
	5'h08: cki <= 32'he0e7eef5;
	5'h09: cki <= 32'hfc030a11;
	5'h0a: cki <= 32'h181f262d;
	5'h0b: cki <= 32'h343b4249;
	5'h0c: cki <= 32'h50575e65;
	5'h0d: cki <= 32'h6c737a81;
	5'h0e: cki <= 32'h888f969d;
	5'h0f: cki <= 32'ha4abb2b9;
	5'h10: cki <= 32'hc0c7ced5;
	5'h11: cki <= 32'hdce3eaf1;
	5'h12: cki <= 32'hf8ff060d;
	5'h13: cki <= 32'h141b2229;
	5'h14: cki <= 32'h30373e45;
	5'h15: cki <= 32'h4c535a61;
	5'h16: cki <= 32'h686f767d;
	5'h17: cki <= 32'h848b9299;
	5'h18: cki <= 32'ha0a7aeb5;
	5'h19: cki <= 32'hbcc3cad1;
	5'h1a: cki <= 32'hd8dfe6ed;
	5'h1b: cki <= 32'hf4fb0209;
	5'h1c: cki <= 32'h10171e25;
	5'h1d: cki <= 32'h2c333a41;
	5'h1e: cki <= 32'h484f565d;
	5'h1f: cki <= 32'h646b7279;
	default: cki <= 32'h0;
	endcase

endmodule
module key_expand_round (
    input  [127:0] din,
    input  [ 31:0] cki,
    output [ 31:0] dout
);

    wire [31:0] word_0, word_1, word_2, word_3;
    wire [31:0] transform_din;
    wire [31:0] transform_dout;
    wire [7:0] sbox_bin0, sbox_bin1, sbox_bin2, sbox_bin3;
    wire [7:0] sbox_bout0, sbox_bout1, sbox_bout2, sbox_bout3;
    wire [31:0] sbox_wout = {sbox_bout0, sbox_bout1, sbox_bout2, sbox_bout3};

    assign {word_0, word_1, word_2, word_3} = din;
    assign transform_din = word_1 ^ word_2 ^ word_3 ^ cki;
    assign {sbox_bin0, sbox_bin1, sbox_bin2, sbox_bin3} = transform_din;
    assign transform_dout = (sbox_wout^{sbox_wout[18:0],sbox_wout[31:19]})^{sbox_wout[8:0],sbox_wout[31:9]};
    assign dout = transform_dout ^ word_0;

    sm4_sbox sm4_sbox0 (.s_in (sbox_bin0), .s_out(sbox_bout0));
    sm4_sbox sm4_sbox1 (.s_in (sbox_bin1), .s_out(sbox_bout1));
    sm4_sbox sm4_sbox2 (.s_in (sbox_bin2), .s_out(sbox_bout2));
    sm4_sbox sm4_sbox3 (.s_in (sbox_bin3), .s_out(sbox_bout3));

endmodule
module sm4_sbox(
    input [7:0] s_in,
    output [7:0] s_out
);
reg [7:0] sbox[0:255];
initial
begin
	sbox[000]=8'hd6; sbox[001]=8'h90; sbox[002]=8'he9; sbox[003]=8'hfe; sbox[004]=8'hcc; sbox[005]=8'he1; sbox[006]=8'h3d; sbox[007]=8'hb7;
	sbox[008]=8'h16; sbox[009]=8'hb6; sbox[010]=8'h14; sbox[011]=8'hc2; sbox[012]=8'h28; sbox[013]=8'hfb; sbox[014]=8'h2c; sbox[015]=8'h05;
	sbox[016]=8'h2b; sbox[017]=8'h67; sbox[018]=8'h9a; sbox[019]=8'h76; sbox[020]=8'h2a; sbox[021]=8'hbe; sbox[022]=8'h04; sbox[023]=8'hc3;
	sbox[024]=8'haa; sbox[025]=8'h44; sbox[026]=8'h13; sbox[027]=8'h26; sbox[028]=8'h49; sbox[029]=8'h86; sbox[030]=8'h06; sbox[031]=8'h99;
	sbox[032]=8'h9c; sbox[033]=8'h42; sbox[034]=8'h50; sbox[035]=8'hf4; sbox[036]=8'h91; sbox[037]=8'hef; sbox[038]=8'h98; sbox[039]=8'h7a;
	sbox[040]=8'h33; sbox[041]=8'h54; sbox[042]=8'h0b; sbox[043]=8'h43; sbox[044]=8'hed; sbox[045]=8'hcf; sbox[046]=8'hac; sbox[047]=8'h62;
	sbox[048]=8'he4; sbox[049]=8'hb3; sbox[050]=8'h1c; sbox[051]=8'ha9; sbox[052]=8'hc9; sbox[053]=8'h08; sbox[054]=8'he8; sbox[055]=8'h95;
	sbox[056]=8'h80; sbox[057]=8'hdf; sbox[058]=8'h94; sbox[059]=8'hfa; sbox[060]=8'h75; sbox[061]=8'h8f; sbox[062]=8'h3f; sbox[063]=8'ha6;
	sbox[064]=8'h47; sbox[065]=8'h07; sbox[066]=8'ha7; sbox[067]=8'hfc; sbox[068]=8'hf3; sbox[069]=8'h73; sbox[070]=8'h17; sbox[071]=8'hba;
	sbox[072]=8'h83; sbox[073]=8'h59; sbox[074]=8'h3c; sbox[075]=8'h19; sbox[076]=8'he6; sbox[077]=8'h85; sbox[078]=8'h4f; sbox[079]=8'ha8;
	sbox[080]=8'h68; sbox[081]=8'h6b; sbox[082]=8'h81; sbox[083]=8'hb2; sbox[084]=8'h71; sbox[085]=8'h64; sbox[086]=8'hda; sbox[087]=8'h8b;
	sbox[088]=8'hf8; sbox[089]=8'heb; sbox[090]=8'h0f; sbox[091]=8'h4b; sbox[092]=8'h70; sbox[093]=8'h56; sbox[094]=8'h9d; sbox[095]=8'h35;
	sbox[096]=8'h1e; sbox[097]=8'h24; sbox[098]=8'h0e; sbox[099]=8'h5e; sbox[100]=8'h63; sbox[101]=8'h58; sbox[102]=8'hd1; sbox[103]=8'ha2;
	sbox[104]=8'h25; sbox[105]=8'h22; sbox[106]=8'h7c; sbox[107]=8'h3b; sbox[108]=8'h01; sbox[109]=8'h21; sbox[110]=8'h78; sbox[111]=8'h87;
	sbox[112]=8'hd4; sbox[113]=8'h00; sbox[114]=8'h46; sbox[115]=8'h57; sbox[116]=8'h9f; sbox[117]=8'hd3; sbox[118]=8'h27; sbox[119]=8'h52;
	sbox[120]=8'h4c; sbox[121]=8'h36; sbox[122]=8'h02; sbox[123]=8'he7; sbox[124]=8'ha0; sbox[125]=8'hc4; sbox[126]=8'hc8; sbox[127]=8'h9e;
	sbox[128]=8'hea; sbox[129]=8'hbf; sbox[130]=8'h8a; sbox[131]=8'hd2; sbox[132]=8'h40; sbox[133]=8'hc7; sbox[134]=8'h38; sbox[135]=8'hb5;
	sbox[136]=8'ha3; sbox[137]=8'hf7; sbox[138]=8'hf2; sbox[139]=8'hce; sbox[140]=8'hf9; sbox[141]=8'h61; sbox[142]=8'h15; sbox[143]=8'ha1;
	sbox[144]=8'he0; sbox[145]=8'hae; sbox[146]=8'h5d; sbox[147]=8'ha4; sbox[148]=8'h9b; sbox[149]=8'h34; sbox[150]=8'h1a; sbox[151]=8'h55;
	sbox[152]=8'had; sbox[153]=8'h93; sbox[154]=8'h32; sbox[155]=8'h30; sbox[156]=8'hf5; sbox[157]=8'h8c; sbox[158]=8'hb1; sbox[159]=8'he3;
	sbox[160]=8'h1d; sbox[161]=8'hf6; sbox[162]=8'he2; sbox[163]=8'h2e; sbox[164]=8'h82; sbox[165]=8'h66; sbox[166]=8'hca; sbox[167]=8'h60;
	sbox[168]=8'hc0; sbox[169]=8'h29; sbox[170]=8'h23; sbox[171]=8'hab; sbox[172]=8'h0d; sbox[173]=8'h53; sbox[174]=8'h4e; sbox[175]=8'h6f;
	sbox[176]=8'hd5; sbox[177]=8'hdb; sbox[178]=8'h37; sbox[179]=8'h45; sbox[180]=8'hde; sbox[181]=8'hfd; sbox[182]=8'h8e; sbox[183]=8'h2f;
	sbox[184]=8'h03; sbox[185]=8'hff; sbox[186]=8'h6a; sbox[187]=8'h72; sbox[188]=8'h6d; sbox[189]=8'h6c; sbox[190]=8'h5b; sbox[191]=8'h51;
	sbox[192]=8'h8d; sbox[193]=8'h1b; sbox[194]=8'haf; sbox[195]=8'h92; sbox[196]=8'hbb; sbox[197]=8'hdd; sbox[198]=8'hbc; sbox[199]=8'h7f;
	sbox[200]=8'h11; sbox[201]=8'hd9; sbox[202]=8'h5c; sbox[203]=8'h41; sbox[204]=8'h1f; sbox[205]=8'h10; sbox[206]=8'h5a; sbox[207]=8'hd8;
	sbox[208]=8'h0a; sbox[209]=8'hc1; sbox[210]=8'h31; sbox[211]=8'h88; sbox[212]=8'ha5; sbox[213]=8'hcd; sbox[214]=8'h7b; sbox[215]=8'hbd;
	sbox[216]=8'h2d; sbox[217]=8'h74; sbox[218]=8'hd0; sbox[219]=8'h12; sbox[220]=8'hb8; sbox[221]=8'he5; sbox[222]=8'hb4; sbox[223]=8'hb0;
	sbox[224]=8'h89; sbox[225]=8'h69; sbox[226]=8'h97; sbox[227]=8'h4a; sbox[228]=8'h0c; sbox[229]=8'h96; sbox[230]=8'h77; sbox[231]=8'h7e;
	sbox[232]=8'h65; sbox[233]=8'hb9; sbox[234]=8'hf1; sbox[235]=8'h09; sbox[236]=8'hc5; sbox[237]=8'h6e; sbox[238]=8'hc6; sbox[239]=8'h84;
	sbox[240]=8'h18; sbox[241]=8'hf0; sbox[242]=8'h7d; sbox[243]=8'hec; sbox[244]=8'h3a; sbox[245]=8'hdc; sbox[246]=8'h4d; sbox[247]=8'h20;
	sbox[248]=8'h79; sbox[249]=8'hee; sbox[250]=8'h5f; sbox[251]=8'h3e; sbox[252]=8'hd7; sbox[253]=8'hcb; sbox[254]=8'h39; sbox[255]=8'h48;
end
assign s_out=sbox[s_in];
endmodule

三、实验结果

使用iverilog进行快速功能验证,测试了10组明文/密文对,Makefile文件、testbench和测试结果如下。所有测试用例均通过,实际输出与预期密文完全一致。测试平台自动比较结果并显示通过/失败信息,验证了设计的正确性。VCD波形文件被成功生成,便于后续分析。

IVERILOG := iverilog
VVP := vvp
GTKWAVE := gtkwave

SRC := $(wildcard ./*.v)
VCD := sm4_top.vcd
TARGET := sim

IVERILOG_FLAGS := -g2012 -Wall -Wno-timescale

all: compile run wave

compile: $(SRC)
	@echo "[IVERILOG] Compiling sources..."
	@$(IVERILOG) $(IVERILOG_FLAGS) -o $(TARGET) $(SRC) || (echo "Compilation failed"; exit 1)

run: compile
	@echo "[VVP] Running simulation..."
	@$(VVP) $(TARGET)
	@echo "Simulation completed."

wave:
	@echo "[GTKWAVE] Opening waveforms..."
	@$(GTKWAVE) $(VCD) --autosavename &

.PHONY: all compile run wave
`timescale 1ns/1ps

module sm4_top_tb;
    reg clk=0;
    reg rst_n=0;
    
    reg [127:0] mkey=0;
    reg load_mkey=0;

    reg [127:0] plaintext=0;
    reg din_valid=0;

    wire [127:0] ciphertext;
    wire dout_valid;
    wire busy;

    sm4_top uut (
        .clk(clk),
        .rst_n(rst_n),
        .mkey(mkey),
        .load_mkey(load_mkey),
        .plaintext(plaintext),
        .din_valid(din_valid),
        .ciphertext(ciphertext),
        .dout_valid(dout_valid),
        .busy(busy)
    );

    always #5 clk = ~clk;

    reg [127:0] test_plaintexts [0:9];
    reg [127:0] expected_ciphertexts [0:9];

    initial begin
        test_plaintexts[0]=128'h0123456789abcdeffedcba9876543210;
        test_plaintexts[1]=128'h19dfd145a155ba9582618728cec3129b;
        test_plaintexts[2]=128'h5ea6ab0e8c952e165b5cb8770cc68454;
        test_plaintexts[3]=128'h217da38edffa0a313bae2de200c2f0a4;
        test_plaintexts[4]=128'h9b90f75138905a2455536f8e8c7c48bb;
        test_plaintexts[5]=128'h2b393de18384c3908814a72bd9082802;
        test_plaintexts[6]=128'h20b68d21653ae1e63e1f4186a8b38971;
        test_plaintexts[7]=128'h50bbcc6daca27a2beaeed62752fefcab;
        test_plaintexts[8]=128'hba030f96f7d880675c0888e2c286aa07;
        test_plaintexts[9]=128'h744312ac78eab65501985ef67532d86b;

        expected_ciphertexts[0]=128'h681edf34d206965e86b3e94f536e4246;
        expected_ciphertexts[1]=128'h4f4bb97495eda50ee3d4773f8a70961b;
        expected_ciphertexts[2]=128'h0c18de048cf8ad1a136b32426539fbd8;
        expected_ciphertexts[3]=128'hba6b80da7ab003b8ec1a65b6e44e50aa;
        expected_ciphertexts[4]=128'hf606e5dacd97c4bb6cdb5c51a210a4e2;
        expected_ciphertexts[5]=128'h86637413cc9695b38d7fddd2c8b3682b;
        expected_ciphertexts[6]=128'hbfbf1d47b7956bc2564d79b59d08cdbc;
        expected_ciphertexts[7]=128'hb86d60aa7ad3047aee3a75348e011e49;
        expected_ciphertexts[8]=128'h7f75667ecf8f1079337d70643c0e74a5;
        expected_ciphertexts[9]=128'h0e3a9246a1b0ad477b5d0c33ff72ca40;
    end

    integer i = 0;

    initial begin
        #15 rst_n = 1;
        mkey = 128'h0123456789abcdeffedcba9876543210;
        load_mkey = 1;
        @(negedge clk);
        load_mkey = 0;

        wait(busy == 0);
        @(negedge clk);
        #20 plaintext=test_plaintexts[0]; din_valid = 1;
        #10 din_valid = 0;

        #20 plaintext = test_plaintexts[1]; din_valid = 1;
        #10 plaintext = test_plaintexts[2];
        #10 plaintext = test_plaintexts[3];
        #10 plaintext = test_plaintexts[4];
        #10 din_valid = 0;

        #30 plaintext = test_plaintexts[5]; din_valid = 1;
        #10 plaintext = test_plaintexts[6];
        #10 plaintext = test_plaintexts[7];
        #10 plaintext = test_plaintexts[8];
        #10 plaintext = test_plaintexts[9];
        #10 din_valid = 0;

        wait(busy == 0);
        #100 $finish;
    end

    always @(posedge clk) begin
        if (dout_valid) begin
            if (ciphertext === expected_ciphertexts[i]) begin
                $display("Test %0d: Passed, Expected %h, Actual %h", i, expected_ciphertexts[i], ciphertext);
            end else begin
                $display("Test %0d: Failed, Expected %h, Actual %h", i, expected_ciphertexts[i], ciphertext);
            end
            i = i + 1;
        end
    end

    initial begin
        $dumpfile("sm4_top.vcd");
        $dumpvars(0, sm4_top_tb);
    end
endmodule

在gtkwave和Modelsim中观察仿真波形,可以清晰看到流水线的工作过程。当din_valid有效时,明文进入流水线,经过11个周期后dout_valid变高,输出有效密文。busy信号准确反映了系统状态,密钥扩展和加密过程没有重叠时的控制信号行为符合预期。

 

用Vivado(XC7A35T-1CSG324C)进行综合,结果如下:

 

四、总结

本文介绍了SM4分组密码算法的Verilog流水线实现方案。SM4作为中国国家标准密码算法,采用32轮非线性迭代结构,本设计通过全展开流水线技术实现高性能硬件加密。系统分为密钥扩展和加密处理两大模块,其中密钥扩展模块预先计算32轮子密钥,加密模块则通过32级流水线并行处理数据。Verilog代码采用层次化设计,包括顶层控制、密钥扩展、加密轮函数和S盒等子模块,通过状态信号协调流水线运作。实验验证表明,该设计功能正确,能高效处理加密任务,在标准测试向量下全部通过验证。

评论 14
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

秃了头,空悲切

您的支持就是我创作最大的动力~

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值