Part 1: Gate-Level Combinational Circuit

Objective

This tutorial contains information about the basics of digital circuits using Verilog, FPGA, and an example of a basic circuit using logic gates.

Source Code

This repository contains all of the code required in order to follow this tutorial.

References

1. Verilog

Verilog is a hardware description language (HDL) used to model electronic systems. It is most commonly used in the design and verification of digital circuits.

1.1. Basic Lexical

An identifier gives a unique name to an object. It is composed of letters, digits, the underscore character (_), and the dollar sign ($). $ is usually used with a system task or function. The first character of an identifier must be a letter or underscore. Verilog is a case-sensitive language.

A comment is just for documentation purposes and will be ignored by software. Verilog has two forms of comments. A one-line comment starts with //, as in

// This is a comment

A multiple-line comment is encapsulated between /* and */, as in

/* This is comment line 1
   This is comment line 2
   This is comment line 3 */

1.2. Four-Value System

There are four basic values used in most data types:

  • 0: for "logic 0", or a false condition

  • 1: for "logic 1", or a true condition

  • z: for the high-impedance state

  • x: for an unknown value

The z value corresponds to the output of a tri-state buffer. The x value is usually used in modeling and simulation, representing a value that is not 0, 1, or z, such as an uninitialized input or output conflict.

1.3. Data Type Groups

Verilog has two main groups of data types: net and variable.

In the net group, the data types represent the physical connections between hardware components. The most commonly used data type in this group is wire. As the name indicates, it represents a connecting wire.

There are several examples of wire declarations.

wire dO, dl;             // Two 1-bit signals
wire [7:0] data0, data1; // Two 8-bit data
wire [31:0] addr;        // 32-bit address

In the variable group, the data types represent abstract storage in behavioral modeling There are five data types in this group: reg, integer, real, time, and realtime. The most commonly used data type in this group is reg.

The simple difference between wire and reg is that wire is used only in a combinational circuit, while reg can be used either in a combinational or sequential circuit. wire cannot store a value, while reg can store some value.

1.4. Number Representation

An integer constant in Verilog can be represented in various formats. Its general form is

[sign][size]'[base][value]

The [size] term specifies the number of bits in a number. The [base] term specifies the base of the number, which can be the following:

  • b or B: binary

  • o or O: octal

  • d or D: decimal

  • h or H: hexadecimal

NumberStored ValueComment

5'b11010

11010

5'b1_1010

11010

_ ignored

5'o32

11010

5'd26

11010

5'h1a

11010

5'b0

00000

0 extended

5'b1

00001

0 extended

5'bz

zzzzz

z extended

5'bx

xxxxx

x extended

5'bx01

xxx01

x extended

-5'b00001

11111

2's complement of 00001

1.5. Program Skeleton

When we develop or examine a Verilog code, it is much easier to comprehend if we think in terms of "drawing a circuit" rather than "sequential algorithm."

The skeleton consists of three portions: I/O port declaration, signal declaration, and module body.

1.5.1. Port Declaration

Let's consider this example code of a half-adder circuit.

half_adder.v
module half_adder // Port declaration
    (
        input wire  a,
        input wire  b,
        output wire sum,
        output wire carry    
    );
    
    // Module body
    assign sum = a ^ b; // Continuous assignment
    assign carry = a & b;
    
endmodule

The port declaration of the previous code follows the following syntax:

module [module_name]
    (
        [mode] [data_type] [port_name],
        [mode] [data_type] [port_name],
        ...
        [mode] [data_type] [port_name]
    );

The [mode] term can be input, output, or inout, which represent the input, output, or bidirectional port, respectively. Note that there is no comma in the last declaration. The [data_type] term can be omitted if it is wire.

1.5.2. Module Body

Unlike a program in the C language, in which the statements are executed sequentially, the module body of a synthesizable Verilog module can be thought of as a collection of circuit parts. These parts are operated in parallel and executed concurrently. There are several ways to describe a part:

  • Continuous assignment

  • Procedural assignment or "always block"

  • Module instantiation (structural)

In the previous code, we use continuous assignment to describe the module body, which follows the following syntax:

assign [signal_name] = [expression];

Each continuous assignment can be thought as a circuit part.

1.5.3. Signal Declaration

Let's consider this example code of a full-adder circuit.

full_adder.v
module full_adder
    (
        input wire  a,
        input wire  b,
        input wire  c,
        output wire sum,
        output wire carry 
    );
    
    // Signal declaration
    wire sum_0, carry_0, carry_1;
    
    // Module body
    half_adder half_adder_0 // Module instantiation
    (
        .a(a),
        .b(b),
        .sum(sum_0),
        .carry(carry_0)
    );
    half_adder half_adder_1
    (
        .a(sum_0),
        .b(c),
        .sum(sum),
        .carry(carry_1)
    );
    assign carry = carry_0 | carry_1; // Continuous assignment

endmodule

The full-adder code shows an example of a signal declaration, which follows the following syntax:

[data_type] [port_name];

A digital system is frequently composed of several smaller subsystems. This allows us to build a large system from simpler or predesigned components. Verilog provides a mechanism, known as module instantiation, to perform this task. This type of code is called structural description.

1.6. Testbench

After code is developed, it can be simulated in a host computer to verify the correctness of the circuit operation and can be synthesized to a physical device. Simulation is usually performed within the same HDL framework. We create a special program, known as a testbench, to mimic a physical lab bench.

full_adder_tb.v
`timescale 1ns / 1ps

module full_adder_tb();
    localparam T = 10;
    
    reg a;
    reg b;
    reg c;
    wire sum;
    wire carry;
    
    full_adder full_adder_0
    (
        .a(a),
        .b(b),
        .c(c),
        .sum(sum),
        .carry(carry)
    );
    
    initial
    begin
        a = 0; b = 0; c = 0; #T;
        a = 0; b = 0; c = 1; #T;
        a = 0; b = 1; c = 0; #T;
        a = 0; b = 1; c = 1; #T;
        a = 1; b = 0; c = 0; #T;
        a = 1; b = 0; c = 1; #T;
        a = 1; b = 1; c = 0; #T;
        a = 1; b = 1; c = 1; #T;                
    end
    
endmodule

2. FPGA

2.1. Architecture

Field Programmable Gate Arrays (FPGAs) are integrated circuits that can be programmed and reconfigured by the customer or designer after manufacturing. This flexibility allows engineers to customize the hardware to their specific application or design needs. The two biggest FPGA companies are Xilinx (now owned by AMD) and Altera (now owned by Intel).

FPGAs are constructed from a complex logic block (CLB). The simplified block diagram of a CLB is shown in the following figure. It consists of look-up table (LUT) and flip-flop (FF). The detailed diagram of CLB can be different depending on the FPGA brand.

The CLBs are connected with switch matrix interconnect, as illustrated in the following figure. There are also I/O for interfacing with external components. There are additional blocks on the FPGA, such as embedded memory, DSP blocks, or microprocessors.

The following table contains a comparison of Xilinx FPGA resources on several different boards.

ResourceZybo Z7-10PYNQ Z1/Z2Ultra 96v2Kria K26ZCU104RFSoC 4x2

System Logic Cells

28000

85000

154350

256200

504000

930300

CLB Flip-Flops

35200

106400

141120

234240

460800

850560

CLB LUTs

17600

53200

70560

117120

230400

425280

Distributed RAM (Mb)

-

-

1.8 Mb

3.5 Mb

6.2 Mb

13.0 Mb

Block RAM Blocks

60 (2.1Mb)

140 (4.9Mb)

216 (7.6Mb)

144 (5.1Mb)

312 (11.0Mb)

1080 (38.0Mb)

UltraRAM Blocks

-

-

0

64

96 (27.0Mb)

80 (22.5Mb)

DSP Slices

80

220

360

1248

1728

4272

2.2. Design Flow

The FPGA design flow comprises several different steps:

  • RTL design: module design using HDL code.

  • RTL simulation: functional simulation using testbench to make sure our design is working properly.

  • Synthesis: convert the HDL code into resources that are actually available on your FPGA device.

  • Place and route: select which of the actual resources on the device will be used, and choose which routing resources will be used to interconnect them.

  • Bit file generation: convert the place and route output to the format actually used to program the device.

3. Design Considerations

3.1. Maximum Frequency

The maximum frequency of a module is determined by the critical path. The critical path is the longest path in the circuit and limits the clock speed. This critical path comes from the combinational circuit (a circuit without registers).

A combinational circuit is implemented using basic FPGA resources, and every resource has a propagation delay. Propagation delay is the time difference between an input change and the corresponding output change.

This critical path will determine the maximum operating frequency of a module as illustrated in the following figure.

Pipelining, as shown in the following figure, is a technique whereby we divide a combinational path into multiple parts and include a register at the end of each partial path. In this way, we divide the critical path into multiple small paths, and this allows us to increase the clock speed. However, as a trade-off, the latency of the path will increase.

3.2. Latency

Latency is the time needed for an input change to produce an output change. Latency can be expressed as a length of time or, in synchronous circuits, as a certain number of clock cycles. Latency can be caused by a complex computation that consists of several stages. Latency can also be caused by pipelining to reduce critical paths.

3.3. Throughput

Throughput refers to the rate at which data can be processed. Throughput is usually measured in bit/s. Throughput can be calculated from latency with this equation:

Throughput=1LatencyThroughput=\frac{1}{Latency}

4. Design Example

4.1. Adder Module

In this example, we are going to design a basic combinational circuit, which is a 2-bit adder circuit. The 2-bit adder is based on a full-adder circuit, and it is based on two half-adder circuits. The Verilog code for this circuit is already explained in the previous section. This code is very basic and only uses logic gates.

From this full-adder circuit, we can build a 2-bit adder as shown in the following figure.

This is the implementation of 2-bit adder in Verilog:

adder_2bit.v
module adder_2bit
    (
        input wire [1:0]  a,
        input wire [1:0]  b,
        output wire [2:0] y
    );
    
    wire sum_0, sum_1;
    wire carry_0, carry_1;
    
    full_adder full_adder_0
    (
        .a(a[0]),
        .b(b[0]),
        .c(0),
        .sum(sum_0),
        .carry(carry_0) 
    );
    
    full_adder full_adder_1
    (
        .a(a[1]),
        .b(b[1]),
        .c(carry_0),
        .sum(sum_1),
        .carry(carry_1) 
    );
    
    assign y = {carry_1, sum_1, sum_0};
    
endmodule

4.2. Simulation

After we design a module, we have to simulate it to make sure it functions properly. We create a Verilog testbench file to simulate the adder module.

adder_2bit_tb.v
`timescale 1ns / 1ps

module adder_2bit_tb();
    localparam T = 10;
    
    reg [1:0] a;
    reg [1:0] b;
    wire [2:0] y;
    
    adder_2bit adder_2bit_0
    (
        .a(a),
        .b(b),
        .y(y)
    );
    
    initial
    begin
        a = 0; b = 0;
        #T;
        a = 0; b = 1;
        #T;
        a = 0; b = 2;
        #T;
        a = 0; b = 3;
        #T;
        a = 1; b = 0;
        #T;
        a = 1; b = 1;
        #T;
        a = 1; b = 2;
        #T;
        a = 1; b = 3;
        #T;
        a = 2; b = 0;
        #T;
        a = 2; b = 1;
        #T;
        a = 2; b = 2;
        #T;
        a = 2; b = 3;
        #T;  
        a = 3; b = 0;
        #T;
        a = 3; b = 1;
        #T;
        a = 3; b = 2;
        #T;
        a = 3; b = 3;
        #T;            
    end
    
endmodule

The simulation result is shown in the following figure.

4.3. Constraints

Constraints are used to influence the FPGA design implementation tools including the synthesizer, and place-and-route tools. Constraints can include placement, timing, and I/O restrictions.

In this project we use constraints to configure I/O as follows. We use four switches and three LEDs.

constrs_1.xdc
#Switches
#IO_L19N_T3_VREF_35
set_property PACKAGE_PIN G15 [get_ports {a[0]}]
set_property IOSTANDARD LVCMOS33 [get_ports {a[0]}]

#IO_L24P_T3_34
set_property PACKAGE_PIN P15 [get_ports {a[1]}]
set_property IOSTANDARD LVCMOS33 [get_ports {a[1]}]

#IO_L4N_T0_34
set_property PACKAGE_PIN W13 [get_ports {b[0]}]
set_property IOSTANDARD LVCMOS33 [get_ports {b[0]}]

#IO_L9P_T1_DQS_34
set_property PACKAGE_PIN T16 [get_ports {b[1]}]
set_property IOSTANDARD LVCMOS33 [get_ports {b[1]}]

#LEDs
#IO_L23P_T3_35
set_property PACKAGE_PIN M14 [get_ports {y[0]}]
set_property IOSTANDARD LVCMOS33 [get_ports {y[0]}]

#IO_L23N_T3_35
set_property PACKAGE_PIN M15 [get_ports {y[1]}]
set_property IOSTANDARD LVCMOS33 [get_ports {y[1]}]

#IO_0_35
set_property PACKAGE_PIN G14 [get_ports {y[2]}]
set_property IOSTANDARD LVCMOS33 [get_ports {y[2]}]

4.4. Synthesis

After we simulate the design, the next step is to synthesis the design. This process converts the HDL code into resources that are actually available on your FPGA device. The schematic after the synthesize process is shown in the following figure.

The FPGA resource utilized by this design after synthesis is shown in the following figure. This resource utilization result from the synthesis is only an estimation, not a final result. In the next step, place and route, this utilization will be optimized.

4.5. Place and Route

In this process, the Vivado tools select which of the actual resources on the device will be used and choose which routing resources will be used to interconnect them. This process is also called implementation in Vivado. The output from this process is the final resource utilization, which resource is utilized, and the route to connect them.

4.6. Generate Bitstream

This process converts the place and route output to the format actually used to program the device. After that, we can program the FPGA board.

4.7. Result

The following figure shows the result on the FPGA board. The switch values from left to right are b[1], b[0], a[1], and a[0]. The LED values from left to right are 0, y[2], y[1], and y[0]. From the figure below, the input a is 2 and b is 3, therefore the output y is 5.

5. Conclusion

In this tutorial, we covered the basics of Verilog, FPGA, design considerations, and an example of a basic circuit using logic gates. We also covered the flow of the FPGA design.

Last updated