📈
Ween's Lab
UdemyYouTubeTikTok
  • Welcome
  • 📻FPGA Tutorials
    • FPGA Boards: Getting Started
      • Getting Started with PYNQ on Kria KV260 Vision AI Starter Kit
      • Getting Started with PYNQ on Red Pitaya STEMlab 125-14
      • Getting Started with PYNQ on ZYBO
    • FPGA Ethernet Tutorial
      • FPGA Tutorial Ethernet 1: Simple TCP Server
    • PYNQ FPGA Tutorial 101
      • Part 0: Introduction
      • Part 1: GPIO
      • Part 2: Custom IP
      • Part 3: Memory
      • Part 4: ANN Processor
    • PYNQ FPGA Tutorial 102
      • Part 0: Introduction
      • Part 1: Memory Mapped
      • Part 2: Direct Memory Access
      • Part 3: AXI-Lite Multiplier
      • Part 4: AXI-Stream Multiplier with DMA
      • Part 5: AXI-Lite GCD
      • Part 6: AXI-Stream GCD with DMA
      • Part 7: Access to DDR from PL
    • ZYNQ FPGA Tutorial
      • Part 1: Gate-Level Combinational Circuit
      • Part 2: RT-Level Combinational Circuit
      • Part 3: Regular Sequential Circuit
      • Part 4: FSM Sequential Circuit
      • Part 5: ZYNQ Architecture
      • Part 6: ARM CPU and FPGA Module
      • Part 7: FPGA Memory
      • Part 8: Hardware Accelerator for Neural Networks
    • ZYNQ FPGA Linux Kernel Module
      • Cross Compiling Kernel, Kernel Module, and User Program for PYNQ
      • Configure PL to PS Interrupt in Kernel Module
      • Configure AXI DMA in Kernel Module
  • 📟Proyek Arduino
    • Kumpulan Proyek
      • Rangkaian LED
      • LED Berkedip Nyala Api
      • LED Chaser
      • LED Binary Counter
      • OLED 128x4 Bitcoin Ticker
      • Rangkaian Button
      • Button Multifungsi
      • Button Interrupt
      • Button Debouncing
    • Pelatihan Mikrokontroler Arduino ESP32
      • Bab 1 Pengenalan Mikrokontroler
      • Bab 2 Pengenalan Arduino
      • Bab 3 Pengenalan Bahasa C
      • Bab 4 Digital Output
      • Bab 5 Digital Input
      • Bab 6 Serial Communication
      • Bab 7 Analog-to-Digital Conversion
      • Bab 8 Interrupt
      • Bab 9 Timer
      • Bab 10 Pulse-Width Modulation
      • Bab 11 SPI Communication
      • Bab 12 I2C Communication
  • 💰Finance
    • Coding for Finance
      • Build a Bitcoin Price Alert with Google Cloud and Telegram
      • Build a Bitcoin Ticker with ESP32 and Arduino
      • Stock Price Forecasting with LSTM
    • Trading dan Investasi
      • Istilah Ekonomi, Keuangan, Bisnis, Trading, dan Investasi
      • Jalan Menuju Financial Abundance
      • Memahami Korelasi Emas, Oil, Dollar, BTC, Bonds, dan Saham
      • Mindset Trading dan Investasi
      • Rangkuman Buku: Rahasia Analisis Fundamental Saham
      • Rangkuman Buku: The Psychology of Money
      • Rangkuman Kuliah: Introduction to Adaptive Markets
      • Rumus Menjadi Orang Kaya
  • 📝Life
    • Life Quotes
Powered by GitBook
On this page
  • Objective
  • Source Code
  • References
  • 1. Computer Architecture
  • 1.1. System Bus
  • 1.2. Memory Mapped Access
  • 2. Design Example
  • 3. Conclusion
  1. FPGA Tutorials
  2. ZYNQ FPGA Tutorial

Part 6: ARM CPU and FPGA Module

PreviousPart 5: ZYNQ ArchitectureNextPart 7: FPGA Memory

Last updated 10 months ago

Objective

This tutorial contains information about the ZYNQ SoC and a simple example on how to integrate our custom RTL module to the ZYNQ PS.

Source Code

This repository contains all of the code required in order to follow this tutorial.

References

1. Computer Architecture

1.1. System Bus

In computer architecture, the system bus is an interconnection that connects the CPU with memory and I/O. The following figure provides an illustration. The system bus consists of control, data, and address lines. Data can be sent both ways from the CPU to memory or I/O, or vice versa with the CPU as the master.

The following figure is an illustration of the FPGA SoC architecture. There is an FPGA that can be connected to the CPU via the system bus.

There are various types of system buses: APB, AHB, AXI, Avalon, etc. On the Zynq SoC, the system bus used is APB, AHB, and AXI. These buses belong to the ARM Advanced Microcontroller Bus Architecture (AMBA). APB and AHB are used on internal PS only, while AXI can be used to connect to PL.

This is a detailed block diagram of the Xilinx Zynq architecture. It consists of the CPU, controller for DRAM and flash memory, input/output, FPGA, and system bus.

1.2. Memory Mapped Access

The method of CPU access to memory and I/O using addresses is called memory mapping. Each DDR memory location and I/O register has its own address.

The number of addresses is determined by the bit width of the address. If the address bit width is 32, then there are 2322^{32}232 or 4 GB of addresses. If the address bit width is 40, then there are 2402^{40}240 or 1 TB of addresses.

The following is the memory map on the Zynq-7000:

The Zynq-7000 still uses a 32-bit address width, so the maximum total address space is 4 GB.

  • Location from 0x0000_0000 for DDR memory

  • Location from 0x4000_0000 for AXI slave port 0 in PL

  • Location from 0x8000_0000 for AXI slave port 1 in PL

  • Location from 0xE000_0000 for IO peripherals such as UART, USB, Ethernet, etc.

In AXI, the components are known as master and slave. The master controls whether to read or write. The slave can only respond by reading or writing.

The master is usually the CPU, but custom modules that we create in the FPGA can also act as masters. For example, in the case of an FPGA module, it must read or write from or to DDR memory.

2. Design Example

In this design example, we are going to integrate the PE module into the ZYNQ system with memory map access. The following figure shows the block diagram of the PE module.

The following code shows the code for the PE module.

pe.v
module pe
    #( 
        parameter WIDTH = 8,
        parameter FRAC_BIT = 0
    )
    (
        input wire signed [WIDTH-1:0]  a_in,
        input wire signed [WIDTH-1:0]  y_in,
        input wire signed [WIDTH-1:0]  b,
        output wire signed [WIDTH-1:0] a_out,
        output wire signed [WIDTH-1:0] y_out
    );
    
    wire signed [WIDTH*2-1:0] y_out_i;
    
    assign a_out = a_in;
    assign y_out_i = a_in * b;
    assign y_out = y_in + y_out_i[WIDTH+FRAC_BIT-1:FRAC_BIT];

endmodule

The PE module is a simple module. The I/O of the PE module is not a standard protocol. Therefore, we have to make a top module that wraps the PE module with a standard protocol that can be integrated with the ZYNQ system. The following code shows the AXI-Stream wrapper module for the PE.

axis_pe.v
module axis_pe
    (
        input wire         aclk,
        input wire         aresetn,
        // *** AXIS slave port ***
        output wire        s_axis_tready,
        input wire [31:0]  s_axis_tdata,
        input wire         s_axis_tvalid,
        input wire         s_axis_tlast,
        // *** AXIS master port ***
        input wire         m_axis_tready,
        output wire [31:0] m_axis_tdata,
        output wire        m_axis_tvalid,
        output wire        m_axis_tlast
    );
    
    wire [7:0] y_out;
    
    // AXI-Stream control
    assign s_axis_tready = m_axis_tready;
    assign m_axis_tdata = {24'h000000, y_out};
    assign m_axis_tvalid = s_axis_tvalid;
    assign m_axis_tlast = s_axis_tlast;
    
    // PE
    pe #(8, 0) pe_0
    (
        .a_in(s_axis_tdata[7:0]),
        .y_in(s_axis_tdata[23:16]),
        .b(s_axis_tdata[15:8]),
        .a_out(),
        .y_out(y_out)
    );
    
endmodule

Now that we have our PE module that can talk with AXI-Stream protocol, the next step is to build the block design. The following figure shows the block design. We use an IP called AXI-Stream FIFO. This IP converts memory map access to the AXI-Stream interface.

This is the configuration for the AXI-Stream FIFO IP.

After the AXI-Stream FIFO is connected to the PS, it gets the address as shown in the following figure. This address will be used in the C program.

The following code shows the C code to access the AXI-Stream FIFO. In this example, we send a packet of data that consists of 8x32-bit of data.

helloworld.c
#include <stdio.h>
#include "xparameters.h"
#include "xllfifo.h"
#include "xstatus.h"

#define WORD_SIZE 		4 // Size of words in bytes
#define DATA_LEN 		8 // Number of data

int Init_XLlFifo(XLlFifo *InstancePtr, u16 DeviceId);
int TxSend(XLlFifo *InstancePtr, u32 *SourceAddr);
int RxReceive(XLlFifo *InstancePtr, u32 *DestinationAddr);

XLlFifo FifoInstance;
u32 SourceBuffer[DATA_LEN];
u32 DestinationBuffer[DATA_LEN];

int main()
{
    // Initialize AXI Stream FIFO IP
    Init_XLlFifo(&FifoInstance, XPAR_AXI_FIFO_0_DEVICE_ID);

    printf("Initialization success\n");

    printf("Input:\n");
    for (int i = 0; i <= 7; i++)
    {
    	uint8_t a = i + 1;
    	uint8_t b = 8 - i;
    	uint8_t y = i + 1;
    	SourceBuffer[i] = (y << 16) | (b << 8) | a;
    	printf(" a=%d, b=%d, y=%d\n", a, b, y);
    }

    // Send to NN core
    TxSend(&FifoInstance, SourceBuffer);

    // Read from NN core
    RxReceive(&FifoInstance, DestinationBuffer);

    // Read input
    printf("Output:\n");
    for (int i = 0; i <= 7; i++)
    	printf(" %ld\n", DestinationBuffer[i]);

    return 0;
}

int Init_XLlFifo(XLlFifo *InstancePtr, u16 DeviceId)
{
    XLlFifo_Config *Config;
    int Status;

    Config = XLlFfio_LookupConfig(DeviceId);
    if (!Config)
    {
        printf("No config found for %d\n", DeviceId);
	return XST_FAILURE;
    }

    Status = XLlFifo_CfgInitialize(InstancePtr, Config, Config->BaseAddress);
    if (Status != XST_SUCCESS)
    {
	printf("Initialization failed\n");
	return XST_FAILURE;
    }

    XLlFifo_IntClear(InstancePtr, 0xffffffff);
    Status = XLlFifo_Status(InstancePtr);
    if (Status != 0x0)
    {
	printf("Reset failed\n");
	return XST_FAILURE;
    }

    return XST_SUCCESS;
}

int TxSend(XLlFifo *InstancePtr, u32 *SourceAddr)
{
    // Writing into the FIFO transmit buffer
    for(int i = 0; i < DATA_LEN; i++)
        if (XLlFifo_iTxVacancy(InstancePtr))
	    Xil_Out32(InstancePtr->Axi4BaseAddress + XLLF_TDFD_OFFSET, *(SourceAddr+i));

    // Start transmission by writing transmission length into the TLR
    XLlFifo_iTxSetLen(InstancePtr, (DATA_LEN * WORD_SIZE));

    // Check for transmission completion
    while (!(XLlFifo_IsTxDone(InstancePtr)));

    return XST_SUCCESS;
}

int RxReceive(XLlFifo *InstancePtr, u32* DestinationAddr)
{
    static u32 ReceiveLength;
    u32 RxWord;
    int Status;

    while (XLlFifo_iRxOccupancy(InstancePtr))
    {
	// Read receive length
	ReceiveLength = XLlFifo_iRxGetLen(InstancePtr) / WORD_SIZE;
	// Reading from the FIFO receive buffer
	for (int i = 0; i < ReceiveLength; i++)
	{
	    RxWord = Xil_In32(InstancePtr->Axi4BaseAddress + XLLF_RDFD_OFFSET);
            *(DestinationAddr+i) = RxWord;
	}
    }

    // Check for receive completion
    Status = XLlFifo_IsRxDone(InstancePtr);
    if (Status != TRUE)
    {
	printf("Failing in receive complete\n");
	return XST_FAILURE;
     }

    return XST_SUCCESS;
}

The following figure shows the result of the serial terminal.

3. Conclusion

In this tutorial, we covered the ZYNQ SoC and a simple example on how to integrate our custom RTL module to the ZYNQ PS.

Zynq 7000 Technical Reference Manual,

📻
https://docs.amd.com/r/en-US/ug585-zynq-7000-SoC-TRM
zybo_tutorial/part_6 at main · weenslab/zybo_tutorialGitHub
Logo
Figure 1. A computer system
Figure 2. An SoC architecture with FPGA
Figure 3. ZYNQ architecture block diagram
Figure 4. Zynq7000 memory map
Figure 5. PE module
Figure 6. Block design of the PE system example
Figure 7. Configuring the AXI-Stream FIFO
Figure 8. The address of the AXI-Stream FIFO
Figure 9. PS system output printed on the serial terminal