In computer architecture, the system bus is an interconnection that connects the CPU with memory and I/O. The following figure provides an illustration. The system bus consists of control, data, and address lines. Data can be sent both ways from the CPU to memory or I/O, or vice versa with the CPU as the master.
The following figure is an illustration of the FPGA SoC architecture. There is an FPGA that can be connected to the CPU via the system bus.
There are various types of system buses: APB, AHB, AXI, Avalon, etc. On the Zynq SoC, the system bus used is APB, AHB, and AXI. These buses belong to the ARM Advanced Microcontroller Bus Architecture (AMBA). APB and AHB are used on internal PS only, while AXI can be used to connect to PL.
This is a detailed block diagram of the Xilinx Zynq architecture. It consists of the CPU, controller for DRAM and flash memory, input/output, FPGA, and system bus.
1.2. Memory Mapped Access
The method of CPU access to memory and I/O using addresses is called memory mapping. Each DDR memory location and I/O register has its own address.
The number of addresses is determined by the bit width of the address. If the address bit width is 32, then there are 232 or 4 GB of addresses. If the address bit width is 40, then there are 240 or 1 TB of addresses.
The following is the memory map on the Zynq-7000:
The Zynq-7000 still uses a 32-bit address width, so the maximum total address space is 4 GB.
Location from 0x0000_0000 for DDR memory
Location from 0x4000_0000 for AXI slave port 0 in PL
Location from 0x8000_0000 for AXI slave port 1 in PL
Location from 0xE000_0000 for IO peripherals such as UART, USB, Ethernet, etc.
In AXI, the components are known as master and slave. The master controls whether to read or write. The slave can only respond by reading or writing.
The master is usually the CPU, but custom modules that we create in the FPGA can also act as masters. For example, in the case of an FPGA module, it must read or write from or to DDR memory.
2. Design Example
In this design example, we are going to integrate the PE module into the ZYNQ system with memory map access. The following figure shows the block diagram of the PE module.
The following code shows the code for the PE module.
The PE module is a simple module. The I/O of the PE module is not a standard protocol. Therefore, we have to make a top module that wraps the PE module with a standard protocol that can be integrated with the ZYNQ system. The following code shows the AXI-Stream wrapper module for the PE.
Now that we have our PE module that can talk with AXI-Stream protocol, the next step is to build the block design. The following figure shows the block design. We use an IP called AXI-Stream FIFO. This IP converts memory map access to the AXI-Stream interface.
This is the configuration for the AXI-Stream FIFO IP.
After the AXI-Stream FIFO is connected to the PS, it gets the address as shown in the following figure. This address will be used in the C program.
The following code shows the C code to access the AXI-Stream FIFO. In this example, we send a packet of data that consists of 8x32-bit of data.
helloworld.c
#include <stdio.h>
#include "xparameters.h"
#include "xllfifo.h"
#include "xstatus.h"
#define WORD_SIZE 4 // Size of words in bytes
#define DATA_LEN 8 // Number of data
int Init_XLlFifo(XLlFifo *InstancePtr, u16 DeviceId);
int TxSend(XLlFifo *InstancePtr, u32 *SourceAddr);
int RxReceive(XLlFifo *InstancePtr, u32 *DestinationAddr);
XLlFifo FifoInstance;
u32 SourceBuffer[DATA_LEN];
u32 DestinationBuffer[DATA_LEN];
int main()
{
// Initialize AXI Stream FIFO IP
Init_XLlFifo(&FifoInstance, XPAR_AXI_FIFO_0_DEVICE_ID);
printf("Initialization success\n");
printf("Input:\n");
for (int i = 0; i <= 7; i++)
{
uint8_t a = i + 1;
uint8_t b = 8 - i;
uint8_t y = i + 1;
SourceBuffer[i] = (y << 16) | (b << 8) | a;
printf(" a=%d, b=%d, y=%d\n", a, b, y);
}
// Send to NN core
TxSend(&FifoInstance, SourceBuffer);
// Read from NN core
RxReceive(&FifoInstance, DestinationBuffer);
// Read input
printf("Output:\n");
for (int i = 0; i <= 7; i++)
printf(" %ld\n", DestinationBuffer[i]);
return 0;
}
int Init_XLlFifo(XLlFifo *InstancePtr, u16 DeviceId)
{
XLlFifo_Config *Config;
int Status;
Config = XLlFfio_LookupConfig(DeviceId);
if (!Config)
{
printf("No config found for %d\n", DeviceId);
return XST_FAILURE;
}
Status = XLlFifo_CfgInitialize(InstancePtr, Config, Config->BaseAddress);
if (Status != XST_SUCCESS)
{
printf("Initialization failed\n");
return XST_FAILURE;
}
XLlFifo_IntClear(InstancePtr, 0xffffffff);
Status = XLlFifo_Status(InstancePtr);
if (Status != 0x0)
{
printf("Reset failed\n");
return XST_FAILURE;
}
return XST_SUCCESS;
}
int TxSend(XLlFifo *InstancePtr, u32 *SourceAddr)
{
// Writing into the FIFO transmit buffer
for(int i = 0; i < DATA_LEN; i++)
if (XLlFifo_iTxVacancy(InstancePtr))
Xil_Out32(InstancePtr->Axi4BaseAddress + XLLF_TDFD_OFFSET, *(SourceAddr+i));
// Start transmission by writing transmission length into the TLR
XLlFifo_iTxSetLen(InstancePtr, (DATA_LEN * WORD_SIZE));
// Check for transmission completion
while (!(XLlFifo_IsTxDone(InstancePtr)));
return XST_SUCCESS;
}
int RxReceive(XLlFifo *InstancePtr, u32* DestinationAddr)
{
static u32 ReceiveLength;
u32 RxWord;
int Status;
while (XLlFifo_iRxOccupancy(InstancePtr))
{
// Read receive length
ReceiveLength = XLlFifo_iRxGetLen(InstancePtr) / WORD_SIZE;
// Reading from the FIFO receive buffer
for (int i = 0; i < ReceiveLength; i++)
{
RxWord = Xil_In32(InstancePtr->Axi4BaseAddress + XLLF_RDFD_OFFSET);
*(DestinationAddr+i) = RxWord;
}
}
// Check for receive completion
Status = XLlFifo_IsRxDone(InstancePtr);
if (Status != TRUE)
{
printf("Failing in receive complete\n");
return XST_FAILURE;
}
return XST_SUCCESS;
}
The following figure shows the result of the serial terminal.
3. Conclusion
In this tutorial, we covered the ZYNQ SoC and a simple example on how to integrate our custom RTL module to the ZYNQ PS.