Part 2: Direct Memory Access
Last updated
Last updated
This tutorial contains an introduction to the to the AXI DMA and how to get started using the AXI DMA.
AXI DMA v7.1 LogiCORE IP Product Guide, https://docs.amd.com/viewer/book-attachment/ePquvyIHSl7mKfi0ecEn4Q/CPPzqKuxCU1Q4a3rXOA1jw
Tutorial: PYNQ DMA (Part 1: Hardware design), https://discuss.pynq.io/t/tutorial-pynq-dma-part-1-hardware-design/3133
Tutorial: PYNQ DMA (Part 2: Using the DMA from PYNQ), https://discuss.pynq.io/t/tutorial-pynq-dma-part-2-using-the-dma-from-pynq/3134
Direct memory access (DMA) enables certain hardware subsystems to access system memory without relying on the CPU.
When doing data transfer without DMA, the CPU will be fully occupied for the entire duration of the read or write operation and is thus unavailable to perform other work.
With DMA, the CPU first initiates the transfer, then it does other operations while the transfer is in progress, and it finally receives an interrupt from the DMA controller when the operation is done.
There are several IP DMAs contained in the Xilinx library, either in PS or PL:
The PS DMA controller (DMAC) provides a flexible DMA engine that can provide moderate levels of throughput with little PL logic resource usage. The DMAC resides in the PS and must be programmed via DMA instructions residing in memory, typically prepared by a CPU.
The AXI Direct Memory Access (AXI DMA) IP provides high-bandwidth direct memory access between memory and AXI4-Stream-type target peripherals.
The AXI Central Direct Memory Access (AXI CDMA) provides high-bandwidth Direct Memory Access (DMA) between a memory-mapped source address and a memory-mapped destination address using the AXI4 protocol.
The AXI Video Direct Memory Access (AXI VDMA) core is a soft AMD IP core that provides high-bandwidth direct memory access between memory and AXI4-Stream type video target peripherals.
In this tutorial we will focus only on AXI DMA.
This figure shows the AXI DMA IP. This DMA allows you to stream data from memory, specifically PS DRAM, to an AXI stream interface. This is called the READ channel of the DMA. The DMA can also receive data from an AXI stream and output it to PS DRAM. This is the WRITE channel.
The read and write access to PS DRAM is done via the high-performance AXI ports, AMBA interconnect, DRAM controller, and finally to the DRAM itself outside the Zynq chip.
For the READ channel, AXI DMA reads memory-mapped data from DRAM via the M_AXI_MM2S
port. MM2S stands for memory-mapped-to-stream. Then, the data will be streamed out via the M_AXIS_MM2S
port.
For the WRITE channel, AXI DMA receives stream data from AXI-Stream IP via the S_AXIS_S2MM
port. S2MM stands for stream-to-memory-mapped. Then, the data will be written to the DRAM via the M_AXIS_S2MM
port.
To control the AXI DMA operations, we can use S_AXI_LITE
. We can send instructions to AXI DMA, such as the source address, destination address, and number of data to be transferred.
The AXI DMA can be configured in the IP configuration dialog.
For this project we need to do the following:
Uncheck Enable Scatter Gather Engine to disable Scatter Gather
Set the Width of Buffer Length Register to 26
Change the Address Width is to 40. In this example, I will connect the DMA to the PS memory which is 40-bit for Zynq Ultrascale+. You can set this to 32-bit if you are connecting this to Zynq-7000.
Set the Memory Map Data Width to 64 match the HP port.
Set the Stream Map Data Width to 64.
The AXI DMA master ports need to be connected to the DRAM via PS. This will be done through the PS HP (AXI Slave) ports. These ports are not enabled by default. Double click the Zynq PS block, and go to the PS-PL Configuration, expand HP Slave AXI Interface and enable S AXI HP0 and S AXI HP2 and set the data width to 64.
Internally there are two connections to the PS memory that the four HP ports are connected to. HP0 and HP1 share a switch to one port, and HP2 and HP3 share a switch to the other. The difference may not be noticeable for this example and some design, but when only two HP ports are required, it is more efficient to connect them to HP ports that don’t share a switch. i.e. HP 0 and HP 2 or HP 1 and HP 3 together.
After these ports are enabled, you can see it on the IP block.
Finally, this is our system block diagram, which consists of PS, AXI DMA, and AXIS FIFO. Here, we use the master AXI_GP_0 to connect to the AXI DMA control port via the AXI Interconnect. The AXI DMA data ports to DRAM are connected via the AXI_HP_0 and AXI_HP_2 ports.
This is the final block design diagram as shown in Vivado. The output of the AXI DMA MM2S is connected to the AXIS FIFO and then back to the AXI DMA S2MM.
In order to control AXI DMA in FPGA from our application, we can use the dma
library from the PYNQ library. Without this PYNQ library, if you want to use AXI DMA under Linux, you have to understand the Linux kernel module. The PYNQ library provides abstractions to use AXI DMA in Python.
First, we need to create DMA, DMA send channel, and DMA receive channel objects.
We define the maximum data word (64-bit) that can be processed for a single DMA transfer. The maximum data byte for a single DMA transfer is 67,108,863 bytes. We divide this by 8 because our memory-mapped data width is 64-bit.
We will read some data from DRAM, and write to AXIS FIFO.
The first step is to allocate the buffer. We use allocate()
function to allocate the buffer, and NumPy will be used to specify the type of the buffer, which is unsigned int 64-bit in this case.
The array can be used like any other NumPy array. We can write some test data to the array. Later the data will be transferred by the DMA to the FIFO.
Let’s check the contents of the array.
Now we are ready to carry out AXI DMA transfer from DRAM to AXIS FIFO.
Let’s read the data back from AXIS FIFO, and write to DRAM. We will prepare an empty array before reading data back from FIFO.
Let’s check the contents of the array to make sure it is empty.
Now we are ready to carry out AXI DMA transfer from AXIS FIFO to DRAM
Let’s check the contents of the array after DMA transfer, and compare it with the original data.
Don’t forget to free the memory buffers to avoid memory leaks!
This video contains detailed steps for making this project.
In this tutorial, we covered some of the basics of AXI DMA with the PYNQ framework.