# Part 2: Direct Memory Access

## Objective

This tutorial contains an introduction to the to the AXI DMA and how to get started using the AXI DMA.

## References

* AXI DMA v7.1 LogiCORE IP Product Guide, <https://docs.amd.com/viewer/book-attachment/ePquvyIHSl7mKfi0ecEn4Q/CPPzqKuxCU1Q4a3rXOA1jw>
* Tutorial: PYNQ DMA (Part 1: Hardware design), <https://discuss.pynq.io/t/tutorial-pynq-dma-part-1-hardware-design/3133>
* Tutorial: PYNQ DMA (Part 2: Using the DMA from PYNQ), <https://discuss.pynq.io/t/tutorial-pynq-dma-part-2-using-the-dma-from-pynq/3134>

## Source Code

This repository contains all of the code required in order to follow this tutorial.

{% embed url="<https://github.com/weenslab/pynq102/tree/main>" %}

***

## 1. Hardware Design

### 1.1. Direct Memory Access

Direct memory access (DMA) enables certain hardware subsystems to access system memory without relying on the CPU.

When doing data transfer without DMA, the CPU will be fully occupied for the entire duration of the read or write operation and is thus unavailable to perform other work.

With DMA, the CPU first initiates the transfer, then it does other operations while the transfer is in progress, and it finally receives an interrupt from the DMA controller when the operation is done.

<figure><img src="/files/WGrm8CeHJc1PA2rWWuXQ" alt="" width="375"><figcaption></figcaption></figure>

### 1.2. Xilinx DMA

There are several IP DMAs contained in the Xilinx library, either in PS or PL:

* The **PS DMA controller (DMAC)** provides a flexible DMA engine that can provide moderate levels of throughput with little PL logic resource usage. The DMAC resides in the PS and must be programmed via DMA instructions residing in memory, typically prepared by a CPU.
* The **AXI Direct Memory Access (AXI DMA)** IP provides high-bandwidth direct memory access between memory and AXI4-Stream-type target peripherals.
* The **AXI Central Direct Memory Access (AXI CDMA)** provides high-bandwidth Direct Memory Access (DMA) between a memory-mapped source address and a memory-mapped destination address using the AXI4 protocol.
* The **AXI Video Direct Memory Access (AXI VDMA)** core is a soft AMD IP core that provides high-bandwidth direct memory access between memory and AXI4-Stream type video target peripherals.

In this tutorial we will focus only on AXI DMA.

### 1.3. System Design

This figure shows the AXI DMA IP. This DMA allows you to stream data from memory, specifically PS DRAM, to an AXI stream interface. This is called the READ channel of the DMA. The DMA can also receive data from an AXI stream and output it to PS DRAM. This is the WRITE channel.

<figure><img src="/files/CsSIWFiZfKheMMPhfEGs" alt="" width="297"><figcaption></figcaption></figure>

The read and write access to PS DRAM is done via the high-performance AXI ports, AMBA interconnect, DRAM controller, and finally to the DRAM itself outside the Zynq chip.

<figure><img src="/files/d16dPDlvbJsiB3KQKtCu" alt=""><figcaption></figcaption></figure>

For the READ channel, AXI DMA reads memory-mapped data from DRAM via the `M_AXI_MM2S` port. MM2S stands for memory-mapped-to-stream. Then, the data will be streamed out via the `M_AXIS_MM2S` port.

For the WRITE channel, AXI DMA receives stream data from AXI-Stream IP via the `S_AXIS_S2MM` port. S2MM stands for stream-to-memory-mapped. Then, the data will be written to the DRAM via the `M_AXIS_S2MM` port.

To control the AXI DMA operations, we can use `S_AXI_LITE`. We can send instructions to AXI DMA, such as the source address, destination address, and number of data to be transferred.

The AXI DMA can be configured in the IP configuration dialog.

<figure><img src="/files/7XKL6L8nCBE1Ct9yxAkY" alt=""><figcaption></figcaption></figure>

For this project we need to do the following:

* **Uncheck Enable Scatter Gather Engine** to disable Scatter Gather
* Set the **Width of Buffer Length** **Register** to **26**
* Change the **Address Width** is to **40**. In this example, I will connect the DMA to the PS memory which is 40-bit for Zynq Ultrascale+. You can set this to 32-bit if you are connecting this to Zynq-7000.
* Set the **Memory Map Data Width** to **64** match the HP port.
* Set the **Stream Map Data Width** to **64**.

The AXI DMA master ports need to be connected to the DRAM via PS. This will be done through the PS HP (AXI Slave) ports. These ports are not enabled by default. Double click the Zynq PS block, and go to the PS-PL Configuration, expand HP Slave AXI Interface and enable S AXI HP0 and S AXI HP2 and set the data width to 64.

<figure><img src="/files/mfZSIdyHsrC0iWpH1sdv" alt=""><figcaption></figcaption></figure>

Internally there are two connections to the PS memory that the four HP ports are connected to. HP0 and HP1 share a switch to one port, and HP2 and HP3 share a switch to the other. The difference may not be noticeable for this example and some design, but when only two HP ports are required, it is more efficient to connect them to HP ports that don’t share a switch. i.e. HP 0 and HP 2 or HP 1 and HP 3 together.

After these ports are enabled, you can see it on the IP block.

<figure><img src="/files/ACmB9PxfkJ6NedxvCP2t" alt="" width="520"><figcaption></figcaption></figure>

Finally, this is our system block diagram, which consists of PS, AXI DMA, and AXIS FIFO. Here, we use the master AXI\_GP\_0 to connect to the AXI DMA control port via the AXI Interconnect. The AXI DMA data ports to DRAM are connected via the AXI\_HP\_0 and AXI\_HP\_2 ports.

<figure><img src="/files/611CrBfcxwKn419p9ajP" alt=""><figcaption></figcaption></figure>

This is the final block design diagram as shown in Vivado. The output of the AXI DMA MM2S is connected to the AXIS FIFO and then back to the AXI DMA S2MM.

<figure><img src="/files/fVtk734QK4k7OOEuGWt0" alt=""><figcaption></figcaption></figure>

## 2. Software Design

### 2.1. Hardware-Software Partition

In order to control AXI DMA in FPGA from our application, we can use the `dma` library from the PYNQ library. Without this PYNQ library, if you want to use AXI DMA under Linux, you have to understand the Linux kernel module. The PYNQ library provides abstractions to use AXI DMA in Python.

<figure><img src="/files/51had7iFnkG3sGnL1Q1g" alt=""><figcaption></figcaption></figure>

### 2.2. User Application

First, we need to create DMA, DMA send channel, and DMA receive channel objects.

```python
# Access to AXI DMA
dma = overlay.axi_dma_0
dma_send = overlay.axi_dma_0.sendchannel
dma_recv = overlay.axi_dma_0.recvchannel
```

We define the maximum data word (64-bit) that can be processed for a single DMA transfer. The maximum data byte for a single DMA transfer is 67,108,863 bytes. We divide this by 8 because our memory-mapped data width is 64-bit.

```python
# Maximum data that can be sent by AXI DMA for 1 transaction is 67108863 bytes
# floor(67108863 bytes/8) = 8388607 word (64-bit)
# We divide by 8 because we use uint64 data type
data_size = 8388607
```

#### Read DMA (MM2S)

We will read some data from DRAM, and write to AXIS FIFO.&#x20;

The first step is to allocate the buffer. We use `allocate()` function to allocate the buffer, and NumPy will be used to specify the type of the buffer, which is unsigned int 64-bit in this case.

```python
# Allocate physical memory for AXI DMA MM2S
input_buffer = allocate(shape=(data_size,), dtype=np.uint64)
```

The array can be used like any other NumPy array. We can write some test data to the array. Later the data will be transferred by the DMA to the FIFO.

```python
# Write data to physical memory
for i in range(data_size):
    input_buffer[i] = i + 0xcafe000000000000
```

Let’s check the contents of the array.

```python
# Check the written data
for i in range(10):
    print(hex(input_buffer[i]))
```

```
0xcafe000000000000
0xcafe000000000001
0xcafe000000000002
0xcafe000000000003
0xcafe000000000004
0xcafe000000000005
0xcafe000000000006
0xcafe000000000007
0xcafe000000000008
0xcafe000000000009
```

Now we are ready to carry out AXI DMA transfer from DRAM to AXIS FIFO.

```
# Do AXI DMA MM2S transfer
dma_send.transfer(input_buffer)
```

#### Write DMA (S2MM)

Let’s read the data back from AXIS FIFO, and write to DRAM. We will prepare an empty array before reading data back from FIFO.

```python
# Allocate physical memory for AXI DMA S2MM
output_buffer = allocate(shape=(data_size,), dtype=np.uint64)
```

Let’s check the contents of the array to make sure it is empty.

```python
# Check the memory content
for i in range(10):
    print(hex(output_buffer[i]))
```

```
0x0
0x0
0x0
0x0
0x0
0x0
0x0
0x0
0x0
0x0
```

Now we are ready to carry out AXI DMA transfer from AXIS FIFO to DRAM

```python
# Do AXI DMA S2MM transfer
dma_recv.transfer(output_buffer)
```

Let’s check the contents of the array after DMA transfer, and compare it with the original data.

```python
# Check the memory content after DMA transfer
for i in range(10):
    print(hex(output_buffer[i]))
```

```
0xcafe000000000000
0xcafe000000000001
0xcafe000000000002
0xcafe000000000003
0xcafe000000000004
0xcafe000000000005
0xcafe000000000006
0xcafe000000000007
0xcafe000000000008
0xcafe000000000009
```

```python
# Compare arrays
print("Arrays are equal: {}".format(np.array_equal(input_buffer, output_buffer)))
```

```
Arrays are equal: True
```

Don’t forget to free the memory buffers to avoid memory leaks!

```python
# Delete buffer to prevent memory leak
del input_buffer, output_buffer
```

## 3. Full Step-by-Step Tutorial

This video contains detailed steps for making this project.

{% embed url="<https://youtu.be/StffNGVNFFI>" %}

## 4. Conclusion

In this tutorial, we covered some of the basics of AXI DMA with the PYNQ framework.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://weenslab.gitbook.io/pages/fpga-tutorials/pynq-fpga-tutorial-102/part-2-direct-memory-access.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
