In the C2H channel as per page 22 of UG 195, it’s the application that initiates the transfer. Assuming that it’s the card(FPGA) that wants to pass data, how does it happen via C2H DMA channels (The application does not know when exactly the card wants to send data?
Thank you for the tutorial. Question: When doing the DMA transfer as you show with the Zynq through either the GP or HP slave interfaces, how is the DMA access to DDR assured to be coherent with the CPUs?
Thanks for the amazing Lecture. I have one doubt though! For non-memory mapped devices, while configuring the DMAC why don't we give a destination address. I know IO devices are on a separate bus than CPU, DMAC. So the address wont be a address from system bus address space. But IO Devices might have lots of lots of registers, and if we just want to program a few, how do we say that to DMAC without passing destination address. Since we have direction set and DMAC understands the interconnect between itself and IODEVICE, it should be able to use that destination address in the IODevice address space. Thank in advance for your answer!
Wonderful explanation, I had a doubt though, is the DMA controller a centralized one for all peripherals of the system or are there multiple DMA controllers? how is it usual done in a modern day system?
Sir, i have heard that exception handling is difficult in dma data transfer, can you please explain that... Thankyou sir, this vedio gave me a clear understanting on the concept... hope you will reply🌝🙌🏻
Hi Vipin, Very well explained. I have doubt. You said that when DMA is taking care of the data transfer between the memory and the peripheral/device , the CPU can do other tasks in parallel. But the BUS(address/data/control) can only be used by a single device. Than how can CPU work on a different task in parallel.
Why is there only one system bus?...in zynq ultrascale+ architecture we can directly connect PS with peripherals using M_AXI_GPx ports...when it comes to DMA connection to DRAM controller,we are using S_AXI_HPx ports....therefore we can use 2 system buses where PS and DMA can do the communication with peripherals at the same time. Am I right in this point?...
It is a general system architecture. But in Zynq also if the source/destination for two transactions are DDR, the effect will be like having ba single bus since there is a single ddr interface from PS to the external ddr and memory arbitration is required. Arbitration happens within PS
@@Vipinkmenon thanks for replying...can't we transfer data from the heap of the PS and from the DDR through DMA simultaneously using the above ports I mentioned?....Can you please clear this more. Thanks once again.
Usually heap and stack are logical partitions in the main memory (RAM) itself. In SDK but we can configure other memories (BRAM or other external memory) to be used as heap. But in almost all cases we will be using DRAM as all partions for an elf file. But DMA will give good performance in most cases. When both processor core and DMA wants to access dram, in most cases whatever is required by the processor will be I the cache. So there won't be any memory congestion. Similarly when multiple memory requests come, the memory controller is very intelligent to apply techniques like out of order command issue, burst transfers etc. so that maximum performance is extracted from dram
On boards like vc709 where there r physically two separate buses to dram dimms, we can use memory controllers with 2 channels and can have two DMAs running without any memory arbitration. Or the system should support drams with multiple ranks. I am not sure FPGAs boards come with such dimms but on PCs this is common. Each rank will have a separate bus for data transfer
@@Vipinkmenon let's say that we have some data(let's say array A) loaded from DRAM to PS cache...and we have another array in DRAM (which is array B)....so I'm using DMA to send array B to PL and at the same time I need to send array A from PS cache to PL...will it be possible?
That I will discuss later with some other board. Zedboard doesn't have any PCIe interface. If u need sample code for vc707 or vc709, can check my paper www.google.com/url?sa=t&source=web&rct=j&url=warwick.ac.uk/fac/sci/eng/staff/saf/papers/fpl2014-vipin.pdf&ved=2ahUKEwjtxM_p8rrqAhXI8XMBHReTC3UQFjAAegQIAxAB&usg=AOvVaw0WNw2kBj9PlroLLnQJCZHl
There is a DMA controller in PS. But for this tutorial we are using an IP, which will be implemented in the PL. Source codes are in the vivdeo description of the subsequent videos.
If the data for dma transfer is stored in the memory under processor control, should use dcache flush before starting dma. If the processor is reading data after a device to memory dma, should use dcache invalidate before reading
Zynq 7000 in Zedboard has no iommu. Only ultrascale devices have smmu, which is similar to iommu. It might be possible to build your own address remapping IP. Check this answer in Xilinx records forums.xilinx.com/t5/Embedded-Development-Tools/AXI-address-remapping-in-IP-Integrator-aliasing-AXI-slave-to-two/td-p/445546
I have seen many videos explaining DMA, and this is by far the best explanation. Continue doing the good work sir.
It is one of the best videos about the DMA controller I have ever seen. Thanks
Very nicely explained..!!! Thanks Vipin.
very nicely explained !
In the C2H channel as per page 22 of UG 195, it’s the application that initiates the transfer. Assuming that it’s the card(FPGA) that wants to pass data, how does it happen via C2H DMA channels (The application does not know when exactly the card wants to send data?
Thank you for the tutorial. Question: When doing the DMA transfer as you show with the Zynq through either the GP or HP slave interfaces, how is the DMA access to DDR assured to be coherent with the CPUs?
Hi Sir, thanks for this tutorial, do you have a video about DMA SG mode ?
Hello vipin,Thank you for explanation
How to trigger 2 separate axi dma's which are connected to 2 separate ddr's chips?
Thanks for the amazing Lecture. I have one doubt though! For non-memory mapped devices, while configuring the DMAC why don't we give a destination address. I know IO devices are on a separate bus than CPU, DMAC. So the address wont be a address from system bus address space. But IO Devices might have lots of lots of registers, and if we just want to program a few, how do we say that to DMAC without passing destination address. Since we have direction set and DMAC understands the interconnect between itself and IODEVICE, it should be able to use that destination address in the IODevice address space. Thank in advance for your answer!
Hai sir any idea on DMA register re Intilization
What a bout iommu in which configuration it is present
Wonderful explanation, I had a doubt though, is the DMA controller a centralized one for all peripherals of the system or are there multiple DMA controllers? how is it usual done in a modern day system?
Sir, i have heard that exception handling is difficult in dma data transfer, can you please explain that... Thankyou sir, this vedio gave me a clear understanting on the concept... hope you will reply🌝🙌🏻
Hi Vipin,
Very well explained. I have doubt. You said that when DMA is taking care of the data transfer between the memory and the peripheral/device , the CPU can do other tasks in parallel. But the BUS(address/data/control) can only be used by a single device. Than how can CPU work on a different task in parallel.
Why is there only one system bus?...in zynq ultrascale+ architecture we can directly connect PS with peripherals using M_AXI_GPx ports...when it comes to DMA connection to DRAM controller,we are using S_AXI_HPx ports....therefore we can use 2 system buses where PS and DMA can do the communication with peripherals at the same time. Am I right in this point?...
It is a general system architecture. But in Zynq also if the source/destination for two transactions are DDR, the effect will be like having ba single bus since there is a single ddr interface from PS to the external ddr and memory arbitration is required. Arbitration happens within PS
@@Vipinkmenon thanks for replying...can't we transfer data from the heap of the PS and from the DDR through DMA simultaneously using the above ports I mentioned?....Can you please clear this more. Thanks once again.
Usually heap and stack are logical partitions in the main memory (RAM) itself. In SDK but we can configure other memories (BRAM or other external memory) to be used as heap. But in almost all cases we will be using DRAM as all partions for an elf file. But DMA will give good performance in most cases. When both processor core and DMA wants to access dram, in most cases whatever is required by the processor will be I the cache. So there won't be any memory congestion. Similarly when multiple memory requests come, the memory controller is very intelligent to apply techniques like out of order command issue, burst transfers etc. so that maximum performance is extracted from dram
On boards like vc709 where there r physically two separate buses to dram dimms, we can use memory controllers with 2 channels and can have two DMAs running without any memory arbitration. Or the system should support drams with multiple ranks. I am not sure FPGAs boards come with such dimms but on PCs this is common. Each rank will have a separate bus for data transfer
@@Vipinkmenon let's say that we have some data(let's say array A) loaded from DRAM to PS cache...and we have another array in DRAM (which is array B)....so I'm using DMA to send array B to PL and at the same time I need to send array A from PS cache to PL...will it be possible?
Very good info I got a clarity in the same way can you brief pci connection to the processor
That I will discuss later with some other board. Zedboard doesn't have any PCIe interface. If u need sample code for vc707 or vc709, can check my paper www.google.com/url?sa=t&source=web&rct=j&url=warwick.ac.uk/fac/sci/eng/staff/saf/papers/fpl2014-vipin.pdf&ved=2ahUKEwjtxM_p8rrqAhXI8XMBHReTC3UQFjAAegQIAxAB&usg=AOvVaw0WNw2kBj9PlroLLnQJCZHl
Sir, how can a single system bus support all AXI protocol(Lite, Stream, Full) at the same time during DMA? or are there individual buses for each?
good intro
Hey Vipin, can you please send Presentation of DMA ?
www.slideshare.net/vipinkmenon/dma-237938873
The DMA module is part of the PS(ZYNQ) (not PL FPGA), right? Is there any code example of doing burst read/write using DMA?
There is a DMA controller in PS. But for this tutorial we are using an IP, which will be implemented in the PL. Source codes are in the vivdeo description of the subsequent videos.
Please tell me what about cache sync after dma operation
If the data for dma transfer is stored in the memory under processor control, should use dcache flush before starting dma. If the processor is reading data after a device to memory dma, should use dcache invalidate before reading
@@Vipinkmenon
are you aware iommu in dma case for the devices .how these are connected with iommu and transfer the data to memory
Zynq 7000 in Zedboard has no iommu. Only ultrascale devices have smmu, which is similar to iommu. It might be possible to build your own address remapping IP. Check this answer in Xilinx records forums.xilinx.com/t5/Embedded-Development-Tools/AXI-address-remapping-in-IP-Integrator-aliasing-AXI-slave-to-two/td-p/445546
Thanks!
Can i have a pdf version on the above video
www.slideshare.net/vipinkmenon/dma-237938873
How can I cite your work ?