WARP Project Forums - Wireless Open-Access Research Platform

You are not logged in.

  • Index
  •  » WARP Hardware
  •  » fast transfers between a user pcore and the 2GB memory (Virtex 4)

#1 2011-Apr-19 18:49:08

vladb38
Member
From: USC
Registered: 2010-Nov-30
Posts: 10

fast transfers between a user pcore and the 2GB memory (Virtex 4)

We are trying to push the WarpLab to the real-time domain, that is we would like to be able to send data in real-time over the air, without initial buffering and limitations on the maximal size of data sent. We identified two bottlenecks in doing this:

1) The network interface currently operates at 100Mbps. Is there any hardware-specific reason for which it could not work at 1000Mbps? In ML405 Xilinx has developed a design that works with a similar chip and uses the MPMC's DMA in order to push through data at rates as high as 990 Mbps, therefore unless the chip on the WARP board is very different, the same performance could be expected.

2) The data transfers between the 2GB memory and the System Generator-generated pcores are somehow slow. Here the problem seems to be that sysgen instantiates the circuits with PLB slave interfaces on 32 bits that do not support burst transfers, therefore the xps_central_dma DMA controller will only be able to write data values one at a time. The fastest we could get the xps_central_dma-managed transfer to work at was around 27MBps (bytes, not bits), while for a 10 MHz waveform we would need to send data to the component at about 40MBps. We would like to ask your opinion on the best way to overcome this hurdle. Here are some possible solutions that we are considering.

a) constructing a PLB slave interface with burst transfer support in System Generator. While this would certainly solve the problem, the complexity would be probably very high.

b) copying the BRAM interface found in the OFDM model to our model, in order to allow receiving data from BRAM and constructing a control interface using the normal PLB connection. Conceptually this would be more complicated, since there would be multiple transfers occuring (DMA-controlled transfer from 2GB chip from BRAM, followed by starting the read process from the user's pcore), however, the implementation complexity might be lower. We do not know whether this would achieve the required performance and would like to ask for your opinion on this aspect as well.

While these are the two options we have considered so far, there might be other more efficient/simpler ways of achieving this fast transfer. We would like to hear your opinion on this. Is it perhaps that we missed something when considering this problem?

Offline

 

#2 2011-Apr-19 21:10:39

murphpo
Administrator
From: Mango Communications
Registered: 2006-Jul-03
Posts: 5159

Re: fast transfers between a user pcore and the 2GB memory (Virtex 4)

Interesting stuff. I don't have any concrete solutions (we haven't used the DRAM much in our work). But a few things come to mind:

-WARPLab uses 100M Ethernet to accomodate the xilnet IP stack (which only supports xps_ethernetlite, not xps_ll_temac). The Ethernet chip on the WARP FPGA Board v2.2 is the same Marvell transceiver as in the Xilinx designs. We've used it at 1G. In our experience, the xps_ll_temac in XPS v10 has some issues which cause occasional dropped received frames; later versions seem more robust.

-I agree implementing a PLB master or burst-capable PLB slave (or any custom PLB interface, really) in Sysgen is a big undertaking.

-Using a bram_block to move data in/out of Sysgen works well; we've had great results doing this in the OFDM design. The BRAM interface is 64-bits with no wait states (it can move 1 8-byte word per cycle, but only half-duplex). You'd have to figure out the fastest way to move things between the bram_block and MPMC. The xps_bram_if_cntlr (the PLB-to-BRAM interface) appears to support burst mode on its PLB slave interface (per its datasheet). This, driven by the central DMA, might be enough.

-There's a useful EDK tool called Create and Import Peripheral Wizard. It generates pcore templates with various features, designed to wrap custom logic. This is how we created the radio_controller, for example. The wizard generated the VHDL/Verilog template for a PLB slave with a bunch of software-accissable registers. Sid used this template to design the full controller core (which wraps an SPI controller, and connects the registers to top-level radio board I/O). The wizard can also create templates for PLB masters. We haven't tried this, but have had good luck withe wizard in general. If you use the wizard to generate a custom pcore with faster interfaces (64-bit PLB master, for example), you could then export your Sysgen core as HDL and wrap it in the new pcore. You'd have to replace Sysgen registers with gateways (i.e. top-lelvel ports), that would then connect to registers instantiated in the new pcore template.

-You may consider looking into the SDMA interface to the MPMC. The SDMA provides a LocalLink interface directly into the MPMC, bypassing the PowerPC's PLB entirely. This is the same interface Xilinx uses to connect the TEMAC to the MPMC (the TEMAC-MPMC link is LocalLink; both cores have PLB slave interfaces for control register access). There's no automated way to build LocalLink interfaces in Sysgen, but the LocalLink spec is fairly simple and Xilinx has lots of examples (like xapp691). The spec is, however, well hidden on Xilinx's site. Google found an easier copy. The Xilinx Aurora user guide (virtex_4fx_aurora_8b10b_ug061.pdf) also has a good discussion of LocalLink (chapter 3).

Offline

 

#3 2011-Apr-21 11:38:29

riveridea
Member
From: Tennessee Tech Univ.
Registered: 2010-Oct-01
Posts: 100

Re: fast transfers between a user pcore and the 2GB memory (Virtex 4)

As my investigation previously, if we want to use the 1Gbps ethernet interface to transfer the TCP/IP message, the current xps_II_temac needs to work under interrupt mode which is required by the Lwip TCP/IP stack. Current OFDM ref design uses the xps_II_temac in polling mode.
I am not sure any friends has succeeded in integrating the Lwip to WARPLab or OFDM Ref design.

Offline

 
  • Index
  •  » WARP Hardware
  •  » fast transfers between a user pcore and the 2GB memory (Virtex 4)

Board footer