Hi,
I'm an F1 AWS user, and I'm trying to run the PCIE Write Combine example on a cl_dram_dma example design:
https://github.com/awslabs/aws-fpga-app-notes/tree/master/Using-PCIe-Write-Combining
In this example, I'm trying to experiment with the wc_perf.c example to write a single line (16 DWORDS) from the host to the DDRA. According to the above note, running the example with the below command:
**$ sudo ./wc_perf
**Followed by:
**$ sudo fpga-describe-local-image -S 0 -C
**
outputs the below:
DDR0
** write-count=16
** read-count=0
While running with "-w" option,
**$ sudo ./wc_perf -w
**Followed by:
**$ sudo fpga-describe-local-image -S 0 -C
**
outputs the below:
DDR0
** write-count=1
** read-count=0
meaning, that the write_count when we use the -w (Write Combine option), should have a write count that is reduced from 16 to 1. Quoting from the notes:
"The -w option tells wc_perf to use WC, and the number of write data beats was reduced from 16 to 1. This is the reason why writing a WC region with small operations is faster, because they are accumulated into larger chunks using a 64 byte buffer located in the CPU core bus interface (BIU). This is also the reason why it cannot be used for all accesses."
However, when I run the experiment I get different results than the ones described in the notes:
When I either run
"sudo ./wc_perf -w"
or
"sudo ./wc_perf"
followed by
"sudo fpga-describe-local-image -S 0 -C"
I get the same write_count of 1600:
DDR0
write-count=1600
read-count=0
DDR1
While I understand why the count is 1600 and not 16 (num_of_passes is 100 in wc_perf.c), the count expect to be when using "-w" option is 100 and not 1600, which means 16 times less than without "-w". Am I doing anything wrong in the experiment?