You are not logged in.
Hi
Has anyone had any issues when using the right shift operator on the Microblaze processors? This appears to be a simple problem; perhaps I am missing something?
u32 msb = B8E8561A; u32 lsb = FCF00000; xil_printf("%02X:" , (u8)((msb & 0xFF000000) >> 24) ); xil_printf("%02X:" , (u8)((msb & 0x00FF0000) >> 16) ); xil_printf("%02X:" , (u8)((msb & 0x0000FF00) >> 8 ) ); xil_printf("%02X:" , (u8)((msb & 0x000000FF) >> 0 ) ); xil_printf("%02X:" , (u8)((lsb & 0xFF000000) >> 24) ); xil_printf("%02X" , (u8)((lsb & 0x00FF0000) >> 16) );
Prints 5C:74:2B:1A:7E:78.
What is going on? You'll notice the least significant byte of the msb u32 (1A) is correct. Where do the other values come from?
Is this something to do with the endian-ness of the Microblaze? According to the datasheet it supports a logical shift right (lcr).
Offline
We have never had any issues with right shifting. If you look, every time you set up a node via WARPNet, you use right shift to set the IP address and this works.
I'm not sure if you are using the exact code above, but I'm surprised it compiles if you pasted it 'as is'. For C, if you don't have '0x' in front of a hexadecimal number, the compiler would interpret the value as an identifier. Now, if it compiles, then the linker must be able to resolve the identifier but I don't know what value that would actually be. Instead try:
u32 msb = 0xB8E8561A; u32 lsb = 0xFCF00000;
Offline
I would guess it's a problem higher in the code, where msb/lsb are set.
I just ran the following in an AXI-based (little-endian) MicroBlaze:
u32 msb = 0x12345678; u32 lsb = 0x9abc0000; xil_printf("msb: 0x%02X\n", msb); xil_printf("lsb: 0x%02X\n\n", lsb); xil_printf("%02X:", (u8)( (msb & 0xFF000000) >> 24)); xil_printf("%02X:", (u8)( (msb & 0x00FF0000) >> 16)); xil_printf("%02X:", (u8)( (msb & 0x0000FF00) >> 8)); xil_printf("%02X:", (u8)( (msb & 0x000000FF) >> 0)); xil_printf("%02X:", (u8)( (lsb & 0xFF000000) >> 24)); xil_printf("%02X", (u8)( (lsb & 0x00FF0000) >> 16));
Which printed:
msb: 0x12345678 lsb: 0x9ABC0000 12:34:56:78:9A:BC
Offline
Hi, thanks for the replies. Welsh, you are correct, I missed the '0x' when pasting in to this forum.
Ultimately I am getting the msb and lsb values from a register in hardware, they are not compiled in the source. I do:
//Cast is there just to ensure nothing odd is happening u32 msb = (u32)get_my_register_msb(); u32 lsb = (u32)get_my_register_lsb(); //Make it accessible by array[] type operator u8 *msb_ = &msb; u8 *lsb_ = &lsb; //Print entire 32 bit number as hex xil_printf("msb: 0x%08X\nlsb: 0x%08X\n", msb, lsb); //Method 1 xil_printf("%02X:" , msb_[3]); xil_printf("%02X:" , msb_[2]); xil_printf("%02X:" , msb_[1]); xil_printf("%02X:" , msb_[0]); xil_printf("%02X:" , lsb_[3]); xil_printf("%02X\t", lsb_[2]); //Method 2 xil_printf("%02X:" , (u8)((msb & 0xFF000000) >> 24)); xil_printf("%02X:" , (u8)((msb & 0x00FF0000) >> 16)); xil_printf("%02X:" , (u8)((msb & 0x0000FF00) >> 8)); xil_printf("%02X:" , (u8)((msb & 0x000000FF) >> 0)); xil_printf("%02X:" , (u8)((lsb & 0xFF000000) >> 24)); xil_printf("%02X\t", (u8)((lsb & 0x00FF0000) >> 16));
Which prints:
msb: 0x12345678 lsb: 0x9ABC0000 12:34:56:78:9A:BC 09:1A:2B:78:4D:5E
The hardware is quite simply, a 64 bit const set to value 0x123456789ABC0000 connected to two 32 bit LSB/MSB slices, each of these connected to a 32 bit register.
What is the difference between Method 1 and 2 above?!
Very much appreciate your support.
Offline
The easiest way to understand what is happening is to go look at the disassembly. In the Xilinx SDK, if you open the .elf file in the GUI (open up the Debug folder and double click on the .elf file), you can see exactly what is being executed by the processor.
My guess is that due to some of the optimization flags in the compiler settings and the fact that msb / lsb were not declared as "volatile u32", the compiler believed that it could re-order when it did the & and >> calculations for the second printf. If you look thru the disassembly, you should be able to understand what is going on (you can find the assembly instructions in the Microblaze Reference Guide).
Let us know what you find. Due to some behavior with the linker and the fact that the design is large, we had to turn on the size optimization (-Os) in the 802.11 reference design in order to allow users of the reference design to have space to make modifications. Unfortunately, this can cause weird behavior when the compiler doesn't understand that hardware registers can change value (ie variables marked with the 'volatile' keyword).
Offline
Hello
I've had a good look at the assembler and come up with the following:
Copied from the .elf file.
volatile u32 msb = (u32)get_my_register_msb(); 17154: b0007e21 imm 32289 17158: e860089c lwi r3, r0, 2204 1715c: f861001c swi r3, r1, 28 xil_printf("%02X:" , (u8)((msb & 0xFF000000) >> 24)); 17160: e8c1001c lwi r6, r1, 28 17164: a2400018 ori r18, r0, 24 17168: 90660041 srl r3, r6 1716c: 3252ffff addik r18, r18, -1 17170: be32fffc bneid r18, -4 // 1716c 17174: 90630041 srl r3, r3 17178: b000fffe imm -2 1717c: b9f492bc brlid r15, -27972 // 438 <xil_printf> 17180: 10c30000 addk r6, r3, r0
This looks to me like it is actually only doing the Logical Right Shift (srl) once or perhaps twice. The loop (bneid r18, -4) actually jumps back to the previous instruction (addik r18, r18, -1) which decrements from 24, completely missing out the srl r3, r6 instruction.
If you take a look at the numbers, this is verified:
As it should be: 0x1 2 3 4 5 6 7 8 0b0001 0010 0011 0100 0101 0110 0111 1000 Shifted right by one bit: 0x0 9 1 A 2 B 3 C 0b0000 1001 0001 1010 0010 1011 0011 1100
The latter is very similar to my previous post apart from the final byte which can be attributed to the fact that no right shift is applied to the least significant byte (>> 0).
Like you say I guess this is due to the compiler optimizations. I tried making the u32 msb volatile but it made no difference.
Turning off optimizations causes the code to overflow as expected. Does the 802.11 reference design really use up all of the available microblaze memory? I've seen this post about memory usage.
Has this not caused issues elsewhere in your code? I would have thought >> is used fairly frequently.
Thanks again
Last edited by horace (2014-Sep-04 11:04:13)
Offline
What are the definitions of get_my_register_msb() and get_my_register_lsb()?
Offline
That is strange. I'm not sure how the compiler is interpreting the code that way.
If you look in the reference design C code, we always code that same statement as:
xil_printf("%02X:" , (u8)((msb >> 24) & 0xFF));
(ie the bit masking is done after the shift) which is probably interpreted differently by the compiler. I'm surprised by this as well, but we have not seen any issues using >>
If you look at the executable.map file in the Debug directory, you can get a sense of the memory usage of the design (you can open it with a text editor or inside the SDK by right-clicking and saying "Open with" --> "Text Editor"). One thing that is helpful is that this file is created even if the design will not actually build so you can see how much memory you are over and how the memory is used. We have looked at a number of different schemes to reduce the memory footprint but have found that -Os gives us the best result and allows users to extend the design without running in to memory issues.
Offline
Thanks both for your replies.
I have tried to use the same approach that I've seen elsewhere in the code so get_my_register_msb() is a macro defined in wlan_mac_high.h as:
#define MY_REGISTER_MSB XPAR_MY_NEW_CORE_AXIW_0_MEMMAP_MY_REGISTER_MSB #define get_my_register_msb() (Xil_In32(MY_REGISTER_MSB))
and in xparameters.h generated by the bsp:
#define XPAR_MY_NEW_CORE_AXIW_0_MEMMAP_MY_REGISTER_MSB 0x7E21089C
Is this the best way?
I've just tried your method of shifting before mask and it produces the same result with identical assembler. Very odd. I also tried splitting it in to three distinct instructions, but the compiler just optimizes to the same assembler each time.
It obviously has something to do with the origin of the data since when msb is defined as 0x12345678 at compilation it works. Only when the value comes from a register does it break. Are there any instances in your code where you shift (a shift greater than 8) the value received from a hardware register? I couldn't see any quite the same.
It isn't the end of the world as I can use the array [] operator to access the individual bytes, but I'm confused why this doesn't work.
Offline
These observations are concerning. We do lots of shifting/masking in the drivers for our custom cores. We haven't (at least, I think we haven't) been bitten by this issue before.
A few more questions to help us reproduce this:
-Which interconnect is your custom core connected to (mb_shared_axi, mb_low_axi_periph or mb_high_axi_periph)?
-What version of the Xilinx tools are you using?
I've tried the following code in CPU High of v0.95 of the 802.11 ref design, but the output is always as expected. Can you try this as well? This will help rule in/out the custom core you're using. This code uses two registers in the axi_sysmon_adc core. These are read-only registers whose values will change on each read (tracking chip temperature and voltage).
#define REG_LSB 0x41D00200 //axi_sysmon temperature ADC #define REG_MSB 0x41D00204 //axi_sysmon VCC_INT ADC #define get_my_register_msb() (Xil_In32(REG_MSB)) #define get_my_register_lsb() (Xil_In32(REG_LSB)) int main() { u32 msb = (u32)get_my_register_msb(); u32 lsb = (u32)get_my_register_lsb(); xil_printf("\n\n"); xil_printf("msb: 0x%08X\n", msb); xil_printf("lsb: 0x%08X\n", lsb); xil_printf("%02X:" , (u8)((msb & 0xFF000000) >> 24)); xil_printf("%02X:" , (u8)((msb & 0x00FF0000) >> 16)); xil_printf("%02X:" , (u8)((msb & 0x0000FF00) >> 8)); xil_printf("%02X:" , (u8)((msb & 0x000000FF) >> 0)); xil_printf("%02X:" , (u8)((lsb & 0xFF000000) >> 24)); xil_printf("%02X\n", (u8)((lsb & 0x00FF0000) >> 16)); return 0; } //Renamed actual main() for testing int actual_main(){ <snip>...
Offline
Right, I've done this twice. Just to check!
-Unzipped a fresh copy of the 0.95 ref design
-Open SDK and point to new SDK_Workspace
-Import all existing projects and set Local Repo to SDK_Workspace/../
-Change wlan_mac_high_ap/src/wlan_mac_ap.c as you've highlighted above. Rename main and replace with simple testing code and #defines.
-Build all (works fine, as expected)
-Program FPGA
-Open terminal and cross your fingers...
msb: 0x00005551 lsb: 0x00009B32 00:00:2A:51:00:00
Same problem. 0x2A is 0x55 shifted right by one.
Since it is used throughout, you would have thought this would cause the whole reference design to fall over when I compile and run it on my hardware. But it works fine. Its almost like it is only affecting changes to the code made locally...?
Versions:
-SDK: 14.7 Build SDK_P.20131013
-Compiler: Eclipse C/C++ Development Tools 8.1.2.201302132326
-ISE: 14.7 (nt64) Build EDK_P.20131013+0
-Simulink: R2013a (version 8.1)
Interconnects:
I attach the custom core to mb_high_axi_periph
Offline
-SDK: 14.7 Build SDK_P.20131013
Ah- that's a key difference. We're still using 14.4. We decided the changelogs from 14.4 to 14.7 indicated no improvements big enough to compel a switch before releasing the v1.0 version of the design (most of the changes were related to Zynq support). We'll try this in 14.7 to see if we can reproduce it.
Offline
By using the 14.7 tools, we were able to re-create what you were seeing.
After doing some research, it looks like this is a known bug in the 14.7 compiler when using the -Os optimization. If we change the optimization level to -O3 in the 14.7 tools, we get:
xil_printf("%02X:" , (u8)((msb & 0xFF000000) >> 24)); 2088: b0000002 imm 2 208c: 30a00080 addik r5, r0, 128 2090: b000ffff imm -1 2094: b9f4e3a4 brlid r15, -7260 // 438 <xil_printf> 2098: 64d30018 bsrli r6, r19, 24
which is basically the same as what we had with -Os in 14.4:
xil_printf("%02X:" , (u8)((msb & 0xFF000000) >> 24)); 17520: b0000002 imm 2 17524: 30a0fed4 addik r5, r0, -300 17528: b000fffe imm -2 1752c: b9f4c640 brlid r15, -14784 // 3b6c <xil_printf> 17530: 64d30018 bsrli r6, r19, 24
Since we have done all of our development on the 14.4 SDK, we would suggest that you use that tool-chain. The license you have for 14.7 is backward compatible.
Thanks for bringing this bug to our attention.
Offline
I'm pretty sure the bug can be traced to the /gcc/config/microblaze/microblaze.md file in the mg-gcc toolchain. The 14.7 release included a new assembly routine for logical right shift when size optimization is enabled.
https://github.com/Xilinx/gcc/releases is a useful page for looking up stuff like this.
14.4 microblaze.md - search for 'lshrsi3_bshift'
14.7 microblaze.md - search for 'lshrsi3_bshift'
Notice the extra lshrsi3_with_size_opt definition in the 14.7 version. I'm not good enough at assembly to parse the actual implementation, but strongly suspect it's broken. Good evidence for this is that Xilinx removed this definition in a later release of the mb-gcc toolchain, commenting "lshrsi3_with_size_opt is being removed as it has conflicts with unsigned int variables".
Unfortunately this fix will likely never be incorporated into a Virtex-6 compatible tool. Xilinx considers ISE 14.7 to be a perpetual release - no more fixes, even for glaring, functionality-breaking, bad-code-generating bugs like this.
We'll be sticking with 14.4 for development of the 802.11 design for the foreseeable future.
Like Erik said, thanks for your patience in helping debug this.
Offline
Not at all, thank you both for the detailed analysis.
We'll be sticking with 14.4 for development of the 802.11 design for the foreseeable future.
Fair enough. It's surprising that the reference design will correctly compile and run on 14.7 even though the design must use >> for shifts greater than 8. Perhaps it is just coincidence. Right shift is definitely less popular than left (which doesn't seem to have this problem) I guess you do a lot of writing to hardware registers but less reading.
I will have revert to 14.4. I wonder if there is a revdown tool! :)
Offline
Actually, the Xilinx tools are pretty good about being installed along side each other. I have 14.4, 14.7 and one of the Vivado releases all installed on my machine and have not had any issues with a version of the tools trying to use resources not part of its own install.
Offline
I will have revert to 14.4. I wonder if there is a revdown tool! :)
There isn't, unfortunately. But only a few cores changed from 14.4 to 14.7. If you used revup on the XPS project you will need to revert the HW_VER values for the cores that changed. I think it's only the microblaze (8.40.b -> 8.50.c) and axi_intc (1.03a -> 1.04a), and these changes didn't affect other parameters or port connections.
Offline
Reverting back to 14.4 does indeed solve the right shift optimization issue. Now it does exactly what you write!
Like murphpo says, changing hardware versions works well and is simple to downgrade.
Offline