| Summary: | opencv3 SIGILL on Cortex-A5 with VFPv4-D16 | ||
|---|---|---|---|
| Product: | buildroot | Reporter: | James Cowgill <jcowgill+busybox> |
| Component: | Other | Assignee: | unassigned |
| Status: | RESOLVED FIXED | ||
| Severity: | normal | CC: | buildroot |
| Priority: | P5 | ||
| Version: | 2019.02.3 | ||
| Target Milestone: | --- | ||
| Hardware: | All | ||
| OS: | Linux | ||
| Host: | Target: | ||
| Build: | |||
> I think this code in opencv3.mk is wrong (or maybe the ARM fpu config options > are wrong): > > ifeq ($(BR2_ARCH_IS_64):$(BR2_ARM_CPU_HAS_VFPV3),:y) > OPENCV3_CONF_OPTS += -DENABLE_VFPV3=ON > else > OPENCV3_CONF_OPTS += -DENABLE_VFPV3=OFF > endif > > Apparently I have BR2_ARM_CPU_HAS_VFPV3 set (even though I only have the > -D16 version), but when you pass -DENABLE_VFPV3=ON to OpenCV, it passes > -mfpu=vfpv3 to the compiler and this is what's causing the SIGILL. If the only thing that ENABLE_VFPV3 does is to set the -mfpu=... option, then we should just always set it to OFF, because we already pass that option. Same for NEON, by the way. So, I did a bit more research on this, but couldn't come to a useful conclusion. The handling of CPU optimizations in opencv3 is complicated. ENABLE_VFPV3 and ENABLE_NEON seem to be obsolete options, but the replacement CPU_BASELINE is not very clear. With a VFPv3-D16 case, I get: -- CPU/HW features: -- Baseline: -- requested: DETECT -- disabled: VFPV3 NEON which is quite expected. When NEON is enabled as the FPU, I get: -- CPU/HW features: -- Baseline: NEON -- requested: DETECT -- disabled: VFPV3 NEON So NEON seems to be detected, but it's also listed in the "disabled" features... which doesn't make a lot of sense. Then, if I use VFPv3 as the FPU, I get: -- CPU/HW features: -- Baseline: -- requested: DETECT -- disabled: VFPV3 NEON I.e, it doesn't detect that I have VFPv3. (Of course the tests above are after removing ENABLE_VFPV3/ENABLE_NEON). |
I recently upgraded from 2018.02 to 2019.02 and discovered that opencv3 has broken on my board (it worked with 2018.02). The board has a Cortex-A5 with VFPv4-D16 enabled but without NEON and VFPv4 (so it only has 16 double floating point registers). These are the target options I have set: BR2_arm=y BR2_BINFMT_ELF=y BR2_cortex_a5=y BR2_ARM_ENABLE_NEON=n BR2_ARM_ENABLE_VFP=y BR2_ARM_EABIHF=y BR2_ARM_FPU_VFPV4D16=y BR2_ARM_INSTRUCTIONS_THUMB2=y I can see that my application gets a SIGILL because it tries to load a value into the d20 register which doesn't exist in VFPv4-D16. 0xb661d3d2 in cv::interpolateLanczos4 (coeffs=0xbef4602c, x=0.03125) at /home/jcowgill/workspace/bsp/buildroot/output/build/opencv3-3.4.3/modules/imgproc/src/imgwarp.cpp:176 176 /home/jcowgill/workspace/bsp/buildroot/output/build/opencv3-3.4.3/modules/imgproc/src/imgwarp.cpp: No such file or directory. (gdb) disassemble ... 0xb661d3cc <+796>: ldr r3, [pc, #240] ; (0xb661d4c0 <cv::initInterTab2D(int, bool)+1040>) 0xb661d3ce <+798>: vmov s14, r8 => 0xb661d3d2 <+802>: vldr d20, [r10] 0xb661d3d6 <+806>: vldr d19, [r9] 0xb661d3da <+810>: add r3, pc 0xb661d3dc <+812>: mov r2, r7 ======== I think this code in opencv3.mk is wrong (or maybe the ARM fpu config options are wrong): ifeq ($(BR2_ARCH_IS_64):$(BR2_ARM_CPU_HAS_VFPV3),:y) OPENCV3_CONF_OPTS += -DENABLE_VFPV3=ON else OPENCV3_CONF_OPTS += -DENABLE_VFPV3=OFF endif Apparently I have BR2_ARM_CPU_HAS_VFPV3 set (even though I only have the -D16 version), but when you pass -DENABLE_VFPV3=ON to OpenCV, it passes -mfpu=vfpv3 to the compiler and this is what's causing the SIGILL.