I read somewhere that before performing unaligned load or store next to page boundary (e.g. using _mm_loadu_si128 / _mm_storeu_si128 intrinsics), code should first check if whole vector (in this case 16 bytes) belongs to the same page, and switch to non-vector instructions if not. I understand that this is needed to prevent coredump if next page does not belong to
Tag: sse
gcc 4.x not supporting x87 FPU math?
I’ve been trying to compile gcc 4.x from the sources using –with-fpmath=387 but I’m getting this error: “Invalid –with-fpmath=387”. I looked in the configs and found that it doesn’t support this option (even though docs still mention it as a possible option): Basically, I started this whole thing because I need to supply an executable for an old target platform