Open
Description
It looks like golang.org/x/crypto/chacha20poly1305/chacha20poly1305_amd64.s uses PSHUFB instructions unconditionally, even when built for GOAMD64=v1. PSHUFB is part of SSSE3 which is only v2+. In my version of similar code for chacha8rand I didn't want the overhead of two copies of the code and a runtime switch, so I just did
// ROL16 rotates the uint32s in register R left by 16, using temporary T if needed.
#ifdef GOAMD64_v2
#define ROL16(R, T) PSHUFB ·rol16<>(SB), R
#else
#define ROL16(R, T) ROL(16, R, T)
#endif
// ROL8 rotates the uint32s in register R left by 8, using temporary T if needed.
#ifdef GOAMD64_v2
#define ROL8(R, T) PSHUFB ·rol8<>(SB), R
#else
#define ROL8(R, T) ROL(8, R, T)
#endif
That may be fine for this code too, since newer x86 chips are going to use the AVX code path anyway.