You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[libc][x86] copy one cache line at a time to prevent the use of rep;movsb (#113161)
When using `-mprefer-vector-width=128` with `-march=sandybridge` copying
3 cache lines in one go (192B) gets converted into `rep;movsb` which
translate into a 60% hit in performance.
Consecutive calls to `__builtin_memcpy_inline` (implementation behind
`builtin::Memcpy::block_offset`) are not coalesced by the compiler and
so calling it three times in a row generates the desired assembly. It
only differs in the interleaving of the loads and stores and does not
affect performance.
This is needed to reland
#108939.
0 commit comments