summaryrefslogtreecommitdiff
path: root/insert_string_tpl.h
diff options
context:
space:
mode:
authorAdam Stylinski <kungfujesus06@gmail.com>2024-11-30 09:23:28 -0500
committerHans Kristian Rosbach <hk-github@circlestorm.org>2024-12-10 22:17:14 +0100
commit43d74a223b30902b44b01bf4c4888d8deb35e253 (patch)
treeef1813e6dfbeee03b01156404456cb81c23fd713 /insert_string_tpl.h
parenta4e7c34a4ac171ba878eec86bdd2a58c1d03f8e5 (diff)
downloadProject-Tick-43d74a223b30902b44b01bf4c4888d8deb35e253.tar.gz
Project-Tick-43d74a223b30902b44b01bf4c4888d8deb35e253.zip
Improve pipeling for AVX512 chunking
For reasons that aren't quite so clear, using the masked writes here did not pipeline very well. Either setting up the mask stalled things or masked moves have issues overlapping regular moves. Simply putting the masked moves behind a branch that is rarely taken seemed to do the trick in improving the ILP. While here, put masked loads behind the same branch in case there were ever a hazard for overreading.
Diffstat (limited to 'insert_string_tpl.h')
0 files changed, 0 insertions, 0 deletions