summaryrefslogtreecommitdiff
path: root/tools
diff options
context:
space:
mode:
authorAdam Stylinski <kungfujesus06@gmail.com>2022-04-08 13:24:21 -0400
committerHans Kristian Rosbach <hk-github@circlestorm.org>2022-05-23 16:13:39 +0200
commitd79984b5bcaccab15e6cd13d7d1edea32ac36977 (patch)
tree7b8e0053bfc6d237bb3ff493e0ad580923ef2526 /tools
parentb8269bb7d4702f8e694441112bb4ba7c59ff2362 (diff)
downloadProject-Tick-d79984b5bcaccab15e6cd13d7d1edea32ac36977.tar.gz
Project-Tick-d79984b5bcaccab15e6cd13d7d1edea32ac36977.zip
Adding avx512_vnni inline + copy elision
Interesting revelation while benchmarking all of this is that our chunkmemset_avx seems to be slower in a lot of use cases than chunkmemset_sse. That will be an interesting function to attempt to optimize. Right now though, we're basically beating google for all PNG decode and encode benchmarks. There are some variations of flags that can basically have us trading blows, but we're about as much as 14% faster than chromium's zlib patches. While we're here, add a more direct benchmark of the folded copy method versus the explicit copy + checksum.
Diffstat (limited to 'tools')
0 files changed, 0 insertions, 0 deletions