Adding avx512_vnni inline + copy elision - Project-Tick - Project Tick is a project dedicated to providing developers with ease of use and users with long-lasting software.

diff options

author	Adam Stylinski <kungfujesus06@gmail.com>	2022-04-08 13:24:21 -0400
committer	Hans Kristian Rosbach <hk-github@circlestorm.org>	2022-05-23 16:13:39 +0200
commit	d79984b5bcaccab15e6cd13d7d1edea32ac36977 (patch)
tree	7b8e0053bfc6d237bb3ff493e0ad580923ef2526 /test/fuzz
parent	b8269bb7d4702f8e694441112bb4ba7c59ff2362 (diff)
download	Project-Tick-d79984b5bcaccab15e6cd13d7d1edea32ac36977.tar.gz Project-Tick-d79984b5bcaccab15e6cd13d7d1edea32ac36977.zip

Adding avx512_vnni inline + copy elision

Interesting revelation while benchmarking all of this is that our chunkmemset_avx seems to be slower in a lot of use cases than chunkmemset_sse. That will be an interesting function to attempt to optimize. Right now though, we're basically beating google for all PNG decode and encode benchmarks. There are some variations of flags that can basically have us trading blows, but we're about as much as 14% faster than chromium's zlib patches. While we're here, add a more direct benchmark of the folded copy method versus the explicit copy + checksum.

Diffstat (limited to 'test/fuzz')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: