checksum: fix checksum with odd base address

csum_unfolded() must call csum_avx2() with a 32byte aligned base address. To be able to do that if the buffer is not correctly aligned, it splits the buffers in 2 parts, the second part is 32byte aligned and can be used with csum_avx2(), the first part is the remaining part, that is not 32byte aligned and we use sum_16b() to compute the checksum. A problem appears if the length of the first part is odd because the checksum is using 16bit words to do the checksum. If the length is odd, when the second part is computed, all words are shifted by 1 byte, meaning weight of upper and lower byte is swapped. For instance a 13 bytes buffer: bytes: aa AA bb BB cc CC dd DD ee EE ff FF gg 16bit words: AAaa BBbb CCcc DDdd EEee FFff 00gg If we don't split the sequence, the checksum is: AAaa + BBbb + CCcc + DDdd + EEee + FFff + 00gg If we split the sequence with an even length for the first part: (AAaa + BBbb) + (CCcc + DDdd + EEee + FFff + 00gg) But if the first part has an odd length: (AAaa + BBbb + 00cc) + (ddCC + eeDD + ffEE + ggFF) To avoid the problem, do not call csum_avx2() if the first part cannot have an even length, and compute the checksum of all the buffer using sum_16b(). This is slower but it can only happen if the buffer base address is odd, and this can only happen if the binary is built using '-Os', and that means we have chosen to prioritize size over speed. Reported-by: Mike Jones <mike@mjones.io> Link: https://bugs.passt.top/show_bug.cgi?id=108 Signed-off-by: Laurent Vivier <lvivier@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> [sbrivio: Added comment explaining why we check for pad & 1] Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
author: Laurent Vivier <lvivier@redhat.com> 2025-01-09 14:06:48 +0100
committer: Stefano Brivio <sbrivio@redhat.com> 2025-01-10 22:20:23 +0100
commit: 2c174f1fe8a5f1923b14cde703941d4daac39850 (patch)
tree: b09121d597d284b4da1ca06d87a5bf39e35a3383
parent: 725acd111ba340122f2bb0601e373534eb4b5ed8 (diff)
download: passt-2c174f1fe8a5f1923b14cde703941d4daac39850.tar
passt-2c174f1fe8a5f1923b14cde703941d4daac39850.tar.gz
passt-2c174f1fe8a5f1923b14cde703941d4daac39850.tar.bz2
passt-2c174f1fe8a5f1923b14cde703941d4daac39850.tar.lz
passt-2c174f1fe8a5f1923b14cde703941d4daac39850.tar.xz
passt-2c174f1fe8a5f1923b14cde703941d4daac39850.tar.zst
passt-2c174f1fe8a5f1923b14cde703941d4daac39850.zip
1 files changed, 2 insertions, 1 deletions
diff --git a/checksum.c b/checksum.c
index 1c4354d..b01e0fe 100644
--- a/checksum.c
+++ b/checksum.c
@@ -452,7 +452,8 @@ uint32_t csum_unfolded(const void *buf, size_t len, uint32_t init)
 	intptr_t align = ROUND_UP((intptr_t)buf, sizeof(__m256i));
 	unsigned int pad = align - (intptr_t)buf;
 
-	if (len < pad)
+	/* Don't mix sum_16b() and csum_avx2() with odd padding lengths */
+	if (pad & 1 || len < pad)
 		pad = len;
 
 	if (pad)
author	Laurent Vivier <lvivier@redhat.com>	2025-01-09 14:06:48 +0100
committer	Stefano Brivio <sbrivio@redhat.com>	2025-01-10 22:20:23 +0100
commit	2c174f1fe8a5f1923b14cde703941d4daac39850 (patch)
tree	b09121d597d284b4da1ca06d87a5bf39e35a3383
parent	725acd111ba340122f2bb0601e373534eb4b5ed8 (diff)
download	passt-2c174f1fe8a5f1923b14cde703941d4daac39850.tar passt-2c174f1fe8a5f1923b14cde703941d4daac39850.tar.gz passt-2c174f1fe8a5f1923b14cde703941d4daac39850.tar.bz2 passt-2c174f1fe8a5f1923b14cde703941d4daac39850.tar.lz passt-2c174f1fe8a5f1923b14cde703941d4daac39850.tar.xz passt-2c174f1fe8a5f1923b14cde703941d4daac39850.tar.zst passt-2c174f1fe8a5f1923b14cde703941d4daac39850.zip