aboutgitcodebugslistschat
path: root/checksum.c
Commit message (Collapse)AuthorAgeFilesLines
* Makefile: Hack for optimised-away store in ndp() before checksum calculation2022_09_29.06aa26fStefano Brivio2022-09-291-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With gcc 11 and 12, passing -flto, or -flto=auto, and -O2, intra-procedural optimisation gets rid of a fundamental bit in ndp(): the store of hop_limit in the IPv6 header, before the checksum is calculated, which on x86_64 looks like this: ip6hr->hop_limit = IPPROTO_ICMPV6; b8c0: c6 44 24 35 3a movb $0x3a,0x35(%rsp) Here, hop_limit is temporarily set to the protocol number, to conveniently get the IPv6 pseudo-header for ICMPv6 checksum calculation in memory. With LTO, the assignment just disappears from the binary. This is rather visible as NDP messages get a wrong checksum, namely the expected checksum plus 58, and they're ignored by the guest or in the namespace, meaning we can't get any IPv6 routes, as reported by Wenli Quan. The issue affects a significant number of distribution builds, including the ones for CentOS Stream 9, EPEL 9, Fedora >= 35, Mageia Cauldron, and openSUSE Tumbleweed. As a quick workaround, declare csum_unaligned() as "noipa" for gcc 11 and 12, with -flto and -O2. This disables inlining and cloning, which causes the assignment to be compiled again. Leave a TODO item: we should figure out if a gcc issue has already been reported, and report one otherwise. There's no apparent justification as to why the store could go away. Reported-by: Wenli Quan <wquan@redhat.com> Link: https://bugzilla.redhat.com/show_bug.cgi?id=2129713 Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
* passt: Address new clang-tidy warnings from LLVM 13.0.1Stefano Brivio2022-01-301-2/+5
| | | | | | | | | | | | | | | | | | | | | | | | | clang-tidy from LLVM 13.0.1 reports some new warnings from these checkers: - altera-unroll-loops, altera-id-dependent-backward-branch: ignore for the moment being, add a TODO item - bugprone-easily-swappable-parameters: ignore, nothing to do about those - readability-function-cognitive-complexity: ignore for the moment being, add a TODO item - altera-struct-pack-align: ignore, alignment is forced in protocol headers - concurrency-mt-unsafe: ignore for the moment being, add a TODO item Fix bugprone-implicit-widening-of-multiplication-result warnings, though, that's doable and they seem to make sense. Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
* passt: Add cppcheck target, test, and address resulting warningsStefano Brivio2021-10-211-4/+2
| | | | | | | ...mostly false positives, but a number of very relevant ones too, in tcp_get_sndbuf(), tcp_conn_from_tap(), and siphash PREAMBLE(). Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
* passt: Fix build with gcc 7, use std=c99, enable some more Clang checkersStefano Brivio2021-10-211-5/+4
| | | | | | | | | | | | | | Unions and structs, you all have names now. Take the chance to enable bugprone-reserved-identifier, cert-dcl37-c, and cert-dcl51-cpp checkers in clang-tidy. Provide a ffsl() weak declaration using gcc built-in. Start reordering includes, but that's not enough for the llvm-include-order checker yet. Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
* LICENSES: Add license text files, add missing notices, fix SPDX tagsStefano Brivio2021-10-201-2/+1
| | | | | | | | | | SPDX tags don't replace license files. Some notices were missing and some tags were not according to the SPDX specification, too. Now reuse --lint from the REUSE tool (https://reuse.software/) passes. Reported-by: Martin Hauke <mardnh@gmx.de> Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
* checksum: Stream load into four registers at a time with > 128 bytesStefano Brivio2021-10-151-3/+47
| | | | | | | | ...and further interleave register usage. This brings the csum() overhead reported by perf(1) for 30 seconds of 64KiB TCP IPv4 frames, host to guest, from 7.2% to 5.8%. Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
* checksum: Interleave lo/hi sums while folding into 128-bit sums, drop TODOStefano Brivio2021-10-151-3/+3
| | | | | | | I left a TODO and never checked -- this actually seems to slightly improve CPIs on AMD Naples (two 128-bit FMA units glued together). Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
* checksum: Introduce AVX2 implementation, unify helpersStefano Brivio2021-07-261-0/+292
Provide an AVX2-based function using compiler intrinsics for TCP/IP-style checksums. The load/unpack/add idea and implementation is largely based on code from BESS (the Berkeley Extensible Software Switch) licensed as 3-Clause BSD, with a number of modifications to further decrease pipeline stalls and to minimise cache pollution. This speeds up considerably data paths from sockets to tap interfaces, decreasing overhead for checksum computation, with 16-64KiB packet buffers, from approximately 11% to 7%. The rest is just syscalls at this point. While at it, provide convenience targets in the Makefile for avx2, avx2_debug, and debug targets -- these simply add target-specific CFLAGS to the build. Signed-off-by: Stefano Brivio <sbrivio@redhat.com>