Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Internal Development

Intended for the development of WTX although some tips might be useful for your projects.

Size constraints

A large enum aggressively used in several places can cause a negative runtime impact. In fact, this is so common that the community created several lints to prevent such a scenario.

Some real-world use-cases and associated benchmarks.

That is why WTX has an enforced Error enum size of 16 bytes and that is also the reason why WTX has so many bare error variants.

Performance

Many things that generally improve performance are used in the project, to name a few:

  1. Manual Vectorization: When an algorithm is known for processing large amounts of data, several experiments are performed to analyze the best way to split loops in order to allow the compiler to take advantage of SIMD instructions.
  2. Memory Allocation: Whenever possible, all structures related to heap allocations are only created at the instantiation level.
  3. Fewer Dependencies: No third-party is injected by default. In other words, additional dependencies are up to the user through the selection of Cargo features, which decreases the compilation time of full builds. For example, you can see the mere 7 dependencies required by the PostgreSQL client using cargo tree -e normal --features crypto-ring,postgres.
  4. Vectored and Buffered IO: Instead of writing a single chunk of data and waiting for it to be sent, multiple chunks are gathered and transmitted in a single operation whenever possible.

Profiling

Uses the h2load benchmarking tool (https://nghttp2.org/documentation/h2load-howto.html) and the h2load internal binary (https://github.com/c410-f3r/wtx/blob/main/wtx-internal/src/bin/h2load.rs) for illustration purposes.

Compilation time / Size

cargo-bloat: Finds out what takes most of the space in executables.

cargo bloat --bin h2load --features h2load | head -20

cargo-llvm-lines: Measures the number and size of instantiations of each generic function in a program.

CARGO_PROFILE_RELEASE_LTO=fat cargo llvm-lines --bin h2load --features h2load --package wtx-internal --release | head -20

Performance

Prepare the executables in different terminals.

h2load -c100 --log-file=/tmp/h2load.txt -m10 -n10000 --no-tls-proto=h2c http://localhost:9000
cargo build --bin h2load --features h2load --profile profiling --target x86_64-unknown-linux-gnu

samply: Command line CPU profiler.

samply record ./target/x86_64-unknown-linux-gnu/profiling/h2load

callgrind: Gives global, per-function, and per-source-line instruction counts and simulated cache and branch prediction data.

valgrind --tool=callgrind --dump-instr=yes --collect-jumps=yes --simulate-cache=yes ./target/x86_64-unknown-linux-gnu/profiling/h2load

Compiler flags

Some non-standard options that will influence the final binary. Only use them if you know what you are doing.

Size

  • -C force-frame-pointers=no
  • -C force-unwind-tables=no

More size-related parameters can be found at https://github.com/johnthagen/min-sized-rust.

Runtime

  • -C llvm-args=–inline-threshold=9999
  • -C llvm-args=-enable-dfa-jump-thread
  • -C llvm-args=-vectorize-loops
  • -C llvm-args=-vectorize-slp
  • -C target-cpu=x86-64-v3

Security

  • -C control-flow-guard=yes
  • -C relocation-model=pie
  • -C relro-level=full
  • -Z stack-protector=strong