The speed game automated trading in cpp
30 Dec 2024
https://www.youtube.com/watch?v=ulOLGX3HNCI&list=WL&index=166&ab_channel=MeetingCpp
It’s a race to
- receive market data
- perform risk checks
- sending the right order back
All software is in the 1-10micro s range, all hardware (FPGA, printed cad) is around 100-1000ns.
Usual characteristics of automated trading systems
- Only few lines of code paths are important
- Jitter is a killer
- Very little threading/ vectorisation
The stack:
- Network: preferring microwaves over fibre optic
- Servers
- Kernel tuning: need to tune BIOS, thermal headroom, tune OS, removing interrupts, process isolation
- Cpp
- Algos: typically public text book pricing
Main thing is that a lot of server stack is all tuned for throughput instead of latency.
On cpp general principals are
- Move everything to compile time
- Bypass the OS: aim for 100% userspace code, including network IO
- Cache warm
On cpp techniques
- Move semantics
- Static asserts
- Data member layout, padding alignment
- False sharing
- Cache locality
- Compile time dispatch
- Constexpr
- Varadic templates: you can do a nice compile time recuse with varadic templates resulting in the evaluation of an overloaded base case, pretty nice https://github.com/maciekgajewski/Fast-Log https://github.com/carlcook/variadicLogging/blob/master/main.cc
- Loop unrolling
- Expression short circuiting: move expensive checks to the top
- Signed, unsigned comparisons
- Float double mixing
- Branch prediction reduction
- Exception
- Slow path removal: keep fast path code together and slow code away, don’t inline slow code so that it is brought into fast path code
- Avoiding allocation: new and delete does a hot potato throw to OS
- Fast containers
- Lambda functions
https://github.com/Xilinx-CNS/onload
- you can read write packets without system calls
You have to measure to improve something
- High resolution packet in/packet out timestamping is the source of truth, with a switch