Valgrind Setup
valgrind
--main-stacksize=180777216
--tool=callgrind
--cache-sim=yes
--branch-sim=yes
--dump-instr=yes
--collect-jumps=yes
- Check miss rate
Guidelines
- Use benchmarks and profiling tools
- Measure, measure and measure!
Data
- Avoid using pointers in types
- Use as much of the cache line as possible (Data Oriented Design vs Object Oriented Design)
- Make data access predictable
- Watch out for false sharing in multithreaded systems
- Cache-oblivious algorithm
- Cache friendly containers (contigious memory) - std::vector, flat_set, flat_map
Code
- Fit working set in cache
- Make “fast paths” branch-free sequences
- Inline catiously
- Reduces branching
- Facilitates code-reducing optimizations
- Code duplication reduces effective cache size
- Take advantage of PGO (Profile Guided Optimization) and WPO (Whole Program Optimization)