Modern computer architectures have many properties that can impact software performance. While the compiler is generally good enough at "in-core" optimizations such as instruction selection and various loop optimizations, most memory acceleration techniques cannot work optimally without some help from the programmer. This talk draws a picture of what every C++ developer needs to know about modern memory systems to perform these kinds of optimizations by presenting a series of small code examples and benchmarking their performance — often producing surprising results — experimentally deriving key memory characteristics and highlighting different CPU cache effects.