By playing various wargames I noticed that I kept on getting stuck on format strings vulnerabilities, so I decided to step back and relearn them from scratch. In the process I realized that I couldn’t explain to myself why we can read / write to arbitrary locations by providing a valid address.
printf (\x41\x41\x41\x41_%08x_%08x)
According to my understanding of format strings, this function call is supposed to simply print AAAA + stack value + stack value
and nothing else. Instead, it leaks two addresses starting by the provided address
From https://crypto.stanford.edu/cs155old/cs155-spring08/papers/formatstring-1.2.pdf
The format function now parses the format string ‘A’, by reading a
character a time. If it is not ‘%’, the character is copied to the output. In
case it is, the character behind the ‘%’ specifies the type of parameter that
should be evaluated. The string “%%” has a special meaning, it is used to print
the escape character ‘%’ itself. Every other parameter relates to data, which
is located on the stack
If the above statement is true, and \x41\x41\x41\x41_%08x_%08x
is the only argument of printf() allocated on the stack, then how can we explain reading / writing from/to memory locations ?
EDIT 1:
This answer does indeed specify that we can leak whatever address we want, but it doesn’t go over how to start leaking from an arbitrary memory location.
I this other answer
So you're asking how printf can find the string because there is a
different parameter count than the % signs say? Two thought problems here: a)
Before printf can count the % at all, it has to find the string. Wrong string
content can't prevent finding this string. b) Without attacks: printf supports
variable parameter counts, and it always can find the string. Last parameter
etc. doesn't matter.
For some reason the OP assumes that the ‘AAAA’ part is an actual address.
Continue reading How to Leak Addresses with Format String Exploits→