To be honest, I’m not really sure the best title for this question, or the full scope of it, but the motivation behind it is:
Motivation
Assume your server was hacked, you open up your UTF-8 encoded php script and you find a block or lines of characters that mean nothing to you mostly mapping in the UTF-8 char set to “????????????? hacker.ru”
I’m trying to get a grasp on what this could be and do:
Thoughts I’m considering
- Perhaps the text editor selected font doesn’t support those chars?
- Perhaps those chars were copied and pasted from non UTF-8 into the UTF-8 document
- Crude Example:
- non-UTF-8 binary 1111->A
- UTF-8 binary 1111->B
- Effectively copying bits that don’t map properly
- Is there a way to properly display those chars?
- This is my priority question about these characters Can I assume that these non-mapping chars do nothing? (i.e., they don’t execute aka do damage)
- Are programming languages multi-lingual?
- Can I write php in russain?
- Can i write php in english and russian in the same file?
Assumption: if I or anyone opens a UTF-8 encoded file and type into it, in any language or chars it will properly map them and display properly.
Can anyone shine some light on this subject?
Continue reading Hacked: Can a UTF-8 encoded script execute non-UTF-8 characters?→