Trouble with Character Encodings
(This is work in progress.)
It seems the easiest thing to do is to keep everything in UTF-8:
- All HTML/PHP/… files should be encoded in UTF-8 without BOM. Most editors, however, are set to ANSI by default. Use an editor such as Notepad++. It can handle UTF-8 and will also convert between the different encodings.
- Your database and especially the database tables should be encoded with the UTF-8 character set.
- Send your data to the browser with the correct encoding, for example in PHP:
<?php
header("Content-type: text/html; charset=UTF-8");
?>
- Just to be on the safe side, do it in HTML, too:
<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1">
- It seems that JavaScript (or ECMAScript, for that matter) requires that files be encoded in ASCII. I haven’t figured that out, yet…
If you don’t want to use UTF-8 but stick with, say, ISO-8859-1 because you already have a database encoded in latin1, you can surely do so. The above points apply to this character sets as well. I had trouble, however, when moving a database to a new server. Make sure all files are exported and imported with the same encoding (could be UTF-8 as well). If you’re still having problems, check which character set is used when your PHP script connects to MySQL because sometimes it’s not the one you want. If that’s the root of your problems, you can simply do this:
<?php
mysql_query("SET CHARACTER SET 'latin1'");
?>