Movable Type Mystery: UTF-8 headers and still showing question marks?

It’s been years since I’ve done anything with Movable Type, and I thought it was high time to migrate all my defunct blogs to WordPress. So I logged into Movable Type for the first time in a long time, and I discovered Japanese text in various entries were showing up as question marks in my Movable Type interface.

That was odd.

I have a configuration file from the early 2000s that should have addressed that problem. My config file specified the PublishCharset value to “utf8” and SQLSetNames set to true.

I tried a few solutions I found online, but none of them worked for the simple reason that the Movable Type admin interface was indeed rendered with a UTF-8 content header.

To make matters even more confusing, I queried the database directly through both phpMyAdmin and through the command line. (Hint: start MySQL with the option --default-character-set set to “utf8”.) The Japanese text was showing up fine.

What was going on? Why was it broken now when it worked in the past?

It seemed the key lied in those configuration variables PublishCharset and SQLSetNames. So I searched through the code base to find instances where PublishCharset and SQLSetNames were used.

PublishCharset returned a lot of results, but SQLSetNames appeared in only a handful of files — most notably in the drivers for MySQL and Postgres. With some strategically placed print statements, I discovered that SQLSetNames was being set with a “latin1” character set.

How could that be?

As it turns out, the MySQL driver in Movable Type queries the database for the value of the setting variable character_set_database. I suspect a server move by my host may have reset this value from “utf8” to “latin1”. All I had to do was change the value of this variable, and the question marks disappeared. For example:

ALTER DATABASE `{your database name}`
DEFAULT CHARACTER SET utf8
COLLATE utf8_general_ci

Key take-aways: Movable Type executes a Set Names command in MySQL based on the configured character set of your database. If the character set is not set correctly, your Movable Type admin interface may display characters as question marks, even though UTF-8 headers are being sent to the browser.