French

Converting Character Encodings to UTF-8 in PHP

Character encodings are always "fun" to deal with. The fortunate and naive ones amongst you, who believe that utf-8 solved all pains in the multilingual encoding arena, let me tell you: "you are lucky to be in your surreal world".

I was trying to hook up a PHP application with a web-service wrapped around a legacy application, today. Unfortunately, that web-service was only capable of sending me a ISO-8859-15-encoded output. Since the guy who did the service was in enough pain having had to script it in Lotus Notes Scripting Language, I did not dare ask to fix the problem. Neither do I know (or want to know) enough about Lotus Notes to assume that it was possible, at all.

So I tried fixing the problem on the PHP side. Now, PHP does have a nice method called utf8_encode that encodes ISO-8859-1 strings into UTF-8. You may say - No brainer? Well, not quite. My input was ISO-8859-15. The bratty "5" in the end stands for some extra characters mainly used in French and Finnish, but popping up in Turkish, in my case.

Syndicate content