HTML entity decode to original displayed characters in PHP

Recently I faced problem while fetching data from some websites and putting it in my database. Problem was that when I fetch data I got some special characters encoded like below

  1. character displayed was ‘ but I got ‘ when I fetch using simplehtmldom
  2. character displayed was ’ but I got ’ when I fetch using simplehtmldom

After searching for more then 8 hours I got a very simple and straight forward solution to convert there characters back to the original displayed characters:

/* This is the ‘text’ I fetched*/
$input = "This is the ‘text’ I fetched";
$output = preg_replace_callback("/(&#[0-9]+;)/", function($m) { return mb_convert_encoding($m[1], "UTF-8", "HTML-ENTITIES"); }, $input);
echo $output;

This code above converted my encoded special characters back to originally displayed characters in HTML.

Referance: http://jp2.php.net/manual/en/function.html-entity-decode.php#104617

 

Leave a Reply

Your email address will not be published.