欢迎您访问程序员文章站本站旨在为大家提供分享程序员计算机编程知识!
您现在的位置是: 首页  >  后端开发

php的汉字转换:GBK至Unicode(UTF8)_PHP

程序员文章站 2022-05-06 09:02:47
...
php的汉字转换一直是比较麻烦的事

该类内置了四个函数"htmlHex","htmlDec","escape","u2utf8"
方便用户的使用,同时也可自定义函数进行自己喜欢的操作

qswhGBK.php 从这里下载
http://www.blueidea.com/user/qswh/qswhGBK.zip


class qswhGBK{
var $qswhData;
function qswhGBK($filename="qswhGBK.php"){
$this->qswhData=file($filename);
}
function gb2u($gb,$callback=""){
/******(qiushuiwuhen 2002-8-15)******/
$ret="";
for($i=0;$iif(($p=ord(substr($gb,$i,1)))>127){

$q=ord(substr($gb,++$i,1));
$q=($q-($q>128?65:64))*4;
$q=substr($this->qswhData[$p-128],$q,4);
}
else
$q=dechex($p);
if(empty($callback))
$ret.=$q;
else {
$arr=array("htmlHex","htmlDec","escape","u2utf8");
if(is_integer($callback)){
if($callback>count($arr))die("Invalid Function");
$ret.=$this->$arr[$callback-1]($q);
}else
$ret.=$callback($q);
}
}
return $ret;
}

function htmlHex($str){
return "".$str.";";
}

function htmlDec($str){
return "".hexdec($str).";";
}

function escape($str){
return hexdec($str)}

function u2utf8($str){
/******(qiushuiwuhen 2002-8-15)******/
$sp="!'()*-.0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ_abcdefghijklmnopqrstuvwxyz~";
$dec=hexdec($str);
$bin=decbin($dec);
$len=strlen($bin);
$arr=array("c0","e0","f0");
if($dec>0x7f){
$ret="";
for($i=$len,$j=-1;$i>=0;$i-=6,$j++){
if($i>6)
$ret="%".dechex(0x80+bindec(substr($bin,$i-6,6))).$ret;
else
$ret="%".dechex(hexdec($arr[$j])+bindec(substr($bin,0,6-$i))).$ret;
}
}else{
if(strpos($sp,chr($dec)))
$ret=chr($dec);
else
$ret="%".strtolower($str);
}
return $ret;
}
}


使用范例

$words="中文Abc";
function ex($str){return "[".$str."]";}


$qswh=new qswhGBK("qswhGBK.php");//如果文件名是qswhGBK.php,可省参数

echo("

不带参数:".$qswh-&gt;gb2u($words)); <br>echo("\n调用内置函数htmlHex:".$qswh-&gt;gb2u($words,1)); <br>echo("\n调用内置函数htmlDec:".$qswh-&gt;gb2u($words,2)); <br>echo("\n调用内置函数escape:".$qswh-&gt;gb2u($words,3)); <br>echo("\n调用内置函数u2utf8:".$qswh-&gt;gb2u($words,4)); <br>echo("\n调用自定义函数:".$qswh-&gt;gb2u($words,ex)); <br><p><br>效果如下: <br><br>不带参数:4E2D6587416263 <br>调用内置函数htmlHex:中文Abc <br>调用内置函数htmlDec:中文Abc <br>调用内置函数escape:%u4E2D%u6587Abc <br>调用内置函数u2utf8:%e4%b8%ad%e6%96%87Abc <br>调用自定义函数:[4E2D][6587][41][62][63]</p>