Definition:
function seems_utf8($str) {}
Checks to see if a string is utf8 encoded.
NOTE: This function checks for 5-Byte sequences, UTF8 has Bytes Sequences with a maximum length of 4.
Parameters
- string $str: The string to be checked
Return values
returns:True if $str fits a UTF-8 model, false otherwise.
Source code
function seems_utf8($str) { $length = strlen($str); for ($i=0; $i < $length; $i++) { $c = ord($str[$i]); if ($c < 0x80) $n = 0; # 0bbbbbbb elseif (($c & 0xE0) == 0xC0) $n=1; # 110bbbbb elseif (($c & 0xF0) == 0xE0) $n=2; # 1110bbbb elseif (($c & 0xF8) == 0xF0) $n=3; # 11110bbb elseif (($c & 0xFC) == 0xF8) $n=4; # 111110bb elseif (($c & 0xFE) == 0xFC) $n=5; # 1111110b else return false; # Does not match any model for ($j=0; $j<$n; $j++) { # n bytes matching 10bbbbbb follow ? if ((++$i == $length) || ((ord($str[$i]) & 0xC0) != 0x80)) return false; } } return true; }
2821
No comments yet... Be the first to leave a reply!