Voting

: one plus eight?
(Example: nine)

The Note You're Voting On

seraclimov at yandex dot ru
7 years ago
You can't just simple print separated characters of a text which is encoded in multibyte character set like this;
Because fgetc() will break each multibyte character on its every byte. Consider this example:

<?php
$path
= 'foo/cyrillic.txt';
$handle = fopen($path, 'rb');
while (
FALSE !== ($ch = fgetc($handle))) {
$curs = ftell($hanlde);
print
"[$curs:] $ch\n";
}
/* The result will be something like this:
<
[1]: <
[2]: h
[3]: 2
[4]: >
[5]: �
[6]: �
[7]: �
[8]: �
[9]: �
[10]: �
[11]:
[12]: �
[13]: �
[14]: �
[15]: �
[16]: �
*/
?>

I don't think this is the best, but it can be a workaround:
<?php
$path
= 'path/to/your/file.ext';

if (!
$handle = fopen($path, 'rb')) {
echo
"Can't open ($path) file';
exit;
}

$mbch = ''; // keeps the first byte of 2-byte cyrillic letters
while (FALSE !== (
$ch = fgetc($handle))) {
//check for the sign of 2-byte cyrillic letters
if (empty(
$mbch) && (FALSE !== array_search(ord($ch), Array(208,209,129)))) {
$mbch = $ch; // keep the first byte
continue;
}
$curs = ftell($handle);
print "
[$curs]: " . $mbch . $ch . PHP_EOL;
// or print "
[$curs]: $mbch$ch\n";
if (!empty(
$mbch)) $mbch = ''; // erase the byte after using
}
?>

<< Back to user notes page

To Top