PHP 8.5.0 Alpha 1 available for testing

Voting

: five minus three?
(Example: nine)

The Note You're Voting On

Dennis Robinson from basnetworks dot net
16 years ago
I wanted to use the tokenizer functions to count source lines of code, including counting comments. Attempting to do this with regular expressions does not work well because of situations where /* appears in a string, or other situations. The token_get_all() function makes this task easy by detecting all the comments properly. However, it does not tokenize newline characters. I wrote the below set of functions to also tokenize newline characters as T_NEW_LINE.

<?php

define
('T_NEW_LINE', -1);

function
token_get_all_nl($source)
{
$new_tokens = array();

// Get the tokens
$tokens = token_get_all($source);

// Split newlines into their own tokens
foreach ($tokens as $token)
{
$token_name = is_array($token) ? $token[0] : null;
$token_data = is_array($token) ? $token[1] : $token;

// Do not split encapsed strings or multiline comments
if ($token_name == T_CONSTANT_ENCAPSED_STRING || substr($token_data, 0, 2) == '/*')
{
$new_tokens[] = array($token_name, $token_data);
continue;
}

// Split the data up by newlines
$split_data = preg_split('#(\r\n|\n)#', $token_data, -1, PREG_SPLIT_DELIM_CAPTURE | PREG_SPLIT_NO_EMPTY);

foreach (
$split_data as $data)
{
if (
$data == "\r\n" || $data == "\n")
{
// This is a new line token
$new_tokens[] = array(T_NEW_LINE, $data);
}
else
{
// Add the token under the original token name
$new_tokens[] = is_array($token) ? array($token_name, $data) : $data;
}
}
}

return
$new_tokens;
}

function
token_name_nl($token)
{
if (
$token === T_NEW_LINE)
{
return
'T_NEW_LINE';
}

return
token_name($token);
}

?>

Example usage:

<?php

$tokens
= token_get_all_nl(file_get_contents('somecode.php'));

foreach (
$tokens as $token)
{
if (
is_array($token))
{
echo (
token_name_nl($token[0]) . ': "' . $token[1] . '"<br />');
}
else
{
echo (
'"' . $token . '"<br />');
}
}

?>

I'm sure you can figure out how to count the lines of code, and lines of comments with these functions. This was a huge improvement on my previous attempt at counting lines of code with regular expressions. I hope this helps someone, as many of the user contributed examples on this website have helped me in the past.

<< Back to user notes page

To Top