Description
Description
First of all, a special thanks to the progress made by all developers!
Calling mb_detect_encoding()
with the second argument being null
may return a really strange result, while changing it to mb_detect_order()
fixes the issue. Based on the official PHP documentation, they must behave the same.
I tried something like four hours, but cannot figure out what the problem exactly is. Even I cannot create a small re-producable example; maybe the problem is environment- or context-dependent.
Steps to Reproduce
- Clone Serenata and navigate to its directory. Checkout the commit
d3c9dcb3426a9b5ffe442a436e2179063ea6c9d7
(current master). - Run
composer install
. - Run
./vendor/bin/phpunit
.
You should see there are lots of failures (i.e. 346). So what the hell? Wait, start editing the file src/Analysis/SourceCodeReading/TextToUtf8Converter.php
, go to line 15 and change it from
$encoding = mb_detect_encoding($code, null, true);
to
$encoding = mb_detect_encoding($code, mb_detect_order(), true);
. Now re-run ./vendor/bin/phpunit
. With this simple change, the issue is (almost) fixed and only one failure remains.
I'm pretty sure that this is a new bug introduced in PHP 8.1. I compiled both PHP 8.0.19 and 8.1.8 the same way, with default configurations, and ran PHPUnit under both.
Under PHP 8.0, passing both values as the second argument works perfectly, and ALL tests pass. Under PHP 8.1, I expect $encoding
to hold either ASCII
or UTF-8
(or maybe even false
), but it strangely equals to UUENCODE
. Also, the one remaining failed test after the change is also related to mb_detect_encoding()
, and I expect the same, but it randomly returns UUENCODE
or UTF-7
, causing the +
signs of the string to disappear after mb_convert_encoding()
(line 22 of the file).
PHP Version
PHP 8.1.7, PHP 8.1.8
Operating System
Fedora Workstation 36