Skip to content

File corruption in _php_stream_copy_to_stream_ex when using copy_file_range #10370

Closed
@johnsudaar

Description

@johnsudaar

Description

Recently a user notified us of a corrupted file after a composer install on our platform while using PHP 8.2.1.

It seems that under some condition the _php_stream_copy_to_stream_ex can corrupt files.

I am not a PHP expert but here is the information I have found.

If I run the following code:

<?php
$archive = new PharData("./01fd644d9004da280847edbe58bb0346773cffaa.tar");
var_dump($archive->extractTo("./out", ["src/Versions/Version.php"]));

I get this output on some of our servers:

[14:36] Scalingo: /tmp/server-a $ php test.php 
bool(true)
[14:37] Scalingo: /tmp/server-a $ sha512sum out/src/Versions/Version.php 
ef8b2f794a94bbf45e4e4a1ed12a0edf5d403a6216cb23dcddb5fee894b7c6ee7ea73a92d5cc0a567343d9861a018f6a5439033f09e2d4ecbeb0345894b9c611  out/src/Versions/Version.php

But I expected this output instead:

[14:35] Scalingo: /tmp/server-b $ php test.php 
bool(true)
[14:37] Scalingo: /tmp/server-b $ sha512sum out/src/Versions/Version.php
1fe10db185b013570e15147991377c36afe74bae2a2b799589aac0add6e51d33c943cfc80bf57da52c2bfdd393b632ba2658b289fe0df97ae72b93a8659a96fa  out/src/Versions/Version.php

The file outputed on the first server is clearly corrupted when on the second server it's the expected output.

After some investigation I think that we traced back the issue to the copy_file_range call in streams/streams.c.

On system where corruption is occurring, only a part of the requested length is read, where on systems where no corruption is occurring the entire requested length is read.

Here are part of the strace that seems relevant to me.

The strace which results in corrupted file:

openat(AT_FDCWD, "/tmp/server-a/out/src/Versions/Version.php", O_RDWR|O_CREAT|O_TRUNC, 0666) = 6
fstat(6, {st_mode=S_IFREG|0600, st_size=0, ...}) = 0
lseek(6, 0, SEEK_CUR)                   = 0
lseek(5, 73728, SEEK_SET)               = 73728
lseek(5, 73728, SEEK_SET)               = 73728
lseek(6, 0, SEEK_SET)                   = 0
copy_file_range(5, NULL, 6, NULL, 5667, 0) = 4096
fstat(5, {st_mode=S_IFREG|0600, st_size=99840, ...}) = 0
mmap(NULL, 5667, PROT_READ, MAP_SHARED, 5, 0x13000) = 0x7f6d150e6000
lseek(5, 83491, SEEK_SET)               = 83491
write(6, "Step): void\n    {\n        $this-"..., 5667) = 5667
munmap(0x7f6d150e6000, 5667)            = 0
close(6)                                = 0

The strace that results in a valid file:

openat(AT_FDCWD, "/tmp/server-b/out/src/Versions/Version.php", O_RDWR|O_CREAT|O_TRUNC, 0666) = 6
fstat(6, {st_mode=S_IFREG|0600, st_size=0, ...}) = 0
lseek(6, 0, SEEK_CUR)                   = 0
lseek(5, 73728, SEEK_SET)               = 73728
lseek(5, 73728, SEEK_SET)               = 73728
lseek(6, 0, SEEK_SET)                   = 0
copy_file_range(5, NULL, 6, NULL, 5667, 0) = 5667
close(6)                                = 0
chmod("./out/src/Versions/Version.php", 0644) = 0

When it resulted in a valid file, it seems that copy_file_range copied the entire requested length (5667 bytes).

But on the corrupted version, it looks like it copied a part of the file (4096 bytes). Then it switched to the mmap method to copy the rest of the file.

I'm really neither an expert on PHP internals nor on system calls so the following might be completely wrong. But my understanding, seems to be that the mmap call copies too much data:

mmap(NULL, 5667, PROT_READ, MAP_SHARED, 5, 0x13000) = 0x7f6d150e6000

Seems to ask 5667 bytes on the FD 5. But the FD already read 4096 bytes so shouldn't we read 4096 - 5667 = 1571 bytes?
This seems to be confirmed by the seek that comes after.
The base offset seems to be 73728 (cf. the seek before copy_file_range) so the next seek should be 73728 + 5667 = 79395.
But according to the strace, it seeks to 83491 (73728 + 5667 + 4096).

A solution would be to subtract the result bytes from the bytes read in the mmap.
Or to do a loop around copy_file_range that would keep calling that method until it read the entire requested length.

The thing we do not understand is why copy_file_range is returning 4096 on some systems. Because on almost all systems it copies the entire file even if the environment are pretty close (both are on Ubuntu 20.04, Using the same kernel branch, code run in a docker container, ..). That's why it's a bit hard to give some precise instructions on how to reproduce the issue.

Additional information

On both servers PHP version is:

PHP 8.2.1 (cli) (built: Jan 16 2023 18:08:45) (NTS)
Copyright (c) The PHP Group
Zend Engine v4.2.1, Copyright (c) Zend Technologies
    with Zend OPcache v8.2.1, Copyright (c), by Zend Technologies

PHP Version

PHP 8.2.1

Operating System

Ubuntu 20.04

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions