Skip to content

array_(merge/replace)_recursive enhancements #13283

Open
@gzhegow1991

Description

@gzhegow1991

Description

  1. When you call array_merge_recursive it works little bit unexpectedly
<?php

$arr1 = [ 1 => [ 2 => [ 1, 2 ]]];
$arr2 = [ 1 => [ 2 => [ 3, 4 ]]];

var_dump(array_merge_recursive($arr1, $arr2));

Guess you expect $arr = [ 1 => 2 => [ 1, 2, 3, 4 ] ] ];

But you receive:

array(2) {
  [0]=>
  array(1) {
    [2]=>
    array(2) {
      [0]=>
      int(1)
      [1]=>
      int(2)
    }
  }
  [1]=>
  array(1) {
    [2]=>
    array(2) {
      [0]=>
      int(3)
      [1]=>
      int(4)
    }
  }
}
  1. Sometimes you need array_add_recursive function, that works exactly as $arr = $arr1 + $arr2 (if key not found - add the key), and we haven't that function

  2. array_replace_recursive works well, it replaces existing keys with new ones, so if you want array_add_recursive you could use array_replace with reversed arguments order.

  3. and very rare case - you have to save all the values from both arrays in same pathes, and keep current array path to the values. That case happened when you write something like "de-uniqualize" users in databases (usually incoming data from few sources and you have to uniq users with data of all doubles)

===

So my solution:

<?php
/**
 * @param array     $src
 * @param array     $path
 * @param bool|null $found
 *
 * @return mixed
 *
 * @throws \RuntimeException
 */
function &_array_ref(array &$src, array $path, bool &$found = null) // : &mixed
{
    $isThrow = func_num_args() !== 3;

    // $_path = _array_path($path); // > gzhegow, could support strings as path if you need using explode or somehow, not recommended
    $_path = $path;

    $ref =& $src;

    $found = true;
    while ( null !== key($_path) ) {
        $p = array_shift($_path);

        if (! array_key_exists($p, $ref)) {
            $found = false;

            unset($ref);
            $ref = null;

            if (! $isThrow) {
                break;

            } else {
                throw new \RuntimeException(
                    "Missing key in array: "
                    . var_export($p, 1)
                    . " / " . implode('.', $path)
                );
            }
        }

        $ref =& $ref[ $p ];

        if ((! is_array($ref)) && $_path) {
            $found = false;

            unset($ref);
            $ref = null;

            if (! $isThrow) {
                break;

            } else {
                throw new \RuntimeException(
                    "Trying to traverse scalar value: "
                    . var_export($p, 1)
                    . " / " . implode('.', $path)
                );
            }
        }
    }

    return $ref;
}

/**
 * @param array $dst
 * @param array $path
 * @param mixed $value
 *
 * @return mixed
 *
 * @throws \RuntimeException
 */
function &_array_put(array &$dst, array $path, $value) // : &mixed
{
    // $_path = _array_path($path); // > gzhegow, could support strings as path if you need using explode or somehow, not recommended
    $_path = $path;

    $ref =& $dst;

    while ( null !== key($_path) ) {
        $p = array_shift($_path);

        if (! array_key_exists($p, $ref)) {
            $ref[ $p ] = $_path
                ? []
                : null;
        }

        $ref =& $ref[ $p ];

        if ((! is_array($ref)) && $_path) {
            unset($ref);
            $ref = null;

            throw new \RuntimeException(
                "Trying to traverse scalar value: "
                . var_export($p, 1)
                . ' / ' . implode('.', $path)
            );
        }
    }

    $ref = $value;

    return $ref;
}

/**
 * @param array $array
 *
 * @return Iterator<array, mixed>|Generator<array, mixed>
 */
function &_array_walk(array &$array) : \Generator
{
    if (! $array) return;

    // src, path
    $stack = [];
    $stack[] = [ &$array, [] ];

    while ( null !== key($stack) ) {
        $cur = array_pop($stack);

        $isArray = is_array($cur[ 0 ]);
        $isEmptyArray = $isArray && empty($cur[ 0 ]);

        if (! $isArray || $isEmptyArray) {
            yield $cur[ 1 ] => $cur[ 0 ];

            $isArray = is_array($cur[ 0 ]);
            $isEmptyArray = $isArray && empty($cur[ 0 ]);

            if ($isArray && $isEmptyArray) {
                continue;
            }
        }

        if ($isArray) {
            if ($isEmptyArray) {
                $stack[] = [ &$cur[ 0 ], $cur[ 1 ] ];

            } else {
                end($cur[ 0 ]);
                while ( null !== ($kk = key($cur[ 0 ])) ) {
                    $fullpath = $cur[ 1 ];
                    $fullpath[] = $kk;

                    $stack[] = [ &$cur[ 0 ][ $kk ], $fullpath ];

                    prev($cur[ 0 ]);
                }
            }
        }
    }
}

/**
 * @param array              $arrayList
 * @param array<string>|null $keyList
 *
 * @return Iterator<array, array>|Generator<array, array>
 */
function _array_collect(array $arrayList, array $keyList = null) : \Generator
{
    $keyList = $keyList ?? array_keys($arrayList);

    $generators = [];
    foreach ( $keyList as $idx => $key ) {
        if (! array_key_exists($idx, $arrayList)) {
            unset($keyList[ $idx ]);

            continue;
        }

        if (! is_array($arrayList[ $idx ])) {
            throw new \LogicException(
                'Each of `arrayList` must be array: '
                . var_export($arrayList[ $idx ], 1)
            );
        }

        $generators[ $idx ] = _array_walk($arrayList[ $idx ]);
    }

    $result = [];

    $pathes = [];
    while ( $generators ) {
        foreach ( $generators as $generatorKey => $generator ) {
            /** @var \Generator $generator */

            if (! $generator->valid()) {
                unset($generators[ $generatorKey ]);

            } else {
                /** @var array $path */

                $path = $generator->key();
                $pathString = implode("\0", $path);

                if (! isset($pathes[ $pathString ])) {
                    $yield = false;

                    $values = [];
                    foreach ( $keyList as $idx => $key ) {
                        $ref =& _array_ref($arrayList[ $idx ], $path, $found);

                        if ($found) {
                            $values[ $key ] = $ref;

                            $yield = true;
                        }

                        $ref = null;
                        unset($ref);
                    }

                    if ($yield) {
                        yield $path => $values;
                    }

                    $pathes[ $pathString ] = true;
                }

                $generator->next();
            }
        }
    }

    return $result;
}


$arrays = [];
$arrays[] = [
    '1' => [
        '1.1' => [
            '1.1.1' => '1.1.1',
            '1.1.2' => '1.1.2',

            '1.1.3' => '1.1.3',
            '1.1.4' => [
                '1.1.4.1' => '1.1.4.1',
            ],
            '1.1.5' => [
                '1.1.5',
                '1.1.5.1' => '1.1.5.1',
            ],
        ],
    ],
];
$arrays[] = [
    '1' => [
        '1.1' => [
            // '1.1.1' => null,
            '1.1.2' => '1.1.2: 2',

            '1.1.3' => '1.1.3: 2',
            '1.1.4' => [
                '1.1.4.1' => '1.1.4.1: 2',
            ],
            '1.1.5' => [
                '1.1.5: 2',
                '1.1.5.1' => '1.1.5.1: 2',
            ],
        ],
    ],
];
$keys = [ 'first', 'second' ];

$result = [];
foreach ( _array_collect($arrays, $keys) as $path => $values ) {
    $isSingle = count($values) === 1;

    if ($isSingle) {
        // > gzhegow, that check could be if you do not want to transform plain values to arrays if no collision
        $value = reset($values);

    } else {
        // > add
        // foreach ( $keys as $idx => $key ) {
        //     $values[ $key ] = (array) ($values[ $key ] ?? null);
        // }
        // $value = [];
        // foreach ( $keys as $idx => $key ) {
        //     $value += $values[ $key ];
        // }
        // < add

        // > replace
        // foreach ( $keys as $idx => $key ) {
        //     $values[ $key ] = (array) ($values[ $key ] ?? null);
        // }
        // $value = array_replace(...array_values($values));
        // < replace

        // > merge
        // foreach ( $keys as $idx => $key ) {
        //     $values[ $key ] = (array) ($values[ $key ] ?? null);
        // }
        // $value = array_merge(...array_values($values));
        // < merge

        // > collect
        $value = $values;
        // < collect
    }

    _array_put($result, $path, $value);
}

var_dump($result);
// [
//     1 => [
//         '1.1' => [
//             '1.1.1' => '1.1.1',
//             '1.1.2' => [
//                 'first'  => '1.1.2',
//                 'second' => '1.1.2: 2',
//             ],
//             '1.1.3' => [
//                 'first'  => '1.1.3',
//                 'second' => '1.1.3: 2',
//             ],
//             '1.1.4' => [
//                 '1.1.4.1' => [
//                     'first'  => '1.1.4.1',
//                     'second' => '1.1.4.1: 2',
//                 ],
//             ],
//             '1.1.5' => [
//                 [
//                     'first'  => '1.1.5',
//                     'second' => '1.1.5: 2',
//                 ],
//                 '1.1.5.1' => [
//                     'first'  => '1.1.5.1',
//                     'second' => '1.1.5.1: 2',
//                 ],
//             ],
//         ],
//     ],
// ];

Guess, it could be optimized with all PHP maintainers experience to be faster and low level, but still it works for me.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions