Skip to content

Broadcasting tests #1

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
asmeurer opened this issue Aug 28, 2020 · 8 comments
Closed

Broadcasting tests #1

asmeurer opened this issue Aug 28, 2020 · 8 comments

Comments

@asmeurer
Copy link
Member

Here is the spec for broadcasting https://github.com/data-apis/array-api/blob/master/spec/API_specification/broadcasting.md.

Here are some questions about it:

  • How do I create input arrays for the test? The array object document is empty https://github.com/data-apis/array-api/blob/master/spec/API_specification/array_object.md. Do we have at least enough an idea of what that will look like so I can create tests?

  • What is the best way to test broadcasting? The simplest would be to use a function like numpy.broadcast_arrays or numpy.broadcast_to, but these aren't listed in the spec. And even NumPy doesn't have a function that directly implements the shape broadcasting algorithm—it can only be done to explicit arrays. The spec says broadcasting should apply to all elementwise operations. What is a good elementwise operation that we can use to test only the broadcasting semantics? Or should we make sure to test all of them?

  • The spec doesn't actually specify how resulting broadcast array should look, only what its shape is. Is this intentional? Should we test this? If not, it means we don't actually test the result of a broadcasted operation, only that the shape/errors are correct.

  • As I understand it, "potentially enable more memory-efficient element-wise operations" means that broadcasting does not necessarily need to be done in a memory-efficient way, i.e., libraries are free to copy axes across broadcast dimensions rather than using something like a stride trick.

@rgommers
Copy link
Member

How do I create input arrays for the test? The array object document is empty https://github.com/data-apis/array-api/blob/master/spec/API_specification/array_object.md. Do we have at least enough an idea of what that will look like so I can create tests?

See my answer on gh-2.

What is the best way to test broadcasting? The simplest would be to use a function like numpy.broadcast_arrays or numpy.broadcast_to, but these aren't listed in the spec.

I think broadcasting is a behaviour that needs testing for every function that has broadcast-able inputs, so it's a big job. For now I'd take the one or a handful of functions you'll start with, and see if you can write a broadcast test with shape logic that can be reused.

And even NumPy doesn't have a function that directly implements the shape broadcasting algorithm

Yes this is a pain, we really should have such a function in NumPy. I think we need it here. The algorithm isn't actually all that hard - I suspect there's an implementation in pure Python floating around somewhere.

it can only be done to explicit arrays. The spec says broadcasting should apply to all elementwise operations. What is a good elementwise operation that we can use to test only the broadcasting semantics? Or should we make sure to test all of them?

add?

  • The spec doesn't actually specify how resulting broadcast array should look, only what its shape is. Is this intentional? Should we test this? If not, it means we don't actually test the result of a broadcasted operation, only that the shape/errors are correct.

Ah. I had to read that section again, you're right - it's missing a statement on how values in the array being broadcasted are "virtually duplicated" (not sure that's great wording).

  • As I understand it, "potentially enable more memory-efficient element-wise operations" means that broadcasting does not necessarily need to be done in a memory-efficient way, i.e., libraries are free to copy axes across broadcast dimensions rather than using something like a stride trick.

Indeed. Strides are explicitly out of scope - there are arrays that do not have a strided implementation (e.g. JAX, Dask). That's an implementation detail, and the spec attempts to be careful not to specify anything that puts unnecessary constraints on any library - JIT compilers, lazy evaluation, distributed arrays, GPU arrays, etc. all should be possible.

@kgryte
Copy link

kgryte commented Aug 29, 2020

Re: what an array looks like. That is covered, perhaps too obliquely, here:

Given an element-wise operation involving two compatible arrays, an array having a singleton dimension (i.e., a dimension whose size is one) is broadcast (i.e., virtually repeated) across an array having a corresponding non-singleton dimension.

The key phrase being "virtually repeated".

May be worthwhile for me to add an example to the spec doc to show what the broadcasting algorithm means in practice.

@asmeurer
Copy link
Member Author

asmeurer commented Sep 1, 2020

Yes this is a pain, we really should have such a function in NumPy. I think we need it here. The algorithm isn't actually all that hard - I suspect there's an implementation in pure Python floating around somewhere.

I was actually going to open an issue on NumPy about this some time ago, but never got around to it. I've done so here numpy/numpy#17217.

@asmeurer
Copy link
Member Author

asmeurer commented Sep 1, 2020

I believe there is a bug in the broadcasting pseudocode. If I implement it directly, I get IndexErrors from the test cases. I believe the correct code should be

i. If N1-N+i >= 0, let d1 be the size of dimension N1 - N + i for array A (i.e., the result of shape1[N1 - N + i]); else, let d1 be 1.

ii. If N2-N+i >= 0, let d2 be the size of dimension N2 - N + i for array B (i.e., the result of shape2[N2 - N + i]); else, let d2 be 1.

(also note that the spec says n on these lines when I believe it means to say i)

Here is the function as defined in the spec

class BroadcastError(Exception):
    pass

def broadcast_shapes(shape1, shape2):
    """
    Broadcast shapes `shape1` and `shape2`.

    The code in this function should follow the pseudocode in the spec as
    closely as possible.
    """
    N1 = len(shape1)
    N2 = len(shape2)
    N = max(N1, N2)
    shape = [None]*N
    i = N - 1
    while i >= 0:
        if N1 - N + i >= 0:
            d1 = shape1[i]
        else:
            d1 = 1
        if N2 - N + i >= 0:
            d2 = shape2[i]
        else:
            d2 = 1

        if d1 == 1:
            shape[i] = d2
        elif d2 == 1:
            shape[i] = d1
        elif d1 == d2:
            shape[i] = d1
        else:
            raise BroadcastError

        i = i - 1

    return tuple(shape)

And here is what I believe is the correct version

class BroadcastError(Exception):
    pass

def broadcast_shapes(shape1, shape2):
    """
    Broadcast shapes `shape1` and `shape2`.

    The code in this function should follow the pseudocode in the spec as
    closely as possible.
    """
    N1 = len(shape1)
    N2 = len(shape2)
    N = max(N1, N2)
    shape = [None]*N
    i = N - 1
    while i >= 0:
        if N1 - N + i >= 0:
            d1 = shape1[N1 - N + i] # This line is different
        else:
            d1 = 1
        if N2 - N + i >= 0:
            d2 = shape2[N2 - N + i] # This line is different
        else:
            d2 = 1

        if d1 == 1:
            shape[i] = d2
        elif d2 == 1:
            shape[i] = d1
        elif d1 == d2:
            shape[i] = d1
        else:
            raise BroadcastError

        i = i - 1

    return tuple(shape)

And here are the test cases from the spec

def test_broadcast_shapes_explicit_spec():
    """
    Explicit broadcast shapes examples from the spec
    """
    shape1 = (8, 1, 6, 1)
    shape2 = (7, 1, 5)
    result = (8, 7, 6, 5)
    assert broadcast_shapes(shape1, shape2) == result

    shape1 = (5, 4)
    shape2 = (1,)
    result = (5, 4)
    assert broadcast_shapes(shape1, shape2) == result

    shape1 = (5, 4)
    shape2 = (4,)
    result = (5, 4)
    assert broadcast_shapes(shape1, shape2) == result

    shape1 = (15, 3, 5)
    shape2 = (15, 1, 5)
    result = (15, 3, 5)
    assert broadcast_shapes(shape1, shape2) == result

    shape1 = (15, 3, 5)
    shape2 = (3, 5)
    result = (15, 3, 5)
    assert broadcast_shapes(shape1, shape2) == result

    shape1 = (15, 3, 5)
    shape2 = (3, 1)
    result = (15, 3, 5)
    assert broadcast_shapes(shape1, shape2) == result

    shape1 = (3,)
    shape2 = (4,)
    raises(BroadcastError, lambda: broadcast_shapes(shape1, shape2)) # dimension does not match

    shape1 = (2, 1)
    shape2 = (8, 4, 3)
    raises(BroadcastError, lambda: broadcast_shapes(shape1, shape2)) # second dimension does not match

    shape1 = (15, 3, 5)
    shape2 = (15, 3)
    raises(BroadcastError, lambda: broadcast_shapes(shape1, shape2)) # singleton dimensions can only be prepended, not appended

@kgryte
Copy link

kgryte commented Sep 1, 2020

@asmeurer Thanks for catching this. Will submit a patch to the spec shortly.

asmeurer added a commit that referenced this issue Sep 1, 2020
@kgryte
Copy link

kgryte commented Sep 2, 2020

@asmeurer Patch submitted: data-apis/array-api#30.

@asmeurer
Copy link
Member Author

asmeurer commented Oct 4, 2021

We have tests for broadcasting but they only cover the elementwise functions. We need to extend it to operators (including the in-place operator logic), as well as other functions. c.f. #24

@honno
Copy link
Member

honno commented Dec 22, 2021

We have tests for broadcasting but they only cover the elementwise functions. We need to extend it to operators (including the in-place operator logic), as well as other functions. c.f. #24

Fixed in #35.

The suite now tests broadcasting for all relevant functions in the current spec, so closing.

@honno honno closed this as completed Dec 22, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants