-
Notifications
You must be signed in to change notification settings - Fork 45
Type promotion tests #2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Yes we need to write those sections still. You can assume numpy names for dtypes and array creation functions (don't use
Yes indeed. That's why the set of dtypes is quite conservative - all libraries have the listed dtypes.
I think those two bullet points are a complete specification (the first one just emphasizes that 0-D arrays are not special - they're just arrays). NumPy's rules are much more complicated. Scalars in the spec are Python scalars, the NumPy "array scalars" are not a thing. |
OK, that's a bit confusing, especially since Python ints don't fit any of the given dtypes. I would use the word "native Python types" rather than scalars. And just to clarify then, what is the spec behavior on Python scalars? Is it implementation dependent? Should a Python |
Native types should never upcast. That is consistent with what libraries do now I believe. (EDIT: modulo the "no mixed types below) The main thing that isn't specified well is what happens for dtypes not present in the table. Most importantly, mixed float/int tables are left out. Those are all inconsistently implemented: TensorFlow raises, PyTorch has (e.g.) @kgryte we have that discussion in the meeting minutes, but I'm thinking we should add both the decision and the rationale to the spec, maybe in a note box that can fold out (the theme has those). WDYT? |
@rgommers I don't know if we ever came to a decision regarding dtypes which are not specified in the tables. Currently, it is implementation-dependent, but I cannot recall whether we came to consensus on whether, e.g., this should remain the case or whether these situations should be "disallowed"/trigger an exception. Regardless, agreed that we should add a section explaining the rationale for not including every combination and why we have not yet sought to standardize behavior in this particular circumstance. |
Yes you are totally right, don't know what I was thinking when I wrote that. If we say "must raise an exception", that is standardizing it. |
What's the best way to create empty arrays (arrays with 0 in their shape)? |
|
Interesting that |
I guess I'm still very confused by what is meant here. As far as I can tell NumPy treats
If the int doesn't fit in the type it gives an
|
Those are all "scalar + scalar". I thought you meant "scalar + array", which is the interesting case. This type of thing, Example:
|
So all there is is Python |
Are shape () arrays not part of the spec? I didn't see anything about that. In NumPy shape () arrays seem to behave the same as "array scalars" in my examples, with the exception of |
Yes they are, but those are different from your examples. You'd construct them with |
They should have casting behavior identical to arrays with different shapes. |
So this is all wrong in NumPy currently:
That should give back an |
According to the spec, it should raise an exception. Or at least that's how I read it. I hope I've convinced you that the spec is extremely ambiguous here: "Non-array ("scalar") operands are not permitted to participate in type promotion." "Scalar" can mean one of three things (Python built-in, "dtype scalar", or shape () array). Nothing is defined anywhere. It doesn't even clarify what "not permitted" means. Should it be an exception? If "scalar" does mean Python built-in, what are the assumed dtypes of int, float, and bool (int in particular doesn't match any of the dtypes listed in the document). And the previous sentence also talking about zero-dimensional arrays but saying the opposite thing is even more confusing. |
Yes I agree. Making it concrete ("scalar means Python builtin types, Also for array scalars, we can note that they are left out, with rationale. |
I've pushed a proof of concept for type promotion here https://github.com/Quansight/array-api-tests/blob/master/array_api_tests/test_type_promotion.py#L74. It's not actually parameterized properly yet, and doesn't generate random array data. |
Actually it looks like
according to the spec? Should I use something like Also, just looking at some of the libraries, I'm a bit confused what the API namespace is, e.g., for tensorflow or pytorch. |
That's also a section that still needs to be added, but I'm almost certain we should only use dtype literals. The
I think
Namespaces are all not lined up well. Given both that issue, and that there will be BC-breaking things in the standard for every library, the adoption plan will be to add some new namespace or function (e.g. `mod = get_array_module(api_version='1.0') will give you a namespace with all the standard-compliant names and behaviour). And yes,
Agreed. I'd say you should go with |
I may write up something today for the dtypes and import convention, so we get that sorted out. And Athan is working on the array creation and manipulation functions I believe. Thanks for pointing out all the holes you find @asmeurer, very helpful in prioritizing what to write and find pieces that we forgot about. |
In that case, shouldn't the type promotion document be written in terms of them rather than the string shorthands? Also mxnet doesn't seem to have them. |
Clarification on dtype usage done in data-apis/array-api#32. I think this should make clear that your example should be written as:
Note that the "array object" section is still TODO, but it's a given that that will have a |
Type promotion is now tested extensively for elementwise functions and operators. All that remains to do is tests for the non-elementwise functions that participate in type promotion. I will open a new issue for this. |
Here is the spec for type promotion rules: https://github.com/data-apis/array-api/blob/master/spec/API_specification/type_promotion.md
Questions:
As with Broadcasting tests #1, how do I create inputs for the tests? I didn't see any specification on how to create arrays or how to specify dtypes.
Do I read the document correctly in that all the listed dtypes are required to be implemented?
The bullet points at the bottom specify different semantics for zero-dimensional arrays and scalars. Is the distinction between these two spelled out somewhere? As far as I know, NumPy distinguishes these types but they mostly behave the same (and they seem to both participate in type promotion).
The text was updated successfully, but these errors were encountered: