Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix passing a float for appropriate MXNet operations #22

Open
wants to merge 1 commit into
base: v0.x.x
Choose a base branch
from

Conversation

jens-mueller-sociomantic
Copy link
Contributor

When assigning each element of an n-dimensional array from a scalar, the
scalar should be of type float according to the MXNet documentation for
the operator _set_value. Also the operations with a scalar right-hand
side like _plus_scalar, _minus_scalar, _mul_scalar, _div_scalar,
etc. require the scalar to be of type float. This change replaces the
general type T with float for these cases.

When assigning each element of an n-dimensional array from a scalar, the
scalar should be of type float according to the MXNet documentation for
the operator `_set_value`. Also the operations with a scalar right-hand
side like `_plus_scalar`, `_minus_scalar`, `_mul_scalar`, `_div_scalar`,
etc. require the scalar to be of type float. This change replaces the
general type `T` with `float` for these cases.
@jens-mueller-sociomantic
Copy link
Contributor Author

I need to fix the unittest for applyScalarOp as I just figured out.

@jens-mueller-sociomantic
Copy link
Contributor Author

Probably I'll split it into two commits.

@joseph-wakeling-sociomantic
Copy link
Contributor

This change needs a bit more explanation and justification, I think.

@joseph-wakeling-sociomantic
Copy link
Contributor

Specifically:

  • Is the issue that MXNet expects float input to these operations even when the underlying NDArray contains non-float elements? Or is it that these operations are not supported at all for NDArrays whose elements are not float?

  • If these operations are supported for NDArrays with non-float elements, what happens in these cases?

  • If they are not supported for NDArrays with non-float elements, why expose them via our NDArray class except for T == float?

@joseph-wakeling-sociomantic
Copy link
Contributor

Blocking while these design questions remain unanswered.

@jens-mueller-sociomantic
Copy link
Contributor Author

  • Is the issue that MXNet expects float input to these operations even when the underlying NDArray contains non-float elements? Or is it that these operations are not supported at all for NDArrays whose elements are not float?

The former.

  • If these operations are supported for NDArrays with non-float elements, what happens in these cases?

I assume they are converted. The documentation doesn't say anything specific here.

  • If they are not supported for NDArrays with non-float elements, why expose them via our NDArray class except for T == float?

We should look into enabling non-float NDArrays. I believe they are not yet supported by MXNet.

@joseph-wakeling-sociomantic
Copy link
Contributor

I assume they are converted. The documentation doesn't say anything specific here.

Probably best not to assume anything. Going by experience it seems more likely that this is simply wrong and is not a use-case they support in practice.

As a starting point, can you check what happens when _set_value is invoked on a non-float NDArray, with a float value that doesn't convert nicely to the underlying data type?

@jens-mueller-sociomantic
Copy link
Contributor Author

As a starting point, can you check what happens when _set_value is invoked on a non-float NDArray, with a float value that doesn't convert nicely to the underlying data type?

You cannot have NDArrays of type int or ubyte. When you try you get e.g. [12:26:06] src/c_api/c_api.cc:362: This operation only support floating point types not uint8. double works. I could assign a float value to a double n-dimensional array. Note that the float is converted to a string in the process.

@jens-mueller-sociomantic
Copy link
Contributor Author

jens-mueller-sociomantic commented Sep 18, 2017

As a starting point, can you check what happens when _set_value is invoked on a non-float NDArray, with a float value that doesn't convert nicely to the underlying data type?

Apparently, the passed value is converted to the element type.
The function SetValueOp is called when setting all elements to a passed value using _set_value (see
https://github.com/apache/incubator-mxnet/blob/master/src/ndarray/ndarray.cc#L1207).
SetValueOp calls Eval (see
https://github.com/apache/incubator-mxnet/blob/master/src/ndarray/ndarray.cc#L298)
and Eval converts the right hand side to the element type (https://github.com/apache/incubator-mxnet/blob/master/src/ndarray/ndarray_function-inl.h#L373).

@jens-mueller-sociomantic
Copy link
Contributor Author

On closer inspection it turns out that you cannot call MXNDArrayGetData if the element type is int and ubyte. That seems to be where the above error originates.

@joseph-wakeling-sociomantic
Copy link
Contributor

On closer inspection it turns out that you cannot call MXNDArrayGetData if the element type is int and ubyte. That seems to be where the above error originates.

That's in line with what I was suspecting, i.e. that whole parts of the NDArray functionality just do not exist for certain element types.

Now, that said, I'm not sure that it's our responsibility to sugarcoat the MXNet API by hardcoding in our D wrapper constraints from the upstream implementation (which we should distinguish from the upstream API). Anyone attempting to perform these operations with the wrong element type will get a runtime error from the C API (which we convert to an exception), right?

@jens-mueller-sociomantic
Copy link
Contributor Author

jens-mueller-sociomantic commented Sep 18, 2017

Let me rephrase what the problem is. Whenever we allow passing a T instead of a float we communicate to the user that we support a case that we actually don't and this is us, not MXNet. For example this code

scope array = new NDArray!(int)(cpuContext(), [1u]);
array = int.max;

looks unsuspicious to the user. Because the signature of opAssign accepts a T, an int in this case, (see https://github.com/sociomantic-tsunami/dmxnet/blob/v0.x.x/src/mxnet/NDArray.d#L603). But then we convert this value to a float, which is then converted to a string for passing on to MXNet (https://github.com/sociomantic-tsunami/dmxnet/blob/v0.x.x/src/mxnet/NDArray.d#L612). This is where it goes wrong. We cannot silently convert non-float values to float values. The above is an example that it can go wrong. In the above example the array's elements are all -2147483648 after the assignment. There is a similar problem in case of double because our API suggests we assign a double when in fact we only use single precision. These are the cases I'd like to avoid.

What is true that we don't want to have these cases

scope array = new NDArray!(int)(cpuContext(), [1u]);
array = 1.5f; // sets the element to 1
scope array = new NDArray!(ubyte)(cpuContext(), [1u]);
array = 256f; // sets the element to 0

as well.

@joseph-wakeling-sociomantic
Copy link
Contributor

All fair enough. So I'm a bit confused why you think the appropriate solution is to rewrite the methods to accept float rather than T, instead of just making them unavailable when !is(T == float).

@joseph-wakeling-sociomantic
Copy link
Contributor

(That's assuming that we can't alternatively fix things by updating toNoLossString to handle types other than float, but if I understand you correctly, that's not sufficient to address the underlying MXNet API problem.)

@jens-mueller-sociomantic
Copy link
Contributor Author

So we decided to leave this alone for now. Fundamentally the issue is about MXNet only accepting float arguments in certain cases. One might argue that all non-float n-dimensional arrays cannot be properly supported at the moment.
What this PR shows though, is that currently a non-float value is silently converted to float (when passed to toNoLossString) which seems problematic.

@joseph-wakeling-sociomantic
Copy link
Contributor

What this PR shows though, is that currently a non-float value is silently converted to float (when passed to toNoLossString) which seems problematic.

Yup, we can distinguish here between our own internal issues (toNoLossString is float-specific and shouldn't be), the correctness of how the MXNet C API handles non-float-based NDArrays, and the question of what we expose from our public D API.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants