-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Image trait wip #1
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Made some comments, I made at least one of these in the discord when the markdown was first published but good to have them here as well
sources. This trait be implemented by crates which provide their own data types | ||
that could be used to hold image data. | ||
|
||
### What's not so good |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So one thing is there could (probably should) be convenience functions to access pixels at given x,y coordinates (or specific channel values (x,y,z). And something to allow for moving for moving windows such as https://docs.rs/ndarray/latest/ndarray/struct.ArrayBase.html#method.windows
These and the issues already mentioned in the "what's not so good" are the only things that would stop me from moving over the ndarray-vision algorithms to be implemented in terms of ImageStride
and `height()` methods return `u32` but should perhaps return `usize`. (The [`stride()`](https://docs.rs/machine-vision-formats/0.1.1/machine_vision_formats/trait.Stride.html#tymethod.stride) | ||
method does return `usize`.) | ||
|
||
## [`ndarray_image::NdImage` v0.3.0](https://docs.rs/ndarray-image/0.3.0/ndarray_image/) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I dunno if you want to mention it but there's always ndarray_vision::ImageBase<T, C>
where C: ColorModel
and T
is an ndarray::Data
. It simply wraps an ndarray::ArrayBase<T, Ix3>
and uses Ix3
to make it easy to access different colour channels.
Then using type tags for the colour spaces ndarray-vision basically implements impl From<Rgb> for Hsv
and then there's an auto-conversion so you can use From/Into
to go from ImageBase<T, Rgb> -> ImageBase<T, Hsv>
automatically.
Issue with that colour modelling is that it makes it hard to handle things like gamma, they could potentially be const generics when they're stable - but just how ergonomic is that API? It does let image processing algorithms specify what colour spaces they work on in the types i.e. impl Into<ImageBase<T, Gray>>
which might be a plus? But I haven't quite implemented it like that currently preferring to do conversions inside the algorithms to make the API "friendlier".
Also compile time with generics may be a concern - but I think it's worth it personally for numeric stuff.
Just mentioning as it seems to be different enough to the existing things.
Overall, I would like to mention a few things:
Let me elaborate on that last point a bit. For instance, for RGB, I would expect something like the following: struct Channel {
offsets: [usize; 2],
strides: [usize; 2],
dimensions: [usize; 2],
}
fn channels(&self) -> &[Channel]; I would expect to get back: &[Channel { offsets: [0, 0], strides: [3, width * 3], dimensions: [width, height] }, Channel { offsets: [1, 0], strides: [3, width * 3], dimensions: [width, height] }, Channel { offsets: [2, 0], strides: [3, width * 3], dimensions: [width, height] }] Where for R the X offset is 0, G the X offset is 1, and B the X offset is 2, but they are otherwise the same. These offsets are in bytes, of course. We could ALSO associate more specific information with each channel. For instance, we could say "this channel is blue" or "this channel is luminosity" by using some enum that comes back in the This also allows each channel to be stored separately in memory, so long as they are in the same buffer. Some algorithms might be more efficient when operating on images in this way. |
This is a very early draft made to gain visibility.