Image trait wip #1

astraw · 2022-02-02T22:38:46Z

This is a very early draft made to gain visibility.

xd009642

Made some comments, I made at least one of these in the discord when the markdown was first published but good to have them here as well

xd009642 · 2022-02-02T22:44:28Z

rfcs/0001-image-trait.md

+sources. This trait be implemented by crates which provide their own data types
+that could be used to hold image data.
+
+### What's not so good


So one thing is there could (probably should) be convenience functions to access pixels at given x,y coordinates (or specific channel values (x,y,z). And something to allow for moving for moving windows such as https://docs.rs/ndarray/latest/ndarray/struct.ArrayBase.html#method.windows

These and the issues already mentioned in the "what's not so good" are the only things that would stop me from moving over the ndarray-vision algorithms to be implemented in terms of ImageStride

xd009642 · 2022-02-02T22:50:28Z

rfcs/0001-image-trait.md

+and `height()` methods return `u32` but should perhaps return `usize`. (The [`stride()`](https://docs.rs/machine-vision-formats/0.1.1/machine_vision_formats/trait.Stride.html#tymethod.stride)
+method does return `usize`.)
+
+## [`ndarray_image::NdImage` v0.3.0](https://docs.rs/ndarray-image/0.3.0/ndarray_image/)


I dunno if you want to mention it but there's always ndarray_vision::ImageBase<T, C> where C: ColorModel and T is an ndarray::Data. It simply wraps an ndarray::ArrayBase<T, Ix3> and uses Ix3 to make it easy to access different colour channels.

Then using type tags for the colour spaces ndarray-vision basically implements impl From<Rgb> for Hsv and then there's an auto-conversion so you can use From/Into to go from ImageBase<T, Rgb> -> ImageBase<T, Hsv> automatically.

Issue with that colour modelling is that it makes it hard to handle things like gamma, they could potentially be const generics when they're stable - but just how ergonomic is that API? It does let image processing algorithms specify what colour spaces they work on in the types i.e. impl Into<ImageBase<T, Gray>> which might be a plus? But I haven't quite implemented it like that currently preferring to do conversions inside the algorithms to make the API "friendlier".

Also compile time with generics may be a concern - but I think it's worth it personally for numeric stuff.

Just mentioning as it seems to be different enough to the existing things.

vadixidav · 2022-02-05T22:53:11Z

Overall, I would like to mention a few things:

I don't think this will let us write algorithms that abstract over GPU buffers, which is possibly okay, and perhaps this is just the CPU format of the image (which can then be copied to the GPU with a different abstraction).
My biggest gripe is that we may not just have one stride. We might have an image in YUV420 or similar format where we have entirely unique dimensions for each channel of the image. In that case we cannot say "each pixel is X bits/bytes". I would think that we should have N number of offsets, dimensions, and strides for both X and Y.

Let me elaborate on that last point a bit.

For instance, for RGB, I would expect something like the following:

struct Channel {
    offsets: [usize; 2],
    strides: [usize; 2],
    dimensions: [usize; 2],
}
fn channels(&self) -> &[Channel];

I would expect to get back:

&[Channel { offsets: [0, 0], strides: [3, width * 3], dimensions: [width, height] }, Channel { offsets: [1, 0], strides: [3, width * 3], dimensions: [width, height] }, Channel { offsets: [2, 0], strides: [3, width * 3], dimensions: [width, height] }]

Where for R the X offset is 0, G the X offset is 1, and B the X offset is 2, but they are otherwise the same. These offsets are in bytes, of course. We could ALSO associate more specific information with each channel. For instance, we could say "this channel is blue" or "this channel is luminosity" by using some enum that comes back in the Channel type. The color space information could be stored in the channel or it could be stored with the image entirely separately, but this is just one approach. This system lets us store even images in YUV420 or similar, since it employs a separate "ndarray slice" mechanism for each channel.

This also allows each channel to be stored separately in memory, so long as they are in the same buffer. Some algorithms might be more efficient when operating on images in this way.

astraw added 2 commits November 27, 2021 08:54

initial thoughts

0ddf63d

fix broken markdown

25ed4cb

xd009642 reviewed Feb 2, 2022

View reviewed changes

vadixidav assigned vadixidav and unassigned vadixidav Feb 5, 2022

vadixidav self-requested a review February 5, 2022 22:55

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Image trait wip #1

Image trait wip #1

Uh oh!

astraw commented Feb 2, 2022

Uh oh!

xd009642 left a comment

Uh oh!

xd009642 Feb 2, 2022

Uh oh!

xd009642 Feb 2, 2022

Uh oh!

vadixidav commented Feb 5, 2022 •

edited

Loading

Uh oh!

Uh oh!

Image trait wip #1

Are you sure you want to change the base?

Image trait wip #1

Uh oh!

Conversation

astraw commented Feb 2, 2022

Uh oh!

xd009642 left a comment

Choose a reason for hiding this comment

Uh oh!

xd009642 Feb 2, 2022

Choose a reason for hiding this comment

Uh oh!

xd009642 Feb 2, 2022

Choose a reason for hiding this comment

Uh oh!

vadixidav commented Feb 5, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

vadixidav commented Feb 5, 2022 •

edited

Loading