Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Show % of file/time/similar and/or expose as metric? #83

Open
RichiH opened this issue Feb 12, 2025 · 7 comments
Open

Show % of file/time/similar and/or expose as metric? #83

RichiH opened this issue Feb 12, 2025 · 7 comments

Comments

@RichiH
Copy link

RichiH commented Feb 12, 2025

If we get any progress information from e.g. ffmpeg, it might be nice to expose it as a metric. That way we would predict runtimes which would help with e.g. downscaling the cluster.

@yoe
Copy link
Owner

yoe commented Feb 12, 2025

Have vague plans of doing something along those lines.

Ffmpeg does actually support sending progress information in a parseable format, and Media::Convert (the part of SReview that integrates with ffmpeg) has support for parsing that information and passing progress information on to a callback routine. This is in the test suite so we know it does work, but it is not hooked up in SReview yet.

It sounds straightforward to 'just' implement it, but every time I think about it I get lost in minutiae related to the fact that you want progress information of the whole script, not a single ffmpeg command, and there can be dozens in a single script, and I want to weigh the progress information of a single command against expected runtime, or maybe not, and maybe I should just enumerate all the commands, but then future changes become a pain, maybe I should just create a separate object to keep track of what needs to be done so it's automatic, but what needs to be done is dynamic based on the result of some of the commands and you can't know all that beforehand, so you need loops and conditionals in that object, and and and OMG it's becoming a DSL now maybe I should just stop.

But yeah, eventually I'll get over myself and just do it.

Patches welcome, I guess 😉

@yoe
Copy link
Owner

yoe commented Feb 12, 2025

Also, I don't want to stop with progress information, I also want to implement better error handling so that if an ffmpeg aborts, we update the state of the talk to a failed state so we know that happens. And while we're at it, might want to capture stderr and stdout too and put it in the database so we can keep track of things more easily. Etc.

This all gets very hairy and needs a lot of design work.

@RichiH
Copy link
Author

RichiH commented Feb 13, 2025 via email

@johanvdw
Copy link

Can we transcode quicker? That would remove the need for this metric.

@yoe
Copy link
Owner

yoe commented Feb 13, 2025

Can we transcode quicker?

Sure. Transcoding is always a tradeoff between transcode quality (reflected in file size and number of artifacts) versus transcode time. Transcoding for longer will get you better results (smaller files, less artifacts).

I think the current settings are reasonable, but we can definitely revisit them.

The "vmaf" link that was provided in a different issue should also be useful (we don't use that currently, we probably should, but I need to figure out what the best way to do so is, first).

That would remove the need for this metric.

No it wouldn't :)

@yoe
Copy link
Owner

yoe commented Feb 13, 2025

I think the current settings are reasonable

To expand on this a bit.

SReview is optimized to maximize throughput, not speed of a single encode. This is also why I prefer to have a queue slot per CPU rather than only doing a load 3/4th, as we do now (the difference is very clear in grafana, fwiw).

With the hetzner nodes, we had about 3 times the CPUs that we have now. With that, we managed to transcode somewhat over 300 videos in rougly 16 hours. This means that we should be able to transcode roughly 100 videos in 16 hours with what we have left, which I think is more than plenty.

@johanvdw
Copy link

That would remove the need for this metric.

No it wouldn't :)

you are right. But I want to have the throughput discussion, but I'll open another issue for that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants