Skip to content

Decouple blocking content validation from main Overlay Service loop #342

Closed
@mrferris

Description

@mrferris

This thought relates to content validation code that was merged in #331. Let me know if I'm wrong, or if this is a duplicate/already discussed.

I think that we should remove any blocking async code in the main overlay_service.rs loop, which currently occurs here. This line's await was added in order to call the chain of process_content->process_received_content->validate_content, where validate_content is a blocking async function that makes this entire chain of functions need to be async.

The reasoning for avoiding this is that as long as this function is being awaited on, the select! loop cannot proceed. This means that no queries can progress, requests from the overlay cannot be acted upon, our responses to other nodes cannot happen (leading to peer requests timing out / us being marked disconnected), bucket refreshes can't happen, etc.

Under nominal conditions this shouldn't block for long, but an infura request taking longer than expected would turn our node into a brick for that time. And later it gets even worse if we need to request a new accumulator snapshot in order to validate data: the content query to do that won't be able to progress, leaving us deadlocked unable to get the content that we need in order to validate.

An alternative would be to send our unvalidated content to a dedicated content validation thread, which will then store it if valid.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions