Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Validate target environments #2806

Open
wants to merge 10 commits into
base: main
Choose a base branch
from

Conversation

itowlson
Copy link
Contributor

@itowlson itowlson commented Sep 6, 2024

EXTREMELY WIP.

There are plenty of outstanding questions here (e.g. environment definition, Redis trigger, hybrid componentisation) , and reintegration work (e.g. composing dependencies before validating) (ETA: component dependencies now work). In its current state, the PR is more for visibility than out of any sense of readiness. Also, it has not had any kind of tidying pass, so a lot of naming is not great.

But it does work, sorta kinda, some of the time. So we progress.

cc @tschneidereit

@itowlson itowlson force-pushed the validate-target-environment branch 6 times, most recently from 30aa1fc to 3b25dad Compare September 9, 2024 22:10
@itowlson
Copy link
Contributor Author

Checking environments adds an appreciable bump to the time to run spin build. I looked into caching the environments but a robust way didn't really help - the bulk of the time, in my unscientific test, was in the initial registry request, which retrieval of the Wasm bytes (for the Spin environment) about a quarter of the time. (In my debug build, from a land far away from Fermyon Cloud and GHCR where my test registry is hosted: approx 2500ms to get the digest, then 600ms to get the bytes. The cache eliminated only the 600ms.)

We could, of course, assume that environments are immutable, and key the cache by package reference instead of digest. But that would certainly be an assumption and not guaranteed to be true.

@itowlson
Copy link
Contributor Author

This is now sort of a thing and can possibly be looked at.

Outstanding questions:

  • Where shall we publish the initial set of environments?
    • The wkg configuration will need to reflect this. At the moment it uses the user's default registry. Yeah nah.
  • Where shall we maintain the environment WITs?
    • Separate repo in the Fermyon org?
  • What does that initial set contain?
    • Currently I've created a Spin CLI "2.5" world with just the HTTP trigger (WASI 0.2 and WASI RC).
  • Testing

Possibly longer term questions:

  • How can we manage environments where the set of triggers is variable - specifically the CLI with trigger plugins?
  • How to avoid a lengthy network round-trip on every build
  • Better error reporting for environments where a trigger supports multiple worlds (like, y'know, the Spin CLI).

If folks want to play with this, add the following to your favourite spin.toml:

[application]
targets = ["spin:[email protected]"]

and set your wkg config (~/.config/wasm-pkg/config.toml) to:

default_registry = "registrytest-vfztdiyy.fermyon.app"

[registry."registrytest-vfztdiyy.fermyon.app"]
type = "oci"

(No, that is not the cat walking across the keyboard... this is my test registry which backs onto my ghcr.)

@itowlson itowlson marked this pull request as ready for review September 18, 2024 23:56
@lann
Copy link
Collaborator

lann commented Sep 19, 2024

Where shall we publish the initial set of environments?

fermyon.com?

Where shall we maintain the environment WITs?

I'd suggest "next to the code that implements them"; ideally generated by that code.

Comment on lines +16 to +20
) -> Result<(
Vec<ComponentBuildInfo>,
DeploymentTargets,
Result<spin_manifest::schema::v2::AppManifest, spin_manifest::Error>,
)> {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This return type is getting unwieldy. Can we introduce a new type that gives a name to this collection of values?

let dt = deployment_targets_from_manifest(&manifest);
Ok((bc, dt, Ok(manifest)))
}
Err(e) => {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not obvious to me what's happening here. It reads like we're trying to get build and deployment configs even when we can't parse the manifest? I think some terse one-line comments on each branch of this match would go a long way to helping this be a bit more understandable.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reading the code more, I'm unsure why we go through all these great lengths to read the targets config here when later on we only seem to run the targets check if the manifest was successfully parsed. Won't this information just be thrown away?

Ok((bc, dt, Ok(manifest)))
}
Err(e) => {
let bc = fallback_load_build_configs(&manifest_file).await?;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like there's some need for better error messages here through liberal use of .context. As the code stands now, component_build_configs function might return an error saying only "expected table found some other type" which would be very confusing.

Comment on lines +85 to +96
let table: toml::value::Table = toml::from_str(&manifest_text)?;
let target_environments = table
.get("application")
.and_then(|a| a.as_table())
.and_then(|t| t.get("targets"))
.and_then(|arr| arr.as_array())
.map(|v| v.as_slice())
.unwrap_or_default()
.iter()
.filter_map(|t| t.as_str())
.map(|s| s.to_owned())
.collect();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would serializing to a type through serde::Deserialize making this easier to read?

@@ -57,6 +75,30 @@ async fn fallback_load_build_configs(
})
}

async fn fallback_load_deployment_targets(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: is "deployment targets" the right nomenclature? That sounds like what you would be targeting for spin deploy and not the environment you're targeting. Perhaps we could play with the wording here?

Comment on lines +7 to +8
pub type DeploymentTarget = String;
pub type DeploymentTargets = Vec<DeploymentTarget>;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think if we switch component_build_configs to returning a more descriptive type, than we can get rid of these type aliases. I personally find type aliases that alias core types like String and Vec more confusing than helpful.

Comment on lines +91 to +92
async fn load_component_source(&self, source: &Self::Component) -> anyhow::Result<Vec<u8>>;
async fn load_dependency_source(&self, source: &Self::Dependency) -> anyhow::Result<Vec<u8>>;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we implementing ComponentSourceLoader more than once? I'm a bit lost why we need the flexibility of defining the type of dependency and component instead of hard coding. Can you explain what this buys us?

"#
);

let doc = wac_parser::Document::parse(&wac_text)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We really should implement a programatic API in wac for this so that we don't need to manipulate a wac script.

/// Equivalent to futures::future::join_all, but specialised for iterators of
/// fallible futures. It returns a Result<Vec<...>> instead of a Vec<Result<...>> -
/// this just moves the transposition boilerplate out of the main flow.
async fn join_all_result<T, I>(iter: I) -> anyhow::Result<Vec<T>>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we only care about the first error wouldn't https://docs.rs/futures/0.3.30/futures/future/fn.try_join_all.html be a better fit here? As its currently implemented, we wait for all futures to finish even though we only ever look at the first error.

}

pub async fn load_and_resolve_all<'a>(
app: &'a spin_manifest::schema::v2::AppManifest,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: seems like we pass app and resolution_context around a lot. It might be nicer to put those into a Resolver struct and implement these functions as methods on that struct.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants