Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Build guide: Capabilities #532

Open
wants to merge 39 commits into
base: main
Choose a base branch
from
Open

Build guide: Capabilities #532

wants to merge 39 commits into from

Conversation

pdaoust
Copy link
Collaborator

@pdaoust pdaoust commented Feb 13, 2025

Closes #524 . I'm interested to hear what other people think about this -- there are so many moving parts involved in making a minimal functional capability setup in a hApp that I found the code samples are...... large. I tried to pare them down to the essentials (but still, there's a lot of LOC just to retrieve and exercise a claim), and added a section at the end that integrates it all together into a working system.

It was even worse than this before; this is a rewrite. Any suggestions for improvement/simplification of the code samples are welcome.

@pdaoust pdaoust marked this pull request as ready for review February 13, 2025 21:07
@pdaoust pdaoust enabled auto-merge (squash) February 13, 2025 21:08
@pdaoust pdaoust requested a review from a team February 13, 2025 21:09
---

::: intro
Access to zome functions is secured by a variant of **capability-based security** that adds the ability to restrict access to a given set of callers, identified and authenticated by their public/private key pair.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, there's also the "Transferable" option of having it only secret based, i.e. not tied to a public key. But maybe that's too much info for an intro?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, what I'm saying here is that traditional cap security doesn't have that concept so ours adds it. Doesn't seem clear to me either, on a re-read.


## Capability-based security, updated for agent-centric applications

Traditional [capability-based security](https://en.wikipedia.org/wiki/Capability-based_security) works on a simple concept: the owner of a resource grants access to other processes by giving out a handle to the resource rather than direct access to it. (Usually in a client/server system, the handle is an authorization secret such as an [OAuth2 token](https://auth0.com/intro-to-iam/what-is-oauth-2).) Thus they can control the way the resource is used without needing to deal with access control lists or other access control methods. When the owner no longer wants the process to access the resource, they destroy the handle.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Traditional [capability-based security](https://en.wikipedia.org/wiki/Capability-based_security) works on a simple concept: the owner of a resource grants access to other processes by giving out a handle to the resource rather than direct access to it. (Usually in a client/server system, the handle is an authorization secret such as an [OAuth2 token](https://auth0.com/intro-to-iam/what-is-oauth-2).) Thus they can control the way the resource is used without needing to deal with access control lists or other access control methods. When the owner no longer wants the process to access the resource, they destroy the handle.
Traditional [capability-based security](https://en.wikipedia.org/wiki/Capability-based_security) works on a simple concept: the owner of a resource grants access to other processes by giving out a handle to the resource rather than direct access to it. (Usually in a client/server system, the handle is an authorization secret such as an [OAuth2 token](https://auth0.com/intro-to-iam/what-is-oauth-2).) Thus they can control the way the resource is used without needing to deal with access control lists or other access control methods. When the owner no longer wants the process to access the resource, they invalidate the handle.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you're saying above that they're "giving out a handle" then they cannot destroy the handle itself I guess? They can only invalidate it on their end, i.e. that they stop accepting handles of this kind, right? Might be nit-pick though...

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like nit-picks like this. Thanks!

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


### Unrestricted

In a hApp where agents can call each other's zome functions, <!--TODO: expand this to other cells on agent's device when `UseExisting` provisioning strategy is implemented-->it's usually necessary to create an unrestricted grant for at least one zome function that allows an agent to ask another agent for more capabilities.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's usually necessary to create an unrestricted grant for at least one zome function that allows an agent to ask another agent for more capabilities

I think I disagree with this statement. This assumes that the granting of additional cap grants is at least mediated by remote calls but I don't think that's at all necessary and maybe not even common. Intuitively I'd rather assume that there are coordinator functions to grant access to certain agents for certain functions for example, and a user can pick in the happ UI an agent key from a profile dropdown for example and grant that agent access to something. For that to work there are no prior remote calls involved at all...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would probably rather frame it here that there are certain types of remote calls that, depending on the happ, may want to be accessible by default because they provide an important functionality of the app.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So potentially a way to simplify/shorten the code examples would be to remove the remote call part of requesting a cap grant and assume that this needs to happen out of band which in the case of ghost writing also doesn't necessarily seem unplausible.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But I'm not implying that I think the code example is not good the way it is right now...I'm just mentioning that thought since you're asking about simplification in the PR description.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay, useful pushback. I can see it leading to a great deal of simplification. Thanks!

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After this feedback I had a nagging feeling the comprehensive example really wasn't necessary -- not if we're not writing about patterns in this first pass of the Build Guide -- and ThetaSinner confirmed. After moving it around I just ended up deleting it: 7e5a96e

let mut functions = BTreeSet::new();
functions.insert((zome_info()?.name, "recv_remote_signal".into()));
create_cap_grant(CapGrantEntry {
tag: "".into(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
tag: "".into(),
tag: "remote signals".into(),

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


### Transferrable

Sometimes it doesn't matter who's calling a zome function, as long as they can supply the right secret. This is useful when there's an open number of bots, system services, or other agents that should be authorized to call the function.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

an open number of bots, system services, or other agents

Interesting, do you see use cases for this primarily for non-human agents? It sounds like it from the formulation...but I haven't actually ever thought too much about use cases in which transferable secrets are useful.

Copy link
Collaborator Author

@pdaoust pdaoust Feb 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Useful question -- I guess I can see it'd be useful for human agents with multiple devices. Any other use cases you can think of, off the top of your head?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

// Rather than returning the secret to Bob's UI so he can send it to her
// out-of-band, let's send it to her directly. In a more robust app, we'd
// want to check for failed delivery and set up a retry handler.
send_remote_signal(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is missing a vector of agent keys I think which should be constructed from the requestor field of the DelegateAuthorRequest input.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

---

::: intro
Access to zome functions is secured by **capability-based security**. Holochain extends this concept by adding the ability to restrict access to a given set of callers.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the way this is implemented in Holochain is thoroughly confused (i.e. the implementers made things unclear). But I think the way this is being introduced here makes things more confusing rather than less.

What's going on here is that generally a capability is hard to revoke. In the example of OAuth2 the revocation is generally done based on time. The holder can destroy their own token but the issuer can't do anything. If you want to be able to revoke on demand rather than after a time interval, then you need to record what access tokens you have issued somehow.

The thing we call a CapSecret is just not a secret. It's an identifier for the capability. In the case of an assigned capability, it's completely safe to share. You could call it a secret in the transferable case but really it's used to identify the capability that the caller is wanting to use and access is granted to anyone. It's pretty hard to call a shared password a secret.

So the "unforgeable token" part comes from combining the capability identifier (CapSecret) with a signature from the agent's private key. Although it's actually the zome call that you sign, the message you send is unforgeable.

So in reality, this is really very close to capability-based access control. You can think of the CapSecret as the capability being granted. The capability has to be dynamic because you could permit access to any combination of your zome functions. Trying to name these things uniquely could get very confusing for the user. What we're adding on top of the simplest version of this is revocation. We can delete a capability whenever we want and no use of the "secret" or a valid agent signature will let you make calls.

Every time I look at this, I think that it looks more like an ACL with a pointless secret. If we renamed CapSecret to CapabilityId I think that would be a good start. Then we should document this as capability-based access control with revocation.

Copy link
Collaborator Author

@pdaoust pdaoust Feb 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trying to wrap my head around this in light of what I know about ocap systems. Are you saying that Transferable lacks the property of confinement, whereas 'Assigned' gets you what Demolishing Capability Myths calls 'Property F: Access-controlled delegation channels' (as I understand it, in Holochain terms that means that in order for Alice to give to Bob a capability she received from Carol, Alice has to give Bob a capability to one of her own zome functions that then exercises the Carol capability)?

If so, then I get what you're talking about -- for Transferable, the secret looks more like an API token (security through obfuscation, but at least you can revoke it) and for Assigned, the secret is unnecessary but useful as an identifier for retrieving and exercising it later.

My question is, what's the next steps? Do we talk about this in the guide (or alternatively update the core concepts)? If the problem is in the design, I feel like the most we can do in the guide and the CC is to put up warnings that unrestricted = no capability, transferrable = capability but everyone gets a handle (if they can figure out the handle's identifer), and assigned = actual capability (per this comment), to help knowledgeable people map their understanding of ocaps to what Holochain does and doesn't do.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the secret looks more like an API token (security through obfuscation, but at least you can revoke it)

I don't agree, an API token can generally be revoked and is supposed to uniquely identify its user. I'm saying that both in this case and the Assigned case, it's not really a secret. It's better thought of as an ID. Neither in the case of an API token nor with a cap secret is the value guessable. It's not about obfuscation. My objection comes from the need to share the value with multiple agents in order to "transfer" access. At that point, anybody who posses it is free to share the "secret" and anyone who posses it is free to make calls. To me that's not a secret, its just an ID for a the permissions record that opens up some list of functions.

for Assigned, the secret is unnecessary but useful as an identifier for retrieving and exercising it later.

I'm saying it's necessary but not secret. If I gave you my cap secret but it's assigned to me and not you, then you can't use it. It has no value as a secret. It functions as an identifier because you might grant me permissions to call different sets of functions at different times and I need to tell you which permissions I'm trying to use.

My question is, what's the next steps?

I think some things should be renamed in Holochain but as for how we document this, yes I think that's along the right lines.

  • Unrestricted: Anybody can call the named functions as long as they provide a signed payload.
  • Transferable: Some security, anybody with the capsecret can call the named functions as long as they provide the cap secret within a signed payload. There is no way to control who the capsecret is transferred to once you've given it out to the initial agent(s) who you choose to grant access to.
  • Assigned: More traditional security. Only the listed agents, with the cap secret, can call the listed functions. This is valid to think about as a capability because the agent can independently produce a payload that contains the capsecret to identify the capability and sign their payload making it unforgeable. We record the list of agents that the capability is assigned to, so that we have way to revoke their access again. If you want more granular access and revocation, you can put a single agent in the assigned list.

Along those lines? What I'm really getting at, is to not go too deep on comparing this with capability based access control and describe how Holochain does it in comparison. It's sufficiently contorted in Holochain that it's hard to describe. The TL;DR is that it's capability based access control with revocation by the issuer but I think it's clearer to just directly describe what Holochain does.

Holochain extends this concept for zome calls, first by requiring that the payload of every call be signed by a private key. Let's take a look at the [`CapAccess` enum](https://docs.rs/holochain_integrity_types/latest/holochain_integrity_types/capability/enum.CapAccess.html) which defines the kinds of **capability grant** you can use in your hApp:

* `CapAccess::Unrestricted`: any signing key can access the function(s) covered by the capability.
* `CapAccess::Transferable`: any caller who possesses the secret can access the function(s). This is identical to traditional capability-based security.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both this and the one below are. In this case the assignee is just "anyone". So you still have to provide a valid signature and the "secret" just identifies the capability you want to use to get access to the function you're calling

* `CapAccess::Transferable`: any caller who possesses the secret can access the function(s). This is identical to traditional capability-based security.
* `CapAccess::Assigned`: a caller must possess the secret _and_ sign the call with a known key.

There's a fourth kind of capability, called the **author grant**, which covers any call made by a caller with the same key as the cell's agent ID --- that is, _the agent who owns the cell_. It's essentially a combination of unrestricted (in terms of what functions can be called) plus assigned (in terms of who may call those functions).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Holochain does implement things this way. I wonder if that's the most helpful way for a user to think about it. Perhaps it'd be simpler to say that calls signed by the private key of the agent who owns a cell, don't require an explicit grant, access is always permitted to all zome functions?


There's a fourth kind of capability, called the **author grant**, which covers any call made by a caller with the same key as the cell's agent ID --- that is, _the agent who owns the cell_. It's essentially a combination of unrestricted (in terms of what functions can be called) plus assigned (in terms of who may call those functions).

All zome-to-zome calls within a cell and cell-to-cell calls within an agent's hApp instance are covered by the author grant. UIs are also covered by the author grant if they're hosted by one of the [well-known Holochain runtimes](/build/happs/#package-a-happ-for-distribution).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
All zome-to-zome calls within a cell and cell-to-cell calls within an agent's hApp instance are covered by the author grant. UIs are also covered by the author grant if they're hosted by one of the [well-known Holochain runtimes](/build/happs/#package-a-happ-for-distribution).
All zome-to-zome calls within a cell and cell-to-cell calls within an agent's hApp instance are covered by the author grant. ~~UIs are also covered by the author grant if they're hosted by one of the [well-known Holochain runtimes](/build/happs/#package-a-happ-for-distribution).~~

I think documenting writing apps for our current runtime implementations should be a separate concern.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about treating it as a given? As in, we assume that you're scaffolding a hApp and hence will be using one of our runtimes, so you don't need to worry about provisioning grants for the UI? In that case we don't nee to say anything about the runtime here -- just say "this also applies to a UI that's bundled with a hApp".

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good with me yes, just don't say anything and people shouldn't worry about why it works until they try to build their own runtime or use the client libraries outside of our tooling.


### Transferrable

Sometimes you want to selectively grant access to a function but don't want to restrict the number of agents that can exercise the capability. This is useful when a person has multiple devices (and hence multiple agent IDs), or when there's a bot or background process whose signing key at call time is rotated on an unknown schedule.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should explicitly call out somewhere in this section that the security of this mechanism is minimal. It might be useful but anyone you pass the token to can pass the token on. Yes the name of the access type implies that but I still think we should call it out explicitly


### Assigned

If you're concerned about capability secrets being leaked, you can bind a secret to one or more public keys. The zome call's provenance must match one of these public keys, and the payload signature must be valid for the provenance. This function rewrites `approve_delegate_author_request` to create an assigned grant.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you're concerned about capability secrets being leaked

Based on my comments above, this should never be a concern. I might use Transferable to give a bot read access or something like that but if I care about security or I'm allowing writes, then assigned should be used and the security comes from the agent's signature, not from the secret.


## Store a capability claim

Once an has gotten a capability secret from a grantor, they need to store it as a [`CapClaim`](https://docs.rs/holochain_integrity_types/latest/holochain_integrity_types/capability/struct.CapClaim.html) entry with the [`create_cap_claim`](https://docs.rs/hdk/latest/hdk/capability/fn.create_cap_claim.html) host function so they can use it later when they want to call the functions they've been granted access to.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Once an has gotten a capability secret from a grantor, they need to store it as a [`CapClaim`](https://docs.rs/holochain_integrity_types/latest/holochain_integrity_types/capability/struct.CapClaim.html) entry with the [`create_cap_claim`](https://docs.rs/hdk/latest/hdk/capability/fn.create_cap_claim.html) host function so they can use it later when they want to call the functions they've been granted access to.
Once an agent has got a capability secret from a grantor, they need to store it as a [`CapClaim`](https://docs.rs/holochain_integrity_types/latest/holochain_integrity_types/capability/struct.CapClaim.html) entry with the [`create_cap_claim`](https://docs.rs/hdk/latest/hdk/capability/fn.create_cap_claim.html) host function so they can use it later when they want to call the functions they've been granted access to.


## Use a capability secret

To exercise a capability they've been granted, the agent needs to retrieve the claim from their source chain using the [`query`](https://docs.rs/hdk/latest/hdk/chain/fn.query.html) host function and supply the secret along with the zome call.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only tests for this in Holochain don't do that. They store the secret in memory during the test. That is very poor testing! Not even one example of how this would actually be used.

I'm not unhappy with this being documented this way. It's valid, it works. It's not great though. There's no reason you should have to load all your cap claims to find one that could easily be identified by a SQL query that returns a single row.

This should be a feature request against the HDK please.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agreed, way ahead of you :) holochain/holochain#4708

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice :) I remember seeing that issue now you say!

@pdaoust pdaoust force-pushed the feat/guide/capabilities branch from d8f42c2 to 7e5a96e Compare February 24, 2025 20:17

An agent generates a capability by storing a [`CapGrantEntry`](https://docs.rs/holochain_integrity_types/latest/holochain_integrity_types/capability/struct.CapGrantEntry.html) system entry on their source chain using the [`create_cap_grant`](https://docs.rs/hdk/latest/hdk/capability/fn.create_cap_grant.html) host function.

!!! Capabilities have to be created in every cell
Copy link
Member

@mattyg mattyg Feb 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
!!! Capabilities have to be created in every cell
!!! note Capabilities have to be created in every cell

It wasn't rendering this as a note section.

@@ -138,7 +138,7 @@ So Holochain manages the dependency mapping for you, allowing you to write code

When do you decide whether a hApp should have more than one DNA? Whenever it makes sense to have multiple separate networks or databases within the hApp. These are the most common use cases:

* **Dividing responsibilities.** For instance, a video sharing hApp may have one group of peers who are willing to index video metadata and offer search services and another group of peers who are willing to host and serve videos, along with people who just want to watch them. This DNA could have `search` and `storage` DNAs, along with a main DNA that allows video watchers to look up peers that are offering services and query them.
* **Dividing responsibilities.** For instance, a video sharing hApp may have one group of peers who are willing to index video metadata and offer search services and another group of peers who are willing to host and serve videos, along with people who just want to watch them. This DNA could have `search` and `storage` DNAs, along with a main DNA that allows video watchers to look up peers that are offering services and query them. {#dividing-responsibilities}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't seem like this is linked to anywhere?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Build guide: call a zome function / capabilities
4 participants