Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deprecate and remove AsyncBackingParameters #5079

Open
eskimor opened this issue Jul 19, 2024 · 2 comments
Open

Deprecate and remove AsyncBackingParameters #5079

eskimor opened this issue Jul 19, 2024 · 2 comments
Assignees

Comments

@eskimor
Copy link
Member

eskimor commented Jul 19, 2024

It seems both of them (max_candidate_depth and allowed_ancestry_len) got superseded by the claim queue. The claim queue already provides everything to enforce limits, but more accurately, in particular it also accounts for parachains sharing a core and elastic scaling.

How can we enforce those limits via the claim queue?

max_candidate_depth

For a given core it does not make sense to provide more candidates than there are entries in the claim queue for that parachain, as they could never make it on chain. Backers should keep track of candidates already provided for claim queue entries, even across relay parents and reject candidates if there is no free spot left:

E.g. consider the following claim queue [A,B,A,B]. If there was a collation for B provided at the previous relay chain block already it is still valid in this one, hence we should consider the first B in the queue already occupied and only accept one more collation for B.

Now we can perfectly limit the number of provided candidates, by also accounting for other paras sharing the core. Because the claim queue is per core this also naturally covers elastic scaling: More cores, more candidates can be provided.

allowed_ancestry_len

If you still have a free spot in the claim queue as of the point of view of the provided relay parent, your collation will be accepted. This is sufficient for the collator protocol. We will still need to track the allowed relay parents in the runtime, but its buffer size can be determined based on the claim queue length.

Prerequisite: #4776 - otherwise entries are valid longer than they should be and above reasoning is void.

@Polkadot-Forum
Copy link

This issue has been mentioned on Polkadot Forum. There might be relevant details there:

https://forum.polkadot.network/t/elastic-scaling-mvp-launched/9392/4

@sandreim
Copy link
Contributor

sandreim commented Sep 23, 2024

We've been discussing implementation options in the context of collation fetching fairness and elastic scaling. The gist of it is that we target a few thins with this change:

  • collation fetching fairness (already implemented in Collation fetching fairness #4880)
  • proper spam protection against a large number of candidates (We can't really bump the static async backing parameters on Polkadot, see Bump async backing parameters #4287)
  • ensure we avoid useless backing work (seconding more candidates than assignments, fetching too many collations, etc)
  • async backing parameters are obsoleted by the claim queue

I have looked a bit at the code and I want to propose the following:

  • move the throttling code in backing subsystem + implement Collation fetching fairness - handle group rotations #5754
  • the backing subsystem will throttle CanSecond queries from collator protocol and Seconded statements from statement distribution. This ensures we never ever back more collations than number of assignments in claim queue
  • all other usages of async backing parameters are migrated to use dynamic async backing parameters generated from claim queue

In statement distribution and prospective parachains I propose to be less strict and just limit based on the dynamic async backing params. At system level honest nodes are stricter and limit the amount of candidates in collator protocol and backing. However, malicious backing groups can still provide more candidates, but these will still be limited according to the number assignments times the scheduling_lookahead. If we consider this to be a problem, we can query the backing subsystem for throttling, but it might become a bottleneck preventing the node from doing useful work.

The below high level diagram shows the current flow and throttling (red arrows)

dynamic_async_backing_params drawio (2)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Backlog
Development

No branches or pull requests

4 participants