Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move relay connection logic into main event loop #632

Merged
merged 11 commits into from
Jun 25, 2024

Conversation

sandreae
Copy link
Member

@sandreae sandreae commented Jun 22, 2024

Before this PR a peer only registered on any configured relay nodes on startup in an initiation phase before the main networking event loop started. The main reason for this is that listening on the circuit relay would sometimes (quite often) fail and so we needed a retry loop which would timeout after x seconds. This pattern was clunky and not possible to fit into the event loop. It also meant that we had to disable the Peers behaviour, enable it, and then force connect again to the relay in order to trigger replication... ugh...

Since #631 listening on the circuit relay never seems to fail, and so moving all of the relay connection and registration logic into the main event loop is now possible (yay!). This unlocks some nicer patterns around handling relays and means we can dynamically register on relays during runtime. No disabling peers behaviour and secondary force connecting required.

  • introduce Relay struct for holding current state relating to relays and a node's registration on it
  • move relay dialing, namespace registration, circuit relay listening and initiating peer discovery into main event loop
  • revise connection limits now that things are more stable and observable
  • improve logging messages
  • bonus: depending on logging level either println!(..) or info!(..) startup info

check out these beautiful logs!!

[2024-06-22T15:45:36Z INFO  aquadoggo::manager] Start materializer service
[2024-06-22T15:45:36Z INFO  aquadoggo::materializer::worker] Register reduce worker with pool size 16
[2024-06-22T15:45:36Z INFO  aquadoggo::materializer::worker] Register dependency worker with pool size 16
[2024-06-22T15:45:36Z INFO  aquadoggo::materializer::worker] Register schema worker with pool size 16
[2024-06-22T15:45:36Z INFO  aquadoggo::materializer::worker] Register blob worker with pool size 16
[2024-06-22T15:45:36Z INFO  aquadoggo::materializer::worker] Register garbage_collection worker with pool size 16
[2024-06-22T15:45:36Z DEBUG aquadoggo::materializer::service] Dispatch 0 pending tasks from last runtime
[2024-06-22T15:45:36Z DEBUG aquadoggo::materializer::service] Materialiser service is ready
[2024-06-22T15:45:36Z INFO  aquadoggo::manager] Start http service
[2024-06-22T15:45:36Z DEBUG aquadoggo::graphql::schema] Subscribing GraphQL manager to schema provider
[2024-06-22T15:45:36Z DEBUG aquadoggo::graphql::schema] Finished building initial GraphQL schema
[2024-06-22T15:45:36Z INFO  aquadoggo] HTTP port 2020 was already taken, try random port instead ..
[2024-06-22T15:45:36Z INFO  aquadoggo] Go to http://0.0.0.0:40723/graphql to use GraphQL playground
[2024-06-22T15:45:36Z DEBUG aquadoggo::http::service] HTTP service is ready
[2024-06-22T15:45:36Z INFO  aquadoggo::manager] Start network service
[2024-06-22T15:45:36Z INFO  aquadoggo] Peer id: 12D3KooWHNwGZBRLBQqiJsQD8xLy9JmPfJiSptRX8d15ftiVxLrJ
[2024-06-22T15:45:36Z INFO  aquadoggo::network::service] Networking service initializing...
[2024-06-22T15:45:36Z DEBUG aquadoggo::network::behaviour] Identify network behaviour enabled
[2024-06-22T15:45:36Z DEBUG aquadoggo::network::behaviour] Rendezvous client network behaviour enabled
[2024-06-22T15:45:36Z DEBUG aquadoggo::network::behaviour] Relay client network behaviour enabled
[2024-06-22T15:45:36Z INFO  aquadoggo] QUIC port 2022 was already taken, try random port instead ..
[2024-06-22T15:45:36Z INFO  aquadoggo::network::service] Network service ready!
[2024-06-22T15:45:36Z INFO  aquadoggo::manager] Start replication service
[2024-06-22T15:45:36Z DEBUG aquadoggo::network::service] Dial relay at address 127.0.0.1:2022
[2024-06-22T15:45:36Z INFO  aquadoggo] Node is listening on 0.0.0.0:59137
[2024-06-22T15:45:36Z DEBUG aquadoggo::network::service] Connected to /ip4/127.0.0.1/udp/2022/quic-v1 (1/1)
[2024-06-22T15:45:36Z INFO  aquadoggo::replication::service] Established connection with peer: 12D3KooWK2KQ1BQHzsQDEsHdGfhndDTDJqdFjFgwTSjCQZRNJ3Jm (1)
[2024-06-22T15:45:36Z DEBUG aquadoggo::network::service] Relay identified 12D3KooWK2KQ1BQHzsQDEsHdGfhndDTDJqdFjFgwTSjCQZRNJ3Jm /ip4/127.0.0.1/udp/2022/quic-v1
[2024-06-22T15:45:36Z DEBUG aquadoggo::network::service] Told relay 12D3KooWK2KQ1BQHzsQDEsHdGfhndDTDJqdFjFgwTSjCQZRNJ3Jm its public address
[2024-06-22T15:45:36Z DEBUG aquadoggo::network::service] Registration request sent to relay 12D3KooWK2KQ1BQHzsQDEsHdGfhndDTDJqdFjFgwTSjCQZRNJ3Jm
[2024-06-22T15:45:36Z DEBUG aquadoggo::network::service] Registered on rendezvous 12D3KooWK2KQ1BQHzsQDEsHdGfhndDTDJqdFjFgwTSjCQZRNJ3Jm in namespace "aquadoggo"
[2024-06-22T15:45:36Z DEBUG aquadoggo::network::service] Relay 12D3KooWK2KQ1BQHzsQDEsHdGfhndDTDJqdFjFgwTSjCQZRNJ3Jm accepted circuit reservation request
[2024-06-22T15:45:36Z INFO  aquadoggo::network::service] Discovering peers in namespace "aquadoggo" on relay 12D3KooWK2KQ1BQHzsQDEsHdGfhndDTDJqdFjFgwTSjCQZRNJ3Jm
[2024-06-22T15:45:36Z DEBUG aquadoggo::network::service] Discovered 2 addresses for peer 12D3KooWHNwGZBRLBQqiJsQD8xLy9JmPfJiSptRX8d15ftiVxLrJ
[2024-06-22T15:45:36Z DEBUG aquadoggo::network::service] Discovered 2 addresses for peer 12D3KooWJ5veLXRZ1zibpzMHFDfoPuFCMjKAUUHxWEsgQuyaFgba
[2024-06-22T15:45:36Z DEBUG aquadoggo::network::service] Dialed peer 12D3KooWJ5veLXRZ1zibpzMHFDfoPuFCMjKAUUHxWEsgQuyaFgba
[2024-06-22T15:45:36Z DEBUG aquadoggo::network::service] Connected to /ip4/127.0.0.1/udp/2022/quic-v1/p2p/12D3KooWK2KQ1BQHzsQDEsHdGfhndDTDJqdFjFgwTSjCQZRNJ3Jm/p2p-circuit/p2p/12D3KooWJ5veLXRZ1zibpzMHFDfoPuFCMjKAUUHxWEsgQuyaFgba (1/1)
[2024-06-22T15:45:36Z INFO  aquadoggo::replication::service] Established connection with peer: 12D3KooWJ5veLXRZ1zibpzMHFDfoPuFCMjKAUUHxWEsgQuyaFgba (2)
[2024-06-22T15:45:36Z INFO  aquadoggo::replication::manager] Initiate outbound replication session with peer 12D3KooWK2KQ1BQHzsQDEsHdGfhndDTDJqdFjFgwTSjCQZRNJ3Jm (1)
[2024-06-22T15:45:36Z INFO  aquadoggo::replication::service] Finished replication with peer 12D3KooWK2KQ1BQHzsQDEsHdGfhndDTDJqdFjFgwTSjCQZRNJ3Jm (1)
[2024-06-22T15:45:36Z DEBUG aquadoggo::network::service] Connected to /ip4/127.0.0.1/udp/55554/quic-v1/p2p/12D3KooWJ5veLXRZ1zibpzMHFDfoPuFCMjKAUUHxWEsgQuyaFgba (1/2)
[2024-06-22T15:45:36Z INFO  aquadoggo::network::service] Connection with 12D3KooWJ5veLXRZ1zibpzMHFDfoPuFCMjKAUUHxWEsgQuyaFgba(3) upgraded to direct connection
[2024-06-22T15:45:36Z INFO  aquadoggo::replication::service] Established connection with peer: 12D3KooWJ5veLXRZ1zibpzMHFDfoPuFCMjKAUUHxWEsgQuyaFgba (3)
[2024-06-22T15:45:36Z INFO  aquadoggo::replication::manager] Initiate outbound replication session with peer 12D3KooWJ5veLXRZ1zibpzMHFDfoPuFCMjKAUUHxWEsgQuyaFgba (2)
[2024-06-22T15:45:36Z INFO  aquadoggo::replication::manager] Initiate outbound replication session with peer 12D3KooWK2KQ1BQHzsQDEsHdGfhndDTDJqdFjFgwTSjCQZRNJ3Jm (1)
[2024-06-22T15:45:36Z INFO  aquadoggo::replication::service] Finished replication with peer 12D3KooWK2KQ1BQHzsQDEsHdGfhndDTDJqdFjFgwTSjCQZRNJ3Jm (1)
[2024-06-22T15:45:36Z INFO  aquadoggo::replication::service] Finished replication with peer 12D3KooWJ5veLXRZ1zibpzMHFDfoPuFCMjKAUUHxWEsgQuyaFgba (2)
[2024-06-22T15:45:36Z INFO  aquadoggo::replication::manager] Accept inbound replication session with peer 12D3KooWK2KQ1BQHzsQDEsHdGfhndDTDJqdFjFgwTSjCQZRNJ3Jm (1)
[2024-06-22T15:45:36Z INFO  aquadoggo::replication::service] Finished replication with peer 12D3KooWK2KQ1BQHzsQDEsHdGfhndDTDJqdFjFgwTSjCQZRNJ3Jm (1)

📋 Checklist

  • Add tests that cover your changes
  • Add this PR to the Unreleased section in CHANGELOG.md
  • Link this PR to any issues it closes
  • New files contain a SPDX license header

Copy link

codecov bot commented Jun 22, 2024

Codecov Report

Attention: Patch coverage is 17.52577% with 240 lines in your changes missing coverage. Please review.

Project coverage is 92.46%. Comparing base (ba416ea) to head (98848dc).
Report is 58 commits behind head on bump-libp2p.

Current head 98848dc differs from pull request most recent head 2798270

Please upload reports for the commit 2798270 to get more accurate results.

Files Patch % Lines
aquadoggo/src/network/service.rs 4.28% 134 Missing ⚠️
aquadoggo/src/network/relay.rs 0.00% 62 Missing ⚠️
aquadoggo/src/network/utils.rs 0.00% 32 Missing ⚠️
aquadoggo/src/network/swarm.rs 60.00% 6 Missing ⚠️
aquadoggo/src/http/service.rs 40.00% 3 Missing ⚠️
aquadoggo/src/lib.rs 85.71% 1 Missing ⚠️
aquadoggo/src/network/behaviour.rs 50.00% 1 Missing ⚠️
aquadoggo/src/network/peers/handler.rs 66.66% 1 Missing ⚠️
Additional details and impacted files
@@               Coverage Diff               @@
##           bump-libp2p     #632      +/-   ##
===============================================
+ Coverage        92.35%   92.46%   +0.11%     
===============================================
  Files              106      104       -2     
  Lines            17861    20474    +2613     
===============================================
+ Hits             16495    18931    +2436     
- Misses            1366     1543     +177     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@sandreae sandreae marked this pull request as ready for review June 22, 2024 15:00
@adzialocha adzialocha changed the base branch from bump-libp2p to main June 24, 2024 12:25
@adzialocha adzialocha self-requested a review June 24, 2024 12:25
@adzialocha adzialocha changed the base branch from main to bump-libp2p June 24, 2024 12:25
Copy link
Member

@adzialocha adzialocha left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😍

aquadoggo/src/lib.rs Outdated Show resolved Hide resolved
aquadoggo/src/lib.rs Show resolved Hide resolved
aquadoggo/src/lib.rs Show resolved Hide resolved
@adzialocha
Copy link
Member

(needs a rebase against main and CHANGELOG.md entry)

@sandreae sandreae force-pushed the relay-connection-in-main-event-loop branch from 98848dc to 3b95ae9 Compare June 25, 2024 16:50
@sandreae sandreae changed the base branch from bump-libp2p to main June 25, 2024 17:09
@sandreae sandreae merged commit 776dfa8 into main Jun 25, 2024
8 checks passed
@sandreae sandreae deleted the relay-connection-in-main-event-loop branch June 25, 2024 19:06
jmanm added a commit to jmanm/aquadoggo that referenced this pull request Jul 13, 2024
* Make clippy happy

* Revert "Make clippy happy"

This reverts commit e250ccd.

* Try fmt and clippy again

* Add clippy suggestions

* Allow setting path to config file via env args (p2panda#611)

* Enable passing path to config file via env args

* Remove println

* Update comment

* Remove unwanted file

* Update CHANGELOG

* Accept domain name and ip addresses for peers (p2panda#612)

* Accept String for relay and direct peer addresses in config

* Use ToSocketAddress to handle ip and domain name addresses

* Clippy

* fmt

* Update CHANGELOG

* Update example config.toml

* Prepare CHANGELOG for release

* 0.7.2

* Fix: query for child relations fails when relation list empty (p2panda#614)

* Add test get_child_document_ids test case for document with empty relation list

* Account for null values when relation lists are empty

* Update test comment

* Update CHANGELOG

* 0.7.3

* Re-run tasks for partially materialized blobs (p2panda#618)

* Check materialized blob file is complete before aborting task

* Add test

* fmt

* Update CHANGELOG

* Clippy

* Correct cmp logic

* Remove double comment

---------

Co-authored-by: adz <x12@adz.garden>

* Fix: include all logs from target schema id during replication (p2panda#620)

* Include tombstoned documents when calculating local log heights

* Clippy

* Update CHANGELOG

* Make clippy happy

* Bump rust gh action to v1 and define toolchain version

* Introduce `PeerAddress` struct for improved address resolution patterns (p2panda#621)

* Introduce PeerAddress struct with socket and multiaddr resolution methods

* Don't pop of p2p protocol from relay address as it isn't there

* fmt

* Update CHANGELOG

* Cache socket addresses

* Remove Multiaddr from PeerAddress

* Remove serde traits from PeerAddress

* Add doc string to PeerAddress

* Rename methods

* Re-apply unhandled operations during startup of materializer service (p2panda#623)

* Store method to get all un-indexed operation ids

* Pick up un-indexed operations when starting materializer service, add a test

* Add entry to CHANGELOG.md

* Increase `max_pending_connections_*` (p2panda#628)

* Increase max pending connections

* Update CHANGELOG

* Dial all configured known relay and direct node addresses on schedule (p2panda#622)

* Poll all known peer addresses

* Update PeerAddress method name

* Update CHANGELOG

* WIP: poll known peers

* Check if a direct node was identified (and add comments)

* Don't dial direct node address on startup, rely on scheduler

* More comments

* Remove unused import

* fmt

* Doc strings for EventLoop struct

* Clippy

* 0.7.4

* Minor CHANGELOG.md formatting change

* Fix: handle connection ids greater than 9 in `Peer` impl of `Human` trait (p2panda#634)

* Handle connection ids greater than 9 in peer Human impl

* Clippy

* Update CHANGELOG

* Bump `libp2p` to version `0.53.2` (p2panda#631)

* Bump libp2p to version 0.53.2

* We don't need to listen on tcp port when in relay mode

* Listening on relay circuit no longer sometimes fails

* Remove tcp feature requirement from libp2p

* Refactor connection_keep_alive method

* Clippy

* Remove unnecessary connection_keep_alive method from peers behaviour

* Add CHANGELOG.md entry

---------

Co-authored-by: adz <x12@adz.garden>

* Move relay connection logic into main event loop (p2panda#632)

* Bump `libp2p` to version `0.53.2` (p2panda#631)

* Bump libp2p to version 0.53.2

* We don't need to listen on tcp port when in relay mode

* Listening on relay circuit no longer sometimes fails

* Remove tcp feature requirement from libp2p

* Refactor connection_keep_alive method

* Clippy

* Remove unnecessary connection_keep_alive method from peers behaviour

* Add CHANGELOG.md entry

---------

Co-authored-by: adz <x12@adz.garden>

* Move network service relay initialization into main event loop

* Clippy

* Add DCUTR event debug logging to swarm

* Change log message

* Adjust connection limits

* Even nicer log messages

* Helper to print or info log depending on log level

* Listening on relay circuit no longer sometimes fails

---------

Co-authored-by: adz <x12@adz.garden>

* Support private net with pre-shared key (p2panda#635)

* Swarm listens on both TCP and QUIC addresses

* Support both QUIC and TCP protocols

* TCP port_reuse should be false

* Establish a private net over TCP when psk provided in NetworkConfig

* Initiate swarm with private net when psk provided in config

* Update CHANGELOG

* Doc string fix

* Don't need to differentiate between transports when detecting port

* Update README

* Fix README formatting

* Update example config file

* Check if blob file exists before deleting it from fs (p2panda#636)

* Check if blob file exists before deleting it from fs

* Add entry to CHANGELOG.md

* Inconsistent blob storage warning was wrongly shown (p2panda#638)

* Inconsistent blob storage warning was wrongly shown

* Add entry to CHANGELOG.md

* Minor config.toml cleanup

* Safely handle missing document when retrieving document view from store (p2panda#637)

* Return None when document was deleted

* Add entry to CHANGELOG.md

* Introduce API to subscribe to peer connection events (p2panda#625)

* Introduce API to subscribe to peer connection events

* Add entry to CHANGELOG.md

* 0.8.0

* Also bump version in aquadoggo_cli, add note about that in RELEASE.md

* Adjust level of replication session and document materialization logs (p2panda#639)

* Remove relay and direct peer poll attempt logging

* Change document creation/update/delete logging to info level

* Lower level of replication session logs to debug

* Update CHANGELOG

* Remove incorrectly commit file

* Lower logging level for replication finished message

* Fix logging logic error in reducer

* Improve GraphQL re-build error

* Update README.md

* Expose NodeEvent to public API (p2panda#643)

* Expose NodeEvent to public API

* Add entry to CHANGELOG.md

---------

Co-authored-by: adz <x12@adz.garden>
Co-authored-by: Sam Andreae <contact@samandreae.com>
Co-authored-by: adz <adzialocha@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants