Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[indexer-alt] - add rpc api ingestion to indexer-alt #20787

Open
wants to merge 10 commits into
base: main
Choose a base branch
from

Conversation

patrickkuo
Copy link
Contributor

Description

add rpc api ingestion to indexer-alt, this is to unblock a private-net app indexer.

Test plan

How did you test the new or updated feature?


Release notes

Check each box that your changes affect. If none of the boxes relate to your changes, release notes aren't required.

For each box you select, include information after the relevant heading that describes the impact of your changes that a user might notice and any actions they must take to implement updates.

  • Protocol:
  • Nodes (Validators and Full nodes):
  • gRPC:
  • JSON-RPC:
  • GraphQL:
  • CLI:
  • Rust SDK:

@patrickkuo patrickkuo temporarily deployed to sui-typescript-aws-kms-test-env January 6, 2025 16:18 — with GitHub Actions Inactive
Copy link

vercel bot commented Jan 6, 2025

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
sui-docs ✅ Ready (Inspect) Visit Preview 💬 Add feedback Feb 11, 2025 0:13am
3 Skipped Deployments
Name Status Preview Comments Updated (UTC)
multisig-toolkit ⬜️ Ignored (Inspect) Visit Preview Feb 11, 2025 0:13am
sui-kiosk ⬜️ Ignored (Inspect) Visit Preview Feb 11, 2025 0:13am
sui-typescript-docs ⬜️ Ignored (Inspect) Visit Preview Feb 11, 2025 0:13am

@patrickkuo patrickkuo temporarily deployed to sui-typescript-aws-kms-test-env January 7, 2025 10:03 — with GitHub Actions Inactive
@patrickkuo patrickkuo temporarily deployed to sui-typescript-aws-kms-test-env January 7, 2025 11:00 — with GitHub Actions Inactive
@patrickkuo patrickkuo force-pushed the pat/indexer-rpc-api branch from 8b65939 to 5d6f78d Compare January 7, 2025 11:28
@patrickkuo patrickkuo temporarily deployed to sui-typescript-aws-kms-test-env January 7, 2025 11:28 — with GitHub Actions Inactive
@patrickkuo patrickkuo force-pushed the pat/indexer-rpc-api branch from 5d6f78d to bf7ed0c Compare January 7, 2025 11:45
@patrickkuo patrickkuo temporarily deployed to sui-typescript-aws-kms-test-env January 7, 2025 11:45 — with GitHub Actions Inactive
@patrickkuo patrickkuo temporarily deployed to sui-typescript-aws-kms-test-env January 7, 2025 12:00 — with GitHub Actions Inactive
@patrickkuo patrickkuo temporarily deployed to sui-typescript-aws-kms-test-env January 7, 2025 12:04 — with GitHub Actions Inactive
@patrickkuo patrickkuo temporarily deployed to sui-typescript-aws-kms-test-env January 7, 2025 12:40 — with GitHub Actions Inactive
@patrickkuo patrickkuo marked this pull request as ready for review January 7, 2025 14:01
@patrickkuo patrickkuo temporarily deployed to sui-typescript-aws-kms-test-env January 7, 2025 14:01 — with GitHub Actions Inactive
@patrickkuo patrickkuo requested review from bmwill, amnn and wlmyng January 7, 2025 14:01
Comment on lines 50 to 51
#[clap(long)]
pub basic_auth: Option<String>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you include some documentation of the expected format of this CLI arg, below you can see that its expected to be <username>:<password>.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 on the docs here, and some other comments:

  • Let's also clarify that it is only relevant if --rpc-api-url is set.
  • Consider accepting this value via an environment variable as well.

Comment on lines 79 to 82
let value: MetadataValue<Ascii> = format!("Basic {}", auth)
.parse()
.map_err(Into::into)
.map_err(Status::from_error)?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given this is a username/password lets make sure to set the sensitive flag on the MetadataValue:

https://docs.rs/tonic/latest/tonic/metadata/struct.MetadataValue.html#method.set_sensitive

.tls_config(
ClientTlsConfig::new()
.with_enabled_roots()
.assume_http2(true),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this needed? any TLS negotiation should include h2 in ALPN

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is needed for the full node we setup in the private net

Copy link
Contributor

@amnn amnn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some high level thoughts but the change on the indexing framework side looks good here, I see that @bmwill has some unresolved comments on the RPC API side, so I will leave it with him for the final stamp.

Comment on lines 45 to 48
Ok(Bytes::from(
Blob::encode(&data, BlobEncoding::Bcs)?.to_bytes(),
))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fairly sure we're going to decode this straight away, which is a little unfortunate -- is it worth changing the IngestionClientTrait so that fetch returns a CheckpointData?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added FetchData, which can be either raw bytes or CheckpointData

use sui_storage::blob::{Blob, BlobEncoding};
use url::Url;

pub(crate) struct RpcIngestionClient {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm in two minds about this myself, but do we need this type? (As opposed to implementing IngestionClientTrait directly on sui_rpc_api::Client?

Comment on lines 50 to 51
#[clap(long)]
pub basic_auth: Option<String>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 on the docs here, and some other comments:

  • Let's also clarify that it is only relevant if --rpc-api-url is set.
  • Consider accepting this value via an environment variable as well.

Comment on lines 96 to 100
let basic_auth = args.basic_auth.map(|s| {
let split = s.split(":").collect::<Vec<_>>();
assert_eq!(2, split.len());
(split[0].to_string(), split[1].to_string())
});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not accept this as two different flags to avoid hand-rolling the parser?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am now using the username and password encoded in the URL instead of having separated args

@@ -76,6 +77,17 @@ impl IngestionClient {
}
}

pub(crate) fn new_rpc(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: move this above new_impl so it's with the other constructors?

@bmwill
Copy link
Contributor

bmwill commented Jan 27, 2025

Note #20986 added support for configuring auth

@patrickkuo
Copy link
Contributor Author

patrickkuo commented Jan 30, 2025

Note #20986 added support for configuring auth

thanks! that's great! I will rebase and update this PR

@patrickkuo patrickkuo temporarily deployed to sui-typescript-aws-kms-test-env February 3, 2025 12:18 — with GitHub Actions Inactive
@@ -150,7 +154,7 @@ impl Client {

let (metadata, response, _extentions) = self
.raw_client()
.max_decoding_message_size(64 * 1024 * 1024)
.max_decoding_message_size(128 * 1024 * 1024)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One of the checkpoint in our test environment exceeded 64Mb, I bumped the max decoding message size to avoid the error, @bmwill do you know what is the correct value to set here?

@patrickkuo
Copy link
Contributor Author

this PR is ready for more review, thanks!

impl IngestionClientTrait for RpcClient {
async fn fetch(&self, checkpoint: u64) -> FetchResult {
let data = self.get_full_checkpoint(checkpoint).await.map_err(|e| {
if e.message().contains("not found") {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a more structured way to identify that a checkpoint is not found based on the response @bmwill ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The errors are a bit of a mess still and i'm actively working on this as we speak. right now i think the gRPC Code::Unknown is always returned (which is not great)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a Code::NOT_FOUND though which will eventually be returned here (hopefully by the next release)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok #21163 should address this and allow you to look at the error code directly

FetchData::CheckPointData(data) => {
self.metrics
.total_ingested_bytes
.inc_by(size_of_val(&data) as u64);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

size_of_val is not going to do the right thing here, because it's not going to look through pointers etc. @bmwill, is there a way to get similar information out from the gRPC client? If not, I would opt for not trying to replicate this metric in the case of RPC clients at all (but then we should update its documentation to mention that).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not presently. but we could hook in something if we needed. Agreed that the way this is being done right now would produce inaccurate results.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will remove this for RPC client for now until we have a better way to get the size information.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants