Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Runtime field count #352

Merged
merged 18 commits into from
Sep 20, 2023
Merged

Conversation

jtraglia
Copy link
Member

@jtraglia jtraglia commented Aug 24, 2023

With this PR, one build of c-kzg-4844 will work with both mainnet and minimal trusted setups. If merged, I will update the bindings in separate PRs. Until then, the CI tests for the bindings are expected to fail.

  • Remove the FIELD_ELEMENTS_PER_BLOB macro (which was either 4096 or 4).
    • This is now accessible via kzg_settings->field_elements_per_blob after loading the trusted setup.
  • Allocate blobs and polynomials on-demand on the heap.
    • We cannot allocate them on the stack anymore, since the count isn't a constant.
    • This will also reduce stack usage, which should prevent stack overflows.
  • Remove type definitions for Blob and Polynomial.
    • It will now be up to the bindings to define the blob type.
    • I don't want to do typedef uint8_t Blob because that could be confusing.
  • Remove the LIB_PREFIX hack for supporting both builds.
    • This is no longer necessary.

For the tests:

  • Use new MAINNET and MINIMAL macros to determine which to test.
  • Replace stack allocated blobs/polynomials with heap allocated ones.

@jtraglia jtraglia requested review from ppopth and asn-d6 and removed request for ppopth August 25, 2023 00:01
@jtraglia jtraglia changed the title Remove FIELD_ELEMENTS_PER_BLOB macro Runtime field count Aug 30, 2023
@asn-d6
Copy link
Contributor

asn-d6 commented Aug 31, 2023

IMO this is a 'nice to have'. It should get reviewed by all me, @ppopth and @GottfriedHerold . I can do a first review next week.

@ppopth
Copy link
Member

ppopth commented Sep 11, 2023

I'm not quite convinced that the LIB_PREFIX thing is a hack. I heard that the problem is that lighthouse has to use two dependent crates for this and it has to add another slash for the second crate, as follows.

c-kzg = { git = "https://github.com/ethereum/c-kzg-4844", rev = "fa3c62989527073fdce8b2138bb27a52bb2407c5" , features = ["mainnet-spec"]}
c_kzg_min = { package = "c-kzg", git = "https://github.com/ethereum//c-kzg-4844", rev = "fa3c62989527073fdce8b2138bb27a52bb2407c5", features = ["minimal-spec"], optional = true }

I think, this one is unnecessary because you can just use one crate for both mainnet and minimal, instead of using two. You can probably put the mainnet in one module and the minimal in another module.

Can you give a reason why you think we should eliminate the LIB_PREFIX thing?

For the stack overflow problem in Rust, I think we don't have to remove the Blob definition. It's better to keep it as is. If some bindings have the problem with it, they can just ignore it and redefine that type.

For example, the Rust binding can redefine the Blob type as some kind of Vec<u8> and, when it wants to use the c-kzg functions, it should be able to transmute the pointer of the new Blob type to the pointer of the original Blob type.

Please correct me if I'm wrong.

@jtraglia
Copy link
Member Author

Can you give a reason why you think we should eliminate the LIB_PREFIX thing?

Simplicity.

  • This change makes kzg code in lighthouse much simpler.
  • It would allow us to publish a crate which works for both presets.
  • It will reduce the size of distributions, as only one library per platform is required, instead of two.

For the stack overflow problem in Rust, I think we don't have to remove the Blob definition. It's better to keep it as is. If some bindings have the problem with it, they can just ignore it and redefine that type.

While this PR will result in less stack usage, the stack overflow problem with the rust bindings has been fixed:

Regarding the blob definition, I really wish we could keep this type but I don't think there's a strong reason it needs to exist. For example, nowhere in the C library (excluding the C tests) do we instantiate a blob. Only pointers are used.

We could create a blob type like: typedef uint8_t Blob; But I dislike this because it could be misused. For example, if someone were to instantiate a blob via Blob blob; it would be the incorrect size (just one byte). I'm definitely willing to compromise here though.

For example, the Rust binding can redefine the Blob type as some kind of Vec<u8> and, when it wants to use the c-kzg functions, it should be able to transmute the pointer of the new Blob type to the pointer of the original Blob type.

Sure, I suppose that's possible but it sounds a bit more complicated.

Copy link

@mratsim mratsim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good to me.

In terms of API I would rename KZGSettings as a context using ctx for it and make it the first parameter as it's pretty standard in cryptographic libraries but that's another PR or RFC.

src/c_kzg_4844.c Outdated Show resolved Hide resolved
@jtraglia
Copy link
Member Author

In terms of API I would rename KZGSettings as a context using ctx for it and make it the first parameter as it's pretty standard in cryptographic libraries but that's another PR or RFC.

I've been wanting to rename KZGSettings to Context for a while. Good idea making it the first parameter. If the team is okay with it, I'll make another PR specifically for this later.

@nisdas
Copy link

nisdas commented Sep 14, 2023

This would be very useful for us in testing, a lot of our end to end tests rely on running minimal configs, so the ability to toggle the blob sizes would be very helpful for us and make maintaining the test infrastructure for blobs much easier.

@jtraglia jtraglia changed the base branch from main to runtime-fields September 14, 2023 14:11
@jtraglia
Copy link
Member Author

Changing the base branch to a new runtime-fields branch. I would like to:

  • Merge this PR into that branch.
  • Make additional PRs to that branch with binding updates.
  • Finally, merge runtime-fields into main.

This allows us to see the "final product" without breaking main while the bindings are being updated.

Copy link
Contributor

@asn-d6 asn-d6 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a meta-comment but since this PR will get reviewed by a bunch of people, I think it would be a good idea to make it more pleasant to review.

In particular, ef592fc is the core of the feature but it actually does three things at once:

  • refactors FIELD_ELEMENTS_PER_BLOB
  • refactors Polynomial
  • refactors Blob

It would be great if the branch could be rebased such that these three operations are performed in separate commits. IMO it would make this PR easier to review.

src/c_kzg_4844.c Outdated Show resolved Hide resolved
@jtraglia jtraglia force-pushed the c-runtime-field-count branch from a0043b1 to 2df642d Compare September 14, 2023 19:35
@jtraglia
Copy link
Member Author

The rebase (splitting changes into smaller commits) is still a work in progress. Will finish tomorrow morning.

@jtraglia jtraglia marked this pull request as draft September 15, 2023 01:24
@jtraglia jtraglia marked this pull request as ready for review September 15, 2023 13:27
@jtraglia jtraglia requested a review from asn-d6 September 15, 2023 13:34
@asn-d6
Copy link
Contributor

asn-d6 commented Sep 15, 2023

Done an initial review and this looks really good! Thanks!

I pushed three commits. @jtraglia wanna review them and then squash them and rebase the PR so that it's clean again?
(You will have to squash eed8f3506 and 05c8d7406 into bdc1f0ef5)

It would be great if @GottfriedHerold and @ppopth can also give it a review as well.

@jtraglia
Copy link
Member Author

Your changes look great, thanks for that 🥇

@asn-d6 asn-d6 force-pushed the c-runtime-field-count branch from 3e0aa50 to c8b7d59 Compare September 15, 2023 17:32
@asn-d6
Copy link
Contributor

asn-d6 commented Sep 15, 2023

Your changes look great, thanks for that 🥇

Rebased and force-pushed as per our TG conversation.

I also pushed the pre-rebased branch at https://github.com/asn-d6/c-kzg-4844/tree/pr352_pre_rebase (the one with the three extra commits) in case you want to diff it with the force-pushed code to make sure I didn't screw up the rebase.

@ppopth
Copy link
Member

ppopth commented Sep 16, 2023

This may be a dump question, but why don't we put the size of polynomials and blobs into a field of a struct?
Something like this

typedef struct {
    fr_t *evals;
    uint64_t size;
} Polynomial;

and

typedef struct {
    uint8_t *bytes;
    uint64_t size;
} Blob;

I understand that all the polynomials and the blobs have the same size, but, by doing this, all the polynomials and the blobs will be self-contained and don't depend on external objects like KZGSettings.

Previously we didn't keep the size and we used FIELD_ELEMENTS_PER_BLOB instead which quite makes sense, because FIELD_ELEMENTS_PER_BLOB is just a global constant, not a variable. Now they depend on a variable not a constant, so I think it makes sense to rethink this.

This may need more assertions that the size of polynomials and blobs are equal to the global setting, but this may be worth it.

@jtraglia
Copy link
Member Author

This may be a dumb question, but why don't we put the size of polynomials and blobs into a field of a struct?

So we did try this with polynomials (see this commit), but decided against it for a few reasons. I prefer the way it is now, but I'm open to using poly_t and blob_t types. Not a strong opinion. I think this could be done in a separate PR.

  • Didn't feel that it really simplified that much.
  • Required new initialization/free functions which didn't quite match others in usage.
    • For example, in c_kzg_free we just pass the value, but free_poly takes a pointer.
  • Realized that it's easier to make a mistake with the structure.
    • It must be initialized with {NULL, 0} which is fine, but a little ugly.
    • If you forget to do this, you could free an invalid address (value from the stack).

@asn-d6
Copy link
Contributor

asn-d6 commented Sep 18, 2023

This may be a dumb question, but why don't we put the size of polynomials and blobs into a field of a struct?

So we did try this with polynomials (see this commit), but decided against it for a few reasons. I prefer the way it is now, but I'm open to using poly_t and blob_t types. Not a strong opinion. I think this could be done in a separate PR.

  • Didn't feel that it really simplified that much.

  • Required new initialization/free functions which didn't quite match others in usage.

    • For example, in c_kzg_free we just pass the value, but free_poly takes a pointer.
  • Realized that it's easier to make a mistake with the structure.

    • It must be initialized with {NULL, 0} which is fine, but a little ugly.
    • If you forget to do this, you could free an invalid address (value from the stack).

Yep agreed. I was part of this entire refactoring-then-rerefactoring operation, and I agree that the status quo (without poly_t) seems simpler to read.

Copy link
Contributor

@GottfriedHerold GottfriedHerold left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally, the changes look good to me.

* Two remarks about n1/n2 requirements.
* Move n1/n2 checks to inner function so they behave the same.
* A remark about the file's position before/after.
@jtraglia
Copy link
Member Author

Generally, the changes look good to me.

Thanks @GottfriedHerold. How do these (2328ac5) changes look? I also moved the n1 and n2 checks from load_trusted_setup_file to load_trusted_setup so they would both behave the same.

Copy link
Contributor

@asn-d6 asn-d6 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks!

@jtraglia jtraglia merged commit 91cc602 into ethereum:runtime-fields Sep 20, 2023
@jtraglia jtraglia deleted the c-runtime-field-count branch September 20, 2023 17:02
@ppopth
Copy link
Member

ppopth commented Sep 28, 2023

  • This change makes kzg code in lighthouse much simpler.

  • It would allow us to publish a crate which works for both presets.

I looked at pawanjay176/lighthouse@a2c780a already. It seems to me that the commit makes things simpler because it changes from

pub struct Kzg<P: KzgPreset> {
    trusted_setup: P::KzgSettings,
}

to

pub struct Kzg {
    trusted_setup: KzgSettings,
}

I think the latter can be done without this PR. As I mentioned in #352 (comment), both mainnet and minimal presets can be combined into one crate which makes the latter possible.

I think, this one is unnecessary because you can just use one crate for both mainnet and minimal, instead of using two. You can probably put the mainnet in one module and the minimal in another module.

In the binding, you can do something like

enum KzgSettings {
    Mainnet(...),
    Minimal(...),
}

Now in the lighthouse, you can do the latter.

  • It will reduce the size of distributions, as only one library per platform is required, instead of two.

I think this is not an issue. The idea of duplicating codes is very common. We do it all the time when we use generics (like in Rust and C++).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants