Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pass object name to AsmAnalyzer #15536

Merged
merged 1 commit into from
Oct 30, 2024

Conversation

rodiazet
Copy link
Contributor

@rodiazet rodiazet commented Oct 22, 2024

This is part of eof contract creation prep

Copy link

Thank you for your contribution to the Solidity compiler! A team member will follow up shortly.

If you haven't read our contributing guidelines and our review checklist before, please do it now, this makes the reviewing process and accepting your contribution smoother.

If you have any questions or need our help, feel free to post them in the PR or talk to us directly on the #solidity-dev channel on Matrix.

@cameel

This comment was marked as resolved.

@rodiazet rodiazet force-pushed the object-name-to-analysis branch 2 times, most recently from 4bca0b1 to 510cdab Compare October 22, 2024 21:09
@rodiazet

This comment was marked as resolved.

cameel
cameel previously approved these changes Oct 22, 2024
Copy link
Member

@cameel cameel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good overall, but there are a few small tweaks that would be nice to have before we merge.

libyul/AsmAnalysis.h Outdated Show resolved Hide resolved
libyul/AsmAnalysis.h Outdated Show resolved Hide resolved
libsolidity/experimental/analysis/TypeInference.cpp Outdated Show resolved Hide resolved
@rodiazet rodiazet force-pushed the object-name-to-analysis branch 4 times, most recently from 68d9801 to bd29212 Compare October 23, 2024 14:50
libyul/AsmAnalysis.h Outdated Show resolved Hide resolved
libyul/AsmAnalysis.cpp Outdated Show resolved Hide resolved
libyul/Object.cpp Outdated Show resolved Hide resolved
libyul/optimiser/StackCompressor.cpp Outdated Show resolved Hide resolved
@rodiazet rodiazet force-pushed the object-name-to-analysis branch 4 times, most recently from d263069 to 6bc3aed Compare October 24, 2024 15:46
cameel
cameel previously approved these changes Oct 25, 2024
libyul/AsmAnalysis.cpp Outdated Show resolved Hide resolved
@rodiazet rodiazet force-pushed the object-name-to-analysis branch 2 times, most recently from f323d7f to 9c33cb0 Compare October 25, 2024 09:07
@rodiazet
Copy link
Contributor Author

Because of an issue with mixing data and object names in one set I implemented in a separated commit a change which separates these two name sets. This allows to properly report an error when trying to use Data instead of Object name in eofcreate or returncontract buildins.

Copy link
Member

@cameel cameel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are some small issues here - m_objectName is still present, but now not getting initialized. Also a missing check against blocks with no debug data, which could lead to a segfault with --debug-info none.

Other than that, just a few minor things, like names or variables that you defined but left unused.

Overall I like the idea of the struct though. I initially wanted a different one (don't get scared by my giant comment there - it's just for context :)) but actually let's go with yours.

std::set<std::string> m_dataNames;
Object::QualifiedNames m_dataNames;
/// Current object name
std::string m_objectName;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the switch to Object::QualifiedNames you no longer initialize this field in constructor so any checks relying on it are broken.

You should remove the field and make the checks use QualifiedNames.rootObjectName instead.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's also rename m_dataNames while at it, since it's no longer just data names.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. Fixed.

libyul/Object.h Outdated
Comment on lines 102 to 121
struct QualifiedNames
{
std::string rootObjectName;
std::set<std::string> objectPaths;
std::set<std::string> dataPaths;

bool contains(std::string const& _path) const { return containsObject(_path) || containsData(_path); }
bool containsObject(std::string const& _path) const { return objectPaths.count(_path) > 0; }
bool containsData(std::string const& _path) const { return dataPaths.count(_path) > 0; }
};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the idea. I'd just rename some things:

  • Maybe it would be better to call the struct Object::Structure? And rename qualifiedDataNames() to something like summarizeStructure(). It will better reflect its purpose. The Paths bit in member names is enough to show that they contain whole paths, not just names (and some of the paths in it are actually unqualified names).
  • I'd also rename rootObjectName to just objectName or even name to avoid having it mistaken for the root of the whole object tree.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just for the record, initially I wanted to propose a slightly different structure that would separate things along different lines. Here's how it would have looked like:

Suggested change
struct QualifiedNames
{
std::string rootObjectName;
std::set<std::string> objectPaths;
std::set<std::string> dataPaths;
bool contains(std::string const& _path) const { return containsObject(_path) || containsData(_path); }
bool containsObject(std::string const& _path) const { return objectPaths.count(_path) > 0; }
bool containsData(std::string const& _path) const { return dataPaths.count(_path) > 0; }
};
/// Summarizes the structure of the subtree rooted at a given object,
/// in particular the paths that can be used from within to refer to nested nodes (objects and data).
/// WARNING: Be careful when changing the set of available paths.
/// It affects the language and removing paths a breaking change.
struct Structure
{
std::string objectName;
/// All valid paths that can be used to reference a subnode from within the current object.
/// This includes all ObjectNodes, i.e. both Object and Data.
/// Note that the same node may be reachable via multiple paths, some of them
/// being unqualified names and some fully-qualified paths.
/// Note also that the name of the object itself is a valid path.
std::set<std::string> allSubNodePaths;
/// Paths that can be used to reach children of the current object that are themselves Objects (and not Data).
/// This is a subset of allSubNodePaths.
std::set<std::string> immediateSubObjectPaths;
};

But after looking at the PR more I'm fine with yours. It has more regularity to it which makes it simpler in some ways.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Finally, getTopLevelSubobjectNames() would actually fit as a helper in this struct now.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think about renaming the struct and moving the helper?

{
qualifiedNames.objectPaths.insert(subObjectNode->name);

auto const subObjectQualifiedDataNames = subObject->qualifiedDataNames();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unused. Looks like you intended to use it to replace the two calls to subObject->qualifiedDataNames() below, but forgot?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, this function is in libyul so solAssert -> yulAssert :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. This commit is more like a draft. I will improve naming and add doc comments if it's fine for you

@@ -68,6 +68,8 @@ bool AsmAnalyzer::analyze(Block const& _block)
auto watcher = m_errorReporter.errorWatcher();
try
{
// FIXME: Pass location of the object name. Now it's a location of first code section in yul
validateObjectStructure(_block.debugData->nativeLocation);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

debugData is not always present. It may be null. You need a check against this. Or just use the helper that does that for you:

Suggested change
validateObjectStructure(_block.debugData->nativeLocation);
validateObjectStructure(nativeLocationOf(_block));

@@ -766,3 +768,35 @@ bool AsmAnalyzer::validateInstructions(FunctionCall const& _functionCall)
{
return validateInstructions(_functionCall.functionName.name.str(), nativeLocationOf(_functionCall.functionName));
}

bool AsmAnalyzer::validateObjectStructure(langutil::SourceLocation _astRootLocation)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we could drop the return value here? You're not checking if anyway.

More importantly, it's the opposite of what I'd expect - true for invalid. Are you maybe following what our weird validateInstruction() was doing? I wanted to change that actually, because that was confusing as hell. Let's not let it spread :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just wanted to be consisted with this convention "Some errors -> return true". :) But yeah let's skip return value.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it's good to be consistent, just not sure we actually have that convention. I only recall it in that one function :)

@cameel
Copy link
Member

cameel commented Oct 25, 2024

BTW, many of the problems we had here, like m_objectName not getting initialized or the source location not being present in test output would normally have been detected in CI by the test suite. We should really add a run with EOF enabled in soltest_all.sh (in a separate PR). Not all tests will pass yet (especially not semantic tests), but we could whitelist only the suites known to work, like yulSyntaxTests and objectCompilerTests.

@rodiazet rodiazet force-pushed the object-name-to-analysis branch 3 times, most recently from 3c7b9b7 to 3937a25 Compare October 28, 2024 09:26
@rodiazet
Copy link
Contributor Author

BTW, many of the problems we had here, like m_objectName not getting initialized or the source location not being present in test output would normally have been detected in CI by the test suite. We should really add a run with EOF enabled in soltest_all.sh (in a separate PR). Not all tests will pass yet (especially not semantic tests), but we could whitelist only the suites known to work, like yulSyntaxTests and objectCompilerTests.

I would do it too. Even started looking on solltest_all.sh but now idea where this should be added. Can someone assist with this?

@cameel
Copy link
Member

cameel commented Oct 28, 2024

I guess I can add it myself. Probably better to keep you focused on the actual EOF bits anyway.

@cameel
Copy link
Member

cameel commented Oct 29, 2024

PR that adds an EOF pass: #15552.

cameel
cameel previously approved these changes Oct 29, 2024
Copy link
Member

@cameel cameel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is fine now in terms of correctness.

I still have some small nitpicks. Just some wording tweaks and also parts of what I suggested in #15536 (comment). But I don't want to add another round just for those so I did them myself and pushed 3 commits to the object-name-to-analysis branch. So just pull them in (all or some of them; I separated the rename so that it can be easily dropped if you don't like it) squash and then we can merge this.

@rodiazet
Copy link
Contributor Author

This is fine now in terms of correctness.

I still have some small nitpicks. Just some wording tweaks and also parts of what I suggested in #15536 (comment). But I don't want to add another round just for those so I did them myself and pushed 3 commits to the object-name-to-analysis branch. So just pull them in (all or some of them; I separated the rename so that it can be easily dropped if you don't like it) squash and then we can merge this.

Ok. thanks. I understood from the comment that it's fine finally. But yest your wording is better.

@cameel cameel merged commit fd96fcf into ethereum:develop Oct 30, 2024
73 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants