Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[core] Variable deregistry #12554

Merged
merged 4 commits into from
Nov 13, 2024
Merged

[core] Variable deregistry #12554

merged 4 commits into from
Nov 13, 2024

Conversation

jcotela
Copy link
Member

@jcotela jcotela commented Jul 17, 2024

📝 Description
Each application should deregister the variables it registered. They are not being removed from the registry currently, and I suspect this is causing memory errors on tests. (It definitely did for the tests at Altair involving multiple applications).

I'm creating this as a draft for now, just to see if it passes CI. If this ends up going in, I think we need an implementation of something like RegistryItem::IsSameType (added in this PR) that does not rely on a try/catch.

FYI @loumalouomega @pooyan-dadvand @roigcarlo

🆕 Changelog

  • Add RegistryItem::IsSameType
  • Each application de-registers the variables it registered now.

@jcotela jcotela self-assigned this Jul 17, 2024
@jcotela jcotela requested a review from a team as a code owner July 17, 2024 15:13
@jcotela jcotela marked this pull request as draft July 17, 2024 15:15
Copy link
Member

@loumalouomega loumalouomega left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, okay from my side

Comment on lines +321 to +331
void Register() const {
const std::string variable_path("variables.all."+Name());
if (!Registry::HasItem(variable_path)) {
Registry::AddItem<VariableType>(variable_path, *this);
Registry::AddItem<VariableType>("variables."+Registry::GetCurrentSource()+"."+Name(), *this);
} else if(!Registry::GetItem(variable_path).IsSameType(*this)) {
KRATOS_ERROR << "Attempting to register " << Name()
<< " but a variable with the same name and different type already exists"
<< std::endl;
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am curious, since registering the same variable from multiple apps causes some seg fault at the moment, shoudn't we throw an error if it exists, rather than checking the type?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that Kratos should be able to support registering the same variable from multiple applications, as long as the type is the same. Using the registry in this way would be a step in that direction, since users of the registry are sure to always get the same thing when retrieving a given variable (something that is not so clear with components if Kratos allows overwriting variables as is happening now).

Comment on lines +322 to +332
void KratosApplication::DeregisterVariables() {
const std::string path = "variables."+mApplicationName;
if (Registry::HasItem(path)) {
auto& r_variables = Registry::GetItem(path);
// Iterate over items at path. For each item, remove it from the mappers.all branch too
for (auto i_key = r_variables.KeyConstBegin(); i_key != r_variables.KeyConstEnd(); ++i_key) {
Registry::RemoveItem("variables.all."+*i_key);
}
Registry::RemoveItem(path);
}
}
Copy link
Member

@sunethwarna sunethwarna Jul 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This removes the variables registered by the application from the all, hence we don't allow registering same variable from multiple apps. So, I would as mentioned in the earlier comment, throw an error if the same variable name is registered from multiple places.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please note that the scope of this PR is very narrow (it is intended only for testing, it is not a general "unload application" mechanism). What I am doing here is just ensuring that whoever registered the variable takes care of de-registering it, not ensuring that the kernel remains in a complete state after application de-registry. I understand that this is consistent with what the de-registration logic does for other types of objects.

@jcotela
Copy link
Member Author

jcotela commented Jul 18, 2024

Ok, it seems that this PR is not helping with the pipeline errors. We do need this capability it on the Altair side to avoid memory errors (since we have tests involving multiple applications and this becomes more of an issue), but we can work around it if it is not taken care of by the application de-registry automatically.

I will keep the PR open for a while for discussion.

@roigcarlo
Copy link
Member

roigcarlo commented Oct 14, 2024

I completely forgot about this sorry.

I am not against the changes (this is at least more consistent than what we have) but I see that the same problems is going to happen again.
Normally the usage that we do is to load everything and then finish the simulation. In scenarios where dynamic loading and unloading of applications is happening, the variables registered this way end up in an undefined status.

Ej:

  • A Registers V1
  • B Registers V1 (cannot because repeated)
  • A is unloaded
  • A unloads V1
    ---- V1 should exist here because is part of B but cannot -----

Which again is an ownership problem. For me the ideal scenario is that V1 still exists as a variable, which with components is not possible because it simply links to a memory address that is no longer alive (ownership problem), but with the registry we should be able to return a "handler" o call it as you want that holds the memory space regardless of which application has registered as long as types mach (the RegistryItem), so the application only registers how to construct that variable, but not the variable per se.

So doing something like:

Registry::AddItem<VariableType>(variable_path, [&](){return new Variable<TDataType>()});

and have the Registry (not the RegistyItem) GetValue as

// Syntax is wrong but you get the idea
TDataType GetValue<TDataType>() {
    if This->mValue == null
        mValue = this->GetValue()(); // Calling the lambda, by copy elision RegistryItem is the new owner
    return std::any_cast<TDataType>(mValue)
}

@jcotela
Copy link
Member Author

jcotela commented Oct 16, 2024

@roigcarlo I agree that avoiding the ownership problem would be an improvement, but I am not sure that we can achieve it with the current approach. Going by your example, if the lambda referenced in the registry is defined in application A, I believe you won't be able to call it once the application is unloaded.

In fact, the specific problem that motivated me to go ahead with this draft are variables with custom type. If the class of the variable is defined in a dynamically loaded application, the specific error we get is that destructor for the reference variable instance is being called when the registry is cleared, at the end of the run, but the library that defined the class it is no longer loaded. I would expect something similar to happen in your example when the lambda for V1 is called by the registry after A is unloaded.

Copy link
Member

@roigcarlo roigcarlo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then lets go ahead with this for the moment until we find something more robust.

@jcotela jcotela marked this pull request as ready for review November 13, 2024 10:58
@jcotela jcotela merged commit cab1cb3 into master Nov 13, 2024
11 checks passed
@jcotela jcotela deleted the core/variable-deregistry branch November 13, 2024 14:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants