Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add complexed enzymes as possible catalysts while building sim_data #1088

Merged
merged 4 commits into from
Jun 15, 2021

Conversation

ggsun
Copy link
Contributor

@ggsun ggsun commented Jun 12, 2021

This PR makes changes to the building of sim_data.metabolism such that protein complexes that contain an enzyme as part of the complex always retain the enzymatic activity of the said enzyme. This makes the manual changes made to the metabolic_reactions.tsv file unnecessary. I also pulled out two hard-coded parameters in the process/metabolism.py file as a flat file of its own.

Copy link
Member

@tahorst tahorst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! It's great automating out manual changes but this also makes it more important to have a way of easily looking at data produced from sim_data or create intermediate files from raw_data before sim_data as we've talked about before. In this case, seeing all enzymes for each reaction could be helpful for troubleshooting or analysis but we won't see that in the flat files.

Do you know how many reactions are affected by these changes? Is it just the ones you manually changed or a lot more?

Should we also do the same for equilibrium reactions in addition to complexation? Before #1065 we had to add PUTA-CPLXBND to some reactions catalyzed by PUTA-CPLX which would be included in equilibrium.

@@ -0,0 +1,3 @@
name value units _source _comments
"ppi_concentration" 5e-4 "units.mol/units.L" "multiple sources"
"pH" 7.2 ""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we move other metabolism parameters from parameters.tsv here or is it better to just stick these in parameters.tsv? Not sure if we want to tradeoff a bunch of parameter files for more distinction between the parameters in them. We could also use comment rows to distinguish groups like metabolism, charging, ppGpp etc parameters in a single parameters.tsv file.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh that's a good point. I like the idea of just differentiating between different parameter types within a single file. I can also think of having just two different parameter files - one for parameters that are experimentally measured, and the other for "modeling parameters" like the kinetic objective weight.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can also think of having just two different parameter files - one for parameters that are experimentally measured, and the other for "modeling parameters" like the kinetic objective weight

That does seem like a good distinction to make

@ggsun
Copy link
Contributor Author

ggsun commented Jun 14, 2021

Looks good! It's great automating out manual changes but this also makes it more important to have a way of easily looking at data produced from sim_data or create intermediate files from raw_data before sim_data as we've talked about before. In this case, seeing all enzymes for each reaction could be helpful for troubleshooting or analysis but we won't see that in the flat files.

I totally agree with this, this could also be a good undergrad/rotation starter project. I'll post an issue proposing for this.

Do you know how many reactions are affected by these changes? Is it just the ones you manually changed or a lot more?

There are a total of 118 reactions that gets more enzymes associated to it than what the original flat file suggests with this change.

Should we also do the same for equilibrium reactions in addition to complexation? Before #1065 we had to add PUTA-CPLXBND to some reactions catalyzed by PUTA-CPLX which would be included in equilibrium.

I didn't expect to find any equilibrium proteins acting as enzymes so thanks for letting me know - do you know if this is a rare case or is more common (TFs also having enzymatic activity)? Would it be safe to assume metabolite-bound proteins have the same enzymatic activity?

@tahorst
Copy link
Member

tahorst commented Jun 14, 2021

There are a total of 118 reactions that gets more enzymes associated to it than what the original flat file suggests with this change.

Wow quite a bit! I wonder if any of them would help with cases where we have enzymes go to 0 counts. I think I've checked most of those enzymes on EcoCyc and they weren't part of larger complexes but I'd imagine this change makes things more robust!

do you know if this is a rare case or is more common (TFs also having enzymatic activity)?

I would imagine fairly rare but several more cases probably exist. I know AlaS is also supposed to regulate it's own expression as well as act as a synthetase but we don't model it as a TF in the model and the Ala binding is also disabled to make handling it as a synthetase easier.

Would it be safe to assume metabolite-bound proteins have the same enzymatic activity?

That is a great question. I feel like binding would change the conformation enough to change kinetics of reactions if not completely remove the ability to catalyze a reaction but it's also probably dependent on each molecule. We can probably assume the same activity for now unless we find an example otherwise.

@ggsun ggsun force-pushed the ecocyc-cleanup2 branch from 808b77e to df45b8d Compare June 14, 2021 23:06
@ggsun ggsun merged commit 7a8eac7 into master Jun 15, 2021
@ggsun ggsun deleted the ecocyc-cleanup2 branch June 15, 2021 00:49
@1fish2
Copy link
Contributor

1fish2 commented Jun 16, 2021

... this also makes it more important to have a way of easily looking at data produced from sim_data or create intermediate files from raw_data before sim_data as we've talked about before. In this case, seeing all enzymes for each reaction could be helpful for troubleshooting or analysis but we won't see that in the flat files.

You can look through sim_data by running this a Python Console in PyCharm:

import pickle
path = 'out/manual/kb/simData.cPickle'
sim_data = pickle.load(open(path, 'rb'))

Then click the "Show Variables" eyeglasses icon if needed and expand the sim_data tree.
sim_data

Do you know how many reactions are affected by these changes? Is it just the ones you manually changed or a lot more?

Is compareParca useful for this? We could add a way to filter the comparison.

@tahorst
Copy link
Member

tahorst commented Jun 16, 2021

Then click the "Show Variables" eyeglasses icon if needed and expand the sim_data tree.

That looks super useful! Might be a little overwhelming but much easier than doing things from the command line. There still might be some cases where an exported table format would be preferred but this definitely can make exploring sim_data a lot easier!

Is compareParca useful for this? We could add a way to filter the comparison.

Probably. I forget exactly how the enzymes get stored - I think a dict maybe reaction and enzymes. I'd imagine it should pick up on all the differences and hopefully in an easy to read way.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants