You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Nov 9, 2023. It is now read-only.
I'm trying to use filter_otus_from_otu_table.py and filter_samples_from_otu_table.py with no success. The three files needed to reproduce the issues are here: test2git.zip.
If I start trying to filter with a file containing just one observation (contained in prueba.txt) it works:
It doesn't work, and it returns: TypeError: Object dtype dtype('O') has no native HDF5 equivalent
The same happens if I use again the list with one observation (first example) but I do not include the option --negate_ids_to_exclude, so it has problems when multiple observations/samples should be filtered but not with one. The error is also reproduced if I use directly biom:
I get this error TypeError: array([u'["cathepsin L [EC:3.4.22.15]"]'], dtype=object) is not JSON serializable. And if I try to convert it to hdf5 with the suggested option --collapsed-samples:
I get TypeError: Object dtype dtype('O') has no native HDF5 equivalent. Please note that I controlled that the solutions to this bug (#759) were incorporated in my code. If it helps, I found a similar issue in the project CellProfiler (#995)
The text was updated successfully, but these errors were encountered:
The QIIME 1 Forum is likely your best bet because you have a mixture of QIIME 1 and biom-format commands, but you could instead try the biom-format issue tracker. Please don't post in both locations, many of the same developers monitor both. Either way, we don't provide user support for QIIME 1 or biom-format on this issue tracker. Thanks!
Sorry that I still answer here but I think it would be useful to post the following as it clarifies the problem, just in case someone find it here.
I've been able to perform the filtering making some collage of the code is used in picrust to deal with these matrices. It confirms that the problem comes from the metadata:
import picrust
import h5py
import json
import numpy as np
from biom import load_table
from biom.table import Table
from picrust.util import write_biom_table,picrust_formatter
from biom.util import HAVE_H5PY
table = load_table('otu.2test.metagenomes.biom')
# code found categorize_by_function.py
# metadata are not deserializing correctly. Duct tape it.
update_d = {}
for i, md in zip(table.ids(axis='observation'),
table.metadata(axis='observation')):
update_d[i] = {k: json.loads(v[0]) for k, v in md.items()}
table.add_metadata(update_d, axis='observation')
target = open("prueba2.txt","r")
genes = [row.strip() for row in target]
table_red=table.filter(genes,axis='observation',inplace=False)
#output in BIOM format found in predict_metagenomes.py
format_fs = {'KEGG_Description': picrust_formatter,
'COG_Description': picrust_formatter,
'KEGG_Pathways': picrust_formatter,
'COG_Category': picrust_formatter
}
write_biom_table(table_red,'table.test.biom',format_fs=format_fs) # hdf5
#write_biom_table(table_red,'table.test.biom',write_hdf5=False,format_fs=format_fs) # Json
Sign up for freeto subscribe to this conversation on GitHub.
Already have an account?
Sign in.
I'm trying to use filter_otus_from_otu_table.py and filter_samples_from_otu_table.py with no success. The three files needed to reproduce the issues are here: test2git.zip.
If I start trying to filter with a file containing just one observation (contained in prueba.txt) it works:
$ filter_otus_from_otu_table.py -i otu.2test.metagenomes.biom -o otu.metagenomes.prueba.biom -e prueba.txt --negate_ids_to_exclude
But if want to get two observations (file prueba2.txt):
$ filter_otus_from_otu_table.py -i otu.2test.metagenomes.biom -o otu.metagenomes.prueba.biom -e prueba2.txt --negate_ids_to_exclude
It doesn't work, and it returns:
TypeError: Object dtype dtype('O') has no native HDF5 equivalent
The same happens if I use again the list with one observation (first example) but I do not include the option
--negate_ids_to_exclude
, so it has problems when multiple observations/samples should be filtered but not with one. The error is also reproduced if I use directly biom:$ biom subset-table -i otu.2test.metagenomes.biom -a observation -s prueba2.txt -o otu.2test.metagenomes.prueba.biom
Following this issue in biom-format (#513), it suggests that it may be a problem with the metadata. If try to convert to json:
$ biom convert -i otu.2test.metagenomes.biom -o otu.2test.metagenomes.json.biom --table-type="OTU table" --to-json
I get this error
TypeError: array([u'["cathepsin L [EC:3.4.22.15]"]'], dtype=object) is not JSON serializable
. And if I try to convert it to hdf5 with the suggested option--collapsed-samples
:$ biom convert -i otu.2test.metagenomes.biom -o otu.2test.metagenomes.hdf5.biom --table-type="OTU table" --to-hdf5 --collapsed-samples
I get
TypeError: Object dtype dtype('O') has no native HDF5 equivalent
. Please note that I controlled that the solutions to this bug (#759) were incorporated in my code. If it helps, I found a similar issue in the project CellProfiler (#995)The text was updated successfully, but these errors were encountered: