You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
opf-fido1.4.1 from PyPI and also confirmed in commit 6211d66 of the rc/1.6 branch.
How was FIDO installed?
In Ubuntu 18.04 with pip install opf-fido in a Python 2.7 virtual environment.
What did you do to cause this bug to happen?
Ran the fido-update-signatures command.
What did you expect to happen?
A file formats-v97.xml generated in the conf directory with the latest PRONOM file format definitions.
What did you see instead?
This error in the Preparing to convert PRONOM formats to FIDO signatures... step:
Traceback (most recent call last):
File "/tmp/fido-venv/bin/fido-update-signatures", line 8, in <module>
sys.exit(main())
File "/tmp/fido-venv/local/lib/python2.7/site-packages/fido/update_signatures.py", line 194, in main
run(opts)
File "/tmp/fido-venv/local/lib/python2.7/site-packages/fido/update_signatures.py", line 113, in run
prepare_pronom_to_fido()
File "/tmp/fido-venv/local/lib/python2.7/site-packages/fido/prepare.py", line 697, in run
info.load_pronom_xml(puid)
File "/tmp/fido-venv/local/lib/python2.7/site-packages/fido/prepare.py", line 129, in load_pronom_xml
format_ = self.parse_pronom_xml(stream, puid_filter)
File "/tmp/fido-venv/local/lib/python2.7/site-packages/fido/prepare.py", line 278, in parse_pronom_xml
sock = urlopen(url)
File "/usr/lib/python2.7/urllib2.py", line 154, in urlopen
return opener.open(url, data, timeout)
File "/usr/lib/python2.7/urllib2.py", line 435, in open
response = meth(req, response)
File "/usr/lib/python2.7/urllib2.py", line 548, in http_response
'http', request, response, code, msg, hdrs)
File "/usr/lib/python2.7/urllib2.py", line 467, in error
result = self._call_chain(*args)
File "/usr/lib/python2.7/urllib2.py", line 407, in _call_chain
result = func(*args)
File "/usr/lib/python2.7/urllib2.py", line 654, in http_error_302
return self.parent.open(new, timeout=req.timeout)
File "/usr/lib/python2.7/urllib2.py", line 435, in open
response = meth(req, response)
File "/usr/lib/python2.7/urllib2.py", line 548, in http_response
'http', request, response, code, msg, hdrs)
File "/usr/lib/python2.7/urllib2.py", line 473, in error
return self._call_chain(*args)
File "/usr/lib/python2.7/urllib2.py", line 407, in _call_chain
result = func(*args)
File "/usr/lib/python2.7/urllib2.py", line 556, in http_error_default
raise HTTPError(req.get_full_url(), code, msg, hdrs, fp)
urllib2.HTTPError: HTTP Error 404: Not Found
...
<ReferenceFile>
<ReferenceFileID>1</ReferenceFileID>
<ReferenceFileName>nurbcup2si.png</ReferenceFileName>
<ReferenceFileDescription>W3C PNG 1.0 reference file: Indexed color (palette) image. It is interlaced, so suitable software can give a progressive display.</ReferenceFileDescription>
<ReferenceFileDocumentation>
</ReferenceFileDocumentation>
<ReferenceFileIPR>
</ReferenceFileIPR>
<ReferenceFileNote>
</ReferenceFileNote>
<ReferenceFileIdentifier>
<Identifier>www.w3.org/Graphics/PNG/nurbcup2si.png</Identifier>
<IdentifierType>URL</IdentifierType>
</ReferenceFileIdentifier>
</ReferenceFile>
<ReferenceFile>
<ReferenceFileID>2</ReferenceFileID>
<ReferenceFileName>666.png</ReferenceFileName>
<ReferenceFileDescription>W3C PNG 1.0 reference file: Large truecolor image generated by a raytracer - a visualisation of a 6 by 6 by 6 color cube in CIE LUV color space.</ReferenceFileDescription>
<ReferenceFileDocumentation>
</ReferenceFileDocumentation>
<ReferenceFileIPR>
</ReferenceFileIPR>
<ReferenceFileNote>
</ReferenceFileNote>
<ReferenceFileIdentifier>
<Identifier>www.w3.org/Graphics/PNG/666.png</Identifier>
<IdentifierType>URL</IdentifierType>
</ReferenceFileIdentifier>
</ReferenceFile>
...
puid.fmt.569.xml
...
<ReferenceFile>
<ReferenceFileID>3</ReferenceFileID>
<ReferenceFileName>Matroska Test Suite - Wave 1</ReferenceFileName>
<ReferenceFileDescription>A set of 8 files meant to cover the basic features a player should support to be considered a good Matroska player.</ReferenceFileDescription>
<ReferenceFileDocumentation>
</ReferenceFileDocumentation>
<ReferenceFileIPR>
</ReferenceFileIPR>
<ReferenceFileNote>
</ReferenceFileNote>
<ReferenceFileIdentifier>
<Identifier>http://www.matroska.org/downloads/test_w1.html</Identifier>
<IdentifierType>URL</IdentifierType>
</ReferenceFileIdentifier>
</ReferenceFile>
...
The problem seems to be that the http://www.matroska.org/downloads/test_w1.html URL has changed to https://www.matroska.org/downloads/test_suite.html producing the problem.
Arguably this needs to be fixed in the PRONOM database, but maybe fido should handle the exception in parse_pronom_xml to protect future cases.
The text was updated successfully, but these errors were encountered:
Hi @replaceafill I've actually hit upon this issue when refactoring and updating the signature generation/update code. I've now added a 404 catch for missing test resources as these aren't essential to the functioning of FIDO. The upcoming release should fix this issue, but will also change the way/provide extra options for signature update, including the download of pre-compiled sigs from a central site.
What version of FIDO are you using?
opf-fido
1.4.1
from PyPI and also confirmed in commit 6211d66 of therc/1.6
branch.How was FIDO installed?
In Ubuntu 18.04 with
pip install opf-fido
in a Python 2.7 virtual environment.What did you do to cause this bug to happen?
Ran the
fido-update-signatures
command.What did you expect to happen?
A file
formats-v97.xml
generated in theconf
directory with the latest PRONOM file format definitions.What did you see instead?
This error in the
Preparing to convert PRONOM formats to FIDO signatures...
step:Can you reproduce this reliably?
Yes.
Additional notes
As far as I understand if a signature file contains an element
ReferenceFile/ReferenceFileIdentifier/IdentifierType
with the valueURL
,fido
downloads the file to compute a checksum for it.Currently there are three cases of this:
The problem seems to be that the
http://www.matroska.org/downloads/test_w1.html
URL has changed tohttps://www.matroska.org/downloads/test_suite.html
producing the problem.Arguably this needs to be fixed in the PRONOM database, but maybe
fido
should handle the exception inparse_pronom_xml
to protect future cases.The text was updated successfully, but these errors were encountered: