forked from cisocrgroup/ocrd_cis
-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Port to ocrd core version 3.0.0 #5
Open
MehmedGIT
wants to merge
102
commits into
bertsky:fix-alpha-shape
Choose a base branch
from
MehmedGIT:port-to-v3
base: fix-alpha-shape
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from 14 commits
Commits
Show all changes
102 commits
Select commit
Hold shift + click to select a range
2ed2c4f
add executable property
MehmedGIT 61e6caf
add setup method if missing
MehmedGIT a0965c2
add self.logger wherever missing
MehmedGIT dbccae5
require core >= 3.0.0a1
kba 8557a26
port part of binarize to core v3
kba 911a4c1
Merge pull request #1 from kba/port-to-v3
MehmedGIT 278b706
move: determine_zoom to common.py
MehmedGIT 6beec17
move: logger init to setup()
MehmedGIT 1b2fea3
refactor: log -> logger
MehmedGIT fe33494
remove: unused imports
MehmedGIT 3368a53
remove: file grp cardinality checks inside process()
MehmedGIT ae97768
remove: constructors, adapt setup()
MehmedGIT 60d02d2
completed: OcropyBinarize
MehmedGIT dcaccd4
remove file grp cardinality asserts
MehmedGIT b178227
Update ocrd_cis/ocropy/binarize.py
MehmedGIT 67b6107
Update ocrd_cis/ocropy/binarize.py
MehmedGIT 06a98b1
Update ocrd_cis/ocropy/binarize.py
MehmedGIT 1e6cd7b
Update ocrd_cis/ocropy/binarize.py
MehmedGIT 71bb26d
fix: potentially wrong dpi in logs
MehmedGIT 64f02a3
binarize: don't conflate region/lines seg, pass output_file_id
kba d7c15c7
Update binarize.py
MehmedGIT 156d79f
Merge pull request #2 from kba/fix-binarize-v3
MehmedGIT 19566c0
try to migrate recognize
MehmedGIT 5f60976
fix: migrate recognize
MehmedGIT e8b2603
fix: detect_zoom logging
MehmedGIT 7dfd496
update: test_lib base url
MehmedGIT 033c38a
logging exception -> error
MehmedGIT 46d84d5
refactor: logger as a first positional argument
MehmedGIT f6fe4cf
fix: test_lib.bash data url
MehmedGIT aed0f95
fix: recognize OcrdPage import
MehmedGIT 804f031
try to migrate clip
MehmedGIT 7bdff31
remove: process() methods
MehmedGIT 03c2f15
adapt: docstring of process_page_pcgts
MehmedGIT 90ac28e
refactor: other small things
MehmedGIT f24f86b
fix: determine_zoom
MehmedGIT 5f8e1df
add missing Levenshtein req in setup
MehmedGIT 9a14e1d
fix: remove version req for Levenshtein
MehmedGIT 4ca4d14
fix: Levenshtein import
MehmedGIT fbaafcb
update ocrd-cis-binarize to be compatible with bertsky/core#8
kba 516ce4b
binarize: use final v3 API
bertsky 2e4f26f
binarize: use correct types
bertsky 21be941
clip: use final v3 API
bertsky 9539ac9
clip: use correct types
bertsky 734b5eb
recognize: use final v3 API
bertsky 28ad585
recognize: fix typing import
bertsky 9a7c10a
denoise: adapt to final v3 API
bertsky 7c9f39f
deskew: adapt to final v3 API
bertsky 6698668
dewarp: adapt to final v3 API
bertsky 48a3146
resegment: adapt to final v3 API
bertsky 0dd6fba
ocropy_segment: implement process_page_pcgts
MehmedGIT ad5ac7c
ocropy_segment: remove process
MehmedGIT 5d4007b
segment: adapt to final v3 API
bertsky df1c35c
train: adapt to final v3 API
bertsky c08b623
ocrd-tool.json: add v3 cardinalities
bertsky a18307d
fix: ocropy train errors
MehmedGIT 0ba6839
remove: unused imports
MehmedGIT 7b4ebc6
Merge branch 'port-to-v3' into port-to-v3-return-object
MehmedGIT 6b06e88
Update binarize.py
MehmedGIT 6b19f35
Merge pull request #3 from kba/port-to-v3-return-object
MehmedGIT d1a14b7
refactor: python strings v3
MehmedGIT d8542c2
spacing: train
MehmedGIT d785971
spacing: segment
MehmedGIT 7ca78a9
spacing: resegment
MehmedGIT 1004b43
spacing: rest
MehmedGIT c5498a0
spacing: dewarp
MehmedGIT 31e1245
fix: dewarp return
MehmedGIT f86c993
improve str speed: precompute element_name_id
MehmedGIT b8e3ad6
fix: clip suffix
MehmedGIT 02724f2
fix: denoise return
MehmedGIT aac6fe0
try to fix: ocropy denoise
MehmedGIT 5548d0e
fix: ocropy denoise
MehmedGIT c9f0f56
fix: resegment
MehmedGIT fff9097
optimize segment
MehmedGIT 8b92832
optimize ocropy common
MehmedGIT fceaffe
optimize ocrolib
MehmedGIT 3de2585
optimize align cli
MehmedGIT 0949277
align: use final v3 API
bertsky d4f8483
use ocrd_utils instead of pkg_resources
bertsky ecc44c0
postcorrect: use final v3 API
bertsky 2b310b4
revert: ocropy.ocrolib changes
MehmedGIT 4420c6f
revert: ocropy.common changes
MehmedGIT 2d8650e
remove whitespaces in ocropy.common and ocropy.ocrolib
MehmedGIT 9a153b0
postcorrect: adapt to frozendict Processor.parameter in v3
bertsky bd0613a
require ocrd>=3.0.0b1
bertsky f6e437f
add: simple github actions workflow
MehmedGIT 403781a
Update .github/workflow/tests.yml
MehmedGIT 97083bb
Update .github/workflow/tests.yml
MehmedGIT 2b20e0c
fix: checkout ref
MehmedGIT 86a08eb
Create GH Actions workflow: test.yml
MehmedGIT 231edf2
Merge branch 'master' into port-to-v3
MehmedGIT 1d7e9a0
delete: wrong path for workflows
MehmedGIT 224e86f
fix: NaN error for python3.9+
MehmedGIT a397531
fix: NaN in reading_order in morph.py
MehmedGIT 9cf8305
fix type hints
bertsky a0c734d
dewarp: make thread-safe
bertsky 66baaf0
recognize: disallow multithreading (impossible with current lstm impl…
bertsky 32ce656
postcorrect: make work under METS Server
bertsky c4a5999
tests: use METS Server if OCRD_MAX_PARALLEL_PAGES>1
bertsky ae7dc67
make test: run serially and parallel, show times
bertsky e540b10
require ocrd>=3.0.0b4
bertsky 99b3489
segment: adapt to numpy deprecation
bertsky dee1abf
eval/stats: Levenshtein -> rapidfuzz.distance.Levenshtein
kba File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't we migrate to
process_page_pcgts
here, too?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, everywhere. For now only binarize is migrated. I am just lagging with the migration since I have no working tests locally. The server from where the resources are downloaded is not available anymore. Hence, I try to adapt only things I understand and I am sure are the right things to do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops! Strange that Github pull the older release of our GT. Anyway, with OCR-D/gt_structure_text#2 out of the way it should now suffice to change the base URL in
https://github.com/MehmedGIT/ocrd_cis/blob/156d79fc051abeecf001cd6973e71c18efc659dd/tests/test_lib.bash#L8
to
https://github.com/OCR-D/gt_structure_text/releases/tag/v1.5.0/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And again, thanks for being so thorough!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That still did not help.
To get more detailed errors, I did:
Seems the download fails for some reason since the size of the zip is much smaller. I have manually downloaded the zip and placed it where it is expected. Hooray, the tests started passing, but then:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I have already done that. There was another other missing import - Levenshtein.
Should I add'python-Levenshtein>=0.25.1'
in the setup.py or was that library supposed to come from core?After I have manually installed that, I am stuck again on the same error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is strange because all other tests pass normally although invoking the same method and are supposed to fail. o.0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also noticed missing Levenshtein and also broken
get_ocrd_tool
. I'll send a PR for that after finishing bertsky/core#8. And I'll try to reproduce then.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just do the broken
get_ocrd_tool
fix in the PR. The missing Levenshtein was fixed by Robert. Just the align import was not replaced.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops! Indeed, I forgot to update the non-ocropy processors in that regard.