-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
update, add workflow-configuration, deps-ubuntu and docker #6
Conversation
@kba, should we move the repository to OCR-D? I suggest to keep the name then because I think that just |
* `==5.4.1` (required by ocrd_typegroups_classifier) | ||
* `>=6.2.0` (required by all others) | ||
- Tensorflow: | ||
* `tensorflow-gpu==1.14.0` (required by ocrd_calamari and OCR-D-LAYoutERkennung) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mjenckel, would tensorflow-gpu
2.0.0 also work or is a code update to support that possible for OCR-D-LAYoutERkennung
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this instance, it will be easy for me to hold ocrd_keraslm
and cor-asv-ann
below 2.0. But the problem will likely re-surface when the first module project wants to go 2.0. (Calamari is already, so ocrd_calamari might follow soon.) The problem then is not only changes in the API (that are rather easy to adapt to) but also that models must be retrained.
BTW, as I explained in #3, there is no tensorflow-gpu
for 2.0 (or even 1.15) anymore.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this instance, it will be easy for me to hold
ocrd_keraslm
andcor-asv-ann
below 2.0.
Done.
@stweil, regarding the pip inconsistencies, as you can see at the bottom of the Dockerfile, I alreaddy added a workaround there (that can be updated as long as no actual incompatibilities are involved). Do you want me to move that to the Makefile so it is available for the local installation as well? |
Done. @stweil Can you merge? |
This does not work in containers, and is not good practise otherwise. (All other modules don't use it.) Plus: we delegate here now in stweil/ocrd_all#6
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could not find the updated revision for ocrd_im6convert upstream.
It would be much easier for me to review and merge changes presented in small pull requests which address a single feature and only contain one or two commits.
@bertsky, I started with cherry-picking changes from this pull request, so you'll have to rebase your branch. |
Because it is not merged yet.
I see. But then those PRs would heavily depend on each other. And some things just belong in one PR. It's been quite difficult making this docker thing work. And now the automatic dependencies for local installation.
Please don't do this to me. |
@stweil When you take this PR apart, are you sure you want to test every single change here with the huge long docker build, which for performance reasons is non-incremental in the commited version? It was already extremely time-consuming having this repeated layer by layer when I made changes. And testing is a must here, you cannot rely on anything... |
Yeah, maybe keep the submodule updates out of this PR. |
I don't understand you guys. Is no-one seeing the connection? Almost all those updates were necessary to get the new functionality (automatic system dependencies, Docker integration, module filtering, workarounds for conflicts) working. In many cases, I conjured up the module updates just for this PR! With this attitude (keep it small, split it up and rebase), you are making it unnecessarily hard (and frustrating) to bring in solutions. It was already costly to fight through this (waiting for docker build to fail over and over again, wasting tons of network bandwidth, making all this work automatically and maintainably). I don't want go over all those steps and test them again for no good reason. You can easily |
@bertsky, I now get an ocrd/all docker of 5.02 GB when I run |
Well that's already a (partial) success. (It should be around 7 GB though. Maybe intermediate steps have failed? I recently added Next steps would be to run some workflow and validate. I am afraid there are no smoke tests for that in core or assets. But you can use docker run -it -u $(id -u) -v /path/to/workspaces:/data ocrd/all ocrd-make -f gt-binarize-page-olena-sauvola-deskew-page-ocropy-clip-deskew-region-tesseract-resegment-dewarp-ocr-ocropy-tesseract.mk |
@stweil Note, there seems to be connectivity problems with GH these days – you might have to repeat steps. |
- encapsulate `deps-ubuntu` from modules recursively (including those missing) - add a dockerfile which can include (up to) all executables from all modules without conflicts
- move fix-pip from Dockerfile to Makefile - allow filtering executables by modules they depend on (to easily unselect modules in Docker or pip) - use PIP_INSTALL variable instead of PIP_OPTIONS (and everywhere) - Docker build: increase pip network timeout
- install wget along with deps-ubuntu - don't depend on wget where not needed - document fix-pip and full installation dependencies
A new revision from a pull request only exists in the repository from which that pull request originated. Such a revision is unavailable for most people. The rule for a valid submodule revision is simple: the revision must exist in the repository of the submodule (not necessarily in the master branch). If it only exists in your fork, it is not valid. |
These commands might work for your environment:
|
That's simply not true: A PR's commits are automatically added to a new branch |
The updates seem to be ok (but I have not rebuilt and tested the docker images yet). Except for ocrd_im6convert, which definitely needs OCR-D/ocrd_im6convert@1b082f0 (because |
Signed-off-by: Stefan Weil <[email protected]>
I reverted to the needed version, which is now also on ocrd_im6convert:master. |
With even that out of the way, can you finally merge? |
Co-Authored-By: Stefan Weil <[email protected]>
When creating a new venv (because `VIRTUAL_ENV` was _not_ already set in the runtime context of `make`), don't just export `VIRTUAL_ENV`, but force usage of the `activate` script (even for cases of submodules which can hook into existing venvs at install time). Co-Authored-By: Stefan Weil <[email protected]>
Co-Authored-By: Stefan Weil <[email protected]>
@stweil do you want me to update core again here (significantly, we now have processor parameter validation), or are you going to apply that yourself after the PR? |
I have updated |
Co-Authored-By: Stefan Weil <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are nearly finished. One last typo is in the way. Thank you for your patience.
Co-Authored-By: Stefan Weil <[email protected]>
update, add workflow-configuration, deps-ubuntu and docker Signed-off-by: Stefan Weil <[email protected]>
@stweil, this is my attempt at a "fat" docker setup. Having difficulties getting versioning right for my slim docker proposal, I think this is at least a good stopgap. The result can be obtained under dockerhub as
bertsky/ocrd_all
for now (but I hope someone will publish it underocrd/all
soon).@mikegerber @kba, I also included an updated version of bertsky/workflow-configuration. (For those afraid of calling
make
I included a system-wide installation asocrd-make
, simply wrapping the make calls.)