Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Training Docker Image and Language Model Creation #3153

Closed
gokgozf opened this issue Jul 14, 2020 · 8 comments
Closed

Training Docker Image and Language Model Creation #3153

gokgozf opened this issue Jul 14, 2020 · 8 comments

Comments

@gokgozf
Copy link

gokgozf commented Jul 14, 2020

I really like the idea of separating the training, and build images though i have a doubt.

In my opinion being able to easily generate a new language model and being able to test is a great opportunity for training image which i believe is neglected in the image by not adding the following statements and some dependency installations

If it is aligned with your expectations as well, i can provide quick pull request on that

# Allow Python printing utf-8
ENV PYTHONIOENCODING UTF-8

# Build KenLM in /DeepSpeech/native_client/kenlm folder
WORKDIR /DeepSpeech/native_client
RUN rm -rf kenlm && \
	git clone https://github.com/kpu/kenlm && \
	cd kenlm && \
	git checkout 87e85e66c99ceff1fab2500a7c60c01da7315eec && \
	mkdir -p build && \
	cd build && \
	cmake .. && \
	make -j $(nproc)
@DanBmh
Copy link
Contributor

DanBmh commented Jul 14, 2020

What version are you using?
I already added the KenLM building part some time ago to the Dockerfile.train.tmpl file.

@lissyx
Copy link
Collaborator

lissyx commented Jul 15, 2020

by not adding the following statements and some dependency installations

As @DanBmh said, we have that now. Can you please be clear in your wording? I'm not a big fan of mind reading, so "some dependency" is not really helpful.

@lissyx
Copy link
Collaborator

lissyx commented Jul 21, 2020

So @gokgozf can you elaborate explicitely on what is needed ? The only part of code you pasted is already there...

@lissyx lissyx added the waiting-on-reporter Waitiing on more informations from reporter label Jul 21, 2020
@DanBmh
Copy link
Contributor

DanBmh commented Jul 21, 2020

@lissyx Does scorer packaging still work? I've seen that the py file was replaced by another script with extra installation steps, but didn't test it yet.

@lissyx
Copy link
Collaborator

lissyx commented Jul 21, 2020

@lissyx Does scorer packaging still work?

In the dockerfile? it's possible we don't take care of that yet

@lissyx
Copy link
Collaborator

lissyx commented Jul 27, 2020

Please @gokgozf ? Can you elaborate on what you miss?

@kdavis-mozilla
Copy link
Contributor

@gokgozf In order to help us help you, could you elaborate on what's missing?

@lissyx
Copy link
Collaborator

lissyx commented Sep 9, 2020

Without more information and no feedback, I'm closing this bug. Please reopen / send PR if you need to.

@lissyx lissyx closed this as completed Sep 9, 2020
@lissyx lissyx removed the waiting-on-reporter Waitiing on more informations from reporter label Sep 9, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants