Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

thrift.py3.exceptions.TransportError #92

Open
eyuansu62 opened this issue May 9, 2022 · 11 comments
Open

thrift.py3.exceptions.TransportError #92

eyuansu62 opened this issue May 9, 2022 · 11 comments
Labels
question Further information is requested

Comments

@eyuansu62
Copy link

Hi, When I run the 'make eval' command, it comes to the following error.
Do you have any idea to fix it up?

File "/app/t5-unc/utils/picard_model_wrapper.py", line 200, in with_picard
asyncio.run(_init_picard(), debug=False)
File "/opt/conda/lib/python3.7/asyncio/runners.py", line 43, in run
return loop.run_until_complete(main)
File "/opt/conda/lib/python3.7/asyncio/base_events.py", line 587, in run_until_complete
return future.result()
File "/app/t5-unc/utils/picard_model_wrapper.py", line 127, in _init_picard
await _register_schema(db_id=db_id, db_info=db_info, picard_client=client)
File "/app/t5-unc/utils/picard_model_wrapper.py", line 133, in _register_schema
await picard_client.registerSQLSchema(db_id, sql_schema)
thrift.py3.exceptions.TransportError: (<TransportErrorType.UNKNOWN: 0>, 'Channel is !good()', 0, <TransportOptions.0: 0>)

@tscholak
Copy link
Collaborator

Hi, that means that the picard backend is either not running (yet) or unresponsive. Does this happen reproducibly?

@david-seekai
Copy link

Ran into the same issue. I wasn't using the make file but running it with

docker run \
    -it \
    --rm \
    --user 13011:13011 \
    -p 8000:8000 \
    --mount type=bind,source=/Users/david/database,target=/database \
    --mount type=bind,source=/Users/david/transformers_cache,target=/transformers_cache \
    --mount type=bind,source=/Users/david/configs,target=/app/configs \
            tscholak/text-to-sql-eval:35f43caadadde292f84e83962fbe5320a65d338f	 \
    /bin/bash -c "python seq2seq/serve_seq2seq.py configs/serve.json"

trace

Traceback (most recent call last):
File "seq2seq/serve_seq2seq.py", line 151, in
main()
File "seq2seq/serve_seq2seq.py", line 97, in main
model = model_cls_wrapper(AutoModelForSeq2SeqLM).from_pretrained(
File "seq2seq/serve_seq2seq.py", line 91, in
model_cls=model_cls, picard_args=picard_args, tokenizer=tokenizer
File "/app/seq2seq/utils/picard_model_wrapper.py", line 199, in with_picard
asyncio.run(_init_picard(), debug=False)
File "/opt/conda/lib/python3.7/asyncio/runners.py", line 43, in run
return loop.run_until_complete(main)
File "/opt/conda/lib/python3.7/asyncio/base_events.py", line 587, in run_until_complete
return future.result()
File "/app/seq2seq/utils/picard_model_wrapper.py", line 127, in _init_picard
await _register_tokenizer(picard_client=client)
File "/app/seq2seq/utils/picard_model_wrapper.py", line 145, in _register_tokenizer
await picard_client.registerTokenizer(json_str)
thrift.py3.exceptions.TransportError: (<TransportErrorType.UNKNOWN: 0>, 'Channel is !good()', 0, <TransportOptions.0: 0>)

@eyuansu62
Copy link
Author

It happens when I use the UnifiedSKG code to train a t5-large model, and I copy the picard_model_wrapper into the UnifiedSKG code.
In my view, UnifiedSKG code is same as picard code.
So I can not understand why it happened.

@tscholak
Copy link
Collaborator

Hi, author here. It is not enough to just copy the picard model wrapper. This wrapper is just one small piece in the picard parsing approach. There are library components and a picard executable as well. In most cases, I recommend using the eval image with your model, that is, rather than trying to take picard and put it somewhere else, bring what you have (e.g. a checkpoint) to this codebase and docker images. For most people, that approach is quicker and easier.

@tscholak
Copy link
Collaborator

@david-seekai let me have a look.

@david-seekai
Copy link

Great thanks I'm using the GKE Container-Optimized OS.

https://cloud.google.com/container-optimized-os/docs/concepts/features-and-benefits

Let me know if I can help in anyway or if any other info is needed.

@eyuansu62
Copy link
Author

eyuansu62 commented May 13, 2022

@tscholak yep, I do use the picard eval image as the docker environment.
I do try to bring the ckpt to this codebase, but I face a problem.
The same ckpt has the different performance between two codebase, for example, 68.5 em in UnifiedSKG, but 66.2 em in this codebase without picard mode.
I modified the code to make the input the same between the two codebases.
Have you ever met this problem?

@tscholak
Copy link
Collaborator

tscholak commented May 13, 2022

I recommend comparing the generated outputs. if they are the same, then the issue is the evaluation. if they are not the same, then something is different about how they are generated.

@eyuansu62
Copy link
Author

Yep, I compared the generated outputs and they are different.
I control the same package version, same input format, same function such as generate and evaluate.
But I still do not know why it happened. It is really confusing.

@tscholak tscholak added the question Further information is requested label Jul 31, 2022
@abharga2
Copy link

Hey! I'm reproducibly getting the same error using make prediction_output

Any ideas?

@Tomcatiiii
Copy link

Hello, I fixed this problem by changing time.sleep(1) to 10 in line seq2seq/utils/picard_model_wrapper.py 95. I guess because the child process has not started successfully, the main process connected, so this error will be reported.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

5 participants