Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for MRPC dataset with unit tests #1712

Merged
merged 10 commits into from
May 18, 2022
Merged

Conversation

vcm2114
Copy link
Contributor

@vcm2114 vcm2114 commented May 11, 2022

Summary

  • Added support for MRPC dataset
  • Added mocked unit tests for MRPC dataset

Test

  • pytest test/datasets/test_mrpc.py

Context

See #1710

Copy link
Contributor

@parmeet parmeet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Please fix lint issues before merging :)

@Nayef211
Copy link
Contributor

@ejguan @NivekT it seems like some of the CI builds are failing at the step where we try to install torchdata from source (i.e. here). Do you know if there's been any recent changes that would cause this?

@ejguan
Copy link
Contributor

ejguan commented May 16, 2022

@ejguan @NivekT it seems like some of the CI builds are failing at the step where we try to install torchdata from source (i.e. here). Do you know if there's been any recent changes that would cause this?

This seems weird to me. TorchData is required PyTorch to run some codegen. Based on the script (https://github.com/pytorch/text/blob/main/.circleci/unittest/linux/scripts/install.sh), it seems PyTorch is installed before TorchData. Not sure why this happens.
Please correct me if I am wrong. It seems the CI shouldn't use source to install TorchData

printf "Installing torchdata nightly\n"
pip install --pre torchdata --extra-index-url https://download.pytorch.org/whl/nightly/cpu
.

@NivekT
Copy link
Contributor

NivekT commented May 16, 2022

@ejguan @NivekT it seems like some of the CI builds are failing at the step where we try to install torchdata from source (i.e. here). Do you know if there's been any recent changes that would cause this?

This seems weird to me. TorchData is required PyTorch to run some codegen. Based on the script (https://github.com/pytorch/text/blob/main/.circleci/unittest/linux/scripts/install.sh), it seems PyTorch is installed before TorchData. Not sure why this happens. Please correct me if I am wrong. It seems the CI shouldn't use source to install TorchData

printf "Installing torchdata nightly\n"
pip install --pre torchdata --extra-index-url https://download.pytorch.org/whl/nightly/cpu

Maybe switching to install nightly with conda will help?

@Nayef211
Copy link
Contributor

Nayef211 commented May 16, 2022

Please correct me if I am wrong. It seems the CI shouldn't use source to install TorchData

@ejguan my mistake I guess we are relying on torchdata nightlies from pypi in our unit tests setup and not installing directly from source.

Maybe switching to install nightly with conda will help?

@NivekT do you know what the correct command would be to install the nightly from conda?

printf "Installing torchdata nightly\n"
pip install --pre torchdata --extra-index-url https://download.pytorch.org/whl/nightly/cpu

@Nayef211
Copy link
Contributor

@NivekT do you know what the correct command would be to install the nightly from conda?

Nvm I just found the following link https://anaconda.org/pytorch-nightly/torchdata. I assume it would be as follows

conda install -c pytorch-nightly torchdata

@ejguan
Copy link
Contributor

ejguan commented May 17, 2022

I think this command should be fine:
pip install --pre torchdata --extra-index-url https://download.pytorch.org/whl/nightly/cpu
I just tested it on my local machine.

Copy link
Contributor

@Nayef211 Nayef211 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Delete print lines before merging

test/datasets/test_mrpc.py Outdated Show resolved Hide resolved
test/datasets/test_mrpc.py Outdated Show resolved Hide resolved
test/datasets/test_mrpc.py Outdated Show resolved Hide resolved
@vcm2114 vcm2114 merged commit bb41e4f into pytorch:main May 18, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants