Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

some questions about code #5

Open
Wongcheukwai opened this issue Jul 11, 2020 · 3 comments
Open

some questions about code #5

Wongcheukwai opened this issue Jul 11, 2020 · 3 comments

Comments

@Wongcheukwai
Copy link

Wongcheukwai commented Jul 11, 2020

Hi Sunil,

I really really love your work. It's novel, relatively easy and effective. After running your code, I have a few questions:

  1. I can't figure out why the abstained number before learning_epoch is always zero. But the size of outputs on line419 in train_dac is [128, 11] during learning epochs, how come after max it's never 10? After learning epoch(20),max can be 10. That's really interesting.

  2. I saw your previous discussion with @pingqingsheng regarding how to remove all the abstained data. Can you tell me how to get their indices and remove them? Here is the script i used:
    python train_dac.py --datadir ../dataset --dataset cifar10 --nesterov --net_type resnet --depth 34 -use-gpu --epochs 165 --loss_fn dac_loss --learn_epochs 20 --rand_labels 0.2 -cuda_device 0 --abst_rate 0.2 --save_train_scores.After that, i got 165 .npy files. Which epoch's train_score should I use? And how should I deal with this [50000,11] tensor? Max?

  3. how did you get the parenthetical numbers(especially the remaining noise level) in Table 1. For example, in cifar 10 80% sym noise, you claimed the remaining noise level is just 0.16, but after using my step 2 to remove the noise data, the correct rate for the left data(supposed to be clean) is just 0.28, making the remaining noise level really high. I am really confused by that. Is there something wrong with my step 2(how to remove noise data)?

  4. do you know any other novel approach to distinguish noise and clean data? I tried everything and found GMM is almost the best.

@thulas
Copy link
Owner

thulas commented Jul 16, 2020

Hi Sunil,

I really really love your work. It's novel, relatively easy and effective. After running your code, I have a few questions:

  1. I can't figure out why the abstained number before learning_epoch is always zero. But the size of outputs on line419 in train_dac is [128, 11] during learning epochs, how come after max it's never 10? After learning epoch(20),max can be 10. That's really interesting.

This is correct behavior. Learn epochs is a warm-up phase, where we train with regular cross-entropy. Abstention loss only kicks in after this, i.e. for all epochs > learn_epochs we use abstention loss. See

if epoch <= self.learn_epochs or not self.model.training:
#pdb.set_trace()
loss = F.cross_entropy(input_batch, target_batch, reduction='none')

Since the ground-truth labels don’t have the abstention class, the argmax of the output during the warmup phase is usually never the abstention class (because of the way cross-entropy works).

Also, learn_epochs is a hyperparameter, which we set to 20 in all our experiments.

  1. I saw your previous discussion with @pingqingsheng regarding how to remove all the abstained data. Can you tell me how to get their indices and remove them? Here is the script i used:
    python train_dac.py --datadir ../dataset --dataset cifar10 --nesterov --net_type resnet --depth 34 -use-gpu --epochs 165 --loss_fn dac_loss --learn_epochs 20 --rand_labels 0.2 -cuda_device 0 --abst_rate 0.2 --save_train_scores.After that, i got 165 .npy files. Which epoch's train_score should I use? And how should I deal with this [50000,11] tensor? Max?

You should eliminate all the data points for which argmax was 10 (i.e abstention class) at a selected epoch. Epoch 165, i.e final epoch might itself be a good choice, but it's difficult to say what the best epoch should be as this depends, among other things, on the actual noise rate which is not known in advance.

  1. how did you get the parenthetical numbers(especially the remaining noise level) in Table 1. For example, in cifar 10 80% sym noise, you claimed the remaining noise level is just 0.16, but after using my step 2 to remove the noise data, the correct rate for the left data(supposed to be clean) is just 0.28, making the remaining noise level really high. I am really confused by that. Is there something wrong with my step 2(how to remove noise data)?

See above. Remove abstained points, and retrain (with regular cross-entropy) on cleaner set. But also see discussion here about a few important details: #1 (comment)

  1. do you know any other novel approach to distinguish noise and clean data? I tried everything and found GMM is almost the best.

GMM and other mixture models (like Beta Mixture) work well when the noise model is symmetric, but real world noise is seldom symmetric, and usually feature dependent. In this case, we find the DAC performs especially well. See Section 3 in our ICML paper for details.

@Wongcheukwai
Copy link
Author

thank you for your detailed reply. Can you tell me how to run asymmetric cifar10? which args. should i set?

@thulas
Copy link
Owner

thulas commented Jul 21, 2020

Use the label_flip.py script inside the utils directory to generate a class dependent label corruption. See Appendix C in our ICML paper (https://arxiv.org/pdf/1905.10964.pdf) for additional details.
Once you generate the flipped labels, re-run the experiment using the --label_noise_info argument, as in:
python train_dac.py --label_noise_info <flipped_label_pickle_file> [other arguments as before]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants