How to plot the Hessian max eigenvalue spectra? #12

Dong1P · 2022-04-30T06:51:03Z

I read your paper and studied a lot.
I would also like to see the code for plotting Hessian max eigenvalue spectra.
May I know if you have any plans to update?

Best,

xxxnell · 2022-04-30T13:40:53Z

Hi @Dong1P ,

Thank you for your support. I did not release the code for the Hessian eigenvalue spectra visualization (e.g., Fig 1c and 4) yet. Instead, I provide some useful information below.

Hessian Max Eigenvalue Spectrum: My implementation uses PyHessian (https://github.com/amirgholami/PyHessian) and the pseudo-code below is extremely simple.

Source: Appendix A3 in Blurs Behave Like Ensembles: Spatial Smoothings to Improve Accuracy, Uncertainty, and Robustness (ICML 2022).

It calculates and gathers top-k (e.g., top-5) Hessian eigenvalues by using power iteration mini-batch wisely.

from pyhessian import hessian
from tqdm import tqdm

max_eigens = []  # a list of batch-wise top-k hessian max eigenvalues
model = model.cuda()
for xs, ys in tqdm(dataset_train):
    hessian_comp = hessian(model, data=(xs, ys), transform=transform, weight_decay=weight_decay, cuda=True)  # measure hessian max eigenvalues with NLL + L2 on data augmented (`transform`) datasets
    top_eigenvalues, top_eigenvector = hessian_comp.eigenvalues(top_n=5)  # collect top-5 hessian eigenvaues by using power-iteration (https://en.wikipedia.org/wiki/Power_iteration)
    max_eigens = max_eigens + top_eigenvalues  # aggregate top-5 max eigenvalues

PyHessian does not support transform and weight_decay as arguments by default, so it's better to modify the code for more rigorous results.

Visualization: Hessian spectra (a list of real values, i.e., max_eigens) are visualized by using kernel density estimation (https://en.wikipedia.org/wiki/Kernel_density_estimation). See also https://scikit-learn.org/stable/modules/density.html.

yukimmmmiao · 2022-06-10T07:11:18Z

Thanks for your great work and I have learned a lot from it.@xxxnell
However, I have a few questions about the Hessian max eigenvalue spectra.
I wanna know that the NN weights w in your pseudo-code is fixed as the trained weights or not.
And why you only gather top-5 largest Hessian eigenvalues?
In my opinion, those minus eigenvalues(the smallest) also play an important role in the loss landscape.

xxxnell · 2022-06-10T23:31:29Z

Hi @yukimmmmiao , thank you for the kind words.

I assumed that the largest Hessian values have a dominant influence on optimization (Ghorbani, et al (ICML 2019). See also Liu et al (NeurIPS 2020)). I agree that the smallest Hessian eigenvalues also play an important role in optimization---to be clear, the algorithm will produce the greatest eigenvalues in absolute value, so the Hessian spectrum contains not only the largest eigenvalues but also the smallest negative eigenvalues. However, this algorithm neglect near-zero Hessian values, and I would like to leave a detailed analysis of near-zero Hessian values for future work. In my code, NN weights are fixed values. The Hessian values were measured by using saved checkpoints in separate jobs, not in the optimization tasks, for simplicity.

dgcnz · 2024-04-22T18:55:31Z

Hi @xxxnell, do you have any tips on what arguments to use for the pyhessian.eigenvalues function? If I just specify top_n=5 I never get negative eigenvalues, and by playing around with the parameters I've been able to find some when setting tol=1e-1 and top_n=50. Could you please share the parameters you used for your paper? Thanks! :)

xxxnell · 2024-04-24T16:25:32Z

Hi @dgcnz, thank you for reaching out. The occurrence of negative Hessian eigenvalues is largely dependent on the dataset and model configuration. I was wondering that you're working with smaller datasets, e.g. CIFAR, with data augmentations and utilizing a small model, e.g. Ti-sized model.

dgcnz · 2024-04-24T19:59:13Z

Thanks for your answer, @xxxnell 😄. We're currently testing on Rotational MNIST, which as far as I understand, would be too small/easy to consistently find negative eigenvalues?

Also, the datasets you tested for obtaining negative hessian eigenvalues was 10% of CIFAR and ImageNet, right? Did you by any chance test on a smaller dataset?

For context, we're comparing a CNN with an Rotationally Equivariant CNN and we were hoping to find a similar pattern as your work for a ViT vs ResNet.

xxxnell · 2024-04-25T19:52:31Z

Unfortunately, I haven't tested on datasets smaller/easier than CIFAR. The conf top_n=5 or 10 is for CIFAR, and I believe that having some level of difficulty in the tasks contributes to the observation of negative values. Based on my intuition, although we might expect consistent behaviors on smaller datasets, the difficulty of tasks can influence observations; it's possible that negative Hessians might not be observed in tasks that are too easy. Consequently, using a higher top_n might make more sense in order to observe them.

Please feel free to reach out via email ([email protected] or [email protected]) if you'd like to provide more detailed information about your settings. I'd be happy to discuss at some point.

xxxnell mentioned this issue Feb 23, 2023

Hessian Max eigenvalue spectra 코드 관련 질문드립니다. #34

Closed

xxxnell mentioned this issue Feb 6, 2024

Can I get a guideline for hessian eigenvalue visualization? #42

Closed

dgcnz mentioned this issue Apr 16, 2024

feat: add support for weight_decay amirgholami/PyHessian#24

Open

This was referenced Apr 25, 2024

Debug Hessian Spectra (missing negative eigenvalues) dgcnz/relaxed-equivariance-dynamics#14

Open

Package Max Hessian Spectra computation dgcnz/relaxed-equivariance-dynamics#25

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to plot the Hessian max eigenvalue spectra? #12

How to plot the Hessian max eigenvalue spectra? #12

Dong1P commented Apr 30, 2022

xxxnell commented Apr 30, 2022 •

edited

Loading

yukimmmmiao commented Jun 10, 2022 •

edited

Loading

xxxnell commented Jun 10, 2022 •

edited

Loading

dgcnz commented Apr 22, 2024

xxxnell commented Apr 24, 2024

dgcnz commented Apr 24, 2024 •

edited

Loading

xxxnell commented Apr 25, 2024

How to plot the Hessian max eigenvalue spectra? #12

How to plot the Hessian max eigenvalue spectra? #12

Comments

Dong1P commented Apr 30, 2022

xxxnell commented Apr 30, 2022 • edited Loading

yukimmmmiao commented Jun 10, 2022 • edited Loading

xxxnell commented Jun 10, 2022 • edited Loading

dgcnz commented Apr 22, 2024

xxxnell commented Apr 24, 2024

dgcnz commented Apr 24, 2024 • edited Loading

xxxnell commented Apr 25, 2024

xxxnell commented Apr 30, 2022 •

edited

Loading

yukimmmmiao commented Jun 10, 2022 •

edited

Loading

xxxnell commented Jun 10, 2022 •

edited

Loading

dgcnz commented Apr 24, 2024 •

edited

Loading