Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

md5 calculation does not match #35

Open
jeremysimmons opened this issue Feb 16, 2015 · 4 comments
Open

md5 calculation does not match #35

jeremysimmons opened this issue Feb 16, 2015 · 4 comments

Comments

@jeremysimmons
Copy link

What is the MD5 implementation you're using to hash the href field from a post?
I cannot understand why my hashes are not matching your data.

                            Mine    Yours
766efaf951a21e30f5a6128256441879 == 766efaf951a21e30f5a6128256441879
http://stackoverflow.com/questions/202013/clickonce-and-isolatedstorage
b06c50f709e6003e2480799ba714917b != ed677fde4b8d36f7debfca524164211a
http://www.codemag.com/article/0611041
47b87d2e715ac0fcef2702dbe6bfd331 != 7aab575f1b8d39be0d7de56175538cfc
https://msdn.microsoft.com/en-us/library/bb397895.aspx
61dd26c913a23edf5850806db6597728 == 61dd26c913a23edf5850806db6597728
http://stackoverflow.com/questions/16864617/updating-the-build-definition-for-many-tfs-projects
9e29a21e93a62e4d03a58e52735d3dcd != f46fb6e130ee6bfa1d1f1a7e274879c4
http://www.codeproject.com/Articles/5724/Understanding-NET-Code-Access-Security#Tools

I have cross-checked my computed value with a 3rd party which matches my computed hash. http://www.miraclesalad.com/webtools/md5.php

@jeremysimmons jeremysimmons changed the title md5 calculation md5 calculation does not match Feb 16, 2015
@peoplemerge
Copy link
Contributor

It's just the md5 kind if md5... Which library / implementation? I'm not sure. But the miracalesalad link you sent works the same as ours for this example:
http://inlightapp.com/webArticle/article/53ca276ee4b0356b396e72ed
which miracalesalad calculates as: ddd354c67d0b69c3c3ad6175c53f74bf2
And our link: https://delicious.com/link/dd354c67d0b69c3c3ad6175c53f74bf2

However it's not always that simple. If there are anchors (#placeholder) we strip them all off, so we don't end up storing unlimited variations of the same page forever. That does break some single page apps, so we're rethinking our strategy. There might be some other cases than anchors that we strip off as well.

Is that explanation good enough?

@jeremysimmons
Copy link
Author

Hi Dave -
Thanks for the quick reply. Your explanation doesn't help me calculate how the hash is calculated by the api. If there are special rules for which part of the initial url to use, I would like to know those.

Neither my hash or the one I got from the delicioius api work for the link format you sent: https://delicious.com/link/:md5 hash for the url http://www.codemag.com/article/0611041.

Most of the links for https://delicious.com/link/:md5 just have a message Delicious is processing this link, please try again later.

@peoplemerge
Copy link
Contributor

That is correct. I don't think it's the md5 implementation, more likely some business logic on our side. I'm not sure what that could entail, but I suspect we have matched a URL to another version of the page. I'll try to check it out.

@jeremysimmons
Copy link
Author

Hi Dave -
Any news from the inside on how the URL hash is being calculated?
I'd very much like to be able to compute the hash locally.
The api /v1/posts/get?hashes={md5} is another reason.
Thank you

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants