Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Patch: update crawling to not follow redirects when -disable-redirects is set #630

Merged

Conversation

ErikOwen
Copy link
Contributor

@ErikOwen ErikOwen commented Oct 17, 2023

This fixes #610, where redirects are not followed when performing a katana crawl.

"Standard" crawl:

# without changes introduced in this PR
> echo "http://projectdiscovery.io" | katana -silent -d 2 -disable-redirects
http://projectdiscovery.io
https://projectdiscovery.io/

# with changes introduced in this PR
> echo "http://projectdiscovery.io" | katana -silent -d 2 -disable-redirects
http://projectdiscovery.io

"Headless" crawl:

# without changes introduced in this PR
> echo "http://projectdiscovery.io" | katana -silent -d 2 -disable-redirects -headless
https://projectdiscovery.io/cdn-cgi/scripts/5c5dd728/cloudflare-static/email-decode.min.js
https://projectdiscovery.io/cdn-cgi/scripts/5c5dd728/cloudflare-static/'+e.replace(/
https://projectdiscovery.io/cdn-cgi/l/email-protection
http://projectdiscovery.io
https://projectdiscovery.io/
https://projectdiscovery.io/terms
https://projectdiscovery.io/aboutus
https://projectdiscovery.io/privacy
https://chaos.projectdiscovery.io/
https://projectdiscovery.io/nuclei
https://projectdiscovery.io/community
https://blog.projectdiscovery.io/
https://blog.projectdiscovery.io/stop-pentesting-start-programming/
https://projectdiscovery.io/requestdemo
https://blog.projectdiscovery.io/announcing-nuclei-cloud/
https://blog.projectdiscovery.io/hunting-c2-servers/
https://blog.projectdiscovery.io/the-best-defense-is-a-good-offensive-security-program/
https://projectdiscovery.io/cloudplatform

# with changes introduced in this PR
> echo "http://projectdiscovery.io" | ./katana -silent -d 2 -disable-redirects -headless
http://projectdiscovery.io

@ehsandeep ehsandeep changed the base branch from main to dev October 17, 2023 17:53
@ErikOwen ErikOwen changed the title patch: update 'standard' crawling to not follow redirects when -disable-redirects is set patch: update crawling to not follow redirects when -disable-redirects is set Oct 18, 2023
@ErikOwen ErikOwen changed the title patch: update crawling to not follow redirects when -disable-redirects is set Patch: update crawling to not follow redirects when -disable-redirects is set Oct 18, 2023
@ErikOwen
Copy link
Contributor Author

Can one of the maintainers review this pull request please 🙏?

@Mzack9999
Copy link
Member

@ErikOwen Apologies for the delay in getting back to you on this PR. The disable redirect flag was thought mostly for non-headless crawling in mind as we retain control over synchronous HTTP requests flow.
The problem with headless is that we would be blocking only the classical redirect via Location header but we wouldn't be able to handle other cases like js-induced redirects or via html meta refresh. If you were looking for a complete blocking solution, I think we need to implement more sophisticated detection mechanisms as chrome headless unfortunately doesn't expose a native flag to disable redirects globally. Your intention was to cover only the redirect via Location Header?

@ErikOwen
Copy link
Contributor Author

@Mzack9999 - thank you for taking the time to look at this PR!

Your intention was to cover only the redirect via Location Header?

No, my intention is to properly prevent following all types of redirects when the -disable-redirects flag is set. I didn't consider the other types of redirects that you mentioned (js-induced redirects or HTML meta refresh redirects). Thank you for bringing those to my attention.

Would it be fair to consider this PR as an incremental step in the right direction, since it properly disables following the most common types of redirect (HTTP status redirects which are often used to redirect HTTP requests to HTTPS), and I'll open up a new issue to track the less common redirects that you mentioned above?

@Mzack9999 Mzack9999 added the Type: Bug Inconsistencies or issues which will cause an issue or problem for users or implementors. label Oct 31, 2023
@Mzack9999 Mzack9999 self-requested a review October 31, 2023 16:36
@Mzack9999 Mzack9999 merged commit 659a1f8 into projectdiscovery:dev Oct 31, 2023
13 checks passed
@Mzack9999
Copy link
Member

@ErikOwen I made some small change and merged the PR, thanks for it! During my tests I've anyway unfortunately seen that in many cases the redirect url is anyway visited somehow indirectly if extracted somehow during the crawler process, in fact zero-ing the effect of redirect skip. Feel free to open a new GitHub issue if blocking the other kind of redirects might be useful. Thanks!

@ErikOwen ErikOwen deleted the patch/respect-disable-redirects-flag branch October 31, 2023 18:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Type: Bug Inconsistencies or issues which will cause an issue or problem for users or implementors.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Enabling -disable-redirects flag follows HTTP redirects
2 participants