Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Passive crawling from external sources support #139

Closed
3 tasks done
longnguyenhuynh opened this issue Nov 8, 2022 · 5 comments · Fixed by #781 or #824
Closed
3 tasks done

Passive crawling from external sources support #139

longnguyenhuynh opened this issue Nov 8, 2022 · 5 comments · Fixed by #781 or #824
Assignees
Labels
Priority: High After critical issues are fixed, these should be dealt with before any further issues. Status: Completed Nothing further to be done with this issue. Awaiting to be closed. Type: Enhancement Most issues will probably ask for additions or changes.

Comments

@longnguyenhuynh
Copy link

longnguyenhuynh commented Nov 8, 2022

Please describe your feature request:

Add the ability to get passive URLs / endpoints from -

  • Wayback Machine (headless)
  • Common Crawl (headless)
  • Virus Total
  • Alien Vault
  • URLScan

CLI Options -

   -ps, -passive                   enable passive sources to discover target endpoints
   -pss, -passive-source           passive source to use for url discovery (wayback,urlscan,commoncrawl,virustotal,alienvault)

JSON Output -

{
  "timestamp": "2022-11-05T22:33:27.745815+05:30",
  "endpoint": "https://mail.google.com/mail/u/0/?ik=a9f1fef565&view=pt&search=all&permthid=thread-f:1717382568649591026",
  "source": "https://otx.alienvault.com/api/v1/indicators/domain/google.com/url_list?limit=100&page=5",
  "mode": "passive"
}

Example run:

katana -u hackerone.com -passive -silent

https://hackerone.com/redirect?signature=147f892574d9bece120cd41a5c4539e3fa8e8066&url=https://vimeo.com/137725491
https://hackerone.com/teams/new
https://hackerone.com/sandbox
https://support.hackerone.com/hc/en-us/articles/211538803-Step-by-Step-How-to-write-a-good-vulnerability-report
https://www.hackerone.com/blog/H1-415-Recap-Oath-Pays-Over-400000-Hackers-One-Da%20y
https://hackerone.com/txt3rob
https://hackerone.com/redirect?signature=4d7211d04ad487ae4b5053792b10fe43badb57fe&url=https%3A%2F%2Fwww.youtube.com%2Fwatch%3Fv%3D5iRylyJTzWc
http://hackerone.com/googleplay
https://hackerone.com/reports/104543
https://hackerone.com/reports/83803
https://hackerone.com/reports/253558
https://hackerone.com/users/confirmation?confirmation_token=z1E1oUpkrMnuBQpMHDd
https://www.hackerone.com/product/challenge
https://www.hackerone.com/blog/Sikurs-COO-Hacker-Diversity-Essential-Securing-SIKURPhone
https://hackerone.com/reports/199438
https://hackerone.com/workday
http://api.hackerone.com:8880/
https://hackerone.com/redirect?signature=c8ae58718e901ab4b54c7bcab54b924cb07a386b&url=http://blog.innerht.ml/overflow-trilogy/
https://hackerone.com/reports/269831
http://www.hackerone.com:8880/
http://hackerone.com:8880/
https://hackerone.com/redirect?signature=0486c622361ef174ec407e000fd0f4e54bdaec4a&url=https://www.owasp.org/index.php?title=Broken_Authentication_and_Session_Management&setlang=en
https://hackerone.com/users/sign_up
https://hackerone.com/reports/223609
https://www.hackerone.com/sites/default/files/2018-07/The%20Hacker-Powered%20Security%20Report%202018.pdf
http://hackerone.com/w2w
http://api.hackerone.com:2083/
http://api.hackerone.com:443/
http://api.hackerone.com:8443/
https://hackerone.com/notifications
https://www.hackerone.com/blog/How-to-Hack-Get-Started-Hacking-Mobile
https://hackerone.com/spotify
https://hackerone.com/users/confirmation?confirmation_token=AxJjSSHbxLuxE2wsKrLz
https://www.hackerone.com/privacy
https://hackerone.com/bugs?subject=user&report_id=
https://hackerone.com/hacktivity
https://hackerone.com/reports/180074
https://hackerone.com/glasswire
https://www.hackerone.com/sites/default/files/2018-06/HackerOne-BlackHat-Vegas-Week-Activities-2018_0.pdf
https://hackerone.com/augurproject
https://hackerone.com/leaderboard/all-time
https://hackerone.com/uber
https://hackerone.com/reports/244504
https://hackerone.com/egyptghost1&d=DwMFaQ&c=7DfhQjPWzR3PmWBQVpi%kw&r=nZr0nOaewW9j3jAt8xfGtw&m=R1VVkSZXns7lMVewqXGum%CDerCjoWKII9VPm54%kyk&s=unQflkqs62j/8P6jUmj6hUs5SNbLS8F53i0sZm4DZwE&e=
https://hackerone.com/hacktivity?sort_type=popular&filter=type:all&page=1&range=forever
https://hackerone.com/gcheng
https://hackerone.com/jrjn
https://hackerone.com/spyboy
https://hackerone.com/dchan

Note:

  1. In passive mode, all the applicable options like scope/filters, etc will be supported, except active crawling.
  2. -passive-source option can specify single or multiple (comma-separated) sources.
  3. as default, all supported passive sources will be used in passive mode.
  4. passive crawling mode is optional, can be enabled with -passive flag.

Describe the use case of this feature:

Katana's missing some important URLs compares to other crawler tools

Tasks

  1. dogancanbakir
  2. dogancanbakir
  3. dogancanbakir
@longnguyenhuynh longnguyenhuynh added the Type: Enhancement Most issues will probably ask for additions or changes. label Nov 8, 2022
@ehsandeep ehsandeep changed the title Add URLs from other sources Passive source support Nov 8, 2022
@ehsandeep ehsandeep added the Priority: Low This issue can probably be picked up by anyone looking to contribute to the project, as an entry fix label Nov 8, 2022
@ehsandeep
Copy link
Member

@longnguyenhuynh Katana is primarily an active web crawler; although, thanks to your feature suggestion, passive sources will be added soon.

@Mzack9999
Copy link
Member

@ehsandeep, could you detail how passive URLs should be handled? For example, are they only retrieved and listed?

@fail-open
Copy link
Collaborator

Itd be cool if you could use passive sources to seed the spider, so if there is a passive source of a page that currently isnt linked to so active spider wouldn't find it, the passive record would be spidered from and new pages could potentially be found.

@ehsandeep
Copy link
Member

@Mzack9999 issue is now updated with details.

@fail-open good idea and something to consider/work on after passive support implementation.

@brenocss
Copy link

in naabu we use -passive and -verify, -verify would be great to add some match conditions such as 'status-code, regex, etc'

@tarunKoyalwar tarunKoyalwar self-assigned this Feb 12, 2023
@ehsandeep ehsandeep added Priority: Medium This issue may be useful, and needs some attention. and removed Priority: Low This issue can probably be picked up by anyone looking to contribute to the project, as an entry fix labels Mar 20, 2023
@ehsandeep ehsandeep changed the title Passive source support Passive crawling from external sources support Mar 20, 2023
@ehsandeep ehsandeep pinned this issue Mar 20, 2023
@tarunKoyalwar tarunKoyalwar added Priority: High After critical issues are fixed, these should be dealt with before any further issues. and removed Priority: Medium This issue may be useful, and needs some attention. labels Feb 22, 2024
@dogancanbakir dogancanbakir self-assigned this Feb 26, 2024
@dogancanbakir dogancanbakir linked a pull request Feb 27, 2024 that will close this issue
@ehsandeep ehsandeep added the Status: Completed Nothing further to be done with this issue. Awaiting to be closed. label Mar 20, 2024
@ehsandeep ehsandeep linked a pull request Mar 26, 2024 that will close this issue
@ehsandeep ehsandeep unpinned this issue Mar 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Priority: High After critical issues are fixed, these should be dealt with before any further issues. Status: Completed Nothing further to be done with this issue. Awaiting to be closed. Type: Enhancement Most issues will probably ask for additions or changes.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants