Bug: Missing content_length in JSON output if server response lacks Content-Length header #1032
Labels
Status: Completed
Nothing further to be done with this issue. Awaiting to be closed.
Type: Bug
Inconsistencies or issues which will cause an issue or problem for users or implementors.
Katana Version:
v1.1.0 (latest)
Current Behavior:
When using Katana with JSON output mode, the
content_length
field is not present if the server response does not explicitly include theContent-Length
header.Example:
Command:
Output:
However, when piping the same URLs through HTTPX, the
content_length
field is populated even if theContent-Length
header is absent.Example with HTTPX:
Command:
Output:
Expected Behavior:
HTTPX handles missing
Content-Length
headers by calculating the content length from the response body, ensuring thecontent_length
field is present in the JSON output. This behavior is achieved through logic like this:In contrast, Katana does not appear to include a similar mechanism for ensuring
content_length
is always provided.Katana Code Snippets:
In
pkg/engine/standard/crawl.go
, Katana seems to setresp.ContentLength
based solely on the response data length:And in
pkg/navigation/response.go
, theResponse
struct doesn't include acontent_length
field:Why This Matters:
The
content_length
field is essential for understanding the structure of a website’s response, particularly to identify whether the response body is empty. Including this data in the output helps users assess the content and size of a response even when the server omits theContent-Length
header.Request:
Could you please add logic to Katana to calculate and include the
content_length
in the JSON output, similar to how HTTPX handles it? This would greatly improve the utility of Katana’s output when crawling websites withoutContent-Length
headers.The text was updated successfully, but these errors were encountered: