Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

app-layer gap handling - v6.1 #2696

Closed
wants to merge 2 commits into from
Closed

Conversation

jasonish
Copy link
Member

@jasonish jasonish commented May 8, 2017

Previous PR: #2660

For parsers configured to handle gaps, pass stream gaps onto them by sending NULL data with a positive input length where the length if the number of bytes lost in the gap.

Add gap handling to DNS for an example. In the DNS case a flag is set on the state that a gap occurred. On the next input data the data is re-probed to see if it can continue.

Prscript:

jasonish added 2 commits May 8, 2017 13:43
I parse can now set a flag that will tell the application
layer that it is capable of handling gaps. If enabled, and a
gap occurred, the app-layer needs to be prepared to accept
input that is NULL with a length, where the length is the
number of bytes lost. It is up to the app-layer to
determine if it can sync up with the input data again.
On gap notification a flag is set, on the next call the input
data is reprobed to make sure it can be processed.
@@ -1081,12 +1088,15 @@ int StreamTcpReassembleAppLayer (ThreadVars *tv, TcpReassemblyThreadCtx *ra_ctx,
"(base %u, isn %u, last_ack %u => diff %u) p %"PRIu64,
stream->base_seq, stream->isn, stream->last_ack,
stream->last_ack - (stream->isn + ackadd), p->pcap_cnt);
gap = stream->last_ack - (stream->isn + ackadd);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is an issue when we hit this case. The seg_list continues to be NULL, so its never passed to the app-layer. Not sure yet where I should be looking here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In a preview/test PR #2702 I have removed this seg_list based gap check. I think that makes your PR easier to implement as well.

{
const uint64_t app_progress = STREAM_APP_PROGRESS(stream);
const uint64_t app_progress = STREAM_APP_PROGRESS(stream) + gap;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instead of adding the gap here I think it makes more sense to update app_progress right after the GAP call to the applayer

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I'll have to rebase against your changes then make this change.

Currently I am never able to get the gap check below this to trigger. It was also the gap check before this which triggered, so the gap had to be used here to get at the next data (as the gap call had already been made to the app-layer before calling ReassembleUpdateAppLayer).

@jasonish jasonish mentioned this pull request May 10, 2017
@jasonish
Copy link
Member Author

Closing, see #2704

@jasonish jasonish closed this May 10, 2017
@jasonish jasonish deleted the dev-gap-6.1 branch June 12, 2017 15:58
cccs-rtmorti added a commit to cccs-rtmorti/suricata that referenced this pull request May 30, 2022
cccs-rtmorti added a commit to cccs-rtmorti/suricata that referenced this pull request Dec 30, 2022
Ticket: OISF#2696

There are a lot of changes here, which are described below.

In general these changes are renaming constants to conform to the
libhtp-rs versions (which are generated by cbindgen); making all htp
types opaque and changing struct->member references to
htp_struct_member() function calls; and a handful of changes to offload
functionality onto libhtp-rs from suricata, such as URI normalization
and transaction cleanup.

Constants renamed to correspond to cbindgen generated names:

HTP_OK => HTP_STATUS_OK
HTP_ERROR => HTP_STATUS_ERROR

HTP_SERVER_* => HTP_SERVER_PERSONALITY_*

HTP_REQUEST_INVALID_T_E => HTP_FLAGS_REQUEST_INVALID_T_E
HTP_REQUEST_INVALID_C_L => HTP_FLAGS_REQUEST_INVALID_C_L
HTP_HOST_MISSING => HTP_FLAGS_HOST_MISSING
HTP_HOST_AMBIGUOUS => HTP_FLAGS_HOST_AMBIGUOUS
HTP_HOSTU_INVALID => HTP_FLAGS_HOSTU_INVALID
HTP_HOSTH_INVALID => HTP_FLAGS_HOSTH_INVALID

HTP_AUTH_UNRECOGNIZED => HTP_AUTH_TYPE_UNRECOGNIZED

HTP_M_UNKNOWN => HTP_METHOD_UNKNOWN
HTP_M_GET => HTP_METHOD_GET
HTP_M_POST => HTP_METHOD_POST
HTP_M_PUT => HTP_METHOD_PUT
HTP_M_CONNECT => HTP_METHOD_CONNECT

HTP_STREAM_ERROR => HTP_STREAM_STATE_ERROR
HTP_STREAM_TUNNEL => HTP_STREAM_STATE_TUNNEL

HTP_REQUEST_LINE => HTP_REQUEST_PROGRESS_LINE
HTP_REQUEST_HEADERS => HTP_REQUEST_PROGRESS_HEADERS
HTP_REQUEST_BODY => HTP_REQUEST_PROGRESS_BODY
HTP_REQUEST_TRAILER => HTP_REQUEST_PROGRESS_TRAILER
HTP_REQUEST_COMPLETE => HTP_REQUEST_PROGRESS_COMPLETE
HTP_RESPONSE_LINE => HTP_RESPONSE_PROGRESS_LINE
HTP_RESPONSE_HEADERS => HTP_RESPONSE_PROGRESS_HEADERS
HTP_RESPONSE_BODY => HTP_RESPONSE_PROGRESS_BODY
HTP_RESPONSE_TRAILER => HTP_RESPONSE_PROGRESS_TRAILER
HTP_RESPONSE_COMPLETE => HTP_RESPONSE_PROGRESS_COMPLETE

HTP_PROTOCOL_1_1 => HTP_PROTOCOL_V1_1
HTP_PROTOCOL_1_0 => HTP_PROTOCOL_V1_0
HTP_PROTOCOL_0_9 => HTP_PROTOCOL_V0_9

Functions introduced to handle opaque htp_tx_t:
- tx->cfg => htp_tx_cfg(tx)
- tx->flags => htp_tx_flags(tx)
- tx->is_protocol_0_9 => htp_tx_is_protocol_0_9(tx)
- tx->parsed_uri => htp_tx_parsed_uri(tx)
- tx->parsed_uri->path => htp_uri_path(htp_tx_parsed_uri(tx)
- tx->parsed_uri->hostname => htp_uri_hostname(htp_tx_parsed_uri(tx))
- tx->request_auth_type => htp_tx_request_auth_type(tx)
- tx->request_headers => htp_tx_request_headers(tx)
- tx->request_hostname => htp_tx_request_hostname(tx)
- tx->request_line => htp_tx_request_line(tx)
- tx->request_message_len => htp_tx_request_message_len(tx)
- tx->request_method => htp_tx_request_method(tx)
- tx->request_method_number => htp_tx_request_method_number(tx)
- tx->request_port_number => htp_tx_request_port_number(tx)
- tx->request_progress => htp_tx_request_progress(tx)
- tx->request_protocol => htp_tx_request_protocol(tx)
- tx->request_protocol_number => htp_tx_request_protocol_number(tx)
- tx->request_uri => htp_tx_request_uri(tx)
- tx->response_headers => htp_tx_response_headers(tx)
- tx->response_line => htp_tx_response_line(tx)
- tx->response_message => htp_tx_response_message(tx)
- tx->response_message_len => htp_tx_response_message_len(tx)
- tx->response_status => htp_tx_response_status(tx)
- tx->response_status_number => htp_tx_response_status_number(tx)
- tx->response_progress => htp_tx_response_progress(tx)
- tx->response_protocol_number => htp_tx_response_protocol_number(tx)
- htp_tx_get_user_data() => htp_tx_user_data(tx)
- htp_table_get_c(tx->request_headers, header) => htp_tx_request_header(tx, header)
- htp_table_get_c(tx->response_headers, header) => htp_tx_response_header(tx, header)
- htp_table_get_index(tx->request_headers, index) => htp_tx_request_header_index(tx, index)
- htp_table_size(tx->request_headers) => htp_tx_request_headers_size(tx)
- htp_tx_is_http_2_upgrade(tx) convenience function introduced to detect response status 101
  and “Upgrade: h2c" header.

Functions introduced to handle opaque htp_header_t:
- header->name => htp_header_name(header)
- bstr_ptr(header->name) => htp_header_name_ptr(header)
- bstr_len(header->name) => htp_header_name_len(header)
- header->value => htp_header_value(header)
- bstr_len(header->value) => htp_header_value_len(header)
- bstr_ptr(header->value) => htp_header_value_ptr(header)

Functions introduced to handle opaque htp_headers_t:
- htp_table_size(headers) => htp_headers_size(headers)
- htp_table_get_index(headers, index) => htp_headers_get_index(headers, index)

Functions introduced to handle opaque htp_tx_data_t:
- d->len => htp_tx_data_len()
- d->data => htp_tx_data_data()
- htp_tx_data_tx(data) function to get the htp_tx_t from the htp_tx_data_t
- htp_tx_data_is_empty(data) convenience function introduced to test if the data is empty.

Functions introduced to handle opaque htp_connp_t:
- htp_list_get(connp->transactions, txid) => htp_connp_tx(connp, txid)
- htp_list_size(http_state->conn->transactions) => htp_connp_tx_size(connp)
- htp_connp_get_connection(connp) => htp_connp_connection(connp)
- htp_connp_req_data(connp) => htp_connp_request_data(connp)
- htp_connp_req_close(connp) => htp_connp_request_close(connp)
- htp_connp_res_data(connp) => htp_connp_response_data(connp)
- htp_connp_get_jn_tx(connp) => htp_connp_get_request_tx(connp)
- htp_connp_get_out_tx(connp) => htp_connp_get_response_tx(connp)
- htp_connp_req_data_consumed(connp) => htp_connp_request_data_consumed(connp)
- htp_connp_res_data_consumed(connp) => htp_connp_response_data_consumed(connp)
- htp_connp_get_user_data(connp) => htp_connp_user_data(connp)

Functions introduced to handle opaque htp_conn_t:
- conn->in_data_counter => htp_conn_request_data_counter(conn)
- conn->out_data_counter => htp_conn_response_data_counter(conn)

Other changes:

Build libhtp-rs as a crate inside rust. Update autoconf to no longer
use libhtp as an external dependency. Remove HAVE_HTP feature defines
since they are no longer needed.

Make function arguments and return values const where possible

htp_tx_destroy(tx) will now free an incomplete transaction

htp_time_t replaced with standard struct timeval

Callbacks from libhtp now provide the htp_connp_t and the htp_tx_data_t
as separate arguments. This means the connection parser is no longer
fetched from the transaction inside callbacks.

SCHTPGenerateNormalizedUri() functionality moved inside libhtp-rs, which
now provides normalized URI values.
The normalized URI is available with accessor function: htp_tx_normalized_uri()
Configuration settings added to control the behaviour of the URI normalization:
- htp_config_set_normalized_uri_include_all()
- htp_config_set_plusspace_decode()
- htp_config_set_convert_lowercase()
- htp_config_set_double_decode_normalized_query()
- htp_config_set_double_decode_normalized_path()
- htp_config_set_backslash_convert_slashes()
- htp_config_set_bestfit_replacement_byte()
- htp_config_set_convert_lowercase()
- htp_config_set_nul_encoded_terminates()
- htp_config_set_nul_raw_terminates()
- htp_config_set_path_separators_compress()
- htp_config_set_path_separators_decode()
- htp_config_set_u_encoding_decode()
- htp_config_set_url_encoding_invalid_handling()
- htp_config_set_utf8_convert_bestfit()
- htp_config_set_normalized_uri_include_all()
- htp_config_set_plusspace_decode()
Constants related to configuring uri normalization:
- HTP_URL_DECODE_PRESERVE_PERCENT => HTP_URL_ENCODING_HANDLING_PRESERVE_PERCENT
- HTP_URL_DECODE_REMOVE_PERCENT => HTP_URL_ENCODING_HANDLING_REMOVE_PERCENT
- HTP_URL_DECODE_PROCESS_INVALID => HTP_URL_ENCODING_HANDLING_PROCESS_INVALID

htp_config_set_field_limits(soft_limit, hard_limit) changed to
htp_config_set_field_limit(limit) because libhtp didn't implement soft
limits.

libhtp logging API updated to provide HTP_LOG_CODE constants along with
the message. This eliminates the need to perform string matching on
message text to map log messages to HTTP_DECODER_EVENT values, and the
HTP_LOG_CODE values can be used directly. In support of this,
HTP_DECODER_EVENT values are mapped to their corresponding HTP_LOG_CODE
values:

HTTP_DECODER_EVENT_UNKNOWN_ERROR => HTP_LOG_CODE_UNKNOWN
HTTP_DECODER_EVENT_GZIP_DECOMPRESSION_FAILED => HTP_LOG_CODE_GZIP_DECOMPRESSION_FAILED
HTTP_DECODER_EVENT_REQUEST_FIELD_MISSING_COLON => HTP_LOG_CODE_REQUEST_FIELD_MISSING_COLON
HTTP_DECODER_EVENT_RESPONSE_FIELD_MISSING_COLON => HTP_LOG_CODE_RESPONSE_FIELD_MISSING_COLON
HTTP_DECODER_EVENT_INVALID_REQUEST_CHUNK_LEN => HTP_LOG_CODE_INVALID_REQUEST_CHUNK_LEN
HTTP_DECODER_EVENT_INVALID_RESPONSE_CHUNK_LEN => HTP_LOG_CODE_INVALID_RESPONSE_CHUNK_LEN
HTTP_DECODER_EVENT_INVALID_TRANSFER_ENCODING_VALUE_IN_REQUEST => HTP_LOG_CODE_INVALID_TRANSFER_ENCODING_VALUE_IN_REQUEST
HTTP_DECODER_EVENT_INVALID_TRANSFER_ENCODING_VALUE_IN_RESPONSE => HTP_LOG_CODE_INVALID_TRANSFER_ENCODING_VALUE_IN_RESPONSE
HTTP_DECODER_EVENT_INVALID_CONTENT_LENGTH_FIELD_IN_REQUEST => HTP_LOG_CODE_INVALID_CONTENT_LENGTH_FIELD_IN_REQUEST
HTTP_DECODER_EVENT_INVALID_CONTENT_LENGTH_FIELD_IN_RESPONSE => HTP_LOG_CODE_INVALID_CONTENT_LENGTH_FIELD_IN_RESPONSE
HTTP_DECODER_EVENT_DUPLICATE_CONTENT_LENGTH_FIELD_IN_REQUEST => HTP_LOG_CODE_DUPLICATE_CONTENT_LENGTH_FIELD_IN_REQUEST
HTTP_DECODER_EVENT_DUPLICATE_CONTENT_LENGTH_FIELD_IN_RESPONSE => HTP_LOG_CODE_DUPLICATE_CONTENT_LENGTH_FIELD_IN_RESPONSE
HTTP_DECODER_EVENT_100_CONTINUE_ALREADY_SEEN => HTP_LOG_CODE_CONTINUE_ALREADY_SEEN
HTTP_DECODER_EVENT_UNABLE_TO_MATCH_RESPONSE_TO_REQUEST => HTP_LOG_CODE_UNABLE_TO_MATCH_RESPONSE_TO_REQUEST
HTTP_DECODER_EVENT_INVALID_SERVER_PORT_IN_REQUEST => HTP_LOG_CODE_INVALID_SERVER_PORT_IN_REQUEST
HTTP_DECODER_EVENT_INVALID_AUTHORITY_PORT => HTP_LOG_CODE_INVALID_AUTHORITY_PORT
HTTP_DECODER_EVENT_REQUEST_HEADER_INVALID => HTP_LOG_CODE_REQUEST_HEADER_INVALID
HTTP_DECODER_EVENT_RESPONSE_HEADER_INVALID => HTP_LOG_CODE_RESPONSE_HEADER_INVALID
HTTP_DECODER_EVENT_MISSING_HOST_HEADER => HTP_LOG_CODE_MISSING_HOST_HEADER
HTTP_DECODER_EVENT_HOST_HEADER_AMBIGUOUS => HTP_LOG_CODE_HOST_HEADER_AMBIGUOUS
HTTP_DECODER_EVENT_INVALID_REQUEST_FIELD_FOLDING => HTP_LOG_CODE_INVALID_REQUEST_FIELD_FOLDING
HTTP_DECODER_EVENT_INVALID_RESPONSE_FIELD_FOLDING => HTP_LOG_CODE_INVALID_RESPONSE_FIELD_FOLDING
HTTP_DECODER_EVENT_REQUEST_FIELD_TOO_LONG => HTP_LOG_CODE_REQUEST_FIELD_TOO_LONG
HTTP_DECODER_EVENT_RESPONSE_FIELD_TOO_LONG => HTP_LOG_CODE_RESPONSE_FIELD_TOO_LONG
HTTP_DECODER_EVENT_FILE_NAME_TOO_LONG => HTP_LOG_CODE_REQUEST_LINE_INVALID
HTTP_DECODER_EVENT_REQUEST_LINE_INVALID => HTP_LOG_CODE_REQUEST_BODY_UNEXPECTED
HTTP_DECODER_EVENT_REQUEST_BODY_UNEXPECTED => HTP_LOG_CODE_RESPONSE_BODY_UNEXPECTED
HTTP_DECODER_EVENT_REQUEST_SERVER_PORT_TCP_PORT_MISMATCH => HTP_LOG_CODE_REQUEST_SERVER_PORT_TCP_PORT_MISMATCH
HTTP_DECODER_EVENT_URI_HOST_INVALID => HTP_LOG_CODE_URI_HOST_INVALID
HTTP_DECODER_EVENT_HEADER_HOST_INVALID => HTP_LOG_CODE_HEADER_HOST_INVALID
HTTP_DECODER_EVENT_AUTH_UNRECOGNIZED => HTP_LOG_CODE_AUTH_UNRECOGNIZED
HTTP_DECODER_EVENT_REQUEST_HEADER_REPETITION => HTP_LOG_CODE_REQUEST_HEADER_REPETITION
HTTP_DECODER_EVENT_RESPONSE_HEADER_REPETITION => HTP_LOG_CODE_RESPONSE_HEADER_REPETITION
HTTP_DECODER_EVENT_DOUBLE_ENCODED_URI => HTP_LOG_CODE_DOUBLE_ENCODED_URI
HTTP_DECODER_EVENT_URI_DELIM_NON_COMPLIANT => HTP_LOG_CODE_URI_DELIM_NON_COMPLIANT
HTTP_DECODER_EVENT_METHOD_DELIM_NON_COMPLIANT => HTP_LOG_CODE_METHOD_DELIM_NON_COMPLIANT
HTTP_DECODER_EVENT_REQUEST_LINE_LEADING_WHITESPACE => HTP_LOG_CODE_REQUEST_LINE_LEADING_WHITESPACE
HTTP_DECODER_EVENT_TOO_MANY_ENCODING_LAYERS => HTP_LOG_CODE_TOO_MANY_ENCODING_LAYERS
HTTP_DECODER_EVENT_ABNORMAL_CE_HEADER => HTP_LOG_CODE_ABNORMAL_CE_HEADER
HTTP_DECODER_EVENT_RESPONSE_MULTIPART_BYTERANGES => HTP_LOG_CODE_RESPONSE_MULTIPART_BYTERANGES
HTTP_DECODER_EVENT_RESPONSE_ABNORMAL_TRANSFER_ENCODING => HTP_LOG_CODE_RESPONSE_ABNORMAL_TRANSFER_ENCODING
HTTP_DECODER_EVENT_RESPONSE_CHUNKED_OLD_PROTO => HTP_LOG_CODE_RESPONSE_CHUNKED_OLD_PROTO
HTTP_DECODER_EVENT_RESPONSE_INVALID_PROTOCOL => HTP_LOG_CODE_RESPONSE_INVALID_PROTOCOL
HTTP_DECODER_EVENT_RESPONSE_INVALID_STATUS => HTP_LOG_CODE_RESPONSE_INVALID_STATUS
HTTP_DECODER_EVENT_REQUEST_LINE_INCOMPLETE => HTP_LOG_CODE_REQUEST_LINE_INCOMPLETE
HTTP_DECODER_EVENT_LZMA_MEMLIMIT_REACHED => HTP_LOG_CODE_LZMA_MEMLIMIT_REACHED
HTTP_DECODER_EVENT_COMPRESSION_BOMB => HTP_LOG_CODE_COMPRESSION_BOMB

New log events to describe additional anomalies:
HTP_LOG_CODE_REQUEST_TOO_MANY_LZMA_LAYERS
HTP_LOG_CODE_RESPONSE_TOO_MANY_LZMA_LAYERS
HTP_LOG_CODE_PROTOCOL_CONTAINS_EXTRA_DATA
HTP_LOG_CODE_CONTENT_LENGTH_EXTRA_DATA_START
HTP_LOG_CODE_CONTENT_LENGTH_EXTRA_DATA_END
HTP_LOG_CODE_SWITCHING_PROTO_WITH_CONTENT_LENGTH
HTP_LOG_CODE_DEFORMED_EOL
HTP_LOG_CODE_PARSER_STATE_ERROR
HTP_LOG_CODE_MISSING_OUTBOUND_TRANSACTION_DATA
HTP_LOG_CODE_MISSING_INBOUND_TRANSACTION_DATA
HTP_LOG_CODE_ZERO_LENGTH_DATA_CHUNKS
HTP_LOG_CODE_REQUEST_LINE_UNKNOWN_METHOD
HTP_LOG_CODE_REQUEST_LINE_UNKNOWN_METHOD_NO_PROTOCOL
HTP_LOG_CODE_REQUEST_LINE_UNKNOWN_METHOD_INVALID_PROTOCOL
HTP_LOG_CODE_REQUEST_LINE_NO_PROTOCOL
HTP_LOG_CODE_RESPONSE_LINE_INVALID_PROTOCOL
HTP_LOG_CODE_RESPONSE_LINE_INVALID_RESPONSE_STATUS
HTP_LOG_CODE_RESPONSE_BODY_INTERNAL_ERROR
HTP_LOG_CODE_REQUEST_BODY_DATA_CALLBACK_ERROR
HTP_LOG_CODE_RESPONSE_INVALID_EMPTY_NAME
HTP_LOG_CODE_REQUEST_INVALID_EMPTY_NAME
HTP_LOG_CODE_RESPONSE_INVALID_LWS_AFTER_NAME
HTP_LOG_CODE_RESPONSE_HEADER_NAME_NOT_TOKEN
HTP_LOG_CODE_REQUEST_INVALID_LWS_AFTER_NAME
HTP_LOG_CODE_LZMA_DECOMPRESSION_DISABLED
HTP_LOG_CODE_CONNECTION_ALREADY_OPEN
HTP_LOG_CODE_COMPRESSION_BOMB_DOUBLE_LZMA
HTP_LOG_CODE_INVALID_CONTENT_ENCODING
HTP_LOG_CODE_INVALID_GAP
HTP_LOG_CODE_ERROR

The new htp_log API supports consuming log messages more easily than
walking a list and tracking the current offset. Internally, libhtp-rs
now provides log messages as a queue of htp_log_t, which means the
application can simply call htp_conn_next_log() to fetch the next log
message until the queue is empty. Once the application is done with a
log message, they can call htp_log_free() to dispose of it.

Functions supporting htp_log_t:
htp_conn_next_log(conn) - Get the next log message
htp_log_message(log) - To get the text of the message
htp_log_code(log) - To get the HTP_LOG_CODE value
htp_log_free(log) - To free the htp_log_t
cccs-rtmorti added a commit to cccs-rtmorti/suricata that referenced this pull request Feb 3, 2023
Ticket: OISF#2696

There are a lot of changes here, which are described below.

In general these changes are renaming constants to conform to the
libhtp-rs versions (which are generated by cbindgen); making all htp
types opaque and changing struct->member references to
htp_struct_member() function calls; and a handful of changes to offload
functionality onto libhtp-rs from suricata, such as URI normalization
and transaction cleanup.

Constants renamed to correspond to cbindgen generated names:

HTP_OK => HTP_STATUS_OK
HTP_ERROR => HTP_STATUS_ERROR

HTP_SERVER_* => HTP_SERVER_PERSONALITY_*

HTP_REQUEST_INVALID_T_E => HTP_FLAGS_REQUEST_INVALID_T_E
HTP_REQUEST_INVALID_C_L => HTP_FLAGS_REQUEST_INVALID_C_L
HTP_HOST_MISSING => HTP_FLAGS_HOST_MISSING
HTP_HOST_AMBIGUOUS => HTP_FLAGS_HOST_AMBIGUOUS
HTP_HOSTU_INVALID => HTP_FLAGS_HOSTU_INVALID
HTP_HOSTH_INVALID => HTP_FLAGS_HOSTH_INVALID

HTP_AUTH_UNRECOGNIZED => HTP_AUTH_TYPE_UNRECOGNIZED

HTP_M_UNKNOWN => HTP_METHOD_UNKNOWN
HTP_M_GET => HTP_METHOD_GET
HTP_M_POST => HTP_METHOD_POST
HTP_M_PUT => HTP_METHOD_PUT
HTP_M_CONNECT => HTP_METHOD_CONNECT

HTP_STREAM_ERROR => HTP_STREAM_STATE_ERROR
HTP_STREAM_TUNNEL => HTP_STREAM_STATE_TUNNEL

HTP_REQUEST_LINE => HTP_REQUEST_PROGRESS_LINE
HTP_REQUEST_HEADERS => HTP_REQUEST_PROGRESS_HEADERS
HTP_REQUEST_BODY => HTP_REQUEST_PROGRESS_BODY
HTP_REQUEST_TRAILER => HTP_REQUEST_PROGRESS_TRAILER
HTP_REQUEST_COMPLETE => HTP_REQUEST_PROGRESS_COMPLETE
HTP_RESPONSE_LINE => HTP_RESPONSE_PROGRESS_LINE
HTP_RESPONSE_HEADERS => HTP_RESPONSE_PROGRESS_HEADERS
HTP_RESPONSE_BODY => HTP_RESPONSE_PROGRESS_BODY
HTP_RESPONSE_TRAILER => HTP_RESPONSE_PROGRESS_TRAILER
HTP_RESPONSE_COMPLETE => HTP_RESPONSE_PROGRESS_COMPLETE

HTP_PROTOCOL_1_1 => HTP_PROTOCOL_V1_1
HTP_PROTOCOL_1_0 => HTP_PROTOCOL_V1_0
HTP_PROTOCOL_0_9 => HTP_PROTOCOL_V0_9

Functions introduced to handle opaque htp_tx_t:
- tx->cfg => htp_tx_cfg(tx)
- tx->flags => htp_tx_flags(tx)
- tx->is_protocol_0_9 => htp_tx_is_protocol_0_9(tx)
- tx->parsed_uri => htp_tx_parsed_uri(tx)
- tx->parsed_uri->path => htp_uri_path(htp_tx_parsed_uri(tx)
- tx->parsed_uri->hostname => htp_uri_hostname(htp_tx_parsed_uri(tx))
- tx->request_auth_type => htp_tx_request_auth_type(tx)
- tx->request_headers => htp_tx_request_headers(tx)
- tx->request_hostname => htp_tx_request_hostname(tx)
- tx->request_line => htp_tx_request_line(tx)
- tx->request_message_len => htp_tx_request_message_len(tx)
- tx->request_method => htp_tx_request_method(tx)
- tx->request_method_number => htp_tx_request_method_number(tx)
- tx->request_port_number => htp_tx_request_port_number(tx)
- tx->request_progress => htp_tx_request_progress(tx)
- tx->request_protocol => htp_tx_request_protocol(tx)
- tx->request_protocol_number => htp_tx_request_protocol_number(tx)
- tx->request_uri => htp_tx_request_uri(tx)
- tx->response_headers => htp_tx_response_headers(tx)
- tx->response_line => htp_tx_response_line(tx)
- tx->response_message => htp_tx_response_message(tx)
- tx->response_message_len => htp_tx_response_message_len(tx)
- tx->response_status => htp_tx_response_status(tx)
- tx->response_status_number => htp_tx_response_status_number(tx)
- tx->response_progress => htp_tx_response_progress(tx)
- tx->response_protocol_number => htp_tx_response_protocol_number(tx)
- htp_tx_get_user_data() => htp_tx_user_data(tx)
- htp_table_get_c(tx->request_headers, header) => htp_tx_request_header(tx, header)
- htp_table_get_c(tx->response_headers, header) => htp_tx_response_header(tx, header)
- htp_table_get_index(tx->request_headers, index) => htp_tx_request_header_index(tx, index)
- htp_table_size(tx->request_headers) => htp_tx_request_headers_size(tx)
- htp_tx_is_http_2_upgrade(tx) convenience function introduced to detect response status 101
  and “Upgrade: h2c" header.

Functions introduced to handle opaque htp_header_t:
- header->name => htp_header_name(header)
- bstr_ptr(header->name) => htp_header_name_ptr(header)
- bstr_len(header->name) => htp_header_name_len(header)
- header->value => htp_header_value(header)
- bstr_len(header->value) => htp_header_value_len(header)
- bstr_ptr(header->value) => htp_header_value_ptr(header)

Functions introduced to handle opaque htp_headers_t:
- htp_table_size(headers) => htp_headers_size(headers)
- htp_table_get_index(headers, index) => htp_headers_get_index(headers, index)

Functions introduced to handle opaque htp_tx_data_t:
- d->len => htp_tx_data_len()
- d->data => htp_tx_data_data()
- htp_tx_data_tx(data) function to get the htp_tx_t from the htp_tx_data_t
- htp_tx_data_is_empty(data) convenience function introduced to test if the data is empty.

Functions introduced to handle opaque htp_connp_t:
- htp_list_get(connp->transactions, txid) => htp_connp_tx(connp, txid)
- htp_list_size(http_state->conn->transactions) => htp_connp_tx_size(connp)
- htp_connp_get_connection(connp) => htp_connp_connection(connp)
- htp_connp_req_data(connp) => htp_connp_request_data(connp)
- htp_connp_req_close(connp) => htp_connp_request_close(connp)
- htp_connp_res_data(connp) => htp_connp_response_data(connp)
- htp_connp_get_jn_tx(connp) => htp_connp_get_request_tx(connp)
- htp_connp_get_out_tx(connp) => htp_connp_get_response_tx(connp)
- htp_connp_req_data_consumed(connp) => htp_connp_request_data_consumed(connp)
- htp_connp_res_data_consumed(connp) => htp_connp_response_data_consumed(connp)
- htp_connp_get_user_data(connp) => htp_connp_user_data(connp)

Functions introduced to handle opaque htp_conn_t:
- conn->in_data_counter => htp_conn_request_data_counter(conn)
- conn->out_data_counter => htp_conn_response_data_counter(conn)

Other changes:

Build libhtp-rs as a crate inside rust. Update autoconf to no longer
use libhtp as an external dependency. Remove HAVE_HTP feature defines
since they are no longer needed.

Make function arguments and return values const where possible

htp_tx_destroy(tx) will now free an incomplete transaction

htp_time_t replaced with standard struct timeval

Callbacks from libhtp now provide the htp_connp_t and the htp_tx_data_t
as separate arguments. This means the connection parser is no longer
fetched from the transaction inside callbacks.

SCHTPGenerateNormalizedUri() functionality moved inside libhtp-rs, which
now provides normalized URI values.
The normalized URI is available with accessor function: htp_tx_normalized_uri()
Configuration settings added to control the behaviour of the URI normalization:
- htp_config_set_normalized_uri_include_all()
- htp_config_set_plusspace_decode()
- htp_config_set_convert_lowercase()
- htp_config_set_double_decode_normalized_query()
- htp_config_set_double_decode_normalized_path()
- htp_config_set_backslash_convert_slashes()
- htp_config_set_bestfit_replacement_byte()
- htp_config_set_convert_lowercase()
- htp_config_set_nul_encoded_terminates()
- htp_config_set_nul_raw_terminates()
- htp_config_set_path_separators_compress()
- htp_config_set_path_separators_decode()
- htp_config_set_u_encoding_decode()
- htp_config_set_url_encoding_invalid_handling()
- htp_config_set_utf8_convert_bestfit()
- htp_config_set_normalized_uri_include_all()
- htp_config_set_plusspace_decode()
Constants related to configuring uri normalization:
- HTP_URL_DECODE_PRESERVE_PERCENT => HTP_URL_ENCODING_HANDLING_PRESERVE_PERCENT
- HTP_URL_DECODE_REMOVE_PERCENT => HTP_URL_ENCODING_HANDLING_REMOVE_PERCENT
- HTP_URL_DECODE_PROCESS_INVALID => HTP_URL_ENCODING_HANDLING_PROCESS_INVALID

htp_config_set_field_limits(soft_limit, hard_limit) changed to
htp_config_set_field_limit(limit) because libhtp didn't implement soft
limits.

libhtp logging API updated to provide HTP_LOG_CODE constants along with
the message. This eliminates the need to perform string matching on
message text to map log messages to HTTP_DECODER_EVENT values, and the
HTP_LOG_CODE values can be used directly. In support of this,
HTP_DECODER_EVENT values are mapped to their corresponding HTP_LOG_CODE
values:

HTTP_DECODER_EVENT_UNKNOWN_ERROR => HTP_LOG_CODE_UNKNOWN
HTTP_DECODER_EVENT_GZIP_DECOMPRESSION_FAILED => HTP_LOG_CODE_GZIP_DECOMPRESSION_FAILED
HTTP_DECODER_EVENT_REQUEST_FIELD_MISSING_COLON => HTP_LOG_CODE_REQUEST_FIELD_MISSING_COLON
HTTP_DECODER_EVENT_RESPONSE_FIELD_MISSING_COLON => HTP_LOG_CODE_RESPONSE_FIELD_MISSING_COLON
HTTP_DECODER_EVENT_INVALID_REQUEST_CHUNK_LEN => HTP_LOG_CODE_INVALID_REQUEST_CHUNK_LEN
HTTP_DECODER_EVENT_INVALID_RESPONSE_CHUNK_LEN => HTP_LOG_CODE_INVALID_RESPONSE_CHUNK_LEN
HTTP_DECODER_EVENT_INVALID_TRANSFER_ENCODING_VALUE_IN_REQUEST => HTP_LOG_CODE_INVALID_TRANSFER_ENCODING_VALUE_IN_REQUEST
HTTP_DECODER_EVENT_INVALID_TRANSFER_ENCODING_VALUE_IN_RESPONSE => HTP_LOG_CODE_INVALID_TRANSFER_ENCODING_VALUE_IN_RESPONSE
HTTP_DECODER_EVENT_INVALID_CONTENT_LENGTH_FIELD_IN_REQUEST => HTP_LOG_CODE_INVALID_CONTENT_LENGTH_FIELD_IN_REQUEST
HTTP_DECODER_EVENT_INVALID_CONTENT_LENGTH_FIELD_IN_RESPONSE => HTP_LOG_CODE_INVALID_CONTENT_LENGTH_FIELD_IN_RESPONSE
HTTP_DECODER_EVENT_DUPLICATE_CONTENT_LENGTH_FIELD_IN_REQUEST => HTP_LOG_CODE_DUPLICATE_CONTENT_LENGTH_FIELD_IN_REQUEST
HTTP_DECODER_EVENT_DUPLICATE_CONTENT_LENGTH_FIELD_IN_RESPONSE => HTP_LOG_CODE_DUPLICATE_CONTENT_LENGTH_FIELD_IN_RESPONSE
HTTP_DECODER_EVENT_100_CONTINUE_ALREADY_SEEN => HTP_LOG_CODE_CONTINUE_ALREADY_SEEN
HTTP_DECODER_EVENT_UNABLE_TO_MATCH_RESPONSE_TO_REQUEST => HTP_LOG_CODE_UNABLE_TO_MATCH_RESPONSE_TO_REQUEST
HTTP_DECODER_EVENT_INVALID_SERVER_PORT_IN_REQUEST => HTP_LOG_CODE_INVALID_SERVER_PORT_IN_REQUEST
HTTP_DECODER_EVENT_INVALID_AUTHORITY_PORT => HTP_LOG_CODE_INVALID_AUTHORITY_PORT
HTTP_DECODER_EVENT_REQUEST_HEADER_INVALID => HTP_LOG_CODE_REQUEST_HEADER_INVALID
HTTP_DECODER_EVENT_RESPONSE_HEADER_INVALID => HTP_LOG_CODE_RESPONSE_HEADER_INVALID
HTTP_DECODER_EVENT_MISSING_HOST_HEADER => HTP_LOG_CODE_MISSING_HOST_HEADER
HTTP_DECODER_EVENT_HOST_HEADER_AMBIGUOUS => HTP_LOG_CODE_HOST_HEADER_AMBIGUOUS
HTTP_DECODER_EVENT_INVALID_REQUEST_FIELD_FOLDING => HTP_LOG_CODE_INVALID_REQUEST_FIELD_FOLDING
HTTP_DECODER_EVENT_INVALID_RESPONSE_FIELD_FOLDING => HTP_LOG_CODE_INVALID_RESPONSE_FIELD_FOLDING
HTTP_DECODER_EVENT_REQUEST_FIELD_TOO_LONG => HTP_LOG_CODE_REQUEST_FIELD_TOO_LONG
HTTP_DECODER_EVENT_RESPONSE_FIELD_TOO_LONG => HTP_LOG_CODE_RESPONSE_FIELD_TOO_LONG
HTTP_DECODER_EVENT_FILE_NAME_TOO_LONG => HTP_LOG_CODE_REQUEST_LINE_INVALID
HTTP_DECODER_EVENT_REQUEST_LINE_INVALID => HTP_LOG_CODE_REQUEST_BODY_UNEXPECTED
HTTP_DECODER_EVENT_REQUEST_BODY_UNEXPECTED => HTP_LOG_CODE_RESPONSE_BODY_UNEXPECTED
HTTP_DECODER_EVENT_REQUEST_SERVER_PORT_TCP_PORT_MISMATCH => HTP_LOG_CODE_REQUEST_SERVER_PORT_TCP_PORT_MISMATCH
HTTP_DECODER_EVENT_URI_HOST_INVALID => HTP_LOG_CODE_URI_HOST_INVALID
HTTP_DECODER_EVENT_HEADER_HOST_INVALID => HTP_LOG_CODE_HEADER_HOST_INVALID
HTTP_DECODER_EVENT_AUTH_UNRECOGNIZED => HTP_LOG_CODE_AUTH_UNRECOGNIZED
HTTP_DECODER_EVENT_REQUEST_HEADER_REPETITION => HTP_LOG_CODE_REQUEST_HEADER_REPETITION
HTTP_DECODER_EVENT_RESPONSE_HEADER_REPETITION => HTP_LOG_CODE_RESPONSE_HEADER_REPETITION
HTTP_DECODER_EVENT_DOUBLE_ENCODED_URI => HTP_LOG_CODE_DOUBLE_ENCODED_URI
HTTP_DECODER_EVENT_URI_DELIM_NON_COMPLIANT => HTP_LOG_CODE_URI_DELIM_NON_COMPLIANT
HTTP_DECODER_EVENT_METHOD_DELIM_NON_COMPLIANT => HTP_LOG_CODE_METHOD_DELIM_NON_COMPLIANT
HTTP_DECODER_EVENT_REQUEST_LINE_LEADING_WHITESPACE => HTP_LOG_CODE_REQUEST_LINE_LEADING_WHITESPACE
HTTP_DECODER_EVENT_TOO_MANY_ENCODING_LAYERS => HTP_LOG_CODE_TOO_MANY_ENCODING_LAYERS
HTTP_DECODER_EVENT_ABNORMAL_CE_HEADER => HTP_LOG_CODE_ABNORMAL_CE_HEADER
HTTP_DECODER_EVENT_RESPONSE_MULTIPART_BYTERANGES => HTP_LOG_CODE_RESPONSE_MULTIPART_BYTERANGES
HTTP_DECODER_EVENT_RESPONSE_ABNORMAL_TRANSFER_ENCODING => HTP_LOG_CODE_RESPONSE_ABNORMAL_TRANSFER_ENCODING
HTTP_DECODER_EVENT_RESPONSE_CHUNKED_OLD_PROTO => HTP_LOG_CODE_RESPONSE_CHUNKED_OLD_PROTO
HTTP_DECODER_EVENT_RESPONSE_INVALID_PROTOCOL => HTP_LOG_CODE_RESPONSE_INVALID_PROTOCOL
HTTP_DECODER_EVENT_RESPONSE_INVALID_STATUS => HTP_LOG_CODE_RESPONSE_INVALID_STATUS
HTTP_DECODER_EVENT_REQUEST_LINE_INCOMPLETE => HTP_LOG_CODE_REQUEST_LINE_INCOMPLETE
HTTP_DECODER_EVENT_LZMA_MEMLIMIT_REACHED => HTP_LOG_CODE_LZMA_MEMLIMIT_REACHED
HTTP_DECODER_EVENT_COMPRESSION_BOMB => HTP_LOG_CODE_COMPRESSION_BOMB

New log events to describe additional anomalies:
HTP_LOG_CODE_REQUEST_TOO_MANY_LZMA_LAYERS
HTP_LOG_CODE_RESPONSE_TOO_MANY_LZMA_LAYERS
HTP_LOG_CODE_PROTOCOL_CONTAINS_EXTRA_DATA
HTP_LOG_CODE_CONTENT_LENGTH_EXTRA_DATA_START
HTP_LOG_CODE_CONTENT_LENGTH_EXTRA_DATA_END
HTP_LOG_CODE_SWITCHING_PROTO_WITH_CONTENT_LENGTH
HTP_LOG_CODE_DEFORMED_EOL
HTP_LOG_CODE_PARSER_STATE_ERROR
HTP_LOG_CODE_MISSING_OUTBOUND_TRANSACTION_DATA
HTP_LOG_CODE_MISSING_INBOUND_TRANSACTION_DATA
HTP_LOG_CODE_ZERO_LENGTH_DATA_CHUNKS
HTP_LOG_CODE_REQUEST_LINE_UNKNOWN_METHOD
HTP_LOG_CODE_REQUEST_LINE_UNKNOWN_METHOD_NO_PROTOCOL
HTP_LOG_CODE_REQUEST_LINE_UNKNOWN_METHOD_INVALID_PROTOCOL
HTP_LOG_CODE_REQUEST_LINE_NO_PROTOCOL
HTP_LOG_CODE_RESPONSE_LINE_INVALID_PROTOCOL
HTP_LOG_CODE_RESPONSE_LINE_INVALID_RESPONSE_STATUS
HTP_LOG_CODE_RESPONSE_BODY_INTERNAL_ERROR
HTP_LOG_CODE_REQUEST_BODY_DATA_CALLBACK_ERROR
HTP_LOG_CODE_RESPONSE_INVALID_EMPTY_NAME
HTP_LOG_CODE_REQUEST_INVALID_EMPTY_NAME
HTP_LOG_CODE_RESPONSE_INVALID_LWS_AFTER_NAME
HTP_LOG_CODE_RESPONSE_HEADER_NAME_NOT_TOKEN
HTP_LOG_CODE_REQUEST_INVALID_LWS_AFTER_NAME
HTP_LOG_CODE_LZMA_DECOMPRESSION_DISABLED
HTP_LOG_CODE_CONNECTION_ALREADY_OPEN
HTP_LOG_CODE_COMPRESSION_BOMB_DOUBLE_LZMA
HTP_LOG_CODE_INVALID_CONTENT_ENCODING
HTP_LOG_CODE_INVALID_GAP
HTP_LOG_CODE_ERROR

The new htp_log API supports consuming log messages more easily than
walking a list and tracking the current offset. Internally, libhtp-rs
now provides log messages as a queue of htp_log_t, which means the
application can simply call htp_conn_next_log() to fetch the next log
message until the queue is empty. Once the application is done with a
log message, they can call htp_log_free() to dispose of it.

Functions supporting htp_log_t:
htp_conn_next_log(conn) - Get the next log message
htp_log_message(log) - To get the text of the message
htp_log_code(log) - To get the HTP_LOG_CODE value
htp_log_free(log) - To free the htp_log_t
cccs-rtmorti added a commit to cccs-rtmorti/suricata that referenced this pull request Feb 3, 2023
Ticket: OISF#2696

There are a lot of changes here, which are described below.

In general these changes are renaming constants to conform to the
libhtp-rs versions (which are generated by cbindgen); making all htp
types opaque and changing struct->member references to
htp_struct_member() function calls; and a handful of changes to offload
functionality onto libhtp-rs from suricata, such as URI normalization
and transaction cleanup.

Constants renamed to correspond to cbindgen generated names:

HTP_OK => HTP_STATUS_OK
HTP_ERROR => HTP_STATUS_ERROR

HTP_SERVER_* => HTP_SERVER_PERSONALITY_*

HTP_REQUEST_INVALID_T_E => HTP_FLAGS_REQUEST_INVALID_T_E
HTP_REQUEST_INVALID_C_L => HTP_FLAGS_REQUEST_INVALID_C_L
HTP_HOST_MISSING => HTP_FLAGS_HOST_MISSING
HTP_HOST_AMBIGUOUS => HTP_FLAGS_HOST_AMBIGUOUS
HTP_HOSTU_INVALID => HTP_FLAGS_HOSTU_INVALID
HTP_HOSTH_INVALID => HTP_FLAGS_HOSTH_INVALID

HTP_AUTH_UNRECOGNIZED => HTP_AUTH_TYPE_UNRECOGNIZED

HTP_M_UNKNOWN => HTP_METHOD_UNKNOWN
HTP_M_GET => HTP_METHOD_GET
HTP_M_POST => HTP_METHOD_POST
HTP_M_PUT => HTP_METHOD_PUT
HTP_M_CONNECT => HTP_METHOD_CONNECT

HTP_STREAM_ERROR => HTP_STREAM_STATE_ERROR
HTP_STREAM_TUNNEL => HTP_STREAM_STATE_TUNNEL

HTP_REQUEST_LINE => HTP_REQUEST_PROGRESS_LINE
HTP_REQUEST_HEADERS => HTP_REQUEST_PROGRESS_HEADERS
HTP_REQUEST_BODY => HTP_REQUEST_PROGRESS_BODY
HTP_REQUEST_TRAILER => HTP_REQUEST_PROGRESS_TRAILER
HTP_REQUEST_COMPLETE => HTP_REQUEST_PROGRESS_COMPLETE
HTP_RESPONSE_LINE => HTP_RESPONSE_PROGRESS_LINE
HTP_RESPONSE_HEADERS => HTP_RESPONSE_PROGRESS_HEADERS
HTP_RESPONSE_BODY => HTP_RESPONSE_PROGRESS_BODY
HTP_RESPONSE_TRAILER => HTP_RESPONSE_PROGRESS_TRAILER
HTP_RESPONSE_COMPLETE => HTP_RESPONSE_PROGRESS_COMPLETE

HTP_PROTOCOL_1_1 => HTP_PROTOCOL_V1_1
HTP_PROTOCOL_1_0 => HTP_PROTOCOL_V1_0
HTP_PROTOCOL_0_9 => HTP_PROTOCOL_V0_9

Functions introduced to handle opaque htp_tx_t:
- tx->cfg => htp_tx_cfg(tx)
- tx->flags => htp_tx_flags(tx)
- tx->is_protocol_0_9 => htp_tx_is_protocol_0_9(tx)
- tx->parsed_uri => htp_tx_parsed_uri(tx)
- tx->parsed_uri->path => htp_uri_path(htp_tx_parsed_uri(tx)
- tx->parsed_uri->hostname => htp_uri_hostname(htp_tx_parsed_uri(tx))
- tx->request_auth_type => htp_tx_request_auth_type(tx)
- tx->request_headers => htp_tx_request_headers(tx)
- tx->request_hostname => htp_tx_request_hostname(tx)
- tx->request_line => htp_tx_request_line(tx)
- tx->request_message_len => htp_tx_request_message_len(tx)
- tx->request_method => htp_tx_request_method(tx)
- tx->request_method_number => htp_tx_request_method_number(tx)
- tx->request_port_number => htp_tx_request_port_number(tx)
- tx->request_progress => htp_tx_request_progress(tx)
- tx->request_protocol => htp_tx_request_protocol(tx)
- tx->request_protocol_number => htp_tx_request_protocol_number(tx)
- tx->request_uri => htp_tx_request_uri(tx)
- tx->response_headers => htp_tx_response_headers(tx)
- tx->response_line => htp_tx_response_line(tx)
- tx->response_message => htp_tx_response_message(tx)
- tx->response_message_len => htp_tx_response_message_len(tx)
- tx->response_status => htp_tx_response_status(tx)
- tx->response_status_number => htp_tx_response_status_number(tx)
- tx->response_progress => htp_tx_response_progress(tx)
- tx->response_protocol_number => htp_tx_response_protocol_number(tx)
- htp_tx_get_user_data() => htp_tx_user_data(tx)
- htp_table_get_c(tx->request_headers, header) => htp_tx_request_header(tx, header)
- htp_table_get_c(tx->response_headers, header) => htp_tx_response_header(tx, header)
- htp_table_get_index(tx->request_headers, index) => htp_tx_request_header_index(tx, index)
- htp_table_size(tx->request_headers) => htp_tx_request_headers_size(tx)
- htp_tx_is_http_2_upgrade(tx) convenience function introduced to detect response status 101
  and “Upgrade: h2c" header.

Functions introduced to handle opaque htp_header_t:
- header->name => htp_header_name(header)
- bstr_ptr(header->name) => htp_header_name_ptr(header)
- bstr_len(header->name) => htp_header_name_len(header)
- header->value => htp_header_value(header)
- bstr_len(header->value) => htp_header_value_len(header)
- bstr_ptr(header->value) => htp_header_value_ptr(header)

Functions introduced to handle opaque htp_headers_t:
- htp_table_size(headers) => htp_headers_size(headers)
- htp_table_get_index(headers, index) => htp_headers_get_index(headers, index)

Functions introduced to handle opaque htp_tx_data_t:
- d->len => htp_tx_data_len()
- d->data => htp_tx_data_data()
- htp_tx_data_tx(data) function to get the htp_tx_t from the htp_tx_data_t
- htp_tx_data_is_empty(data) convenience function introduced to test if the data is empty.

Functions introduced to handle opaque htp_connp_t:
- htp_list_get(connp->transactions, txid) => htp_connp_tx(connp, txid)
- htp_list_size(http_state->conn->transactions) => htp_connp_tx_size(connp)
- htp_connp_get_connection(connp) => htp_connp_connection(connp)
- htp_connp_req_data(connp) => htp_connp_request_data(connp)
- htp_connp_req_close(connp) => htp_connp_request_close(connp)
- htp_connp_res_data(connp) => htp_connp_response_data(connp)
- htp_connp_get_jn_tx(connp) => htp_connp_get_request_tx(connp)
- htp_connp_get_out_tx(connp) => htp_connp_get_response_tx(connp)
- htp_connp_req_data_consumed(connp) => htp_connp_request_data_consumed(connp)
- htp_connp_res_data_consumed(connp) => htp_connp_response_data_consumed(connp)
- htp_connp_get_user_data(connp) => htp_connp_user_data(connp)

Functions introduced to handle opaque htp_conn_t:
- conn->in_data_counter => htp_conn_request_data_counter(conn)
- conn->out_data_counter => htp_conn_response_data_counter(conn)

Other changes:

Build libhtp-rs as a crate inside rust. Update autoconf to no longer
use libhtp as an external dependency. Remove HAVE_HTP feature defines
since they are no longer needed.

Make function arguments and return values const where possible

htp_tx_destroy(tx) will now free an incomplete transaction

htp_time_t replaced with standard struct timeval

Callbacks from libhtp now provide the htp_connp_t and the htp_tx_data_t
as separate arguments. This means the connection parser is no longer
fetched from the transaction inside callbacks.

SCHTPGenerateNormalizedUri() functionality moved inside libhtp-rs, which
now provides normalized URI values.
The normalized URI is available with accessor function: htp_tx_normalized_uri()
Configuration settings added to control the behaviour of the URI normalization:
- htp_config_set_normalized_uri_include_all()
- htp_config_set_plusspace_decode()
- htp_config_set_convert_lowercase()
- htp_config_set_double_decode_normalized_query()
- htp_config_set_double_decode_normalized_path()
- htp_config_set_backslash_convert_slashes()
- htp_config_set_bestfit_replacement_byte()
- htp_config_set_convert_lowercase()
- htp_config_set_nul_encoded_terminates()
- htp_config_set_nul_raw_terminates()
- htp_config_set_path_separators_compress()
- htp_config_set_path_separators_decode()
- htp_config_set_u_encoding_decode()
- htp_config_set_url_encoding_invalid_handling()
- htp_config_set_utf8_convert_bestfit()
- htp_config_set_normalized_uri_include_all()
- htp_config_set_plusspace_decode()
Constants related to configuring uri normalization:
- HTP_URL_DECODE_PRESERVE_PERCENT => HTP_URL_ENCODING_HANDLING_PRESERVE_PERCENT
- HTP_URL_DECODE_REMOVE_PERCENT => HTP_URL_ENCODING_HANDLING_REMOVE_PERCENT
- HTP_URL_DECODE_PROCESS_INVALID => HTP_URL_ENCODING_HANDLING_PROCESS_INVALID

htp_config_set_field_limits(soft_limit, hard_limit) changed to
htp_config_set_field_limit(limit) because libhtp didn't implement soft
limits.

libhtp logging API updated to provide HTP_LOG_CODE constants along with
the message. This eliminates the need to perform string matching on
message text to map log messages to HTTP_DECODER_EVENT values, and the
HTP_LOG_CODE values can be used directly. In support of this,
HTP_DECODER_EVENT values are mapped to their corresponding HTP_LOG_CODE
values:

HTTP_DECODER_EVENT_UNKNOWN_ERROR => HTP_LOG_CODE_UNKNOWN
HTTP_DECODER_EVENT_GZIP_DECOMPRESSION_FAILED => HTP_LOG_CODE_GZIP_DECOMPRESSION_FAILED
HTTP_DECODER_EVENT_REQUEST_FIELD_MISSING_COLON => HTP_LOG_CODE_REQUEST_FIELD_MISSING_COLON
HTTP_DECODER_EVENT_RESPONSE_FIELD_MISSING_COLON => HTP_LOG_CODE_RESPONSE_FIELD_MISSING_COLON
HTTP_DECODER_EVENT_INVALID_REQUEST_CHUNK_LEN => HTP_LOG_CODE_INVALID_REQUEST_CHUNK_LEN
HTTP_DECODER_EVENT_INVALID_RESPONSE_CHUNK_LEN => HTP_LOG_CODE_INVALID_RESPONSE_CHUNK_LEN
HTTP_DECODER_EVENT_INVALID_TRANSFER_ENCODING_VALUE_IN_REQUEST => HTP_LOG_CODE_INVALID_TRANSFER_ENCODING_VALUE_IN_REQUEST
HTTP_DECODER_EVENT_INVALID_TRANSFER_ENCODING_VALUE_IN_RESPONSE => HTP_LOG_CODE_INVALID_TRANSFER_ENCODING_VALUE_IN_RESPONSE
HTTP_DECODER_EVENT_INVALID_CONTENT_LENGTH_FIELD_IN_REQUEST => HTP_LOG_CODE_INVALID_CONTENT_LENGTH_FIELD_IN_REQUEST
HTTP_DECODER_EVENT_INVALID_CONTENT_LENGTH_FIELD_IN_RESPONSE => HTP_LOG_CODE_INVALID_CONTENT_LENGTH_FIELD_IN_RESPONSE
HTTP_DECODER_EVENT_DUPLICATE_CONTENT_LENGTH_FIELD_IN_REQUEST => HTP_LOG_CODE_DUPLICATE_CONTENT_LENGTH_FIELD_IN_REQUEST
HTTP_DECODER_EVENT_DUPLICATE_CONTENT_LENGTH_FIELD_IN_RESPONSE => HTP_LOG_CODE_DUPLICATE_CONTENT_LENGTH_FIELD_IN_RESPONSE
HTTP_DECODER_EVENT_100_CONTINUE_ALREADY_SEEN => HTP_LOG_CODE_CONTINUE_ALREADY_SEEN
HTTP_DECODER_EVENT_UNABLE_TO_MATCH_RESPONSE_TO_REQUEST => HTP_LOG_CODE_UNABLE_TO_MATCH_RESPONSE_TO_REQUEST
HTTP_DECODER_EVENT_INVALID_SERVER_PORT_IN_REQUEST => HTP_LOG_CODE_INVALID_SERVER_PORT_IN_REQUEST
HTTP_DECODER_EVENT_INVALID_AUTHORITY_PORT => HTP_LOG_CODE_INVALID_AUTHORITY_PORT
HTTP_DECODER_EVENT_REQUEST_HEADER_INVALID => HTP_LOG_CODE_REQUEST_HEADER_INVALID
HTTP_DECODER_EVENT_RESPONSE_HEADER_INVALID => HTP_LOG_CODE_RESPONSE_HEADER_INVALID
HTTP_DECODER_EVENT_MISSING_HOST_HEADER => HTP_LOG_CODE_MISSING_HOST_HEADER
HTTP_DECODER_EVENT_HOST_HEADER_AMBIGUOUS => HTP_LOG_CODE_HOST_HEADER_AMBIGUOUS
HTTP_DECODER_EVENT_INVALID_REQUEST_FIELD_FOLDING => HTP_LOG_CODE_INVALID_REQUEST_FIELD_FOLDING
HTTP_DECODER_EVENT_INVALID_RESPONSE_FIELD_FOLDING => HTP_LOG_CODE_INVALID_RESPONSE_FIELD_FOLDING
HTTP_DECODER_EVENT_REQUEST_FIELD_TOO_LONG => HTP_LOG_CODE_REQUEST_FIELD_TOO_LONG
HTTP_DECODER_EVENT_RESPONSE_FIELD_TOO_LONG => HTP_LOG_CODE_RESPONSE_FIELD_TOO_LONG
HTTP_DECODER_EVENT_FILE_NAME_TOO_LONG => HTP_LOG_CODE_REQUEST_LINE_INVALID
HTTP_DECODER_EVENT_REQUEST_LINE_INVALID => HTP_LOG_CODE_REQUEST_BODY_UNEXPECTED
HTTP_DECODER_EVENT_REQUEST_BODY_UNEXPECTED => HTP_LOG_CODE_RESPONSE_BODY_UNEXPECTED
HTTP_DECODER_EVENT_REQUEST_SERVER_PORT_TCP_PORT_MISMATCH => HTP_LOG_CODE_REQUEST_SERVER_PORT_TCP_PORT_MISMATCH
HTTP_DECODER_EVENT_URI_HOST_INVALID => HTP_LOG_CODE_URI_HOST_INVALID
HTTP_DECODER_EVENT_HEADER_HOST_INVALID => HTP_LOG_CODE_HEADER_HOST_INVALID
HTTP_DECODER_EVENT_AUTH_UNRECOGNIZED => HTP_LOG_CODE_AUTH_UNRECOGNIZED
HTTP_DECODER_EVENT_REQUEST_HEADER_REPETITION => HTP_LOG_CODE_REQUEST_HEADER_REPETITION
HTTP_DECODER_EVENT_RESPONSE_HEADER_REPETITION => HTP_LOG_CODE_RESPONSE_HEADER_REPETITION
HTTP_DECODER_EVENT_DOUBLE_ENCODED_URI => HTP_LOG_CODE_DOUBLE_ENCODED_URI
HTTP_DECODER_EVENT_URI_DELIM_NON_COMPLIANT => HTP_LOG_CODE_URI_DELIM_NON_COMPLIANT
HTTP_DECODER_EVENT_METHOD_DELIM_NON_COMPLIANT => HTP_LOG_CODE_METHOD_DELIM_NON_COMPLIANT
HTTP_DECODER_EVENT_REQUEST_LINE_LEADING_WHITESPACE => HTP_LOG_CODE_REQUEST_LINE_LEADING_WHITESPACE
HTTP_DECODER_EVENT_TOO_MANY_ENCODING_LAYERS => HTP_LOG_CODE_TOO_MANY_ENCODING_LAYERS
HTTP_DECODER_EVENT_ABNORMAL_CE_HEADER => HTP_LOG_CODE_ABNORMAL_CE_HEADER
HTTP_DECODER_EVENT_RESPONSE_MULTIPART_BYTERANGES => HTP_LOG_CODE_RESPONSE_MULTIPART_BYTERANGES
HTTP_DECODER_EVENT_RESPONSE_ABNORMAL_TRANSFER_ENCODING => HTP_LOG_CODE_RESPONSE_ABNORMAL_TRANSFER_ENCODING
HTTP_DECODER_EVENT_RESPONSE_CHUNKED_OLD_PROTO => HTP_LOG_CODE_RESPONSE_CHUNKED_OLD_PROTO
HTTP_DECODER_EVENT_RESPONSE_INVALID_PROTOCOL => HTP_LOG_CODE_RESPONSE_INVALID_PROTOCOL
HTTP_DECODER_EVENT_RESPONSE_INVALID_STATUS => HTP_LOG_CODE_RESPONSE_INVALID_STATUS
HTTP_DECODER_EVENT_REQUEST_LINE_INCOMPLETE => HTP_LOG_CODE_REQUEST_LINE_INCOMPLETE
HTTP_DECODER_EVENT_LZMA_MEMLIMIT_REACHED => HTP_LOG_CODE_LZMA_MEMLIMIT_REACHED
HTTP_DECODER_EVENT_COMPRESSION_BOMB => HTP_LOG_CODE_COMPRESSION_BOMB

New log events to describe additional anomalies:
HTP_LOG_CODE_REQUEST_TOO_MANY_LZMA_LAYERS
HTP_LOG_CODE_RESPONSE_TOO_MANY_LZMA_LAYERS
HTP_LOG_CODE_PROTOCOL_CONTAINS_EXTRA_DATA
HTP_LOG_CODE_CONTENT_LENGTH_EXTRA_DATA_START
HTP_LOG_CODE_CONTENT_LENGTH_EXTRA_DATA_END
HTP_LOG_CODE_SWITCHING_PROTO_WITH_CONTENT_LENGTH
HTP_LOG_CODE_DEFORMED_EOL
HTP_LOG_CODE_PARSER_STATE_ERROR
HTP_LOG_CODE_MISSING_OUTBOUND_TRANSACTION_DATA
HTP_LOG_CODE_MISSING_INBOUND_TRANSACTION_DATA
HTP_LOG_CODE_ZERO_LENGTH_DATA_CHUNKS
HTP_LOG_CODE_REQUEST_LINE_UNKNOWN_METHOD
HTP_LOG_CODE_REQUEST_LINE_UNKNOWN_METHOD_NO_PROTOCOL
HTP_LOG_CODE_REQUEST_LINE_UNKNOWN_METHOD_INVALID_PROTOCOL
HTP_LOG_CODE_REQUEST_LINE_NO_PROTOCOL
HTP_LOG_CODE_RESPONSE_LINE_INVALID_PROTOCOL
HTP_LOG_CODE_RESPONSE_LINE_INVALID_RESPONSE_STATUS
HTP_LOG_CODE_RESPONSE_BODY_INTERNAL_ERROR
HTP_LOG_CODE_REQUEST_BODY_DATA_CALLBACK_ERROR
HTP_LOG_CODE_RESPONSE_INVALID_EMPTY_NAME
HTP_LOG_CODE_REQUEST_INVALID_EMPTY_NAME
HTP_LOG_CODE_RESPONSE_INVALID_LWS_AFTER_NAME
HTP_LOG_CODE_RESPONSE_HEADER_NAME_NOT_TOKEN
HTP_LOG_CODE_REQUEST_INVALID_LWS_AFTER_NAME
HTP_LOG_CODE_LZMA_DECOMPRESSION_DISABLED
HTP_LOG_CODE_CONNECTION_ALREADY_OPEN
HTP_LOG_CODE_COMPRESSION_BOMB_DOUBLE_LZMA
HTP_LOG_CODE_INVALID_CONTENT_ENCODING
HTP_LOG_CODE_INVALID_GAP
HTP_LOG_CODE_ERROR

The new htp_log API supports consuming log messages more easily than
walking a list and tracking the current offset. Internally, libhtp-rs
now provides log messages as a queue of htp_log_t, which means the
application can simply call htp_conn_next_log() to fetch the next log
message until the queue is empty. Once the application is done with a
log message, they can call htp_log_free() to dispose of it.

Functions supporting htp_log_t:
htp_conn_next_log(conn) - Get the next log message
htp_log_message(log) - To get the text of the message
htp_log_code(log) - To get the HTP_LOG_CODE value
htp_log_free(log) - To free the htp_log_t
cccs-rtmorti added a commit to cccs-rtmorti/suricata that referenced this pull request Feb 6, 2023
Ticket: OISF#2696

There are a lot of changes here, which are described below.

In general these changes are renaming constants to conform to the
libhtp-rs versions (which are generated by cbindgen); making all htp
types opaque and changing struct->member references to
htp_struct_member() function calls; and a handful of changes to offload
functionality onto libhtp-rs from suricata, such as URI normalization
and transaction cleanup.

Constants renamed to correspond to cbindgen generated names:

HTP_OK => HTP_STATUS_OK
HTP_ERROR => HTP_STATUS_ERROR

HTP_SERVER_* => HTP_SERVER_PERSONALITY_*

HTP_REQUEST_INVALID_T_E => HTP_FLAGS_REQUEST_INVALID_T_E
HTP_REQUEST_INVALID_C_L => HTP_FLAGS_REQUEST_INVALID_C_L
HTP_HOST_MISSING => HTP_FLAGS_HOST_MISSING
HTP_HOST_AMBIGUOUS => HTP_FLAGS_HOST_AMBIGUOUS
HTP_HOSTU_INVALID => HTP_FLAGS_HOSTU_INVALID
HTP_HOSTH_INVALID => HTP_FLAGS_HOSTH_INVALID

HTP_AUTH_UNRECOGNIZED => HTP_AUTH_TYPE_UNRECOGNIZED

HTP_M_UNKNOWN => HTP_METHOD_UNKNOWN
HTP_M_GET => HTP_METHOD_GET
HTP_M_POST => HTP_METHOD_POST
HTP_M_PUT => HTP_METHOD_PUT
HTP_M_CONNECT => HTP_METHOD_CONNECT

HTP_STREAM_ERROR => HTP_STREAM_STATE_ERROR
HTP_STREAM_TUNNEL => HTP_STREAM_STATE_TUNNEL

HTP_REQUEST_LINE => HTP_REQUEST_PROGRESS_LINE
HTP_REQUEST_HEADERS => HTP_REQUEST_PROGRESS_HEADERS
HTP_REQUEST_BODY => HTP_REQUEST_PROGRESS_BODY
HTP_REQUEST_TRAILER => HTP_REQUEST_PROGRESS_TRAILER
HTP_REQUEST_COMPLETE => HTP_REQUEST_PROGRESS_COMPLETE
HTP_RESPONSE_LINE => HTP_RESPONSE_PROGRESS_LINE
HTP_RESPONSE_HEADERS => HTP_RESPONSE_PROGRESS_HEADERS
HTP_RESPONSE_BODY => HTP_RESPONSE_PROGRESS_BODY
HTP_RESPONSE_TRAILER => HTP_RESPONSE_PROGRESS_TRAILER
HTP_RESPONSE_COMPLETE => HTP_RESPONSE_PROGRESS_COMPLETE

HTP_PROTOCOL_1_1 => HTP_PROTOCOL_V1_1
HTP_PROTOCOL_1_0 => HTP_PROTOCOL_V1_0
HTP_PROTOCOL_0_9 => HTP_PROTOCOL_V0_9

Functions introduced to handle opaque htp_tx_t:
- tx->cfg => htp_tx_cfg(tx)
- tx->flags => htp_tx_flags(tx)
- tx->is_protocol_0_9 => htp_tx_is_protocol_0_9(tx)
- tx->parsed_uri => htp_tx_parsed_uri(tx)
- tx->parsed_uri->path => htp_uri_path(htp_tx_parsed_uri(tx)
- tx->parsed_uri->hostname => htp_uri_hostname(htp_tx_parsed_uri(tx))
- tx->request_auth_type => htp_tx_request_auth_type(tx)
- tx->request_headers => htp_tx_request_headers(tx)
- tx->request_hostname => htp_tx_request_hostname(tx)
- tx->request_line => htp_tx_request_line(tx)
- tx->request_message_len => htp_tx_request_message_len(tx)
- tx->request_method => htp_tx_request_method(tx)
- tx->request_method_number => htp_tx_request_method_number(tx)
- tx->request_port_number => htp_tx_request_port_number(tx)
- tx->request_progress => htp_tx_request_progress(tx)
- tx->request_protocol => htp_tx_request_protocol(tx)
- tx->request_protocol_number => htp_tx_request_protocol_number(tx)
- tx->request_uri => htp_tx_request_uri(tx)
- tx->response_headers => htp_tx_response_headers(tx)
- tx->response_line => htp_tx_response_line(tx)
- tx->response_message => htp_tx_response_message(tx)
- tx->response_message_len => htp_tx_response_message_len(tx)
- tx->response_status => htp_tx_response_status(tx)
- tx->response_status_number => htp_tx_response_status_number(tx)
- tx->response_progress => htp_tx_response_progress(tx)
- tx->response_protocol_number => htp_tx_response_protocol_number(tx)
- htp_tx_get_user_data() => htp_tx_user_data(tx)
- htp_table_get_c(tx->request_headers, header) => htp_tx_request_header(tx, header)
- htp_table_get_c(tx->response_headers, header) => htp_tx_response_header(tx, header)
- htp_table_get_index(tx->request_headers, index) => htp_tx_request_header_index(tx, index)
- htp_table_size(tx->request_headers) => htp_tx_request_headers_size(tx)
- htp_tx_is_http_2_upgrade(tx) convenience function introduced to detect response status 101
  and “Upgrade: h2c" header.

Functions introduced to handle opaque htp_header_t:
- header->name => htp_header_name(header)
- bstr_ptr(header->name) => htp_header_name_ptr(header)
- bstr_len(header->name) => htp_header_name_len(header)
- header->value => htp_header_value(header)
- bstr_len(header->value) => htp_header_value_len(header)
- bstr_ptr(header->value) => htp_header_value_ptr(header)

Functions introduced to handle opaque htp_headers_t:
- htp_table_size(headers) => htp_headers_size(headers)
- htp_table_get_index(headers, index) => htp_headers_get_index(headers, index)

Functions introduced to handle opaque htp_tx_data_t:
- d->len => htp_tx_data_len()
- d->data => htp_tx_data_data()
- htp_tx_data_tx(data) function to get the htp_tx_t from the htp_tx_data_t
- htp_tx_data_is_empty(data) convenience function introduced to test if the data is empty.

Functions introduced to handle opaque htp_connp_t:
- htp_list_get(connp->transactions, txid) => htp_connp_tx(connp, txid)
- htp_list_size(http_state->conn->transactions) => htp_connp_tx_size(connp)
- htp_connp_get_connection(connp) => htp_connp_connection(connp)
- htp_connp_req_data(connp) => htp_connp_request_data(connp)
- htp_connp_req_close(connp) => htp_connp_request_close(connp)
- htp_connp_res_data(connp) => htp_connp_response_data(connp)
- htp_connp_get_jn_tx(connp) => htp_connp_get_request_tx(connp)
- htp_connp_get_out_tx(connp) => htp_connp_get_response_tx(connp)
- htp_connp_req_data_consumed(connp) => htp_connp_request_data_consumed(connp)
- htp_connp_res_data_consumed(connp) => htp_connp_response_data_consumed(connp)
- htp_connp_get_user_data(connp) => htp_connp_user_data(connp)

Functions introduced to handle opaque htp_conn_t:
- conn->in_data_counter => htp_conn_request_data_counter(conn)
- conn->out_data_counter => htp_conn_response_data_counter(conn)

Other changes:

Build libhtp-rs as a crate inside rust. Update autoconf to no longer
use libhtp as an external dependency. Remove HAVE_HTP feature defines
since they are no longer needed.

Make function arguments and return values const where possible

htp_tx_destroy(tx) will now free an incomplete transaction

htp_time_t replaced with standard struct timeval

Callbacks from libhtp now provide the htp_connp_t and the htp_tx_data_t
as separate arguments. This means the connection parser is no longer
fetched from the transaction inside callbacks.

SCHTPGenerateNormalizedUri() functionality moved inside libhtp-rs, which
now provides normalized URI values.
The normalized URI is available with accessor function: htp_tx_normalized_uri()
Configuration settings added to control the behaviour of the URI normalization:
- htp_config_set_normalized_uri_include_all()
- htp_config_set_plusspace_decode()
- htp_config_set_convert_lowercase()
- htp_config_set_double_decode_normalized_query()
- htp_config_set_double_decode_normalized_path()
- htp_config_set_backslash_convert_slashes()
- htp_config_set_bestfit_replacement_byte()
- htp_config_set_convert_lowercase()
- htp_config_set_nul_encoded_terminates()
- htp_config_set_nul_raw_terminates()
- htp_config_set_path_separators_compress()
- htp_config_set_path_separators_decode()
- htp_config_set_u_encoding_decode()
- htp_config_set_url_encoding_invalid_handling()
- htp_config_set_utf8_convert_bestfit()
- htp_config_set_normalized_uri_include_all()
- htp_config_set_plusspace_decode()
Constants related to configuring uri normalization:
- HTP_URL_DECODE_PRESERVE_PERCENT => HTP_URL_ENCODING_HANDLING_PRESERVE_PERCENT
- HTP_URL_DECODE_REMOVE_PERCENT => HTP_URL_ENCODING_HANDLING_REMOVE_PERCENT
- HTP_URL_DECODE_PROCESS_INVALID => HTP_URL_ENCODING_HANDLING_PROCESS_INVALID

htp_config_set_field_limits(soft_limit, hard_limit) changed to
htp_config_set_field_limit(limit) because libhtp didn't implement soft
limits.

libhtp logging API updated to provide HTP_LOG_CODE constants along with
the message. This eliminates the need to perform string matching on
message text to map log messages to HTTP_DECODER_EVENT values, and the
HTP_LOG_CODE values can be used directly. In support of this,
HTP_DECODER_EVENT values are mapped to their corresponding HTP_LOG_CODE
values:

HTTP_DECODER_EVENT_UNKNOWN_ERROR => HTP_LOG_CODE_UNKNOWN
HTTP_DECODER_EVENT_GZIP_DECOMPRESSION_FAILED => HTP_LOG_CODE_GZIP_DECOMPRESSION_FAILED
HTTP_DECODER_EVENT_REQUEST_FIELD_MISSING_COLON => HTP_LOG_CODE_REQUEST_FIELD_MISSING_COLON
HTTP_DECODER_EVENT_RESPONSE_FIELD_MISSING_COLON => HTP_LOG_CODE_RESPONSE_FIELD_MISSING_COLON
HTTP_DECODER_EVENT_INVALID_REQUEST_CHUNK_LEN => HTP_LOG_CODE_INVALID_REQUEST_CHUNK_LEN
HTTP_DECODER_EVENT_INVALID_RESPONSE_CHUNK_LEN => HTP_LOG_CODE_INVALID_RESPONSE_CHUNK_LEN
HTTP_DECODER_EVENT_INVALID_TRANSFER_ENCODING_VALUE_IN_REQUEST => HTP_LOG_CODE_INVALID_TRANSFER_ENCODING_VALUE_IN_REQUEST
HTTP_DECODER_EVENT_INVALID_TRANSFER_ENCODING_VALUE_IN_RESPONSE => HTP_LOG_CODE_INVALID_TRANSFER_ENCODING_VALUE_IN_RESPONSE
HTTP_DECODER_EVENT_INVALID_CONTENT_LENGTH_FIELD_IN_REQUEST => HTP_LOG_CODE_INVALID_CONTENT_LENGTH_FIELD_IN_REQUEST
HTTP_DECODER_EVENT_INVALID_CONTENT_LENGTH_FIELD_IN_RESPONSE => HTP_LOG_CODE_INVALID_CONTENT_LENGTH_FIELD_IN_RESPONSE
HTTP_DECODER_EVENT_DUPLICATE_CONTENT_LENGTH_FIELD_IN_REQUEST => HTP_LOG_CODE_DUPLICATE_CONTENT_LENGTH_FIELD_IN_REQUEST
HTTP_DECODER_EVENT_DUPLICATE_CONTENT_LENGTH_FIELD_IN_RESPONSE => HTP_LOG_CODE_DUPLICATE_CONTENT_LENGTH_FIELD_IN_RESPONSE
HTTP_DECODER_EVENT_100_CONTINUE_ALREADY_SEEN => HTP_LOG_CODE_CONTINUE_ALREADY_SEEN
HTTP_DECODER_EVENT_UNABLE_TO_MATCH_RESPONSE_TO_REQUEST => HTP_LOG_CODE_UNABLE_TO_MATCH_RESPONSE_TO_REQUEST
HTTP_DECODER_EVENT_INVALID_SERVER_PORT_IN_REQUEST => HTP_LOG_CODE_INVALID_SERVER_PORT_IN_REQUEST
HTTP_DECODER_EVENT_INVALID_AUTHORITY_PORT => HTP_LOG_CODE_INVALID_AUTHORITY_PORT
HTTP_DECODER_EVENT_REQUEST_HEADER_INVALID => HTP_LOG_CODE_REQUEST_HEADER_INVALID
HTTP_DECODER_EVENT_RESPONSE_HEADER_INVALID => HTP_LOG_CODE_RESPONSE_HEADER_INVALID
HTTP_DECODER_EVENT_MISSING_HOST_HEADER => HTP_LOG_CODE_MISSING_HOST_HEADER
HTTP_DECODER_EVENT_HOST_HEADER_AMBIGUOUS => HTP_LOG_CODE_HOST_HEADER_AMBIGUOUS
HTTP_DECODER_EVENT_INVALID_REQUEST_FIELD_FOLDING => HTP_LOG_CODE_INVALID_REQUEST_FIELD_FOLDING
HTTP_DECODER_EVENT_INVALID_RESPONSE_FIELD_FOLDING => HTP_LOG_CODE_INVALID_RESPONSE_FIELD_FOLDING
HTTP_DECODER_EVENT_REQUEST_FIELD_TOO_LONG => HTP_LOG_CODE_REQUEST_FIELD_TOO_LONG
HTTP_DECODER_EVENT_RESPONSE_FIELD_TOO_LONG => HTP_LOG_CODE_RESPONSE_FIELD_TOO_LONG
HTTP_DECODER_EVENT_FILE_NAME_TOO_LONG => HTP_LOG_CODE_REQUEST_LINE_INVALID
HTTP_DECODER_EVENT_REQUEST_LINE_INVALID => HTP_LOG_CODE_REQUEST_BODY_UNEXPECTED
HTTP_DECODER_EVENT_REQUEST_BODY_UNEXPECTED => HTP_LOG_CODE_RESPONSE_BODY_UNEXPECTED
HTTP_DECODER_EVENT_REQUEST_SERVER_PORT_TCP_PORT_MISMATCH => HTP_LOG_CODE_REQUEST_SERVER_PORT_TCP_PORT_MISMATCH
HTTP_DECODER_EVENT_URI_HOST_INVALID => HTP_LOG_CODE_URI_HOST_INVALID
HTTP_DECODER_EVENT_HEADER_HOST_INVALID => HTP_LOG_CODE_HEADER_HOST_INVALID
HTTP_DECODER_EVENT_AUTH_UNRECOGNIZED => HTP_LOG_CODE_AUTH_UNRECOGNIZED
HTTP_DECODER_EVENT_REQUEST_HEADER_REPETITION => HTP_LOG_CODE_REQUEST_HEADER_REPETITION
HTTP_DECODER_EVENT_RESPONSE_HEADER_REPETITION => HTP_LOG_CODE_RESPONSE_HEADER_REPETITION
HTTP_DECODER_EVENT_DOUBLE_ENCODED_URI => HTP_LOG_CODE_DOUBLE_ENCODED_URI
HTTP_DECODER_EVENT_URI_DELIM_NON_COMPLIANT => HTP_LOG_CODE_URI_DELIM_NON_COMPLIANT
HTTP_DECODER_EVENT_METHOD_DELIM_NON_COMPLIANT => HTP_LOG_CODE_METHOD_DELIM_NON_COMPLIANT
HTTP_DECODER_EVENT_REQUEST_LINE_LEADING_WHITESPACE => HTP_LOG_CODE_REQUEST_LINE_LEADING_WHITESPACE
HTTP_DECODER_EVENT_TOO_MANY_ENCODING_LAYERS => HTP_LOG_CODE_TOO_MANY_ENCODING_LAYERS
HTTP_DECODER_EVENT_ABNORMAL_CE_HEADER => HTP_LOG_CODE_ABNORMAL_CE_HEADER
HTTP_DECODER_EVENT_RESPONSE_MULTIPART_BYTERANGES => HTP_LOG_CODE_RESPONSE_MULTIPART_BYTERANGES
HTTP_DECODER_EVENT_RESPONSE_ABNORMAL_TRANSFER_ENCODING => HTP_LOG_CODE_RESPONSE_ABNORMAL_TRANSFER_ENCODING
HTTP_DECODER_EVENT_RESPONSE_CHUNKED_OLD_PROTO => HTP_LOG_CODE_RESPONSE_CHUNKED_OLD_PROTO
HTTP_DECODER_EVENT_RESPONSE_INVALID_PROTOCOL => HTP_LOG_CODE_RESPONSE_INVALID_PROTOCOL
HTTP_DECODER_EVENT_RESPONSE_INVALID_STATUS => HTP_LOG_CODE_RESPONSE_INVALID_STATUS
HTTP_DECODER_EVENT_REQUEST_LINE_INCOMPLETE => HTP_LOG_CODE_REQUEST_LINE_INCOMPLETE
HTTP_DECODER_EVENT_LZMA_MEMLIMIT_REACHED => HTP_LOG_CODE_LZMA_MEMLIMIT_REACHED
HTTP_DECODER_EVENT_COMPRESSION_BOMB => HTP_LOG_CODE_COMPRESSION_BOMB

New log events to describe additional anomalies:
HTP_LOG_CODE_REQUEST_TOO_MANY_LZMA_LAYERS
HTP_LOG_CODE_RESPONSE_TOO_MANY_LZMA_LAYERS
HTP_LOG_CODE_PROTOCOL_CONTAINS_EXTRA_DATA
HTP_LOG_CODE_CONTENT_LENGTH_EXTRA_DATA_START
HTP_LOG_CODE_CONTENT_LENGTH_EXTRA_DATA_END
HTP_LOG_CODE_SWITCHING_PROTO_WITH_CONTENT_LENGTH
HTP_LOG_CODE_DEFORMED_EOL
HTP_LOG_CODE_PARSER_STATE_ERROR
HTP_LOG_CODE_MISSING_OUTBOUND_TRANSACTION_DATA
HTP_LOG_CODE_MISSING_INBOUND_TRANSACTION_DATA
HTP_LOG_CODE_ZERO_LENGTH_DATA_CHUNKS
HTP_LOG_CODE_REQUEST_LINE_UNKNOWN_METHOD
HTP_LOG_CODE_REQUEST_LINE_UNKNOWN_METHOD_NO_PROTOCOL
HTP_LOG_CODE_REQUEST_LINE_UNKNOWN_METHOD_INVALID_PROTOCOL
HTP_LOG_CODE_REQUEST_LINE_NO_PROTOCOL
HTP_LOG_CODE_RESPONSE_LINE_INVALID_PROTOCOL
HTP_LOG_CODE_RESPONSE_LINE_INVALID_RESPONSE_STATUS
HTP_LOG_CODE_RESPONSE_BODY_INTERNAL_ERROR
HTP_LOG_CODE_REQUEST_BODY_DATA_CALLBACK_ERROR
HTP_LOG_CODE_RESPONSE_INVALID_EMPTY_NAME
HTP_LOG_CODE_REQUEST_INVALID_EMPTY_NAME
HTP_LOG_CODE_RESPONSE_INVALID_LWS_AFTER_NAME
HTP_LOG_CODE_RESPONSE_HEADER_NAME_NOT_TOKEN
HTP_LOG_CODE_REQUEST_INVALID_LWS_AFTER_NAME
HTP_LOG_CODE_LZMA_DECOMPRESSION_DISABLED
HTP_LOG_CODE_CONNECTION_ALREADY_OPEN
HTP_LOG_CODE_COMPRESSION_BOMB_DOUBLE_LZMA
HTP_LOG_CODE_INVALID_CONTENT_ENCODING
HTP_LOG_CODE_INVALID_GAP
HTP_LOG_CODE_ERROR

The new htp_log API supports consuming log messages more easily than
walking a list and tracking the current offset. Internally, libhtp-rs
now provides log messages as a queue of htp_log_t, which means the
application can simply call htp_conn_next_log() to fetch the next log
message until the queue is empty. Once the application is done with a
log message, they can call htp_log_free() to dispose of it.

Functions supporting htp_log_t:
htp_conn_next_log(conn) - Get the next log message
htp_log_message(log) - To get the text of the message
htp_log_code(log) - To get the HTP_LOG_CODE value
htp_log_free(log) - To free the htp_log_t
cccs-rtmorti added a commit to cccs-rtmorti/suricata that referenced this pull request Feb 6, 2023
Ticket: OISF#2696

There are a lot of changes here, which are described below.

In general these changes are renaming constants to conform to the
libhtp-rs versions (which are generated by cbindgen); making all htp
types opaque and changing struct->member references to
htp_struct_member() function calls; and a handful of changes to offload
functionality onto libhtp-rs from suricata, such as URI normalization
and transaction cleanup.

Constants renamed to correspond to cbindgen generated names:

HTP_OK => HTP_STATUS_OK
HTP_ERROR => HTP_STATUS_ERROR

HTP_SERVER_* => HTP_SERVER_PERSONALITY_*

HTP_REQUEST_INVALID_T_E => HTP_FLAGS_REQUEST_INVALID_T_E
HTP_REQUEST_INVALID_C_L => HTP_FLAGS_REQUEST_INVALID_C_L
HTP_HOST_MISSING => HTP_FLAGS_HOST_MISSING
HTP_HOST_AMBIGUOUS => HTP_FLAGS_HOST_AMBIGUOUS
HTP_HOSTU_INVALID => HTP_FLAGS_HOSTU_INVALID
HTP_HOSTH_INVALID => HTP_FLAGS_HOSTH_INVALID

HTP_AUTH_UNRECOGNIZED => HTP_AUTH_TYPE_UNRECOGNIZED

HTP_M_UNKNOWN => HTP_METHOD_UNKNOWN
HTP_M_GET => HTP_METHOD_GET
HTP_M_POST => HTP_METHOD_POST
HTP_M_PUT => HTP_METHOD_PUT
HTP_M_CONNECT => HTP_METHOD_CONNECT

HTP_STREAM_ERROR => HTP_STREAM_STATE_ERROR
HTP_STREAM_TUNNEL => HTP_STREAM_STATE_TUNNEL

HTP_REQUEST_LINE => HTP_REQUEST_PROGRESS_LINE
HTP_REQUEST_HEADERS => HTP_REQUEST_PROGRESS_HEADERS
HTP_REQUEST_BODY => HTP_REQUEST_PROGRESS_BODY
HTP_REQUEST_TRAILER => HTP_REQUEST_PROGRESS_TRAILER
HTP_REQUEST_COMPLETE => HTP_REQUEST_PROGRESS_COMPLETE
HTP_RESPONSE_LINE => HTP_RESPONSE_PROGRESS_LINE
HTP_RESPONSE_HEADERS => HTP_RESPONSE_PROGRESS_HEADERS
HTP_RESPONSE_BODY => HTP_RESPONSE_PROGRESS_BODY
HTP_RESPONSE_TRAILER => HTP_RESPONSE_PROGRESS_TRAILER
HTP_RESPONSE_COMPLETE => HTP_RESPONSE_PROGRESS_COMPLETE

HTP_PROTOCOL_1_1 => HTP_PROTOCOL_V1_1
HTP_PROTOCOL_1_0 => HTP_PROTOCOL_V1_0
HTP_PROTOCOL_0_9 => HTP_PROTOCOL_V0_9

Functions introduced to handle opaque htp_tx_t:
- tx->cfg => htp_tx_cfg(tx)
- tx->flags => htp_tx_flags(tx)
- tx->is_protocol_0_9 => htp_tx_is_protocol_0_9(tx)
- tx->parsed_uri => htp_tx_parsed_uri(tx)
- tx->parsed_uri->path => htp_uri_path(htp_tx_parsed_uri(tx)
- tx->parsed_uri->hostname => htp_uri_hostname(htp_tx_parsed_uri(tx))
- tx->request_auth_type => htp_tx_request_auth_type(tx)
- tx->request_headers => htp_tx_request_headers(tx)
- tx->request_hostname => htp_tx_request_hostname(tx)
- tx->request_line => htp_tx_request_line(tx)
- tx->request_message_len => htp_tx_request_message_len(tx)
- tx->request_method => htp_tx_request_method(tx)
- tx->request_method_number => htp_tx_request_method_number(tx)
- tx->request_port_number => htp_tx_request_port_number(tx)
- tx->request_progress => htp_tx_request_progress(tx)
- tx->request_protocol => htp_tx_request_protocol(tx)
- tx->request_protocol_number => htp_tx_request_protocol_number(tx)
- tx->request_uri => htp_tx_request_uri(tx)
- tx->response_headers => htp_tx_response_headers(tx)
- tx->response_line => htp_tx_response_line(tx)
- tx->response_message => htp_tx_response_message(tx)
- tx->response_message_len => htp_tx_response_message_len(tx)
- tx->response_status => htp_tx_response_status(tx)
- tx->response_status_number => htp_tx_response_status_number(tx)
- tx->response_progress => htp_tx_response_progress(tx)
- tx->response_protocol_number => htp_tx_response_protocol_number(tx)
- htp_tx_get_user_data() => htp_tx_user_data(tx)
- htp_table_get_c(tx->request_headers, header) => htp_tx_request_header(tx, header)
- htp_table_get_c(tx->response_headers, header) => htp_tx_response_header(tx, header)
- htp_table_get_index(tx->request_headers, index) => htp_tx_request_header_index(tx, index)
- htp_table_size(tx->request_headers) => htp_tx_request_headers_size(tx)
- htp_tx_is_http_2_upgrade(tx) convenience function introduced to detect response status 101
  and “Upgrade: h2c" header.

Functions introduced to handle opaque htp_header_t:
- header->name => htp_header_name(header)
- bstr_ptr(header->name) => htp_header_name_ptr(header)
- bstr_len(header->name) => htp_header_name_len(header)
- header->value => htp_header_value(header)
- bstr_len(header->value) => htp_header_value_len(header)
- bstr_ptr(header->value) => htp_header_value_ptr(header)

Functions introduced to handle opaque htp_headers_t:
- htp_table_size(headers) => htp_headers_size(headers)
- htp_table_get_index(headers, index) => htp_headers_get_index(headers, index)

Functions introduced to handle opaque htp_tx_data_t:
- d->len => htp_tx_data_len()
- d->data => htp_tx_data_data()
- htp_tx_data_tx(data) function to get the htp_tx_t from the htp_tx_data_t
- htp_tx_data_is_empty(data) convenience function introduced to test if the data is empty.

Functions introduced to handle opaque htp_connp_t:
- htp_list_get(connp->transactions, txid) => htp_connp_tx(connp, txid)
- htp_list_size(http_state->conn->transactions) => htp_connp_tx_size(connp)
- htp_connp_get_connection(connp) => htp_connp_connection(connp)
- htp_connp_req_data(connp) => htp_connp_request_data(connp)
- htp_connp_req_close(connp) => htp_connp_request_close(connp)
- htp_connp_res_data(connp) => htp_connp_response_data(connp)
- htp_connp_get_jn_tx(connp) => htp_connp_get_request_tx(connp)
- htp_connp_get_out_tx(connp) => htp_connp_get_response_tx(connp)
- htp_connp_req_data_consumed(connp) => htp_connp_request_data_consumed(connp)
- htp_connp_res_data_consumed(connp) => htp_connp_response_data_consumed(connp)
- htp_connp_get_user_data(connp) => htp_connp_user_data(connp)

Functions introduced to handle opaque htp_conn_t:
- conn->in_data_counter => htp_conn_request_data_counter(conn)
- conn->out_data_counter => htp_conn_response_data_counter(conn)

Other changes:

Build libhtp-rs as a crate inside rust. Update autoconf to no longer
use libhtp as an external dependency. Remove HAVE_HTP feature defines
since they are no longer needed.

Make function arguments and return values const where possible

htp_tx_destroy(tx) will now free an incomplete transaction

htp_time_t replaced with standard struct timeval

Callbacks from libhtp now provide the htp_connp_t and the htp_tx_data_t
as separate arguments. This means the connection parser is no longer
fetched from the transaction inside callbacks.

SCHTPGenerateNormalizedUri() functionality moved inside libhtp-rs, which
now provides normalized URI values.
The normalized URI is available with accessor function: htp_tx_normalized_uri()
Configuration settings added to control the behaviour of the URI normalization:
- htp_config_set_normalized_uri_include_all()
- htp_config_set_plusspace_decode()
- htp_config_set_convert_lowercase()
- htp_config_set_double_decode_normalized_query()
- htp_config_set_double_decode_normalized_path()
- htp_config_set_backslash_convert_slashes()
- htp_config_set_bestfit_replacement_byte()
- htp_config_set_convert_lowercase()
- htp_config_set_nul_encoded_terminates()
- htp_config_set_nul_raw_terminates()
- htp_config_set_path_separators_compress()
- htp_config_set_path_separators_decode()
- htp_config_set_u_encoding_decode()
- htp_config_set_url_encoding_invalid_handling()
- htp_config_set_utf8_convert_bestfit()
- htp_config_set_normalized_uri_include_all()
- htp_config_set_plusspace_decode()
Constants related to configuring uri normalization:
- HTP_URL_DECODE_PRESERVE_PERCENT => HTP_URL_ENCODING_HANDLING_PRESERVE_PERCENT
- HTP_URL_DECODE_REMOVE_PERCENT => HTP_URL_ENCODING_HANDLING_REMOVE_PERCENT
- HTP_URL_DECODE_PROCESS_INVALID => HTP_URL_ENCODING_HANDLING_PROCESS_INVALID

htp_config_set_field_limits(soft_limit, hard_limit) changed to
htp_config_set_field_limit(limit) because libhtp didn't implement soft
limits.

libhtp logging API updated to provide HTP_LOG_CODE constants along with
the message. This eliminates the need to perform string matching on
message text to map log messages to HTTP_DECODER_EVENT values, and the
HTP_LOG_CODE values can be used directly. In support of this,
HTP_DECODER_EVENT values are mapped to their corresponding HTP_LOG_CODE
values:

HTTP_DECODER_EVENT_UNKNOWN_ERROR => HTP_LOG_CODE_UNKNOWN
HTTP_DECODER_EVENT_GZIP_DECOMPRESSION_FAILED => HTP_LOG_CODE_GZIP_DECOMPRESSION_FAILED
HTTP_DECODER_EVENT_REQUEST_FIELD_MISSING_COLON => HTP_LOG_CODE_REQUEST_FIELD_MISSING_COLON
HTTP_DECODER_EVENT_RESPONSE_FIELD_MISSING_COLON => HTP_LOG_CODE_RESPONSE_FIELD_MISSING_COLON
HTTP_DECODER_EVENT_INVALID_REQUEST_CHUNK_LEN => HTP_LOG_CODE_INVALID_REQUEST_CHUNK_LEN
HTTP_DECODER_EVENT_INVALID_RESPONSE_CHUNK_LEN => HTP_LOG_CODE_INVALID_RESPONSE_CHUNK_LEN
HTTP_DECODER_EVENT_INVALID_TRANSFER_ENCODING_VALUE_IN_REQUEST => HTP_LOG_CODE_INVALID_TRANSFER_ENCODING_VALUE_IN_REQUEST
HTTP_DECODER_EVENT_INVALID_TRANSFER_ENCODING_VALUE_IN_RESPONSE => HTP_LOG_CODE_INVALID_TRANSFER_ENCODING_VALUE_IN_RESPONSE
HTTP_DECODER_EVENT_INVALID_CONTENT_LENGTH_FIELD_IN_REQUEST => HTP_LOG_CODE_INVALID_CONTENT_LENGTH_FIELD_IN_REQUEST
HTTP_DECODER_EVENT_INVALID_CONTENT_LENGTH_FIELD_IN_RESPONSE => HTP_LOG_CODE_INVALID_CONTENT_LENGTH_FIELD_IN_RESPONSE
HTTP_DECODER_EVENT_DUPLICATE_CONTENT_LENGTH_FIELD_IN_REQUEST => HTP_LOG_CODE_DUPLICATE_CONTENT_LENGTH_FIELD_IN_REQUEST
HTTP_DECODER_EVENT_DUPLICATE_CONTENT_LENGTH_FIELD_IN_RESPONSE => HTP_LOG_CODE_DUPLICATE_CONTENT_LENGTH_FIELD_IN_RESPONSE
HTTP_DECODER_EVENT_100_CONTINUE_ALREADY_SEEN => HTP_LOG_CODE_CONTINUE_ALREADY_SEEN
HTTP_DECODER_EVENT_UNABLE_TO_MATCH_RESPONSE_TO_REQUEST => HTP_LOG_CODE_UNABLE_TO_MATCH_RESPONSE_TO_REQUEST
HTTP_DECODER_EVENT_INVALID_SERVER_PORT_IN_REQUEST => HTP_LOG_CODE_INVALID_SERVER_PORT_IN_REQUEST
HTTP_DECODER_EVENT_INVALID_AUTHORITY_PORT => HTP_LOG_CODE_INVALID_AUTHORITY_PORT
HTTP_DECODER_EVENT_REQUEST_HEADER_INVALID => HTP_LOG_CODE_REQUEST_HEADER_INVALID
HTTP_DECODER_EVENT_RESPONSE_HEADER_INVALID => HTP_LOG_CODE_RESPONSE_HEADER_INVALID
HTTP_DECODER_EVENT_MISSING_HOST_HEADER => HTP_LOG_CODE_MISSING_HOST_HEADER
HTTP_DECODER_EVENT_HOST_HEADER_AMBIGUOUS => HTP_LOG_CODE_HOST_HEADER_AMBIGUOUS
HTTP_DECODER_EVENT_INVALID_REQUEST_FIELD_FOLDING => HTP_LOG_CODE_INVALID_REQUEST_FIELD_FOLDING
HTTP_DECODER_EVENT_INVALID_RESPONSE_FIELD_FOLDING => HTP_LOG_CODE_INVALID_RESPONSE_FIELD_FOLDING
HTTP_DECODER_EVENT_REQUEST_FIELD_TOO_LONG => HTP_LOG_CODE_REQUEST_FIELD_TOO_LONG
HTTP_DECODER_EVENT_RESPONSE_FIELD_TOO_LONG => HTP_LOG_CODE_RESPONSE_FIELD_TOO_LONG
HTTP_DECODER_EVENT_FILE_NAME_TOO_LONG => HTP_LOG_CODE_REQUEST_LINE_INVALID
HTTP_DECODER_EVENT_REQUEST_LINE_INVALID => HTP_LOG_CODE_REQUEST_BODY_UNEXPECTED
HTTP_DECODER_EVENT_REQUEST_BODY_UNEXPECTED => HTP_LOG_CODE_RESPONSE_BODY_UNEXPECTED
HTTP_DECODER_EVENT_REQUEST_SERVER_PORT_TCP_PORT_MISMATCH => HTP_LOG_CODE_REQUEST_SERVER_PORT_TCP_PORT_MISMATCH
HTTP_DECODER_EVENT_URI_HOST_INVALID => HTP_LOG_CODE_URI_HOST_INVALID
HTTP_DECODER_EVENT_HEADER_HOST_INVALID => HTP_LOG_CODE_HEADER_HOST_INVALID
HTTP_DECODER_EVENT_AUTH_UNRECOGNIZED => HTP_LOG_CODE_AUTH_UNRECOGNIZED
HTTP_DECODER_EVENT_REQUEST_HEADER_REPETITION => HTP_LOG_CODE_REQUEST_HEADER_REPETITION
HTTP_DECODER_EVENT_RESPONSE_HEADER_REPETITION => HTP_LOG_CODE_RESPONSE_HEADER_REPETITION
HTTP_DECODER_EVENT_DOUBLE_ENCODED_URI => HTP_LOG_CODE_DOUBLE_ENCODED_URI
HTTP_DECODER_EVENT_URI_DELIM_NON_COMPLIANT => HTP_LOG_CODE_URI_DELIM_NON_COMPLIANT
HTTP_DECODER_EVENT_METHOD_DELIM_NON_COMPLIANT => HTP_LOG_CODE_METHOD_DELIM_NON_COMPLIANT
HTTP_DECODER_EVENT_REQUEST_LINE_LEADING_WHITESPACE => HTP_LOG_CODE_REQUEST_LINE_LEADING_WHITESPACE
HTTP_DECODER_EVENT_TOO_MANY_ENCODING_LAYERS => HTP_LOG_CODE_TOO_MANY_ENCODING_LAYERS
HTTP_DECODER_EVENT_ABNORMAL_CE_HEADER => HTP_LOG_CODE_ABNORMAL_CE_HEADER
HTTP_DECODER_EVENT_RESPONSE_MULTIPART_BYTERANGES => HTP_LOG_CODE_RESPONSE_MULTIPART_BYTERANGES
HTTP_DECODER_EVENT_RESPONSE_ABNORMAL_TRANSFER_ENCODING => HTP_LOG_CODE_RESPONSE_ABNORMAL_TRANSFER_ENCODING
HTTP_DECODER_EVENT_RESPONSE_CHUNKED_OLD_PROTO => HTP_LOG_CODE_RESPONSE_CHUNKED_OLD_PROTO
HTTP_DECODER_EVENT_RESPONSE_INVALID_PROTOCOL => HTP_LOG_CODE_RESPONSE_INVALID_PROTOCOL
HTTP_DECODER_EVENT_RESPONSE_INVALID_STATUS => HTP_LOG_CODE_RESPONSE_INVALID_STATUS
HTTP_DECODER_EVENT_REQUEST_LINE_INCOMPLETE => HTP_LOG_CODE_REQUEST_LINE_INCOMPLETE
HTTP_DECODER_EVENT_LZMA_MEMLIMIT_REACHED => HTP_LOG_CODE_LZMA_MEMLIMIT_REACHED
HTTP_DECODER_EVENT_COMPRESSION_BOMB => HTP_LOG_CODE_COMPRESSION_BOMB

New log events to describe additional anomalies:
HTP_LOG_CODE_REQUEST_TOO_MANY_LZMA_LAYERS
HTP_LOG_CODE_RESPONSE_TOO_MANY_LZMA_LAYERS
HTP_LOG_CODE_PROTOCOL_CONTAINS_EXTRA_DATA
HTP_LOG_CODE_CONTENT_LENGTH_EXTRA_DATA_START
HTP_LOG_CODE_CONTENT_LENGTH_EXTRA_DATA_END
HTP_LOG_CODE_SWITCHING_PROTO_WITH_CONTENT_LENGTH
HTP_LOG_CODE_DEFORMED_EOL
HTP_LOG_CODE_PARSER_STATE_ERROR
HTP_LOG_CODE_MISSING_OUTBOUND_TRANSACTION_DATA
HTP_LOG_CODE_MISSING_INBOUND_TRANSACTION_DATA
HTP_LOG_CODE_ZERO_LENGTH_DATA_CHUNKS
HTP_LOG_CODE_REQUEST_LINE_UNKNOWN_METHOD
HTP_LOG_CODE_REQUEST_LINE_UNKNOWN_METHOD_NO_PROTOCOL
HTP_LOG_CODE_REQUEST_LINE_UNKNOWN_METHOD_INVALID_PROTOCOL
HTP_LOG_CODE_REQUEST_LINE_NO_PROTOCOL
HTP_LOG_CODE_RESPONSE_LINE_INVALID_PROTOCOL
HTP_LOG_CODE_RESPONSE_LINE_INVALID_RESPONSE_STATUS
HTP_LOG_CODE_RESPONSE_BODY_INTERNAL_ERROR
HTP_LOG_CODE_REQUEST_BODY_DATA_CALLBACK_ERROR
HTP_LOG_CODE_RESPONSE_INVALID_EMPTY_NAME
HTP_LOG_CODE_REQUEST_INVALID_EMPTY_NAME
HTP_LOG_CODE_RESPONSE_INVALID_LWS_AFTER_NAME
HTP_LOG_CODE_RESPONSE_HEADER_NAME_NOT_TOKEN
HTP_LOG_CODE_REQUEST_INVALID_LWS_AFTER_NAME
HTP_LOG_CODE_LZMA_DECOMPRESSION_DISABLED
HTP_LOG_CODE_CONNECTION_ALREADY_OPEN
HTP_LOG_CODE_COMPRESSION_BOMB_DOUBLE_LZMA
HTP_LOG_CODE_INVALID_CONTENT_ENCODING
HTP_LOG_CODE_INVALID_GAP
HTP_LOG_CODE_ERROR

The new htp_log API supports consuming log messages more easily than
walking a list and tracking the current offset. Internally, libhtp-rs
now provides log messages as a queue of htp_log_t, which means the
application can simply call htp_conn_next_log() to fetch the next log
message until the queue is empty. Once the application is done with a
log message, they can call htp_log_free() to dispose of it.

Functions supporting htp_log_t:
htp_conn_next_log(conn) - Get the next log message
htp_log_message(log) - To get the text of the message
htp_log_code(log) - To get the HTP_LOG_CODE value
htp_log_free(log) - To free the htp_log_t
cccs-rtmorti added a commit to cccs-rtmorti/suricata that referenced this pull request Jun 15, 2023
Ticket: OISF#2696

There are a lot of changes here, which are described below.

In general these changes are renaming constants to conform to the
libhtp-rs versions (which are generated by cbindgen); making all htp
types opaque and changing struct->member references to
htp_struct_member() function calls; and a handful of changes to offload
functionality onto libhtp-rs from suricata, such as URI normalization
and transaction cleanup.

Constants renamed to correspond to cbindgen generated names:

HTP_OK => HTP_STATUS_OK
HTP_ERROR => HTP_STATUS_ERROR

HTP_SERVER_* => HTP_SERVER_PERSONALITY_*

HTP_REQUEST_INVALID_T_E => HTP_FLAGS_REQUEST_INVALID_T_E
HTP_REQUEST_INVALID_C_L => HTP_FLAGS_REQUEST_INVALID_C_L
HTP_HOST_MISSING => HTP_FLAGS_HOST_MISSING
HTP_HOST_AMBIGUOUS => HTP_FLAGS_HOST_AMBIGUOUS
HTP_HOSTU_INVALID => HTP_FLAGS_HOSTU_INVALID
HTP_HOSTH_INVALID => HTP_FLAGS_HOSTH_INVALID

HTP_AUTH_UNRECOGNIZED => HTP_AUTH_TYPE_UNRECOGNIZED

HTP_M_UNKNOWN => HTP_METHOD_UNKNOWN
HTP_M_GET => HTP_METHOD_GET
HTP_M_POST => HTP_METHOD_POST
HTP_M_PUT => HTP_METHOD_PUT
HTP_M_CONNECT => HTP_METHOD_CONNECT

HTP_STREAM_ERROR => HTP_STREAM_STATE_ERROR
HTP_STREAM_TUNNEL => HTP_STREAM_STATE_TUNNEL

HTP_REQUEST_LINE => HTP_REQUEST_PROGRESS_LINE
HTP_REQUEST_HEADERS => HTP_REQUEST_PROGRESS_HEADERS
HTP_REQUEST_BODY => HTP_REQUEST_PROGRESS_BODY
HTP_REQUEST_TRAILER => HTP_REQUEST_PROGRESS_TRAILER
HTP_REQUEST_COMPLETE => HTP_REQUEST_PROGRESS_COMPLETE
HTP_RESPONSE_LINE => HTP_RESPONSE_PROGRESS_LINE
HTP_RESPONSE_HEADERS => HTP_RESPONSE_PROGRESS_HEADERS
HTP_RESPONSE_BODY => HTP_RESPONSE_PROGRESS_BODY
HTP_RESPONSE_TRAILER => HTP_RESPONSE_PROGRESS_TRAILER
HTP_RESPONSE_COMPLETE => HTP_RESPONSE_PROGRESS_COMPLETE

HTP_PROTOCOL_1_1 => HTP_PROTOCOL_V1_1
HTP_PROTOCOL_1_0 => HTP_PROTOCOL_V1_0
HTP_PROTOCOL_0_9 => HTP_PROTOCOL_V0_9

Functions introduced to handle opaque htp_tx_t:
- tx->cfg => htp_tx_cfg(tx)
- tx->flags => htp_tx_flags(tx)
- tx->is_protocol_0_9 => htp_tx_is_protocol_0_9(tx)
- tx->parsed_uri => htp_tx_parsed_uri(tx)
- tx->parsed_uri->path => htp_uri_path(htp_tx_parsed_uri(tx)
- tx->parsed_uri->hostname => htp_uri_hostname(htp_tx_parsed_uri(tx))
- tx->request_auth_type => htp_tx_request_auth_type(tx)
- tx->request_headers => htp_tx_request_headers(tx)
- tx->request_hostname => htp_tx_request_hostname(tx)
- tx->request_line => htp_tx_request_line(tx)
- tx->request_message_len => htp_tx_request_message_len(tx)
- tx->request_method => htp_tx_request_method(tx)
- tx->request_method_number => htp_tx_request_method_number(tx)
- tx->request_port_number => htp_tx_request_port_number(tx)
- tx->request_progress => htp_tx_request_progress(tx)
- tx->request_protocol => htp_tx_request_protocol(tx)
- tx->request_protocol_number => htp_tx_request_protocol_number(tx)
- tx->request_uri => htp_tx_request_uri(tx)
- tx->response_headers => htp_tx_response_headers(tx)
- tx->response_line => htp_tx_response_line(tx)
- tx->response_message => htp_tx_response_message(tx)
- tx->response_message_len => htp_tx_response_message_len(tx)
- tx->response_status => htp_tx_response_status(tx)
- tx->response_status_number => htp_tx_response_status_number(tx)
- tx->response_progress => htp_tx_response_progress(tx)
- tx->response_protocol_number => htp_tx_response_protocol_number(tx)
- htp_tx_get_user_data() => htp_tx_user_data(tx)
- htp_table_get_c(tx->request_headers, header) => htp_tx_request_header(tx, header)
- htp_table_get_c(tx->response_headers, header) => htp_tx_response_header(tx, header)
- htp_table_get_index(tx->request_headers, index) => htp_tx_request_header_index(tx, index)
- htp_table_size(tx->request_headers) => htp_tx_request_headers_size(tx)
- htp_tx_is_http_2_upgrade(tx) convenience function introduced to detect response status 101
  and “Upgrade: h2c" header.

Functions introduced to handle opaque htp_header_t:
- header->name => htp_header_name(header)
- bstr_ptr(header->name) => htp_header_name_ptr(header)
- bstr_len(header->name) => htp_header_name_len(header)
- header->value => htp_header_value(header)
- bstr_len(header->value) => htp_header_value_len(header)
- bstr_ptr(header->value) => htp_header_value_ptr(header)

Functions introduced to handle opaque htp_headers_t:
- htp_table_size(headers) => htp_headers_size(headers)
- htp_table_get_index(headers, index) => htp_headers_get_index(headers, index)

Functions introduced to handle opaque htp_tx_data_t:
- d->len => htp_tx_data_len()
- d->data => htp_tx_data_data()
- htp_tx_data_tx(data) function to get the htp_tx_t from the htp_tx_data_t
- htp_tx_data_is_empty(data) convenience function introduced to test if the data is empty.

Functions introduced to handle opaque htp_connp_t:
- htp_list_get(connp->transactions, txid) => htp_connp_tx(connp, txid)
- htp_list_size(http_state->conn->transactions) => htp_connp_tx_size(connp)
- htp_connp_get_connection(connp) => htp_connp_connection(connp)
- htp_connp_req_data(connp) => htp_connp_request_data(connp)
- htp_connp_req_close(connp) => htp_connp_request_close(connp)
- htp_connp_res_data(connp) => htp_connp_response_data(connp)
- htp_connp_get_jn_tx(connp) => htp_connp_get_request_tx(connp)
- htp_connp_get_out_tx(connp) => htp_connp_get_response_tx(connp)
- htp_connp_req_data_consumed(connp) => htp_connp_request_data_consumed(connp)
- htp_connp_res_data_consumed(connp) => htp_connp_response_data_consumed(connp)
- htp_connp_get_user_data(connp) => htp_connp_user_data(connp)

Functions introduced to handle opaque htp_conn_t:
- conn->in_data_counter => htp_conn_request_data_counter(conn)
- conn->out_data_counter => htp_conn_response_data_counter(conn)

Other changes:

Build libhtp-rs as a crate inside rust. Update autoconf to no longer
use libhtp as an external dependency. Remove HAVE_HTP feature defines
since they are no longer needed.

Make function arguments and return values const where possible

htp_tx_destroy(tx) will now free an incomplete transaction

htp_time_t replaced with standard struct timeval

Callbacks from libhtp now provide the htp_connp_t and the htp_tx_data_t
as separate arguments. This means the connection parser is no longer
fetched from the transaction inside callbacks.

SCHTPGenerateNormalizedUri() functionality moved inside libhtp-rs, which
now provides normalized URI values.
The normalized URI is available with accessor function: htp_tx_normalized_uri()
Configuration settings added to control the behaviour of the URI normalization:
- htp_config_set_normalized_uri_include_all()
- htp_config_set_plusspace_decode()
- htp_config_set_convert_lowercase()
- htp_config_set_double_decode_normalized_query()
- htp_config_set_double_decode_normalized_path()
- htp_config_set_backslash_convert_slashes()
- htp_config_set_bestfit_replacement_byte()
- htp_config_set_convert_lowercase()
- htp_config_set_nul_encoded_terminates()
- htp_config_set_nul_raw_terminates()
- htp_config_set_path_separators_compress()
- htp_config_set_path_separators_decode()
- htp_config_set_u_encoding_decode()
- htp_config_set_url_encoding_invalid_handling()
- htp_config_set_utf8_convert_bestfit()
- htp_config_set_normalized_uri_include_all()
- htp_config_set_plusspace_decode()
Constants related to configuring uri normalization:
- HTP_URL_DECODE_PRESERVE_PERCENT => HTP_URL_ENCODING_HANDLING_PRESERVE_PERCENT
- HTP_URL_DECODE_REMOVE_PERCENT => HTP_URL_ENCODING_HANDLING_REMOVE_PERCENT
- HTP_URL_DECODE_PROCESS_INVALID => HTP_URL_ENCODING_HANDLING_PROCESS_INVALID

htp_config_set_field_limits(soft_limit, hard_limit) changed to
htp_config_set_field_limit(limit) because libhtp didn't implement soft
limits.

libhtp logging API updated to provide HTP_LOG_CODE constants along with
the message. This eliminates the need to perform string matching on
message text to map log messages to HTTP_DECODER_EVENT values, and the
HTP_LOG_CODE values can be used directly. In support of this,
HTP_DECODER_EVENT values are mapped to their corresponding HTP_LOG_CODE
values:

HTTP_DECODER_EVENT_UNKNOWN_ERROR => HTP_LOG_CODE_UNKNOWN
HTTP_DECODER_EVENT_GZIP_DECOMPRESSION_FAILED => HTP_LOG_CODE_GZIP_DECOMPRESSION_FAILED
HTTP_DECODER_EVENT_REQUEST_FIELD_MISSING_COLON => HTP_LOG_CODE_REQUEST_FIELD_MISSING_COLON
HTTP_DECODER_EVENT_RESPONSE_FIELD_MISSING_COLON => HTP_LOG_CODE_RESPONSE_FIELD_MISSING_COLON
HTTP_DECODER_EVENT_INVALID_REQUEST_CHUNK_LEN => HTP_LOG_CODE_INVALID_REQUEST_CHUNK_LEN
HTTP_DECODER_EVENT_INVALID_RESPONSE_CHUNK_LEN => HTP_LOG_CODE_INVALID_RESPONSE_CHUNK_LEN
HTTP_DECODER_EVENT_INVALID_TRANSFER_ENCODING_VALUE_IN_REQUEST => HTP_LOG_CODE_INVALID_TRANSFER_ENCODING_VALUE_IN_REQUEST
HTTP_DECODER_EVENT_INVALID_TRANSFER_ENCODING_VALUE_IN_RESPONSE => HTP_LOG_CODE_INVALID_TRANSFER_ENCODING_VALUE_IN_RESPONSE
HTTP_DECODER_EVENT_INVALID_CONTENT_LENGTH_FIELD_IN_REQUEST => HTP_LOG_CODE_INVALID_CONTENT_LENGTH_FIELD_IN_REQUEST
HTTP_DECODER_EVENT_INVALID_CONTENT_LENGTH_FIELD_IN_RESPONSE => HTP_LOG_CODE_INVALID_CONTENT_LENGTH_FIELD_IN_RESPONSE
HTTP_DECODER_EVENT_DUPLICATE_CONTENT_LENGTH_FIELD_IN_REQUEST => HTP_LOG_CODE_DUPLICATE_CONTENT_LENGTH_FIELD_IN_REQUEST
HTTP_DECODER_EVENT_DUPLICATE_CONTENT_LENGTH_FIELD_IN_RESPONSE => HTP_LOG_CODE_DUPLICATE_CONTENT_LENGTH_FIELD_IN_RESPONSE
HTTP_DECODER_EVENT_100_CONTINUE_ALREADY_SEEN => HTP_LOG_CODE_CONTINUE_ALREADY_SEEN
HTTP_DECODER_EVENT_UNABLE_TO_MATCH_RESPONSE_TO_REQUEST => HTP_LOG_CODE_UNABLE_TO_MATCH_RESPONSE_TO_REQUEST
HTTP_DECODER_EVENT_INVALID_SERVER_PORT_IN_REQUEST => HTP_LOG_CODE_INVALID_SERVER_PORT_IN_REQUEST
HTTP_DECODER_EVENT_INVALID_AUTHORITY_PORT => HTP_LOG_CODE_INVALID_AUTHORITY_PORT
HTTP_DECODER_EVENT_REQUEST_HEADER_INVALID => HTP_LOG_CODE_REQUEST_HEADER_INVALID
HTTP_DECODER_EVENT_RESPONSE_HEADER_INVALID => HTP_LOG_CODE_RESPONSE_HEADER_INVALID
HTTP_DECODER_EVENT_MISSING_HOST_HEADER => HTP_LOG_CODE_MISSING_HOST_HEADER
HTTP_DECODER_EVENT_HOST_HEADER_AMBIGUOUS => HTP_LOG_CODE_HOST_HEADER_AMBIGUOUS
HTTP_DECODER_EVENT_INVALID_REQUEST_FIELD_FOLDING => HTP_LOG_CODE_INVALID_REQUEST_FIELD_FOLDING
HTTP_DECODER_EVENT_INVALID_RESPONSE_FIELD_FOLDING => HTP_LOG_CODE_INVALID_RESPONSE_FIELD_FOLDING
HTTP_DECODER_EVENT_REQUEST_FIELD_TOO_LONG => HTP_LOG_CODE_REQUEST_FIELD_TOO_LONG
HTTP_DECODER_EVENT_RESPONSE_FIELD_TOO_LONG => HTP_LOG_CODE_RESPONSE_FIELD_TOO_LONG
HTTP_DECODER_EVENT_FILE_NAME_TOO_LONG => HTP_LOG_CODE_REQUEST_LINE_INVALID
HTTP_DECODER_EVENT_REQUEST_LINE_INVALID => HTP_LOG_CODE_REQUEST_BODY_UNEXPECTED
HTTP_DECODER_EVENT_REQUEST_BODY_UNEXPECTED => HTP_LOG_CODE_RESPONSE_BODY_UNEXPECTED
HTTP_DECODER_EVENT_REQUEST_SERVER_PORT_TCP_PORT_MISMATCH => HTP_LOG_CODE_REQUEST_SERVER_PORT_TCP_PORT_MISMATCH
HTTP_DECODER_EVENT_URI_HOST_INVALID => HTP_LOG_CODE_URI_HOST_INVALID
HTTP_DECODER_EVENT_HEADER_HOST_INVALID => HTP_LOG_CODE_HEADER_HOST_INVALID
HTTP_DECODER_EVENT_AUTH_UNRECOGNIZED => HTP_LOG_CODE_AUTH_UNRECOGNIZED
HTTP_DECODER_EVENT_REQUEST_HEADER_REPETITION => HTP_LOG_CODE_REQUEST_HEADER_REPETITION
HTTP_DECODER_EVENT_RESPONSE_HEADER_REPETITION => HTP_LOG_CODE_RESPONSE_HEADER_REPETITION
HTTP_DECODER_EVENT_DOUBLE_ENCODED_URI => HTP_LOG_CODE_DOUBLE_ENCODED_URI
HTTP_DECODER_EVENT_URI_DELIM_NON_COMPLIANT => HTP_LOG_CODE_URI_DELIM_NON_COMPLIANT
HTTP_DECODER_EVENT_METHOD_DELIM_NON_COMPLIANT => HTP_LOG_CODE_METHOD_DELIM_NON_COMPLIANT
HTTP_DECODER_EVENT_REQUEST_LINE_LEADING_WHITESPACE => HTP_LOG_CODE_REQUEST_LINE_LEADING_WHITESPACE
HTTP_DECODER_EVENT_TOO_MANY_ENCODING_LAYERS => HTP_LOG_CODE_TOO_MANY_ENCODING_LAYERS
HTTP_DECODER_EVENT_ABNORMAL_CE_HEADER => HTP_LOG_CODE_ABNORMAL_CE_HEADER
HTTP_DECODER_EVENT_RESPONSE_MULTIPART_BYTERANGES => HTP_LOG_CODE_RESPONSE_MULTIPART_BYTERANGES
HTTP_DECODER_EVENT_RESPONSE_ABNORMAL_TRANSFER_ENCODING => HTP_LOG_CODE_RESPONSE_ABNORMAL_TRANSFER_ENCODING
HTTP_DECODER_EVENT_RESPONSE_CHUNKED_OLD_PROTO => HTP_LOG_CODE_RESPONSE_CHUNKED_OLD_PROTO
HTTP_DECODER_EVENT_RESPONSE_INVALID_PROTOCOL => HTP_LOG_CODE_RESPONSE_INVALID_PROTOCOL
HTTP_DECODER_EVENT_RESPONSE_INVALID_STATUS => HTP_LOG_CODE_RESPONSE_INVALID_STATUS
HTTP_DECODER_EVENT_REQUEST_LINE_INCOMPLETE => HTP_LOG_CODE_REQUEST_LINE_INCOMPLETE
HTTP_DECODER_EVENT_LZMA_MEMLIMIT_REACHED => HTP_LOG_CODE_LZMA_MEMLIMIT_REACHED
HTTP_DECODER_EVENT_COMPRESSION_BOMB => HTP_LOG_CODE_COMPRESSION_BOMB

New log events to describe additional anomalies:
HTP_LOG_CODE_REQUEST_TOO_MANY_LZMA_LAYERS
HTP_LOG_CODE_RESPONSE_TOO_MANY_LZMA_LAYERS
HTP_LOG_CODE_PROTOCOL_CONTAINS_EXTRA_DATA
HTP_LOG_CODE_CONTENT_LENGTH_EXTRA_DATA_START
HTP_LOG_CODE_CONTENT_LENGTH_EXTRA_DATA_END
HTP_LOG_CODE_SWITCHING_PROTO_WITH_CONTENT_LENGTH
HTP_LOG_CODE_DEFORMED_EOL
HTP_LOG_CODE_PARSER_STATE_ERROR
HTP_LOG_CODE_MISSING_OUTBOUND_TRANSACTION_DATA
HTP_LOG_CODE_MISSING_INBOUND_TRANSACTION_DATA
HTP_LOG_CODE_ZERO_LENGTH_DATA_CHUNKS
HTP_LOG_CODE_REQUEST_LINE_UNKNOWN_METHOD
HTP_LOG_CODE_REQUEST_LINE_UNKNOWN_METHOD_NO_PROTOCOL
HTP_LOG_CODE_REQUEST_LINE_UNKNOWN_METHOD_INVALID_PROTOCOL
HTP_LOG_CODE_REQUEST_LINE_NO_PROTOCOL
HTP_LOG_CODE_RESPONSE_LINE_INVALID_PROTOCOL
HTP_LOG_CODE_RESPONSE_LINE_INVALID_RESPONSE_STATUS
HTP_LOG_CODE_RESPONSE_BODY_INTERNAL_ERROR
HTP_LOG_CODE_REQUEST_BODY_DATA_CALLBACK_ERROR
HTP_LOG_CODE_RESPONSE_INVALID_EMPTY_NAME
HTP_LOG_CODE_REQUEST_INVALID_EMPTY_NAME
HTP_LOG_CODE_RESPONSE_INVALID_LWS_AFTER_NAME
HTP_LOG_CODE_RESPONSE_HEADER_NAME_NOT_TOKEN
HTP_LOG_CODE_REQUEST_INVALID_LWS_AFTER_NAME
HTP_LOG_CODE_LZMA_DECOMPRESSION_DISABLED
HTP_LOG_CODE_CONNECTION_ALREADY_OPEN
HTP_LOG_CODE_COMPRESSION_BOMB_DOUBLE_LZMA
HTP_LOG_CODE_INVALID_CONTENT_ENCODING
HTP_LOG_CODE_INVALID_GAP
HTP_LOG_CODE_ERROR

The new htp_log API supports consuming log messages more easily than
walking a list and tracking the current offset. Internally, libhtp-rs
now provides log messages as a queue of htp_log_t, which means the
application can simply call htp_conn_next_log() to fetch the next log
message until the queue is empty. Once the application is done with a
log message, they can call htp_log_free() to dispose of it.

Functions supporting htp_log_t:
htp_conn_next_log(conn) - Get the next log message
htp_log_message(log) - To get the text of the message
htp_log_code(log) - To get the HTP_LOG_CODE value
htp_log_free(log) - To free the htp_log_t
cccs-rtmorti added a commit to cccs-rtmorti/suricata that referenced this pull request Jun 16, 2023
Ticket: OISF#2696

There are a lot of changes here, which are described below.

In general these changes are renaming constants to conform to the
libhtp-rs versions (which are generated by cbindgen); making all htp
types opaque and changing struct->member references to
htp_struct_member() function calls; and a handful of changes to offload
functionality onto libhtp-rs from suricata, such as URI normalization
and transaction cleanup.

Constants renamed to correspond to cbindgen generated names:

HTP_OK => HTP_STATUS_OK
HTP_ERROR => HTP_STATUS_ERROR

HTP_SERVER_* => HTP_SERVER_PERSONALITY_*

HTP_REQUEST_INVALID_T_E => HTP_FLAGS_REQUEST_INVALID_T_E
HTP_REQUEST_INVALID_C_L => HTP_FLAGS_REQUEST_INVALID_C_L
HTP_HOST_MISSING => HTP_FLAGS_HOST_MISSING
HTP_HOST_AMBIGUOUS => HTP_FLAGS_HOST_AMBIGUOUS
HTP_HOSTU_INVALID => HTP_FLAGS_HOSTU_INVALID
HTP_HOSTH_INVALID => HTP_FLAGS_HOSTH_INVALID

HTP_AUTH_UNRECOGNIZED => HTP_AUTH_TYPE_UNRECOGNIZED

HTP_M_UNKNOWN => HTP_METHOD_UNKNOWN
HTP_M_GET => HTP_METHOD_GET
HTP_M_POST => HTP_METHOD_POST
HTP_M_PUT => HTP_METHOD_PUT
HTP_M_CONNECT => HTP_METHOD_CONNECT

HTP_STREAM_ERROR => HTP_STREAM_STATE_ERROR
HTP_STREAM_TUNNEL => HTP_STREAM_STATE_TUNNEL

HTP_REQUEST_LINE => HTP_REQUEST_PROGRESS_LINE
HTP_REQUEST_HEADERS => HTP_REQUEST_PROGRESS_HEADERS
HTP_REQUEST_BODY => HTP_REQUEST_PROGRESS_BODY
HTP_REQUEST_TRAILER => HTP_REQUEST_PROGRESS_TRAILER
HTP_REQUEST_COMPLETE => HTP_REQUEST_PROGRESS_COMPLETE
HTP_RESPONSE_LINE => HTP_RESPONSE_PROGRESS_LINE
HTP_RESPONSE_HEADERS => HTP_RESPONSE_PROGRESS_HEADERS
HTP_RESPONSE_BODY => HTP_RESPONSE_PROGRESS_BODY
HTP_RESPONSE_TRAILER => HTP_RESPONSE_PROGRESS_TRAILER
HTP_RESPONSE_COMPLETE => HTP_RESPONSE_PROGRESS_COMPLETE

HTP_PROTOCOL_1_1 => HTP_PROTOCOL_V1_1
HTP_PROTOCOL_1_0 => HTP_PROTOCOL_V1_0
HTP_PROTOCOL_0_9 => HTP_PROTOCOL_V0_9

Functions introduced to handle opaque htp_tx_t:
- tx->cfg => htp_tx_cfg(tx)
- tx->flags => htp_tx_flags(tx)
- tx->is_protocol_0_9 => htp_tx_is_protocol_0_9(tx)
- tx->parsed_uri => htp_tx_parsed_uri(tx)
- tx->parsed_uri->path => htp_uri_path(htp_tx_parsed_uri(tx)
- tx->parsed_uri->hostname => htp_uri_hostname(htp_tx_parsed_uri(tx))
- tx->request_auth_type => htp_tx_request_auth_type(tx)
- tx->request_headers => htp_tx_request_headers(tx)
- tx->request_hostname => htp_tx_request_hostname(tx)
- tx->request_line => htp_tx_request_line(tx)
- tx->request_message_len => htp_tx_request_message_len(tx)
- tx->request_method => htp_tx_request_method(tx)
- tx->request_method_number => htp_tx_request_method_number(tx)
- tx->request_port_number => htp_tx_request_port_number(tx)
- tx->request_progress => htp_tx_request_progress(tx)
- tx->request_protocol => htp_tx_request_protocol(tx)
- tx->request_protocol_number => htp_tx_request_protocol_number(tx)
- tx->request_uri => htp_tx_request_uri(tx)
- tx->response_headers => htp_tx_response_headers(tx)
- tx->response_line => htp_tx_response_line(tx)
- tx->response_message => htp_tx_response_message(tx)
- tx->response_message_len => htp_tx_response_message_len(tx)
- tx->response_status => htp_tx_response_status(tx)
- tx->response_status_number => htp_tx_response_status_number(tx)
- tx->response_progress => htp_tx_response_progress(tx)
- tx->response_protocol_number => htp_tx_response_protocol_number(tx)
- htp_tx_get_user_data() => htp_tx_user_data(tx)
- htp_table_get_c(tx->request_headers, header) => htp_tx_request_header(tx, header)
- htp_table_get_c(tx->response_headers, header) => htp_tx_response_header(tx, header)
- htp_table_get_index(tx->request_headers, index) => htp_tx_request_header_index(tx, index)
- htp_table_size(tx->request_headers) => htp_tx_request_headers_size(tx)
- htp_tx_is_http_2_upgrade(tx) convenience function introduced to detect response status 101
  and “Upgrade: h2c" header.

Functions introduced to handle opaque htp_header_t:
- header->name => htp_header_name(header)
- bstr_ptr(header->name) => htp_header_name_ptr(header)
- bstr_len(header->name) => htp_header_name_len(header)
- header->value => htp_header_value(header)
- bstr_len(header->value) => htp_header_value_len(header)
- bstr_ptr(header->value) => htp_header_value_ptr(header)

Functions introduced to handle opaque htp_headers_t:
- htp_table_size(headers) => htp_headers_size(headers)
- htp_table_get_index(headers, index) => htp_headers_get_index(headers, index)

Functions introduced to handle opaque htp_tx_data_t:
- d->len => htp_tx_data_len()
- d->data => htp_tx_data_data()
- htp_tx_data_tx(data) function to get the htp_tx_t from the htp_tx_data_t
- htp_tx_data_is_empty(data) convenience function introduced to test if the data is empty.

Functions introduced to handle opaque htp_connp_t:
- htp_list_get(connp->transactions, txid) => htp_connp_tx(connp, txid)
- htp_list_size(http_state->conn->transactions) => htp_connp_tx_size(connp)
- htp_connp_get_connection(connp) => htp_connp_connection(connp)
- htp_connp_req_data(connp) => htp_connp_request_data(connp)
- htp_connp_req_close(connp) => htp_connp_request_close(connp)
- htp_connp_res_data(connp) => htp_connp_response_data(connp)
- htp_connp_get_jn_tx(connp) => htp_connp_get_request_tx(connp)
- htp_connp_get_out_tx(connp) => htp_connp_get_response_tx(connp)
- htp_connp_req_data_consumed(connp) => htp_connp_request_data_consumed(connp)
- htp_connp_res_data_consumed(connp) => htp_connp_response_data_consumed(connp)
- htp_connp_get_user_data(connp) => htp_connp_user_data(connp)

Functions introduced to handle opaque htp_conn_t:
- conn->in_data_counter => htp_conn_request_data_counter(conn)
- conn->out_data_counter => htp_conn_response_data_counter(conn)

Other changes:

Build libhtp-rs as a crate inside rust. Update autoconf to no longer
use libhtp as an external dependency. Remove HAVE_HTP feature defines
since they are no longer needed.

Make function arguments and return values const where possible

htp_tx_destroy(tx) will now free an incomplete transaction

htp_time_t replaced with standard struct timeval

Callbacks from libhtp now provide the htp_connp_t and the htp_tx_data_t
as separate arguments. This means the connection parser is no longer
fetched from the transaction inside callbacks.

SCHTPGenerateNormalizedUri() functionality moved inside libhtp-rs, which
now provides normalized URI values.
The normalized URI is available with accessor function: htp_tx_normalized_uri()
Configuration settings added to control the behaviour of the URI normalization:
- htp_config_set_normalized_uri_include_all()
- htp_config_set_plusspace_decode()
- htp_config_set_convert_lowercase()
- htp_config_set_double_decode_normalized_query()
- htp_config_set_double_decode_normalized_path()
- htp_config_set_backslash_convert_slashes()
- htp_config_set_bestfit_replacement_byte()
- htp_config_set_convert_lowercase()
- htp_config_set_nul_encoded_terminates()
- htp_config_set_nul_raw_terminates()
- htp_config_set_path_separators_compress()
- htp_config_set_path_separators_decode()
- htp_config_set_u_encoding_decode()
- htp_config_set_url_encoding_invalid_handling()
- htp_config_set_utf8_convert_bestfit()
- htp_config_set_normalized_uri_include_all()
- htp_config_set_plusspace_decode()
Constants related to configuring uri normalization:
- HTP_URL_DECODE_PRESERVE_PERCENT => HTP_URL_ENCODING_HANDLING_PRESERVE_PERCENT
- HTP_URL_DECODE_REMOVE_PERCENT => HTP_URL_ENCODING_HANDLING_REMOVE_PERCENT
- HTP_URL_DECODE_PROCESS_INVALID => HTP_URL_ENCODING_HANDLING_PROCESS_INVALID

htp_config_set_field_limits(soft_limit, hard_limit) changed to
htp_config_set_field_limit(limit) because libhtp didn't implement soft
limits.

libhtp logging API updated to provide HTP_LOG_CODE constants along with
the message. This eliminates the need to perform string matching on
message text to map log messages to HTTP_DECODER_EVENT values, and the
HTP_LOG_CODE values can be used directly. In support of this,
HTP_DECODER_EVENT values are mapped to their corresponding HTP_LOG_CODE
values:

HTTP_DECODER_EVENT_UNKNOWN_ERROR => HTP_LOG_CODE_UNKNOWN
HTTP_DECODER_EVENT_GZIP_DECOMPRESSION_FAILED => HTP_LOG_CODE_GZIP_DECOMPRESSION_FAILED
HTTP_DECODER_EVENT_REQUEST_FIELD_MISSING_COLON => HTP_LOG_CODE_REQUEST_FIELD_MISSING_COLON
HTTP_DECODER_EVENT_RESPONSE_FIELD_MISSING_COLON => HTP_LOG_CODE_RESPONSE_FIELD_MISSING_COLON
HTTP_DECODER_EVENT_INVALID_REQUEST_CHUNK_LEN => HTP_LOG_CODE_INVALID_REQUEST_CHUNK_LEN
HTTP_DECODER_EVENT_INVALID_RESPONSE_CHUNK_LEN => HTP_LOG_CODE_INVALID_RESPONSE_CHUNK_LEN
HTTP_DECODER_EVENT_INVALID_TRANSFER_ENCODING_VALUE_IN_REQUEST => HTP_LOG_CODE_INVALID_TRANSFER_ENCODING_VALUE_IN_REQUEST
HTTP_DECODER_EVENT_INVALID_TRANSFER_ENCODING_VALUE_IN_RESPONSE => HTP_LOG_CODE_INVALID_TRANSFER_ENCODING_VALUE_IN_RESPONSE
HTTP_DECODER_EVENT_INVALID_CONTENT_LENGTH_FIELD_IN_REQUEST => HTP_LOG_CODE_INVALID_CONTENT_LENGTH_FIELD_IN_REQUEST
HTTP_DECODER_EVENT_INVALID_CONTENT_LENGTH_FIELD_IN_RESPONSE => HTP_LOG_CODE_INVALID_CONTENT_LENGTH_FIELD_IN_RESPONSE
HTTP_DECODER_EVENT_DUPLICATE_CONTENT_LENGTH_FIELD_IN_REQUEST => HTP_LOG_CODE_DUPLICATE_CONTENT_LENGTH_FIELD_IN_REQUEST
HTTP_DECODER_EVENT_DUPLICATE_CONTENT_LENGTH_FIELD_IN_RESPONSE => HTP_LOG_CODE_DUPLICATE_CONTENT_LENGTH_FIELD_IN_RESPONSE
HTTP_DECODER_EVENT_100_CONTINUE_ALREADY_SEEN => HTP_LOG_CODE_CONTINUE_ALREADY_SEEN
HTTP_DECODER_EVENT_UNABLE_TO_MATCH_RESPONSE_TO_REQUEST => HTP_LOG_CODE_UNABLE_TO_MATCH_RESPONSE_TO_REQUEST
HTTP_DECODER_EVENT_INVALID_SERVER_PORT_IN_REQUEST => HTP_LOG_CODE_INVALID_SERVER_PORT_IN_REQUEST
HTTP_DECODER_EVENT_INVALID_AUTHORITY_PORT => HTP_LOG_CODE_INVALID_AUTHORITY_PORT
HTTP_DECODER_EVENT_REQUEST_HEADER_INVALID => HTP_LOG_CODE_REQUEST_HEADER_INVALID
HTTP_DECODER_EVENT_RESPONSE_HEADER_INVALID => HTP_LOG_CODE_RESPONSE_HEADER_INVALID
HTTP_DECODER_EVENT_MISSING_HOST_HEADER => HTP_LOG_CODE_MISSING_HOST_HEADER
HTTP_DECODER_EVENT_HOST_HEADER_AMBIGUOUS => HTP_LOG_CODE_HOST_HEADER_AMBIGUOUS
HTTP_DECODER_EVENT_INVALID_REQUEST_FIELD_FOLDING => HTP_LOG_CODE_INVALID_REQUEST_FIELD_FOLDING
HTTP_DECODER_EVENT_INVALID_RESPONSE_FIELD_FOLDING => HTP_LOG_CODE_INVALID_RESPONSE_FIELD_FOLDING
HTTP_DECODER_EVENT_REQUEST_FIELD_TOO_LONG => HTP_LOG_CODE_REQUEST_FIELD_TOO_LONG
HTTP_DECODER_EVENT_RESPONSE_FIELD_TOO_LONG => HTP_LOG_CODE_RESPONSE_FIELD_TOO_LONG
HTTP_DECODER_EVENT_FILE_NAME_TOO_LONG => HTP_LOG_CODE_REQUEST_LINE_INVALID
HTTP_DECODER_EVENT_REQUEST_LINE_INVALID => HTP_LOG_CODE_REQUEST_BODY_UNEXPECTED
HTTP_DECODER_EVENT_REQUEST_BODY_UNEXPECTED => HTP_LOG_CODE_RESPONSE_BODY_UNEXPECTED
HTTP_DECODER_EVENT_REQUEST_SERVER_PORT_TCP_PORT_MISMATCH => HTP_LOG_CODE_REQUEST_SERVER_PORT_TCP_PORT_MISMATCH
HTTP_DECODER_EVENT_URI_HOST_INVALID => HTP_LOG_CODE_URI_HOST_INVALID
HTTP_DECODER_EVENT_HEADER_HOST_INVALID => HTP_LOG_CODE_HEADER_HOST_INVALID
HTTP_DECODER_EVENT_AUTH_UNRECOGNIZED => HTP_LOG_CODE_AUTH_UNRECOGNIZED
HTTP_DECODER_EVENT_REQUEST_HEADER_REPETITION => HTP_LOG_CODE_REQUEST_HEADER_REPETITION
HTTP_DECODER_EVENT_RESPONSE_HEADER_REPETITION => HTP_LOG_CODE_RESPONSE_HEADER_REPETITION
HTTP_DECODER_EVENT_DOUBLE_ENCODED_URI => HTP_LOG_CODE_DOUBLE_ENCODED_URI
HTTP_DECODER_EVENT_URI_DELIM_NON_COMPLIANT => HTP_LOG_CODE_URI_DELIM_NON_COMPLIANT
HTTP_DECODER_EVENT_METHOD_DELIM_NON_COMPLIANT => HTP_LOG_CODE_METHOD_DELIM_NON_COMPLIANT
HTTP_DECODER_EVENT_REQUEST_LINE_LEADING_WHITESPACE => HTP_LOG_CODE_REQUEST_LINE_LEADING_WHITESPACE
HTTP_DECODER_EVENT_TOO_MANY_ENCODING_LAYERS => HTP_LOG_CODE_TOO_MANY_ENCODING_LAYERS
HTTP_DECODER_EVENT_ABNORMAL_CE_HEADER => HTP_LOG_CODE_ABNORMAL_CE_HEADER
HTTP_DECODER_EVENT_RESPONSE_MULTIPART_BYTERANGES => HTP_LOG_CODE_RESPONSE_MULTIPART_BYTERANGES
HTTP_DECODER_EVENT_RESPONSE_ABNORMAL_TRANSFER_ENCODING => HTP_LOG_CODE_RESPONSE_ABNORMAL_TRANSFER_ENCODING
HTTP_DECODER_EVENT_RESPONSE_CHUNKED_OLD_PROTO => HTP_LOG_CODE_RESPONSE_CHUNKED_OLD_PROTO
HTTP_DECODER_EVENT_RESPONSE_INVALID_PROTOCOL => HTP_LOG_CODE_RESPONSE_INVALID_PROTOCOL
HTTP_DECODER_EVENT_RESPONSE_INVALID_STATUS => HTP_LOG_CODE_RESPONSE_INVALID_STATUS
HTTP_DECODER_EVENT_REQUEST_LINE_INCOMPLETE => HTP_LOG_CODE_REQUEST_LINE_INCOMPLETE
HTTP_DECODER_EVENT_LZMA_MEMLIMIT_REACHED => HTP_LOG_CODE_LZMA_MEMLIMIT_REACHED
HTTP_DECODER_EVENT_COMPRESSION_BOMB => HTP_LOG_CODE_COMPRESSION_BOMB

New log events to describe additional anomalies:
HTP_LOG_CODE_REQUEST_TOO_MANY_LZMA_LAYERS
HTP_LOG_CODE_RESPONSE_TOO_MANY_LZMA_LAYERS
HTP_LOG_CODE_PROTOCOL_CONTAINS_EXTRA_DATA
HTP_LOG_CODE_CONTENT_LENGTH_EXTRA_DATA_START
HTP_LOG_CODE_CONTENT_LENGTH_EXTRA_DATA_END
HTP_LOG_CODE_SWITCHING_PROTO_WITH_CONTENT_LENGTH
HTP_LOG_CODE_DEFORMED_EOL
HTP_LOG_CODE_PARSER_STATE_ERROR
HTP_LOG_CODE_MISSING_OUTBOUND_TRANSACTION_DATA
HTP_LOG_CODE_MISSING_INBOUND_TRANSACTION_DATA
HTP_LOG_CODE_ZERO_LENGTH_DATA_CHUNKS
HTP_LOG_CODE_REQUEST_LINE_UNKNOWN_METHOD
HTP_LOG_CODE_REQUEST_LINE_UNKNOWN_METHOD_NO_PROTOCOL
HTP_LOG_CODE_REQUEST_LINE_UNKNOWN_METHOD_INVALID_PROTOCOL
HTP_LOG_CODE_REQUEST_LINE_NO_PROTOCOL
HTP_LOG_CODE_RESPONSE_LINE_INVALID_PROTOCOL
HTP_LOG_CODE_RESPONSE_LINE_INVALID_RESPONSE_STATUS
HTP_LOG_CODE_RESPONSE_BODY_INTERNAL_ERROR
HTP_LOG_CODE_REQUEST_BODY_DATA_CALLBACK_ERROR
HTP_LOG_CODE_RESPONSE_INVALID_EMPTY_NAME
HTP_LOG_CODE_REQUEST_INVALID_EMPTY_NAME
HTP_LOG_CODE_RESPONSE_INVALID_LWS_AFTER_NAME
HTP_LOG_CODE_RESPONSE_HEADER_NAME_NOT_TOKEN
HTP_LOG_CODE_REQUEST_INVALID_LWS_AFTER_NAME
HTP_LOG_CODE_LZMA_DECOMPRESSION_DISABLED
HTP_LOG_CODE_CONNECTION_ALREADY_OPEN
HTP_LOG_CODE_COMPRESSION_BOMB_DOUBLE_LZMA
HTP_LOG_CODE_INVALID_CONTENT_ENCODING
HTP_LOG_CODE_INVALID_GAP
HTP_LOG_CODE_ERROR

The new htp_log API supports consuming log messages more easily than
walking a list and tracking the current offset. Internally, libhtp-rs
now provides log messages as a queue of htp_log_t, which means the
application can simply call htp_conn_next_log() to fetch the next log
message until the queue is empty. Once the application is done with a
log message, they can call htp_log_free() to dispose of it.

Functions supporting htp_log_t:
htp_conn_next_log(conn) - Get the next log message
htp_log_message(log) - To get the text of the message
htp_log_code(log) - To get the HTP_LOG_CODE value
htp_log_free(log) - To free the htp_log_t
cccs-rtmorti added a commit to cccs-rtmorti/suricata that referenced this pull request Jun 26, 2023
Ticket: OISF#2696

There are a lot of changes here, which are described below.

In general these changes are renaming constants to conform to the
libhtp-rs versions (which are generated by cbindgen); making all htp
types opaque and changing struct->member references to
htp_struct_member() function calls; and a handful of changes to offload
functionality onto libhtp-rs from suricata, such as URI normalization
and transaction cleanup.

Constants renamed to correspond to cbindgen generated names:

HTP_OK => HTP_STATUS_OK
HTP_ERROR => HTP_STATUS_ERROR

HTP_SERVER_* => HTP_SERVER_PERSONALITY_*

HTP_REQUEST_INVALID_T_E => HTP_FLAGS_REQUEST_INVALID_T_E
HTP_REQUEST_INVALID_C_L => HTP_FLAGS_REQUEST_INVALID_C_L
HTP_HOST_MISSING => HTP_FLAGS_HOST_MISSING
HTP_HOST_AMBIGUOUS => HTP_FLAGS_HOST_AMBIGUOUS
HTP_HOSTU_INVALID => HTP_FLAGS_HOSTU_INVALID
HTP_HOSTH_INVALID => HTP_FLAGS_HOSTH_INVALID

HTP_AUTH_UNRECOGNIZED => HTP_AUTH_TYPE_UNRECOGNIZED

HTP_M_UNKNOWN => HTP_METHOD_UNKNOWN
HTP_M_GET => HTP_METHOD_GET
HTP_M_POST => HTP_METHOD_POST
HTP_M_PUT => HTP_METHOD_PUT
HTP_M_CONNECT => HTP_METHOD_CONNECT

HTP_STREAM_ERROR => HTP_STREAM_STATE_ERROR
HTP_STREAM_TUNNEL => HTP_STREAM_STATE_TUNNEL

HTP_REQUEST_LINE => HTP_REQUEST_PROGRESS_LINE
HTP_REQUEST_HEADERS => HTP_REQUEST_PROGRESS_HEADERS
HTP_REQUEST_BODY => HTP_REQUEST_PROGRESS_BODY
HTP_REQUEST_TRAILER => HTP_REQUEST_PROGRESS_TRAILER
HTP_REQUEST_COMPLETE => HTP_REQUEST_PROGRESS_COMPLETE
HTP_RESPONSE_LINE => HTP_RESPONSE_PROGRESS_LINE
HTP_RESPONSE_HEADERS => HTP_RESPONSE_PROGRESS_HEADERS
HTP_RESPONSE_BODY => HTP_RESPONSE_PROGRESS_BODY
HTP_RESPONSE_TRAILER => HTP_RESPONSE_PROGRESS_TRAILER
HTP_RESPONSE_COMPLETE => HTP_RESPONSE_PROGRESS_COMPLETE

HTP_PROTOCOL_1_1 => HTP_PROTOCOL_V1_1
HTP_PROTOCOL_1_0 => HTP_PROTOCOL_V1_0
HTP_PROTOCOL_0_9 => HTP_PROTOCOL_V0_9

Functions introduced to handle opaque htp_tx_t:
- tx->cfg => htp_tx_cfg(tx)
- tx->flags => htp_tx_flags(tx)
- tx->is_protocol_0_9 => htp_tx_is_protocol_0_9(tx)
- tx->parsed_uri => htp_tx_parsed_uri(tx)
- tx->parsed_uri->path => htp_uri_path(htp_tx_parsed_uri(tx)
- tx->parsed_uri->hostname => htp_uri_hostname(htp_tx_parsed_uri(tx))
- tx->request_auth_type => htp_tx_request_auth_type(tx)
- tx->request_headers => htp_tx_request_headers(tx)
- tx->request_hostname => htp_tx_request_hostname(tx)
- tx->request_line => htp_tx_request_line(tx)
- tx->request_message_len => htp_tx_request_message_len(tx)
- tx->request_method => htp_tx_request_method(tx)
- tx->request_method_number => htp_tx_request_method_number(tx)
- tx->request_port_number => htp_tx_request_port_number(tx)
- tx->request_progress => htp_tx_request_progress(tx)
- tx->request_protocol => htp_tx_request_protocol(tx)
- tx->request_protocol_number => htp_tx_request_protocol_number(tx)
- tx->request_uri => htp_tx_request_uri(tx)
- tx->response_headers => htp_tx_response_headers(tx)
- tx->response_line => htp_tx_response_line(tx)
- tx->response_message => htp_tx_response_message(tx)
- tx->response_message_len => htp_tx_response_message_len(tx)
- tx->response_status => htp_tx_response_status(tx)
- tx->response_status_number => htp_tx_response_status_number(tx)
- tx->response_progress => htp_tx_response_progress(tx)
- tx->response_protocol_number => htp_tx_response_protocol_number(tx)
- htp_tx_get_user_data() => htp_tx_user_data(tx)
- htp_table_get_c(tx->request_headers, header) => htp_tx_request_header(tx, header)
- htp_table_get_c(tx->response_headers, header) => htp_tx_response_header(tx, header)
- htp_table_get_index(tx->request_headers, index) => htp_tx_request_header_index(tx, index)
- htp_table_size(tx->request_headers) => htp_tx_request_headers_size(tx)
- htp_tx_is_http_2_upgrade(tx) convenience function introduced to detect response status 101
  and “Upgrade: h2c" header.

Functions introduced to handle opaque htp_header_t:
- header->name => htp_header_name(header)
- bstr_ptr(header->name) => htp_header_name_ptr(header)
- bstr_len(header->name) => htp_header_name_len(header)
- header->value => htp_header_value(header)
- bstr_len(header->value) => htp_header_value_len(header)
- bstr_ptr(header->value) => htp_header_value_ptr(header)

Functions introduced to handle opaque htp_headers_t:
- htp_table_size(headers) => htp_headers_size(headers)
- htp_table_get_index(headers, index) => htp_headers_get_index(headers, index)

Functions introduced to handle opaque htp_tx_data_t:
- d->len => htp_tx_data_len()
- d->data => htp_tx_data_data()
- htp_tx_data_tx(data) function to get the htp_tx_t from the htp_tx_data_t
- htp_tx_data_is_empty(data) convenience function introduced to test if the data is empty.

Functions introduced to handle opaque htp_connp_t:
- htp_list_get(connp->transactions, txid) => htp_connp_tx(connp, txid)
- htp_list_size(http_state->conn->transactions) => htp_connp_tx_size(connp)
- htp_connp_get_connection(connp) => htp_connp_connection(connp)
- htp_connp_req_data(connp) => htp_connp_request_data(connp)
- htp_connp_req_close(connp) => htp_connp_request_close(connp)
- htp_connp_res_data(connp) => htp_connp_response_data(connp)
- htp_connp_get_jn_tx(connp) => htp_connp_get_request_tx(connp)
- htp_connp_get_out_tx(connp) => htp_connp_get_response_tx(connp)
- htp_connp_req_data_consumed(connp) => htp_connp_request_data_consumed(connp)
- htp_connp_res_data_consumed(connp) => htp_connp_response_data_consumed(connp)
- htp_connp_get_user_data(connp) => htp_connp_user_data(connp)

Functions introduced to handle opaque htp_conn_t:
- conn->in_data_counter => htp_conn_request_data_counter(conn)
- conn->out_data_counter => htp_conn_response_data_counter(conn)

Other changes:

Build libhtp-rs as a crate inside rust. Update autoconf to no longer
use libhtp as an external dependency. Remove HAVE_HTP feature defines
since they are no longer needed.

Make function arguments and return values const where possible

htp_tx_destroy(tx) will now free an incomplete transaction

htp_time_t replaced with standard struct timeval

Callbacks from libhtp now provide the htp_connp_t and the htp_tx_data_t
as separate arguments. This means the connection parser is no longer
fetched from the transaction inside callbacks.

SCHTPGenerateNormalizedUri() functionality moved inside libhtp-rs, which
now provides normalized URI values.
The normalized URI is available with accessor function: htp_tx_normalized_uri()
Configuration settings added to control the behaviour of the URI normalization:
- htp_config_set_normalized_uri_include_all()
- htp_config_set_plusspace_decode()
- htp_config_set_convert_lowercase()
- htp_config_set_double_decode_normalized_query()
- htp_config_set_double_decode_normalized_path()
- htp_config_set_backslash_convert_slashes()
- htp_config_set_bestfit_replacement_byte()
- htp_config_set_convert_lowercase()
- htp_config_set_nul_encoded_terminates()
- htp_config_set_nul_raw_terminates()
- htp_config_set_path_separators_compress()
- htp_config_set_path_separators_decode()
- htp_config_set_u_encoding_decode()
- htp_config_set_url_encoding_invalid_handling()
- htp_config_set_utf8_convert_bestfit()
- htp_config_set_normalized_uri_include_all()
- htp_config_set_plusspace_decode()
Constants related to configuring uri normalization:
- HTP_URL_DECODE_PRESERVE_PERCENT => HTP_URL_ENCODING_HANDLING_PRESERVE_PERCENT
- HTP_URL_DECODE_REMOVE_PERCENT => HTP_URL_ENCODING_HANDLING_REMOVE_PERCENT
- HTP_URL_DECODE_PROCESS_INVALID => HTP_URL_ENCODING_HANDLING_PROCESS_INVALID

htp_config_set_field_limits(soft_limit, hard_limit) changed to
htp_config_set_field_limit(limit) because libhtp didn't implement soft
limits.

libhtp logging API updated to provide HTP_LOG_CODE constants along with
the message. This eliminates the need to perform string matching on
message text to map log messages to HTTP_DECODER_EVENT values, and the
HTP_LOG_CODE values can be used directly. In support of this,
HTP_DECODER_EVENT values are mapped to their corresponding HTP_LOG_CODE
values:

HTTP_DECODER_EVENT_UNKNOWN_ERROR => HTP_LOG_CODE_UNKNOWN
HTTP_DECODER_EVENT_GZIP_DECOMPRESSION_FAILED => HTP_LOG_CODE_GZIP_DECOMPRESSION_FAILED
HTTP_DECODER_EVENT_REQUEST_FIELD_MISSING_COLON => HTP_LOG_CODE_REQUEST_FIELD_MISSING_COLON
HTTP_DECODER_EVENT_RESPONSE_FIELD_MISSING_COLON => HTP_LOG_CODE_RESPONSE_FIELD_MISSING_COLON
HTTP_DECODER_EVENT_INVALID_REQUEST_CHUNK_LEN => HTP_LOG_CODE_INVALID_REQUEST_CHUNK_LEN
HTTP_DECODER_EVENT_INVALID_RESPONSE_CHUNK_LEN => HTP_LOG_CODE_INVALID_RESPONSE_CHUNK_LEN
HTTP_DECODER_EVENT_INVALID_TRANSFER_ENCODING_VALUE_IN_REQUEST => HTP_LOG_CODE_INVALID_TRANSFER_ENCODING_VALUE_IN_REQUEST
HTTP_DECODER_EVENT_INVALID_TRANSFER_ENCODING_VALUE_IN_RESPONSE => HTP_LOG_CODE_INVALID_TRANSFER_ENCODING_VALUE_IN_RESPONSE
HTTP_DECODER_EVENT_INVALID_CONTENT_LENGTH_FIELD_IN_REQUEST => HTP_LOG_CODE_INVALID_CONTENT_LENGTH_FIELD_IN_REQUEST
HTTP_DECODER_EVENT_INVALID_CONTENT_LENGTH_FIELD_IN_RESPONSE => HTP_LOG_CODE_INVALID_CONTENT_LENGTH_FIELD_IN_RESPONSE
HTTP_DECODER_EVENT_DUPLICATE_CONTENT_LENGTH_FIELD_IN_REQUEST => HTP_LOG_CODE_DUPLICATE_CONTENT_LENGTH_FIELD_IN_REQUEST
HTTP_DECODER_EVENT_DUPLICATE_CONTENT_LENGTH_FIELD_IN_RESPONSE => HTP_LOG_CODE_DUPLICATE_CONTENT_LENGTH_FIELD_IN_RESPONSE
HTTP_DECODER_EVENT_100_CONTINUE_ALREADY_SEEN => HTP_LOG_CODE_CONTINUE_ALREADY_SEEN
HTTP_DECODER_EVENT_UNABLE_TO_MATCH_RESPONSE_TO_REQUEST => HTP_LOG_CODE_UNABLE_TO_MATCH_RESPONSE_TO_REQUEST
HTTP_DECODER_EVENT_INVALID_SERVER_PORT_IN_REQUEST => HTP_LOG_CODE_INVALID_SERVER_PORT_IN_REQUEST
HTTP_DECODER_EVENT_INVALID_AUTHORITY_PORT => HTP_LOG_CODE_INVALID_AUTHORITY_PORT
HTTP_DECODER_EVENT_REQUEST_HEADER_INVALID => HTP_LOG_CODE_REQUEST_HEADER_INVALID
HTTP_DECODER_EVENT_RESPONSE_HEADER_INVALID => HTP_LOG_CODE_RESPONSE_HEADER_INVALID
HTTP_DECODER_EVENT_MISSING_HOST_HEADER => HTP_LOG_CODE_MISSING_HOST_HEADER
HTTP_DECODER_EVENT_HOST_HEADER_AMBIGUOUS => HTP_LOG_CODE_HOST_HEADER_AMBIGUOUS
HTTP_DECODER_EVENT_INVALID_REQUEST_FIELD_FOLDING => HTP_LOG_CODE_INVALID_REQUEST_FIELD_FOLDING
HTTP_DECODER_EVENT_INVALID_RESPONSE_FIELD_FOLDING => HTP_LOG_CODE_INVALID_RESPONSE_FIELD_FOLDING
HTTP_DECODER_EVENT_REQUEST_FIELD_TOO_LONG => HTP_LOG_CODE_REQUEST_FIELD_TOO_LONG
HTTP_DECODER_EVENT_RESPONSE_FIELD_TOO_LONG => HTP_LOG_CODE_RESPONSE_FIELD_TOO_LONG
HTTP_DECODER_EVENT_FILE_NAME_TOO_LONG => HTP_LOG_CODE_REQUEST_LINE_INVALID
HTTP_DECODER_EVENT_REQUEST_LINE_INVALID => HTP_LOG_CODE_REQUEST_BODY_UNEXPECTED
HTTP_DECODER_EVENT_REQUEST_BODY_UNEXPECTED => HTP_LOG_CODE_RESPONSE_BODY_UNEXPECTED
HTTP_DECODER_EVENT_REQUEST_SERVER_PORT_TCP_PORT_MISMATCH => HTP_LOG_CODE_REQUEST_SERVER_PORT_TCP_PORT_MISMATCH
HTTP_DECODER_EVENT_URI_HOST_INVALID => HTP_LOG_CODE_URI_HOST_INVALID
HTTP_DECODER_EVENT_HEADER_HOST_INVALID => HTP_LOG_CODE_HEADER_HOST_INVALID
HTTP_DECODER_EVENT_AUTH_UNRECOGNIZED => HTP_LOG_CODE_AUTH_UNRECOGNIZED
HTTP_DECODER_EVENT_REQUEST_HEADER_REPETITION => HTP_LOG_CODE_REQUEST_HEADER_REPETITION
HTTP_DECODER_EVENT_RESPONSE_HEADER_REPETITION => HTP_LOG_CODE_RESPONSE_HEADER_REPETITION
HTTP_DECODER_EVENT_DOUBLE_ENCODED_URI => HTP_LOG_CODE_DOUBLE_ENCODED_URI
HTTP_DECODER_EVENT_URI_DELIM_NON_COMPLIANT => HTP_LOG_CODE_URI_DELIM_NON_COMPLIANT
HTTP_DECODER_EVENT_METHOD_DELIM_NON_COMPLIANT => HTP_LOG_CODE_METHOD_DELIM_NON_COMPLIANT
HTTP_DECODER_EVENT_REQUEST_LINE_LEADING_WHITESPACE => HTP_LOG_CODE_REQUEST_LINE_LEADING_WHITESPACE
HTTP_DECODER_EVENT_TOO_MANY_ENCODING_LAYERS => HTP_LOG_CODE_TOO_MANY_ENCODING_LAYERS
HTTP_DECODER_EVENT_ABNORMAL_CE_HEADER => HTP_LOG_CODE_ABNORMAL_CE_HEADER
HTTP_DECODER_EVENT_RESPONSE_MULTIPART_BYTERANGES => HTP_LOG_CODE_RESPONSE_MULTIPART_BYTERANGES
HTTP_DECODER_EVENT_RESPONSE_ABNORMAL_TRANSFER_ENCODING => HTP_LOG_CODE_RESPONSE_ABNORMAL_TRANSFER_ENCODING
HTTP_DECODER_EVENT_RESPONSE_CHUNKED_OLD_PROTO => HTP_LOG_CODE_RESPONSE_CHUNKED_OLD_PROTO
HTTP_DECODER_EVENT_RESPONSE_INVALID_PROTOCOL => HTP_LOG_CODE_RESPONSE_INVALID_PROTOCOL
HTTP_DECODER_EVENT_RESPONSE_INVALID_STATUS => HTP_LOG_CODE_RESPONSE_INVALID_STATUS
HTTP_DECODER_EVENT_REQUEST_LINE_INCOMPLETE => HTP_LOG_CODE_REQUEST_LINE_INCOMPLETE
HTTP_DECODER_EVENT_LZMA_MEMLIMIT_REACHED => HTP_LOG_CODE_LZMA_MEMLIMIT_REACHED
HTTP_DECODER_EVENT_COMPRESSION_BOMB => HTP_LOG_CODE_COMPRESSION_BOMB

New log events to describe additional anomalies:
HTP_LOG_CODE_REQUEST_TOO_MANY_LZMA_LAYERS
HTP_LOG_CODE_RESPONSE_TOO_MANY_LZMA_LAYERS
HTP_LOG_CODE_PROTOCOL_CONTAINS_EXTRA_DATA
HTP_LOG_CODE_CONTENT_LENGTH_EXTRA_DATA_START
HTP_LOG_CODE_CONTENT_LENGTH_EXTRA_DATA_END
HTP_LOG_CODE_SWITCHING_PROTO_WITH_CONTENT_LENGTH
HTP_LOG_CODE_DEFORMED_EOL
HTP_LOG_CODE_PARSER_STATE_ERROR
HTP_LOG_CODE_MISSING_OUTBOUND_TRANSACTION_DATA
HTP_LOG_CODE_MISSING_INBOUND_TRANSACTION_DATA
HTP_LOG_CODE_ZERO_LENGTH_DATA_CHUNKS
HTP_LOG_CODE_REQUEST_LINE_UNKNOWN_METHOD
HTP_LOG_CODE_REQUEST_LINE_UNKNOWN_METHOD_NO_PROTOCOL
HTP_LOG_CODE_REQUEST_LINE_UNKNOWN_METHOD_INVALID_PROTOCOL
HTP_LOG_CODE_REQUEST_LINE_NO_PROTOCOL
HTP_LOG_CODE_RESPONSE_LINE_INVALID_PROTOCOL
HTP_LOG_CODE_RESPONSE_LINE_INVALID_RESPONSE_STATUS
HTP_LOG_CODE_RESPONSE_BODY_INTERNAL_ERROR
HTP_LOG_CODE_REQUEST_BODY_DATA_CALLBACK_ERROR
HTP_LOG_CODE_RESPONSE_INVALID_EMPTY_NAME
HTP_LOG_CODE_REQUEST_INVALID_EMPTY_NAME
HTP_LOG_CODE_RESPONSE_INVALID_LWS_AFTER_NAME
HTP_LOG_CODE_RESPONSE_HEADER_NAME_NOT_TOKEN
HTP_LOG_CODE_REQUEST_INVALID_LWS_AFTER_NAME
HTP_LOG_CODE_LZMA_DECOMPRESSION_DISABLED
HTP_LOG_CODE_CONNECTION_ALREADY_OPEN
HTP_LOG_CODE_COMPRESSION_BOMB_DOUBLE_LZMA
HTP_LOG_CODE_INVALID_CONTENT_ENCODING
HTP_LOG_CODE_INVALID_GAP
HTP_LOG_CODE_ERROR

The new htp_log API supports consuming log messages more easily than
walking a list and tracking the current offset. Internally, libhtp-rs
now provides log messages as a queue of htp_log_t, which means the
application can simply call htp_conn_next_log() to fetch the next log
message until the queue is empty. Once the application is done with a
log message, they can call htp_log_free() to dispose of it.

Functions supporting htp_log_t:
htp_conn_next_log(conn) - Get the next log message
htp_log_message(log) - To get the text of the message
htp_log_code(log) - To get the HTP_LOG_CODE value
htp_log_free(log) - To free the htp_log_t
cccs-rtmorti added a commit to cccs-rtmorti/suricata that referenced this pull request Jun 26, 2023
Ticket: OISF#2696

There are a lot of changes here, which are described below.

In general these changes are renaming constants to conform to the
libhtp-rs versions (which are generated by cbindgen); making all htp
types opaque and changing struct->member references to
htp_struct_member() function calls; and a handful of changes to offload
functionality onto libhtp-rs from suricata, such as URI normalization
and transaction cleanup.

Constants renamed to correspond to cbindgen generated names:

HTP_OK => HTP_STATUS_OK
HTP_ERROR => HTP_STATUS_ERROR

HTP_SERVER_* => HTP_SERVER_PERSONALITY_*

HTP_REQUEST_INVALID_T_E => HTP_FLAGS_REQUEST_INVALID_T_E
HTP_REQUEST_INVALID_C_L => HTP_FLAGS_REQUEST_INVALID_C_L
HTP_HOST_MISSING => HTP_FLAGS_HOST_MISSING
HTP_HOST_AMBIGUOUS => HTP_FLAGS_HOST_AMBIGUOUS
HTP_HOSTU_INVALID => HTP_FLAGS_HOSTU_INVALID
HTP_HOSTH_INVALID => HTP_FLAGS_HOSTH_INVALID

HTP_AUTH_UNRECOGNIZED => HTP_AUTH_TYPE_UNRECOGNIZED

HTP_M_UNKNOWN => HTP_METHOD_UNKNOWN
HTP_M_GET => HTP_METHOD_GET
HTP_M_POST => HTP_METHOD_POST
HTP_M_PUT => HTP_METHOD_PUT
HTP_M_CONNECT => HTP_METHOD_CONNECT

HTP_STREAM_ERROR => HTP_STREAM_STATE_ERROR
HTP_STREAM_TUNNEL => HTP_STREAM_STATE_TUNNEL

HTP_REQUEST_LINE => HTP_REQUEST_PROGRESS_LINE
HTP_REQUEST_HEADERS => HTP_REQUEST_PROGRESS_HEADERS
HTP_REQUEST_BODY => HTP_REQUEST_PROGRESS_BODY
HTP_REQUEST_TRAILER => HTP_REQUEST_PROGRESS_TRAILER
HTP_REQUEST_COMPLETE => HTP_REQUEST_PROGRESS_COMPLETE
HTP_RESPONSE_LINE => HTP_RESPONSE_PROGRESS_LINE
HTP_RESPONSE_HEADERS => HTP_RESPONSE_PROGRESS_HEADERS
HTP_RESPONSE_BODY => HTP_RESPONSE_PROGRESS_BODY
HTP_RESPONSE_TRAILER => HTP_RESPONSE_PROGRESS_TRAILER
HTP_RESPONSE_COMPLETE => HTP_RESPONSE_PROGRESS_COMPLETE

HTP_PROTOCOL_1_1 => HTP_PROTOCOL_V1_1
HTP_PROTOCOL_1_0 => HTP_PROTOCOL_V1_0
HTP_PROTOCOL_0_9 => HTP_PROTOCOL_V0_9

Functions introduced to handle opaque htp_tx_t:
- tx->cfg => htp_tx_cfg(tx)
- tx->flags => htp_tx_flags(tx)
- tx->is_protocol_0_9 => htp_tx_is_protocol_0_9(tx)
- tx->parsed_uri => htp_tx_parsed_uri(tx)
- tx->parsed_uri->path => htp_uri_path(htp_tx_parsed_uri(tx)
- tx->parsed_uri->hostname => htp_uri_hostname(htp_tx_parsed_uri(tx))
- tx->request_auth_type => htp_tx_request_auth_type(tx)
- tx->request_headers => htp_tx_request_headers(tx)
- tx->request_hostname => htp_tx_request_hostname(tx)
- tx->request_line => htp_tx_request_line(tx)
- tx->request_message_len => htp_tx_request_message_len(tx)
- tx->request_method => htp_tx_request_method(tx)
- tx->request_method_number => htp_tx_request_method_number(tx)
- tx->request_port_number => htp_tx_request_port_number(tx)
- tx->request_progress => htp_tx_request_progress(tx)
- tx->request_protocol => htp_tx_request_protocol(tx)
- tx->request_protocol_number => htp_tx_request_protocol_number(tx)
- tx->request_uri => htp_tx_request_uri(tx)
- tx->response_headers => htp_tx_response_headers(tx)
- tx->response_line => htp_tx_response_line(tx)
- tx->response_message => htp_tx_response_message(tx)
- tx->response_message_len => htp_tx_response_message_len(tx)
- tx->response_status => htp_tx_response_status(tx)
- tx->response_status_number => htp_tx_response_status_number(tx)
- tx->response_progress => htp_tx_response_progress(tx)
- tx->response_protocol_number => htp_tx_response_protocol_number(tx)
- htp_tx_get_user_data() => htp_tx_user_data(tx)
- htp_table_get_c(tx->request_headers, header) => htp_tx_request_header(tx, header)
- htp_table_get_c(tx->response_headers, header) => htp_tx_response_header(tx, header)
- htp_table_get_index(tx->request_headers, index) => htp_tx_request_header_index(tx, index)
- htp_table_size(tx->request_headers) => htp_tx_request_headers_size(tx)
- htp_tx_is_http_2_upgrade(tx) convenience function introduced to detect response status 101
  and “Upgrade: h2c" header.

Functions introduced to handle opaque htp_header_t:
- header->name => htp_header_name(header)
- bstr_ptr(header->name) => htp_header_name_ptr(header)
- bstr_len(header->name) => htp_header_name_len(header)
- header->value => htp_header_value(header)
- bstr_len(header->value) => htp_header_value_len(header)
- bstr_ptr(header->value) => htp_header_value_ptr(header)

Functions introduced to handle opaque htp_headers_t:
- htp_table_size(headers) => htp_headers_size(headers)
- htp_table_get_index(headers, index) => htp_headers_get_index(headers, index)

Functions introduced to handle opaque htp_tx_data_t:
- d->len => htp_tx_data_len()
- d->data => htp_tx_data_data()
- htp_tx_data_tx(data) function to get the htp_tx_t from the htp_tx_data_t
- htp_tx_data_is_empty(data) convenience function introduced to test if the data is empty.

Functions introduced to handle opaque htp_connp_t:
- htp_list_get(connp->transactions, txid) => htp_connp_tx(connp, txid)
- htp_list_size(http_state->conn->transactions) => htp_connp_tx_size(connp)
- htp_connp_get_connection(connp) => htp_connp_connection(connp)
- htp_connp_req_data(connp) => htp_connp_request_data(connp)
- htp_connp_req_close(connp) => htp_connp_request_close(connp)
- htp_connp_res_data(connp) => htp_connp_response_data(connp)
- htp_connp_get_jn_tx(connp) => htp_connp_get_request_tx(connp)
- htp_connp_get_out_tx(connp) => htp_connp_get_response_tx(connp)
- htp_connp_req_data_consumed(connp) => htp_connp_request_data_consumed(connp)
- htp_connp_res_data_consumed(connp) => htp_connp_response_data_consumed(connp)
- htp_connp_get_user_data(connp) => htp_connp_user_data(connp)

Functions introduced to handle opaque htp_conn_t:
- conn->in_data_counter => htp_conn_request_data_counter(conn)
- conn->out_data_counter => htp_conn_response_data_counter(conn)

Other changes:

Build libhtp-rs as a crate inside rust. Update autoconf to no longer
use libhtp as an external dependency. Remove HAVE_HTP feature defines
since they are no longer needed.

Make function arguments and return values const where possible

htp_tx_destroy(tx) will now free an incomplete transaction

htp_time_t replaced with standard struct timeval

Callbacks from libhtp now provide the htp_connp_t and the htp_tx_data_t
as separate arguments. This means the connection parser is no longer
fetched from the transaction inside callbacks.

SCHTPGenerateNormalizedUri() functionality moved inside libhtp-rs, which
now provides normalized URI values.
The normalized URI is available with accessor function: htp_tx_normalized_uri()
Configuration settings added to control the behaviour of the URI normalization:
- htp_config_set_normalized_uri_include_all()
- htp_config_set_plusspace_decode()
- htp_config_set_convert_lowercase()
- htp_config_set_double_decode_normalized_query()
- htp_config_set_double_decode_normalized_path()
- htp_config_set_backslash_convert_slashes()
- htp_config_set_bestfit_replacement_byte()
- htp_config_set_convert_lowercase()
- htp_config_set_nul_encoded_terminates()
- htp_config_set_nul_raw_terminates()
- htp_config_set_path_separators_compress()
- htp_config_set_path_separators_decode()
- htp_config_set_u_encoding_decode()
- htp_config_set_url_encoding_invalid_handling()
- htp_config_set_utf8_convert_bestfit()
- htp_config_set_normalized_uri_include_all()
- htp_config_set_plusspace_decode()
Constants related to configuring uri normalization:
- HTP_URL_DECODE_PRESERVE_PERCENT => HTP_URL_ENCODING_HANDLING_PRESERVE_PERCENT
- HTP_URL_DECODE_REMOVE_PERCENT => HTP_URL_ENCODING_HANDLING_REMOVE_PERCENT
- HTP_URL_DECODE_PROCESS_INVALID => HTP_URL_ENCODING_HANDLING_PROCESS_INVALID

htp_config_set_field_limits(soft_limit, hard_limit) changed to
htp_config_set_field_limit(limit) because libhtp didn't implement soft
limits.

libhtp logging API updated to provide HTP_LOG_CODE constants along with
the message. This eliminates the need to perform string matching on
message text to map log messages to HTTP_DECODER_EVENT values, and the
HTP_LOG_CODE values can be used directly. In support of this,
HTP_DECODER_EVENT values are mapped to their corresponding HTP_LOG_CODE
values:

HTTP_DECODER_EVENT_UNKNOWN_ERROR => HTP_LOG_CODE_UNKNOWN
HTTP_DECODER_EVENT_GZIP_DECOMPRESSION_FAILED => HTP_LOG_CODE_GZIP_DECOMPRESSION_FAILED
HTTP_DECODER_EVENT_REQUEST_FIELD_MISSING_COLON => HTP_LOG_CODE_REQUEST_FIELD_MISSING_COLON
HTTP_DECODER_EVENT_RESPONSE_FIELD_MISSING_COLON => HTP_LOG_CODE_RESPONSE_FIELD_MISSING_COLON
HTTP_DECODER_EVENT_INVALID_REQUEST_CHUNK_LEN => HTP_LOG_CODE_INVALID_REQUEST_CHUNK_LEN
HTTP_DECODER_EVENT_INVALID_RESPONSE_CHUNK_LEN => HTP_LOG_CODE_INVALID_RESPONSE_CHUNK_LEN
HTTP_DECODER_EVENT_INVALID_TRANSFER_ENCODING_VALUE_IN_REQUEST => HTP_LOG_CODE_INVALID_TRANSFER_ENCODING_VALUE_IN_REQUEST
HTTP_DECODER_EVENT_INVALID_TRANSFER_ENCODING_VALUE_IN_RESPONSE => HTP_LOG_CODE_INVALID_TRANSFER_ENCODING_VALUE_IN_RESPONSE
HTTP_DECODER_EVENT_INVALID_CONTENT_LENGTH_FIELD_IN_REQUEST => HTP_LOG_CODE_INVALID_CONTENT_LENGTH_FIELD_IN_REQUEST
HTTP_DECODER_EVENT_INVALID_CONTENT_LENGTH_FIELD_IN_RESPONSE => HTP_LOG_CODE_INVALID_CONTENT_LENGTH_FIELD_IN_RESPONSE
HTTP_DECODER_EVENT_DUPLICATE_CONTENT_LENGTH_FIELD_IN_REQUEST => HTP_LOG_CODE_DUPLICATE_CONTENT_LENGTH_FIELD_IN_REQUEST
HTTP_DECODER_EVENT_DUPLICATE_CONTENT_LENGTH_FIELD_IN_RESPONSE => HTP_LOG_CODE_DUPLICATE_CONTENT_LENGTH_FIELD_IN_RESPONSE
HTTP_DECODER_EVENT_100_CONTINUE_ALREADY_SEEN => HTP_LOG_CODE_CONTINUE_ALREADY_SEEN
HTTP_DECODER_EVENT_UNABLE_TO_MATCH_RESPONSE_TO_REQUEST => HTP_LOG_CODE_UNABLE_TO_MATCH_RESPONSE_TO_REQUEST
HTTP_DECODER_EVENT_INVALID_SERVER_PORT_IN_REQUEST => HTP_LOG_CODE_INVALID_SERVER_PORT_IN_REQUEST
HTTP_DECODER_EVENT_INVALID_AUTHORITY_PORT => HTP_LOG_CODE_INVALID_AUTHORITY_PORT
HTTP_DECODER_EVENT_REQUEST_HEADER_INVALID => HTP_LOG_CODE_REQUEST_HEADER_INVALID
HTTP_DECODER_EVENT_RESPONSE_HEADER_INVALID => HTP_LOG_CODE_RESPONSE_HEADER_INVALID
HTTP_DECODER_EVENT_MISSING_HOST_HEADER => HTP_LOG_CODE_MISSING_HOST_HEADER
HTTP_DECODER_EVENT_HOST_HEADER_AMBIGUOUS => HTP_LOG_CODE_HOST_HEADER_AMBIGUOUS
HTTP_DECODER_EVENT_INVALID_REQUEST_FIELD_FOLDING => HTP_LOG_CODE_INVALID_REQUEST_FIELD_FOLDING
HTTP_DECODER_EVENT_INVALID_RESPONSE_FIELD_FOLDING => HTP_LOG_CODE_INVALID_RESPONSE_FIELD_FOLDING
HTTP_DECODER_EVENT_REQUEST_FIELD_TOO_LONG => HTP_LOG_CODE_REQUEST_FIELD_TOO_LONG
HTTP_DECODER_EVENT_RESPONSE_FIELD_TOO_LONG => HTP_LOG_CODE_RESPONSE_FIELD_TOO_LONG
HTTP_DECODER_EVENT_FILE_NAME_TOO_LONG => HTP_LOG_CODE_REQUEST_LINE_INVALID
HTTP_DECODER_EVENT_REQUEST_LINE_INVALID => HTP_LOG_CODE_REQUEST_BODY_UNEXPECTED
HTTP_DECODER_EVENT_REQUEST_BODY_UNEXPECTED => HTP_LOG_CODE_RESPONSE_BODY_UNEXPECTED
HTTP_DECODER_EVENT_REQUEST_SERVER_PORT_TCP_PORT_MISMATCH => HTP_LOG_CODE_REQUEST_SERVER_PORT_TCP_PORT_MISMATCH
HTTP_DECODER_EVENT_URI_HOST_INVALID => HTP_LOG_CODE_URI_HOST_INVALID
HTTP_DECODER_EVENT_HEADER_HOST_INVALID => HTP_LOG_CODE_HEADER_HOST_INVALID
HTTP_DECODER_EVENT_AUTH_UNRECOGNIZED => HTP_LOG_CODE_AUTH_UNRECOGNIZED
HTTP_DECODER_EVENT_REQUEST_HEADER_REPETITION => HTP_LOG_CODE_REQUEST_HEADER_REPETITION
HTTP_DECODER_EVENT_RESPONSE_HEADER_REPETITION => HTP_LOG_CODE_RESPONSE_HEADER_REPETITION
HTTP_DECODER_EVENT_DOUBLE_ENCODED_URI => HTP_LOG_CODE_DOUBLE_ENCODED_URI
HTTP_DECODER_EVENT_URI_DELIM_NON_COMPLIANT => HTP_LOG_CODE_URI_DELIM_NON_COMPLIANT
HTTP_DECODER_EVENT_METHOD_DELIM_NON_COMPLIANT => HTP_LOG_CODE_METHOD_DELIM_NON_COMPLIANT
HTTP_DECODER_EVENT_REQUEST_LINE_LEADING_WHITESPACE => HTP_LOG_CODE_REQUEST_LINE_LEADING_WHITESPACE
HTTP_DECODER_EVENT_TOO_MANY_ENCODING_LAYERS => HTP_LOG_CODE_TOO_MANY_ENCODING_LAYERS
HTTP_DECODER_EVENT_ABNORMAL_CE_HEADER => HTP_LOG_CODE_ABNORMAL_CE_HEADER
HTTP_DECODER_EVENT_RESPONSE_MULTIPART_BYTERANGES => HTP_LOG_CODE_RESPONSE_MULTIPART_BYTERANGES
HTTP_DECODER_EVENT_RESPONSE_ABNORMAL_TRANSFER_ENCODING => HTP_LOG_CODE_RESPONSE_ABNORMAL_TRANSFER_ENCODING
HTTP_DECODER_EVENT_RESPONSE_CHUNKED_OLD_PROTO => HTP_LOG_CODE_RESPONSE_CHUNKED_OLD_PROTO
HTTP_DECODER_EVENT_RESPONSE_INVALID_PROTOCOL => HTP_LOG_CODE_RESPONSE_INVALID_PROTOCOL
HTTP_DECODER_EVENT_RESPONSE_INVALID_STATUS => HTP_LOG_CODE_RESPONSE_INVALID_STATUS
HTTP_DECODER_EVENT_REQUEST_LINE_INCOMPLETE => HTP_LOG_CODE_REQUEST_LINE_INCOMPLETE
HTTP_DECODER_EVENT_LZMA_MEMLIMIT_REACHED => HTP_LOG_CODE_LZMA_MEMLIMIT_REACHED
HTTP_DECODER_EVENT_COMPRESSION_BOMB => HTP_LOG_CODE_COMPRESSION_BOMB

New log events to describe additional anomalies:
HTP_LOG_CODE_REQUEST_TOO_MANY_LZMA_LAYERS
HTP_LOG_CODE_RESPONSE_TOO_MANY_LZMA_LAYERS
HTP_LOG_CODE_PROTOCOL_CONTAINS_EXTRA_DATA
HTP_LOG_CODE_CONTENT_LENGTH_EXTRA_DATA_START
HTP_LOG_CODE_CONTENT_LENGTH_EXTRA_DATA_END
HTP_LOG_CODE_SWITCHING_PROTO_WITH_CONTENT_LENGTH
HTP_LOG_CODE_DEFORMED_EOL
HTP_LOG_CODE_PARSER_STATE_ERROR
HTP_LOG_CODE_MISSING_OUTBOUND_TRANSACTION_DATA
HTP_LOG_CODE_MISSING_INBOUND_TRANSACTION_DATA
HTP_LOG_CODE_ZERO_LENGTH_DATA_CHUNKS
HTP_LOG_CODE_REQUEST_LINE_UNKNOWN_METHOD
HTP_LOG_CODE_REQUEST_LINE_UNKNOWN_METHOD_NO_PROTOCOL
HTP_LOG_CODE_REQUEST_LINE_UNKNOWN_METHOD_INVALID_PROTOCOL
HTP_LOG_CODE_REQUEST_LINE_NO_PROTOCOL
HTP_LOG_CODE_RESPONSE_LINE_INVALID_PROTOCOL
HTP_LOG_CODE_RESPONSE_LINE_INVALID_RESPONSE_STATUS
HTP_LOG_CODE_RESPONSE_BODY_INTERNAL_ERROR
HTP_LOG_CODE_REQUEST_BODY_DATA_CALLBACK_ERROR
HTP_LOG_CODE_RESPONSE_INVALID_EMPTY_NAME
HTP_LOG_CODE_REQUEST_INVALID_EMPTY_NAME
HTP_LOG_CODE_RESPONSE_INVALID_LWS_AFTER_NAME
HTP_LOG_CODE_RESPONSE_HEADER_NAME_NOT_TOKEN
HTP_LOG_CODE_REQUEST_INVALID_LWS_AFTER_NAME
HTP_LOG_CODE_LZMA_DECOMPRESSION_DISABLED
HTP_LOG_CODE_CONNECTION_ALREADY_OPEN
HTP_LOG_CODE_COMPRESSION_BOMB_DOUBLE_LZMA
HTP_LOG_CODE_INVALID_CONTENT_ENCODING
HTP_LOG_CODE_INVALID_GAP
HTP_LOG_CODE_ERROR

The new htp_log API supports consuming log messages more easily than
walking a list and tracking the current offset. Internally, libhtp-rs
now provides log messages as a queue of htp_log_t, which means the
application can simply call htp_conn_next_log() to fetch the next log
message until the queue is empty. Once the application is done with a
log message, they can call htp_log_free() to dispose of it.

Functions supporting htp_log_t:
htp_conn_next_log(conn) - Get the next log message
htp_log_message(log) - To get the text of the message
htp_log_code(log) - To get the HTP_LOG_CODE value
htp_log_free(log) - To free the htp_log_t
cccs-rtmorti added a commit to cccs-rtmorti/suricata that referenced this pull request Jun 26, 2023
Ticket: OISF#2696

There are a lot of changes here, which are described below.

In general these changes are renaming constants to conform to the
libhtp-rs versions (which are generated by cbindgen); making all htp
types opaque and changing struct->member references to
htp_struct_member() function calls; and a handful of changes to offload
functionality onto libhtp-rs from suricata, such as URI normalization
and transaction cleanup.

Constants renamed to correspond to cbindgen generated names:

HTP_OK => HTP_STATUS_OK
HTP_ERROR => HTP_STATUS_ERROR

HTP_SERVER_* => HTP_SERVER_PERSONALITY_*

HTP_REQUEST_INVALID_T_E => HTP_FLAGS_REQUEST_INVALID_T_E
HTP_REQUEST_INVALID_C_L => HTP_FLAGS_REQUEST_INVALID_C_L
HTP_HOST_MISSING => HTP_FLAGS_HOST_MISSING
HTP_HOST_AMBIGUOUS => HTP_FLAGS_HOST_AMBIGUOUS
HTP_HOSTU_INVALID => HTP_FLAGS_HOSTU_INVALID
HTP_HOSTH_INVALID => HTP_FLAGS_HOSTH_INVALID

HTP_AUTH_UNRECOGNIZED => HTP_AUTH_TYPE_UNRECOGNIZED

HTP_M_UNKNOWN => HTP_METHOD_UNKNOWN
HTP_M_GET => HTP_METHOD_GET
HTP_M_POST => HTP_METHOD_POST
HTP_M_PUT => HTP_METHOD_PUT
HTP_M_CONNECT => HTP_METHOD_CONNECT

HTP_STREAM_ERROR => HTP_STREAM_STATE_ERROR
HTP_STREAM_TUNNEL => HTP_STREAM_STATE_TUNNEL

HTP_REQUEST_LINE => HTP_REQUEST_PROGRESS_LINE
HTP_REQUEST_HEADERS => HTP_REQUEST_PROGRESS_HEADERS
HTP_REQUEST_BODY => HTP_REQUEST_PROGRESS_BODY
HTP_REQUEST_TRAILER => HTP_REQUEST_PROGRESS_TRAILER
HTP_REQUEST_COMPLETE => HTP_REQUEST_PROGRESS_COMPLETE
HTP_RESPONSE_LINE => HTP_RESPONSE_PROGRESS_LINE
HTP_RESPONSE_HEADERS => HTP_RESPONSE_PROGRESS_HEADERS
HTP_RESPONSE_BODY => HTP_RESPONSE_PROGRESS_BODY
HTP_RESPONSE_TRAILER => HTP_RESPONSE_PROGRESS_TRAILER
HTP_RESPONSE_COMPLETE => HTP_RESPONSE_PROGRESS_COMPLETE

HTP_PROTOCOL_1_1 => HTP_PROTOCOL_V1_1
HTP_PROTOCOL_1_0 => HTP_PROTOCOL_V1_0
HTP_PROTOCOL_0_9 => HTP_PROTOCOL_V0_9

Functions introduced to handle opaque htp_tx_t:
- tx->cfg => htp_tx_cfg(tx)
- tx->flags => htp_tx_flags(tx)
- tx->is_protocol_0_9 => htp_tx_is_protocol_0_9(tx)
- tx->parsed_uri => htp_tx_parsed_uri(tx)
- tx->parsed_uri->path => htp_uri_path(htp_tx_parsed_uri(tx)
- tx->parsed_uri->hostname => htp_uri_hostname(htp_tx_parsed_uri(tx))
- tx->request_auth_type => htp_tx_request_auth_type(tx)
- tx->request_headers => htp_tx_request_headers(tx)
- tx->request_hostname => htp_tx_request_hostname(tx)
- tx->request_line => htp_tx_request_line(tx)
- tx->request_message_len => htp_tx_request_message_len(tx)
- tx->request_method => htp_tx_request_method(tx)
- tx->request_method_number => htp_tx_request_method_number(tx)
- tx->request_port_number => htp_tx_request_port_number(tx)
- tx->request_progress => htp_tx_request_progress(tx)
- tx->request_protocol => htp_tx_request_protocol(tx)
- tx->request_protocol_number => htp_tx_request_protocol_number(tx)
- tx->request_uri => htp_tx_request_uri(tx)
- tx->response_headers => htp_tx_response_headers(tx)
- tx->response_line => htp_tx_response_line(tx)
- tx->response_message => htp_tx_response_message(tx)
- tx->response_message_len => htp_tx_response_message_len(tx)
- tx->response_status => htp_tx_response_status(tx)
- tx->response_status_number => htp_tx_response_status_number(tx)
- tx->response_progress => htp_tx_response_progress(tx)
- tx->response_protocol_number => htp_tx_response_protocol_number(tx)
- htp_tx_get_user_data() => htp_tx_user_data(tx)
- htp_table_get_c(tx->request_headers, header) => htp_tx_request_header(tx, header)
- htp_table_get_c(tx->response_headers, header) => htp_tx_response_header(tx, header)
- htp_table_get_index(tx->request_headers, index) => htp_tx_request_header_index(tx, index)
- htp_table_size(tx->request_headers) => htp_tx_request_headers_size(tx)
- htp_tx_is_http_2_upgrade(tx) convenience function introduced to detect response status 101
  and “Upgrade: h2c" header.

Functions introduced to handle opaque htp_header_t:
- header->name => htp_header_name(header)
- bstr_ptr(header->name) => htp_header_name_ptr(header)
- bstr_len(header->name) => htp_header_name_len(header)
- header->value => htp_header_value(header)
- bstr_len(header->value) => htp_header_value_len(header)
- bstr_ptr(header->value) => htp_header_value_ptr(header)

Functions introduced to handle opaque htp_headers_t:
- htp_table_size(headers) => htp_headers_size(headers)
- htp_table_get_index(headers, index) => htp_headers_get_index(headers, index)

Functions introduced to handle opaque htp_tx_data_t:
- d->len => htp_tx_data_len()
- d->data => htp_tx_data_data()
- htp_tx_data_tx(data) function to get the htp_tx_t from the htp_tx_data_t
- htp_tx_data_is_empty(data) convenience function introduced to test if the data is empty.

Functions introduced to handle opaque htp_connp_t:
- htp_list_get(connp->transactions, txid) => htp_connp_tx(connp, txid)
- htp_list_size(http_state->conn->transactions) => htp_connp_tx_size(connp)
- htp_connp_get_connection(connp) => htp_connp_connection(connp)
- htp_connp_req_data(connp) => htp_connp_request_data(connp)
- htp_connp_req_close(connp) => htp_connp_request_close(connp)
- htp_connp_res_data(connp) => htp_connp_response_data(connp)
- htp_connp_get_jn_tx(connp) => htp_connp_get_request_tx(connp)
- htp_connp_get_out_tx(connp) => htp_connp_get_response_tx(connp)
- htp_connp_req_data_consumed(connp) => htp_connp_request_data_consumed(connp)
- htp_connp_res_data_consumed(connp) => htp_connp_response_data_consumed(connp)
- htp_connp_get_user_data(connp) => htp_connp_user_data(connp)

Functions introduced to handle opaque htp_conn_t:
- conn->in_data_counter => htp_conn_request_data_counter(conn)
- conn->out_data_counter => htp_conn_response_data_counter(conn)

Other changes:

Build libhtp-rs as a crate inside rust. Update autoconf to no longer
use libhtp as an external dependency. Remove HAVE_HTP feature defines
since they are no longer needed.

Make function arguments and return values const where possible

htp_tx_destroy(tx) will now free an incomplete transaction

htp_time_t replaced with standard struct timeval

Callbacks from libhtp now provide the htp_connp_t and the htp_tx_data_t
as separate arguments. This means the connection parser is no longer
fetched from the transaction inside callbacks.

SCHTPGenerateNormalizedUri() functionality moved inside libhtp-rs, which
now provides normalized URI values.
The normalized URI is available with accessor function: htp_tx_normalized_uri()
Configuration settings added to control the behaviour of the URI normalization:
- htp_config_set_normalized_uri_include_all()
- htp_config_set_plusspace_decode()
- htp_config_set_convert_lowercase()
- htp_config_set_double_decode_normalized_query()
- htp_config_set_double_decode_normalized_path()
- htp_config_set_backslash_convert_slashes()
- htp_config_set_bestfit_replacement_byte()
- htp_config_set_convert_lowercase()
- htp_config_set_nul_encoded_terminates()
- htp_config_set_nul_raw_terminates()
- htp_config_set_path_separators_compress()
- htp_config_set_path_separators_decode()
- htp_config_set_u_encoding_decode()
- htp_config_set_url_encoding_invalid_handling()
- htp_config_set_utf8_convert_bestfit()
- htp_config_set_normalized_uri_include_all()
- htp_config_set_plusspace_decode()
Constants related to configuring uri normalization:
- HTP_URL_DECODE_PRESERVE_PERCENT => HTP_URL_ENCODING_HANDLING_PRESERVE_PERCENT
- HTP_URL_DECODE_REMOVE_PERCENT => HTP_URL_ENCODING_HANDLING_REMOVE_PERCENT
- HTP_URL_DECODE_PROCESS_INVALID => HTP_URL_ENCODING_HANDLING_PROCESS_INVALID

htp_config_set_field_limits(soft_limit, hard_limit) changed to
htp_config_set_field_limit(limit) because libhtp didn't implement soft
limits.

libhtp logging API updated to provide HTP_LOG_CODE constants along with
the message. This eliminates the need to perform string matching on
message text to map log messages to HTTP_DECODER_EVENT values, and the
HTP_LOG_CODE values can be used directly. In support of this,
HTP_DECODER_EVENT values are mapped to their corresponding HTP_LOG_CODE
values:

HTTP_DECODER_EVENT_UNKNOWN_ERROR => HTP_LOG_CODE_UNKNOWN
HTTP_DECODER_EVENT_GZIP_DECOMPRESSION_FAILED => HTP_LOG_CODE_GZIP_DECOMPRESSION_FAILED
HTTP_DECODER_EVENT_REQUEST_FIELD_MISSING_COLON => HTP_LOG_CODE_REQUEST_FIELD_MISSING_COLON
HTTP_DECODER_EVENT_RESPONSE_FIELD_MISSING_COLON => HTP_LOG_CODE_RESPONSE_FIELD_MISSING_COLON
HTTP_DECODER_EVENT_INVALID_REQUEST_CHUNK_LEN => HTP_LOG_CODE_INVALID_REQUEST_CHUNK_LEN
HTTP_DECODER_EVENT_INVALID_RESPONSE_CHUNK_LEN => HTP_LOG_CODE_INVALID_RESPONSE_CHUNK_LEN
HTTP_DECODER_EVENT_INVALID_TRANSFER_ENCODING_VALUE_IN_REQUEST => HTP_LOG_CODE_INVALID_TRANSFER_ENCODING_VALUE_IN_REQUEST
HTTP_DECODER_EVENT_INVALID_TRANSFER_ENCODING_VALUE_IN_RESPONSE => HTP_LOG_CODE_INVALID_TRANSFER_ENCODING_VALUE_IN_RESPONSE
HTTP_DECODER_EVENT_INVALID_CONTENT_LENGTH_FIELD_IN_REQUEST => HTP_LOG_CODE_INVALID_CONTENT_LENGTH_FIELD_IN_REQUEST
HTTP_DECODER_EVENT_INVALID_CONTENT_LENGTH_FIELD_IN_RESPONSE => HTP_LOG_CODE_INVALID_CONTENT_LENGTH_FIELD_IN_RESPONSE
HTTP_DECODER_EVENT_DUPLICATE_CONTENT_LENGTH_FIELD_IN_REQUEST => HTP_LOG_CODE_DUPLICATE_CONTENT_LENGTH_FIELD_IN_REQUEST
HTTP_DECODER_EVENT_DUPLICATE_CONTENT_LENGTH_FIELD_IN_RESPONSE => HTP_LOG_CODE_DUPLICATE_CONTENT_LENGTH_FIELD_IN_RESPONSE
HTTP_DECODER_EVENT_100_CONTINUE_ALREADY_SEEN => HTP_LOG_CODE_CONTINUE_ALREADY_SEEN
HTTP_DECODER_EVENT_UNABLE_TO_MATCH_RESPONSE_TO_REQUEST => HTP_LOG_CODE_UNABLE_TO_MATCH_RESPONSE_TO_REQUEST
HTTP_DECODER_EVENT_INVALID_SERVER_PORT_IN_REQUEST => HTP_LOG_CODE_INVALID_SERVER_PORT_IN_REQUEST
HTTP_DECODER_EVENT_INVALID_AUTHORITY_PORT => HTP_LOG_CODE_INVALID_AUTHORITY_PORT
HTTP_DECODER_EVENT_REQUEST_HEADER_INVALID => HTP_LOG_CODE_REQUEST_HEADER_INVALID
HTTP_DECODER_EVENT_RESPONSE_HEADER_INVALID => HTP_LOG_CODE_RESPONSE_HEADER_INVALID
HTTP_DECODER_EVENT_MISSING_HOST_HEADER => HTP_LOG_CODE_MISSING_HOST_HEADER
HTTP_DECODER_EVENT_HOST_HEADER_AMBIGUOUS => HTP_LOG_CODE_HOST_HEADER_AMBIGUOUS
HTTP_DECODER_EVENT_INVALID_REQUEST_FIELD_FOLDING => HTP_LOG_CODE_INVALID_REQUEST_FIELD_FOLDING
HTTP_DECODER_EVENT_INVALID_RESPONSE_FIELD_FOLDING => HTP_LOG_CODE_INVALID_RESPONSE_FIELD_FOLDING
HTTP_DECODER_EVENT_REQUEST_FIELD_TOO_LONG => HTP_LOG_CODE_REQUEST_FIELD_TOO_LONG
HTTP_DECODER_EVENT_RESPONSE_FIELD_TOO_LONG => HTP_LOG_CODE_RESPONSE_FIELD_TOO_LONG
HTTP_DECODER_EVENT_FILE_NAME_TOO_LONG => HTP_LOG_CODE_REQUEST_LINE_INVALID
HTTP_DECODER_EVENT_REQUEST_LINE_INVALID => HTP_LOG_CODE_REQUEST_BODY_UNEXPECTED
HTTP_DECODER_EVENT_REQUEST_BODY_UNEXPECTED => HTP_LOG_CODE_RESPONSE_BODY_UNEXPECTED
HTTP_DECODER_EVENT_REQUEST_SERVER_PORT_TCP_PORT_MISMATCH => HTP_LOG_CODE_REQUEST_SERVER_PORT_TCP_PORT_MISMATCH
HTTP_DECODER_EVENT_URI_HOST_INVALID => HTP_LOG_CODE_URI_HOST_INVALID
HTTP_DECODER_EVENT_HEADER_HOST_INVALID => HTP_LOG_CODE_HEADER_HOST_INVALID
HTTP_DECODER_EVENT_AUTH_UNRECOGNIZED => HTP_LOG_CODE_AUTH_UNRECOGNIZED
HTTP_DECODER_EVENT_REQUEST_HEADER_REPETITION => HTP_LOG_CODE_REQUEST_HEADER_REPETITION
HTTP_DECODER_EVENT_RESPONSE_HEADER_REPETITION => HTP_LOG_CODE_RESPONSE_HEADER_REPETITION
HTTP_DECODER_EVENT_DOUBLE_ENCODED_URI => HTP_LOG_CODE_DOUBLE_ENCODED_URI
HTTP_DECODER_EVENT_URI_DELIM_NON_COMPLIANT => HTP_LOG_CODE_URI_DELIM_NON_COMPLIANT
HTTP_DECODER_EVENT_METHOD_DELIM_NON_COMPLIANT => HTP_LOG_CODE_METHOD_DELIM_NON_COMPLIANT
HTTP_DECODER_EVENT_REQUEST_LINE_LEADING_WHITESPACE => HTP_LOG_CODE_REQUEST_LINE_LEADING_WHITESPACE
HTTP_DECODER_EVENT_TOO_MANY_ENCODING_LAYERS => HTP_LOG_CODE_TOO_MANY_ENCODING_LAYERS
HTTP_DECODER_EVENT_ABNORMAL_CE_HEADER => HTP_LOG_CODE_ABNORMAL_CE_HEADER
HTTP_DECODER_EVENT_RESPONSE_MULTIPART_BYTERANGES => HTP_LOG_CODE_RESPONSE_MULTIPART_BYTERANGES
HTTP_DECODER_EVENT_RESPONSE_ABNORMAL_TRANSFER_ENCODING => HTP_LOG_CODE_RESPONSE_ABNORMAL_TRANSFER_ENCODING
HTTP_DECODER_EVENT_RESPONSE_CHUNKED_OLD_PROTO => HTP_LOG_CODE_RESPONSE_CHUNKED_OLD_PROTO
HTTP_DECODER_EVENT_RESPONSE_INVALID_PROTOCOL => HTP_LOG_CODE_RESPONSE_INVALID_PROTOCOL
HTTP_DECODER_EVENT_RESPONSE_INVALID_STATUS => HTP_LOG_CODE_RESPONSE_INVALID_STATUS
HTTP_DECODER_EVENT_REQUEST_LINE_INCOMPLETE => HTP_LOG_CODE_REQUEST_LINE_INCOMPLETE
HTTP_DECODER_EVENT_LZMA_MEMLIMIT_REACHED => HTP_LOG_CODE_LZMA_MEMLIMIT_REACHED
HTTP_DECODER_EVENT_COMPRESSION_BOMB => HTP_LOG_CODE_COMPRESSION_BOMB

New log events to describe additional anomalies:
HTP_LOG_CODE_REQUEST_TOO_MANY_LZMA_LAYERS
HTP_LOG_CODE_RESPONSE_TOO_MANY_LZMA_LAYERS
HTP_LOG_CODE_PROTOCOL_CONTAINS_EXTRA_DATA
HTP_LOG_CODE_CONTENT_LENGTH_EXTRA_DATA_START
HTP_LOG_CODE_CONTENT_LENGTH_EXTRA_DATA_END
HTP_LOG_CODE_SWITCHING_PROTO_WITH_CONTENT_LENGTH
HTP_LOG_CODE_DEFORMED_EOL
HTP_LOG_CODE_PARSER_STATE_ERROR
HTP_LOG_CODE_MISSING_OUTBOUND_TRANSACTION_DATA
HTP_LOG_CODE_MISSING_INBOUND_TRANSACTION_DATA
HTP_LOG_CODE_ZERO_LENGTH_DATA_CHUNKS
HTP_LOG_CODE_REQUEST_LINE_UNKNOWN_METHOD
HTP_LOG_CODE_REQUEST_LINE_UNKNOWN_METHOD_NO_PROTOCOL
HTP_LOG_CODE_REQUEST_LINE_UNKNOWN_METHOD_INVALID_PROTOCOL
HTP_LOG_CODE_REQUEST_LINE_NO_PROTOCOL
HTP_LOG_CODE_RESPONSE_LINE_INVALID_PROTOCOL
HTP_LOG_CODE_RESPONSE_LINE_INVALID_RESPONSE_STATUS
HTP_LOG_CODE_RESPONSE_BODY_INTERNAL_ERROR
HTP_LOG_CODE_REQUEST_BODY_DATA_CALLBACK_ERROR
HTP_LOG_CODE_RESPONSE_INVALID_EMPTY_NAME
HTP_LOG_CODE_REQUEST_INVALID_EMPTY_NAME
HTP_LOG_CODE_RESPONSE_INVALID_LWS_AFTER_NAME
HTP_LOG_CODE_RESPONSE_HEADER_NAME_NOT_TOKEN
HTP_LOG_CODE_REQUEST_INVALID_LWS_AFTER_NAME
HTP_LOG_CODE_LZMA_DECOMPRESSION_DISABLED
HTP_LOG_CODE_CONNECTION_ALREADY_OPEN
HTP_LOG_CODE_COMPRESSION_BOMB_DOUBLE_LZMA
HTP_LOG_CODE_INVALID_CONTENT_ENCODING
HTP_LOG_CODE_INVALID_GAP
HTP_LOG_CODE_ERROR

The new htp_log API supports consuming log messages more easily than
walking a list and tracking the current offset. Internally, libhtp-rs
now provides log messages as a queue of htp_log_t, which means the
application can simply call htp_conn_next_log() to fetch the next log
message until the queue is empty. Once the application is done with a
log message, they can call htp_log_free() to dispose of it.

Functions supporting htp_log_t:
htp_conn_next_log(conn) - Get the next log message
htp_log_message(log) - To get the text of the message
htp_log_code(log) - To get the HTP_LOG_CODE value
htp_log_free(log) - To free the htp_log_t
catenacyber pushed a commit to catenacyber/suricata that referenced this pull request Oct 13, 2023
Ticket: OISF#2696

There are a lot of changes here, which are described below.

In general these changes are renaming constants to conform to the
libhtp-rs versions (which are generated by cbindgen); making all htp
types opaque and changing struct->member references to
htp_struct_member() function calls; and a handful of changes to offload
functionality onto libhtp-rs from suricata, such as URI normalization
and transaction cleanup.

Constants renamed to correspond to cbindgen generated names:

HTP_OK => HTP_STATUS_OK
HTP_ERROR => HTP_STATUS_ERROR

HTP_SERVER_* => HTP_SERVER_PERSONALITY_*

HTP_REQUEST_INVALID_T_E => HTP_FLAGS_REQUEST_INVALID_T_E
HTP_REQUEST_INVALID_C_L => HTP_FLAGS_REQUEST_INVALID_C_L
HTP_HOST_MISSING => HTP_FLAGS_HOST_MISSING
HTP_HOST_AMBIGUOUS => HTP_FLAGS_HOST_AMBIGUOUS
HTP_HOSTU_INVALID => HTP_FLAGS_HOSTU_INVALID
HTP_HOSTH_INVALID => HTP_FLAGS_HOSTH_INVALID

HTP_AUTH_UNRECOGNIZED => HTP_AUTH_TYPE_UNRECOGNIZED

HTP_M_UNKNOWN => HTP_METHOD_UNKNOWN
HTP_M_GET => HTP_METHOD_GET
HTP_M_POST => HTP_METHOD_POST
HTP_M_PUT => HTP_METHOD_PUT
HTP_M_CONNECT => HTP_METHOD_CONNECT

HTP_STREAM_ERROR => HTP_STREAM_STATE_ERROR
HTP_STREAM_TUNNEL => HTP_STREAM_STATE_TUNNEL

HTP_REQUEST_LINE => HTP_REQUEST_PROGRESS_LINE
HTP_REQUEST_HEADERS => HTP_REQUEST_PROGRESS_HEADERS
HTP_REQUEST_BODY => HTP_REQUEST_PROGRESS_BODY
HTP_REQUEST_TRAILER => HTP_REQUEST_PROGRESS_TRAILER
HTP_REQUEST_COMPLETE => HTP_REQUEST_PROGRESS_COMPLETE
HTP_RESPONSE_LINE => HTP_RESPONSE_PROGRESS_LINE
HTP_RESPONSE_HEADERS => HTP_RESPONSE_PROGRESS_HEADERS
HTP_RESPONSE_BODY => HTP_RESPONSE_PROGRESS_BODY
HTP_RESPONSE_TRAILER => HTP_RESPONSE_PROGRESS_TRAILER
HTP_RESPONSE_COMPLETE => HTP_RESPONSE_PROGRESS_COMPLETE

HTP_PROTOCOL_1_1 => HTP_PROTOCOL_V1_1
HTP_PROTOCOL_1_0 => HTP_PROTOCOL_V1_0
HTP_PROTOCOL_0_9 => HTP_PROTOCOL_V0_9

Functions introduced to handle opaque htp_tx_t:
- tx->cfg => htp_tx_cfg(tx)
- tx->flags => htp_tx_flags(tx)
- tx->is_protocol_0_9 => htp_tx_is_protocol_0_9(tx)
- tx->parsed_uri => htp_tx_parsed_uri(tx)
- tx->parsed_uri->path => htp_uri_path(htp_tx_parsed_uri(tx)
- tx->parsed_uri->hostname => htp_uri_hostname(htp_tx_parsed_uri(tx))
- tx->request_auth_type => htp_tx_request_auth_type(tx)
- tx->request_headers => htp_tx_request_headers(tx)
- tx->request_hostname => htp_tx_request_hostname(tx)
- tx->request_line => htp_tx_request_line(tx)
- tx->request_message_len => htp_tx_request_message_len(tx)
- tx->request_method => htp_tx_request_method(tx)
- tx->request_method_number => htp_tx_request_method_number(tx)
- tx->request_port_number => htp_tx_request_port_number(tx)
- tx->request_progress => htp_tx_request_progress(tx)
- tx->request_protocol => htp_tx_request_protocol(tx)
- tx->request_protocol_number => htp_tx_request_protocol_number(tx)
- tx->request_uri => htp_tx_request_uri(tx)
- tx->response_headers => htp_tx_response_headers(tx)
- tx->response_line => htp_tx_response_line(tx)
- tx->response_message => htp_tx_response_message(tx)
- tx->response_message_len => htp_tx_response_message_len(tx)
- tx->response_status => htp_tx_response_status(tx)
- tx->response_status_number => htp_tx_response_status_number(tx)
- tx->response_progress => htp_tx_response_progress(tx)
- tx->response_protocol_number => htp_tx_response_protocol_number(tx)
- htp_tx_get_user_data() => htp_tx_user_data(tx)
- htp_table_get_c(tx->request_headers, header) => htp_tx_request_header(tx, header)
- htp_table_get_c(tx->response_headers, header) => htp_tx_response_header(tx, header)
- htp_table_get_index(tx->request_headers, index) => htp_tx_request_header_index(tx, index)
- htp_table_size(tx->request_headers) => htp_tx_request_headers_size(tx)
- htp_tx_is_http_2_upgrade(tx) convenience function introduced to detect response status 101
  and “Upgrade: h2c" header.

Functions introduced to handle opaque htp_header_t:
- header->name => htp_header_name(header)
- bstr_ptr(header->name) => htp_header_name_ptr(header)
- bstr_len(header->name) => htp_header_name_len(header)
- header->value => htp_header_value(header)
- bstr_len(header->value) => htp_header_value_len(header)
- bstr_ptr(header->value) => htp_header_value_ptr(header)

Functions introduced to handle opaque htp_headers_t:
- htp_table_size(headers) => htp_headers_size(headers)
- htp_table_get_index(headers, index) => htp_headers_get_index(headers, index)

Functions introduced to handle opaque htp_tx_data_t:
- d->len => htp_tx_data_len()
- d->data => htp_tx_data_data()
- htp_tx_data_tx(data) function to get the htp_tx_t from the htp_tx_data_t
- htp_tx_data_is_empty(data) convenience function introduced to test if the data is empty.

Functions introduced to handle opaque htp_connp_t:
- htp_list_get(connp->transactions, txid) => htp_connp_tx(connp, txid)
- htp_list_size(http_state->conn->transactions) => htp_connp_tx_size(connp)
- htp_connp_get_connection(connp) => htp_connp_connection(connp)
- htp_connp_req_data(connp) => htp_connp_request_data(connp)
- htp_connp_req_close(connp) => htp_connp_request_close(connp)
- htp_connp_res_data(connp) => htp_connp_response_data(connp)
- htp_connp_get_jn_tx(connp) => htp_connp_get_request_tx(connp)
- htp_connp_get_out_tx(connp) => htp_connp_get_response_tx(connp)
- htp_connp_req_data_consumed(connp) => htp_connp_request_data_consumed(connp)
- htp_connp_res_data_consumed(connp) => htp_connp_response_data_consumed(connp)
- htp_connp_get_user_data(connp) => htp_connp_user_data(connp)

Functions introduced to handle opaque htp_conn_t:
- conn->in_data_counter => htp_conn_request_data_counter(conn)
- conn->out_data_counter => htp_conn_response_data_counter(conn)

Other changes:

Build libhtp-rs as a crate inside rust. Update autoconf to no longer
use libhtp as an external dependency. Remove HAVE_HTP feature defines
since they are no longer needed.

Make function arguments and return values const where possible

htp_tx_destroy(tx) will now free an incomplete transaction

htp_time_t replaced with standard struct timeval

Callbacks from libhtp now provide the htp_connp_t and the htp_tx_data_t
as separate arguments. This means the connection parser is no longer
fetched from the transaction inside callbacks.

SCHTPGenerateNormalizedUri() functionality moved inside libhtp-rs, which
now provides normalized URI values.
The normalized URI is available with accessor function: htp_tx_normalized_uri()
Configuration settings added to control the behaviour of the URI normalization:
- htp_config_set_normalized_uri_include_all()
- htp_config_set_plusspace_decode()
- htp_config_set_convert_lowercase()
- htp_config_set_double_decode_normalized_query()
- htp_config_set_double_decode_normalized_path()
- htp_config_set_backslash_convert_slashes()
- htp_config_set_bestfit_replacement_byte()
- htp_config_set_convert_lowercase()
- htp_config_set_nul_encoded_terminates()
- htp_config_set_nul_raw_terminates()
- htp_config_set_path_separators_compress()
- htp_config_set_path_separators_decode()
- htp_config_set_u_encoding_decode()
- htp_config_set_url_encoding_invalid_handling()
- htp_config_set_utf8_convert_bestfit()
- htp_config_set_normalized_uri_include_all()
- htp_config_set_plusspace_decode()
Constants related to configuring uri normalization:
- HTP_URL_DECODE_PRESERVE_PERCENT => HTP_URL_ENCODING_HANDLING_PRESERVE_PERCENT
- HTP_URL_DECODE_REMOVE_PERCENT => HTP_URL_ENCODING_HANDLING_REMOVE_PERCENT
- HTP_URL_DECODE_PROCESS_INVALID => HTP_URL_ENCODING_HANDLING_PROCESS_INVALID

htp_config_set_field_limits(soft_limit, hard_limit) changed to
htp_config_set_field_limit(limit) because libhtp didn't implement soft
limits.

libhtp logging API updated to provide HTP_LOG_CODE constants along with
the message. This eliminates the need to perform string matching on
message text to map log messages to HTTP_DECODER_EVENT values, and the
HTP_LOG_CODE values can be used directly. In support of this,
HTP_DECODER_EVENT values are mapped to their corresponding HTP_LOG_CODE
values:

HTTP_DECODER_EVENT_UNKNOWN_ERROR => HTP_LOG_CODE_UNKNOWN
HTTP_DECODER_EVENT_GZIP_DECOMPRESSION_FAILED => HTP_LOG_CODE_GZIP_DECOMPRESSION_FAILED
HTTP_DECODER_EVENT_REQUEST_FIELD_MISSING_COLON => HTP_LOG_CODE_REQUEST_FIELD_MISSING_COLON
HTTP_DECODER_EVENT_RESPONSE_FIELD_MISSING_COLON => HTP_LOG_CODE_RESPONSE_FIELD_MISSING_COLON
HTTP_DECODER_EVENT_INVALID_REQUEST_CHUNK_LEN => HTP_LOG_CODE_INVALID_REQUEST_CHUNK_LEN
HTTP_DECODER_EVENT_INVALID_RESPONSE_CHUNK_LEN => HTP_LOG_CODE_INVALID_RESPONSE_CHUNK_LEN
HTTP_DECODER_EVENT_INVALID_TRANSFER_ENCODING_VALUE_IN_REQUEST => HTP_LOG_CODE_INVALID_TRANSFER_ENCODING_VALUE_IN_REQUEST
HTTP_DECODER_EVENT_INVALID_TRANSFER_ENCODING_VALUE_IN_RESPONSE => HTP_LOG_CODE_INVALID_TRANSFER_ENCODING_VALUE_IN_RESPONSE
HTTP_DECODER_EVENT_INVALID_CONTENT_LENGTH_FIELD_IN_REQUEST => HTP_LOG_CODE_INVALID_CONTENT_LENGTH_FIELD_IN_REQUEST
HTTP_DECODER_EVENT_INVALID_CONTENT_LENGTH_FIELD_IN_RESPONSE => HTP_LOG_CODE_INVALID_CONTENT_LENGTH_FIELD_IN_RESPONSE
HTTP_DECODER_EVENT_DUPLICATE_CONTENT_LENGTH_FIELD_IN_REQUEST => HTP_LOG_CODE_DUPLICATE_CONTENT_LENGTH_FIELD_IN_REQUEST
HTTP_DECODER_EVENT_DUPLICATE_CONTENT_LENGTH_FIELD_IN_RESPONSE => HTP_LOG_CODE_DUPLICATE_CONTENT_LENGTH_FIELD_IN_RESPONSE
HTTP_DECODER_EVENT_100_CONTINUE_ALREADY_SEEN => HTP_LOG_CODE_CONTINUE_ALREADY_SEEN
HTTP_DECODER_EVENT_UNABLE_TO_MATCH_RESPONSE_TO_REQUEST => HTP_LOG_CODE_UNABLE_TO_MATCH_RESPONSE_TO_REQUEST
HTTP_DECODER_EVENT_INVALID_SERVER_PORT_IN_REQUEST => HTP_LOG_CODE_INVALID_SERVER_PORT_IN_REQUEST
HTTP_DECODER_EVENT_INVALID_AUTHORITY_PORT => HTP_LOG_CODE_INVALID_AUTHORITY_PORT
HTTP_DECODER_EVENT_REQUEST_HEADER_INVALID => HTP_LOG_CODE_REQUEST_HEADER_INVALID
HTTP_DECODER_EVENT_RESPONSE_HEADER_INVALID => HTP_LOG_CODE_RESPONSE_HEADER_INVALID
HTTP_DECODER_EVENT_MISSING_HOST_HEADER => HTP_LOG_CODE_MISSING_HOST_HEADER
HTTP_DECODER_EVENT_HOST_HEADER_AMBIGUOUS => HTP_LOG_CODE_HOST_HEADER_AMBIGUOUS
HTTP_DECODER_EVENT_INVALID_REQUEST_FIELD_FOLDING => HTP_LOG_CODE_INVALID_REQUEST_FIELD_FOLDING
HTTP_DECODER_EVENT_INVALID_RESPONSE_FIELD_FOLDING => HTP_LOG_CODE_INVALID_RESPONSE_FIELD_FOLDING
HTTP_DECODER_EVENT_REQUEST_FIELD_TOO_LONG => HTP_LOG_CODE_REQUEST_FIELD_TOO_LONG
HTTP_DECODER_EVENT_RESPONSE_FIELD_TOO_LONG => HTP_LOG_CODE_RESPONSE_FIELD_TOO_LONG
HTTP_DECODER_EVENT_FILE_NAME_TOO_LONG => HTP_LOG_CODE_REQUEST_LINE_INVALID
HTTP_DECODER_EVENT_REQUEST_LINE_INVALID => HTP_LOG_CODE_REQUEST_BODY_UNEXPECTED
HTTP_DECODER_EVENT_REQUEST_BODY_UNEXPECTED => HTP_LOG_CODE_RESPONSE_BODY_UNEXPECTED
HTTP_DECODER_EVENT_REQUEST_SERVER_PORT_TCP_PORT_MISMATCH => HTP_LOG_CODE_REQUEST_SERVER_PORT_TCP_PORT_MISMATCH
HTTP_DECODER_EVENT_URI_HOST_INVALID => HTP_LOG_CODE_URI_HOST_INVALID
HTTP_DECODER_EVENT_HEADER_HOST_INVALID => HTP_LOG_CODE_HEADER_HOST_INVALID
HTTP_DECODER_EVENT_AUTH_UNRECOGNIZED => HTP_LOG_CODE_AUTH_UNRECOGNIZED
HTTP_DECODER_EVENT_REQUEST_HEADER_REPETITION => HTP_LOG_CODE_REQUEST_HEADER_REPETITION
HTTP_DECODER_EVENT_RESPONSE_HEADER_REPETITION => HTP_LOG_CODE_RESPONSE_HEADER_REPETITION
HTTP_DECODER_EVENT_DOUBLE_ENCODED_URI => HTP_LOG_CODE_DOUBLE_ENCODED_URI
HTTP_DECODER_EVENT_URI_DELIM_NON_COMPLIANT => HTP_LOG_CODE_URI_DELIM_NON_COMPLIANT
HTTP_DECODER_EVENT_METHOD_DELIM_NON_COMPLIANT => HTP_LOG_CODE_METHOD_DELIM_NON_COMPLIANT
HTTP_DECODER_EVENT_REQUEST_LINE_LEADING_WHITESPACE => HTP_LOG_CODE_REQUEST_LINE_LEADING_WHITESPACE
HTTP_DECODER_EVENT_TOO_MANY_ENCODING_LAYERS => HTP_LOG_CODE_TOO_MANY_ENCODING_LAYERS
HTTP_DECODER_EVENT_ABNORMAL_CE_HEADER => HTP_LOG_CODE_ABNORMAL_CE_HEADER
HTTP_DECODER_EVENT_RESPONSE_MULTIPART_BYTERANGES => HTP_LOG_CODE_RESPONSE_MULTIPART_BYTERANGES
HTTP_DECODER_EVENT_RESPONSE_ABNORMAL_TRANSFER_ENCODING => HTP_LOG_CODE_RESPONSE_ABNORMAL_TRANSFER_ENCODING
HTTP_DECODER_EVENT_RESPONSE_CHUNKED_OLD_PROTO => HTP_LOG_CODE_RESPONSE_CHUNKED_OLD_PROTO
HTTP_DECODER_EVENT_RESPONSE_INVALID_PROTOCOL => HTP_LOG_CODE_RESPONSE_INVALID_PROTOCOL
HTTP_DECODER_EVENT_RESPONSE_INVALID_STATUS => HTP_LOG_CODE_RESPONSE_INVALID_STATUS
HTTP_DECODER_EVENT_REQUEST_LINE_INCOMPLETE => HTP_LOG_CODE_REQUEST_LINE_INCOMPLETE
HTTP_DECODER_EVENT_LZMA_MEMLIMIT_REACHED => HTP_LOG_CODE_LZMA_MEMLIMIT_REACHED
HTTP_DECODER_EVENT_COMPRESSION_BOMB => HTP_LOG_CODE_COMPRESSION_BOMB

New log events to describe additional anomalies:
HTP_LOG_CODE_REQUEST_TOO_MANY_LZMA_LAYERS
HTP_LOG_CODE_RESPONSE_TOO_MANY_LZMA_LAYERS
HTP_LOG_CODE_PROTOCOL_CONTAINS_EXTRA_DATA
HTP_LOG_CODE_CONTENT_LENGTH_EXTRA_DATA_START
HTP_LOG_CODE_CONTENT_LENGTH_EXTRA_DATA_END
HTP_LOG_CODE_SWITCHING_PROTO_WITH_CONTENT_LENGTH
HTP_LOG_CODE_DEFORMED_EOL
HTP_LOG_CODE_PARSER_STATE_ERROR
HTP_LOG_CODE_MISSING_OUTBOUND_TRANSACTION_DATA
HTP_LOG_CODE_MISSING_INBOUND_TRANSACTION_DATA
HTP_LOG_CODE_ZERO_LENGTH_DATA_CHUNKS
HTP_LOG_CODE_REQUEST_LINE_UNKNOWN_METHOD
HTP_LOG_CODE_REQUEST_LINE_UNKNOWN_METHOD_NO_PROTOCOL
HTP_LOG_CODE_REQUEST_LINE_UNKNOWN_METHOD_INVALID_PROTOCOL
HTP_LOG_CODE_REQUEST_LINE_NO_PROTOCOL
HTP_LOG_CODE_RESPONSE_LINE_INVALID_PROTOCOL
HTP_LOG_CODE_RESPONSE_LINE_INVALID_RESPONSE_STATUS
HTP_LOG_CODE_RESPONSE_BODY_INTERNAL_ERROR
HTP_LOG_CODE_REQUEST_BODY_DATA_CALLBACK_ERROR
HTP_LOG_CODE_RESPONSE_INVALID_EMPTY_NAME
HTP_LOG_CODE_REQUEST_INVALID_EMPTY_NAME
HTP_LOG_CODE_RESPONSE_INVALID_LWS_AFTER_NAME
HTP_LOG_CODE_RESPONSE_HEADER_NAME_NOT_TOKEN
HTP_LOG_CODE_REQUEST_INVALID_LWS_AFTER_NAME
HTP_LOG_CODE_LZMA_DECOMPRESSION_DISABLED
HTP_LOG_CODE_CONNECTION_ALREADY_OPEN
HTP_LOG_CODE_COMPRESSION_BOMB_DOUBLE_LZMA
HTP_LOG_CODE_INVALID_CONTENT_ENCODING
HTP_LOG_CODE_INVALID_GAP
HTP_LOG_CODE_ERROR

The new htp_log API supports consuming log messages more easily than
walking a list and tracking the current offset. Internally, libhtp-rs
now provides log messages as a queue of htp_log_t, which means the
application can simply call htp_conn_next_log() to fetch the next log
message until the queue is empty. Once the application is done with a
log message, they can call htp_log_free() to dispose of it.

Functions supporting htp_log_t:
htp_conn_next_log(conn) - Get the next log message
htp_log_message(log) - To get the text of the message
htp_log_code(log) - To get the HTP_LOG_CODE value
htp_log_free(log) - To free the htp_log_t
catenacyber pushed a commit to catenacyber/suricata that referenced this pull request Oct 13, 2023
Ticket: OISF#2696

There are a lot of changes here, which are described below.

In general these changes are renaming constants to conform to the
libhtp-rs versions (which are generated by cbindgen); making all htp
types opaque and changing struct->member references to
htp_struct_member() function calls; and a handful of changes to offload
functionality onto libhtp-rs from suricata, such as URI normalization
and transaction cleanup.

Constants renamed to correspond to cbindgen generated names:

HTP_OK => HTP_STATUS_OK
HTP_ERROR => HTP_STATUS_ERROR

HTP_SERVER_* => HTP_SERVER_PERSONALITY_*

HTP_REQUEST_INVALID_T_E => HTP_FLAGS_REQUEST_INVALID_T_E
HTP_REQUEST_INVALID_C_L => HTP_FLAGS_REQUEST_INVALID_C_L
HTP_HOST_MISSING => HTP_FLAGS_HOST_MISSING
HTP_HOST_AMBIGUOUS => HTP_FLAGS_HOST_AMBIGUOUS
HTP_HOSTU_INVALID => HTP_FLAGS_HOSTU_INVALID
HTP_HOSTH_INVALID => HTP_FLAGS_HOSTH_INVALID

HTP_AUTH_UNRECOGNIZED => HTP_AUTH_TYPE_UNRECOGNIZED

HTP_M_UNKNOWN => HTP_METHOD_UNKNOWN
HTP_M_GET => HTP_METHOD_GET
HTP_M_POST => HTP_METHOD_POST
HTP_M_PUT => HTP_METHOD_PUT
HTP_M_CONNECT => HTP_METHOD_CONNECT

HTP_STREAM_ERROR => HTP_STREAM_STATE_ERROR
HTP_STREAM_TUNNEL => HTP_STREAM_STATE_TUNNEL

HTP_REQUEST_LINE => HTP_REQUEST_PROGRESS_LINE
HTP_REQUEST_HEADERS => HTP_REQUEST_PROGRESS_HEADERS
HTP_REQUEST_BODY => HTP_REQUEST_PROGRESS_BODY
HTP_REQUEST_TRAILER => HTP_REQUEST_PROGRESS_TRAILER
HTP_REQUEST_COMPLETE => HTP_REQUEST_PROGRESS_COMPLETE
HTP_RESPONSE_LINE => HTP_RESPONSE_PROGRESS_LINE
HTP_RESPONSE_HEADERS => HTP_RESPONSE_PROGRESS_HEADERS
HTP_RESPONSE_BODY => HTP_RESPONSE_PROGRESS_BODY
HTP_RESPONSE_TRAILER => HTP_RESPONSE_PROGRESS_TRAILER
HTP_RESPONSE_COMPLETE => HTP_RESPONSE_PROGRESS_COMPLETE

HTP_PROTOCOL_1_1 => HTP_PROTOCOL_V1_1
HTP_PROTOCOL_1_0 => HTP_PROTOCOL_V1_0
HTP_PROTOCOL_0_9 => HTP_PROTOCOL_V0_9

Functions introduced to handle opaque htp_tx_t:
- tx->cfg => htp_tx_cfg(tx)
- tx->flags => htp_tx_flags(tx)
- tx->is_protocol_0_9 => htp_tx_is_protocol_0_9(tx)
- tx->parsed_uri => htp_tx_parsed_uri(tx)
- tx->parsed_uri->path => htp_uri_path(htp_tx_parsed_uri(tx)
- tx->parsed_uri->hostname => htp_uri_hostname(htp_tx_parsed_uri(tx))
- tx->request_auth_type => htp_tx_request_auth_type(tx)
- tx->request_headers => htp_tx_request_headers(tx)
- tx->request_hostname => htp_tx_request_hostname(tx)
- tx->request_line => htp_tx_request_line(tx)
- tx->request_message_len => htp_tx_request_message_len(tx)
- tx->request_method => htp_tx_request_method(tx)
- tx->request_method_number => htp_tx_request_method_number(tx)
- tx->request_port_number => htp_tx_request_port_number(tx)
- tx->request_progress => htp_tx_request_progress(tx)
- tx->request_protocol => htp_tx_request_protocol(tx)
- tx->request_protocol_number => htp_tx_request_protocol_number(tx)
- tx->request_uri => htp_tx_request_uri(tx)
- tx->response_headers => htp_tx_response_headers(tx)
- tx->response_line => htp_tx_response_line(tx)
- tx->response_message => htp_tx_response_message(tx)
- tx->response_message_len => htp_tx_response_message_len(tx)
- tx->response_status => htp_tx_response_status(tx)
- tx->response_status_number => htp_tx_response_status_number(tx)
- tx->response_progress => htp_tx_response_progress(tx)
- tx->response_protocol_number => htp_tx_response_protocol_number(tx)
- htp_tx_get_user_data() => htp_tx_user_data(tx)
- htp_table_get_c(tx->request_headers, header) => htp_tx_request_header(tx, header)
- htp_table_get_c(tx->response_headers, header) => htp_tx_response_header(tx, header)
- htp_table_get_index(tx->request_headers, index) => htp_tx_request_header_index(tx, index)
- htp_table_size(tx->request_headers) => htp_tx_request_headers_size(tx)
- htp_tx_is_http_2_upgrade(tx) convenience function introduced to detect response status 101
  and “Upgrade: h2c" header.

Functions introduced to handle opaque htp_header_t:
- header->name => htp_header_name(header)
- bstr_ptr(header->name) => htp_header_name_ptr(header)
- bstr_len(header->name) => htp_header_name_len(header)
- header->value => htp_header_value(header)
- bstr_len(header->value) => htp_header_value_len(header)
- bstr_ptr(header->value) => htp_header_value_ptr(header)

Functions introduced to handle opaque htp_headers_t:
- htp_table_size(headers) => htp_headers_size(headers)
- htp_table_get_index(headers, index) => htp_headers_get_index(headers, index)

Functions introduced to handle opaque htp_tx_data_t:
- d->len => htp_tx_data_len()
- d->data => htp_tx_data_data()
- htp_tx_data_tx(data) function to get the htp_tx_t from the htp_tx_data_t
- htp_tx_data_is_empty(data) convenience function introduced to test if the data is empty.

Functions introduced to handle opaque htp_connp_t:
- htp_list_get(connp->transactions, txid) => htp_connp_tx(connp, txid)
- htp_list_size(http_state->conn->transactions) => htp_connp_tx_size(connp)
- htp_connp_get_connection(connp) => htp_connp_connection(connp)
- htp_connp_req_data(connp) => htp_connp_request_data(connp)
- htp_connp_req_close(connp) => htp_connp_request_close(connp)
- htp_connp_res_data(connp) => htp_connp_response_data(connp)
- htp_connp_get_jn_tx(connp) => htp_connp_get_request_tx(connp)
- htp_connp_get_out_tx(connp) => htp_connp_get_response_tx(connp)
- htp_connp_req_data_consumed(connp) => htp_connp_request_data_consumed(connp)
- htp_connp_res_data_consumed(connp) => htp_connp_response_data_consumed(connp)
- htp_connp_get_user_data(connp) => htp_connp_user_data(connp)

Functions introduced to handle opaque htp_conn_t:
- conn->in_data_counter => htp_conn_request_data_counter(conn)
- conn->out_data_counter => htp_conn_response_data_counter(conn)

Other changes:

Build libhtp-rs as a crate inside rust. Update autoconf to no longer
use libhtp as an external dependency. Remove HAVE_HTP feature defines
since they are no longer needed.

Make function arguments and return values const where possible

htp_tx_destroy(tx) will now free an incomplete transaction

htp_time_t replaced with standard struct timeval

Callbacks from libhtp now provide the htp_connp_t and the htp_tx_data_t
as separate arguments. This means the connection parser is no longer
fetched from the transaction inside callbacks.

SCHTPGenerateNormalizedUri() functionality moved inside libhtp-rs, which
now provides normalized URI values.
The normalized URI is available with accessor function: htp_tx_normalized_uri()
Configuration settings added to control the behaviour of the URI normalization:
- htp_config_set_normalized_uri_include_all()
- htp_config_set_plusspace_decode()
- htp_config_set_convert_lowercase()
- htp_config_set_double_decode_normalized_query()
- htp_config_set_double_decode_normalized_path()
- htp_config_set_backslash_convert_slashes()
- htp_config_set_bestfit_replacement_byte()
- htp_config_set_convert_lowercase()
- htp_config_set_nul_encoded_terminates()
- htp_config_set_nul_raw_terminates()
- htp_config_set_path_separators_compress()
- htp_config_set_path_separators_decode()
- htp_config_set_u_encoding_decode()
- htp_config_set_url_encoding_invalid_handling()
- htp_config_set_utf8_convert_bestfit()
- htp_config_set_normalized_uri_include_all()
- htp_config_set_plusspace_decode()
Constants related to configuring uri normalization:
- HTP_URL_DECODE_PRESERVE_PERCENT => HTP_URL_ENCODING_HANDLING_PRESERVE_PERCENT
- HTP_URL_DECODE_REMOVE_PERCENT => HTP_URL_ENCODING_HANDLING_REMOVE_PERCENT
- HTP_URL_DECODE_PROCESS_INVALID => HTP_URL_ENCODING_HANDLING_PROCESS_INVALID

htp_config_set_field_limits(soft_limit, hard_limit) changed to
htp_config_set_field_limit(limit) because libhtp didn't implement soft
limits.

libhtp logging API updated to provide HTP_LOG_CODE constants along with
the message. This eliminates the need to perform string matching on
message text to map log messages to HTTP_DECODER_EVENT values, and the
HTP_LOG_CODE values can be used directly. In support of this,
HTP_DECODER_EVENT values are mapped to their corresponding HTP_LOG_CODE
values:

HTTP_DECODER_EVENT_UNKNOWN_ERROR => HTP_LOG_CODE_UNKNOWN
HTTP_DECODER_EVENT_GZIP_DECOMPRESSION_FAILED => HTP_LOG_CODE_GZIP_DECOMPRESSION_FAILED
HTTP_DECODER_EVENT_REQUEST_FIELD_MISSING_COLON => HTP_LOG_CODE_REQUEST_FIELD_MISSING_COLON
HTTP_DECODER_EVENT_RESPONSE_FIELD_MISSING_COLON => HTP_LOG_CODE_RESPONSE_FIELD_MISSING_COLON
HTTP_DECODER_EVENT_INVALID_REQUEST_CHUNK_LEN => HTP_LOG_CODE_INVALID_REQUEST_CHUNK_LEN
HTTP_DECODER_EVENT_INVALID_RESPONSE_CHUNK_LEN => HTP_LOG_CODE_INVALID_RESPONSE_CHUNK_LEN
HTTP_DECODER_EVENT_INVALID_TRANSFER_ENCODING_VALUE_IN_REQUEST => HTP_LOG_CODE_INVALID_TRANSFER_ENCODING_VALUE_IN_REQUEST
HTTP_DECODER_EVENT_INVALID_TRANSFER_ENCODING_VALUE_IN_RESPONSE => HTP_LOG_CODE_INVALID_TRANSFER_ENCODING_VALUE_IN_RESPONSE
HTTP_DECODER_EVENT_INVALID_CONTENT_LENGTH_FIELD_IN_REQUEST => HTP_LOG_CODE_INVALID_CONTENT_LENGTH_FIELD_IN_REQUEST
HTTP_DECODER_EVENT_INVALID_CONTENT_LENGTH_FIELD_IN_RESPONSE => HTP_LOG_CODE_INVALID_CONTENT_LENGTH_FIELD_IN_RESPONSE
HTTP_DECODER_EVENT_DUPLICATE_CONTENT_LENGTH_FIELD_IN_REQUEST => HTP_LOG_CODE_DUPLICATE_CONTENT_LENGTH_FIELD_IN_REQUEST
HTTP_DECODER_EVENT_DUPLICATE_CONTENT_LENGTH_FIELD_IN_RESPONSE => HTP_LOG_CODE_DUPLICATE_CONTENT_LENGTH_FIELD_IN_RESPONSE
HTTP_DECODER_EVENT_100_CONTINUE_ALREADY_SEEN => HTP_LOG_CODE_CONTINUE_ALREADY_SEEN
HTTP_DECODER_EVENT_UNABLE_TO_MATCH_RESPONSE_TO_REQUEST => HTP_LOG_CODE_UNABLE_TO_MATCH_RESPONSE_TO_REQUEST
HTTP_DECODER_EVENT_INVALID_SERVER_PORT_IN_REQUEST => HTP_LOG_CODE_INVALID_SERVER_PORT_IN_REQUEST
HTTP_DECODER_EVENT_INVALID_AUTHORITY_PORT => HTP_LOG_CODE_INVALID_AUTHORITY_PORT
HTTP_DECODER_EVENT_REQUEST_HEADER_INVALID => HTP_LOG_CODE_REQUEST_HEADER_INVALID
HTTP_DECODER_EVENT_RESPONSE_HEADER_INVALID => HTP_LOG_CODE_RESPONSE_HEADER_INVALID
HTTP_DECODER_EVENT_MISSING_HOST_HEADER => HTP_LOG_CODE_MISSING_HOST_HEADER
HTTP_DECODER_EVENT_HOST_HEADER_AMBIGUOUS => HTP_LOG_CODE_HOST_HEADER_AMBIGUOUS
HTTP_DECODER_EVENT_INVALID_REQUEST_FIELD_FOLDING => HTP_LOG_CODE_INVALID_REQUEST_FIELD_FOLDING
HTTP_DECODER_EVENT_INVALID_RESPONSE_FIELD_FOLDING => HTP_LOG_CODE_INVALID_RESPONSE_FIELD_FOLDING
HTTP_DECODER_EVENT_REQUEST_FIELD_TOO_LONG => HTP_LOG_CODE_REQUEST_FIELD_TOO_LONG
HTTP_DECODER_EVENT_RESPONSE_FIELD_TOO_LONG => HTP_LOG_CODE_RESPONSE_FIELD_TOO_LONG
HTTP_DECODER_EVENT_FILE_NAME_TOO_LONG => HTP_LOG_CODE_REQUEST_LINE_INVALID
HTTP_DECODER_EVENT_REQUEST_LINE_INVALID => HTP_LOG_CODE_REQUEST_BODY_UNEXPECTED
HTTP_DECODER_EVENT_REQUEST_BODY_UNEXPECTED => HTP_LOG_CODE_RESPONSE_BODY_UNEXPECTED
HTTP_DECODER_EVENT_REQUEST_SERVER_PORT_TCP_PORT_MISMATCH => HTP_LOG_CODE_REQUEST_SERVER_PORT_TCP_PORT_MISMATCH
HTTP_DECODER_EVENT_URI_HOST_INVALID => HTP_LOG_CODE_URI_HOST_INVALID
HTTP_DECODER_EVENT_HEADER_HOST_INVALID => HTP_LOG_CODE_HEADER_HOST_INVALID
HTTP_DECODER_EVENT_AUTH_UNRECOGNIZED => HTP_LOG_CODE_AUTH_UNRECOGNIZED
HTTP_DECODER_EVENT_REQUEST_HEADER_REPETITION => HTP_LOG_CODE_REQUEST_HEADER_REPETITION
HTTP_DECODER_EVENT_RESPONSE_HEADER_REPETITION => HTP_LOG_CODE_RESPONSE_HEADER_REPETITION
HTTP_DECODER_EVENT_DOUBLE_ENCODED_URI => HTP_LOG_CODE_DOUBLE_ENCODED_URI
HTTP_DECODER_EVENT_URI_DELIM_NON_COMPLIANT => HTP_LOG_CODE_URI_DELIM_NON_COMPLIANT
HTTP_DECODER_EVENT_METHOD_DELIM_NON_COMPLIANT => HTP_LOG_CODE_METHOD_DELIM_NON_COMPLIANT
HTTP_DECODER_EVENT_REQUEST_LINE_LEADING_WHITESPACE => HTP_LOG_CODE_REQUEST_LINE_LEADING_WHITESPACE
HTTP_DECODER_EVENT_TOO_MANY_ENCODING_LAYERS => HTP_LOG_CODE_TOO_MANY_ENCODING_LAYERS
HTTP_DECODER_EVENT_ABNORMAL_CE_HEADER => HTP_LOG_CODE_ABNORMAL_CE_HEADER
HTTP_DECODER_EVENT_RESPONSE_MULTIPART_BYTERANGES => HTP_LOG_CODE_RESPONSE_MULTIPART_BYTERANGES
HTTP_DECODER_EVENT_RESPONSE_ABNORMAL_TRANSFER_ENCODING => HTP_LOG_CODE_RESPONSE_ABNORMAL_TRANSFER_ENCODING
HTTP_DECODER_EVENT_RESPONSE_CHUNKED_OLD_PROTO => HTP_LOG_CODE_RESPONSE_CHUNKED_OLD_PROTO
HTTP_DECODER_EVENT_RESPONSE_INVALID_PROTOCOL => HTP_LOG_CODE_RESPONSE_INVALID_PROTOCOL
HTTP_DECODER_EVENT_RESPONSE_INVALID_STATUS => HTP_LOG_CODE_RESPONSE_INVALID_STATUS
HTTP_DECODER_EVENT_REQUEST_LINE_INCOMPLETE => HTP_LOG_CODE_REQUEST_LINE_INCOMPLETE
HTTP_DECODER_EVENT_LZMA_MEMLIMIT_REACHED => HTP_LOG_CODE_LZMA_MEMLIMIT_REACHED
HTTP_DECODER_EVENT_COMPRESSION_BOMB => HTP_LOG_CODE_COMPRESSION_BOMB

New log events to describe additional anomalies:
HTP_LOG_CODE_REQUEST_TOO_MANY_LZMA_LAYERS
HTP_LOG_CODE_RESPONSE_TOO_MANY_LZMA_LAYERS
HTP_LOG_CODE_PROTOCOL_CONTAINS_EXTRA_DATA
HTP_LOG_CODE_CONTENT_LENGTH_EXTRA_DATA_START
HTP_LOG_CODE_CONTENT_LENGTH_EXTRA_DATA_END
HTP_LOG_CODE_SWITCHING_PROTO_WITH_CONTENT_LENGTH
HTP_LOG_CODE_DEFORMED_EOL
HTP_LOG_CODE_PARSER_STATE_ERROR
HTP_LOG_CODE_MISSING_OUTBOUND_TRANSACTION_DATA
HTP_LOG_CODE_MISSING_INBOUND_TRANSACTION_DATA
HTP_LOG_CODE_ZERO_LENGTH_DATA_CHUNKS
HTP_LOG_CODE_REQUEST_LINE_UNKNOWN_METHOD
HTP_LOG_CODE_REQUEST_LINE_UNKNOWN_METHOD_NO_PROTOCOL
HTP_LOG_CODE_REQUEST_LINE_UNKNOWN_METHOD_INVALID_PROTOCOL
HTP_LOG_CODE_REQUEST_LINE_NO_PROTOCOL
HTP_LOG_CODE_RESPONSE_LINE_INVALID_PROTOCOL
HTP_LOG_CODE_RESPONSE_LINE_INVALID_RESPONSE_STATUS
HTP_LOG_CODE_RESPONSE_BODY_INTERNAL_ERROR
HTP_LOG_CODE_REQUEST_BODY_DATA_CALLBACK_ERROR
HTP_LOG_CODE_RESPONSE_INVALID_EMPTY_NAME
HTP_LOG_CODE_REQUEST_INVALID_EMPTY_NAME
HTP_LOG_CODE_RESPONSE_INVALID_LWS_AFTER_NAME
HTP_LOG_CODE_RESPONSE_HEADER_NAME_NOT_TOKEN
HTP_LOG_CODE_REQUEST_INVALID_LWS_AFTER_NAME
HTP_LOG_CODE_LZMA_DECOMPRESSION_DISABLED
HTP_LOG_CODE_CONNECTION_ALREADY_OPEN
HTP_LOG_CODE_COMPRESSION_BOMB_DOUBLE_LZMA
HTP_LOG_CODE_INVALID_CONTENT_ENCODING
HTP_LOG_CODE_INVALID_GAP
HTP_LOG_CODE_ERROR

The new htp_log API supports consuming log messages more easily than
walking a list and tracking the current offset. Internally, libhtp-rs
now provides log messages as a queue of htp_log_t, which means the
application can simply call htp_conn_next_log() to fetch the next log
message until the queue is empty. Once the application is done with a
log message, they can call htp_log_free() to dispose of it.

Functions supporting htp_log_t:
htp_conn_next_log(conn) - Get the next log message
htp_log_message(log) - To get the text of the message
htp_log_code(log) - To get the HTP_LOG_CODE value
htp_log_free(log) - To free the htp_log_t
catenacyber pushed a commit to catenacyber/suricata that referenced this pull request Oct 13, 2023
Ticket: OISF#2696

There are a lot of changes here, which are described below.

In general these changes are renaming constants to conform to the
libhtp-rs versions (which are generated by cbindgen); making all htp
types opaque and changing struct->member references to
htp_struct_member() function calls; and a handful of changes to offload
functionality onto libhtp-rs from suricata, such as URI normalization
and transaction cleanup.

Constants renamed to correspond to cbindgen generated names:

HTP_OK => HTP_STATUS_OK
HTP_ERROR => HTP_STATUS_ERROR

HTP_SERVER_* => HTP_SERVER_PERSONALITY_*

HTP_REQUEST_INVALID_T_E => HTP_FLAGS_REQUEST_INVALID_T_E
HTP_REQUEST_INVALID_C_L => HTP_FLAGS_REQUEST_INVALID_C_L
HTP_HOST_MISSING => HTP_FLAGS_HOST_MISSING
HTP_HOST_AMBIGUOUS => HTP_FLAGS_HOST_AMBIGUOUS
HTP_HOSTU_INVALID => HTP_FLAGS_HOSTU_INVALID
HTP_HOSTH_INVALID => HTP_FLAGS_HOSTH_INVALID

HTP_AUTH_UNRECOGNIZED => HTP_AUTH_TYPE_UNRECOGNIZED

HTP_M_UNKNOWN => HTP_METHOD_UNKNOWN
HTP_M_GET => HTP_METHOD_GET
HTP_M_POST => HTP_METHOD_POST
HTP_M_PUT => HTP_METHOD_PUT
HTP_M_CONNECT => HTP_METHOD_CONNECT

HTP_STREAM_ERROR => HTP_STREAM_STATE_ERROR
HTP_STREAM_TUNNEL => HTP_STREAM_STATE_TUNNEL

HTP_REQUEST_LINE => HTP_REQUEST_PROGRESS_LINE
HTP_REQUEST_HEADERS => HTP_REQUEST_PROGRESS_HEADERS
HTP_REQUEST_BODY => HTP_REQUEST_PROGRESS_BODY
HTP_REQUEST_TRAILER => HTP_REQUEST_PROGRESS_TRAILER
HTP_REQUEST_COMPLETE => HTP_REQUEST_PROGRESS_COMPLETE
HTP_RESPONSE_LINE => HTP_RESPONSE_PROGRESS_LINE
HTP_RESPONSE_HEADERS => HTP_RESPONSE_PROGRESS_HEADERS
HTP_RESPONSE_BODY => HTP_RESPONSE_PROGRESS_BODY
HTP_RESPONSE_TRAILER => HTP_RESPONSE_PROGRESS_TRAILER
HTP_RESPONSE_COMPLETE => HTP_RESPONSE_PROGRESS_COMPLETE

HTP_PROTOCOL_1_1 => HTP_PROTOCOL_V1_1
HTP_PROTOCOL_1_0 => HTP_PROTOCOL_V1_0
HTP_PROTOCOL_0_9 => HTP_PROTOCOL_V0_9

Functions introduced to handle opaque htp_tx_t:
- tx->cfg => htp_tx_cfg(tx)
- tx->flags => htp_tx_flags(tx)
- tx->is_protocol_0_9 => htp_tx_is_protocol_0_9(tx)
- tx->parsed_uri => htp_tx_parsed_uri(tx)
- tx->parsed_uri->path => htp_uri_path(htp_tx_parsed_uri(tx)
- tx->parsed_uri->hostname => htp_uri_hostname(htp_tx_parsed_uri(tx))
- tx->request_auth_type => htp_tx_request_auth_type(tx)
- tx->request_headers => htp_tx_request_headers(tx)
- tx->request_hostname => htp_tx_request_hostname(tx)
- tx->request_line => htp_tx_request_line(tx)
- tx->request_message_len => htp_tx_request_message_len(tx)
- tx->request_method => htp_tx_request_method(tx)
- tx->request_method_number => htp_tx_request_method_number(tx)
- tx->request_port_number => htp_tx_request_port_number(tx)
- tx->request_progress => htp_tx_request_progress(tx)
- tx->request_protocol => htp_tx_request_protocol(tx)
- tx->request_protocol_number => htp_tx_request_protocol_number(tx)
- tx->request_uri => htp_tx_request_uri(tx)
- tx->response_headers => htp_tx_response_headers(tx)
- tx->response_line => htp_tx_response_line(tx)
- tx->response_message => htp_tx_response_message(tx)
- tx->response_message_len => htp_tx_response_message_len(tx)
- tx->response_status => htp_tx_response_status(tx)
- tx->response_status_number => htp_tx_response_status_number(tx)
- tx->response_progress => htp_tx_response_progress(tx)
- tx->response_protocol_number => htp_tx_response_protocol_number(tx)
- htp_tx_get_user_data() => htp_tx_user_data(tx)
- htp_table_get_c(tx->request_headers, header) => htp_tx_request_header(tx, header)
- htp_table_get_c(tx->response_headers, header) => htp_tx_response_header(tx, header)
- htp_table_get_index(tx->request_headers, index) => htp_tx_request_header_index(tx, index)
- htp_table_size(tx->request_headers) => htp_tx_request_headers_size(tx)
- htp_tx_is_http_2_upgrade(tx) convenience function introduced to detect response status 101
  and “Upgrade: h2c" header.

Functions introduced to handle opaque htp_header_t:
- header->name => htp_header_name(header)
- bstr_ptr(header->name) => htp_header_name_ptr(header)
- bstr_len(header->name) => htp_header_name_len(header)
- header->value => htp_header_value(header)
- bstr_len(header->value) => htp_header_value_len(header)
- bstr_ptr(header->value) => htp_header_value_ptr(header)

Functions introduced to handle opaque htp_headers_t:
- htp_table_size(headers) => htp_headers_size(headers)
- htp_table_get_index(headers, index) => htp_headers_get_index(headers, index)

Functions introduced to handle opaque htp_tx_data_t:
- d->len => htp_tx_data_len()
- d->data => htp_tx_data_data()
- htp_tx_data_tx(data) function to get the htp_tx_t from the htp_tx_data_t
- htp_tx_data_is_empty(data) convenience function introduced to test if the data is empty.

Functions introduced to handle opaque htp_connp_t:
- htp_list_get(connp->transactions, txid) => htp_connp_tx(connp, txid)
- htp_list_size(http_state->conn->transactions) => htp_connp_tx_size(connp)
- htp_connp_get_connection(connp) => htp_connp_connection(connp)
- htp_connp_req_data(connp) => htp_connp_request_data(connp)
- htp_connp_req_close(connp) => htp_connp_request_close(connp)
- htp_connp_res_data(connp) => htp_connp_response_data(connp)
- htp_connp_get_jn_tx(connp) => htp_connp_get_request_tx(connp)
- htp_connp_get_out_tx(connp) => htp_connp_get_response_tx(connp)
- htp_connp_req_data_consumed(connp) => htp_connp_request_data_consumed(connp)
- htp_connp_res_data_consumed(connp) => htp_connp_response_data_consumed(connp)
- htp_connp_get_user_data(connp) => htp_connp_user_data(connp)

Functions introduced to handle opaque htp_conn_t:
- conn->in_data_counter => htp_conn_request_data_counter(conn)
- conn->out_data_counter => htp_conn_response_data_counter(conn)

Other changes:

Build libhtp-rs as a crate inside rust. Update autoconf to no longer
use libhtp as an external dependency. Remove HAVE_HTP feature defines
since they are no longer needed.

Make function arguments and return values const where possible

htp_tx_destroy(tx) will now free an incomplete transaction

htp_time_t replaced with standard struct timeval

Callbacks from libhtp now provide the htp_connp_t and the htp_tx_data_t
as separate arguments. This means the connection parser is no longer
fetched from the transaction inside callbacks.

SCHTPGenerateNormalizedUri() functionality moved inside libhtp-rs, which
now provides normalized URI values.
The normalized URI is available with accessor function: htp_tx_normalized_uri()
Configuration settings added to control the behaviour of the URI normalization:
- htp_config_set_normalized_uri_include_all()
- htp_config_set_plusspace_decode()
- htp_config_set_convert_lowercase()
- htp_config_set_double_decode_normalized_query()
- htp_config_set_double_decode_normalized_path()
- htp_config_set_backslash_convert_slashes()
- htp_config_set_bestfit_replacement_byte()
- htp_config_set_convert_lowercase()
- htp_config_set_nul_encoded_terminates()
- htp_config_set_nul_raw_terminates()
- htp_config_set_path_separators_compress()
- htp_config_set_path_separators_decode()
- htp_config_set_u_encoding_decode()
- htp_config_set_url_encoding_invalid_handling()
- htp_config_set_utf8_convert_bestfit()
- htp_config_set_normalized_uri_include_all()
- htp_config_set_plusspace_decode()
Constants related to configuring uri normalization:
- HTP_URL_DECODE_PRESERVE_PERCENT => HTP_URL_ENCODING_HANDLING_PRESERVE_PERCENT
- HTP_URL_DECODE_REMOVE_PERCENT => HTP_URL_ENCODING_HANDLING_REMOVE_PERCENT
- HTP_URL_DECODE_PROCESS_INVALID => HTP_URL_ENCODING_HANDLING_PROCESS_INVALID

htp_config_set_field_limits(soft_limit, hard_limit) changed to
htp_config_set_field_limit(limit) because libhtp didn't implement soft
limits.

libhtp logging API updated to provide HTP_LOG_CODE constants along with
the message. This eliminates the need to perform string matching on
message text to map log messages to HTTP_DECODER_EVENT values, and the
HTP_LOG_CODE values can be used directly. In support of this,
HTP_DECODER_EVENT values are mapped to their corresponding HTP_LOG_CODE
values:

HTTP_DECODER_EVENT_UNKNOWN_ERROR => HTP_LOG_CODE_UNKNOWN
HTTP_DECODER_EVENT_GZIP_DECOMPRESSION_FAILED => HTP_LOG_CODE_GZIP_DECOMPRESSION_FAILED
HTTP_DECODER_EVENT_REQUEST_FIELD_MISSING_COLON => HTP_LOG_CODE_REQUEST_FIELD_MISSING_COLON
HTTP_DECODER_EVENT_RESPONSE_FIELD_MISSING_COLON => HTP_LOG_CODE_RESPONSE_FIELD_MISSING_COLON
HTTP_DECODER_EVENT_INVALID_REQUEST_CHUNK_LEN => HTP_LOG_CODE_INVALID_REQUEST_CHUNK_LEN
HTTP_DECODER_EVENT_INVALID_RESPONSE_CHUNK_LEN => HTP_LOG_CODE_INVALID_RESPONSE_CHUNK_LEN
HTTP_DECODER_EVENT_INVALID_TRANSFER_ENCODING_VALUE_IN_REQUEST => HTP_LOG_CODE_INVALID_TRANSFER_ENCODING_VALUE_IN_REQUEST
HTTP_DECODER_EVENT_INVALID_TRANSFER_ENCODING_VALUE_IN_RESPONSE => HTP_LOG_CODE_INVALID_TRANSFER_ENCODING_VALUE_IN_RESPONSE
HTTP_DECODER_EVENT_INVALID_CONTENT_LENGTH_FIELD_IN_REQUEST => HTP_LOG_CODE_INVALID_CONTENT_LENGTH_FIELD_IN_REQUEST
HTTP_DECODER_EVENT_INVALID_CONTENT_LENGTH_FIELD_IN_RESPONSE => HTP_LOG_CODE_INVALID_CONTENT_LENGTH_FIELD_IN_RESPONSE
HTTP_DECODER_EVENT_DUPLICATE_CONTENT_LENGTH_FIELD_IN_REQUEST => HTP_LOG_CODE_DUPLICATE_CONTENT_LENGTH_FIELD_IN_REQUEST
HTTP_DECODER_EVENT_DUPLICATE_CONTENT_LENGTH_FIELD_IN_RESPONSE => HTP_LOG_CODE_DUPLICATE_CONTENT_LENGTH_FIELD_IN_RESPONSE
HTTP_DECODER_EVENT_100_CONTINUE_ALREADY_SEEN => HTP_LOG_CODE_CONTINUE_ALREADY_SEEN
HTTP_DECODER_EVENT_UNABLE_TO_MATCH_RESPONSE_TO_REQUEST => HTP_LOG_CODE_UNABLE_TO_MATCH_RESPONSE_TO_REQUEST
HTTP_DECODER_EVENT_INVALID_SERVER_PORT_IN_REQUEST => HTP_LOG_CODE_INVALID_SERVER_PORT_IN_REQUEST
HTTP_DECODER_EVENT_INVALID_AUTHORITY_PORT => HTP_LOG_CODE_INVALID_AUTHORITY_PORT
HTTP_DECODER_EVENT_REQUEST_HEADER_INVALID => HTP_LOG_CODE_REQUEST_HEADER_INVALID
HTTP_DECODER_EVENT_RESPONSE_HEADER_INVALID => HTP_LOG_CODE_RESPONSE_HEADER_INVALID
HTTP_DECODER_EVENT_MISSING_HOST_HEADER => HTP_LOG_CODE_MISSING_HOST_HEADER
HTTP_DECODER_EVENT_HOST_HEADER_AMBIGUOUS => HTP_LOG_CODE_HOST_HEADER_AMBIGUOUS
HTTP_DECODER_EVENT_INVALID_REQUEST_FIELD_FOLDING => HTP_LOG_CODE_INVALID_REQUEST_FIELD_FOLDING
HTTP_DECODER_EVENT_INVALID_RESPONSE_FIELD_FOLDING => HTP_LOG_CODE_INVALID_RESPONSE_FIELD_FOLDING
HTTP_DECODER_EVENT_REQUEST_FIELD_TOO_LONG => HTP_LOG_CODE_REQUEST_FIELD_TOO_LONG
HTTP_DECODER_EVENT_RESPONSE_FIELD_TOO_LONG => HTP_LOG_CODE_RESPONSE_FIELD_TOO_LONG
HTTP_DECODER_EVENT_FILE_NAME_TOO_LONG => HTP_LOG_CODE_REQUEST_LINE_INVALID
HTTP_DECODER_EVENT_REQUEST_LINE_INVALID => HTP_LOG_CODE_REQUEST_BODY_UNEXPECTED
HTTP_DECODER_EVENT_REQUEST_BODY_UNEXPECTED => HTP_LOG_CODE_RESPONSE_BODY_UNEXPECTED
HTTP_DECODER_EVENT_REQUEST_SERVER_PORT_TCP_PORT_MISMATCH => HTP_LOG_CODE_REQUEST_SERVER_PORT_TCP_PORT_MISMATCH
HTTP_DECODER_EVENT_URI_HOST_INVALID => HTP_LOG_CODE_URI_HOST_INVALID
HTTP_DECODER_EVENT_HEADER_HOST_INVALID => HTP_LOG_CODE_HEADER_HOST_INVALID
HTTP_DECODER_EVENT_AUTH_UNRECOGNIZED => HTP_LOG_CODE_AUTH_UNRECOGNIZED
HTTP_DECODER_EVENT_REQUEST_HEADER_REPETITION => HTP_LOG_CODE_REQUEST_HEADER_REPETITION
HTTP_DECODER_EVENT_RESPONSE_HEADER_REPETITION => HTP_LOG_CODE_RESPONSE_HEADER_REPETITION
HTTP_DECODER_EVENT_DOUBLE_ENCODED_URI => HTP_LOG_CODE_DOUBLE_ENCODED_URI
HTTP_DECODER_EVENT_URI_DELIM_NON_COMPLIANT => HTP_LOG_CODE_URI_DELIM_NON_COMPLIANT
HTTP_DECODER_EVENT_METHOD_DELIM_NON_COMPLIANT => HTP_LOG_CODE_METHOD_DELIM_NON_COMPLIANT
HTTP_DECODER_EVENT_REQUEST_LINE_LEADING_WHITESPACE => HTP_LOG_CODE_REQUEST_LINE_LEADING_WHITESPACE
HTTP_DECODER_EVENT_TOO_MANY_ENCODING_LAYERS => HTP_LOG_CODE_TOO_MANY_ENCODING_LAYERS
HTTP_DECODER_EVENT_ABNORMAL_CE_HEADER => HTP_LOG_CODE_ABNORMAL_CE_HEADER
HTTP_DECODER_EVENT_RESPONSE_MULTIPART_BYTERANGES => HTP_LOG_CODE_RESPONSE_MULTIPART_BYTERANGES
HTTP_DECODER_EVENT_RESPONSE_ABNORMAL_TRANSFER_ENCODING => HTP_LOG_CODE_RESPONSE_ABNORMAL_TRANSFER_ENCODING
HTTP_DECODER_EVENT_RESPONSE_CHUNKED_OLD_PROTO => HTP_LOG_CODE_RESPONSE_CHUNKED_OLD_PROTO
HTTP_DECODER_EVENT_RESPONSE_INVALID_PROTOCOL => HTP_LOG_CODE_RESPONSE_INVALID_PROTOCOL
HTTP_DECODER_EVENT_RESPONSE_INVALID_STATUS => HTP_LOG_CODE_RESPONSE_INVALID_STATUS
HTTP_DECODER_EVENT_REQUEST_LINE_INCOMPLETE => HTP_LOG_CODE_REQUEST_LINE_INCOMPLETE
HTTP_DECODER_EVENT_LZMA_MEMLIMIT_REACHED => HTP_LOG_CODE_LZMA_MEMLIMIT_REACHED
HTTP_DECODER_EVENT_COMPRESSION_BOMB => HTP_LOG_CODE_COMPRESSION_BOMB

New log events to describe additional anomalies:
HTP_LOG_CODE_REQUEST_TOO_MANY_LZMA_LAYERS
HTP_LOG_CODE_RESPONSE_TOO_MANY_LZMA_LAYERS
HTP_LOG_CODE_PROTOCOL_CONTAINS_EXTRA_DATA
HTP_LOG_CODE_CONTENT_LENGTH_EXTRA_DATA_START
HTP_LOG_CODE_CONTENT_LENGTH_EXTRA_DATA_END
HTP_LOG_CODE_SWITCHING_PROTO_WITH_CONTENT_LENGTH
HTP_LOG_CODE_DEFORMED_EOL
HTP_LOG_CODE_PARSER_STATE_ERROR
HTP_LOG_CODE_MISSING_OUTBOUND_TRANSACTION_DATA
HTP_LOG_CODE_MISSING_INBOUND_TRANSACTION_DATA
HTP_LOG_CODE_ZERO_LENGTH_DATA_CHUNKS
HTP_LOG_CODE_REQUEST_LINE_UNKNOWN_METHOD
HTP_LOG_CODE_REQUEST_LINE_UNKNOWN_METHOD_NO_PROTOCOL
HTP_LOG_CODE_REQUEST_LINE_UNKNOWN_METHOD_INVALID_PROTOCOL
HTP_LOG_CODE_REQUEST_LINE_NO_PROTOCOL
HTP_LOG_CODE_RESPONSE_LINE_INVALID_PROTOCOL
HTP_LOG_CODE_RESPONSE_LINE_INVALID_RESPONSE_STATUS
HTP_LOG_CODE_RESPONSE_BODY_INTERNAL_ERROR
HTP_LOG_CODE_REQUEST_BODY_DATA_CALLBACK_ERROR
HTP_LOG_CODE_RESPONSE_INVALID_EMPTY_NAME
HTP_LOG_CODE_REQUEST_INVALID_EMPTY_NAME
HTP_LOG_CODE_RESPONSE_INVALID_LWS_AFTER_NAME
HTP_LOG_CODE_RESPONSE_HEADER_NAME_NOT_TOKEN
HTP_LOG_CODE_REQUEST_INVALID_LWS_AFTER_NAME
HTP_LOG_CODE_LZMA_DECOMPRESSION_DISABLED
HTP_LOG_CODE_CONNECTION_ALREADY_OPEN
HTP_LOG_CODE_COMPRESSION_BOMB_DOUBLE_LZMA
HTP_LOG_CODE_INVALID_CONTENT_ENCODING
HTP_LOG_CODE_INVALID_GAP
HTP_LOG_CODE_ERROR

The new htp_log API supports consuming log messages more easily than
walking a list and tracking the current offset. Internally, libhtp-rs
now provides log messages as a queue of htp_log_t, which means the
application can simply call htp_conn_next_log() to fetch the next log
message until the queue is empty. Once the application is done with a
log message, they can call htp_log_free() to dispose of it.

Functions supporting htp_log_t:
htp_conn_next_log(conn) - Get the next log message
htp_log_message(log) - To get the text of the message
htp_log_code(log) - To get the HTP_LOG_CODE value
htp_log_free(log) - To free the htp_log_t
catenacyber pushed a commit to catenacyber/suricata that referenced this pull request Oct 16, 2023
Ticket: OISF#2696

There are a lot of changes here, which are described below.

In general these changes are renaming constants to conform to the
libhtp-rs versions (which are generated by cbindgen); making all htp
types opaque and changing struct->member references to
htp_struct_member() function calls; and a handful of changes to offload
functionality onto libhtp-rs from suricata, such as URI normalization
and transaction cleanup.

Constants renamed to correspond to cbindgen generated names:

HTP_OK => HTP_STATUS_OK
HTP_ERROR => HTP_STATUS_ERROR

HTP_SERVER_* => HTP_SERVER_PERSONALITY_*

HTP_REQUEST_INVALID_T_E => HTP_FLAGS_REQUEST_INVALID_T_E
HTP_REQUEST_INVALID_C_L => HTP_FLAGS_REQUEST_INVALID_C_L
HTP_HOST_MISSING => HTP_FLAGS_HOST_MISSING
HTP_HOST_AMBIGUOUS => HTP_FLAGS_HOST_AMBIGUOUS
HTP_HOSTU_INVALID => HTP_FLAGS_HOSTU_INVALID
HTP_HOSTH_INVALID => HTP_FLAGS_HOSTH_INVALID

HTP_AUTH_UNRECOGNIZED => HTP_AUTH_TYPE_UNRECOGNIZED

HTP_M_UNKNOWN => HTP_METHOD_UNKNOWN
HTP_M_GET => HTP_METHOD_GET
HTP_M_POST => HTP_METHOD_POST
HTP_M_PUT => HTP_METHOD_PUT
HTP_M_CONNECT => HTP_METHOD_CONNECT

HTP_STREAM_ERROR => HTP_STREAM_STATE_ERROR
HTP_STREAM_TUNNEL => HTP_STREAM_STATE_TUNNEL

HTP_REQUEST_LINE => HTP_REQUEST_PROGRESS_LINE
HTP_REQUEST_HEADERS => HTP_REQUEST_PROGRESS_HEADERS
HTP_REQUEST_BODY => HTP_REQUEST_PROGRESS_BODY
HTP_REQUEST_TRAILER => HTP_REQUEST_PROGRESS_TRAILER
HTP_REQUEST_COMPLETE => HTP_REQUEST_PROGRESS_COMPLETE
HTP_RESPONSE_LINE => HTP_RESPONSE_PROGRESS_LINE
HTP_RESPONSE_HEADERS => HTP_RESPONSE_PROGRESS_HEADERS
HTP_RESPONSE_BODY => HTP_RESPONSE_PROGRESS_BODY
HTP_RESPONSE_TRAILER => HTP_RESPONSE_PROGRESS_TRAILER
HTP_RESPONSE_COMPLETE => HTP_RESPONSE_PROGRESS_COMPLETE

HTP_PROTOCOL_1_1 => HTP_PROTOCOL_V1_1
HTP_PROTOCOL_1_0 => HTP_PROTOCOL_V1_0
HTP_PROTOCOL_0_9 => HTP_PROTOCOL_V0_9

Functions introduced to handle opaque htp_tx_t:
- tx->cfg => htp_tx_cfg(tx)
- tx->flags => htp_tx_flags(tx)
- tx->is_protocol_0_9 => htp_tx_is_protocol_0_9(tx)
- tx->parsed_uri => htp_tx_parsed_uri(tx)
- tx->parsed_uri->path => htp_uri_path(htp_tx_parsed_uri(tx)
- tx->parsed_uri->hostname => htp_uri_hostname(htp_tx_parsed_uri(tx))
- tx->request_auth_type => htp_tx_request_auth_type(tx)
- tx->request_headers => htp_tx_request_headers(tx)
- tx->request_hostname => htp_tx_request_hostname(tx)
- tx->request_line => htp_tx_request_line(tx)
- tx->request_message_len => htp_tx_request_message_len(tx)
- tx->request_method => htp_tx_request_method(tx)
- tx->request_method_number => htp_tx_request_method_number(tx)
- tx->request_port_number => htp_tx_request_port_number(tx)
- tx->request_progress => htp_tx_request_progress(tx)
- tx->request_protocol => htp_tx_request_protocol(tx)
- tx->request_protocol_number => htp_tx_request_protocol_number(tx)
- tx->request_uri => htp_tx_request_uri(tx)
- tx->response_headers => htp_tx_response_headers(tx)
- tx->response_line => htp_tx_response_line(tx)
- tx->response_message => htp_tx_response_message(tx)
- tx->response_message_len => htp_tx_response_message_len(tx)
- tx->response_status => htp_tx_response_status(tx)
- tx->response_status_number => htp_tx_response_status_number(tx)
- tx->response_progress => htp_tx_response_progress(tx)
- tx->response_protocol_number => htp_tx_response_protocol_number(tx)
- htp_tx_get_user_data() => htp_tx_user_data(tx)
- htp_table_get_c(tx->request_headers, header) => htp_tx_request_header(tx, header)
- htp_table_get_c(tx->response_headers, header) => htp_tx_response_header(tx, header)
- htp_table_get_index(tx->request_headers, index) => htp_tx_request_header_index(tx, index)
- htp_table_size(tx->request_headers) => htp_tx_request_headers_size(tx)
- htp_tx_is_http_2_upgrade(tx) convenience function introduced to detect response status 101
  and “Upgrade: h2c" header.

Functions introduced to handle opaque htp_header_t:
- header->name => htp_header_name(header)
- bstr_ptr(header->name) => htp_header_name_ptr(header)
- bstr_len(header->name) => htp_header_name_len(header)
- header->value => htp_header_value(header)
- bstr_len(header->value) => htp_header_value_len(header)
- bstr_ptr(header->value) => htp_header_value_ptr(header)

Functions introduced to handle opaque htp_headers_t:
- htp_table_size(headers) => htp_headers_size(headers)
- htp_table_get_index(headers, index) => htp_headers_get_index(headers, index)

Functions introduced to handle opaque htp_tx_data_t:
- d->len => htp_tx_data_len()
- d->data => htp_tx_data_data()
- htp_tx_data_tx(data) function to get the htp_tx_t from the htp_tx_data_t
- htp_tx_data_is_empty(data) convenience function introduced to test if the data is empty.

Functions introduced to handle opaque htp_connp_t:
- htp_list_get(connp->transactions, txid) => htp_connp_tx(connp, txid)
- htp_list_size(http_state->conn->transactions) => htp_connp_tx_size(connp)
- htp_connp_get_connection(connp) => htp_connp_connection(connp)
- htp_connp_req_data(connp) => htp_connp_request_data(connp)
- htp_connp_req_close(connp) => htp_connp_request_close(connp)
- htp_connp_res_data(connp) => htp_connp_response_data(connp)
- htp_connp_get_jn_tx(connp) => htp_connp_get_request_tx(connp)
- htp_connp_get_out_tx(connp) => htp_connp_get_response_tx(connp)
- htp_connp_req_data_consumed(connp) => htp_connp_request_data_consumed(connp)
- htp_connp_res_data_consumed(connp) => htp_connp_response_data_consumed(connp)
- htp_connp_get_user_data(connp) => htp_connp_user_data(connp)

Functions introduced to handle opaque htp_conn_t:
- conn->in_data_counter => htp_conn_request_data_counter(conn)
- conn->out_data_counter => htp_conn_response_data_counter(conn)

Other changes:

Build libhtp-rs as a crate inside rust. Update autoconf to no longer
use libhtp as an external dependency. Remove HAVE_HTP feature defines
since they are no longer needed.

Make function arguments and return values const where possible

htp_tx_destroy(tx) will now free an incomplete transaction

htp_time_t replaced with standard struct timeval

Callbacks from libhtp now provide the htp_connp_t and the htp_tx_data_t
as separate arguments. This means the connection parser is no longer
fetched from the transaction inside callbacks.

SCHTPGenerateNormalizedUri() functionality moved inside libhtp-rs, which
now provides normalized URI values.
The normalized URI is available with accessor function: htp_tx_normalized_uri()
Configuration settings added to control the behaviour of the URI normalization:
- htp_config_set_normalized_uri_include_all()
- htp_config_set_plusspace_decode()
- htp_config_set_convert_lowercase()
- htp_config_set_double_decode_normalized_query()
- htp_config_set_double_decode_normalized_path()
- htp_config_set_backslash_convert_slashes()
- htp_config_set_bestfit_replacement_byte()
- htp_config_set_convert_lowercase()
- htp_config_set_nul_encoded_terminates()
- htp_config_set_nul_raw_terminates()
- htp_config_set_path_separators_compress()
- htp_config_set_path_separators_decode()
- htp_config_set_u_encoding_decode()
- htp_config_set_url_encoding_invalid_handling()
- htp_config_set_utf8_convert_bestfit()
- htp_config_set_normalized_uri_include_all()
- htp_config_set_plusspace_decode()
Constants related to configuring uri normalization:
- HTP_URL_DECODE_PRESERVE_PERCENT => HTP_URL_ENCODING_HANDLING_PRESERVE_PERCENT
- HTP_URL_DECODE_REMOVE_PERCENT => HTP_URL_ENCODING_HANDLING_REMOVE_PERCENT
- HTP_URL_DECODE_PROCESS_INVALID => HTP_URL_ENCODING_HANDLING_PROCESS_INVALID

htp_config_set_field_limits(soft_limit, hard_limit) changed to
htp_config_set_field_limit(limit) because libhtp didn't implement soft
limits.

libhtp logging API updated to provide HTP_LOG_CODE constants along with
the message. This eliminates the need to perform string matching on
message text to map log messages to HTTP_DECODER_EVENT values, and the
HTP_LOG_CODE values can be used directly. In support of this,
HTP_DECODER_EVENT values are mapped to their corresponding HTP_LOG_CODE
values:

HTTP_DECODER_EVENT_UNKNOWN_ERROR => HTP_LOG_CODE_UNKNOWN
HTTP_DECODER_EVENT_GZIP_DECOMPRESSION_FAILED => HTP_LOG_CODE_GZIP_DECOMPRESSION_FAILED
HTTP_DECODER_EVENT_REQUEST_FIELD_MISSING_COLON => HTP_LOG_CODE_REQUEST_FIELD_MISSING_COLON
HTTP_DECODER_EVENT_RESPONSE_FIELD_MISSING_COLON => HTP_LOG_CODE_RESPONSE_FIELD_MISSING_COLON
HTTP_DECODER_EVENT_INVALID_REQUEST_CHUNK_LEN => HTP_LOG_CODE_INVALID_REQUEST_CHUNK_LEN
HTTP_DECODER_EVENT_INVALID_RESPONSE_CHUNK_LEN => HTP_LOG_CODE_INVALID_RESPONSE_CHUNK_LEN
HTTP_DECODER_EVENT_INVALID_TRANSFER_ENCODING_VALUE_IN_REQUEST => HTP_LOG_CODE_INVALID_TRANSFER_ENCODING_VALUE_IN_REQUEST
HTTP_DECODER_EVENT_INVALID_TRANSFER_ENCODING_VALUE_IN_RESPONSE => HTP_LOG_CODE_INVALID_TRANSFER_ENCODING_VALUE_IN_RESPONSE
HTTP_DECODER_EVENT_INVALID_CONTENT_LENGTH_FIELD_IN_REQUEST => HTP_LOG_CODE_INVALID_CONTENT_LENGTH_FIELD_IN_REQUEST
HTTP_DECODER_EVENT_INVALID_CONTENT_LENGTH_FIELD_IN_RESPONSE => HTP_LOG_CODE_INVALID_CONTENT_LENGTH_FIELD_IN_RESPONSE
HTTP_DECODER_EVENT_DUPLICATE_CONTENT_LENGTH_FIELD_IN_REQUEST => HTP_LOG_CODE_DUPLICATE_CONTENT_LENGTH_FIELD_IN_REQUEST
HTTP_DECODER_EVENT_DUPLICATE_CONTENT_LENGTH_FIELD_IN_RESPONSE => HTP_LOG_CODE_DUPLICATE_CONTENT_LENGTH_FIELD_IN_RESPONSE
HTTP_DECODER_EVENT_100_CONTINUE_ALREADY_SEEN => HTP_LOG_CODE_CONTINUE_ALREADY_SEEN
HTTP_DECODER_EVENT_UNABLE_TO_MATCH_RESPONSE_TO_REQUEST => HTP_LOG_CODE_UNABLE_TO_MATCH_RESPONSE_TO_REQUEST
HTTP_DECODER_EVENT_INVALID_SERVER_PORT_IN_REQUEST => HTP_LOG_CODE_INVALID_SERVER_PORT_IN_REQUEST
HTTP_DECODER_EVENT_INVALID_AUTHORITY_PORT => HTP_LOG_CODE_INVALID_AUTHORITY_PORT
HTTP_DECODER_EVENT_REQUEST_HEADER_INVALID => HTP_LOG_CODE_REQUEST_HEADER_INVALID
HTTP_DECODER_EVENT_RESPONSE_HEADER_INVALID => HTP_LOG_CODE_RESPONSE_HEADER_INVALID
HTTP_DECODER_EVENT_MISSING_HOST_HEADER => HTP_LOG_CODE_MISSING_HOST_HEADER
HTTP_DECODER_EVENT_HOST_HEADER_AMBIGUOUS => HTP_LOG_CODE_HOST_HEADER_AMBIGUOUS
HTTP_DECODER_EVENT_INVALID_REQUEST_FIELD_FOLDING => HTP_LOG_CODE_INVALID_REQUEST_FIELD_FOLDING
HTTP_DECODER_EVENT_INVALID_RESPONSE_FIELD_FOLDING => HTP_LOG_CODE_INVALID_RESPONSE_FIELD_FOLDING
HTTP_DECODER_EVENT_REQUEST_FIELD_TOO_LONG => HTP_LOG_CODE_REQUEST_FIELD_TOO_LONG
HTTP_DECODER_EVENT_RESPONSE_FIELD_TOO_LONG => HTP_LOG_CODE_RESPONSE_FIELD_TOO_LONG
HTTP_DECODER_EVENT_FILE_NAME_TOO_LONG => HTP_LOG_CODE_REQUEST_LINE_INVALID
HTTP_DECODER_EVENT_REQUEST_LINE_INVALID => HTP_LOG_CODE_REQUEST_BODY_UNEXPECTED
HTTP_DECODER_EVENT_REQUEST_BODY_UNEXPECTED => HTP_LOG_CODE_RESPONSE_BODY_UNEXPECTED
HTTP_DECODER_EVENT_REQUEST_SERVER_PORT_TCP_PORT_MISMATCH => HTP_LOG_CODE_REQUEST_SERVER_PORT_TCP_PORT_MISMATCH
HTTP_DECODER_EVENT_URI_HOST_INVALID => HTP_LOG_CODE_URI_HOST_INVALID
HTTP_DECODER_EVENT_HEADER_HOST_INVALID => HTP_LOG_CODE_HEADER_HOST_INVALID
HTTP_DECODER_EVENT_AUTH_UNRECOGNIZED => HTP_LOG_CODE_AUTH_UNRECOGNIZED
HTTP_DECODER_EVENT_REQUEST_HEADER_REPETITION => HTP_LOG_CODE_REQUEST_HEADER_REPETITION
HTTP_DECODER_EVENT_RESPONSE_HEADER_REPETITION => HTP_LOG_CODE_RESPONSE_HEADER_REPETITION
HTTP_DECODER_EVENT_DOUBLE_ENCODED_URI => HTP_LOG_CODE_DOUBLE_ENCODED_URI
HTTP_DECODER_EVENT_URI_DELIM_NON_COMPLIANT => HTP_LOG_CODE_URI_DELIM_NON_COMPLIANT
HTTP_DECODER_EVENT_METHOD_DELIM_NON_COMPLIANT => HTP_LOG_CODE_METHOD_DELIM_NON_COMPLIANT
HTTP_DECODER_EVENT_REQUEST_LINE_LEADING_WHITESPACE => HTP_LOG_CODE_REQUEST_LINE_LEADING_WHITESPACE
HTTP_DECODER_EVENT_TOO_MANY_ENCODING_LAYERS => HTP_LOG_CODE_TOO_MANY_ENCODING_LAYERS
HTTP_DECODER_EVENT_ABNORMAL_CE_HEADER => HTP_LOG_CODE_ABNORMAL_CE_HEADER
HTTP_DECODER_EVENT_RESPONSE_MULTIPART_BYTERANGES => HTP_LOG_CODE_RESPONSE_MULTIPART_BYTERANGES
HTTP_DECODER_EVENT_RESPONSE_ABNORMAL_TRANSFER_ENCODING => HTP_LOG_CODE_RESPONSE_ABNORMAL_TRANSFER_ENCODING
HTTP_DECODER_EVENT_RESPONSE_CHUNKED_OLD_PROTO => HTP_LOG_CODE_RESPONSE_CHUNKED_OLD_PROTO
HTTP_DECODER_EVENT_RESPONSE_INVALID_PROTOCOL => HTP_LOG_CODE_RESPONSE_INVALID_PROTOCOL
HTTP_DECODER_EVENT_RESPONSE_INVALID_STATUS => HTP_LOG_CODE_RESPONSE_INVALID_STATUS
HTTP_DECODER_EVENT_REQUEST_LINE_INCOMPLETE => HTP_LOG_CODE_REQUEST_LINE_INCOMPLETE
HTTP_DECODER_EVENT_LZMA_MEMLIMIT_REACHED => HTP_LOG_CODE_LZMA_MEMLIMIT_REACHED
HTTP_DECODER_EVENT_COMPRESSION_BOMB => HTP_LOG_CODE_COMPRESSION_BOMB

New log events to describe additional anomalies:
HTP_LOG_CODE_REQUEST_TOO_MANY_LZMA_LAYERS
HTP_LOG_CODE_RESPONSE_TOO_MANY_LZMA_LAYERS
HTP_LOG_CODE_PROTOCOL_CONTAINS_EXTRA_DATA
HTP_LOG_CODE_CONTENT_LENGTH_EXTRA_DATA_START
HTP_LOG_CODE_CONTENT_LENGTH_EXTRA_DATA_END
HTP_LOG_CODE_SWITCHING_PROTO_WITH_CONTENT_LENGTH
HTP_LOG_CODE_DEFORMED_EOL
HTP_LOG_CODE_PARSER_STATE_ERROR
HTP_LOG_CODE_MISSING_OUTBOUND_TRANSACTION_DATA
HTP_LOG_CODE_MISSING_INBOUND_TRANSACTION_DATA
HTP_LOG_CODE_ZERO_LENGTH_DATA_CHUNKS
HTP_LOG_CODE_REQUEST_LINE_UNKNOWN_METHOD
HTP_LOG_CODE_REQUEST_LINE_UNKNOWN_METHOD_NO_PROTOCOL
HTP_LOG_CODE_REQUEST_LINE_UNKNOWN_METHOD_INVALID_PROTOCOL
HTP_LOG_CODE_REQUEST_LINE_NO_PROTOCOL
HTP_LOG_CODE_RESPONSE_LINE_INVALID_PROTOCOL
HTP_LOG_CODE_RESPONSE_LINE_INVALID_RESPONSE_STATUS
HTP_LOG_CODE_RESPONSE_BODY_INTERNAL_ERROR
HTP_LOG_CODE_REQUEST_BODY_DATA_CALLBACK_ERROR
HTP_LOG_CODE_RESPONSE_INVALID_EMPTY_NAME
HTP_LOG_CODE_REQUEST_INVALID_EMPTY_NAME
HTP_LOG_CODE_RESPONSE_INVALID_LWS_AFTER_NAME
HTP_LOG_CODE_RESPONSE_HEADER_NAME_NOT_TOKEN
HTP_LOG_CODE_REQUEST_INVALID_LWS_AFTER_NAME
HTP_LOG_CODE_LZMA_DECOMPRESSION_DISABLED
HTP_LOG_CODE_CONNECTION_ALREADY_OPEN
HTP_LOG_CODE_COMPRESSION_BOMB_DOUBLE_LZMA
HTP_LOG_CODE_INVALID_CONTENT_ENCODING
HTP_LOG_CODE_INVALID_GAP
HTP_LOG_CODE_ERROR

The new htp_log API supports consuming log messages more easily than
walking a list and tracking the current offset. Internally, libhtp-rs
now provides log messages as a queue of htp_log_t, which means the
application can simply call htp_conn_next_log() to fetch the next log
message until the queue is empty. Once the application is done with a
log message, they can call htp_log_free() to dispose of it.

Functions supporting htp_log_t:
htp_conn_next_log(conn) - Get the next log message
htp_log_message(log) - To get the text of the message
htp_log_code(log) - To get the HTP_LOG_CODE value
htp_log_free(log) - To free the htp_log_t
catenacyber pushed a commit to catenacyber/suricata that referenced this pull request Apr 4, 2024
Ticket: OISF#2696

There are a lot of changes here, which are described below.

In general these changes are renaming constants to conform to the
libhtp-rs versions (which are generated by cbindgen); making all htp
types opaque and changing struct->member references to
htp_struct_member() function calls; and a handful of changes to offload
functionality onto libhtp-rs from suricata, such as URI normalization
and transaction cleanup.

Constants renamed to correspond to cbindgen generated names:

HTP_OK => HTP_STATUS_OK
HTP_ERROR => HTP_STATUS_ERROR

HTP_SERVER_* => HTP_SERVER_PERSONALITY_*

HTP_REQUEST_INVALID_T_E => HTP_FLAGS_REQUEST_INVALID_T_E
HTP_REQUEST_INVALID_C_L => HTP_FLAGS_REQUEST_INVALID_C_L
HTP_HOST_MISSING => HTP_FLAGS_HOST_MISSING
HTP_HOST_AMBIGUOUS => HTP_FLAGS_HOST_AMBIGUOUS
HTP_HOSTU_INVALID => HTP_FLAGS_HOSTU_INVALID
HTP_HOSTH_INVALID => HTP_FLAGS_HOSTH_INVALID

HTP_AUTH_UNRECOGNIZED => HTP_AUTH_TYPE_UNRECOGNIZED

HTP_M_UNKNOWN => HTP_METHOD_UNKNOWN
HTP_M_GET => HTP_METHOD_GET
HTP_M_POST => HTP_METHOD_POST
HTP_M_PUT => HTP_METHOD_PUT
HTP_M_CONNECT => HTP_METHOD_CONNECT

HTP_STREAM_ERROR => HTP_STREAM_STATE_ERROR
HTP_STREAM_TUNNEL => HTP_STREAM_STATE_TUNNEL

HTP_REQUEST_LINE => HTP_REQUEST_PROGRESS_LINE
HTP_REQUEST_HEADERS => HTP_REQUEST_PROGRESS_HEADERS
HTP_REQUEST_BODY => HTP_REQUEST_PROGRESS_BODY
HTP_REQUEST_TRAILER => HTP_REQUEST_PROGRESS_TRAILER
HTP_REQUEST_COMPLETE => HTP_REQUEST_PROGRESS_COMPLETE
HTP_RESPONSE_LINE => HTP_RESPONSE_PROGRESS_LINE
HTP_RESPONSE_HEADERS => HTP_RESPONSE_PROGRESS_HEADERS
HTP_RESPONSE_BODY => HTP_RESPONSE_PROGRESS_BODY
HTP_RESPONSE_TRAILER => HTP_RESPONSE_PROGRESS_TRAILER
HTP_RESPONSE_COMPLETE => HTP_RESPONSE_PROGRESS_COMPLETE

HTP_PROTOCOL_1_1 => HTP_PROTOCOL_V1_1
HTP_PROTOCOL_1_0 => HTP_PROTOCOL_V1_0
HTP_PROTOCOL_0_9 => HTP_PROTOCOL_V0_9

Functions introduced to handle opaque htp_tx_t:
- tx->cfg => htp_tx_cfg(tx)
- tx->flags => htp_tx_flags(tx)
- tx->is_protocol_0_9 => htp_tx_is_protocol_0_9(tx)
- tx->parsed_uri => htp_tx_parsed_uri(tx)
- tx->parsed_uri->path => htp_uri_path(htp_tx_parsed_uri(tx)
- tx->parsed_uri->hostname => htp_uri_hostname(htp_tx_parsed_uri(tx))
- tx->request_auth_type => htp_tx_request_auth_type(tx)
- tx->request_headers => htp_tx_request_headers(tx)
- tx->request_hostname => htp_tx_request_hostname(tx)
- tx->request_line => htp_tx_request_line(tx)
- tx->request_message_len => htp_tx_request_message_len(tx)
- tx->request_method => htp_tx_request_method(tx)
- tx->request_method_number => htp_tx_request_method_number(tx)
- tx->request_port_number => htp_tx_request_port_number(tx)
- tx->request_progress => htp_tx_request_progress(tx)
- tx->request_protocol => htp_tx_request_protocol(tx)
- tx->request_protocol_number => htp_tx_request_protocol_number(tx)
- tx->request_uri => htp_tx_request_uri(tx)
- tx->response_headers => htp_tx_response_headers(tx)
- tx->response_line => htp_tx_response_line(tx)
- tx->response_message => htp_tx_response_message(tx)
- tx->response_message_len => htp_tx_response_message_len(tx)
- tx->response_status => htp_tx_response_status(tx)
- tx->response_status_number => htp_tx_response_status_number(tx)
- tx->response_progress => htp_tx_response_progress(tx)
- tx->response_protocol_number => htp_tx_response_protocol_number(tx)
- htp_tx_get_user_data() => htp_tx_user_data(tx)
- htp_table_get_c(tx->request_headers, header) => htp_tx_request_header(tx, header)
- htp_table_get_c(tx->response_headers, header) => htp_tx_response_header(tx, header)
- htp_table_get_index(tx->request_headers, index) => htp_tx_request_header_index(tx, index)
- htp_table_size(tx->request_headers) => htp_tx_request_headers_size(tx)
- htp_tx_is_http_2_upgrade(tx) convenience function introduced to detect response status 101
  and “Upgrade: h2c" header.

Functions introduced to handle opaque htp_header_t:
- header->name => htp_header_name(header)
- bstr_ptr(header->name) => htp_header_name_ptr(header)
- bstr_len(header->name) => htp_header_name_len(header)
- header->value => htp_header_value(header)
- bstr_len(header->value) => htp_header_value_len(header)
- bstr_ptr(header->value) => htp_header_value_ptr(header)

Functions introduced to handle opaque htp_headers_t:
- htp_table_size(headers) => htp_headers_size(headers)
- htp_table_get_index(headers, index) => htp_headers_get_index(headers, index)

Functions introduced to handle opaque htp_tx_data_t:
- d->len => htp_tx_data_len()
- d->data => htp_tx_data_data()
- htp_tx_data_tx(data) function to get the htp_tx_t from the htp_tx_data_t
- htp_tx_data_is_empty(data) convenience function introduced to test if the data is empty.

Functions introduced to handle opaque htp_connp_t:
- htp_list_get(connp->transactions, txid) => htp_connp_tx(connp, txid)
- htp_list_size(http_state->conn->transactions) => htp_connp_tx_size(connp)
- htp_connp_get_connection(connp) => htp_connp_connection(connp)
- htp_connp_req_data(connp) => htp_connp_request_data(connp)
- htp_connp_req_close(connp) => htp_connp_request_close(connp)
- htp_connp_res_data(connp) => htp_connp_response_data(connp)
- htp_connp_get_jn_tx(connp) => htp_connp_get_request_tx(connp)
- htp_connp_get_out_tx(connp) => htp_connp_get_response_tx(connp)
- htp_connp_req_data_consumed(connp) => htp_connp_request_data_consumed(connp)
- htp_connp_res_data_consumed(connp) => htp_connp_response_data_consumed(connp)
- htp_connp_get_user_data(connp) => htp_connp_user_data(connp)

Functions introduced to handle opaque htp_conn_t:
- conn->in_data_counter => htp_conn_request_data_counter(conn)
- conn->out_data_counter => htp_conn_response_data_counter(conn)

Other changes:

Build libhtp-rs as a crate inside rust. Update autoconf to no longer
use libhtp as an external dependency. Remove HAVE_HTP feature defines
since they are no longer needed.

Make function arguments and return values const where possible

htp_tx_destroy(tx) will now free an incomplete transaction

htp_time_t replaced with standard struct timeval

Callbacks from libhtp now provide the htp_connp_t and the htp_tx_data_t
as separate arguments. This means the connection parser is no longer
fetched from the transaction inside callbacks.

SCHTPGenerateNormalizedUri() functionality moved inside libhtp-rs, which
now provides normalized URI values.
The normalized URI is available with accessor function: htp_tx_normalized_uri()
Configuration settings added to control the behaviour of the URI normalization:
- htp_config_set_normalized_uri_include_all()
- htp_config_set_plusspace_decode()
- htp_config_set_convert_lowercase()
- htp_config_set_double_decode_normalized_query()
- htp_config_set_double_decode_normalized_path()
- htp_config_set_backslash_convert_slashes()
- htp_config_set_bestfit_replacement_byte()
- htp_config_set_convert_lowercase()
- htp_config_set_nul_encoded_terminates()
- htp_config_set_nul_raw_terminates()
- htp_config_set_path_separators_compress()
- htp_config_set_path_separators_decode()
- htp_config_set_u_encoding_decode()
- htp_config_set_url_encoding_invalid_handling()
- htp_config_set_utf8_convert_bestfit()
- htp_config_set_normalized_uri_include_all()
- htp_config_set_plusspace_decode()
Constants related to configuring uri normalization:
- HTP_URL_DECODE_PRESERVE_PERCENT => HTP_URL_ENCODING_HANDLING_PRESERVE_PERCENT
- HTP_URL_DECODE_REMOVE_PERCENT => HTP_URL_ENCODING_HANDLING_REMOVE_PERCENT
- HTP_URL_DECODE_PROCESS_INVALID => HTP_URL_ENCODING_HANDLING_PROCESS_INVALID

htp_config_set_field_limits(soft_limit, hard_limit) changed to
htp_config_set_field_limit(limit) because libhtp didn't implement soft
limits.

libhtp logging API updated to provide HTP_LOG_CODE constants along with
the message. This eliminates the need to perform string matching on
message text to map log messages to HTTP_DECODER_EVENT values, and the
HTP_LOG_CODE values can be used directly. In support of this,
HTP_DECODER_EVENT values are mapped to their corresponding HTP_LOG_CODE
values:

HTTP_DECODER_EVENT_UNKNOWN_ERROR => HTP_LOG_CODE_UNKNOWN
HTTP_DECODER_EVENT_GZIP_DECOMPRESSION_FAILED => HTP_LOG_CODE_GZIP_DECOMPRESSION_FAILED
HTTP_DECODER_EVENT_REQUEST_FIELD_MISSING_COLON => HTP_LOG_CODE_REQUEST_FIELD_MISSING_COLON
HTTP_DECODER_EVENT_RESPONSE_FIELD_MISSING_COLON => HTP_LOG_CODE_RESPONSE_FIELD_MISSING_COLON
HTTP_DECODER_EVENT_INVALID_REQUEST_CHUNK_LEN => HTP_LOG_CODE_INVALID_REQUEST_CHUNK_LEN
HTTP_DECODER_EVENT_INVALID_RESPONSE_CHUNK_LEN => HTP_LOG_CODE_INVALID_RESPONSE_CHUNK_LEN
HTTP_DECODER_EVENT_INVALID_TRANSFER_ENCODING_VALUE_IN_REQUEST => HTP_LOG_CODE_INVALID_TRANSFER_ENCODING_VALUE_IN_REQUEST
HTTP_DECODER_EVENT_INVALID_TRANSFER_ENCODING_VALUE_IN_RESPONSE => HTP_LOG_CODE_INVALID_TRANSFER_ENCODING_VALUE_IN_RESPONSE
HTTP_DECODER_EVENT_INVALID_CONTENT_LENGTH_FIELD_IN_REQUEST => HTP_LOG_CODE_INVALID_CONTENT_LENGTH_FIELD_IN_REQUEST
HTTP_DECODER_EVENT_INVALID_CONTENT_LENGTH_FIELD_IN_RESPONSE => HTP_LOG_CODE_INVALID_CONTENT_LENGTH_FIELD_IN_RESPONSE
HTTP_DECODER_EVENT_DUPLICATE_CONTENT_LENGTH_FIELD_IN_REQUEST => HTP_LOG_CODE_DUPLICATE_CONTENT_LENGTH_FIELD_IN_REQUEST
HTTP_DECODER_EVENT_DUPLICATE_CONTENT_LENGTH_FIELD_IN_RESPONSE => HTP_LOG_CODE_DUPLICATE_CONTENT_LENGTH_FIELD_IN_RESPONSE
HTTP_DECODER_EVENT_100_CONTINUE_ALREADY_SEEN => HTP_LOG_CODE_CONTINUE_ALREADY_SEEN
HTTP_DECODER_EVENT_UNABLE_TO_MATCH_RESPONSE_TO_REQUEST => HTP_LOG_CODE_UNABLE_TO_MATCH_RESPONSE_TO_REQUEST
HTTP_DECODER_EVENT_INVALID_SERVER_PORT_IN_REQUEST => HTP_LOG_CODE_INVALID_SERVER_PORT_IN_REQUEST
HTTP_DECODER_EVENT_INVALID_AUTHORITY_PORT => HTP_LOG_CODE_INVALID_AUTHORITY_PORT
HTTP_DECODER_EVENT_REQUEST_HEADER_INVALID => HTP_LOG_CODE_REQUEST_HEADER_INVALID
HTTP_DECODER_EVENT_RESPONSE_HEADER_INVALID => HTP_LOG_CODE_RESPONSE_HEADER_INVALID
HTTP_DECODER_EVENT_MISSING_HOST_HEADER => HTP_LOG_CODE_MISSING_HOST_HEADER
HTTP_DECODER_EVENT_HOST_HEADER_AMBIGUOUS => HTP_LOG_CODE_HOST_HEADER_AMBIGUOUS
HTTP_DECODER_EVENT_INVALID_REQUEST_FIELD_FOLDING => HTP_LOG_CODE_INVALID_REQUEST_FIELD_FOLDING
HTTP_DECODER_EVENT_INVALID_RESPONSE_FIELD_FOLDING => HTP_LOG_CODE_INVALID_RESPONSE_FIELD_FOLDING
HTTP_DECODER_EVENT_REQUEST_FIELD_TOO_LONG => HTP_LOG_CODE_REQUEST_FIELD_TOO_LONG
HTTP_DECODER_EVENT_RESPONSE_FIELD_TOO_LONG => HTP_LOG_CODE_RESPONSE_FIELD_TOO_LONG
HTTP_DECODER_EVENT_FILE_NAME_TOO_LONG => HTP_LOG_CODE_REQUEST_LINE_INVALID
HTTP_DECODER_EVENT_REQUEST_LINE_INVALID => HTP_LOG_CODE_REQUEST_BODY_UNEXPECTED
HTTP_DECODER_EVENT_REQUEST_BODY_UNEXPECTED => HTP_LOG_CODE_RESPONSE_BODY_UNEXPECTED
HTTP_DECODER_EVENT_REQUEST_SERVER_PORT_TCP_PORT_MISMATCH => HTP_LOG_CODE_REQUEST_SERVER_PORT_TCP_PORT_MISMATCH
HTTP_DECODER_EVENT_URI_HOST_INVALID => HTP_LOG_CODE_URI_HOST_INVALID
HTTP_DECODER_EVENT_HEADER_HOST_INVALID => HTP_LOG_CODE_HEADER_HOST_INVALID
HTTP_DECODER_EVENT_AUTH_UNRECOGNIZED => HTP_LOG_CODE_AUTH_UNRECOGNIZED
HTTP_DECODER_EVENT_REQUEST_HEADER_REPETITION => HTP_LOG_CODE_REQUEST_HEADER_REPETITION
HTTP_DECODER_EVENT_RESPONSE_HEADER_REPETITION => HTP_LOG_CODE_RESPONSE_HEADER_REPETITION
HTTP_DECODER_EVENT_DOUBLE_ENCODED_URI => HTP_LOG_CODE_DOUBLE_ENCODED_URI
HTTP_DECODER_EVENT_URI_DELIM_NON_COMPLIANT => HTP_LOG_CODE_URI_DELIM_NON_COMPLIANT
HTTP_DECODER_EVENT_METHOD_DELIM_NON_COMPLIANT => HTP_LOG_CODE_METHOD_DELIM_NON_COMPLIANT
HTTP_DECODER_EVENT_REQUEST_LINE_LEADING_WHITESPACE => HTP_LOG_CODE_REQUEST_LINE_LEADING_WHITESPACE
HTTP_DECODER_EVENT_TOO_MANY_ENCODING_LAYERS => HTP_LOG_CODE_TOO_MANY_ENCODING_LAYERS
HTTP_DECODER_EVENT_ABNORMAL_CE_HEADER => HTP_LOG_CODE_ABNORMAL_CE_HEADER
HTTP_DECODER_EVENT_RESPONSE_MULTIPART_BYTERANGES => HTP_LOG_CODE_RESPONSE_MULTIPART_BYTERANGES
HTTP_DECODER_EVENT_RESPONSE_ABNORMAL_TRANSFER_ENCODING => HTP_LOG_CODE_RESPONSE_ABNORMAL_TRANSFER_ENCODING
HTTP_DECODER_EVENT_RESPONSE_CHUNKED_OLD_PROTO => HTP_LOG_CODE_RESPONSE_CHUNKED_OLD_PROTO
HTTP_DECODER_EVENT_RESPONSE_INVALID_PROTOCOL => HTP_LOG_CODE_RESPONSE_INVALID_PROTOCOL
HTTP_DECODER_EVENT_RESPONSE_INVALID_STATUS => HTP_LOG_CODE_RESPONSE_INVALID_STATUS
HTTP_DECODER_EVENT_REQUEST_LINE_INCOMPLETE => HTP_LOG_CODE_REQUEST_LINE_INCOMPLETE
HTTP_DECODER_EVENT_LZMA_MEMLIMIT_REACHED => HTP_LOG_CODE_LZMA_MEMLIMIT_REACHED
HTTP_DECODER_EVENT_COMPRESSION_BOMB => HTP_LOG_CODE_COMPRESSION_BOMB

New log events to describe additional anomalies:
HTP_LOG_CODE_REQUEST_TOO_MANY_LZMA_LAYERS
HTP_LOG_CODE_RESPONSE_TOO_MANY_LZMA_LAYERS
HTP_LOG_CODE_PROTOCOL_CONTAINS_EXTRA_DATA
HTP_LOG_CODE_CONTENT_LENGTH_EXTRA_DATA_START
HTP_LOG_CODE_CONTENT_LENGTH_EXTRA_DATA_END
HTP_LOG_CODE_SWITCHING_PROTO_WITH_CONTENT_LENGTH
HTP_LOG_CODE_DEFORMED_EOL
HTP_LOG_CODE_PARSER_STATE_ERROR
HTP_LOG_CODE_MISSING_OUTBOUND_TRANSACTION_DATA
HTP_LOG_CODE_MISSING_INBOUND_TRANSACTION_DATA
HTP_LOG_CODE_ZERO_LENGTH_DATA_CHUNKS
HTP_LOG_CODE_REQUEST_LINE_UNKNOWN_METHOD
HTP_LOG_CODE_REQUEST_LINE_UNKNOWN_METHOD_NO_PROTOCOL
HTP_LOG_CODE_REQUEST_LINE_UNKNOWN_METHOD_INVALID_PROTOCOL
HTP_LOG_CODE_REQUEST_LINE_NO_PROTOCOL
HTP_LOG_CODE_RESPONSE_LINE_INVALID_PROTOCOL
HTP_LOG_CODE_RESPONSE_LINE_INVALID_RESPONSE_STATUS
HTP_LOG_CODE_RESPONSE_BODY_INTERNAL_ERROR
HTP_LOG_CODE_REQUEST_BODY_DATA_CALLBACK_ERROR
HTP_LOG_CODE_RESPONSE_INVALID_EMPTY_NAME
HTP_LOG_CODE_REQUEST_INVALID_EMPTY_NAME
HTP_LOG_CODE_RESPONSE_INVALID_LWS_AFTER_NAME
HTP_LOG_CODE_RESPONSE_HEADER_NAME_NOT_TOKEN
HTP_LOG_CODE_REQUEST_INVALID_LWS_AFTER_NAME
HTP_LOG_CODE_LZMA_DECOMPRESSION_DISABLED
HTP_LOG_CODE_CONNECTION_ALREADY_OPEN
HTP_LOG_CODE_COMPRESSION_BOMB_DOUBLE_LZMA
HTP_LOG_CODE_INVALID_CONTENT_ENCODING
HTP_LOG_CODE_INVALID_GAP
HTP_LOG_CODE_ERROR

The new htp_log API supports consuming log messages more easily than
walking a list and tracking the current offset. Internally, libhtp-rs
now provides log messages as a queue of htp_log_t, which means the
application can simply call htp_conn_next_log() to fetch the next log
message until the queue is empty. Once the application is done with a
log message, they can call htp_log_free() to dispose of it.

Functions supporting htp_log_t:
htp_conn_next_log(conn) - Get the next log message
htp_log_message(log) - To get the text of the message
htp_log_code(log) - To get the HTP_LOG_CODE value
htp_log_free(log) - To free the htp_log_t
catenacyber pushed a commit to catenacyber/suricata that referenced this pull request Apr 4, 2024
Ticket: OISF#2696

There are a lot of changes here, which are described below.

In general these changes are renaming constants to conform to the
libhtp-rs versions (which are generated by cbindgen); making all htp
types opaque and changing struct->member references to
htp_struct_member() function calls; and a handful of changes to offload
functionality onto libhtp-rs from suricata, such as URI normalization
and transaction cleanup.

Constants renamed to correspond to cbindgen generated names:

HTP_OK => HTP_STATUS_OK
HTP_ERROR => HTP_STATUS_ERROR

HTP_SERVER_* => HTP_SERVER_PERSONALITY_*

HTP_REQUEST_INVALID_T_E => HTP_FLAGS_REQUEST_INVALID_T_E
HTP_REQUEST_INVALID_C_L => HTP_FLAGS_REQUEST_INVALID_C_L
HTP_HOST_MISSING => HTP_FLAGS_HOST_MISSING
HTP_HOST_AMBIGUOUS => HTP_FLAGS_HOST_AMBIGUOUS
HTP_HOSTU_INVALID => HTP_FLAGS_HOSTU_INVALID
HTP_HOSTH_INVALID => HTP_FLAGS_HOSTH_INVALID

HTP_AUTH_UNRECOGNIZED => HTP_AUTH_TYPE_UNRECOGNIZED

HTP_M_UNKNOWN => HTP_METHOD_UNKNOWN
HTP_M_GET => HTP_METHOD_GET
HTP_M_POST => HTP_METHOD_POST
HTP_M_PUT => HTP_METHOD_PUT
HTP_M_CONNECT => HTP_METHOD_CONNECT

HTP_STREAM_ERROR => HTP_STREAM_STATE_ERROR
HTP_STREAM_TUNNEL => HTP_STREAM_STATE_TUNNEL

HTP_REQUEST_LINE => HTP_REQUEST_PROGRESS_LINE
HTP_REQUEST_HEADERS => HTP_REQUEST_PROGRESS_HEADERS
HTP_REQUEST_BODY => HTP_REQUEST_PROGRESS_BODY
HTP_REQUEST_TRAILER => HTP_REQUEST_PROGRESS_TRAILER
HTP_REQUEST_COMPLETE => HTP_REQUEST_PROGRESS_COMPLETE
HTP_RESPONSE_LINE => HTP_RESPONSE_PROGRESS_LINE
HTP_RESPONSE_HEADERS => HTP_RESPONSE_PROGRESS_HEADERS
HTP_RESPONSE_BODY => HTP_RESPONSE_PROGRESS_BODY
HTP_RESPONSE_TRAILER => HTP_RESPONSE_PROGRESS_TRAILER
HTP_RESPONSE_COMPLETE => HTP_RESPONSE_PROGRESS_COMPLETE

HTP_PROTOCOL_1_1 => HTP_PROTOCOL_V1_1
HTP_PROTOCOL_1_0 => HTP_PROTOCOL_V1_0
HTP_PROTOCOL_0_9 => HTP_PROTOCOL_V0_9

Functions introduced to handle opaque htp_tx_t:
- tx->cfg => htp_tx_cfg(tx)
- tx->flags => htp_tx_flags(tx)
- tx->is_protocol_0_9 => htp_tx_is_protocol_0_9(tx)
- tx->parsed_uri => htp_tx_parsed_uri(tx)
- tx->parsed_uri->path => htp_uri_path(htp_tx_parsed_uri(tx)
- tx->parsed_uri->hostname => htp_uri_hostname(htp_tx_parsed_uri(tx))
- tx->request_auth_type => htp_tx_request_auth_type(tx)
- tx->request_headers => htp_tx_request_headers(tx)
- tx->request_hostname => htp_tx_request_hostname(tx)
- tx->request_line => htp_tx_request_line(tx)
- tx->request_message_len => htp_tx_request_message_len(tx)
- tx->request_method => htp_tx_request_method(tx)
- tx->request_method_number => htp_tx_request_method_number(tx)
- tx->request_port_number => htp_tx_request_port_number(tx)
- tx->request_progress => htp_tx_request_progress(tx)
- tx->request_protocol => htp_tx_request_protocol(tx)
- tx->request_protocol_number => htp_tx_request_protocol_number(tx)
- tx->request_uri => htp_tx_request_uri(tx)
- tx->response_headers => htp_tx_response_headers(tx)
- tx->response_line => htp_tx_response_line(tx)
- tx->response_message => htp_tx_response_message(tx)
- tx->response_message_len => htp_tx_response_message_len(tx)
- tx->response_status => htp_tx_response_status(tx)
- tx->response_status_number => htp_tx_response_status_number(tx)
- tx->response_progress => htp_tx_response_progress(tx)
- tx->response_protocol_number => htp_tx_response_protocol_number(tx)
- htp_tx_get_user_data() => htp_tx_user_data(tx)
- htp_table_get_c(tx->request_headers, header) => htp_tx_request_header(tx, header)
- htp_table_get_c(tx->response_headers, header) => htp_tx_response_header(tx, header)
- htp_table_get_index(tx->request_headers, index) => htp_tx_request_header_index(tx, index)
- htp_table_size(tx->request_headers) => htp_tx_request_headers_size(tx)
- htp_tx_is_http_2_upgrade(tx) convenience function introduced to detect response status 101
  and “Upgrade: h2c" header.

Functions introduced to handle opaque htp_header_t:
- header->name => htp_header_name(header)
- bstr_ptr(header->name) => htp_header_name_ptr(header)
- bstr_len(header->name) => htp_header_name_len(header)
- header->value => htp_header_value(header)
- bstr_len(header->value) => htp_header_value_len(header)
- bstr_ptr(header->value) => htp_header_value_ptr(header)

Functions introduced to handle opaque htp_headers_t:
- htp_table_size(headers) => htp_headers_size(headers)
- htp_table_get_index(headers, index) => htp_headers_get_index(headers, index)

Functions introduced to handle opaque htp_tx_data_t:
- d->len => htp_tx_data_len()
- d->data => htp_tx_data_data()
- htp_tx_data_tx(data) function to get the htp_tx_t from the htp_tx_data_t
- htp_tx_data_is_empty(data) convenience function introduced to test if the data is empty.

Functions introduced to handle opaque htp_connp_t:
- htp_list_get(connp->transactions, txid) => htp_connp_tx(connp, txid)
- htp_list_size(http_state->conn->transactions) => htp_connp_tx_size(connp)
- htp_connp_get_connection(connp) => htp_connp_connection(connp)
- htp_connp_req_data(connp) => htp_connp_request_data(connp)
- htp_connp_req_close(connp) => htp_connp_request_close(connp)
- htp_connp_res_data(connp) => htp_connp_response_data(connp)
- htp_connp_get_jn_tx(connp) => htp_connp_get_request_tx(connp)
- htp_connp_get_out_tx(connp) => htp_connp_get_response_tx(connp)
- htp_connp_req_data_consumed(connp) => htp_connp_request_data_consumed(connp)
- htp_connp_res_data_consumed(connp) => htp_connp_response_data_consumed(connp)
- htp_connp_get_user_data(connp) => htp_connp_user_data(connp)

Functions introduced to handle opaque htp_conn_t:
- conn->in_data_counter => htp_conn_request_data_counter(conn)
- conn->out_data_counter => htp_conn_response_data_counter(conn)

Other changes:

Build libhtp-rs as a crate inside rust. Update autoconf to no longer
use libhtp as an external dependency. Remove HAVE_HTP feature defines
since they are no longer needed.

Make function arguments and return values const where possible

htp_tx_destroy(tx) will now free an incomplete transaction

htp_time_t replaced with standard struct timeval

Callbacks from libhtp now provide the htp_connp_t and the htp_tx_data_t
as separate arguments. This means the connection parser is no longer
fetched from the transaction inside callbacks.

SCHTPGenerateNormalizedUri() functionality moved inside libhtp-rs, which
now provides normalized URI values.
The normalized URI is available with accessor function: htp_tx_normalized_uri()
Configuration settings added to control the behaviour of the URI normalization:
- htp_config_set_normalized_uri_include_all()
- htp_config_set_plusspace_decode()
- htp_config_set_convert_lowercase()
- htp_config_set_double_decode_normalized_query()
- htp_config_set_double_decode_normalized_path()
- htp_config_set_backslash_convert_slashes()
- htp_config_set_bestfit_replacement_byte()
- htp_config_set_convert_lowercase()
- htp_config_set_nul_encoded_terminates()
- htp_config_set_nul_raw_terminates()
- htp_config_set_path_separators_compress()
- htp_config_set_path_separators_decode()
- htp_config_set_u_encoding_decode()
- htp_config_set_url_encoding_invalid_handling()
- htp_config_set_utf8_convert_bestfit()
- htp_config_set_normalized_uri_include_all()
- htp_config_set_plusspace_decode()
Constants related to configuring uri normalization:
- HTP_URL_DECODE_PRESERVE_PERCENT => HTP_URL_ENCODING_HANDLING_PRESERVE_PERCENT
- HTP_URL_DECODE_REMOVE_PERCENT => HTP_URL_ENCODING_HANDLING_REMOVE_PERCENT
- HTP_URL_DECODE_PROCESS_INVALID => HTP_URL_ENCODING_HANDLING_PROCESS_INVALID

htp_config_set_field_limits(soft_limit, hard_limit) changed to
htp_config_set_field_limit(limit) because libhtp didn't implement soft
limits.

libhtp logging API updated to provide HTP_LOG_CODE constants along with
the message. This eliminates the need to perform string matching on
message text to map log messages to HTTP_DECODER_EVENT values, and the
HTP_LOG_CODE values can be used directly. In support of this,
HTP_DECODER_EVENT values are mapped to their corresponding HTP_LOG_CODE
values:

HTTP_DECODER_EVENT_UNKNOWN_ERROR => HTP_LOG_CODE_UNKNOWN
HTTP_DECODER_EVENT_GZIP_DECOMPRESSION_FAILED => HTP_LOG_CODE_GZIP_DECOMPRESSION_FAILED
HTTP_DECODER_EVENT_REQUEST_FIELD_MISSING_COLON => HTP_LOG_CODE_REQUEST_FIELD_MISSING_COLON
HTTP_DECODER_EVENT_RESPONSE_FIELD_MISSING_COLON => HTP_LOG_CODE_RESPONSE_FIELD_MISSING_COLON
HTTP_DECODER_EVENT_INVALID_REQUEST_CHUNK_LEN => HTP_LOG_CODE_INVALID_REQUEST_CHUNK_LEN
HTTP_DECODER_EVENT_INVALID_RESPONSE_CHUNK_LEN => HTP_LOG_CODE_INVALID_RESPONSE_CHUNK_LEN
HTTP_DECODER_EVENT_INVALID_TRANSFER_ENCODING_VALUE_IN_REQUEST => HTP_LOG_CODE_INVALID_TRANSFER_ENCODING_VALUE_IN_REQUEST
HTTP_DECODER_EVENT_INVALID_TRANSFER_ENCODING_VALUE_IN_RESPONSE => HTP_LOG_CODE_INVALID_TRANSFER_ENCODING_VALUE_IN_RESPONSE
HTTP_DECODER_EVENT_INVALID_CONTENT_LENGTH_FIELD_IN_REQUEST => HTP_LOG_CODE_INVALID_CONTENT_LENGTH_FIELD_IN_REQUEST
HTTP_DECODER_EVENT_INVALID_CONTENT_LENGTH_FIELD_IN_RESPONSE => HTP_LOG_CODE_INVALID_CONTENT_LENGTH_FIELD_IN_RESPONSE
HTTP_DECODER_EVENT_DUPLICATE_CONTENT_LENGTH_FIELD_IN_REQUEST => HTP_LOG_CODE_DUPLICATE_CONTENT_LENGTH_FIELD_IN_REQUEST
HTTP_DECODER_EVENT_DUPLICATE_CONTENT_LENGTH_FIELD_IN_RESPONSE => HTP_LOG_CODE_DUPLICATE_CONTENT_LENGTH_FIELD_IN_RESPONSE
HTTP_DECODER_EVENT_100_CONTINUE_ALREADY_SEEN => HTP_LOG_CODE_CONTINUE_ALREADY_SEEN
HTTP_DECODER_EVENT_UNABLE_TO_MATCH_RESPONSE_TO_REQUEST => HTP_LOG_CODE_UNABLE_TO_MATCH_RESPONSE_TO_REQUEST
HTTP_DECODER_EVENT_INVALID_SERVER_PORT_IN_REQUEST => HTP_LOG_CODE_INVALID_SERVER_PORT_IN_REQUEST
HTTP_DECODER_EVENT_INVALID_AUTHORITY_PORT => HTP_LOG_CODE_INVALID_AUTHORITY_PORT
HTTP_DECODER_EVENT_REQUEST_HEADER_INVALID => HTP_LOG_CODE_REQUEST_HEADER_INVALID
HTTP_DECODER_EVENT_RESPONSE_HEADER_INVALID => HTP_LOG_CODE_RESPONSE_HEADER_INVALID
HTTP_DECODER_EVENT_MISSING_HOST_HEADER => HTP_LOG_CODE_MISSING_HOST_HEADER
HTTP_DECODER_EVENT_HOST_HEADER_AMBIGUOUS => HTP_LOG_CODE_HOST_HEADER_AMBIGUOUS
HTTP_DECODER_EVENT_INVALID_REQUEST_FIELD_FOLDING => HTP_LOG_CODE_INVALID_REQUEST_FIELD_FOLDING
HTTP_DECODER_EVENT_INVALID_RESPONSE_FIELD_FOLDING => HTP_LOG_CODE_INVALID_RESPONSE_FIELD_FOLDING
HTTP_DECODER_EVENT_REQUEST_FIELD_TOO_LONG => HTP_LOG_CODE_REQUEST_FIELD_TOO_LONG
HTTP_DECODER_EVENT_RESPONSE_FIELD_TOO_LONG => HTP_LOG_CODE_RESPONSE_FIELD_TOO_LONG
HTTP_DECODER_EVENT_FILE_NAME_TOO_LONG => HTP_LOG_CODE_REQUEST_LINE_INVALID
HTTP_DECODER_EVENT_REQUEST_LINE_INVALID => HTP_LOG_CODE_REQUEST_BODY_UNEXPECTED
HTTP_DECODER_EVENT_REQUEST_BODY_UNEXPECTED => HTP_LOG_CODE_RESPONSE_BODY_UNEXPECTED
HTTP_DECODER_EVENT_REQUEST_SERVER_PORT_TCP_PORT_MISMATCH => HTP_LOG_CODE_REQUEST_SERVER_PORT_TCP_PORT_MISMATCH
HTTP_DECODER_EVENT_URI_HOST_INVALID => HTP_LOG_CODE_URI_HOST_INVALID
HTTP_DECODER_EVENT_HEADER_HOST_INVALID => HTP_LOG_CODE_HEADER_HOST_INVALID
HTTP_DECODER_EVENT_AUTH_UNRECOGNIZED => HTP_LOG_CODE_AUTH_UNRECOGNIZED
HTTP_DECODER_EVENT_REQUEST_HEADER_REPETITION => HTP_LOG_CODE_REQUEST_HEADER_REPETITION
HTTP_DECODER_EVENT_RESPONSE_HEADER_REPETITION => HTP_LOG_CODE_RESPONSE_HEADER_REPETITION
HTTP_DECODER_EVENT_DOUBLE_ENCODED_URI => HTP_LOG_CODE_DOUBLE_ENCODED_URI
HTTP_DECODER_EVENT_URI_DELIM_NON_COMPLIANT => HTP_LOG_CODE_URI_DELIM_NON_COMPLIANT
HTTP_DECODER_EVENT_METHOD_DELIM_NON_COMPLIANT => HTP_LOG_CODE_METHOD_DELIM_NON_COMPLIANT
HTTP_DECODER_EVENT_REQUEST_LINE_LEADING_WHITESPACE => HTP_LOG_CODE_REQUEST_LINE_LEADING_WHITESPACE
HTTP_DECODER_EVENT_TOO_MANY_ENCODING_LAYERS => HTP_LOG_CODE_TOO_MANY_ENCODING_LAYERS
HTTP_DECODER_EVENT_ABNORMAL_CE_HEADER => HTP_LOG_CODE_ABNORMAL_CE_HEADER
HTTP_DECODER_EVENT_RESPONSE_MULTIPART_BYTERANGES => HTP_LOG_CODE_RESPONSE_MULTIPART_BYTERANGES
HTTP_DECODER_EVENT_RESPONSE_ABNORMAL_TRANSFER_ENCODING => HTP_LOG_CODE_RESPONSE_ABNORMAL_TRANSFER_ENCODING
HTTP_DECODER_EVENT_RESPONSE_CHUNKED_OLD_PROTO => HTP_LOG_CODE_RESPONSE_CHUNKED_OLD_PROTO
HTTP_DECODER_EVENT_RESPONSE_INVALID_PROTOCOL => HTP_LOG_CODE_RESPONSE_INVALID_PROTOCOL
HTTP_DECODER_EVENT_RESPONSE_INVALID_STATUS => HTP_LOG_CODE_RESPONSE_INVALID_STATUS
HTTP_DECODER_EVENT_REQUEST_LINE_INCOMPLETE => HTP_LOG_CODE_REQUEST_LINE_INCOMPLETE
HTTP_DECODER_EVENT_LZMA_MEMLIMIT_REACHED => HTP_LOG_CODE_LZMA_MEMLIMIT_REACHED
HTTP_DECODER_EVENT_COMPRESSION_BOMB => HTP_LOG_CODE_COMPRESSION_BOMB

New log events to describe additional anomalies:
HTP_LOG_CODE_REQUEST_TOO_MANY_LZMA_LAYERS
HTP_LOG_CODE_RESPONSE_TOO_MANY_LZMA_LAYERS
HTP_LOG_CODE_PROTOCOL_CONTAINS_EXTRA_DATA
HTP_LOG_CODE_CONTENT_LENGTH_EXTRA_DATA_START
HTP_LOG_CODE_CONTENT_LENGTH_EXTRA_DATA_END
HTP_LOG_CODE_SWITCHING_PROTO_WITH_CONTENT_LENGTH
HTP_LOG_CODE_DEFORMED_EOL
HTP_LOG_CODE_PARSER_STATE_ERROR
HTP_LOG_CODE_MISSING_OUTBOUND_TRANSACTION_DATA
HTP_LOG_CODE_MISSING_INBOUND_TRANSACTION_DATA
HTP_LOG_CODE_ZERO_LENGTH_DATA_CHUNKS
HTP_LOG_CODE_REQUEST_LINE_UNKNOWN_METHOD
HTP_LOG_CODE_REQUEST_LINE_UNKNOWN_METHOD_NO_PROTOCOL
HTP_LOG_CODE_REQUEST_LINE_UNKNOWN_METHOD_INVALID_PROTOCOL
HTP_LOG_CODE_REQUEST_LINE_NO_PROTOCOL
HTP_LOG_CODE_RESPONSE_LINE_INVALID_PROTOCOL
HTP_LOG_CODE_RESPONSE_LINE_INVALID_RESPONSE_STATUS
HTP_LOG_CODE_RESPONSE_BODY_INTERNAL_ERROR
HTP_LOG_CODE_REQUEST_BODY_DATA_CALLBACK_ERROR
HTP_LOG_CODE_RESPONSE_INVALID_EMPTY_NAME
HTP_LOG_CODE_REQUEST_INVALID_EMPTY_NAME
HTP_LOG_CODE_RESPONSE_INVALID_LWS_AFTER_NAME
HTP_LOG_CODE_RESPONSE_HEADER_NAME_NOT_TOKEN
HTP_LOG_CODE_REQUEST_INVALID_LWS_AFTER_NAME
HTP_LOG_CODE_LZMA_DECOMPRESSION_DISABLED
HTP_LOG_CODE_CONNECTION_ALREADY_OPEN
HTP_LOG_CODE_COMPRESSION_BOMB_DOUBLE_LZMA
HTP_LOG_CODE_INVALID_CONTENT_ENCODING
HTP_LOG_CODE_INVALID_GAP
HTP_LOG_CODE_ERROR

The new htp_log API supports consuming log messages more easily than
walking a list and tracking the current offset. Internally, libhtp-rs
now provides log messages as a queue of htp_log_t, which means the
application can simply call htp_conn_next_log() to fetch the next log
message until the queue is empty. Once the application is done with a
log message, they can call htp_log_free() to dispose of it.

Functions supporting htp_log_t:
htp_conn_next_log(conn) - Get the next log message
htp_log_message(log) - To get the text of the message
htp_log_code(log) - To get the HTP_LOG_CODE value
htp_log_free(log) - To free the htp_log_t
catenacyber pushed a commit to catenacyber/suricata that referenced this pull request Apr 4, 2024
Ticket: OISF#2696

There are a lot of changes here, which are described below.

In general these changes are renaming constants to conform to the
libhtp-rs versions (which are generated by cbindgen); making all htp
types opaque and changing struct->member references to
htp_struct_member() function calls; and a handful of changes to offload
functionality onto libhtp-rs from suricata, such as URI normalization
and transaction cleanup.

Constants renamed to correspond to cbindgen generated names:

HTP_OK => HTP_STATUS_OK
HTP_ERROR => HTP_STATUS_ERROR

HTP_SERVER_* => HTP_SERVER_PERSONALITY_*

HTP_REQUEST_INVALID_T_E => HTP_FLAGS_REQUEST_INVALID_T_E
HTP_REQUEST_INVALID_C_L => HTP_FLAGS_REQUEST_INVALID_C_L
HTP_HOST_MISSING => HTP_FLAGS_HOST_MISSING
HTP_HOST_AMBIGUOUS => HTP_FLAGS_HOST_AMBIGUOUS
HTP_HOSTU_INVALID => HTP_FLAGS_HOSTU_INVALID
HTP_HOSTH_INVALID => HTP_FLAGS_HOSTH_INVALID

HTP_AUTH_UNRECOGNIZED => HTP_AUTH_TYPE_UNRECOGNIZED

HTP_M_UNKNOWN => HTP_METHOD_UNKNOWN
HTP_M_GET => HTP_METHOD_GET
HTP_M_POST => HTP_METHOD_POST
HTP_M_PUT => HTP_METHOD_PUT
HTP_M_CONNECT => HTP_METHOD_CONNECT

HTP_STREAM_ERROR => HTP_STREAM_STATE_ERROR
HTP_STREAM_TUNNEL => HTP_STREAM_STATE_TUNNEL

HTP_REQUEST_LINE => HTP_REQUEST_PROGRESS_LINE
HTP_REQUEST_HEADERS => HTP_REQUEST_PROGRESS_HEADERS
HTP_REQUEST_BODY => HTP_REQUEST_PROGRESS_BODY
HTP_REQUEST_TRAILER => HTP_REQUEST_PROGRESS_TRAILER
HTP_REQUEST_COMPLETE => HTP_REQUEST_PROGRESS_COMPLETE
HTP_RESPONSE_LINE => HTP_RESPONSE_PROGRESS_LINE
HTP_RESPONSE_HEADERS => HTP_RESPONSE_PROGRESS_HEADERS
HTP_RESPONSE_BODY => HTP_RESPONSE_PROGRESS_BODY
HTP_RESPONSE_TRAILER => HTP_RESPONSE_PROGRESS_TRAILER
HTP_RESPONSE_COMPLETE => HTP_RESPONSE_PROGRESS_COMPLETE

HTP_PROTOCOL_1_1 => HTP_PROTOCOL_V1_1
HTP_PROTOCOL_1_0 => HTP_PROTOCOL_V1_0
HTP_PROTOCOL_0_9 => HTP_PROTOCOL_V0_9

Functions introduced to handle opaque htp_tx_t:
- tx->cfg => htp_tx_cfg(tx)
- tx->flags => htp_tx_flags(tx)
- tx->is_protocol_0_9 => htp_tx_is_protocol_0_9(tx)
- tx->parsed_uri => htp_tx_parsed_uri(tx)
- tx->parsed_uri->path => htp_uri_path(htp_tx_parsed_uri(tx)
- tx->parsed_uri->hostname => htp_uri_hostname(htp_tx_parsed_uri(tx))
- tx->request_auth_type => htp_tx_request_auth_type(tx)
- tx->request_headers => htp_tx_request_headers(tx)
- tx->request_hostname => htp_tx_request_hostname(tx)
- tx->request_line => htp_tx_request_line(tx)
- tx->request_message_len => htp_tx_request_message_len(tx)
- tx->request_method => htp_tx_request_method(tx)
- tx->request_method_number => htp_tx_request_method_number(tx)
- tx->request_port_number => htp_tx_request_port_number(tx)
- tx->request_progress => htp_tx_request_progress(tx)
- tx->request_protocol => htp_tx_request_protocol(tx)
- tx->request_protocol_number => htp_tx_request_protocol_number(tx)
- tx->request_uri => htp_tx_request_uri(tx)
- tx->response_headers => htp_tx_response_headers(tx)
- tx->response_line => htp_tx_response_line(tx)
- tx->response_message => htp_tx_response_message(tx)
- tx->response_message_len => htp_tx_response_message_len(tx)
- tx->response_status => htp_tx_response_status(tx)
- tx->response_status_number => htp_tx_response_status_number(tx)
- tx->response_progress => htp_tx_response_progress(tx)
- tx->response_protocol_number => htp_tx_response_protocol_number(tx)
- htp_tx_get_user_data() => htp_tx_user_data(tx)
- htp_table_get_c(tx->request_headers, header) => htp_tx_request_header(tx, header)
- htp_table_get_c(tx->response_headers, header) => htp_tx_response_header(tx, header)
- htp_table_get_index(tx->request_headers, index) => htp_tx_request_header_index(tx, index)
- htp_table_size(tx->request_headers) => htp_tx_request_headers_size(tx)
- htp_tx_is_http_2_upgrade(tx) convenience function introduced to detect response status 101
  and “Upgrade: h2c" header.

Functions introduced to handle opaque htp_header_t:
- header->name => htp_header_name(header)
- bstr_ptr(header->name) => htp_header_name_ptr(header)
- bstr_len(header->name) => htp_header_name_len(header)
- header->value => htp_header_value(header)
- bstr_len(header->value) => htp_header_value_len(header)
- bstr_ptr(header->value) => htp_header_value_ptr(header)

Functions introduced to handle opaque htp_headers_t:
- htp_table_size(headers) => htp_headers_size(headers)
- htp_table_get_index(headers, index) => htp_headers_get_index(headers, index)

Functions introduced to handle opaque htp_tx_data_t:
- d->len => htp_tx_data_len()
- d->data => htp_tx_data_data()
- htp_tx_data_tx(data) function to get the htp_tx_t from the htp_tx_data_t
- htp_tx_data_is_empty(data) convenience function introduced to test if the data is empty.

Functions introduced to handle opaque htp_connp_t:
- htp_list_get(connp->transactions, txid) => htp_connp_tx(connp, txid)
- htp_list_size(http_state->conn->transactions) => htp_connp_tx_size(connp)
- htp_connp_get_connection(connp) => htp_connp_connection(connp)
- htp_connp_req_data(connp) => htp_connp_request_data(connp)
- htp_connp_req_close(connp) => htp_connp_request_close(connp)
- htp_connp_res_data(connp) => htp_connp_response_data(connp)
- htp_connp_get_jn_tx(connp) => htp_connp_get_request_tx(connp)
- htp_connp_get_out_tx(connp) => htp_connp_get_response_tx(connp)
- htp_connp_req_data_consumed(connp) => htp_connp_request_data_consumed(connp)
- htp_connp_res_data_consumed(connp) => htp_connp_response_data_consumed(connp)
- htp_connp_get_user_data(connp) => htp_connp_user_data(connp)

Functions introduced to handle opaque htp_conn_t:
- conn->in_data_counter => htp_conn_request_data_counter(conn)
- conn->out_data_counter => htp_conn_response_data_counter(conn)

Other changes:

Build libhtp-rs as a crate inside rust. Update autoconf to no longer
use libhtp as an external dependency. Remove HAVE_HTP feature defines
since they are no longer needed.

Make function arguments and return values const where possible

htp_tx_destroy(tx) will now free an incomplete transaction

htp_time_t replaced with standard struct timeval

Callbacks from libhtp now provide the htp_connp_t and the htp_tx_data_t
as separate arguments. This means the connection parser is no longer
fetched from the transaction inside callbacks.

SCHTPGenerateNormalizedUri() functionality moved inside libhtp-rs, which
now provides normalized URI values.
The normalized URI is available with accessor function: htp_tx_normalized_uri()
Configuration settings added to control the behaviour of the URI normalization:
- htp_config_set_normalized_uri_include_all()
- htp_config_set_plusspace_decode()
- htp_config_set_convert_lowercase()
- htp_config_set_double_decode_normalized_query()
- htp_config_set_double_decode_normalized_path()
- htp_config_set_backslash_convert_slashes()
- htp_config_set_bestfit_replacement_byte()
- htp_config_set_convert_lowercase()
- htp_config_set_nul_encoded_terminates()
- htp_config_set_nul_raw_terminates()
- htp_config_set_path_separators_compress()
- htp_config_set_path_separators_decode()
- htp_config_set_u_encoding_decode()
- htp_config_set_url_encoding_invalid_handling()
- htp_config_set_utf8_convert_bestfit()
- htp_config_set_normalized_uri_include_all()
- htp_config_set_plusspace_decode()
Constants related to configuring uri normalization:
- HTP_URL_DECODE_PRESERVE_PERCENT => HTP_URL_ENCODING_HANDLING_PRESERVE_PERCENT
- HTP_URL_DECODE_REMOVE_PERCENT => HTP_URL_ENCODING_HANDLING_REMOVE_PERCENT
- HTP_URL_DECODE_PROCESS_INVALID => HTP_URL_ENCODING_HANDLING_PROCESS_INVALID

htp_config_set_field_limits(soft_limit, hard_limit) changed to
htp_config_set_field_limit(limit) because libhtp didn't implement soft
limits.

libhtp logging API updated to provide HTP_LOG_CODE constants along with
the message. This eliminates the need to perform string matching on
message text to map log messages to HTTP_DECODER_EVENT values, and the
HTP_LOG_CODE values can be used directly. In support of this,
HTP_DECODER_EVENT values are mapped to their corresponding HTP_LOG_CODE
values:

HTTP_DECODER_EVENT_UNKNOWN_ERROR => HTP_LOG_CODE_UNKNOWN
HTTP_DECODER_EVENT_GZIP_DECOMPRESSION_FAILED => HTP_LOG_CODE_GZIP_DECOMPRESSION_FAILED
HTTP_DECODER_EVENT_REQUEST_FIELD_MISSING_COLON => HTP_LOG_CODE_REQUEST_FIELD_MISSING_COLON
HTTP_DECODER_EVENT_RESPONSE_FIELD_MISSING_COLON => HTP_LOG_CODE_RESPONSE_FIELD_MISSING_COLON
HTTP_DECODER_EVENT_INVALID_REQUEST_CHUNK_LEN => HTP_LOG_CODE_INVALID_REQUEST_CHUNK_LEN
HTTP_DECODER_EVENT_INVALID_RESPONSE_CHUNK_LEN => HTP_LOG_CODE_INVALID_RESPONSE_CHUNK_LEN
HTTP_DECODER_EVENT_INVALID_TRANSFER_ENCODING_VALUE_IN_REQUEST => HTP_LOG_CODE_INVALID_TRANSFER_ENCODING_VALUE_IN_REQUEST
HTTP_DECODER_EVENT_INVALID_TRANSFER_ENCODING_VALUE_IN_RESPONSE => HTP_LOG_CODE_INVALID_TRANSFER_ENCODING_VALUE_IN_RESPONSE
HTTP_DECODER_EVENT_INVALID_CONTENT_LENGTH_FIELD_IN_REQUEST => HTP_LOG_CODE_INVALID_CONTENT_LENGTH_FIELD_IN_REQUEST
HTTP_DECODER_EVENT_INVALID_CONTENT_LENGTH_FIELD_IN_RESPONSE => HTP_LOG_CODE_INVALID_CONTENT_LENGTH_FIELD_IN_RESPONSE
HTTP_DECODER_EVENT_DUPLICATE_CONTENT_LENGTH_FIELD_IN_REQUEST => HTP_LOG_CODE_DUPLICATE_CONTENT_LENGTH_FIELD_IN_REQUEST
HTTP_DECODER_EVENT_DUPLICATE_CONTENT_LENGTH_FIELD_IN_RESPONSE => HTP_LOG_CODE_DUPLICATE_CONTENT_LENGTH_FIELD_IN_RESPONSE
HTTP_DECODER_EVENT_100_CONTINUE_ALREADY_SEEN => HTP_LOG_CODE_CONTINUE_ALREADY_SEEN
HTTP_DECODER_EVENT_UNABLE_TO_MATCH_RESPONSE_TO_REQUEST => HTP_LOG_CODE_UNABLE_TO_MATCH_RESPONSE_TO_REQUEST
HTTP_DECODER_EVENT_INVALID_SERVER_PORT_IN_REQUEST => HTP_LOG_CODE_INVALID_SERVER_PORT_IN_REQUEST
HTTP_DECODER_EVENT_INVALID_AUTHORITY_PORT => HTP_LOG_CODE_INVALID_AUTHORITY_PORT
HTTP_DECODER_EVENT_REQUEST_HEADER_INVALID => HTP_LOG_CODE_REQUEST_HEADER_INVALID
HTTP_DECODER_EVENT_RESPONSE_HEADER_INVALID => HTP_LOG_CODE_RESPONSE_HEADER_INVALID
HTTP_DECODER_EVENT_MISSING_HOST_HEADER => HTP_LOG_CODE_MISSING_HOST_HEADER
HTTP_DECODER_EVENT_HOST_HEADER_AMBIGUOUS => HTP_LOG_CODE_HOST_HEADER_AMBIGUOUS
HTTP_DECODER_EVENT_INVALID_REQUEST_FIELD_FOLDING => HTP_LOG_CODE_INVALID_REQUEST_FIELD_FOLDING
HTTP_DECODER_EVENT_INVALID_RESPONSE_FIELD_FOLDING => HTP_LOG_CODE_INVALID_RESPONSE_FIELD_FOLDING
HTTP_DECODER_EVENT_REQUEST_FIELD_TOO_LONG => HTP_LOG_CODE_REQUEST_FIELD_TOO_LONG
HTTP_DECODER_EVENT_RESPONSE_FIELD_TOO_LONG => HTP_LOG_CODE_RESPONSE_FIELD_TOO_LONG
HTTP_DECODER_EVENT_FILE_NAME_TOO_LONG => HTP_LOG_CODE_REQUEST_LINE_INVALID
HTTP_DECODER_EVENT_REQUEST_LINE_INVALID => HTP_LOG_CODE_REQUEST_BODY_UNEXPECTED
HTTP_DECODER_EVENT_REQUEST_BODY_UNEXPECTED => HTP_LOG_CODE_RESPONSE_BODY_UNEXPECTED
HTTP_DECODER_EVENT_REQUEST_SERVER_PORT_TCP_PORT_MISMATCH => HTP_LOG_CODE_REQUEST_SERVER_PORT_TCP_PORT_MISMATCH
HTTP_DECODER_EVENT_URI_HOST_INVALID => HTP_LOG_CODE_URI_HOST_INVALID
HTTP_DECODER_EVENT_HEADER_HOST_INVALID => HTP_LOG_CODE_HEADER_HOST_INVALID
HTTP_DECODER_EVENT_AUTH_UNRECOGNIZED => HTP_LOG_CODE_AUTH_UNRECOGNIZED
HTTP_DECODER_EVENT_REQUEST_HEADER_REPETITION => HTP_LOG_CODE_REQUEST_HEADER_REPETITION
HTTP_DECODER_EVENT_RESPONSE_HEADER_REPETITION => HTP_LOG_CODE_RESPONSE_HEADER_REPETITION
HTTP_DECODER_EVENT_DOUBLE_ENCODED_URI => HTP_LOG_CODE_DOUBLE_ENCODED_URI
HTTP_DECODER_EVENT_URI_DELIM_NON_COMPLIANT => HTP_LOG_CODE_URI_DELIM_NON_COMPLIANT
HTTP_DECODER_EVENT_METHOD_DELIM_NON_COMPLIANT => HTP_LOG_CODE_METHOD_DELIM_NON_COMPLIANT
HTTP_DECODER_EVENT_REQUEST_LINE_LEADING_WHITESPACE => HTP_LOG_CODE_REQUEST_LINE_LEADING_WHITESPACE
HTTP_DECODER_EVENT_TOO_MANY_ENCODING_LAYERS => HTP_LOG_CODE_TOO_MANY_ENCODING_LAYERS
HTTP_DECODER_EVENT_ABNORMAL_CE_HEADER => HTP_LOG_CODE_ABNORMAL_CE_HEADER
HTTP_DECODER_EVENT_RESPONSE_MULTIPART_BYTERANGES => HTP_LOG_CODE_RESPONSE_MULTIPART_BYTERANGES
HTTP_DECODER_EVENT_RESPONSE_ABNORMAL_TRANSFER_ENCODING => HTP_LOG_CODE_RESPONSE_ABNORMAL_TRANSFER_ENCODING
HTTP_DECODER_EVENT_RESPONSE_CHUNKED_OLD_PROTO => HTP_LOG_CODE_RESPONSE_CHUNKED_OLD_PROTO
HTTP_DECODER_EVENT_RESPONSE_INVALID_PROTOCOL => HTP_LOG_CODE_RESPONSE_INVALID_PROTOCOL
HTTP_DECODER_EVENT_RESPONSE_INVALID_STATUS => HTP_LOG_CODE_RESPONSE_INVALID_STATUS
HTTP_DECODER_EVENT_REQUEST_LINE_INCOMPLETE => HTP_LOG_CODE_REQUEST_LINE_INCOMPLETE
HTTP_DECODER_EVENT_LZMA_MEMLIMIT_REACHED => HTP_LOG_CODE_LZMA_MEMLIMIT_REACHED
HTTP_DECODER_EVENT_COMPRESSION_BOMB => HTP_LOG_CODE_COMPRESSION_BOMB

New log events to describe additional anomalies:
HTP_LOG_CODE_REQUEST_TOO_MANY_LZMA_LAYERS
HTP_LOG_CODE_RESPONSE_TOO_MANY_LZMA_LAYERS
HTP_LOG_CODE_PROTOCOL_CONTAINS_EXTRA_DATA
HTP_LOG_CODE_CONTENT_LENGTH_EXTRA_DATA_START
HTP_LOG_CODE_CONTENT_LENGTH_EXTRA_DATA_END
HTP_LOG_CODE_SWITCHING_PROTO_WITH_CONTENT_LENGTH
HTP_LOG_CODE_DEFORMED_EOL
HTP_LOG_CODE_PARSER_STATE_ERROR
HTP_LOG_CODE_MISSING_OUTBOUND_TRANSACTION_DATA
HTP_LOG_CODE_MISSING_INBOUND_TRANSACTION_DATA
HTP_LOG_CODE_ZERO_LENGTH_DATA_CHUNKS
HTP_LOG_CODE_REQUEST_LINE_UNKNOWN_METHOD
HTP_LOG_CODE_REQUEST_LINE_UNKNOWN_METHOD_NO_PROTOCOL
HTP_LOG_CODE_REQUEST_LINE_UNKNOWN_METHOD_INVALID_PROTOCOL
HTP_LOG_CODE_REQUEST_LINE_NO_PROTOCOL
HTP_LOG_CODE_RESPONSE_LINE_INVALID_PROTOCOL
HTP_LOG_CODE_RESPONSE_LINE_INVALID_RESPONSE_STATUS
HTP_LOG_CODE_RESPONSE_BODY_INTERNAL_ERROR
HTP_LOG_CODE_REQUEST_BODY_DATA_CALLBACK_ERROR
HTP_LOG_CODE_RESPONSE_INVALID_EMPTY_NAME
HTP_LOG_CODE_REQUEST_INVALID_EMPTY_NAME
HTP_LOG_CODE_RESPONSE_INVALID_LWS_AFTER_NAME
HTP_LOG_CODE_RESPONSE_HEADER_NAME_NOT_TOKEN
HTP_LOG_CODE_REQUEST_INVALID_LWS_AFTER_NAME
HTP_LOG_CODE_LZMA_DECOMPRESSION_DISABLED
HTP_LOG_CODE_CONNECTION_ALREADY_OPEN
HTP_LOG_CODE_COMPRESSION_BOMB_DOUBLE_LZMA
HTP_LOG_CODE_INVALID_CONTENT_ENCODING
HTP_LOG_CODE_INVALID_GAP
HTP_LOG_CODE_ERROR

The new htp_log API supports consuming log messages more easily than
walking a list and tracking the current offset. Internally, libhtp-rs
now provides log messages as a queue of htp_log_t, which means the
application can simply call htp_conn_next_log() to fetch the next log
message until the queue is empty. Once the application is done with a
log message, they can call htp_log_free() to dispose of it.

Functions supporting htp_log_t:
htp_conn_next_log(conn) - Get the next log message
htp_log_message(log) - To get the text of the message
htp_log_code(log) - To get the HTP_LOG_CODE value
htp_log_free(log) - To free the htp_log_t
catenacyber pushed a commit to catenacyber/suricata that referenced this pull request Apr 4, 2024
Ticket: OISF#2696

There are a lot of changes here, which are described below.

In general these changes are renaming constants to conform to the
libhtp-rs versions (which are generated by cbindgen); making all htp
types opaque and changing struct->member references to
htp_struct_member() function calls; and a handful of changes to offload
functionality onto libhtp-rs from suricata, such as URI normalization
and transaction cleanup.

Constants renamed to correspond to cbindgen generated names:

HTP_OK => HTP_STATUS_OK
HTP_ERROR => HTP_STATUS_ERROR

HTP_SERVER_* => HTP_SERVER_PERSONALITY_*

HTP_REQUEST_INVALID_T_E => HTP_FLAGS_REQUEST_INVALID_T_E
HTP_REQUEST_INVALID_C_L => HTP_FLAGS_REQUEST_INVALID_C_L
HTP_HOST_MISSING => HTP_FLAGS_HOST_MISSING
HTP_HOST_AMBIGUOUS => HTP_FLAGS_HOST_AMBIGUOUS
HTP_HOSTU_INVALID => HTP_FLAGS_HOSTU_INVALID
HTP_HOSTH_INVALID => HTP_FLAGS_HOSTH_INVALID

HTP_AUTH_UNRECOGNIZED => HTP_AUTH_TYPE_UNRECOGNIZED

HTP_M_UNKNOWN => HTP_METHOD_UNKNOWN
HTP_M_GET => HTP_METHOD_GET
HTP_M_POST => HTP_METHOD_POST
HTP_M_PUT => HTP_METHOD_PUT
HTP_M_CONNECT => HTP_METHOD_CONNECT

HTP_STREAM_ERROR => HTP_STREAM_STATE_ERROR
HTP_STREAM_TUNNEL => HTP_STREAM_STATE_TUNNEL

HTP_REQUEST_LINE => HTP_REQUEST_PROGRESS_LINE
HTP_REQUEST_HEADERS => HTP_REQUEST_PROGRESS_HEADERS
HTP_REQUEST_BODY => HTP_REQUEST_PROGRESS_BODY
HTP_REQUEST_TRAILER => HTP_REQUEST_PROGRESS_TRAILER
HTP_REQUEST_COMPLETE => HTP_REQUEST_PROGRESS_COMPLETE
HTP_RESPONSE_LINE => HTP_RESPONSE_PROGRESS_LINE
HTP_RESPONSE_HEADERS => HTP_RESPONSE_PROGRESS_HEADERS
HTP_RESPONSE_BODY => HTP_RESPONSE_PROGRESS_BODY
HTP_RESPONSE_TRAILER => HTP_RESPONSE_PROGRESS_TRAILER
HTP_RESPONSE_COMPLETE => HTP_RESPONSE_PROGRESS_COMPLETE

HTP_PROTOCOL_1_1 => HTP_PROTOCOL_V1_1
HTP_PROTOCOL_1_0 => HTP_PROTOCOL_V1_0
HTP_PROTOCOL_0_9 => HTP_PROTOCOL_V0_9

Functions introduced to handle opaque htp_tx_t:
- tx->cfg => htp_tx_cfg(tx)
- tx->flags => htp_tx_flags(tx)
- tx->is_protocol_0_9 => htp_tx_is_protocol_0_9(tx)
- tx->parsed_uri => htp_tx_parsed_uri(tx)
- tx->parsed_uri->path => htp_uri_path(htp_tx_parsed_uri(tx)
- tx->parsed_uri->hostname => htp_uri_hostname(htp_tx_parsed_uri(tx))
- tx->request_auth_type => htp_tx_request_auth_type(tx)
- tx->request_headers => htp_tx_request_headers(tx)
- tx->request_hostname => htp_tx_request_hostname(tx)
- tx->request_line => htp_tx_request_line(tx)
- tx->request_message_len => htp_tx_request_message_len(tx)
- tx->request_method => htp_tx_request_method(tx)
- tx->request_method_number => htp_tx_request_method_number(tx)
- tx->request_port_number => htp_tx_request_port_number(tx)
- tx->request_progress => htp_tx_request_progress(tx)
- tx->request_protocol => htp_tx_request_protocol(tx)
- tx->request_protocol_number => htp_tx_request_protocol_number(tx)
- tx->request_uri => htp_tx_request_uri(tx)
- tx->response_headers => htp_tx_response_headers(tx)
- tx->response_line => htp_tx_response_line(tx)
- tx->response_message => htp_tx_response_message(tx)
- tx->response_message_len => htp_tx_response_message_len(tx)
- tx->response_status => htp_tx_response_status(tx)
- tx->response_status_number => htp_tx_response_status_number(tx)
- tx->response_progress => htp_tx_response_progress(tx)
- tx->response_protocol_number => htp_tx_response_protocol_number(tx)
- htp_tx_get_user_data() => htp_tx_user_data(tx)
- htp_table_get_c(tx->request_headers, header) => htp_tx_request_header(tx, header)
- htp_table_get_c(tx->response_headers, header) => htp_tx_response_header(tx, header)
- htp_table_get_index(tx->request_headers, index) => htp_tx_request_header_index(tx, index)
- htp_table_size(tx->request_headers) => htp_tx_request_headers_size(tx)
- htp_tx_is_http_2_upgrade(tx) convenience function introduced to detect response status 101
  and “Upgrade: h2c" header.

Functions introduced to handle opaque htp_header_t:
- header->name => htp_header_name(header)
- bstr_ptr(header->name) => htp_header_name_ptr(header)
- bstr_len(header->name) => htp_header_name_len(header)
- header->value => htp_header_value(header)
- bstr_len(header->value) => htp_header_value_len(header)
- bstr_ptr(header->value) => htp_header_value_ptr(header)

Functions introduced to handle opaque htp_headers_t:
- htp_table_size(headers) => htp_headers_size(headers)
- htp_table_get_index(headers, index) => htp_headers_get_index(headers, index)

Functions introduced to handle opaque htp_tx_data_t:
- d->len => htp_tx_data_len()
- d->data => htp_tx_data_data()
- htp_tx_data_tx(data) function to get the htp_tx_t from the htp_tx_data_t
- htp_tx_data_is_empty(data) convenience function introduced to test if the data is empty.

Functions introduced to handle opaque htp_connp_t:
- htp_list_get(connp->transactions, txid) => htp_connp_tx(connp, txid)
- htp_list_size(http_state->conn->transactions) => htp_connp_tx_size(connp)
- htp_connp_get_connection(connp) => htp_connp_connection(connp)
- htp_connp_req_data(connp) => htp_connp_request_data(connp)
- htp_connp_req_close(connp) => htp_connp_request_close(connp)
- htp_connp_res_data(connp) => htp_connp_response_data(connp)
- htp_connp_get_jn_tx(connp) => htp_connp_get_request_tx(connp)
- htp_connp_get_out_tx(connp) => htp_connp_get_response_tx(connp)
- htp_connp_req_data_consumed(connp) => htp_connp_request_data_consumed(connp)
- htp_connp_res_data_consumed(connp) => htp_connp_response_data_consumed(connp)
- htp_connp_get_user_data(connp) => htp_connp_user_data(connp)

Functions introduced to handle opaque htp_conn_t:
- conn->in_data_counter => htp_conn_request_data_counter(conn)
- conn->out_data_counter => htp_conn_response_data_counter(conn)

Other changes:

Build libhtp-rs as a crate inside rust. Update autoconf to no longer
use libhtp as an external dependency. Remove HAVE_HTP feature defines
since they are no longer needed.

Make function arguments and return values const where possible

htp_tx_destroy(tx) will now free an incomplete transaction

htp_time_t replaced with standard struct timeval

Callbacks from libhtp now provide the htp_connp_t and the htp_tx_data_t
as separate arguments. This means the connection parser is no longer
fetched from the transaction inside callbacks.

SCHTPGenerateNormalizedUri() functionality moved inside libhtp-rs, which
now provides normalized URI values.
The normalized URI is available with accessor function: htp_tx_normalized_uri()
Configuration settings added to control the behaviour of the URI normalization:
- htp_config_set_normalized_uri_include_all()
- htp_config_set_plusspace_decode()
- htp_config_set_convert_lowercase()
- htp_config_set_double_decode_normalized_query()
- htp_config_set_double_decode_normalized_path()
- htp_config_set_backslash_convert_slashes()
- htp_config_set_bestfit_replacement_byte()
- htp_config_set_convert_lowercase()
- htp_config_set_nul_encoded_terminates()
- htp_config_set_nul_raw_terminates()
- htp_config_set_path_separators_compress()
- htp_config_set_path_separators_decode()
- htp_config_set_u_encoding_decode()
- htp_config_set_url_encoding_invalid_handling()
- htp_config_set_utf8_convert_bestfit()
- htp_config_set_normalized_uri_include_all()
- htp_config_set_plusspace_decode()
Constants related to configuring uri normalization:
- HTP_URL_DECODE_PRESERVE_PERCENT => HTP_URL_ENCODING_HANDLING_PRESERVE_PERCENT
- HTP_URL_DECODE_REMOVE_PERCENT => HTP_URL_ENCODING_HANDLING_REMOVE_PERCENT
- HTP_URL_DECODE_PROCESS_INVALID => HTP_URL_ENCODING_HANDLING_PROCESS_INVALID

htp_config_set_field_limits(soft_limit, hard_limit) changed to
htp_config_set_field_limit(limit) because libhtp didn't implement soft
limits.

libhtp logging API updated to provide HTP_LOG_CODE constants along with
the message. This eliminates the need to perform string matching on
message text to map log messages to HTTP_DECODER_EVENT values, and the
HTP_LOG_CODE values can be used directly. In support of this,
HTP_DECODER_EVENT values are mapped to their corresponding HTP_LOG_CODE
values:

HTTP_DECODER_EVENT_UNKNOWN_ERROR => HTP_LOG_CODE_UNKNOWN
HTTP_DECODER_EVENT_GZIP_DECOMPRESSION_FAILED => HTP_LOG_CODE_GZIP_DECOMPRESSION_FAILED
HTTP_DECODER_EVENT_REQUEST_FIELD_MISSING_COLON => HTP_LOG_CODE_REQUEST_FIELD_MISSING_COLON
HTTP_DECODER_EVENT_RESPONSE_FIELD_MISSING_COLON => HTP_LOG_CODE_RESPONSE_FIELD_MISSING_COLON
HTTP_DECODER_EVENT_INVALID_REQUEST_CHUNK_LEN => HTP_LOG_CODE_INVALID_REQUEST_CHUNK_LEN
HTTP_DECODER_EVENT_INVALID_RESPONSE_CHUNK_LEN => HTP_LOG_CODE_INVALID_RESPONSE_CHUNK_LEN
HTTP_DECODER_EVENT_INVALID_TRANSFER_ENCODING_VALUE_IN_REQUEST => HTP_LOG_CODE_INVALID_TRANSFER_ENCODING_VALUE_IN_REQUEST
HTTP_DECODER_EVENT_INVALID_TRANSFER_ENCODING_VALUE_IN_RESPONSE => HTP_LOG_CODE_INVALID_TRANSFER_ENCODING_VALUE_IN_RESPONSE
HTTP_DECODER_EVENT_INVALID_CONTENT_LENGTH_FIELD_IN_REQUEST => HTP_LOG_CODE_INVALID_CONTENT_LENGTH_FIELD_IN_REQUEST
HTTP_DECODER_EVENT_INVALID_CONTENT_LENGTH_FIELD_IN_RESPONSE => HTP_LOG_CODE_INVALID_CONTENT_LENGTH_FIELD_IN_RESPONSE
HTTP_DECODER_EVENT_DUPLICATE_CONTENT_LENGTH_FIELD_IN_REQUEST => HTP_LOG_CODE_DUPLICATE_CONTENT_LENGTH_FIELD_IN_REQUEST
HTTP_DECODER_EVENT_DUPLICATE_CONTENT_LENGTH_FIELD_IN_RESPONSE => HTP_LOG_CODE_DUPLICATE_CONTENT_LENGTH_FIELD_IN_RESPONSE
HTTP_DECODER_EVENT_100_CONTINUE_ALREADY_SEEN => HTP_LOG_CODE_CONTINUE_ALREADY_SEEN
HTTP_DECODER_EVENT_UNABLE_TO_MATCH_RESPONSE_TO_REQUEST => HTP_LOG_CODE_UNABLE_TO_MATCH_RESPONSE_TO_REQUEST
HTTP_DECODER_EVENT_INVALID_SERVER_PORT_IN_REQUEST => HTP_LOG_CODE_INVALID_SERVER_PORT_IN_REQUEST
HTTP_DECODER_EVENT_INVALID_AUTHORITY_PORT => HTP_LOG_CODE_INVALID_AUTHORITY_PORT
HTTP_DECODER_EVENT_REQUEST_HEADER_INVALID => HTP_LOG_CODE_REQUEST_HEADER_INVALID
HTTP_DECODER_EVENT_RESPONSE_HEADER_INVALID => HTP_LOG_CODE_RESPONSE_HEADER_INVALID
HTTP_DECODER_EVENT_MISSING_HOST_HEADER => HTP_LOG_CODE_MISSING_HOST_HEADER
HTTP_DECODER_EVENT_HOST_HEADER_AMBIGUOUS => HTP_LOG_CODE_HOST_HEADER_AMBIGUOUS
HTTP_DECODER_EVENT_INVALID_REQUEST_FIELD_FOLDING => HTP_LOG_CODE_INVALID_REQUEST_FIELD_FOLDING
HTTP_DECODER_EVENT_INVALID_RESPONSE_FIELD_FOLDING => HTP_LOG_CODE_INVALID_RESPONSE_FIELD_FOLDING
HTTP_DECODER_EVENT_REQUEST_FIELD_TOO_LONG => HTP_LOG_CODE_REQUEST_FIELD_TOO_LONG
HTTP_DECODER_EVENT_RESPONSE_FIELD_TOO_LONG => HTP_LOG_CODE_RESPONSE_FIELD_TOO_LONG
HTTP_DECODER_EVENT_FILE_NAME_TOO_LONG => HTP_LOG_CODE_REQUEST_LINE_INVALID
HTTP_DECODER_EVENT_REQUEST_LINE_INVALID => HTP_LOG_CODE_REQUEST_BODY_UNEXPECTED
HTTP_DECODER_EVENT_REQUEST_BODY_UNEXPECTED => HTP_LOG_CODE_RESPONSE_BODY_UNEXPECTED
HTTP_DECODER_EVENT_REQUEST_SERVER_PORT_TCP_PORT_MISMATCH => HTP_LOG_CODE_REQUEST_SERVER_PORT_TCP_PORT_MISMATCH
HTTP_DECODER_EVENT_URI_HOST_INVALID => HTP_LOG_CODE_URI_HOST_INVALID
HTTP_DECODER_EVENT_HEADER_HOST_INVALID => HTP_LOG_CODE_HEADER_HOST_INVALID
HTTP_DECODER_EVENT_AUTH_UNRECOGNIZED => HTP_LOG_CODE_AUTH_UNRECOGNIZED
HTTP_DECODER_EVENT_REQUEST_HEADER_REPETITION => HTP_LOG_CODE_REQUEST_HEADER_REPETITION
HTTP_DECODER_EVENT_RESPONSE_HEADER_REPETITION => HTP_LOG_CODE_RESPONSE_HEADER_REPETITION
HTTP_DECODER_EVENT_DOUBLE_ENCODED_URI => HTP_LOG_CODE_DOUBLE_ENCODED_URI
HTTP_DECODER_EVENT_URI_DELIM_NON_COMPLIANT => HTP_LOG_CODE_URI_DELIM_NON_COMPLIANT
HTTP_DECODER_EVENT_METHOD_DELIM_NON_COMPLIANT => HTP_LOG_CODE_METHOD_DELIM_NON_COMPLIANT
HTTP_DECODER_EVENT_REQUEST_LINE_LEADING_WHITESPACE => HTP_LOG_CODE_REQUEST_LINE_LEADING_WHITESPACE
HTTP_DECODER_EVENT_TOO_MANY_ENCODING_LAYERS => HTP_LOG_CODE_TOO_MANY_ENCODING_LAYERS
HTTP_DECODER_EVENT_ABNORMAL_CE_HEADER => HTP_LOG_CODE_ABNORMAL_CE_HEADER
HTTP_DECODER_EVENT_RESPONSE_MULTIPART_BYTERANGES => HTP_LOG_CODE_RESPONSE_MULTIPART_BYTERANGES
HTTP_DECODER_EVENT_RESPONSE_ABNORMAL_TRANSFER_ENCODING => HTP_LOG_CODE_RESPONSE_ABNORMAL_TRANSFER_ENCODING
HTTP_DECODER_EVENT_RESPONSE_CHUNKED_OLD_PROTO => HTP_LOG_CODE_RESPONSE_CHUNKED_OLD_PROTO
HTTP_DECODER_EVENT_RESPONSE_INVALID_PROTOCOL => HTP_LOG_CODE_RESPONSE_INVALID_PROTOCOL
HTTP_DECODER_EVENT_RESPONSE_INVALID_STATUS => HTP_LOG_CODE_RESPONSE_INVALID_STATUS
HTTP_DECODER_EVENT_REQUEST_LINE_INCOMPLETE => HTP_LOG_CODE_REQUEST_LINE_INCOMPLETE
HTTP_DECODER_EVENT_LZMA_MEMLIMIT_REACHED => HTP_LOG_CODE_LZMA_MEMLIMIT_REACHED
HTTP_DECODER_EVENT_COMPRESSION_BOMB => HTP_LOG_CODE_COMPRESSION_BOMB

New log events to describe additional anomalies:
HTP_LOG_CODE_REQUEST_TOO_MANY_LZMA_LAYERS
HTP_LOG_CODE_RESPONSE_TOO_MANY_LZMA_LAYERS
HTP_LOG_CODE_PROTOCOL_CONTAINS_EXTRA_DATA
HTP_LOG_CODE_CONTENT_LENGTH_EXTRA_DATA_START
HTP_LOG_CODE_CONTENT_LENGTH_EXTRA_DATA_END
HTP_LOG_CODE_SWITCHING_PROTO_WITH_CONTENT_LENGTH
HTP_LOG_CODE_DEFORMED_EOL
HTP_LOG_CODE_PARSER_STATE_ERROR
HTP_LOG_CODE_MISSING_OUTBOUND_TRANSACTION_DATA
HTP_LOG_CODE_MISSING_INBOUND_TRANSACTION_DATA
HTP_LOG_CODE_ZERO_LENGTH_DATA_CHUNKS
HTP_LOG_CODE_REQUEST_LINE_UNKNOWN_METHOD
HTP_LOG_CODE_REQUEST_LINE_UNKNOWN_METHOD_NO_PROTOCOL
HTP_LOG_CODE_REQUEST_LINE_UNKNOWN_METHOD_INVALID_PROTOCOL
HTP_LOG_CODE_REQUEST_LINE_NO_PROTOCOL
HTP_LOG_CODE_RESPONSE_LINE_INVALID_PROTOCOL
HTP_LOG_CODE_RESPONSE_LINE_INVALID_RESPONSE_STATUS
HTP_LOG_CODE_RESPONSE_BODY_INTERNAL_ERROR
HTP_LOG_CODE_REQUEST_BODY_DATA_CALLBACK_ERROR
HTP_LOG_CODE_RESPONSE_INVALID_EMPTY_NAME
HTP_LOG_CODE_REQUEST_INVALID_EMPTY_NAME
HTP_LOG_CODE_RESPONSE_INVALID_LWS_AFTER_NAME
HTP_LOG_CODE_RESPONSE_HEADER_NAME_NOT_TOKEN
HTP_LOG_CODE_REQUEST_INVALID_LWS_AFTER_NAME
HTP_LOG_CODE_LZMA_DECOMPRESSION_DISABLED
HTP_LOG_CODE_CONNECTION_ALREADY_OPEN
HTP_LOG_CODE_COMPRESSION_BOMB_DOUBLE_LZMA
HTP_LOG_CODE_INVALID_CONTENT_ENCODING
HTP_LOG_CODE_INVALID_GAP
HTP_LOG_CODE_ERROR

The new htp_log API supports consuming log messages more easily than
walking a list and tracking the current offset. Internally, libhtp-rs
now provides log messages as a queue of htp_log_t, which means the
application can simply call htp_conn_next_log() to fetch the next log
message until the queue is empty. Once the application is done with a
log message, they can call htp_log_free() to dispose of it.

Functions supporting htp_log_t:
htp_conn_next_log(conn) - Get the next log message
htp_log_message(log) - To get the text of the message
htp_log_code(log) - To get the HTP_LOG_CODE value
htp_log_free(log) - To free the htp_log_t
catenacyber pushed a commit to catenacyber/suricata that referenced this pull request Apr 11, 2024
Ticket: OISF#2696

There are a lot of changes here, which are described below.

In general these changes are renaming constants to conform to the
libhtp-rs versions (which are generated by cbindgen); making all htp
types opaque and changing struct->member references to
htp_struct_member() function calls; and a handful of changes to offload
functionality onto libhtp-rs from suricata, such as URI normalization
and transaction cleanup.

Constants renamed to correspond to cbindgen generated names:

HTP_OK => HTP_STATUS_OK
HTP_ERROR => HTP_STATUS_ERROR

HTP_SERVER_* => HTP_SERVER_PERSONALITY_*

HTP_REQUEST_INVALID_T_E => HTP_FLAGS_REQUEST_INVALID_T_E
HTP_REQUEST_INVALID_C_L => HTP_FLAGS_REQUEST_INVALID_C_L
HTP_HOST_MISSING => HTP_FLAGS_HOST_MISSING
HTP_HOST_AMBIGUOUS => HTP_FLAGS_HOST_AMBIGUOUS
HTP_HOSTU_INVALID => HTP_FLAGS_HOSTU_INVALID
HTP_HOSTH_INVALID => HTP_FLAGS_HOSTH_INVALID

HTP_AUTH_UNRECOGNIZED => HTP_AUTH_TYPE_UNRECOGNIZED

HTP_M_UNKNOWN => HTP_METHOD_UNKNOWN
HTP_M_GET => HTP_METHOD_GET
HTP_M_POST => HTP_METHOD_POST
HTP_M_PUT => HTP_METHOD_PUT
HTP_M_CONNECT => HTP_METHOD_CONNECT

HTP_STREAM_ERROR => HTP_STREAM_STATE_ERROR
HTP_STREAM_TUNNEL => HTP_STREAM_STATE_TUNNEL

HTP_REQUEST_LINE => HTP_REQUEST_PROGRESS_LINE
HTP_REQUEST_HEADERS => HTP_REQUEST_PROGRESS_HEADERS
HTP_REQUEST_BODY => HTP_REQUEST_PROGRESS_BODY
HTP_REQUEST_TRAILER => HTP_REQUEST_PROGRESS_TRAILER
HTP_REQUEST_COMPLETE => HTP_REQUEST_PROGRESS_COMPLETE
HTP_RESPONSE_LINE => HTP_RESPONSE_PROGRESS_LINE
HTP_RESPONSE_HEADERS => HTP_RESPONSE_PROGRESS_HEADERS
HTP_RESPONSE_BODY => HTP_RESPONSE_PROGRESS_BODY
HTP_RESPONSE_TRAILER => HTP_RESPONSE_PROGRESS_TRAILER
HTP_RESPONSE_COMPLETE => HTP_RESPONSE_PROGRESS_COMPLETE

HTP_PROTOCOL_1_1 => HTP_PROTOCOL_V1_1
HTP_PROTOCOL_1_0 => HTP_PROTOCOL_V1_0
HTP_PROTOCOL_0_9 => HTP_PROTOCOL_V0_9

Functions introduced to handle opaque htp_tx_t:
- tx->cfg => htp_tx_cfg(tx)
- tx->flags => htp_tx_flags(tx)
- tx->is_protocol_0_9 => htp_tx_is_protocol_0_9(tx)
- tx->parsed_uri => htp_tx_parsed_uri(tx)
- tx->parsed_uri->path => htp_uri_path(htp_tx_parsed_uri(tx)
- tx->parsed_uri->hostname => htp_uri_hostname(htp_tx_parsed_uri(tx))
- tx->request_auth_type => htp_tx_request_auth_type(tx)
- tx->request_headers => htp_tx_request_headers(tx)
- tx->request_hostname => htp_tx_request_hostname(tx)
- tx->request_line => htp_tx_request_line(tx)
- tx->request_message_len => htp_tx_request_message_len(tx)
- tx->request_method => htp_tx_request_method(tx)
- tx->request_method_number => htp_tx_request_method_number(tx)
- tx->request_port_number => htp_tx_request_port_number(tx)
- tx->request_progress => htp_tx_request_progress(tx)
- tx->request_protocol => htp_tx_request_protocol(tx)
- tx->request_protocol_number => htp_tx_request_protocol_number(tx)
- tx->request_uri => htp_tx_request_uri(tx)
- tx->response_headers => htp_tx_response_headers(tx)
- tx->response_line => htp_tx_response_line(tx)
- tx->response_message => htp_tx_response_message(tx)
- tx->response_message_len => htp_tx_response_message_len(tx)
- tx->response_status => htp_tx_response_status(tx)
- tx->response_status_number => htp_tx_response_status_number(tx)
- tx->response_progress => htp_tx_response_progress(tx)
- tx->response_protocol_number => htp_tx_response_protocol_number(tx)
- htp_tx_get_user_data() => htp_tx_user_data(tx)
- htp_table_get_c(tx->request_headers, header) => htp_tx_request_header(tx, header)
- htp_table_get_c(tx->response_headers, header) => htp_tx_response_header(tx, header)
- htp_table_get_index(tx->request_headers, index) => htp_tx_request_header_index(tx, index)
- htp_table_size(tx->request_headers) => htp_tx_request_headers_size(tx)
- htp_tx_is_http_2_upgrade(tx) convenience function introduced to detect response status 101
  and “Upgrade: h2c" header.

Functions introduced to handle opaque htp_header_t:
- header->name => htp_header_name(header)
- bstr_ptr(header->name) => htp_header_name_ptr(header)
- bstr_len(header->name) => htp_header_name_len(header)
- header->value => htp_header_value(header)
- bstr_len(header->value) => htp_header_value_len(header)
- bstr_ptr(header->value) => htp_header_value_ptr(header)

Functions introduced to handle opaque htp_headers_t:
- htp_table_size(headers) => htp_headers_size(headers)
- htp_table_get_index(headers, index) => htp_headers_get_index(headers, index)

Functions introduced to handle opaque htp_tx_data_t:
- d->len => htp_tx_data_len()
- d->data => htp_tx_data_data()
- htp_tx_data_tx(data) function to get the htp_tx_t from the htp_tx_data_t
- htp_tx_data_is_empty(data) convenience function introduced to test if the data is empty.

Functions introduced to handle opaque htp_connp_t:
- htp_list_get(connp->transactions, txid) => htp_connp_tx(connp, txid)
- htp_list_size(http_state->conn->transactions) => htp_connp_tx_size(connp)
- htp_connp_get_connection(connp) => htp_connp_connection(connp)
- htp_connp_req_data(connp) => htp_connp_request_data(connp)
- htp_connp_req_close(connp) => htp_connp_request_close(connp)
- htp_connp_res_data(connp) => htp_connp_response_data(connp)
- htp_connp_get_jn_tx(connp) => htp_connp_get_request_tx(connp)
- htp_connp_get_out_tx(connp) => htp_connp_get_response_tx(connp)
- htp_connp_req_data_consumed(connp) => htp_connp_request_data_consumed(connp)
- htp_connp_res_data_consumed(connp) => htp_connp_response_data_consumed(connp)
- htp_connp_get_user_data(connp) => htp_connp_user_data(connp)

Functions introduced to handle opaque htp_conn_t:
- conn->in_data_counter => htp_conn_request_data_counter(conn)
- conn->out_data_counter => htp_conn_response_data_counter(conn)

Other changes:

Build libhtp-rs as a crate inside rust. Update autoconf to no longer
use libhtp as an external dependency. Remove HAVE_HTP feature defines
since they are no longer needed.

Make function arguments and return values const where possible

htp_tx_destroy(tx) will now free an incomplete transaction

htp_time_t replaced with standard struct timeval

Callbacks from libhtp now provide the htp_connp_t and the htp_tx_data_t
as separate arguments. This means the connection parser is no longer
fetched from the transaction inside callbacks.

SCHTPGenerateNormalizedUri() functionality moved inside libhtp-rs, which
now provides normalized URI values.
The normalized URI is available with accessor function: htp_tx_normalized_uri()
Configuration settings added to control the behaviour of the URI normalization:
- htp_config_set_normalized_uri_include_all()
- htp_config_set_plusspace_decode()
- htp_config_set_convert_lowercase()
- htp_config_set_double_decode_normalized_query()
- htp_config_set_double_decode_normalized_path()
- htp_config_set_backslash_convert_slashes()
- htp_config_set_bestfit_replacement_byte()
- htp_config_set_convert_lowercase()
- htp_config_set_nul_encoded_terminates()
- htp_config_set_nul_raw_terminates()
- htp_config_set_path_separators_compress()
- htp_config_set_path_separators_decode()
- htp_config_set_u_encoding_decode()
- htp_config_set_url_encoding_invalid_handling()
- htp_config_set_utf8_convert_bestfit()
- htp_config_set_normalized_uri_include_all()
- htp_config_set_plusspace_decode()
Constants related to configuring uri normalization:
- HTP_URL_DECODE_PRESERVE_PERCENT => HTP_URL_ENCODING_HANDLING_PRESERVE_PERCENT
- HTP_URL_DECODE_REMOVE_PERCENT => HTP_URL_ENCODING_HANDLING_REMOVE_PERCENT
- HTP_URL_DECODE_PROCESS_INVALID => HTP_URL_ENCODING_HANDLING_PROCESS_INVALID

htp_config_set_field_limits(soft_limit, hard_limit) changed to
htp_config_set_field_limit(limit) because libhtp didn't implement soft
limits.

libhtp logging API updated to provide HTP_LOG_CODE constants along with
the message. This eliminates the need to perform string matching on
message text to map log messages to HTTP_DECODER_EVENT values, and the
HTP_LOG_CODE values can be used directly. In support of this,
HTP_DECODER_EVENT values are mapped to their corresponding HTP_LOG_CODE
values:

HTTP_DECODER_EVENT_UNKNOWN_ERROR => HTP_LOG_CODE_UNKNOWN
HTTP_DECODER_EVENT_GZIP_DECOMPRESSION_FAILED => HTP_LOG_CODE_GZIP_DECOMPRESSION_FAILED
HTTP_DECODER_EVENT_REQUEST_FIELD_MISSING_COLON => HTP_LOG_CODE_REQUEST_FIELD_MISSING_COLON
HTTP_DECODER_EVENT_RESPONSE_FIELD_MISSING_COLON => HTP_LOG_CODE_RESPONSE_FIELD_MISSING_COLON
HTTP_DECODER_EVENT_INVALID_REQUEST_CHUNK_LEN => HTP_LOG_CODE_INVALID_REQUEST_CHUNK_LEN
HTTP_DECODER_EVENT_INVALID_RESPONSE_CHUNK_LEN => HTP_LOG_CODE_INVALID_RESPONSE_CHUNK_LEN
HTTP_DECODER_EVENT_INVALID_TRANSFER_ENCODING_VALUE_IN_REQUEST => HTP_LOG_CODE_INVALID_TRANSFER_ENCODING_VALUE_IN_REQUEST
HTTP_DECODER_EVENT_INVALID_TRANSFER_ENCODING_VALUE_IN_RESPONSE => HTP_LOG_CODE_INVALID_TRANSFER_ENCODING_VALUE_IN_RESPONSE
HTTP_DECODER_EVENT_INVALID_CONTENT_LENGTH_FIELD_IN_REQUEST => HTP_LOG_CODE_INVALID_CONTENT_LENGTH_FIELD_IN_REQUEST
HTTP_DECODER_EVENT_INVALID_CONTENT_LENGTH_FIELD_IN_RESPONSE => HTP_LOG_CODE_INVALID_CONTENT_LENGTH_FIELD_IN_RESPONSE
HTTP_DECODER_EVENT_DUPLICATE_CONTENT_LENGTH_FIELD_IN_REQUEST => HTP_LOG_CODE_DUPLICATE_CONTENT_LENGTH_FIELD_IN_REQUEST
HTTP_DECODER_EVENT_DUPLICATE_CONTENT_LENGTH_FIELD_IN_RESPONSE => HTP_LOG_CODE_DUPLICATE_CONTENT_LENGTH_FIELD_IN_RESPONSE
HTTP_DECODER_EVENT_100_CONTINUE_ALREADY_SEEN => HTP_LOG_CODE_CONTINUE_ALREADY_SEEN
HTTP_DECODER_EVENT_UNABLE_TO_MATCH_RESPONSE_TO_REQUEST => HTP_LOG_CODE_UNABLE_TO_MATCH_RESPONSE_TO_REQUEST
HTTP_DECODER_EVENT_INVALID_SERVER_PORT_IN_REQUEST => HTP_LOG_CODE_INVALID_SERVER_PORT_IN_REQUEST
HTTP_DECODER_EVENT_INVALID_AUTHORITY_PORT => HTP_LOG_CODE_INVALID_AUTHORITY_PORT
HTTP_DECODER_EVENT_REQUEST_HEADER_INVALID => HTP_LOG_CODE_REQUEST_HEADER_INVALID
HTTP_DECODER_EVENT_RESPONSE_HEADER_INVALID => HTP_LOG_CODE_RESPONSE_HEADER_INVALID
HTTP_DECODER_EVENT_MISSING_HOST_HEADER => HTP_LOG_CODE_MISSING_HOST_HEADER
HTTP_DECODER_EVENT_HOST_HEADER_AMBIGUOUS => HTP_LOG_CODE_HOST_HEADER_AMBIGUOUS
HTTP_DECODER_EVENT_INVALID_REQUEST_FIELD_FOLDING => HTP_LOG_CODE_INVALID_REQUEST_FIELD_FOLDING
HTTP_DECODER_EVENT_INVALID_RESPONSE_FIELD_FOLDING => HTP_LOG_CODE_INVALID_RESPONSE_FIELD_FOLDING
HTTP_DECODER_EVENT_REQUEST_FIELD_TOO_LONG => HTP_LOG_CODE_REQUEST_FIELD_TOO_LONG
HTTP_DECODER_EVENT_RESPONSE_FIELD_TOO_LONG => HTP_LOG_CODE_RESPONSE_FIELD_TOO_LONG
HTTP_DECODER_EVENT_FILE_NAME_TOO_LONG => HTP_LOG_CODE_REQUEST_LINE_INVALID
HTTP_DECODER_EVENT_REQUEST_LINE_INVALID => HTP_LOG_CODE_REQUEST_BODY_UNEXPECTED
HTTP_DECODER_EVENT_REQUEST_BODY_UNEXPECTED => HTP_LOG_CODE_RESPONSE_BODY_UNEXPECTED
HTTP_DECODER_EVENT_REQUEST_SERVER_PORT_TCP_PORT_MISMATCH => HTP_LOG_CODE_REQUEST_SERVER_PORT_TCP_PORT_MISMATCH
HTTP_DECODER_EVENT_URI_HOST_INVALID => HTP_LOG_CODE_URI_HOST_INVALID
HTTP_DECODER_EVENT_HEADER_HOST_INVALID => HTP_LOG_CODE_HEADER_HOST_INVALID
HTTP_DECODER_EVENT_AUTH_UNRECOGNIZED => HTP_LOG_CODE_AUTH_UNRECOGNIZED
HTTP_DECODER_EVENT_REQUEST_HEADER_REPETITION => HTP_LOG_CODE_REQUEST_HEADER_REPETITION
HTTP_DECODER_EVENT_RESPONSE_HEADER_REPETITION => HTP_LOG_CODE_RESPONSE_HEADER_REPETITION
HTTP_DECODER_EVENT_DOUBLE_ENCODED_URI => HTP_LOG_CODE_DOUBLE_ENCODED_URI
HTTP_DECODER_EVENT_URI_DELIM_NON_COMPLIANT => HTP_LOG_CODE_URI_DELIM_NON_COMPLIANT
HTTP_DECODER_EVENT_METHOD_DELIM_NON_COMPLIANT => HTP_LOG_CODE_METHOD_DELIM_NON_COMPLIANT
HTTP_DECODER_EVENT_REQUEST_LINE_LEADING_WHITESPACE => HTP_LOG_CODE_REQUEST_LINE_LEADING_WHITESPACE
HTTP_DECODER_EVENT_TOO_MANY_ENCODING_LAYERS => HTP_LOG_CODE_TOO_MANY_ENCODING_LAYERS
HTTP_DECODER_EVENT_ABNORMAL_CE_HEADER => HTP_LOG_CODE_ABNORMAL_CE_HEADER
HTTP_DECODER_EVENT_RESPONSE_MULTIPART_BYTERANGES => HTP_LOG_CODE_RESPONSE_MULTIPART_BYTERANGES
HTTP_DECODER_EVENT_RESPONSE_ABNORMAL_TRANSFER_ENCODING => HTP_LOG_CODE_RESPONSE_ABNORMAL_TRANSFER_ENCODING
HTTP_DECODER_EVENT_RESPONSE_CHUNKED_OLD_PROTO => HTP_LOG_CODE_RESPONSE_CHUNKED_OLD_PROTO
HTTP_DECODER_EVENT_RESPONSE_INVALID_PROTOCOL => HTP_LOG_CODE_RESPONSE_INVALID_PROTOCOL
HTTP_DECODER_EVENT_RESPONSE_INVALID_STATUS => HTP_LOG_CODE_RESPONSE_INVALID_STATUS
HTTP_DECODER_EVENT_REQUEST_LINE_INCOMPLETE => HTP_LOG_CODE_REQUEST_LINE_INCOMPLETE
HTTP_DECODER_EVENT_LZMA_MEMLIMIT_REACHED => HTP_LOG_CODE_LZMA_MEMLIMIT_REACHED
HTTP_DECODER_EVENT_COMPRESSION_BOMB => HTP_LOG_CODE_COMPRESSION_BOMB

New log events to describe additional anomalies:
HTP_LOG_CODE_REQUEST_TOO_MANY_LZMA_LAYERS
HTP_LOG_CODE_RESPONSE_TOO_MANY_LZMA_LAYERS
HTP_LOG_CODE_PROTOCOL_CONTAINS_EXTRA_DATA
HTP_LOG_CODE_CONTENT_LENGTH_EXTRA_DATA_START
HTP_LOG_CODE_CONTENT_LENGTH_EXTRA_DATA_END
HTP_LOG_CODE_SWITCHING_PROTO_WITH_CONTENT_LENGTH
HTP_LOG_CODE_DEFORMED_EOL
HTP_LOG_CODE_PARSER_STATE_ERROR
HTP_LOG_CODE_MISSING_OUTBOUND_TRANSACTION_DATA
HTP_LOG_CODE_MISSING_INBOUND_TRANSACTION_DATA
HTP_LOG_CODE_ZERO_LENGTH_DATA_CHUNKS
HTP_LOG_CODE_REQUEST_LINE_UNKNOWN_METHOD
HTP_LOG_CODE_REQUEST_LINE_UNKNOWN_METHOD_NO_PROTOCOL
HTP_LOG_CODE_REQUEST_LINE_UNKNOWN_METHOD_INVALID_PROTOCOL
HTP_LOG_CODE_REQUEST_LINE_NO_PROTOCOL
HTP_LOG_CODE_RESPONSE_LINE_INVALID_PROTOCOL
HTP_LOG_CODE_RESPONSE_LINE_INVALID_RESPONSE_STATUS
HTP_LOG_CODE_RESPONSE_BODY_INTERNAL_ERROR
HTP_LOG_CODE_REQUEST_BODY_DATA_CALLBACK_ERROR
HTP_LOG_CODE_RESPONSE_INVALID_EMPTY_NAME
HTP_LOG_CODE_REQUEST_INVALID_EMPTY_NAME
HTP_LOG_CODE_RESPONSE_INVALID_LWS_AFTER_NAME
HTP_LOG_CODE_RESPONSE_HEADER_NAME_NOT_TOKEN
HTP_LOG_CODE_REQUEST_INVALID_LWS_AFTER_NAME
HTP_LOG_CODE_LZMA_DECOMPRESSION_DISABLED
HTP_LOG_CODE_CONNECTION_ALREADY_OPEN
HTP_LOG_CODE_COMPRESSION_BOMB_DOUBLE_LZMA
HTP_LOG_CODE_INVALID_CONTENT_ENCODING
HTP_LOG_CODE_INVALID_GAP
HTP_LOG_CODE_ERROR

The new htp_log API supports consuming log messages more easily than
walking a list and tracking the current offset. Internally, libhtp-rs
now provides log messages as a queue of htp_log_t, which means the
application can simply call htp_conn_next_log() to fetch the next log
message until the queue is empty. Once the application is done with a
log message, they can call htp_log_free() to dispose of it.

Functions supporting htp_log_t:
htp_conn_next_log(conn) - Get the next log message
htp_log_message(log) - To get the text of the message
htp_log_code(log) - To get the HTP_LOG_CODE value
htp_log_free(log) - To free the htp_log_t
catenacyber pushed a commit to catenacyber/suricata that referenced this pull request Apr 11, 2024
Ticket: OISF#2696

There are a lot of changes here, which are described below.

In general these changes are renaming constants to conform to the
libhtp-rs versions (which are generated by cbindgen); making all htp
types opaque and changing struct->member references to
htp_struct_member() function calls; and a handful of changes to offload
functionality onto libhtp-rs from suricata, such as URI normalization
and transaction cleanup.

Constants renamed to correspond to cbindgen generated names:

HTP_OK => HTP_STATUS_OK
HTP_ERROR => HTP_STATUS_ERROR

HTP_SERVER_* => HTP_SERVER_PERSONALITY_*

HTP_REQUEST_INVALID_T_E => HTP_FLAGS_REQUEST_INVALID_T_E
HTP_REQUEST_INVALID_C_L => HTP_FLAGS_REQUEST_INVALID_C_L
HTP_HOST_MISSING => HTP_FLAGS_HOST_MISSING
HTP_HOST_AMBIGUOUS => HTP_FLAGS_HOST_AMBIGUOUS
HTP_HOSTU_INVALID => HTP_FLAGS_HOSTU_INVALID
HTP_HOSTH_INVALID => HTP_FLAGS_HOSTH_INVALID

HTP_AUTH_UNRECOGNIZED => HTP_AUTH_TYPE_UNRECOGNIZED

HTP_M_UNKNOWN => HTP_METHOD_UNKNOWN
HTP_M_GET => HTP_METHOD_GET
HTP_M_POST => HTP_METHOD_POST
HTP_M_PUT => HTP_METHOD_PUT
HTP_M_CONNECT => HTP_METHOD_CONNECT

HTP_STREAM_ERROR => HTP_STREAM_STATE_ERROR
HTP_STREAM_TUNNEL => HTP_STREAM_STATE_TUNNEL

HTP_REQUEST_LINE => HTP_REQUEST_PROGRESS_LINE
HTP_REQUEST_HEADERS => HTP_REQUEST_PROGRESS_HEADERS
HTP_REQUEST_BODY => HTP_REQUEST_PROGRESS_BODY
HTP_REQUEST_TRAILER => HTP_REQUEST_PROGRESS_TRAILER
HTP_REQUEST_COMPLETE => HTP_REQUEST_PROGRESS_COMPLETE
HTP_RESPONSE_LINE => HTP_RESPONSE_PROGRESS_LINE
HTP_RESPONSE_HEADERS => HTP_RESPONSE_PROGRESS_HEADERS
HTP_RESPONSE_BODY => HTP_RESPONSE_PROGRESS_BODY
HTP_RESPONSE_TRAILER => HTP_RESPONSE_PROGRESS_TRAILER
HTP_RESPONSE_COMPLETE => HTP_RESPONSE_PROGRESS_COMPLETE

HTP_PROTOCOL_1_1 => HTP_PROTOCOL_V1_1
HTP_PROTOCOL_1_0 => HTP_PROTOCOL_V1_0
HTP_PROTOCOL_0_9 => HTP_PROTOCOL_V0_9

Functions introduced to handle opaque htp_tx_t:
- tx->cfg => htp_tx_cfg(tx)
- tx->flags => htp_tx_flags(tx)
- tx->is_protocol_0_9 => htp_tx_is_protocol_0_9(tx)
- tx->parsed_uri => htp_tx_parsed_uri(tx)
- tx->parsed_uri->path => htp_uri_path(htp_tx_parsed_uri(tx)
- tx->parsed_uri->hostname => htp_uri_hostname(htp_tx_parsed_uri(tx))
- tx->request_auth_type => htp_tx_request_auth_type(tx)
- tx->request_headers => htp_tx_request_headers(tx)
- tx->request_hostname => htp_tx_request_hostname(tx)
- tx->request_line => htp_tx_request_line(tx)
- tx->request_message_len => htp_tx_request_message_len(tx)
- tx->request_method => htp_tx_request_method(tx)
- tx->request_method_number => htp_tx_request_method_number(tx)
- tx->request_port_number => htp_tx_request_port_number(tx)
- tx->request_progress => htp_tx_request_progress(tx)
- tx->request_protocol => htp_tx_request_protocol(tx)
- tx->request_protocol_number => htp_tx_request_protocol_number(tx)
- tx->request_uri => htp_tx_request_uri(tx)
- tx->response_headers => htp_tx_response_headers(tx)
- tx->response_line => htp_tx_response_line(tx)
- tx->response_message => htp_tx_response_message(tx)
- tx->response_message_len => htp_tx_response_message_len(tx)
- tx->response_status => htp_tx_response_status(tx)
- tx->response_status_number => htp_tx_response_status_number(tx)
- tx->response_progress => htp_tx_response_progress(tx)
- tx->response_protocol_number => htp_tx_response_protocol_number(tx)
- htp_tx_get_user_data() => htp_tx_user_data(tx)
- htp_table_get_c(tx->request_headers, header) => htp_tx_request_header(tx, header)
- htp_table_get_c(tx->response_headers, header) => htp_tx_response_header(tx, header)
- htp_table_get_index(tx->request_headers, index) => htp_tx_request_header_index(tx, index)
- htp_table_size(tx->request_headers) => htp_tx_request_headers_size(tx)
- htp_tx_is_http_2_upgrade(tx) convenience function introduced to detect response status 101
  and “Upgrade: h2c" header.

Functions introduced to handle opaque htp_header_t:
- header->name => htp_header_name(header)
- bstr_ptr(header->name) => htp_header_name_ptr(header)
- bstr_len(header->name) => htp_header_name_len(header)
- header->value => htp_header_value(header)
- bstr_len(header->value) => htp_header_value_len(header)
- bstr_ptr(header->value) => htp_header_value_ptr(header)

Functions introduced to handle opaque htp_headers_t:
- htp_table_size(headers) => htp_headers_size(headers)
- htp_table_get_index(headers, index) => htp_headers_get_index(headers, index)

Functions introduced to handle opaque htp_tx_data_t:
- d->len => htp_tx_data_len()
- d->data => htp_tx_data_data()
- htp_tx_data_tx(data) function to get the htp_tx_t from the htp_tx_data_t
- htp_tx_data_is_empty(data) convenience function introduced to test if the data is empty.

Functions introduced to handle opaque htp_connp_t:
- htp_list_get(connp->transactions, txid) => htp_connp_tx(connp, txid)
- htp_list_size(http_state->conn->transactions) => htp_connp_tx_size(connp)
- htp_connp_get_connection(connp) => htp_connp_connection(connp)
- htp_connp_req_data(connp) => htp_connp_request_data(connp)
- htp_connp_req_close(connp) => htp_connp_request_close(connp)
- htp_connp_res_data(connp) => htp_connp_response_data(connp)
- htp_connp_get_jn_tx(connp) => htp_connp_get_request_tx(connp)
- htp_connp_get_out_tx(connp) => htp_connp_get_response_tx(connp)
- htp_connp_req_data_consumed(connp) => htp_connp_request_data_consumed(connp)
- htp_connp_res_data_consumed(connp) => htp_connp_response_data_consumed(connp)
- htp_connp_get_user_data(connp) => htp_connp_user_data(connp)

Functions introduced to handle opaque htp_conn_t:
- conn->in_data_counter => htp_conn_request_data_counter(conn)
- conn->out_data_counter => htp_conn_response_data_counter(conn)

Other changes:

Build libhtp-rs as a crate inside rust. Update autoconf to no longer
use libhtp as an external dependency. Remove HAVE_HTP feature defines
since they are no longer needed.

Make function arguments and return values const where possible

htp_tx_destroy(tx) will now free an incomplete transaction

htp_time_t replaced with standard struct timeval

Callbacks from libhtp now provide the htp_connp_t and the htp_tx_data_t
as separate arguments. This means the connection parser is no longer
fetched from the transaction inside callbacks.

SCHTPGenerateNormalizedUri() functionality moved inside libhtp-rs, which
now provides normalized URI values.
The normalized URI is available with accessor function: htp_tx_normalized_uri()
Configuration settings added to control the behaviour of the URI normalization:
- htp_config_set_normalized_uri_include_all()
- htp_config_set_plusspace_decode()
- htp_config_set_convert_lowercase()
- htp_config_set_double_decode_normalized_query()
- htp_config_set_double_decode_normalized_path()
- htp_config_set_backslash_convert_slashes()
- htp_config_set_bestfit_replacement_byte()
- htp_config_set_convert_lowercase()
- htp_config_set_nul_encoded_terminates()
- htp_config_set_nul_raw_terminates()
- htp_config_set_path_separators_compress()
- htp_config_set_path_separators_decode()
- htp_config_set_u_encoding_decode()
- htp_config_set_url_encoding_invalid_handling()
- htp_config_set_utf8_convert_bestfit()
- htp_config_set_normalized_uri_include_all()
- htp_config_set_plusspace_decode()
Constants related to configuring uri normalization:
- HTP_URL_DECODE_PRESERVE_PERCENT => HTP_URL_ENCODING_HANDLING_PRESERVE_PERCENT
- HTP_URL_DECODE_REMOVE_PERCENT => HTP_URL_ENCODING_HANDLING_REMOVE_PERCENT
- HTP_URL_DECODE_PROCESS_INVALID => HTP_URL_ENCODING_HANDLING_PROCESS_INVALID

htp_config_set_field_limits(soft_limit, hard_limit) changed to
htp_config_set_field_limit(limit) because libhtp didn't implement soft
limits.

libhtp logging API updated to provide HTP_LOG_CODE constants along with
the message. This eliminates the need to perform string matching on
message text to map log messages to HTTP_DECODER_EVENT values, and the
HTP_LOG_CODE values can be used directly. In support of this,
HTP_DECODER_EVENT values are mapped to their corresponding HTP_LOG_CODE
values:

HTTP_DECODER_EVENT_UNKNOWN_ERROR => HTP_LOG_CODE_UNKNOWN
HTTP_DECODER_EVENT_GZIP_DECOMPRESSION_FAILED => HTP_LOG_CODE_GZIP_DECOMPRESSION_FAILED
HTTP_DECODER_EVENT_REQUEST_FIELD_MISSING_COLON => HTP_LOG_CODE_REQUEST_FIELD_MISSING_COLON
HTTP_DECODER_EVENT_RESPONSE_FIELD_MISSING_COLON => HTP_LOG_CODE_RESPONSE_FIELD_MISSING_COLON
HTTP_DECODER_EVENT_INVALID_REQUEST_CHUNK_LEN => HTP_LOG_CODE_INVALID_REQUEST_CHUNK_LEN
HTTP_DECODER_EVENT_INVALID_RESPONSE_CHUNK_LEN => HTP_LOG_CODE_INVALID_RESPONSE_CHUNK_LEN
HTTP_DECODER_EVENT_INVALID_TRANSFER_ENCODING_VALUE_IN_REQUEST => HTP_LOG_CODE_INVALID_TRANSFER_ENCODING_VALUE_IN_REQUEST
HTTP_DECODER_EVENT_INVALID_TRANSFER_ENCODING_VALUE_IN_RESPONSE => HTP_LOG_CODE_INVALID_TRANSFER_ENCODING_VALUE_IN_RESPONSE
HTTP_DECODER_EVENT_INVALID_CONTENT_LENGTH_FIELD_IN_REQUEST => HTP_LOG_CODE_INVALID_CONTENT_LENGTH_FIELD_IN_REQUEST
HTTP_DECODER_EVENT_INVALID_CONTENT_LENGTH_FIELD_IN_RESPONSE => HTP_LOG_CODE_INVALID_CONTENT_LENGTH_FIELD_IN_RESPONSE
HTTP_DECODER_EVENT_DUPLICATE_CONTENT_LENGTH_FIELD_IN_REQUEST => HTP_LOG_CODE_DUPLICATE_CONTENT_LENGTH_FIELD_IN_REQUEST
HTTP_DECODER_EVENT_DUPLICATE_CONTENT_LENGTH_FIELD_IN_RESPONSE => HTP_LOG_CODE_DUPLICATE_CONTENT_LENGTH_FIELD_IN_RESPONSE
HTTP_DECODER_EVENT_100_CONTINUE_ALREADY_SEEN => HTP_LOG_CODE_CONTINUE_ALREADY_SEEN
HTTP_DECODER_EVENT_UNABLE_TO_MATCH_RESPONSE_TO_REQUEST => HTP_LOG_CODE_UNABLE_TO_MATCH_RESPONSE_TO_REQUEST
HTTP_DECODER_EVENT_INVALID_SERVER_PORT_IN_REQUEST => HTP_LOG_CODE_INVALID_SERVER_PORT_IN_REQUEST
HTTP_DECODER_EVENT_INVALID_AUTHORITY_PORT => HTP_LOG_CODE_INVALID_AUTHORITY_PORT
HTTP_DECODER_EVENT_REQUEST_HEADER_INVALID => HTP_LOG_CODE_REQUEST_HEADER_INVALID
HTTP_DECODER_EVENT_RESPONSE_HEADER_INVALID => HTP_LOG_CODE_RESPONSE_HEADER_INVALID
HTTP_DECODER_EVENT_MISSING_HOST_HEADER => HTP_LOG_CODE_MISSING_HOST_HEADER
HTTP_DECODER_EVENT_HOST_HEADER_AMBIGUOUS => HTP_LOG_CODE_HOST_HEADER_AMBIGUOUS
HTTP_DECODER_EVENT_INVALID_REQUEST_FIELD_FOLDING => HTP_LOG_CODE_INVALID_REQUEST_FIELD_FOLDING
HTTP_DECODER_EVENT_INVALID_RESPONSE_FIELD_FOLDING => HTP_LOG_CODE_INVALID_RESPONSE_FIELD_FOLDING
HTTP_DECODER_EVENT_REQUEST_FIELD_TOO_LONG => HTP_LOG_CODE_REQUEST_FIELD_TOO_LONG
HTTP_DECODER_EVENT_RESPONSE_FIELD_TOO_LONG => HTP_LOG_CODE_RESPONSE_FIELD_TOO_LONG
HTTP_DECODER_EVENT_FILE_NAME_TOO_LONG => HTP_LOG_CODE_REQUEST_LINE_INVALID
HTTP_DECODER_EVENT_REQUEST_LINE_INVALID => HTP_LOG_CODE_REQUEST_BODY_UNEXPECTED
HTTP_DECODER_EVENT_REQUEST_BODY_UNEXPECTED => HTP_LOG_CODE_RESPONSE_BODY_UNEXPECTED
HTTP_DECODER_EVENT_REQUEST_SERVER_PORT_TCP_PORT_MISMATCH => HTP_LOG_CODE_REQUEST_SERVER_PORT_TCP_PORT_MISMATCH
HTTP_DECODER_EVENT_URI_HOST_INVALID => HTP_LOG_CODE_URI_HOST_INVALID
HTTP_DECODER_EVENT_HEADER_HOST_INVALID => HTP_LOG_CODE_HEADER_HOST_INVALID
HTTP_DECODER_EVENT_AUTH_UNRECOGNIZED => HTP_LOG_CODE_AUTH_UNRECOGNIZED
HTTP_DECODER_EVENT_REQUEST_HEADER_REPETITION => HTP_LOG_CODE_REQUEST_HEADER_REPETITION
HTTP_DECODER_EVENT_RESPONSE_HEADER_REPETITION => HTP_LOG_CODE_RESPONSE_HEADER_REPETITION
HTTP_DECODER_EVENT_DOUBLE_ENCODED_URI => HTP_LOG_CODE_DOUBLE_ENCODED_URI
HTTP_DECODER_EVENT_URI_DELIM_NON_COMPLIANT => HTP_LOG_CODE_URI_DELIM_NON_COMPLIANT
HTTP_DECODER_EVENT_METHOD_DELIM_NON_COMPLIANT => HTP_LOG_CODE_METHOD_DELIM_NON_COMPLIANT
HTTP_DECODER_EVENT_REQUEST_LINE_LEADING_WHITESPACE => HTP_LOG_CODE_REQUEST_LINE_LEADING_WHITESPACE
HTTP_DECODER_EVENT_TOO_MANY_ENCODING_LAYERS => HTP_LOG_CODE_TOO_MANY_ENCODING_LAYERS
HTTP_DECODER_EVENT_ABNORMAL_CE_HEADER => HTP_LOG_CODE_ABNORMAL_CE_HEADER
HTTP_DECODER_EVENT_RESPONSE_MULTIPART_BYTERANGES => HTP_LOG_CODE_RESPONSE_MULTIPART_BYTERANGES
HTTP_DECODER_EVENT_RESPONSE_ABNORMAL_TRANSFER_ENCODING => HTP_LOG_CODE_RESPONSE_ABNORMAL_TRANSFER_ENCODING
HTTP_DECODER_EVENT_RESPONSE_CHUNKED_OLD_PROTO => HTP_LOG_CODE_RESPONSE_CHUNKED_OLD_PROTO
HTTP_DECODER_EVENT_RESPONSE_INVALID_PROTOCOL => HTP_LOG_CODE_RESPONSE_INVALID_PROTOCOL
HTTP_DECODER_EVENT_RESPONSE_INVALID_STATUS => HTP_LOG_CODE_RESPONSE_INVALID_STATUS
HTTP_DECODER_EVENT_REQUEST_LINE_INCOMPLETE => HTP_LOG_CODE_REQUEST_LINE_INCOMPLETE
HTTP_DECODER_EVENT_LZMA_MEMLIMIT_REACHED => HTP_LOG_CODE_LZMA_MEMLIMIT_REACHED
HTTP_DECODER_EVENT_COMPRESSION_BOMB => HTP_LOG_CODE_COMPRESSION_BOMB

New log events to describe additional anomalies:
HTP_LOG_CODE_REQUEST_TOO_MANY_LZMA_LAYERS
HTP_LOG_CODE_RESPONSE_TOO_MANY_LZMA_LAYERS
HTP_LOG_CODE_PROTOCOL_CONTAINS_EXTRA_DATA
HTP_LOG_CODE_CONTENT_LENGTH_EXTRA_DATA_START
HTP_LOG_CODE_CONTENT_LENGTH_EXTRA_DATA_END
HTP_LOG_CODE_SWITCHING_PROTO_WITH_CONTENT_LENGTH
HTP_LOG_CODE_DEFORMED_EOL
HTP_LOG_CODE_PARSER_STATE_ERROR
HTP_LOG_CODE_MISSING_OUTBOUND_TRANSACTION_DATA
HTP_LOG_CODE_MISSING_INBOUND_TRANSACTION_DATA
HTP_LOG_CODE_ZERO_LENGTH_DATA_CHUNKS
HTP_LOG_CODE_REQUEST_LINE_UNKNOWN_METHOD
HTP_LOG_CODE_REQUEST_LINE_UNKNOWN_METHOD_NO_PROTOCOL
HTP_LOG_CODE_REQUEST_LINE_UNKNOWN_METHOD_INVALID_PROTOCOL
HTP_LOG_CODE_REQUEST_LINE_NO_PROTOCOL
HTP_LOG_CODE_RESPONSE_LINE_INVALID_PROTOCOL
HTP_LOG_CODE_RESPONSE_LINE_INVALID_RESPONSE_STATUS
HTP_LOG_CODE_RESPONSE_BODY_INTERNAL_ERROR
HTP_LOG_CODE_REQUEST_BODY_DATA_CALLBACK_ERROR
HTP_LOG_CODE_RESPONSE_INVALID_EMPTY_NAME
HTP_LOG_CODE_REQUEST_INVALID_EMPTY_NAME
HTP_LOG_CODE_RESPONSE_INVALID_LWS_AFTER_NAME
HTP_LOG_CODE_RESPONSE_HEADER_NAME_NOT_TOKEN
HTP_LOG_CODE_REQUEST_INVALID_LWS_AFTER_NAME
HTP_LOG_CODE_LZMA_DECOMPRESSION_DISABLED
HTP_LOG_CODE_CONNECTION_ALREADY_OPEN
HTP_LOG_CODE_COMPRESSION_BOMB_DOUBLE_LZMA
HTP_LOG_CODE_INVALID_CONTENT_ENCODING
HTP_LOG_CODE_INVALID_GAP
HTP_LOG_CODE_ERROR

The new htp_log API supports consuming log messages more easily than
walking a list and tracking the current offset. Internally, libhtp-rs
now provides log messages as a queue of htp_log_t, which means the
application can simply call htp_conn_next_log() to fetch the next log
message until the queue is empty. Once the application is done with a
log message, they can call htp_log_free() to dispose of it.

Functions supporting htp_log_t:
htp_conn_next_log(conn) - Get the next log message
htp_log_message(log) - To get the text of the message
htp_log_code(log) - To get the HTP_LOG_CODE value
htp_log_free(log) - To free the htp_log_t
catenacyber pushed a commit to catenacyber/suricata that referenced this pull request Apr 30, 2024
Ticket: OISF#2696

There are a lot of changes here, which are described below.

In general these changes are renaming constants to conform to the
libhtp-rs versions (which are generated by cbindgen); making all htp
types opaque and changing struct->member references to
htp_struct_member() function calls; and a handful of changes to offload
functionality onto libhtp-rs from suricata, such as URI normalization
and transaction cleanup.

Constants renamed to correspond to cbindgen generated names:

HTP_OK => HTP_STATUS_OK
HTP_ERROR => HTP_STATUS_ERROR

HTP_SERVER_* => HTP_SERVER_PERSONALITY_*

HTP_REQUEST_INVALID_T_E => HTP_FLAGS_REQUEST_INVALID_T_E
HTP_REQUEST_INVALID_C_L => HTP_FLAGS_REQUEST_INVALID_C_L
HTP_HOST_MISSING => HTP_FLAGS_HOST_MISSING
HTP_HOST_AMBIGUOUS => HTP_FLAGS_HOST_AMBIGUOUS
HTP_HOSTU_INVALID => HTP_FLAGS_HOSTU_INVALID
HTP_HOSTH_INVALID => HTP_FLAGS_HOSTH_INVALID

HTP_AUTH_UNRECOGNIZED => HTP_AUTH_TYPE_UNRECOGNIZED

HTP_M_UNKNOWN => HTP_METHOD_UNKNOWN
HTP_M_GET => HTP_METHOD_GET
HTP_M_POST => HTP_METHOD_POST
HTP_M_PUT => HTP_METHOD_PUT
HTP_M_CONNECT => HTP_METHOD_CONNECT

HTP_STREAM_ERROR => HTP_STREAM_STATE_ERROR
HTP_STREAM_TUNNEL => HTP_STREAM_STATE_TUNNEL

HTP_REQUEST_LINE => HTP_REQUEST_PROGRESS_LINE
HTP_REQUEST_HEADERS => HTP_REQUEST_PROGRESS_HEADERS
HTP_REQUEST_BODY => HTP_REQUEST_PROGRESS_BODY
HTP_REQUEST_TRAILER => HTP_REQUEST_PROGRESS_TRAILER
HTP_REQUEST_COMPLETE => HTP_REQUEST_PROGRESS_COMPLETE
HTP_RESPONSE_LINE => HTP_RESPONSE_PROGRESS_LINE
HTP_RESPONSE_HEADERS => HTP_RESPONSE_PROGRESS_HEADERS
HTP_RESPONSE_BODY => HTP_RESPONSE_PROGRESS_BODY
HTP_RESPONSE_TRAILER => HTP_RESPONSE_PROGRESS_TRAILER
HTP_RESPONSE_COMPLETE => HTP_RESPONSE_PROGRESS_COMPLETE

HTP_PROTOCOL_1_1 => HTP_PROTOCOL_V1_1
HTP_PROTOCOL_1_0 => HTP_PROTOCOL_V1_0
HTP_PROTOCOL_0_9 => HTP_PROTOCOL_V0_9

Functions introduced to handle opaque htp_tx_t:
- tx->cfg => htp_tx_cfg(tx)
- tx->flags => htp_tx_flags(tx)
- tx->is_protocol_0_9 => htp_tx_is_protocol_0_9(tx)
- tx->parsed_uri => htp_tx_parsed_uri(tx)
- tx->parsed_uri->path => htp_uri_path(htp_tx_parsed_uri(tx)
- tx->parsed_uri->hostname => htp_uri_hostname(htp_tx_parsed_uri(tx))
- tx->request_auth_type => htp_tx_request_auth_type(tx)
- tx->request_headers => htp_tx_request_headers(tx)
- tx->request_hostname => htp_tx_request_hostname(tx)
- tx->request_line => htp_tx_request_line(tx)
- tx->request_message_len => htp_tx_request_message_len(tx)
- tx->request_method => htp_tx_request_method(tx)
- tx->request_method_number => htp_tx_request_method_number(tx)
- tx->request_port_number => htp_tx_request_port_number(tx)
- tx->request_progress => htp_tx_request_progress(tx)
- tx->request_protocol => htp_tx_request_protocol(tx)
- tx->request_protocol_number => htp_tx_request_protocol_number(tx)
- tx->request_uri => htp_tx_request_uri(tx)
- tx->response_headers => htp_tx_response_headers(tx)
- tx->response_line => htp_tx_response_line(tx)
- tx->response_message => htp_tx_response_message(tx)
- tx->response_message_len => htp_tx_response_message_len(tx)
- tx->response_status => htp_tx_response_status(tx)
- tx->response_status_number => htp_tx_response_status_number(tx)
- tx->response_progress => htp_tx_response_progress(tx)
- tx->response_protocol_number => htp_tx_response_protocol_number(tx)
- htp_tx_get_user_data() => htp_tx_user_data(tx)
- htp_table_get_c(tx->request_headers, header) => htp_tx_request_header(tx, header)
- htp_table_get_c(tx->response_headers, header) => htp_tx_response_header(tx, header)
- htp_table_get_index(tx->request_headers, index) => htp_tx_request_header_index(tx, index)
- htp_table_size(tx->request_headers) => htp_tx_request_headers_size(tx)
- htp_tx_is_http_2_upgrade(tx) convenience function introduced to detect response status 101
  and “Upgrade: h2c" header.

Functions introduced to handle opaque htp_header_t:
- header->name => htp_header_name(header)
- bstr_ptr(header->name) => htp_header_name_ptr(header)
- bstr_len(header->name) => htp_header_name_len(header)
- header->value => htp_header_value(header)
- bstr_len(header->value) => htp_header_value_len(header)
- bstr_ptr(header->value) => htp_header_value_ptr(header)

Functions introduced to handle opaque htp_headers_t:
- htp_table_size(headers) => htp_headers_size(headers)
- htp_table_get_index(headers, index) => htp_headers_get_index(headers, index)

Functions introduced to handle opaque htp_tx_data_t:
- d->len => htp_tx_data_len()
- d->data => htp_tx_data_data()
- htp_tx_data_tx(data) function to get the htp_tx_t from the htp_tx_data_t
- htp_tx_data_is_empty(data) convenience function introduced to test if the data is empty.

Functions introduced to handle opaque htp_connp_t:
- htp_list_get(connp->transactions, txid) => htp_connp_tx(connp, txid)
- htp_list_size(http_state->conn->transactions) => htp_connp_tx_size(connp)
- htp_connp_get_connection(connp) => htp_connp_connection(connp)
- htp_connp_req_data(connp) => htp_connp_request_data(connp)
- htp_connp_req_close(connp) => htp_connp_request_close(connp)
- htp_connp_res_data(connp) => htp_connp_response_data(connp)
- htp_connp_get_jn_tx(connp) => htp_connp_get_request_tx(connp)
- htp_connp_get_out_tx(connp) => htp_connp_get_response_tx(connp)
- htp_connp_req_data_consumed(connp) => htp_connp_request_data_consumed(connp)
- htp_connp_res_data_consumed(connp) => htp_connp_response_data_consumed(connp)
- htp_connp_get_user_data(connp) => htp_connp_user_data(connp)

Functions introduced to handle opaque htp_conn_t:
- conn->in_data_counter => htp_conn_request_data_counter(conn)
- conn->out_data_counter => htp_conn_response_data_counter(conn)

Other changes:

Build libhtp-rs as a crate inside rust. Update autoconf to no longer
use libhtp as an external dependency. Remove HAVE_HTP feature defines
since they are no longer needed.

Make function arguments and return values const where possible

htp_tx_destroy(tx) will now free an incomplete transaction

htp_time_t replaced with standard struct timeval

Callbacks from libhtp now provide the htp_connp_t and the htp_tx_data_t
as separate arguments. This means the connection parser is no longer
fetched from the transaction inside callbacks.

SCHTPGenerateNormalizedUri() functionality moved inside libhtp-rs, which
now provides normalized URI values.
The normalized URI is available with accessor function: htp_tx_normalized_uri()
Configuration settings added to control the behaviour of the URI normalization:
- htp_config_set_normalized_uri_include_all()
- htp_config_set_plusspace_decode()
- htp_config_set_convert_lowercase()
- htp_config_set_double_decode_normalized_query()
- htp_config_set_double_decode_normalized_path()
- htp_config_set_backslash_convert_slashes()
- htp_config_set_bestfit_replacement_byte()
- htp_config_set_convert_lowercase()
- htp_config_set_nul_encoded_terminates()
- htp_config_set_nul_raw_terminates()
- htp_config_set_path_separators_compress()
- htp_config_set_path_separators_decode()
- htp_config_set_u_encoding_decode()
- htp_config_set_url_encoding_invalid_handling()
- htp_config_set_utf8_convert_bestfit()
- htp_config_set_normalized_uri_include_all()
- htp_config_set_plusspace_decode()
Constants related to configuring uri normalization:
- HTP_URL_DECODE_PRESERVE_PERCENT => HTP_URL_ENCODING_HANDLING_PRESERVE_PERCENT
- HTP_URL_DECODE_REMOVE_PERCENT => HTP_URL_ENCODING_HANDLING_REMOVE_PERCENT
- HTP_URL_DECODE_PROCESS_INVALID => HTP_URL_ENCODING_HANDLING_PROCESS_INVALID

htp_config_set_field_limits(soft_limit, hard_limit) changed to
htp_config_set_field_limit(limit) because libhtp didn't implement soft
limits.

libhtp logging API updated to provide HTP_LOG_CODE constants along with
the message. This eliminates the need to perform string matching on
message text to map log messages to HTTP_DECODER_EVENT values, and the
HTP_LOG_CODE values can be used directly. In support of this,
HTP_DECODER_EVENT values are mapped to their corresponding HTP_LOG_CODE
values:

HTTP_DECODER_EVENT_UNKNOWN_ERROR => HTP_LOG_CODE_UNKNOWN
HTTP_DECODER_EVENT_GZIP_DECOMPRESSION_FAILED => HTP_LOG_CODE_GZIP_DECOMPRESSION_FAILED
HTTP_DECODER_EVENT_REQUEST_FIELD_MISSING_COLON => HTP_LOG_CODE_REQUEST_FIELD_MISSING_COLON
HTTP_DECODER_EVENT_RESPONSE_FIELD_MISSING_COLON => HTP_LOG_CODE_RESPONSE_FIELD_MISSING_COLON
HTTP_DECODER_EVENT_INVALID_REQUEST_CHUNK_LEN => HTP_LOG_CODE_INVALID_REQUEST_CHUNK_LEN
HTTP_DECODER_EVENT_INVALID_RESPONSE_CHUNK_LEN => HTP_LOG_CODE_INVALID_RESPONSE_CHUNK_LEN
HTTP_DECODER_EVENT_INVALID_TRANSFER_ENCODING_VALUE_IN_REQUEST => HTP_LOG_CODE_INVALID_TRANSFER_ENCODING_VALUE_IN_REQUEST
HTTP_DECODER_EVENT_INVALID_TRANSFER_ENCODING_VALUE_IN_RESPONSE => HTP_LOG_CODE_INVALID_TRANSFER_ENCODING_VALUE_IN_RESPONSE
HTTP_DECODER_EVENT_INVALID_CONTENT_LENGTH_FIELD_IN_REQUEST => HTP_LOG_CODE_INVALID_CONTENT_LENGTH_FIELD_IN_REQUEST
HTTP_DECODER_EVENT_INVALID_CONTENT_LENGTH_FIELD_IN_RESPONSE => HTP_LOG_CODE_INVALID_CONTENT_LENGTH_FIELD_IN_RESPONSE
HTTP_DECODER_EVENT_DUPLICATE_CONTENT_LENGTH_FIELD_IN_REQUEST => HTP_LOG_CODE_DUPLICATE_CONTENT_LENGTH_FIELD_IN_REQUEST
HTTP_DECODER_EVENT_DUPLICATE_CONTENT_LENGTH_FIELD_IN_RESPONSE => HTP_LOG_CODE_DUPLICATE_CONTENT_LENGTH_FIELD_IN_RESPONSE
HTTP_DECODER_EVENT_100_CONTINUE_ALREADY_SEEN => HTP_LOG_CODE_CONTINUE_ALREADY_SEEN
HTTP_DECODER_EVENT_UNABLE_TO_MATCH_RESPONSE_TO_REQUEST => HTP_LOG_CODE_UNABLE_TO_MATCH_RESPONSE_TO_REQUEST
HTTP_DECODER_EVENT_INVALID_SERVER_PORT_IN_REQUEST => HTP_LOG_CODE_INVALID_SERVER_PORT_IN_REQUEST
HTTP_DECODER_EVENT_INVALID_AUTHORITY_PORT => HTP_LOG_CODE_INVALID_AUTHORITY_PORT
HTTP_DECODER_EVENT_REQUEST_HEADER_INVALID => HTP_LOG_CODE_REQUEST_HEADER_INVALID
HTTP_DECODER_EVENT_RESPONSE_HEADER_INVALID => HTP_LOG_CODE_RESPONSE_HEADER_INVALID
HTTP_DECODER_EVENT_MISSING_HOST_HEADER => HTP_LOG_CODE_MISSING_HOST_HEADER
HTTP_DECODER_EVENT_HOST_HEADER_AMBIGUOUS => HTP_LOG_CODE_HOST_HEADER_AMBIGUOUS
HTTP_DECODER_EVENT_INVALID_REQUEST_FIELD_FOLDING => HTP_LOG_CODE_INVALID_REQUEST_FIELD_FOLDING
HTTP_DECODER_EVENT_INVALID_RESPONSE_FIELD_FOLDING => HTP_LOG_CODE_INVALID_RESPONSE_FIELD_FOLDING
HTTP_DECODER_EVENT_REQUEST_FIELD_TOO_LONG => HTP_LOG_CODE_REQUEST_FIELD_TOO_LONG
HTTP_DECODER_EVENT_RESPONSE_FIELD_TOO_LONG => HTP_LOG_CODE_RESPONSE_FIELD_TOO_LONG
HTTP_DECODER_EVENT_FILE_NAME_TOO_LONG => HTP_LOG_CODE_REQUEST_LINE_INVALID
HTTP_DECODER_EVENT_REQUEST_LINE_INVALID => HTP_LOG_CODE_REQUEST_BODY_UNEXPECTED
HTTP_DECODER_EVENT_REQUEST_BODY_UNEXPECTED => HTP_LOG_CODE_RESPONSE_BODY_UNEXPECTED
HTTP_DECODER_EVENT_REQUEST_SERVER_PORT_TCP_PORT_MISMATCH => HTP_LOG_CODE_REQUEST_SERVER_PORT_TCP_PORT_MISMATCH
HTTP_DECODER_EVENT_URI_HOST_INVALID => HTP_LOG_CODE_URI_HOST_INVALID
HTTP_DECODER_EVENT_HEADER_HOST_INVALID => HTP_LOG_CODE_HEADER_HOST_INVALID
HTTP_DECODER_EVENT_AUTH_UNRECOGNIZED => HTP_LOG_CODE_AUTH_UNRECOGNIZED
HTTP_DECODER_EVENT_REQUEST_HEADER_REPETITION => HTP_LOG_CODE_REQUEST_HEADER_REPETITION
HTTP_DECODER_EVENT_RESPONSE_HEADER_REPETITION => HTP_LOG_CODE_RESPONSE_HEADER_REPETITION
HTTP_DECODER_EVENT_DOUBLE_ENCODED_URI => HTP_LOG_CODE_DOUBLE_ENCODED_URI
HTTP_DECODER_EVENT_URI_DELIM_NON_COMPLIANT => HTP_LOG_CODE_URI_DELIM_NON_COMPLIANT
HTTP_DECODER_EVENT_METHOD_DELIM_NON_COMPLIANT => HTP_LOG_CODE_METHOD_DELIM_NON_COMPLIANT
HTTP_DECODER_EVENT_REQUEST_LINE_LEADING_WHITESPACE => HTP_LOG_CODE_REQUEST_LINE_LEADING_WHITESPACE
HTTP_DECODER_EVENT_TOO_MANY_ENCODING_LAYERS => HTP_LOG_CODE_TOO_MANY_ENCODING_LAYERS
HTTP_DECODER_EVENT_ABNORMAL_CE_HEADER => HTP_LOG_CODE_ABNORMAL_CE_HEADER
HTTP_DECODER_EVENT_RESPONSE_MULTIPART_BYTERANGES => HTP_LOG_CODE_RESPONSE_MULTIPART_BYTERANGES
HTTP_DECODER_EVENT_RESPONSE_ABNORMAL_TRANSFER_ENCODING => HTP_LOG_CODE_RESPONSE_ABNORMAL_TRANSFER_ENCODING
HTTP_DECODER_EVENT_RESPONSE_CHUNKED_OLD_PROTO => HTP_LOG_CODE_RESPONSE_CHUNKED_OLD_PROTO
HTTP_DECODER_EVENT_RESPONSE_INVALID_PROTOCOL => HTP_LOG_CODE_RESPONSE_INVALID_PROTOCOL
HTTP_DECODER_EVENT_RESPONSE_INVALID_STATUS => HTP_LOG_CODE_RESPONSE_INVALID_STATUS
HTTP_DECODER_EVENT_REQUEST_LINE_INCOMPLETE => HTP_LOG_CODE_REQUEST_LINE_INCOMPLETE
HTTP_DECODER_EVENT_LZMA_MEMLIMIT_REACHED => HTP_LOG_CODE_LZMA_MEMLIMIT_REACHED
HTTP_DECODER_EVENT_COMPRESSION_BOMB => HTP_LOG_CODE_COMPRESSION_BOMB

New log events to describe additional anomalies:
HTP_LOG_CODE_REQUEST_TOO_MANY_LZMA_LAYERS
HTP_LOG_CODE_RESPONSE_TOO_MANY_LZMA_LAYERS
HTP_LOG_CODE_PROTOCOL_CONTAINS_EXTRA_DATA
HTP_LOG_CODE_CONTENT_LENGTH_EXTRA_DATA_START
HTP_LOG_CODE_CONTENT_LENGTH_EXTRA_DATA_END
HTP_LOG_CODE_SWITCHING_PROTO_WITH_CONTENT_LENGTH
HTP_LOG_CODE_DEFORMED_EOL
HTP_LOG_CODE_PARSER_STATE_ERROR
HTP_LOG_CODE_MISSING_OUTBOUND_TRANSACTION_DATA
HTP_LOG_CODE_MISSING_INBOUND_TRANSACTION_DATA
HTP_LOG_CODE_ZERO_LENGTH_DATA_CHUNKS
HTP_LOG_CODE_REQUEST_LINE_UNKNOWN_METHOD
HTP_LOG_CODE_REQUEST_LINE_UNKNOWN_METHOD_NO_PROTOCOL
HTP_LOG_CODE_REQUEST_LINE_UNKNOWN_METHOD_INVALID_PROTOCOL
HTP_LOG_CODE_REQUEST_LINE_NO_PROTOCOL
HTP_LOG_CODE_RESPONSE_LINE_INVALID_PROTOCOL
HTP_LOG_CODE_RESPONSE_LINE_INVALID_RESPONSE_STATUS
HTP_LOG_CODE_RESPONSE_BODY_INTERNAL_ERROR
HTP_LOG_CODE_REQUEST_BODY_DATA_CALLBACK_ERROR
HTP_LOG_CODE_RESPONSE_INVALID_EMPTY_NAME
HTP_LOG_CODE_REQUEST_INVALID_EMPTY_NAME
HTP_LOG_CODE_RESPONSE_INVALID_LWS_AFTER_NAME
HTP_LOG_CODE_RESPONSE_HEADER_NAME_NOT_TOKEN
HTP_LOG_CODE_REQUEST_INVALID_LWS_AFTER_NAME
HTP_LOG_CODE_LZMA_DECOMPRESSION_DISABLED
HTP_LOG_CODE_CONNECTION_ALREADY_OPEN
HTP_LOG_CODE_COMPRESSION_BOMB_DOUBLE_LZMA
HTP_LOG_CODE_INVALID_CONTENT_ENCODING
HTP_LOG_CODE_INVALID_GAP
HTP_LOG_CODE_ERROR

The new htp_log API supports consuming log messages more easily than
walking a list and tracking the current offset. Internally, libhtp-rs
now provides log messages as a queue of htp_log_t, which means the
application can simply call htp_conn_next_log() to fetch the next log
message until the queue is empty. Once the application is done with a
log message, they can call htp_log_free() to dispose of it.

Functions supporting htp_log_t:
htp_conn_next_log(conn) - Get the next log message
htp_log_message(log) - To get the text of the message
htp_log_code(log) - To get the HTP_LOG_CODE value
htp_log_free(log) - To free the htp_log_t
catenacyber pushed a commit to catenacyber/suricata that referenced this pull request Apr 30, 2024
Ticket: OISF#2696

There are a lot of changes here, which are described below.

In general these changes are renaming constants to conform to the
libhtp-rs versions (which are generated by cbindgen); making all htp
types opaque and changing struct->member references to
htp_struct_member() function calls; and a handful of changes to offload
functionality onto libhtp-rs from suricata, such as URI normalization
and transaction cleanup.

Constants renamed to correspond to cbindgen generated names:

HTP_OK => HTP_STATUS_OK
HTP_ERROR => HTP_STATUS_ERROR

HTP_SERVER_* => HTP_SERVER_PERSONALITY_*

HTP_REQUEST_INVALID_T_E => HTP_FLAGS_REQUEST_INVALID_T_E
HTP_REQUEST_INVALID_C_L => HTP_FLAGS_REQUEST_INVALID_C_L
HTP_HOST_MISSING => HTP_FLAGS_HOST_MISSING
HTP_HOST_AMBIGUOUS => HTP_FLAGS_HOST_AMBIGUOUS
HTP_HOSTU_INVALID => HTP_FLAGS_HOSTU_INVALID
HTP_HOSTH_INVALID => HTP_FLAGS_HOSTH_INVALID

HTP_AUTH_UNRECOGNIZED => HTP_AUTH_TYPE_UNRECOGNIZED

HTP_M_UNKNOWN => HTP_METHOD_UNKNOWN
HTP_M_GET => HTP_METHOD_GET
HTP_M_POST => HTP_METHOD_POST
HTP_M_PUT => HTP_METHOD_PUT
HTP_M_CONNECT => HTP_METHOD_CONNECT

HTP_STREAM_ERROR => HTP_STREAM_STATE_ERROR
HTP_STREAM_TUNNEL => HTP_STREAM_STATE_TUNNEL

HTP_REQUEST_LINE => HTP_REQUEST_PROGRESS_LINE
HTP_REQUEST_HEADERS => HTP_REQUEST_PROGRESS_HEADERS
HTP_REQUEST_BODY => HTP_REQUEST_PROGRESS_BODY
HTP_REQUEST_TRAILER => HTP_REQUEST_PROGRESS_TRAILER
HTP_REQUEST_COMPLETE => HTP_REQUEST_PROGRESS_COMPLETE
HTP_RESPONSE_LINE => HTP_RESPONSE_PROGRESS_LINE
HTP_RESPONSE_HEADERS => HTP_RESPONSE_PROGRESS_HEADERS
HTP_RESPONSE_BODY => HTP_RESPONSE_PROGRESS_BODY
HTP_RESPONSE_TRAILER => HTP_RESPONSE_PROGRESS_TRAILER
HTP_RESPONSE_COMPLETE => HTP_RESPONSE_PROGRESS_COMPLETE

HTP_PROTOCOL_1_1 => HTP_PROTOCOL_V1_1
HTP_PROTOCOL_1_0 => HTP_PROTOCOL_V1_0
HTP_PROTOCOL_0_9 => HTP_PROTOCOL_V0_9

Functions introduced to handle opaque htp_tx_t:
- tx->cfg => htp_tx_cfg(tx)
- tx->flags => htp_tx_flags(tx)
- tx->is_protocol_0_9 => htp_tx_is_protocol_0_9(tx)
- tx->parsed_uri => htp_tx_parsed_uri(tx)
- tx->parsed_uri->path => htp_uri_path(htp_tx_parsed_uri(tx)
- tx->parsed_uri->hostname => htp_uri_hostname(htp_tx_parsed_uri(tx))
- tx->request_auth_type => htp_tx_request_auth_type(tx)
- tx->request_headers => htp_tx_request_headers(tx)
- tx->request_hostname => htp_tx_request_hostname(tx)
- tx->request_line => htp_tx_request_line(tx)
- tx->request_message_len => htp_tx_request_message_len(tx)
- tx->request_method => htp_tx_request_method(tx)
- tx->request_method_number => htp_tx_request_method_number(tx)
- tx->request_port_number => htp_tx_request_port_number(tx)
- tx->request_progress => htp_tx_request_progress(tx)
- tx->request_protocol => htp_tx_request_protocol(tx)
- tx->request_protocol_number => htp_tx_request_protocol_number(tx)
- tx->request_uri => htp_tx_request_uri(tx)
- tx->response_headers => htp_tx_response_headers(tx)
- tx->response_line => htp_tx_response_line(tx)
- tx->response_message => htp_tx_response_message(tx)
- tx->response_message_len => htp_tx_response_message_len(tx)
- tx->response_status => htp_tx_response_status(tx)
- tx->response_status_number => htp_tx_response_status_number(tx)
- tx->response_progress => htp_tx_response_progress(tx)
- tx->response_protocol_number => htp_tx_response_protocol_number(tx)
- htp_tx_get_user_data() => htp_tx_user_data(tx)
- htp_table_get_c(tx->request_headers, header) => htp_tx_request_header(tx, header)
- htp_table_get_c(tx->response_headers, header) => htp_tx_response_header(tx, header)
- htp_table_get_index(tx->request_headers, index) => htp_tx_request_header_index(tx, index)
- htp_table_size(tx->request_headers) => htp_tx_request_headers_size(tx)
- htp_tx_is_http_2_upgrade(tx) convenience function introduced to detect response status 101
  and “Upgrade: h2c" header.

Functions introduced to handle opaque htp_header_t:
- header->name => htp_header_name(header)
- bstr_ptr(header->name) => htp_header_name_ptr(header)
- bstr_len(header->name) => htp_header_name_len(header)
- header->value => htp_header_value(header)
- bstr_len(header->value) => htp_header_value_len(header)
- bstr_ptr(header->value) => htp_header_value_ptr(header)

Functions introduced to handle opaque htp_headers_t:
- htp_table_size(headers) => htp_headers_size(headers)
- htp_table_get_index(headers, index) => htp_headers_get_index(headers, index)

Functions introduced to handle opaque htp_tx_data_t:
- d->len => htp_tx_data_len()
- d->data => htp_tx_data_data()
- htp_tx_data_tx(data) function to get the htp_tx_t from the htp_tx_data_t
- htp_tx_data_is_empty(data) convenience function introduced to test if the data is empty.

Functions introduced to handle opaque htp_connp_t:
- htp_list_get(connp->transactions, txid) => htp_connp_tx(connp, txid)
- htp_list_size(http_state->conn->transactions) => htp_connp_tx_size(connp)
- htp_connp_get_connection(connp) => htp_connp_connection(connp)
- htp_connp_req_data(connp) => htp_connp_request_data(connp)
- htp_connp_req_close(connp) => htp_connp_request_close(connp)
- htp_connp_res_data(connp) => htp_connp_response_data(connp)
- htp_connp_get_jn_tx(connp) => htp_connp_get_request_tx(connp)
- htp_connp_get_out_tx(connp) => htp_connp_get_response_tx(connp)
- htp_connp_req_data_consumed(connp) => htp_connp_request_data_consumed(connp)
- htp_connp_res_data_consumed(connp) => htp_connp_response_data_consumed(connp)
- htp_connp_get_user_data(connp) => htp_connp_user_data(connp)

Functions introduced to handle opaque htp_conn_t:
- conn->in_data_counter => htp_conn_request_data_counter(conn)
- conn->out_data_counter => htp_conn_response_data_counter(conn)

Other changes:

Build libhtp-rs as a crate inside rust. Update autoconf to no longer
use libhtp as an external dependency. Remove HAVE_HTP feature defines
since they are no longer needed.

Make function arguments and return values const where possible

htp_tx_destroy(tx) will now free an incomplete transaction

htp_time_t replaced with standard struct timeval

Callbacks from libhtp now provide the htp_connp_t and the htp_tx_data_t
as separate arguments. This means the connection parser is no longer
fetched from the transaction inside callbacks.

SCHTPGenerateNormalizedUri() functionality moved inside libhtp-rs, which
now provides normalized URI values.
The normalized URI is available with accessor function: htp_tx_normalized_uri()
Configuration settings added to control the behaviour of the URI normalization:
- htp_config_set_normalized_uri_include_all()
- htp_config_set_plusspace_decode()
- htp_config_set_convert_lowercase()
- htp_config_set_double_decode_normalized_query()
- htp_config_set_double_decode_normalized_path()
- htp_config_set_backslash_convert_slashes()
- htp_config_set_bestfit_replacement_byte()
- htp_config_set_convert_lowercase()
- htp_config_set_nul_encoded_terminates()
- htp_config_set_nul_raw_terminates()
- htp_config_set_path_separators_compress()
- htp_config_set_path_separators_decode()
- htp_config_set_u_encoding_decode()
- htp_config_set_url_encoding_invalid_handling()
- htp_config_set_utf8_convert_bestfit()
- htp_config_set_normalized_uri_include_all()
- htp_config_set_plusspace_decode()
Constants related to configuring uri normalization:
- HTP_URL_DECODE_PRESERVE_PERCENT => HTP_URL_ENCODING_HANDLING_PRESERVE_PERCENT
- HTP_URL_DECODE_REMOVE_PERCENT => HTP_URL_ENCODING_HANDLING_REMOVE_PERCENT
- HTP_URL_DECODE_PROCESS_INVALID => HTP_URL_ENCODING_HANDLING_PROCESS_INVALID

htp_config_set_field_limits(soft_limit, hard_limit) changed to
htp_config_set_field_limit(limit) because libhtp didn't implement soft
limits.

libhtp logging API updated to provide HTP_LOG_CODE constants along with
the message. This eliminates the need to perform string matching on
message text to map log messages to HTTP_DECODER_EVENT values, and the
HTP_LOG_CODE values can be used directly. In support of this,
HTP_DECODER_EVENT values are mapped to their corresponding HTP_LOG_CODE
values:

HTTP_DECODER_EVENT_UNKNOWN_ERROR => HTP_LOG_CODE_UNKNOWN
HTTP_DECODER_EVENT_GZIP_DECOMPRESSION_FAILED => HTP_LOG_CODE_GZIP_DECOMPRESSION_FAILED
HTTP_DECODER_EVENT_REQUEST_FIELD_MISSING_COLON => HTP_LOG_CODE_REQUEST_FIELD_MISSING_COLON
HTTP_DECODER_EVENT_RESPONSE_FIELD_MISSING_COLON => HTP_LOG_CODE_RESPONSE_FIELD_MISSING_COLON
HTTP_DECODER_EVENT_INVALID_REQUEST_CHUNK_LEN => HTP_LOG_CODE_INVALID_REQUEST_CHUNK_LEN
HTTP_DECODER_EVENT_INVALID_RESPONSE_CHUNK_LEN => HTP_LOG_CODE_INVALID_RESPONSE_CHUNK_LEN
HTTP_DECODER_EVENT_INVALID_TRANSFER_ENCODING_VALUE_IN_REQUEST => HTP_LOG_CODE_INVALID_TRANSFER_ENCODING_VALUE_IN_REQUEST
HTTP_DECODER_EVENT_INVALID_TRANSFER_ENCODING_VALUE_IN_RESPONSE => HTP_LOG_CODE_INVALID_TRANSFER_ENCODING_VALUE_IN_RESPONSE
HTTP_DECODER_EVENT_INVALID_CONTENT_LENGTH_FIELD_IN_REQUEST => HTP_LOG_CODE_INVALID_CONTENT_LENGTH_FIELD_IN_REQUEST
HTTP_DECODER_EVENT_INVALID_CONTENT_LENGTH_FIELD_IN_RESPONSE => HTP_LOG_CODE_INVALID_CONTENT_LENGTH_FIELD_IN_RESPONSE
HTTP_DECODER_EVENT_DUPLICATE_CONTENT_LENGTH_FIELD_IN_REQUEST => HTP_LOG_CODE_DUPLICATE_CONTENT_LENGTH_FIELD_IN_REQUEST
HTTP_DECODER_EVENT_DUPLICATE_CONTENT_LENGTH_FIELD_IN_RESPONSE => HTP_LOG_CODE_DUPLICATE_CONTENT_LENGTH_FIELD_IN_RESPONSE
HTTP_DECODER_EVENT_100_CONTINUE_ALREADY_SEEN => HTP_LOG_CODE_CONTINUE_ALREADY_SEEN
HTTP_DECODER_EVENT_UNABLE_TO_MATCH_RESPONSE_TO_REQUEST => HTP_LOG_CODE_UNABLE_TO_MATCH_RESPONSE_TO_REQUEST
HTTP_DECODER_EVENT_INVALID_SERVER_PORT_IN_REQUEST => HTP_LOG_CODE_INVALID_SERVER_PORT_IN_REQUEST
HTTP_DECODER_EVENT_INVALID_AUTHORITY_PORT => HTP_LOG_CODE_INVALID_AUTHORITY_PORT
HTTP_DECODER_EVENT_REQUEST_HEADER_INVALID => HTP_LOG_CODE_REQUEST_HEADER_INVALID
HTTP_DECODER_EVENT_RESPONSE_HEADER_INVALID => HTP_LOG_CODE_RESPONSE_HEADER_INVALID
HTTP_DECODER_EVENT_MISSING_HOST_HEADER => HTP_LOG_CODE_MISSING_HOST_HEADER
HTTP_DECODER_EVENT_HOST_HEADER_AMBIGUOUS => HTP_LOG_CODE_HOST_HEADER_AMBIGUOUS
HTTP_DECODER_EVENT_INVALID_REQUEST_FIELD_FOLDING => HTP_LOG_CODE_INVALID_REQUEST_FIELD_FOLDING
HTTP_DECODER_EVENT_INVALID_RESPONSE_FIELD_FOLDING => HTP_LOG_CODE_INVALID_RESPONSE_FIELD_FOLDING
HTTP_DECODER_EVENT_REQUEST_FIELD_TOO_LONG => HTP_LOG_CODE_REQUEST_FIELD_TOO_LONG
HTTP_DECODER_EVENT_RESPONSE_FIELD_TOO_LONG => HTP_LOG_CODE_RESPONSE_FIELD_TOO_LONG
HTTP_DECODER_EVENT_FILE_NAME_TOO_LONG => HTP_LOG_CODE_REQUEST_LINE_INVALID
HTTP_DECODER_EVENT_REQUEST_LINE_INVALID => HTP_LOG_CODE_REQUEST_BODY_UNEXPECTED
HTTP_DECODER_EVENT_REQUEST_BODY_UNEXPECTED => HTP_LOG_CODE_RESPONSE_BODY_UNEXPECTED
HTTP_DECODER_EVENT_REQUEST_SERVER_PORT_TCP_PORT_MISMATCH => HTP_LOG_CODE_REQUEST_SERVER_PORT_TCP_PORT_MISMATCH
HTTP_DECODER_EVENT_URI_HOST_INVALID => HTP_LOG_CODE_URI_HOST_INVALID
HTTP_DECODER_EVENT_HEADER_HOST_INVALID => HTP_LOG_CODE_HEADER_HOST_INVALID
HTTP_DECODER_EVENT_AUTH_UNRECOGNIZED => HTP_LOG_CODE_AUTH_UNRECOGNIZED
HTTP_DECODER_EVENT_REQUEST_HEADER_REPETITION => HTP_LOG_CODE_REQUEST_HEADER_REPETITION
HTTP_DECODER_EVENT_RESPONSE_HEADER_REPETITION => HTP_LOG_CODE_RESPONSE_HEADER_REPETITION
HTTP_DECODER_EVENT_DOUBLE_ENCODED_URI => HTP_LOG_CODE_DOUBLE_ENCODED_URI
HTTP_DECODER_EVENT_URI_DELIM_NON_COMPLIANT => HTP_LOG_CODE_URI_DELIM_NON_COMPLIANT
HTTP_DECODER_EVENT_METHOD_DELIM_NON_COMPLIANT => HTP_LOG_CODE_METHOD_DELIM_NON_COMPLIANT
HTTP_DECODER_EVENT_REQUEST_LINE_LEADING_WHITESPACE => HTP_LOG_CODE_REQUEST_LINE_LEADING_WHITESPACE
HTTP_DECODER_EVENT_TOO_MANY_ENCODING_LAYERS => HTP_LOG_CODE_TOO_MANY_ENCODING_LAYERS
HTTP_DECODER_EVENT_ABNORMAL_CE_HEADER => HTP_LOG_CODE_ABNORMAL_CE_HEADER
HTTP_DECODER_EVENT_RESPONSE_MULTIPART_BYTERANGES => HTP_LOG_CODE_RESPONSE_MULTIPART_BYTERANGES
HTTP_DECODER_EVENT_RESPONSE_ABNORMAL_TRANSFER_ENCODING => HTP_LOG_CODE_RESPONSE_ABNORMAL_TRANSFER_ENCODING
HTTP_DECODER_EVENT_RESPONSE_CHUNKED_OLD_PROTO => HTP_LOG_CODE_RESPONSE_CHUNKED_OLD_PROTO
HTTP_DECODER_EVENT_RESPONSE_INVALID_PROTOCOL => HTP_LOG_CODE_RESPONSE_INVALID_PROTOCOL
HTTP_DECODER_EVENT_RESPONSE_INVALID_STATUS => HTP_LOG_CODE_RESPONSE_INVALID_STATUS
HTTP_DECODER_EVENT_REQUEST_LINE_INCOMPLETE => HTP_LOG_CODE_REQUEST_LINE_INCOMPLETE
HTTP_DECODER_EVENT_LZMA_MEMLIMIT_REACHED => HTP_LOG_CODE_LZMA_MEMLIMIT_REACHED
HTTP_DECODER_EVENT_COMPRESSION_BOMB => HTP_LOG_CODE_COMPRESSION_BOMB

New log events to describe additional anomalies:
HTP_LOG_CODE_REQUEST_TOO_MANY_LZMA_LAYERS
HTP_LOG_CODE_RESPONSE_TOO_MANY_LZMA_LAYERS
HTP_LOG_CODE_PROTOCOL_CONTAINS_EXTRA_DATA
HTP_LOG_CODE_CONTENT_LENGTH_EXTRA_DATA_START
HTP_LOG_CODE_CONTENT_LENGTH_EXTRA_DATA_END
HTP_LOG_CODE_SWITCHING_PROTO_WITH_CONTENT_LENGTH
HTP_LOG_CODE_DEFORMED_EOL
HTP_LOG_CODE_PARSER_STATE_ERROR
HTP_LOG_CODE_MISSING_OUTBOUND_TRANSACTION_DATA
HTP_LOG_CODE_MISSING_INBOUND_TRANSACTION_DATA
HTP_LOG_CODE_ZERO_LENGTH_DATA_CHUNKS
HTP_LOG_CODE_REQUEST_LINE_UNKNOWN_METHOD
HTP_LOG_CODE_REQUEST_LINE_UNKNOWN_METHOD_NO_PROTOCOL
HTP_LOG_CODE_REQUEST_LINE_UNKNOWN_METHOD_INVALID_PROTOCOL
HTP_LOG_CODE_REQUEST_LINE_NO_PROTOCOL
HTP_LOG_CODE_RESPONSE_LINE_INVALID_PROTOCOL
HTP_LOG_CODE_RESPONSE_LINE_INVALID_RESPONSE_STATUS
HTP_LOG_CODE_RESPONSE_BODY_INTERNAL_ERROR
HTP_LOG_CODE_REQUEST_BODY_DATA_CALLBACK_ERROR
HTP_LOG_CODE_RESPONSE_INVALID_EMPTY_NAME
HTP_LOG_CODE_REQUEST_INVALID_EMPTY_NAME
HTP_LOG_CODE_RESPONSE_INVALID_LWS_AFTER_NAME
HTP_LOG_CODE_RESPONSE_HEADER_NAME_NOT_TOKEN
HTP_LOG_CODE_REQUEST_INVALID_LWS_AFTER_NAME
HTP_LOG_CODE_LZMA_DECOMPRESSION_DISABLED
HTP_LOG_CODE_CONNECTION_ALREADY_OPEN
HTP_LOG_CODE_COMPRESSION_BOMB_DOUBLE_LZMA
HTP_LOG_CODE_INVALID_CONTENT_ENCODING
HTP_LOG_CODE_INVALID_GAP
HTP_LOG_CODE_ERROR

The new htp_log API supports consuming log messages more easily than
walking a list and tracking the current offset. Internally, libhtp-rs
now provides log messages as a queue of htp_log_t, which means the
application can simply call htp_conn_next_log() to fetch the next log
message until the queue is empty. Once the application is done with a
log message, they can call htp_log_free() to dispose of it.

Functions supporting htp_log_t:
htp_conn_next_log(conn) - Get the next log message
htp_log_message(log) - To get the text of the message
htp_log_code(log) - To get the HTP_LOG_CODE value
htp_log_free(log) - To free the htp_log_t
catenacyber pushed a commit to catenacyber/suricata that referenced this pull request Jun 27, 2024
Ticket: OISF#2696

There are a lot of changes here, which are described below.

In general these changes are renaming constants to conform to the
libhtp-rs versions (which are generated by cbindgen); making all htp
types opaque and changing struct->member references to
htp_struct_member() function calls; and a handful of changes to offload
functionality onto libhtp-rs from suricata, such as URI normalization
and transaction cleanup.

Constants renamed to correspond to cbindgen generated names:

HTP_OK => HTP_STATUS_OK
HTP_ERROR => HTP_STATUS_ERROR

HTP_SERVER_* => HTP_SERVER_PERSONALITY_*

HTP_REQUEST_INVALID_T_E => HTP_FLAGS_REQUEST_INVALID_T_E
HTP_REQUEST_INVALID_C_L => HTP_FLAGS_REQUEST_INVALID_C_L
HTP_HOST_MISSING => HTP_FLAGS_HOST_MISSING
HTP_HOST_AMBIGUOUS => HTP_FLAGS_HOST_AMBIGUOUS
HTP_HOSTU_INVALID => HTP_FLAGS_HOSTU_INVALID
HTP_HOSTH_INVALID => HTP_FLAGS_HOSTH_INVALID

HTP_AUTH_UNRECOGNIZED => HTP_AUTH_TYPE_UNRECOGNIZED

HTP_M_UNKNOWN => HTP_METHOD_UNKNOWN
HTP_M_GET => HTP_METHOD_GET
HTP_M_POST => HTP_METHOD_POST
HTP_M_PUT => HTP_METHOD_PUT
HTP_M_CONNECT => HTP_METHOD_CONNECT

HTP_STREAM_ERROR => HTP_STREAM_STATE_ERROR
HTP_STREAM_TUNNEL => HTP_STREAM_STATE_TUNNEL

HTP_REQUEST_LINE => HTP_REQUEST_PROGRESS_LINE
HTP_REQUEST_HEADERS => HTP_REQUEST_PROGRESS_HEADERS
HTP_REQUEST_BODY => HTP_REQUEST_PROGRESS_BODY
HTP_REQUEST_TRAILER => HTP_REQUEST_PROGRESS_TRAILER
HTP_REQUEST_COMPLETE => HTP_REQUEST_PROGRESS_COMPLETE
HTP_RESPONSE_LINE => HTP_RESPONSE_PROGRESS_LINE
HTP_RESPONSE_HEADERS => HTP_RESPONSE_PROGRESS_HEADERS
HTP_RESPONSE_BODY => HTP_RESPONSE_PROGRESS_BODY
HTP_RESPONSE_TRAILER => HTP_RESPONSE_PROGRESS_TRAILER
HTP_RESPONSE_COMPLETE => HTP_RESPONSE_PROGRESS_COMPLETE

HTP_PROTOCOL_1_1 => HTP_PROTOCOL_V1_1
HTP_PROTOCOL_1_0 => HTP_PROTOCOL_V1_0
HTP_PROTOCOL_0_9 => HTP_PROTOCOL_V0_9

Functions introduced to handle opaque htp_tx_t:
- tx->cfg => htp_tx_cfg(tx)
- tx->flags => htp_tx_flags(tx)
- tx->is_protocol_0_9 => htp_tx_is_protocol_0_9(tx)
- tx->parsed_uri => htp_tx_parsed_uri(tx)
- tx->parsed_uri->path => htp_uri_path(htp_tx_parsed_uri(tx)
- tx->parsed_uri->hostname => htp_uri_hostname(htp_tx_parsed_uri(tx))
- tx->request_auth_type => htp_tx_request_auth_type(tx)
- tx->request_headers => htp_tx_request_headers(tx)
- tx->request_hostname => htp_tx_request_hostname(tx)
- tx->request_line => htp_tx_request_line(tx)
- tx->request_message_len => htp_tx_request_message_len(tx)
- tx->request_method => htp_tx_request_method(tx)
- tx->request_method_number => htp_tx_request_method_number(tx)
- tx->request_port_number => htp_tx_request_port_number(tx)
- tx->request_progress => htp_tx_request_progress(tx)
- tx->request_protocol => htp_tx_request_protocol(tx)
- tx->request_protocol_number => htp_tx_request_protocol_number(tx)
- tx->request_uri => htp_tx_request_uri(tx)
- tx->response_headers => htp_tx_response_headers(tx)
- tx->response_line => htp_tx_response_line(tx)
- tx->response_message => htp_tx_response_message(tx)
- tx->response_message_len => htp_tx_response_message_len(tx)
- tx->response_status => htp_tx_response_status(tx)
- tx->response_status_number => htp_tx_response_status_number(tx)
- tx->response_progress => htp_tx_response_progress(tx)
- tx->response_protocol_number => htp_tx_response_protocol_number(tx)
- htp_tx_get_user_data() => htp_tx_user_data(tx)
- htp_table_get_c(tx->request_headers, header) => htp_tx_request_header(tx, header)
- htp_table_get_c(tx->response_headers, header) => htp_tx_response_header(tx, header)
- htp_table_get_index(tx->request_headers, index) => htp_tx_request_header_index(tx, index)
- htp_table_size(tx->request_headers) => htp_tx_request_headers_size(tx)
- htp_tx_is_http_2_upgrade(tx) convenience function introduced to detect response status 101
  and “Upgrade: h2c" header.

Functions introduced to handle opaque htp_header_t:
- header->name => htp_header_name(header)
- bstr_ptr(header->name) => htp_header_name_ptr(header)
- bstr_len(header->name) => htp_header_name_len(header)
- header->value => htp_header_value(header)
- bstr_len(header->value) => htp_header_value_len(header)
- bstr_ptr(header->value) => htp_header_value_ptr(header)

Functions introduced to handle opaque htp_headers_t:
- htp_table_size(headers) => htp_headers_size(headers)
- htp_table_get_index(headers, index) => htp_headers_get_index(headers, index)

Functions introduced to handle opaque htp_tx_data_t:
- d->len => htp_tx_data_len()
- d->data => htp_tx_data_data()
- htp_tx_data_tx(data) function to get the htp_tx_t from the htp_tx_data_t
- htp_tx_data_is_empty(data) convenience function introduced to test if the data is empty.

Functions introduced to handle opaque htp_connp_t:
- htp_list_get(connp->transactions, txid) => htp_connp_tx(connp, txid)
- htp_list_size(http_state->conn->transactions) => htp_connp_tx_size(connp)
- htp_connp_get_connection(connp) => htp_connp_connection(connp)
- htp_connp_req_data(connp) => htp_connp_request_data(connp)
- htp_connp_req_close(connp) => htp_connp_request_close(connp)
- htp_connp_res_data(connp) => htp_connp_response_data(connp)
- htp_connp_get_jn_tx(connp) => htp_connp_get_request_tx(connp)
- htp_connp_get_out_tx(connp) => htp_connp_get_response_tx(connp)
- htp_connp_req_data_consumed(connp) => htp_connp_request_data_consumed(connp)
- htp_connp_res_data_consumed(connp) => htp_connp_response_data_consumed(connp)
- htp_connp_get_user_data(connp) => htp_connp_user_data(connp)

Functions introduced to handle opaque htp_conn_t:
- conn->in_data_counter => htp_conn_request_data_counter(conn)
- conn->out_data_counter => htp_conn_response_data_counter(conn)

Other changes:

Build libhtp-rs as a crate inside rust. Update autoconf to no longer
use libhtp as an external dependency. Remove HAVE_HTP feature defines
since they are no longer needed.

Make function arguments and return values const where possible

htp_tx_destroy(tx) will now free an incomplete transaction

htp_time_t replaced with standard struct timeval

Callbacks from libhtp now provide the htp_connp_t and the htp_tx_data_t
as separate arguments. This means the connection parser is no longer
fetched from the transaction inside callbacks.

SCHTPGenerateNormalizedUri() functionality moved inside libhtp-rs, which
now provides normalized URI values.
The normalized URI is available with accessor function: htp_tx_normalized_uri()
Configuration settings added to control the behaviour of the URI normalization:
- htp_config_set_normalized_uri_include_all()
- htp_config_set_plusspace_decode()
- htp_config_set_convert_lowercase()
- htp_config_set_double_decode_normalized_query()
- htp_config_set_double_decode_normalized_path()
- htp_config_set_backslash_convert_slashes()
- htp_config_set_bestfit_replacement_byte()
- htp_config_set_convert_lowercase()
- htp_config_set_nul_encoded_terminates()
- htp_config_set_nul_raw_terminates()
- htp_config_set_path_separators_compress()
- htp_config_set_path_separators_decode()
- htp_config_set_u_encoding_decode()
- htp_config_set_url_encoding_invalid_handling()
- htp_config_set_utf8_convert_bestfit()
- htp_config_set_normalized_uri_include_all()
- htp_config_set_plusspace_decode()
Constants related to configuring uri normalization:
- HTP_URL_DECODE_PRESERVE_PERCENT => HTP_URL_ENCODING_HANDLING_PRESERVE_PERCENT
- HTP_URL_DECODE_REMOVE_PERCENT => HTP_URL_ENCODING_HANDLING_REMOVE_PERCENT
- HTP_URL_DECODE_PROCESS_INVALID => HTP_URL_ENCODING_HANDLING_PROCESS_INVALID

htp_config_set_field_limits(soft_limit, hard_limit) changed to
htp_config_set_field_limit(limit) because libhtp didn't implement soft
limits.

libhtp logging API updated to provide HTP_LOG_CODE constants along with
the message. This eliminates the need to perform string matching on
message text to map log messages to HTTP_DECODER_EVENT values, and the
HTP_LOG_CODE values can be used directly. In support of this,
HTP_DECODER_EVENT values are mapped to their corresponding HTP_LOG_CODE
values:

HTTP_DECODER_EVENT_UNKNOWN_ERROR => HTP_LOG_CODE_UNKNOWN
HTTP_DECODER_EVENT_GZIP_DECOMPRESSION_FAILED => HTP_LOG_CODE_GZIP_DECOMPRESSION_FAILED
HTTP_DECODER_EVENT_REQUEST_FIELD_MISSING_COLON => HTP_LOG_CODE_REQUEST_FIELD_MISSING_COLON
HTTP_DECODER_EVENT_RESPONSE_FIELD_MISSING_COLON => HTP_LOG_CODE_RESPONSE_FIELD_MISSING_COLON
HTTP_DECODER_EVENT_INVALID_REQUEST_CHUNK_LEN => HTP_LOG_CODE_INVALID_REQUEST_CHUNK_LEN
HTTP_DECODER_EVENT_INVALID_RESPONSE_CHUNK_LEN => HTP_LOG_CODE_INVALID_RESPONSE_CHUNK_LEN
HTTP_DECODER_EVENT_INVALID_TRANSFER_ENCODING_VALUE_IN_REQUEST => HTP_LOG_CODE_INVALID_TRANSFER_ENCODING_VALUE_IN_REQUEST
HTTP_DECODER_EVENT_INVALID_TRANSFER_ENCODING_VALUE_IN_RESPONSE => HTP_LOG_CODE_INVALID_TRANSFER_ENCODING_VALUE_IN_RESPONSE
HTTP_DECODER_EVENT_INVALID_CONTENT_LENGTH_FIELD_IN_REQUEST => HTP_LOG_CODE_INVALID_CONTENT_LENGTH_FIELD_IN_REQUEST
HTTP_DECODER_EVENT_INVALID_CONTENT_LENGTH_FIELD_IN_RESPONSE => HTP_LOG_CODE_INVALID_CONTENT_LENGTH_FIELD_IN_RESPONSE
HTTP_DECODER_EVENT_DUPLICATE_CONTENT_LENGTH_FIELD_IN_REQUEST => HTP_LOG_CODE_DUPLICATE_CONTENT_LENGTH_FIELD_IN_REQUEST
HTTP_DECODER_EVENT_DUPLICATE_CONTENT_LENGTH_FIELD_IN_RESPONSE => HTP_LOG_CODE_DUPLICATE_CONTENT_LENGTH_FIELD_IN_RESPONSE
HTTP_DECODER_EVENT_100_CONTINUE_ALREADY_SEEN => HTP_LOG_CODE_CONTINUE_ALREADY_SEEN
HTTP_DECODER_EVENT_UNABLE_TO_MATCH_RESPONSE_TO_REQUEST => HTP_LOG_CODE_UNABLE_TO_MATCH_RESPONSE_TO_REQUEST
HTTP_DECODER_EVENT_INVALID_SERVER_PORT_IN_REQUEST => HTP_LOG_CODE_INVALID_SERVER_PORT_IN_REQUEST
HTTP_DECODER_EVENT_INVALID_AUTHORITY_PORT => HTP_LOG_CODE_INVALID_AUTHORITY_PORT
HTTP_DECODER_EVENT_REQUEST_HEADER_INVALID => HTP_LOG_CODE_REQUEST_HEADER_INVALID
HTTP_DECODER_EVENT_RESPONSE_HEADER_INVALID => HTP_LOG_CODE_RESPONSE_HEADER_INVALID
HTTP_DECODER_EVENT_MISSING_HOST_HEADER => HTP_LOG_CODE_MISSING_HOST_HEADER
HTTP_DECODER_EVENT_HOST_HEADER_AMBIGUOUS => HTP_LOG_CODE_HOST_HEADER_AMBIGUOUS
HTTP_DECODER_EVENT_INVALID_REQUEST_FIELD_FOLDING => HTP_LOG_CODE_INVALID_REQUEST_FIELD_FOLDING
HTTP_DECODER_EVENT_INVALID_RESPONSE_FIELD_FOLDING => HTP_LOG_CODE_INVALID_RESPONSE_FIELD_FOLDING
HTTP_DECODER_EVENT_REQUEST_FIELD_TOO_LONG => HTP_LOG_CODE_REQUEST_FIELD_TOO_LONG
HTTP_DECODER_EVENT_RESPONSE_FIELD_TOO_LONG => HTP_LOG_CODE_RESPONSE_FIELD_TOO_LONG
HTTP_DECODER_EVENT_FILE_NAME_TOO_LONG => HTP_LOG_CODE_REQUEST_LINE_INVALID
HTTP_DECODER_EVENT_REQUEST_LINE_INVALID => HTP_LOG_CODE_REQUEST_BODY_UNEXPECTED
HTTP_DECODER_EVENT_REQUEST_BODY_UNEXPECTED => HTP_LOG_CODE_RESPONSE_BODY_UNEXPECTED
HTTP_DECODER_EVENT_REQUEST_SERVER_PORT_TCP_PORT_MISMATCH => HTP_LOG_CODE_REQUEST_SERVER_PORT_TCP_PORT_MISMATCH
HTTP_DECODER_EVENT_URI_HOST_INVALID => HTP_LOG_CODE_URI_HOST_INVALID
HTTP_DECODER_EVENT_HEADER_HOST_INVALID => HTP_LOG_CODE_HEADER_HOST_INVALID
HTTP_DECODER_EVENT_AUTH_UNRECOGNIZED => HTP_LOG_CODE_AUTH_UNRECOGNIZED
HTTP_DECODER_EVENT_REQUEST_HEADER_REPETITION => HTP_LOG_CODE_REQUEST_HEADER_REPETITION
HTTP_DECODER_EVENT_RESPONSE_HEADER_REPETITION => HTP_LOG_CODE_RESPONSE_HEADER_REPETITION
HTTP_DECODER_EVENT_DOUBLE_ENCODED_URI => HTP_LOG_CODE_DOUBLE_ENCODED_URI
HTTP_DECODER_EVENT_URI_DELIM_NON_COMPLIANT => HTP_LOG_CODE_URI_DELIM_NON_COMPLIANT
HTTP_DECODER_EVENT_METHOD_DELIM_NON_COMPLIANT => HTP_LOG_CODE_METHOD_DELIM_NON_COMPLIANT
HTTP_DECODER_EVENT_REQUEST_LINE_LEADING_WHITESPACE => HTP_LOG_CODE_REQUEST_LINE_LEADING_WHITESPACE
HTTP_DECODER_EVENT_TOO_MANY_ENCODING_LAYERS => HTP_LOG_CODE_TOO_MANY_ENCODING_LAYERS
HTTP_DECODER_EVENT_ABNORMAL_CE_HEADER => HTP_LOG_CODE_ABNORMAL_CE_HEADER
HTTP_DECODER_EVENT_RESPONSE_MULTIPART_BYTERANGES => HTP_LOG_CODE_RESPONSE_MULTIPART_BYTERANGES
HTTP_DECODER_EVENT_RESPONSE_ABNORMAL_TRANSFER_ENCODING => HTP_LOG_CODE_RESPONSE_ABNORMAL_TRANSFER_ENCODING
HTTP_DECODER_EVENT_RESPONSE_CHUNKED_OLD_PROTO => HTP_LOG_CODE_RESPONSE_CHUNKED_OLD_PROTO
HTTP_DECODER_EVENT_RESPONSE_INVALID_PROTOCOL => HTP_LOG_CODE_RESPONSE_INVALID_PROTOCOL
HTTP_DECODER_EVENT_RESPONSE_INVALID_STATUS => HTP_LOG_CODE_RESPONSE_INVALID_STATUS
HTTP_DECODER_EVENT_REQUEST_LINE_INCOMPLETE => HTP_LOG_CODE_REQUEST_LINE_INCOMPLETE
HTTP_DECODER_EVENT_LZMA_MEMLIMIT_REACHED => HTP_LOG_CODE_LZMA_MEMLIMIT_REACHED
HTTP_DECODER_EVENT_COMPRESSION_BOMB => HTP_LOG_CODE_COMPRESSION_BOMB

New log events to describe additional anomalies:
HTP_LOG_CODE_REQUEST_TOO_MANY_LZMA_LAYERS
HTP_LOG_CODE_RESPONSE_TOO_MANY_LZMA_LAYERS
HTP_LOG_CODE_PROTOCOL_CONTAINS_EXTRA_DATA
HTP_LOG_CODE_CONTENT_LENGTH_EXTRA_DATA_START
HTP_LOG_CODE_CONTENT_LENGTH_EXTRA_DATA_END
HTP_LOG_CODE_SWITCHING_PROTO_WITH_CONTENT_LENGTH
HTP_LOG_CODE_DEFORMED_EOL
HTP_LOG_CODE_PARSER_STATE_ERROR
HTP_LOG_CODE_MISSING_OUTBOUND_TRANSACTION_DATA
HTP_LOG_CODE_MISSING_INBOUND_TRANSACTION_DATA
HTP_LOG_CODE_ZERO_LENGTH_DATA_CHUNKS
HTP_LOG_CODE_REQUEST_LINE_UNKNOWN_METHOD
HTP_LOG_CODE_REQUEST_LINE_UNKNOWN_METHOD_NO_PROTOCOL
HTP_LOG_CODE_REQUEST_LINE_UNKNOWN_METHOD_INVALID_PROTOCOL
HTP_LOG_CODE_REQUEST_LINE_NO_PROTOCOL
HTP_LOG_CODE_RESPONSE_LINE_INVALID_PROTOCOL
HTP_LOG_CODE_RESPONSE_LINE_INVALID_RESPONSE_STATUS
HTP_LOG_CODE_RESPONSE_BODY_INTERNAL_ERROR
HTP_LOG_CODE_REQUEST_BODY_DATA_CALLBACK_ERROR
HTP_LOG_CODE_RESPONSE_INVALID_EMPTY_NAME
HTP_LOG_CODE_REQUEST_INVALID_EMPTY_NAME
HTP_LOG_CODE_RESPONSE_INVALID_LWS_AFTER_NAME
HTP_LOG_CODE_RESPONSE_HEADER_NAME_NOT_TOKEN
HTP_LOG_CODE_REQUEST_INVALID_LWS_AFTER_NAME
HTP_LOG_CODE_LZMA_DECOMPRESSION_DISABLED
HTP_LOG_CODE_CONNECTION_ALREADY_OPEN
HTP_LOG_CODE_COMPRESSION_BOMB_DOUBLE_LZMA
HTP_LOG_CODE_INVALID_CONTENT_ENCODING
HTP_LOG_CODE_INVALID_GAP
HTP_LOG_CODE_ERROR

The new htp_log API supports consuming log messages more easily than
walking a list and tracking the current offset. Internally, libhtp-rs
now provides log messages as a queue of htp_log_t, which means the
application can simply call htp_conn_next_log() to fetch the next log
message until the queue is empty. Once the application is done with a
log message, they can call htp_log_free() to dispose of it.

Functions supporting htp_log_t:
htp_conn_next_log(conn) - Get the next log message
htp_log_message(log) - To get the text of the message
htp_log_code(log) - To get the HTP_LOG_CODE value
htp_log_free(log) - To free the htp_log_t
catenacyber pushed a commit to catenacyber/suricata that referenced this pull request Jun 27, 2024
Ticket: OISF#2696

There are a lot of changes here, which are described below.

In general these changes are renaming constants to conform to the
libhtp-rs versions (which are generated by cbindgen); making all htp
types opaque and changing struct->member references to
htp_struct_member() function calls; and a handful of changes to offload
functionality onto libhtp-rs from suricata, such as URI normalization
and transaction cleanup.

Constants renamed to correspond to cbindgen generated names:

HTP_OK => HTP_STATUS_OK
HTP_ERROR => HTP_STATUS_ERROR

HTP_SERVER_* => HTP_SERVER_PERSONALITY_*

HTP_REQUEST_INVALID_T_E => HTP_FLAGS_REQUEST_INVALID_T_E
HTP_REQUEST_INVALID_C_L => HTP_FLAGS_REQUEST_INVALID_C_L
HTP_HOST_MISSING => HTP_FLAGS_HOST_MISSING
HTP_HOST_AMBIGUOUS => HTP_FLAGS_HOST_AMBIGUOUS
HTP_HOSTU_INVALID => HTP_FLAGS_HOSTU_INVALID
HTP_HOSTH_INVALID => HTP_FLAGS_HOSTH_INVALID

HTP_AUTH_UNRECOGNIZED => HTP_AUTH_TYPE_UNRECOGNIZED

HTP_M_UNKNOWN => HTP_METHOD_UNKNOWN
HTP_M_GET => HTP_METHOD_GET
HTP_M_POST => HTP_METHOD_POST
HTP_M_PUT => HTP_METHOD_PUT
HTP_M_CONNECT => HTP_METHOD_CONNECT

HTP_STREAM_ERROR => HTP_STREAM_STATE_ERROR
HTP_STREAM_TUNNEL => HTP_STREAM_STATE_TUNNEL

HTP_REQUEST_LINE => HTP_REQUEST_PROGRESS_LINE
HTP_REQUEST_HEADERS => HTP_REQUEST_PROGRESS_HEADERS
HTP_REQUEST_BODY => HTP_REQUEST_PROGRESS_BODY
HTP_REQUEST_TRAILER => HTP_REQUEST_PROGRESS_TRAILER
HTP_REQUEST_COMPLETE => HTP_REQUEST_PROGRESS_COMPLETE
HTP_RESPONSE_LINE => HTP_RESPONSE_PROGRESS_LINE
HTP_RESPONSE_HEADERS => HTP_RESPONSE_PROGRESS_HEADERS
HTP_RESPONSE_BODY => HTP_RESPONSE_PROGRESS_BODY
HTP_RESPONSE_TRAILER => HTP_RESPONSE_PROGRESS_TRAILER
HTP_RESPONSE_COMPLETE => HTP_RESPONSE_PROGRESS_COMPLETE

HTP_PROTOCOL_1_1 => HTP_PROTOCOL_V1_1
HTP_PROTOCOL_1_0 => HTP_PROTOCOL_V1_0
HTP_PROTOCOL_0_9 => HTP_PROTOCOL_V0_9

Functions introduced to handle opaque htp_tx_t:
- tx->cfg => htp_tx_cfg(tx)
- tx->flags => htp_tx_flags(tx)
- tx->is_protocol_0_9 => htp_tx_is_protocol_0_9(tx)
- tx->parsed_uri => htp_tx_parsed_uri(tx)
- tx->parsed_uri->path => htp_uri_path(htp_tx_parsed_uri(tx)
- tx->parsed_uri->hostname => htp_uri_hostname(htp_tx_parsed_uri(tx))
- tx->request_auth_type => htp_tx_request_auth_type(tx)
- tx->request_headers => htp_tx_request_headers(tx)
- tx->request_hostname => htp_tx_request_hostname(tx)
- tx->request_line => htp_tx_request_line(tx)
- tx->request_message_len => htp_tx_request_message_len(tx)
- tx->request_method => htp_tx_request_method(tx)
- tx->request_method_number => htp_tx_request_method_number(tx)
- tx->request_port_number => htp_tx_request_port_number(tx)
- tx->request_progress => htp_tx_request_progress(tx)
- tx->request_protocol => htp_tx_request_protocol(tx)
- tx->request_protocol_number => htp_tx_request_protocol_number(tx)
- tx->request_uri => htp_tx_request_uri(tx)
- tx->response_headers => htp_tx_response_headers(tx)
- tx->response_line => htp_tx_response_line(tx)
- tx->response_message => htp_tx_response_message(tx)
- tx->response_message_len => htp_tx_response_message_len(tx)
- tx->response_status => htp_tx_response_status(tx)
- tx->response_status_number => htp_tx_response_status_number(tx)
- tx->response_progress => htp_tx_response_progress(tx)
- tx->response_protocol_number => htp_tx_response_protocol_number(tx)
- htp_tx_get_user_data() => htp_tx_user_data(tx)
- htp_table_get_c(tx->request_headers, header) => htp_tx_request_header(tx, header)
- htp_table_get_c(tx->response_headers, header) => htp_tx_response_header(tx, header)
- htp_table_get_index(tx->request_headers, index) => htp_tx_request_header_index(tx, index)
- htp_table_size(tx->request_headers) => htp_tx_request_headers_size(tx)
- htp_tx_is_http_2_upgrade(tx) convenience function introduced to detect response status 101
  and “Upgrade: h2c" header.

Functions introduced to handle opaque htp_header_t:
- header->name => htp_header_name(header)
- bstr_ptr(header->name) => htp_header_name_ptr(header)
- bstr_len(header->name) => htp_header_name_len(header)
- header->value => htp_header_value(header)
- bstr_len(header->value) => htp_header_value_len(header)
- bstr_ptr(header->value) => htp_header_value_ptr(header)

Functions introduced to handle opaque htp_headers_t:
- htp_table_size(headers) => htp_headers_size(headers)
- htp_table_get_index(headers, index) => htp_headers_get_index(headers, index)

Functions introduced to handle opaque htp_tx_data_t:
- d->len => htp_tx_data_len()
- d->data => htp_tx_data_data()
- htp_tx_data_tx(data) function to get the htp_tx_t from the htp_tx_data_t
- htp_tx_data_is_empty(data) convenience function introduced to test if the data is empty.

Functions introduced to handle opaque htp_connp_t:
- htp_list_get(connp->transactions, txid) => htp_connp_tx(connp, txid)
- htp_list_size(http_state->conn->transactions) => htp_connp_tx_size(connp)
- htp_connp_get_connection(connp) => htp_connp_connection(connp)
- htp_connp_req_data(connp) => htp_connp_request_data(connp)
- htp_connp_req_close(connp) => htp_connp_request_close(connp)
- htp_connp_res_data(connp) => htp_connp_response_data(connp)
- htp_connp_get_jn_tx(connp) => htp_connp_get_request_tx(connp)
- htp_connp_get_out_tx(connp) => htp_connp_get_response_tx(connp)
- htp_connp_req_data_consumed(connp) => htp_connp_request_data_consumed(connp)
- htp_connp_res_data_consumed(connp) => htp_connp_response_data_consumed(connp)
- htp_connp_get_user_data(connp) => htp_connp_user_data(connp)

Functions introduced to handle opaque htp_conn_t:
- conn->in_data_counter => htp_conn_request_data_counter(conn)
- conn->out_data_counter => htp_conn_response_data_counter(conn)

Other changes:

Build libhtp-rs as a crate inside rust. Update autoconf to no longer
use libhtp as an external dependency. Remove HAVE_HTP feature defines
since they are no longer needed.

Make function arguments and return values const where possible

htp_tx_destroy(tx) will now free an incomplete transaction

htp_time_t replaced with standard struct timeval

Callbacks from libhtp now provide the htp_connp_t and the htp_tx_data_t
as separate arguments. This means the connection parser is no longer
fetched from the transaction inside callbacks.

SCHTPGenerateNormalizedUri() functionality moved inside libhtp-rs, which
now provides normalized URI values.
The normalized URI is available with accessor function: htp_tx_normalized_uri()
Configuration settings added to control the behaviour of the URI normalization:
- htp_config_set_normalized_uri_include_all()
- htp_config_set_plusspace_decode()
- htp_config_set_convert_lowercase()
- htp_config_set_double_decode_normalized_query()
- htp_config_set_double_decode_normalized_path()
- htp_config_set_backslash_convert_slashes()
- htp_config_set_bestfit_replacement_byte()
- htp_config_set_convert_lowercase()
- htp_config_set_nul_encoded_terminates()
- htp_config_set_nul_raw_terminates()
- htp_config_set_path_separators_compress()
- htp_config_set_path_separators_decode()
- htp_config_set_u_encoding_decode()
- htp_config_set_url_encoding_invalid_handling()
- htp_config_set_utf8_convert_bestfit()
- htp_config_set_normalized_uri_include_all()
- htp_config_set_plusspace_decode()
Constants related to configuring uri normalization:
- HTP_URL_DECODE_PRESERVE_PERCENT => HTP_URL_ENCODING_HANDLING_PRESERVE_PERCENT
- HTP_URL_DECODE_REMOVE_PERCENT => HTP_URL_ENCODING_HANDLING_REMOVE_PERCENT
- HTP_URL_DECODE_PROCESS_INVALID => HTP_URL_ENCODING_HANDLING_PROCESS_INVALID

htp_config_set_field_limits(soft_limit, hard_limit) changed to
htp_config_set_field_limit(limit) because libhtp didn't implement soft
limits.

libhtp logging API updated to provide HTP_LOG_CODE constants along with
the message. This eliminates the need to perform string matching on
message text to map log messages to HTTP_DECODER_EVENT values, and the
HTP_LOG_CODE values can be used directly. In support of this,
HTP_DECODER_EVENT values are mapped to their corresponding HTP_LOG_CODE
values:

HTTP_DECODER_EVENT_UNKNOWN_ERROR => HTP_LOG_CODE_UNKNOWN
HTTP_DECODER_EVENT_GZIP_DECOMPRESSION_FAILED => HTP_LOG_CODE_GZIP_DECOMPRESSION_FAILED
HTTP_DECODER_EVENT_REQUEST_FIELD_MISSING_COLON => HTP_LOG_CODE_REQUEST_FIELD_MISSING_COLON
HTTP_DECODER_EVENT_RESPONSE_FIELD_MISSING_COLON => HTP_LOG_CODE_RESPONSE_FIELD_MISSING_COLON
HTTP_DECODER_EVENT_INVALID_REQUEST_CHUNK_LEN => HTP_LOG_CODE_INVALID_REQUEST_CHUNK_LEN
HTTP_DECODER_EVENT_INVALID_RESPONSE_CHUNK_LEN => HTP_LOG_CODE_INVALID_RESPONSE_CHUNK_LEN
HTTP_DECODER_EVENT_INVALID_TRANSFER_ENCODING_VALUE_IN_REQUEST => HTP_LOG_CODE_INVALID_TRANSFER_ENCODING_VALUE_IN_REQUEST
HTTP_DECODER_EVENT_INVALID_TRANSFER_ENCODING_VALUE_IN_RESPONSE => HTP_LOG_CODE_INVALID_TRANSFER_ENCODING_VALUE_IN_RESPONSE
HTTP_DECODER_EVENT_INVALID_CONTENT_LENGTH_FIELD_IN_REQUEST => HTP_LOG_CODE_INVALID_CONTENT_LENGTH_FIELD_IN_REQUEST
HTTP_DECODER_EVENT_INVALID_CONTENT_LENGTH_FIELD_IN_RESPONSE => HTP_LOG_CODE_INVALID_CONTENT_LENGTH_FIELD_IN_RESPONSE
HTTP_DECODER_EVENT_DUPLICATE_CONTENT_LENGTH_FIELD_IN_REQUEST => HTP_LOG_CODE_DUPLICATE_CONTENT_LENGTH_FIELD_IN_REQUEST
HTTP_DECODER_EVENT_DUPLICATE_CONTENT_LENGTH_FIELD_IN_RESPONSE => HTP_LOG_CODE_DUPLICATE_CONTENT_LENGTH_FIELD_IN_RESPONSE
HTTP_DECODER_EVENT_100_CONTINUE_ALREADY_SEEN => HTP_LOG_CODE_CONTINUE_ALREADY_SEEN
HTTP_DECODER_EVENT_UNABLE_TO_MATCH_RESPONSE_TO_REQUEST => HTP_LOG_CODE_UNABLE_TO_MATCH_RESPONSE_TO_REQUEST
HTTP_DECODER_EVENT_INVALID_SERVER_PORT_IN_REQUEST => HTP_LOG_CODE_INVALID_SERVER_PORT_IN_REQUEST
HTTP_DECODER_EVENT_INVALID_AUTHORITY_PORT => HTP_LOG_CODE_INVALID_AUTHORITY_PORT
HTTP_DECODER_EVENT_REQUEST_HEADER_INVALID => HTP_LOG_CODE_REQUEST_HEADER_INVALID
HTTP_DECODER_EVENT_RESPONSE_HEADER_INVALID => HTP_LOG_CODE_RESPONSE_HEADER_INVALID
HTTP_DECODER_EVENT_MISSING_HOST_HEADER => HTP_LOG_CODE_MISSING_HOST_HEADER
HTTP_DECODER_EVENT_HOST_HEADER_AMBIGUOUS => HTP_LOG_CODE_HOST_HEADER_AMBIGUOUS
HTTP_DECODER_EVENT_INVALID_REQUEST_FIELD_FOLDING => HTP_LOG_CODE_INVALID_REQUEST_FIELD_FOLDING
HTTP_DECODER_EVENT_INVALID_RESPONSE_FIELD_FOLDING => HTP_LOG_CODE_INVALID_RESPONSE_FIELD_FOLDING
HTTP_DECODER_EVENT_REQUEST_FIELD_TOO_LONG => HTP_LOG_CODE_REQUEST_FIELD_TOO_LONG
HTTP_DECODER_EVENT_RESPONSE_FIELD_TOO_LONG => HTP_LOG_CODE_RESPONSE_FIELD_TOO_LONG
HTTP_DECODER_EVENT_FILE_NAME_TOO_LONG => HTP_LOG_CODE_REQUEST_LINE_INVALID
HTTP_DECODER_EVENT_REQUEST_LINE_INVALID => HTP_LOG_CODE_REQUEST_BODY_UNEXPECTED
HTTP_DECODER_EVENT_REQUEST_BODY_UNEXPECTED => HTP_LOG_CODE_RESPONSE_BODY_UNEXPECTED
HTTP_DECODER_EVENT_REQUEST_SERVER_PORT_TCP_PORT_MISMATCH => HTP_LOG_CODE_REQUEST_SERVER_PORT_TCP_PORT_MISMATCH
HTTP_DECODER_EVENT_URI_HOST_INVALID => HTP_LOG_CODE_URI_HOST_INVALID
HTTP_DECODER_EVENT_HEADER_HOST_INVALID => HTP_LOG_CODE_HEADER_HOST_INVALID
HTTP_DECODER_EVENT_AUTH_UNRECOGNIZED => HTP_LOG_CODE_AUTH_UNRECOGNIZED
HTTP_DECODER_EVENT_REQUEST_HEADER_REPETITION => HTP_LOG_CODE_REQUEST_HEADER_REPETITION
HTTP_DECODER_EVENT_RESPONSE_HEADER_REPETITION => HTP_LOG_CODE_RESPONSE_HEADER_REPETITION
HTTP_DECODER_EVENT_DOUBLE_ENCODED_URI => HTP_LOG_CODE_DOUBLE_ENCODED_URI
HTTP_DECODER_EVENT_URI_DELIM_NON_COMPLIANT => HTP_LOG_CODE_URI_DELIM_NON_COMPLIANT
HTTP_DECODER_EVENT_METHOD_DELIM_NON_COMPLIANT => HTP_LOG_CODE_METHOD_DELIM_NON_COMPLIANT
HTTP_DECODER_EVENT_REQUEST_LINE_LEADING_WHITESPACE => HTP_LOG_CODE_REQUEST_LINE_LEADING_WHITESPACE
HTTP_DECODER_EVENT_TOO_MANY_ENCODING_LAYERS => HTP_LOG_CODE_TOO_MANY_ENCODING_LAYERS
HTTP_DECODER_EVENT_ABNORMAL_CE_HEADER => HTP_LOG_CODE_ABNORMAL_CE_HEADER
HTTP_DECODER_EVENT_RESPONSE_MULTIPART_BYTERANGES => HTP_LOG_CODE_RESPONSE_MULTIPART_BYTERANGES
HTTP_DECODER_EVENT_RESPONSE_ABNORMAL_TRANSFER_ENCODING => HTP_LOG_CODE_RESPONSE_ABNORMAL_TRANSFER_ENCODING
HTTP_DECODER_EVENT_RESPONSE_CHUNKED_OLD_PROTO => HTP_LOG_CODE_RESPONSE_CHUNKED_OLD_PROTO
HTTP_DECODER_EVENT_RESPONSE_INVALID_PROTOCOL => HTP_LOG_CODE_RESPONSE_INVALID_PROTOCOL
HTTP_DECODER_EVENT_RESPONSE_INVALID_STATUS => HTP_LOG_CODE_RESPONSE_INVALID_STATUS
HTTP_DECODER_EVENT_REQUEST_LINE_INCOMPLETE => HTP_LOG_CODE_REQUEST_LINE_INCOMPLETE
HTTP_DECODER_EVENT_LZMA_MEMLIMIT_REACHED => HTP_LOG_CODE_LZMA_MEMLIMIT_REACHED
HTTP_DECODER_EVENT_COMPRESSION_BOMB => HTP_LOG_CODE_COMPRESSION_BOMB

New log events to describe additional anomalies:
HTP_LOG_CODE_REQUEST_TOO_MANY_LZMA_LAYERS
HTP_LOG_CODE_RESPONSE_TOO_MANY_LZMA_LAYERS
HTP_LOG_CODE_PROTOCOL_CONTAINS_EXTRA_DATA
HTP_LOG_CODE_CONTENT_LENGTH_EXTRA_DATA_START
HTP_LOG_CODE_CONTENT_LENGTH_EXTRA_DATA_END
HTP_LOG_CODE_SWITCHING_PROTO_WITH_CONTENT_LENGTH
HTP_LOG_CODE_DEFORMED_EOL
HTP_LOG_CODE_PARSER_STATE_ERROR
HTP_LOG_CODE_MISSING_OUTBOUND_TRANSACTION_DATA
HTP_LOG_CODE_MISSING_INBOUND_TRANSACTION_DATA
HTP_LOG_CODE_ZERO_LENGTH_DATA_CHUNKS
HTP_LOG_CODE_REQUEST_LINE_UNKNOWN_METHOD
HTP_LOG_CODE_REQUEST_LINE_UNKNOWN_METHOD_NO_PROTOCOL
HTP_LOG_CODE_REQUEST_LINE_UNKNOWN_METHOD_INVALID_PROTOCOL
HTP_LOG_CODE_REQUEST_LINE_NO_PROTOCOL
HTP_LOG_CODE_RESPONSE_LINE_INVALID_PROTOCOL
HTP_LOG_CODE_RESPONSE_LINE_INVALID_RESPONSE_STATUS
HTP_LOG_CODE_RESPONSE_BODY_INTERNAL_ERROR
HTP_LOG_CODE_REQUEST_BODY_DATA_CALLBACK_ERROR
HTP_LOG_CODE_RESPONSE_INVALID_EMPTY_NAME
HTP_LOG_CODE_REQUEST_INVALID_EMPTY_NAME
HTP_LOG_CODE_RESPONSE_INVALID_LWS_AFTER_NAME
HTP_LOG_CODE_RESPONSE_HEADER_NAME_NOT_TOKEN
HTP_LOG_CODE_REQUEST_INVALID_LWS_AFTER_NAME
HTP_LOG_CODE_LZMA_DECOMPRESSION_DISABLED
HTP_LOG_CODE_CONNECTION_ALREADY_OPEN
HTP_LOG_CODE_COMPRESSION_BOMB_DOUBLE_LZMA
HTP_LOG_CODE_INVALID_CONTENT_ENCODING
HTP_LOG_CODE_INVALID_GAP
HTP_LOG_CODE_ERROR

The new htp_log API supports consuming log messages more easily than
walking a list and tracking the current offset. Internally, libhtp-rs
now provides log messages as a queue of htp_log_t, which means the
application can simply call htp_conn_next_log() to fetch the next log
message until the queue is empty. Once the application is done with a
log message, they can call htp_log_free() to dispose of it.

Functions supporting htp_log_t:
htp_conn_next_log(conn) - Get the next log message
htp_log_message(log) - To get the text of the message
htp_log_code(log) - To get the HTP_LOG_CODE value
htp_log_free(log) - To free the htp_log_t
catenacyber pushed a commit to catenacyber/suricata that referenced this pull request Jun 27, 2024
Ticket: OISF#2696

There are a lot of changes here, which are described below.

In general these changes are renaming constants to conform to the
libhtp-rs versions (which are generated by cbindgen); making all htp
types opaque and changing struct->member references to
htp_struct_member() function calls; and a handful of changes to offload
functionality onto libhtp-rs from suricata, such as URI normalization
and transaction cleanup.

Constants renamed to correspond to cbindgen generated names:

HTP_OK => HTP_STATUS_OK
HTP_ERROR => HTP_STATUS_ERROR

HTP_SERVER_* => HTP_SERVER_PERSONALITY_*

HTP_REQUEST_INVALID_T_E => HTP_FLAGS_REQUEST_INVALID_T_E
HTP_REQUEST_INVALID_C_L => HTP_FLAGS_REQUEST_INVALID_C_L
HTP_HOST_MISSING => HTP_FLAGS_HOST_MISSING
HTP_HOST_AMBIGUOUS => HTP_FLAGS_HOST_AMBIGUOUS
HTP_HOSTU_INVALID => HTP_FLAGS_HOSTU_INVALID
HTP_HOSTH_INVALID => HTP_FLAGS_HOSTH_INVALID

HTP_AUTH_UNRECOGNIZED => HTP_AUTH_TYPE_UNRECOGNIZED

HTP_M_UNKNOWN => HTP_METHOD_UNKNOWN
HTP_M_GET => HTP_METHOD_GET
HTP_M_POST => HTP_METHOD_POST
HTP_M_PUT => HTP_METHOD_PUT
HTP_M_CONNECT => HTP_METHOD_CONNECT

HTP_STREAM_ERROR => HTP_STREAM_STATE_ERROR
HTP_STREAM_TUNNEL => HTP_STREAM_STATE_TUNNEL

HTP_REQUEST_LINE => HTP_REQUEST_PROGRESS_LINE
HTP_REQUEST_HEADERS => HTP_REQUEST_PROGRESS_HEADERS
HTP_REQUEST_BODY => HTP_REQUEST_PROGRESS_BODY
HTP_REQUEST_TRAILER => HTP_REQUEST_PROGRESS_TRAILER
HTP_REQUEST_COMPLETE => HTP_REQUEST_PROGRESS_COMPLETE
HTP_RESPONSE_LINE => HTP_RESPONSE_PROGRESS_LINE
HTP_RESPONSE_HEADERS => HTP_RESPONSE_PROGRESS_HEADERS
HTP_RESPONSE_BODY => HTP_RESPONSE_PROGRESS_BODY
HTP_RESPONSE_TRAILER => HTP_RESPONSE_PROGRESS_TRAILER
HTP_RESPONSE_COMPLETE => HTP_RESPONSE_PROGRESS_COMPLETE

HTP_PROTOCOL_1_1 => HTP_PROTOCOL_V1_1
HTP_PROTOCOL_1_0 => HTP_PROTOCOL_V1_0
HTP_PROTOCOL_0_9 => HTP_PROTOCOL_V0_9

Functions introduced to handle opaque htp_tx_t:
- tx->cfg => htp_tx_cfg(tx)
- tx->flags => htp_tx_flags(tx)
- tx->is_protocol_0_9 => htp_tx_is_protocol_0_9(tx)
- tx->parsed_uri => htp_tx_parsed_uri(tx)
- tx->parsed_uri->path => htp_uri_path(htp_tx_parsed_uri(tx)
- tx->parsed_uri->hostname => htp_uri_hostname(htp_tx_parsed_uri(tx))
- tx->request_auth_type => htp_tx_request_auth_type(tx)
- tx->request_headers => htp_tx_request_headers(tx)
- tx->request_hostname => htp_tx_request_hostname(tx)
- tx->request_line => htp_tx_request_line(tx)
- tx->request_message_len => htp_tx_request_message_len(tx)
- tx->request_method => htp_tx_request_method(tx)
- tx->request_method_number => htp_tx_request_method_number(tx)
- tx->request_port_number => htp_tx_request_port_number(tx)
- tx->request_progress => htp_tx_request_progress(tx)
- tx->request_protocol => htp_tx_request_protocol(tx)
- tx->request_protocol_number => htp_tx_request_protocol_number(tx)
- tx->request_uri => htp_tx_request_uri(tx)
- tx->response_headers => htp_tx_response_headers(tx)
- tx->response_line => htp_tx_response_line(tx)
- tx->response_message => htp_tx_response_message(tx)
- tx->response_message_len => htp_tx_response_message_len(tx)
- tx->response_status => htp_tx_response_status(tx)
- tx->response_status_number => htp_tx_response_status_number(tx)
- tx->response_progress => htp_tx_response_progress(tx)
- tx->response_protocol_number => htp_tx_response_protocol_number(tx)
- htp_tx_get_user_data() => htp_tx_user_data(tx)
- htp_table_get_c(tx->request_headers, header) => htp_tx_request_header(tx, header)
- htp_table_get_c(tx->response_headers, header) => htp_tx_response_header(tx, header)
- htp_table_get_index(tx->request_headers, index) => htp_tx_request_header_index(tx, index)
- htp_table_size(tx->request_headers) => htp_tx_request_headers_size(tx)
- htp_tx_is_http_2_upgrade(tx) convenience function introduced to detect response status 101
  and “Upgrade: h2c" header.

Functions introduced to handle opaque htp_header_t:
- header->name => htp_header_name(header)
- bstr_ptr(header->name) => htp_header_name_ptr(header)
- bstr_len(header->name) => htp_header_name_len(header)
- header->value => htp_header_value(header)
- bstr_len(header->value) => htp_header_value_len(header)
- bstr_ptr(header->value) => htp_header_value_ptr(header)

Functions introduced to handle opaque htp_headers_t:
- htp_table_size(headers) => htp_headers_size(headers)
- htp_table_get_index(headers, index) => htp_headers_get_index(headers, index)

Functions introduced to handle opaque htp_tx_data_t:
- d->len => htp_tx_data_len()
- d->data => htp_tx_data_data()
- htp_tx_data_tx(data) function to get the htp_tx_t from the htp_tx_data_t
- htp_tx_data_is_empty(data) convenience function introduced to test if the data is empty.

Functions introduced to handle opaque htp_connp_t:
- htp_list_get(connp->transactions, txid) => htp_connp_tx(connp, txid)
- htp_list_size(http_state->conn->transactions) => htp_connp_tx_size(connp)
- htp_connp_get_connection(connp) => htp_connp_connection(connp)
- htp_connp_req_data(connp) => htp_connp_request_data(connp)
- htp_connp_req_close(connp) => htp_connp_request_close(connp)
- htp_connp_res_data(connp) => htp_connp_response_data(connp)
- htp_connp_get_jn_tx(connp) => htp_connp_get_request_tx(connp)
- htp_connp_get_out_tx(connp) => htp_connp_get_response_tx(connp)
- htp_connp_req_data_consumed(connp) => htp_connp_request_data_consumed(connp)
- htp_connp_res_data_consumed(connp) => htp_connp_response_data_consumed(connp)
- htp_connp_get_user_data(connp) => htp_connp_user_data(connp)

Functions introduced to handle opaque htp_conn_t:
- conn->in_data_counter => htp_conn_request_data_counter(conn)
- conn->out_data_counter => htp_conn_response_data_counter(conn)

Other changes:

Build libhtp-rs as a crate inside rust. Update autoconf to no longer
use libhtp as an external dependency. Remove HAVE_HTP feature defines
since they are no longer needed.

Make function arguments and return values const where possible

htp_tx_destroy(tx) will now free an incomplete transaction

htp_time_t replaced with standard struct timeval

Callbacks from libhtp now provide the htp_connp_t and the htp_tx_data_t
as separate arguments. This means the connection parser is no longer
fetched from the transaction inside callbacks.

SCHTPGenerateNormalizedUri() functionality moved inside libhtp-rs, which
now provides normalized URI values.
The normalized URI is available with accessor function: htp_tx_normalized_uri()
Configuration settings added to control the behaviour of the URI normalization:
- htp_config_set_normalized_uri_include_all()
- htp_config_set_plusspace_decode()
- htp_config_set_convert_lowercase()
- htp_config_set_double_decode_normalized_query()
- htp_config_set_double_decode_normalized_path()
- htp_config_set_backslash_convert_slashes()
- htp_config_set_bestfit_replacement_byte()
- htp_config_set_convert_lowercase()
- htp_config_set_nul_encoded_terminates()
- htp_config_set_nul_raw_terminates()
- htp_config_set_path_separators_compress()
- htp_config_set_path_separators_decode()
- htp_config_set_u_encoding_decode()
- htp_config_set_url_encoding_invalid_handling()
- htp_config_set_utf8_convert_bestfit()
- htp_config_set_normalized_uri_include_all()
- htp_config_set_plusspace_decode()
Constants related to configuring uri normalization:
- HTP_URL_DECODE_PRESERVE_PERCENT => HTP_URL_ENCODING_HANDLING_PRESERVE_PERCENT
- HTP_URL_DECODE_REMOVE_PERCENT => HTP_URL_ENCODING_HANDLING_REMOVE_PERCENT
- HTP_URL_DECODE_PROCESS_INVALID => HTP_URL_ENCODING_HANDLING_PROCESS_INVALID

htp_config_set_field_limits(soft_limit, hard_limit) changed to
htp_config_set_field_limit(limit) because libhtp didn't implement soft
limits.

libhtp logging API updated to provide HTP_LOG_CODE constants along with
the message. This eliminates the need to perform string matching on
message text to map log messages to HTTP_DECODER_EVENT values, and the
HTP_LOG_CODE values can be used directly. In support of this,
HTP_DECODER_EVENT values are mapped to their corresponding HTP_LOG_CODE
values:

HTTP_DECODER_EVENT_UNKNOWN_ERROR => HTP_LOG_CODE_UNKNOWN
HTTP_DECODER_EVENT_GZIP_DECOMPRESSION_FAILED => HTP_LOG_CODE_GZIP_DECOMPRESSION_FAILED
HTTP_DECODER_EVENT_REQUEST_FIELD_MISSING_COLON => HTP_LOG_CODE_REQUEST_FIELD_MISSING_COLON
HTTP_DECODER_EVENT_RESPONSE_FIELD_MISSING_COLON => HTP_LOG_CODE_RESPONSE_FIELD_MISSING_COLON
HTTP_DECODER_EVENT_INVALID_REQUEST_CHUNK_LEN => HTP_LOG_CODE_INVALID_REQUEST_CHUNK_LEN
HTTP_DECODER_EVENT_INVALID_RESPONSE_CHUNK_LEN => HTP_LOG_CODE_INVALID_RESPONSE_CHUNK_LEN
HTTP_DECODER_EVENT_INVALID_TRANSFER_ENCODING_VALUE_IN_REQUEST => HTP_LOG_CODE_INVALID_TRANSFER_ENCODING_VALUE_IN_REQUEST
HTTP_DECODER_EVENT_INVALID_TRANSFER_ENCODING_VALUE_IN_RESPONSE => HTP_LOG_CODE_INVALID_TRANSFER_ENCODING_VALUE_IN_RESPONSE
HTTP_DECODER_EVENT_INVALID_CONTENT_LENGTH_FIELD_IN_REQUEST => HTP_LOG_CODE_INVALID_CONTENT_LENGTH_FIELD_IN_REQUEST
HTTP_DECODER_EVENT_INVALID_CONTENT_LENGTH_FIELD_IN_RESPONSE => HTP_LOG_CODE_INVALID_CONTENT_LENGTH_FIELD_IN_RESPONSE
HTTP_DECODER_EVENT_DUPLICATE_CONTENT_LENGTH_FIELD_IN_REQUEST => HTP_LOG_CODE_DUPLICATE_CONTENT_LENGTH_FIELD_IN_REQUEST
HTTP_DECODER_EVENT_DUPLICATE_CONTENT_LENGTH_FIELD_IN_RESPONSE => HTP_LOG_CODE_DUPLICATE_CONTENT_LENGTH_FIELD_IN_RESPONSE
HTTP_DECODER_EVENT_100_CONTINUE_ALREADY_SEEN => HTP_LOG_CODE_CONTINUE_ALREADY_SEEN
HTTP_DECODER_EVENT_UNABLE_TO_MATCH_RESPONSE_TO_REQUEST => HTP_LOG_CODE_UNABLE_TO_MATCH_RESPONSE_TO_REQUEST
HTTP_DECODER_EVENT_INVALID_SERVER_PORT_IN_REQUEST => HTP_LOG_CODE_INVALID_SERVER_PORT_IN_REQUEST
HTTP_DECODER_EVENT_INVALID_AUTHORITY_PORT => HTP_LOG_CODE_INVALID_AUTHORITY_PORT
HTTP_DECODER_EVENT_REQUEST_HEADER_INVALID => HTP_LOG_CODE_REQUEST_HEADER_INVALID
HTTP_DECODER_EVENT_RESPONSE_HEADER_INVALID => HTP_LOG_CODE_RESPONSE_HEADER_INVALID
HTTP_DECODER_EVENT_MISSING_HOST_HEADER => HTP_LOG_CODE_MISSING_HOST_HEADER
HTTP_DECODER_EVENT_HOST_HEADER_AMBIGUOUS => HTP_LOG_CODE_HOST_HEADER_AMBIGUOUS
HTTP_DECODER_EVENT_INVALID_REQUEST_FIELD_FOLDING => HTP_LOG_CODE_INVALID_REQUEST_FIELD_FOLDING
HTTP_DECODER_EVENT_INVALID_RESPONSE_FIELD_FOLDING => HTP_LOG_CODE_INVALID_RESPONSE_FIELD_FOLDING
HTTP_DECODER_EVENT_REQUEST_FIELD_TOO_LONG => HTP_LOG_CODE_REQUEST_FIELD_TOO_LONG
HTTP_DECODER_EVENT_RESPONSE_FIELD_TOO_LONG => HTP_LOG_CODE_RESPONSE_FIELD_TOO_LONG
HTTP_DECODER_EVENT_FILE_NAME_TOO_LONG => HTP_LOG_CODE_REQUEST_LINE_INVALID
HTTP_DECODER_EVENT_REQUEST_LINE_INVALID => HTP_LOG_CODE_REQUEST_BODY_UNEXPECTED
HTTP_DECODER_EVENT_REQUEST_BODY_UNEXPECTED => HTP_LOG_CODE_RESPONSE_BODY_UNEXPECTED
HTTP_DECODER_EVENT_REQUEST_SERVER_PORT_TCP_PORT_MISMATCH => HTP_LOG_CODE_REQUEST_SERVER_PORT_TCP_PORT_MISMATCH
HTTP_DECODER_EVENT_URI_HOST_INVALID => HTP_LOG_CODE_URI_HOST_INVALID
HTTP_DECODER_EVENT_HEADER_HOST_INVALID => HTP_LOG_CODE_HEADER_HOST_INVALID
HTTP_DECODER_EVENT_AUTH_UNRECOGNIZED => HTP_LOG_CODE_AUTH_UNRECOGNIZED
HTTP_DECODER_EVENT_REQUEST_HEADER_REPETITION => HTP_LOG_CODE_REQUEST_HEADER_REPETITION
HTTP_DECODER_EVENT_RESPONSE_HEADER_REPETITION => HTP_LOG_CODE_RESPONSE_HEADER_REPETITION
HTTP_DECODER_EVENT_DOUBLE_ENCODED_URI => HTP_LOG_CODE_DOUBLE_ENCODED_URI
HTTP_DECODER_EVENT_URI_DELIM_NON_COMPLIANT => HTP_LOG_CODE_URI_DELIM_NON_COMPLIANT
HTTP_DECODER_EVENT_METHOD_DELIM_NON_COMPLIANT => HTP_LOG_CODE_METHOD_DELIM_NON_COMPLIANT
HTTP_DECODER_EVENT_REQUEST_LINE_LEADING_WHITESPACE => HTP_LOG_CODE_REQUEST_LINE_LEADING_WHITESPACE
HTTP_DECODER_EVENT_TOO_MANY_ENCODING_LAYERS => HTP_LOG_CODE_TOO_MANY_ENCODING_LAYERS
HTTP_DECODER_EVENT_ABNORMAL_CE_HEADER => HTP_LOG_CODE_ABNORMAL_CE_HEADER
HTTP_DECODER_EVENT_RESPONSE_MULTIPART_BYTERANGES => HTP_LOG_CODE_RESPONSE_MULTIPART_BYTERANGES
HTTP_DECODER_EVENT_RESPONSE_ABNORMAL_TRANSFER_ENCODING => HTP_LOG_CODE_RESPONSE_ABNORMAL_TRANSFER_ENCODING
HTTP_DECODER_EVENT_RESPONSE_CHUNKED_OLD_PROTO => HTP_LOG_CODE_RESPONSE_CHUNKED_OLD_PROTO
HTTP_DECODER_EVENT_RESPONSE_INVALID_PROTOCOL => HTP_LOG_CODE_RESPONSE_INVALID_PROTOCOL
HTTP_DECODER_EVENT_RESPONSE_INVALID_STATUS => HTP_LOG_CODE_RESPONSE_INVALID_STATUS
HTTP_DECODER_EVENT_REQUEST_LINE_INCOMPLETE => HTP_LOG_CODE_REQUEST_LINE_INCOMPLETE
HTTP_DECODER_EVENT_LZMA_MEMLIMIT_REACHED => HTP_LOG_CODE_LZMA_MEMLIMIT_REACHED
HTTP_DECODER_EVENT_COMPRESSION_BOMB => HTP_LOG_CODE_COMPRESSION_BOMB

New log events to describe additional anomalies:
HTP_LOG_CODE_REQUEST_TOO_MANY_LZMA_LAYERS
HTP_LOG_CODE_RESPONSE_TOO_MANY_LZMA_LAYERS
HTP_LOG_CODE_PROTOCOL_CONTAINS_EXTRA_DATA
HTP_LOG_CODE_CONTENT_LENGTH_EXTRA_DATA_START
HTP_LOG_CODE_CONTENT_LENGTH_EXTRA_DATA_END
HTP_LOG_CODE_SWITCHING_PROTO_WITH_CONTENT_LENGTH
HTP_LOG_CODE_DEFORMED_EOL
HTP_LOG_CODE_PARSER_STATE_ERROR
HTP_LOG_CODE_MISSING_OUTBOUND_TRANSACTION_DATA
HTP_LOG_CODE_MISSING_INBOUND_TRANSACTION_DATA
HTP_LOG_CODE_ZERO_LENGTH_DATA_CHUNKS
HTP_LOG_CODE_REQUEST_LINE_UNKNOWN_METHOD
HTP_LOG_CODE_REQUEST_LINE_UNKNOWN_METHOD_NO_PROTOCOL
HTP_LOG_CODE_REQUEST_LINE_UNKNOWN_METHOD_INVALID_PROTOCOL
HTP_LOG_CODE_REQUEST_LINE_NO_PROTOCOL
HTP_LOG_CODE_RESPONSE_LINE_INVALID_PROTOCOL
HTP_LOG_CODE_RESPONSE_LINE_INVALID_RESPONSE_STATUS
HTP_LOG_CODE_RESPONSE_BODY_INTERNAL_ERROR
HTP_LOG_CODE_REQUEST_BODY_DATA_CALLBACK_ERROR
HTP_LOG_CODE_RESPONSE_INVALID_EMPTY_NAME
HTP_LOG_CODE_REQUEST_INVALID_EMPTY_NAME
HTP_LOG_CODE_RESPONSE_INVALID_LWS_AFTER_NAME
HTP_LOG_CODE_RESPONSE_HEADER_NAME_NOT_TOKEN
HTP_LOG_CODE_REQUEST_INVALID_LWS_AFTER_NAME
HTP_LOG_CODE_LZMA_DECOMPRESSION_DISABLED
HTP_LOG_CODE_CONNECTION_ALREADY_OPEN
HTP_LOG_CODE_COMPRESSION_BOMB_DOUBLE_LZMA
HTP_LOG_CODE_INVALID_CONTENT_ENCODING
HTP_LOG_CODE_INVALID_GAP
HTP_LOG_CODE_ERROR

The new htp_log API supports consuming log messages more easily than
walking a list and tracking the current offset. Internally, libhtp-rs
now provides log messages as a queue of htp_log_t, which means the
application can simply call htp_conn_next_log() to fetch the next log
message until the queue is empty. Once the application is done with a
log message, they can call htp_log_free() to dispose of it.

Functions supporting htp_log_t:
htp_conn_next_log(conn) - Get the next log message
htp_log_message(log) - To get the text of the message
htp_log_code(log) - To get the HTP_LOG_CODE value
htp_log_free(log) - To free the htp_log_t
catenacyber pushed a commit to catenacyber/suricata that referenced this pull request Jun 27, 2024
Ticket: OISF#2696

There are a lot of changes here, which are described below.

In general these changes are renaming constants to conform to the
libhtp-rs versions (which are generated by cbindgen); making all htp
types opaque and changing struct->member references to
htp_struct_member() function calls; and a handful of changes to offload
functionality onto libhtp-rs from suricata, such as URI normalization
and transaction cleanup.

Constants renamed to correspond to cbindgen generated names:

HTP_OK => HTP_STATUS_OK
HTP_ERROR => HTP_STATUS_ERROR

HTP_SERVER_* => HTP_SERVER_PERSONALITY_*

HTP_REQUEST_INVALID_T_E => HTP_FLAGS_REQUEST_INVALID_T_E
HTP_REQUEST_INVALID_C_L => HTP_FLAGS_REQUEST_INVALID_C_L
HTP_HOST_MISSING => HTP_FLAGS_HOST_MISSING
HTP_HOST_AMBIGUOUS => HTP_FLAGS_HOST_AMBIGUOUS
HTP_HOSTU_INVALID => HTP_FLAGS_HOSTU_INVALID
HTP_HOSTH_INVALID => HTP_FLAGS_HOSTH_INVALID

HTP_AUTH_UNRECOGNIZED => HTP_AUTH_TYPE_UNRECOGNIZED

HTP_M_UNKNOWN => HTP_METHOD_UNKNOWN
HTP_M_GET => HTP_METHOD_GET
HTP_M_POST => HTP_METHOD_POST
HTP_M_PUT => HTP_METHOD_PUT
HTP_M_CONNECT => HTP_METHOD_CONNECT

HTP_STREAM_ERROR => HTP_STREAM_STATE_ERROR
HTP_STREAM_TUNNEL => HTP_STREAM_STATE_TUNNEL

HTP_REQUEST_LINE => HTP_REQUEST_PROGRESS_LINE
HTP_REQUEST_HEADERS => HTP_REQUEST_PROGRESS_HEADERS
HTP_REQUEST_BODY => HTP_REQUEST_PROGRESS_BODY
HTP_REQUEST_TRAILER => HTP_REQUEST_PROGRESS_TRAILER
HTP_REQUEST_COMPLETE => HTP_REQUEST_PROGRESS_COMPLETE
HTP_RESPONSE_LINE => HTP_RESPONSE_PROGRESS_LINE
HTP_RESPONSE_HEADERS => HTP_RESPONSE_PROGRESS_HEADERS
HTP_RESPONSE_BODY => HTP_RESPONSE_PROGRESS_BODY
HTP_RESPONSE_TRAILER => HTP_RESPONSE_PROGRESS_TRAILER
HTP_RESPONSE_COMPLETE => HTP_RESPONSE_PROGRESS_COMPLETE

HTP_PROTOCOL_1_1 => HTP_PROTOCOL_V1_1
HTP_PROTOCOL_1_0 => HTP_PROTOCOL_V1_0
HTP_PROTOCOL_0_9 => HTP_PROTOCOL_V0_9

Functions introduced to handle opaque htp_tx_t:
- tx->cfg => htp_tx_cfg(tx)
- tx->flags => htp_tx_flags(tx)
- tx->is_protocol_0_9 => htp_tx_is_protocol_0_9(tx)
- tx->parsed_uri => htp_tx_parsed_uri(tx)
- tx->parsed_uri->path => htp_uri_path(htp_tx_parsed_uri(tx)
- tx->parsed_uri->hostname => htp_uri_hostname(htp_tx_parsed_uri(tx))
- tx->request_auth_type => htp_tx_request_auth_type(tx)
- tx->request_headers => htp_tx_request_headers(tx)
- tx->request_hostname => htp_tx_request_hostname(tx)
- tx->request_line => htp_tx_request_line(tx)
- tx->request_message_len => htp_tx_request_message_len(tx)
- tx->request_method => htp_tx_request_method(tx)
- tx->request_method_number => htp_tx_request_method_number(tx)
- tx->request_port_number => htp_tx_request_port_number(tx)
- tx->request_progress => htp_tx_request_progress(tx)
- tx->request_protocol => htp_tx_request_protocol(tx)
- tx->request_protocol_number => htp_tx_request_protocol_number(tx)
- tx->request_uri => htp_tx_request_uri(tx)
- tx->response_headers => htp_tx_response_headers(tx)
- tx->response_line => htp_tx_response_line(tx)
- tx->response_message => htp_tx_response_message(tx)
- tx->response_message_len => htp_tx_response_message_len(tx)
- tx->response_status => htp_tx_response_status(tx)
- tx->response_status_number => htp_tx_response_status_number(tx)
- tx->response_progress => htp_tx_response_progress(tx)
- tx->response_protocol_number => htp_tx_response_protocol_number(tx)
- htp_tx_get_user_data() => htp_tx_user_data(tx)
- htp_table_get_c(tx->request_headers, header) => htp_tx_request_header(tx, header)
- htp_table_get_c(tx->response_headers, header) => htp_tx_response_header(tx, header)
- htp_table_get_index(tx->request_headers, index) => htp_tx_request_header_index(tx, index)
- htp_table_size(tx->request_headers) => htp_tx_request_headers_size(tx)
- htp_tx_is_http_2_upgrade(tx) convenience function introduced to detect response status 101
  and “Upgrade: h2c" header.

Functions introduced to handle opaque htp_header_t:
- header->name => htp_header_name(header)
- bstr_ptr(header->name) => htp_header_name_ptr(header)
- bstr_len(header->name) => htp_header_name_len(header)
- header->value => htp_header_value(header)
- bstr_len(header->value) => htp_header_value_len(header)
- bstr_ptr(header->value) => htp_header_value_ptr(header)

Functions introduced to handle opaque htp_headers_t:
- htp_table_size(headers) => htp_headers_size(headers)
- htp_table_get_index(headers, index) => htp_headers_get_index(headers, index)

Functions introduced to handle opaque htp_tx_data_t:
- d->len => htp_tx_data_len()
- d->data => htp_tx_data_data()
- htp_tx_data_tx(data) function to get the htp_tx_t from the htp_tx_data_t
- htp_tx_data_is_empty(data) convenience function introduced to test if the data is empty.

Functions introduced to handle opaque htp_connp_t:
- htp_list_get(connp->transactions, txid) => htp_connp_tx(connp, txid)
- htp_list_size(http_state->conn->transactions) => htp_connp_tx_size(connp)
- htp_connp_get_connection(connp) => htp_connp_connection(connp)
- htp_connp_req_data(connp) => htp_connp_request_data(connp)
- htp_connp_req_close(connp) => htp_connp_request_close(connp)
- htp_connp_res_data(connp) => htp_connp_response_data(connp)
- htp_connp_get_jn_tx(connp) => htp_connp_get_request_tx(connp)
- htp_connp_get_out_tx(connp) => htp_connp_get_response_tx(connp)
- htp_connp_req_data_consumed(connp) => htp_connp_request_data_consumed(connp)
- htp_connp_res_data_consumed(connp) => htp_connp_response_data_consumed(connp)
- htp_connp_get_user_data(connp) => htp_connp_user_data(connp)

Functions introduced to handle opaque htp_conn_t:
- conn->in_data_counter => htp_conn_request_data_counter(conn)
- conn->out_data_counter => htp_conn_response_data_counter(conn)

Other changes:

Build libhtp-rs as a crate inside rust. Update autoconf to no longer
use libhtp as an external dependency. Remove HAVE_HTP feature defines
since they are no longer needed.

Make function arguments and return values const where possible

htp_tx_destroy(tx) will now free an incomplete transaction

htp_time_t replaced with standard struct timeval

Callbacks from libhtp now provide the htp_connp_t and the htp_tx_data_t
as separate arguments. This means the connection parser is no longer
fetched from the transaction inside callbacks.

SCHTPGenerateNormalizedUri() functionality moved inside libhtp-rs, which
now provides normalized URI values.
The normalized URI is available with accessor function: htp_tx_normalized_uri()
Configuration settings added to control the behaviour of the URI normalization:
- htp_config_set_normalized_uri_include_all()
- htp_config_set_plusspace_decode()
- htp_config_set_convert_lowercase()
- htp_config_set_double_decode_normalized_query()
- htp_config_set_double_decode_normalized_path()
- htp_config_set_backslash_convert_slashes()
- htp_config_set_bestfit_replacement_byte()
- htp_config_set_convert_lowercase()
- htp_config_set_nul_encoded_terminates()
- htp_config_set_nul_raw_terminates()
- htp_config_set_path_separators_compress()
- htp_config_set_path_separators_decode()
- htp_config_set_u_encoding_decode()
- htp_config_set_url_encoding_invalid_handling()
- htp_config_set_utf8_convert_bestfit()
- htp_config_set_normalized_uri_include_all()
- htp_config_set_plusspace_decode()
Constants related to configuring uri normalization:
- HTP_URL_DECODE_PRESERVE_PERCENT => HTP_URL_ENCODING_HANDLING_PRESERVE_PERCENT
- HTP_URL_DECODE_REMOVE_PERCENT => HTP_URL_ENCODING_HANDLING_REMOVE_PERCENT
- HTP_URL_DECODE_PROCESS_INVALID => HTP_URL_ENCODING_HANDLING_PROCESS_INVALID

htp_config_set_field_limits(soft_limit, hard_limit) changed to
htp_config_set_field_limit(limit) because libhtp didn't implement soft
limits.

libhtp logging API updated to provide HTP_LOG_CODE constants along with
the message. This eliminates the need to perform string matching on
message text to map log messages to HTTP_DECODER_EVENT values, and the
HTP_LOG_CODE values can be used directly. In support of this,
HTP_DECODER_EVENT values are mapped to their corresponding HTP_LOG_CODE
values:

HTTP_DECODER_EVENT_UNKNOWN_ERROR => HTP_LOG_CODE_UNKNOWN
HTTP_DECODER_EVENT_GZIP_DECOMPRESSION_FAILED => HTP_LOG_CODE_GZIP_DECOMPRESSION_FAILED
HTTP_DECODER_EVENT_REQUEST_FIELD_MISSING_COLON => HTP_LOG_CODE_REQUEST_FIELD_MISSING_COLON
HTTP_DECODER_EVENT_RESPONSE_FIELD_MISSING_COLON => HTP_LOG_CODE_RESPONSE_FIELD_MISSING_COLON
HTTP_DECODER_EVENT_INVALID_REQUEST_CHUNK_LEN => HTP_LOG_CODE_INVALID_REQUEST_CHUNK_LEN
HTTP_DECODER_EVENT_INVALID_RESPONSE_CHUNK_LEN => HTP_LOG_CODE_INVALID_RESPONSE_CHUNK_LEN
HTTP_DECODER_EVENT_INVALID_TRANSFER_ENCODING_VALUE_IN_REQUEST => HTP_LOG_CODE_INVALID_TRANSFER_ENCODING_VALUE_IN_REQUEST
HTTP_DECODER_EVENT_INVALID_TRANSFER_ENCODING_VALUE_IN_RESPONSE => HTP_LOG_CODE_INVALID_TRANSFER_ENCODING_VALUE_IN_RESPONSE
HTTP_DECODER_EVENT_INVALID_CONTENT_LENGTH_FIELD_IN_REQUEST => HTP_LOG_CODE_INVALID_CONTENT_LENGTH_FIELD_IN_REQUEST
HTTP_DECODER_EVENT_INVALID_CONTENT_LENGTH_FIELD_IN_RESPONSE => HTP_LOG_CODE_INVALID_CONTENT_LENGTH_FIELD_IN_RESPONSE
HTTP_DECODER_EVENT_DUPLICATE_CONTENT_LENGTH_FIELD_IN_REQUEST => HTP_LOG_CODE_DUPLICATE_CONTENT_LENGTH_FIELD_IN_REQUEST
HTTP_DECODER_EVENT_DUPLICATE_CONTENT_LENGTH_FIELD_IN_RESPONSE => HTP_LOG_CODE_DUPLICATE_CONTENT_LENGTH_FIELD_IN_RESPONSE
HTTP_DECODER_EVENT_100_CONTINUE_ALREADY_SEEN => HTP_LOG_CODE_CONTINUE_ALREADY_SEEN
HTTP_DECODER_EVENT_UNABLE_TO_MATCH_RESPONSE_TO_REQUEST => HTP_LOG_CODE_UNABLE_TO_MATCH_RESPONSE_TO_REQUEST
HTTP_DECODER_EVENT_INVALID_SERVER_PORT_IN_REQUEST => HTP_LOG_CODE_INVALID_SERVER_PORT_IN_REQUEST
HTTP_DECODER_EVENT_INVALID_AUTHORITY_PORT => HTP_LOG_CODE_INVALID_AUTHORITY_PORT
HTTP_DECODER_EVENT_REQUEST_HEADER_INVALID => HTP_LOG_CODE_REQUEST_HEADER_INVALID
HTTP_DECODER_EVENT_RESPONSE_HEADER_INVALID => HTP_LOG_CODE_RESPONSE_HEADER_INVALID
HTTP_DECODER_EVENT_MISSING_HOST_HEADER => HTP_LOG_CODE_MISSING_HOST_HEADER
HTTP_DECODER_EVENT_HOST_HEADER_AMBIGUOUS => HTP_LOG_CODE_HOST_HEADER_AMBIGUOUS
HTTP_DECODER_EVENT_INVALID_REQUEST_FIELD_FOLDING => HTP_LOG_CODE_INVALID_REQUEST_FIELD_FOLDING
HTTP_DECODER_EVENT_INVALID_RESPONSE_FIELD_FOLDING => HTP_LOG_CODE_INVALID_RESPONSE_FIELD_FOLDING
HTTP_DECODER_EVENT_REQUEST_FIELD_TOO_LONG => HTP_LOG_CODE_REQUEST_FIELD_TOO_LONG
HTTP_DECODER_EVENT_RESPONSE_FIELD_TOO_LONG => HTP_LOG_CODE_RESPONSE_FIELD_TOO_LONG
HTTP_DECODER_EVENT_FILE_NAME_TOO_LONG => HTP_LOG_CODE_REQUEST_LINE_INVALID
HTTP_DECODER_EVENT_REQUEST_LINE_INVALID => HTP_LOG_CODE_REQUEST_BODY_UNEXPECTED
HTTP_DECODER_EVENT_REQUEST_BODY_UNEXPECTED => HTP_LOG_CODE_RESPONSE_BODY_UNEXPECTED
HTTP_DECODER_EVENT_REQUEST_SERVER_PORT_TCP_PORT_MISMATCH => HTP_LOG_CODE_REQUEST_SERVER_PORT_TCP_PORT_MISMATCH
HTTP_DECODER_EVENT_URI_HOST_INVALID => HTP_LOG_CODE_URI_HOST_INVALID
HTTP_DECODER_EVENT_HEADER_HOST_INVALID => HTP_LOG_CODE_HEADER_HOST_INVALID
HTTP_DECODER_EVENT_AUTH_UNRECOGNIZED => HTP_LOG_CODE_AUTH_UNRECOGNIZED
HTTP_DECODER_EVENT_REQUEST_HEADER_REPETITION => HTP_LOG_CODE_REQUEST_HEADER_REPETITION
HTTP_DECODER_EVENT_RESPONSE_HEADER_REPETITION => HTP_LOG_CODE_RESPONSE_HEADER_REPETITION
HTTP_DECODER_EVENT_DOUBLE_ENCODED_URI => HTP_LOG_CODE_DOUBLE_ENCODED_URI
HTTP_DECODER_EVENT_URI_DELIM_NON_COMPLIANT => HTP_LOG_CODE_URI_DELIM_NON_COMPLIANT
HTTP_DECODER_EVENT_METHOD_DELIM_NON_COMPLIANT => HTP_LOG_CODE_METHOD_DELIM_NON_COMPLIANT
HTTP_DECODER_EVENT_REQUEST_LINE_LEADING_WHITESPACE => HTP_LOG_CODE_REQUEST_LINE_LEADING_WHITESPACE
HTTP_DECODER_EVENT_TOO_MANY_ENCODING_LAYERS => HTP_LOG_CODE_TOO_MANY_ENCODING_LAYERS
HTTP_DECODER_EVENT_ABNORMAL_CE_HEADER => HTP_LOG_CODE_ABNORMAL_CE_HEADER
HTTP_DECODER_EVENT_RESPONSE_MULTIPART_BYTERANGES => HTP_LOG_CODE_RESPONSE_MULTIPART_BYTERANGES
HTTP_DECODER_EVENT_RESPONSE_ABNORMAL_TRANSFER_ENCODING => HTP_LOG_CODE_RESPONSE_ABNORMAL_TRANSFER_ENCODING
HTTP_DECODER_EVENT_RESPONSE_CHUNKED_OLD_PROTO => HTP_LOG_CODE_RESPONSE_CHUNKED_OLD_PROTO
HTTP_DECODER_EVENT_RESPONSE_INVALID_PROTOCOL => HTP_LOG_CODE_RESPONSE_INVALID_PROTOCOL
HTTP_DECODER_EVENT_RESPONSE_INVALID_STATUS => HTP_LOG_CODE_RESPONSE_INVALID_STATUS
HTTP_DECODER_EVENT_REQUEST_LINE_INCOMPLETE => HTP_LOG_CODE_REQUEST_LINE_INCOMPLETE
HTTP_DECODER_EVENT_LZMA_MEMLIMIT_REACHED => HTP_LOG_CODE_LZMA_MEMLIMIT_REACHED
HTTP_DECODER_EVENT_COMPRESSION_BOMB => HTP_LOG_CODE_COMPRESSION_BOMB

New log events to describe additional anomalies:
HTP_LOG_CODE_REQUEST_TOO_MANY_LZMA_LAYERS
HTP_LOG_CODE_RESPONSE_TOO_MANY_LZMA_LAYERS
HTP_LOG_CODE_PROTOCOL_CONTAINS_EXTRA_DATA
HTP_LOG_CODE_CONTENT_LENGTH_EXTRA_DATA_START
HTP_LOG_CODE_CONTENT_LENGTH_EXTRA_DATA_END
HTP_LOG_CODE_SWITCHING_PROTO_WITH_CONTENT_LENGTH
HTP_LOG_CODE_DEFORMED_EOL
HTP_LOG_CODE_PARSER_STATE_ERROR
HTP_LOG_CODE_MISSING_OUTBOUND_TRANSACTION_DATA
HTP_LOG_CODE_MISSING_INBOUND_TRANSACTION_DATA
HTP_LOG_CODE_ZERO_LENGTH_DATA_CHUNKS
HTP_LOG_CODE_REQUEST_LINE_UNKNOWN_METHOD
HTP_LOG_CODE_REQUEST_LINE_UNKNOWN_METHOD_NO_PROTOCOL
HTP_LOG_CODE_REQUEST_LINE_UNKNOWN_METHOD_INVALID_PROTOCOL
HTP_LOG_CODE_REQUEST_LINE_NO_PROTOCOL
HTP_LOG_CODE_RESPONSE_LINE_INVALID_PROTOCOL
HTP_LOG_CODE_RESPONSE_LINE_INVALID_RESPONSE_STATUS
HTP_LOG_CODE_RESPONSE_BODY_INTERNAL_ERROR
HTP_LOG_CODE_REQUEST_BODY_DATA_CALLBACK_ERROR
HTP_LOG_CODE_RESPONSE_INVALID_EMPTY_NAME
HTP_LOG_CODE_REQUEST_INVALID_EMPTY_NAME
HTP_LOG_CODE_RESPONSE_INVALID_LWS_AFTER_NAME
HTP_LOG_CODE_RESPONSE_HEADER_NAME_NOT_TOKEN
HTP_LOG_CODE_REQUEST_INVALID_LWS_AFTER_NAME
HTP_LOG_CODE_LZMA_DECOMPRESSION_DISABLED
HTP_LOG_CODE_CONNECTION_ALREADY_OPEN
HTP_LOG_CODE_COMPRESSION_BOMB_DOUBLE_LZMA
HTP_LOG_CODE_INVALID_CONTENT_ENCODING
HTP_LOG_CODE_INVALID_GAP
HTP_LOG_CODE_ERROR

The new htp_log API supports consuming log messages more easily than
walking a list and tracking the current offset. Internally, libhtp-rs
now provides log messages as a queue of htp_log_t, which means the
application can simply call htp_conn_next_log() to fetch the next log
message until the queue is empty. Once the application is done with a
log message, they can call htp_log_free() to dispose of it.

Functions supporting htp_log_t:
htp_conn_next_log(conn) - Get the next log message
htp_log_message(log) - To get the text of the message
htp_log_code(log) - To get the HTP_LOG_CODE value
htp_log_free(log) - To free the htp_log_t
catenacyber pushed a commit to catenacyber/suricata that referenced this pull request Jul 4, 2024
Ticket: OISF#2696

There are a lot of changes here, which are described below.

In general these changes are renaming constants to conform to the
libhtp-rs versions (which are generated by cbindgen); making all htp
types opaque and changing struct->member references to
htp_struct_member() function calls; and a handful of changes to offload
functionality onto libhtp-rs from suricata, such as URI normalization
and transaction cleanup.

Constants renamed to correspond to cbindgen generated names:

HTP_OK => HTP_STATUS_OK
HTP_ERROR => HTP_STATUS_ERROR

HTP_SERVER_* => HTP_SERVER_PERSONALITY_*

HTP_REQUEST_INVALID_T_E => HTP_FLAGS_REQUEST_INVALID_T_E
HTP_REQUEST_INVALID_C_L => HTP_FLAGS_REQUEST_INVALID_C_L
HTP_HOST_MISSING => HTP_FLAGS_HOST_MISSING
HTP_HOST_AMBIGUOUS => HTP_FLAGS_HOST_AMBIGUOUS
HTP_HOSTU_INVALID => HTP_FLAGS_HOSTU_INVALID
HTP_HOSTH_INVALID => HTP_FLAGS_HOSTH_INVALID

HTP_AUTH_UNRECOGNIZED => HTP_AUTH_TYPE_UNRECOGNIZED

HTP_M_UNKNOWN => HTP_METHOD_UNKNOWN
HTP_M_GET => HTP_METHOD_GET
HTP_M_POST => HTP_METHOD_POST
HTP_M_PUT => HTP_METHOD_PUT
HTP_M_CONNECT => HTP_METHOD_CONNECT

HTP_STREAM_ERROR => HTP_STREAM_STATE_ERROR
HTP_STREAM_TUNNEL => HTP_STREAM_STATE_TUNNEL

HTP_REQUEST_LINE => HTP_REQUEST_PROGRESS_LINE
HTP_REQUEST_HEADERS => HTP_REQUEST_PROGRESS_HEADERS
HTP_REQUEST_BODY => HTP_REQUEST_PROGRESS_BODY
HTP_REQUEST_TRAILER => HTP_REQUEST_PROGRESS_TRAILER
HTP_REQUEST_COMPLETE => HTP_REQUEST_PROGRESS_COMPLETE
HTP_RESPONSE_LINE => HTP_RESPONSE_PROGRESS_LINE
HTP_RESPONSE_HEADERS => HTP_RESPONSE_PROGRESS_HEADERS
HTP_RESPONSE_BODY => HTP_RESPONSE_PROGRESS_BODY
HTP_RESPONSE_TRAILER => HTP_RESPONSE_PROGRESS_TRAILER
HTP_RESPONSE_COMPLETE => HTP_RESPONSE_PROGRESS_COMPLETE

HTP_PROTOCOL_1_1 => HTP_PROTOCOL_V1_1
HTP_PROTOCOL_1_0 => HTP_PROTOCOL_V1_0
HTP_PROTOCOL_0_9 => HTP_PROTOCOL_V0_9

Functions introduced to handle opaque htp_tx_t:
- tx->cfg => htp_tx_cfg(tx)
- tx->flags => htp_tx_flags(tx)
- tx->is_protocol_0_9 => htp_tx_is_protocol_0_9(tx)
- tx->parsed_uri => htp_tx_parsed_uri(tx)
- tx->parsed_uri->path => htp_uri_path(htp_tx_parsed_uri(tx)
- tx->parsed_uri->hostname => htp_uri_hostname(htp_tx_parsed_uri(tx))
- tx->request_auth_type => htp_tx_request_auth_type(tx)
- tx->request_headers => htp_tx_request_headers(tx)
- tx->request_hostname => htp_tx_request_hostname(tx)
- tx->request_line => htp_tx_request_line(tx)
- tx->request_message_len => htp_tx_request_message_len(tx)
- tx->request_method => htp_tx_request_method(tx)
- tx->request_method_number => htp_tx_request_method_number(tx)
- tx->request_port_number => htp_tx_request_port_number(tx)
- tx->request_progress => htp_tx_request_progress(tx)
- tx->request_protocol => htp_tx_request_protocol(tx)
- tx->request_protocol_number => htp_tx_request_protocol_number(tx)
- tx->request_uri => htp_tx_request_uri(tx)
- tx->response_headers => htp_tx_response_headers(tx)
- tx->response_line => htp_tx_response_line(tx)
- tx->response_message => htp_tx_response_message(tx)
- tx->response_message_len => htp_tx_response_message_len(tx)
- tx->response_status => htp_tx_response_status(tx)
- tx->response_status_number => htp_tx_response_status_number(tx)
- tx->response_progress => htp_tx_response_progress(tx)
- tx->response_protocol_number => htp_tx_response_protocol_number(tx)
- htp_tx_get_user_data() => htp_tx_user_data(tx)
- htp_table_get_c(tx->request_headers, header) => htp_tx_request_header(tx, header)
- htp_table_get_c(tx->response_headers, header) => htp_tx_response_header(tx, header)
- htp_table_get_index(tx->request_headers, index) => htp_tx_request_header_index(tx, index)
- htp_table_size(tx->request_headers) => htp_tx_request_headers_size(tx)
- htp_tx_is_http_2_upgrade(tx) convenience function introduced to detect response status 101
  and “Upgrade: h2c" header.

Functions introduced to handle opaque htp_header_t:
- header->name => htp_header_name(header)
- bstr_ptr(header->name) => htp_header_name_ptr(header)
- bstr_len(header->name) => htp_header_name_len(header)
- header->value => htp_header_value(header)
- bstr_len(header->value) => htp_header_value_len(header)
- bstr_ptr(header->value) => htp_header_value_ptr(header)

Functions introduced to handle opaque htp_headers_t:
- htp_table_size(headers) => htp_headers_size(headers)
- htp_table_get_index(headers, index) => htp_headers_get_index(headers, index)

Functions introduced to handle opaque htp_tx_data_t:
- d->len => htp_tx_data_len()
- d->data => htp_tx_data_data()
- htp_tx_data_tx(data) function to get the htp_tx_t from the htp_tx_data_t
- htp_tx_data_is_empty(data) convenience function introduced to test if the data is empty.

Functions introduced to handle opaque htp_connp_t:
- htp_list_get(connp->transactions, txid) => htp_connp_tx(connp, txid)
- htp_list_size(http_state->conn->transactions) => htp_connp_tx_size(connp)
- htp_connp_get_connection(connp) => htp_connp_connection(connp)
- htp_connp_req_data(connp) => htp_connp_request_data(connp)
- htp_connp_req_close(connp) => htp_connp_request_close(connp)
- htp_connp_res_data(connp) => htp_connp_response_data(connp)
- htp_connp_get_jn_tx(connp) => htp_connp_get_request_tx(connp)
- htp_connp_get_out_tx(connp) => htp_connp_get_response_tx(connp)
- htp_connp_req_data_consumed(connp) => htp_connp_request_data_consumed(connp)
- htp_connp_res_data_consumed(connp) => htp_connp_response_data_consumed(connp)
- htp_connp_get_user_data(connp) => htp_connp_user_data(connp)

Functions introduced to handle opaque htp_conn_t:
- conn->in_data_counter => htp_conn_request_data_counter(conn)
- conn->out_data_counter => htp_conn_response_data_counter(conn)

Other changes:

Build libhtp-rs as a crate inside rust. Update autoconf to no longer
use libhtp as an external dependency. Remove HAVE_HTP feature defines
since they are no longer needed.

Make function arguments and return values const where possible

htp_tx_destroy(tx) will now free an incomplete transaction

htp_time_t replaced with standard struct timeval

Callbacks from libhtp now provide the htp_connp_t and the htp_tx_data_t
as separate arguments. This means the connection parser is no longer
fetched from the transaction inside callbacks.

SCHTPGenerateNormalizedUri() functionality moved inside libhtp-rs, which
now provides normalized URI values.
The normalized URI is available with accessor function: htp_tx_normalized_uri()
Configuration settings added to control the behaviour of the URI normalization:
- htp_config_set_normalized_uri_include_all()
- htp_config_set_plusspace_decode()
- htp_config_set_convert_lowercase()
- htp_config_set_double_decode_normalized_query()
- htp_config_set_double_decode_normalized_path()
- htp_config_set_backslash_convert_slashes()
- htp_config_set_bestfit_replacement_byte()
- htp_config_set_convert_lowercase()
- htp_config_set_nul_encoded_terminates()
- htp_config_set_nul_raw_terminates()
- htp_config_set_path_separators_compress()
- htp_config_set_path_separators_decode()
- htp_config_set_u_encoding_decode()
- htp_config_set_url_encoding_invalid_handling()
- htp_config_set_utf8_convert_bestfit()
- htp_config_set_normalized_uri_include_all()
- htp_config_set_plusspace_decode()
Constants related to configuring uri normalization:
- HTP_URL_DECODE_PRESERVE_PERCENT => HTP_URL_ENCODING_HANDLING_PRESERVE_PERCENT
- HTP_URL_DECODE_REMOVE_PERCENT => HTP_URL_ENCODING_HANDLING_REMOVE_PERCENT
- HTP_URL_DECODE_PROCESS_INVALID => HTP_URL_ENCODING_HANDLING_PROCESS_INVALID

htp_config_set_field_limits(soft_limit, hard_limit) changed to
htp_config_set_field_limit(limit) because libhtp didn't implement soft
limits.

libhtp logging API updated to provide HTP_LOG_CODE constants along with
the message. This eliminates the need to perform string matching on
message text to map log messages to HTTP_DECODER_EVENT values, and the
HTP_LOG_CODE values can be used directly. In support of this,
HTP_DECODER_EVENT values are mapped to their corresponding HTP_LOG_CODE
values:

HTTP_DECODER_EVENT_UNKNOWN_ERROR => HTP_LOG_CODE_UNKNOWN
HTTP_DECODER_EVENT_GZIP_DECOMPRESSION_FAILED => HTP_LOG_CODE_GZIP_DECOMPRESSION_FAILED
HTTP_DECODER_EVENT_REQUEST_FIELD_MISSING_COLON => HTP_LOG_CODE_REQUEST_FIELD_MISSING_COLON
HTTP_DECODER_EVENT_RESPONSE_FIELD_MISSING_COLON => HTP_LOG_CODE_RESPONSE_FIELD_MISSING_COLON
HTTP_DECODER_EVENT_INVALID_REQUEST_CHUNK_LEN => HTP_LOG_CODE_INVALID_REQUEST_CHUNK_LEN
HTTP_DECODER_EVENT_INVALID_RESPONSE_CHUNK_LEN => HTP_LOG_CODE_INVALID_RESPONSE_CHUNK_LEN
HTTP_DECODER_EVENT_INVALID_TRANSFER_ENCODING_VALUE_IN_REQUEST => HTP_LOG_CODE_INVALID_TRANSFER_ENCODING_VALUE_IN_REQUEST
HTTP_DECODER_EVENT_INVALID_TRANSFER_ENCODING_VALUE_IN_RESPONSE => HTP_LOG_CODE_INVALID_TRANSFER_ENCODING_VALUE_IN_RESPONSE
HTTP_DECODER_EVENT_INVALID_CONTENT_LENGTH_FIELD_IN_REQUEST => HTP_LOG_CODE_INVALID_CONTENT_LENGTH_FIELD_IN_REQUEST
HTTP_DECODER_EVENT_INVALID_CONTENT_LENGTH_FIELD_IN_RESPONSE => HTP_LOG_CODE_INVALID_CONTENT_LENGTH_FIELD_IN_RESPONSE
HTTP_DECODER_EVENT_DUPLICATE_CONTENT_LENGTH_FIELD_IN_REQUEST => HTP_LOG_CODE_DUPLICATE_CONTENT_LENGTH_FIELD_IN_REQUEST
HTTP_DECODER_EVENT_DUPLICATE_CONTENT_LENGTH_FIELD_IN_RESPONSE => HTP_LOG_CODE_DUPLICATE_CONTENT_LENGTH_FIELD_IN_RESPONSE
HTTP_DECODER_EVENT_100_CONTINUE_ALREADY_SEEN => HTP_LOG_CODE_CONTINUE_ALREADY_SEEN
HTTP_DECODER_EVENT_UNABLE_TO_MATCH_RESPONSE_TO_REQUEST => HTP_LOG_CODE_UNABLE_TO_MATCH_RESPONSE_TO_REQUEST
HTTP_DECODER_EVENT_INVALID_SERVER_PORT_IN_REQUEST => HTP_LOG_CODE_INVALID_SERVER_PORT_IN_REQUEST
HTTP_DECODER_EVENT_INVALID_AUTHORITY_PORT => HTP_LOG_CODE_INVALID_AUTHORITY_PORT
HTTP_DECODER_EVENT_REQUEST_HEADER_INVALID => HTP_LOG_CODE_REQUEST_HEADER_INVALID
HTTP_DECODER_EVENT_RESPONSE_HEADER_INVALID => HTP_LOG_CODE_RESPONSE_HEADER_INVALID
HTTP_DECODER_EVENT_MISSING_HOST_HEADER => HTP_LOG_CODE_MISSING_HOST_HEADER
HTTP_DECODER_EVENT_HOST_HEADER_AMBIGUOUS => HTP_LOG_CODE_HOST_HEADER_AMBIGUOUS
HTTP_DECODER_EVENT_INVALID_REQUEST_FIELD_FOLDING => HTP_LOG_CODE_INVALID_REQUEST_FIELD_FOLDING
HTTP_DECODER_EVENT_INVALID_RESPONSE_FIELD_FOLDING => HTP_LOG_CODE_INVALID_RESPONSE_FIELD_FOLDING
HTTP_DECODER_EVENT_REQUEST_FIELD_TOO_LONG => HTP_LOG_CODE_REQUEST_FIELD_TOO_LONG
HTTP_DECODER_EVENT_RESPONSE_FIELD_TOO_LONG => HTP_LOG_CODE_RESPONSE_FIELD_TOO_LONG
HTTP_DECODER_EVENT_FILE_NAME_TOO_LONG => HTP_LOG_CODE_REQUEST_LINE_INVALID
HTTP_DECODER_EVENT_REQUEST_LINE_INVALID => HTP_LOG_CODE_REQUEST_BODY_UNEXPECTED
HTTP_DECODER_EVENT_REQUEST_BODY_UNEXPECTED => HTP_LOG_CODE_RESPONSE_BODY_UNEXPECTED
HTTP_DECODER_EVENT_REQUEST_SERVER_PORT_TCP_PORT_MISMATCH => HTP_LOG_CODE_REQUEST_SERVER_PORT_TCP_PORT_MISMATCH
HTTP_DECODER_EVENT_URI_HOST_INVALID => HTP_LOG_CODE_URI_HOST_INVALID
HTTP_DECODER_EVENT_HEADER_HOST_INVALID => HTP_LOG_CODE_HEADER_HOST_INVALID
HTTP_DECODER_EVENT_AUTH_UNRECOGNIZED => HTP_LOG_CODE_AUTH_UNRECOGNIZED
HTTP_DECODER_EVENT_REQUEST_HEADER_REPETITION => HTP_LOG_CODE_REQUEST_HEADER_REPETITION
HTTP_DECODER_EVENT_RESPONSE_HEADER_REPETITION => HTP_LOG_CODE_RESPONSE_HEADER_REPETITION
HTTP_DECODER_EVENT_DOUBLE_ENCODED_URI => HTP_LOG_CODE_DOUBLE_ENCODED_URI
HTTP_DECODER_EVENT_URI_DELIM_NON_COMPLIANT => HTP_LOG_CODE_URI_DELIM_NON_COMPLIANT
HTTP_DECODER_EVENT_METHOD_DELIM_NON_COMPLIANT => HTP_LOG_CODE_METHOD_DELIM_NON_COMPLIANT
HTTP_DECODER_EVENT_REQUEST_LINE_LEADING_WHITESPACE => HTP_LOG_CODE_REQUEST_LINE_LEADING_WHITESPACE
HTTP_DECODER_EVENT_TOO_MANY_ENCODING_LAYERS => HTP_LOG_CODE_TOO_MANY_ENCODING_LAYERS
HTTP_DECODER_EVENT_ABNORMAL_CE_HEADER => HTP_LOG_CODE_ABNORMAL_CE_HEADER
HTTP_DECODER_EVENT_RESPONSE_MULTIPART_BYTERANGES => HTP_LOG_CODE_RESPONSE_MULTIPART_BYTERANGES
HTTP_DECODER_EVENT_RESPONSE_ABNORMAL_TRANSFER_ENCODING => HTP_LOG_CODE_RESPONSE_ABNORMAL_TRANSFER_ENCODING
HTTP_DECODER_EVENT_RESPONSE_CHUNKED_OLD_PROTO => HTP_LOG_CODE_RESPONSE_CHUNKED_OLD_PROTO
HTTP_DECODER_EVENT_RESPONSE_INVALID_PROTOCOL => HTP_LOG_CODE_RESPONSE_INVALID_PROTOCOL
HTTP_DECODER_EVENT_RESPONSE_INVALID_STATUS => HTP_LOG_CODE_RESPONSE_INVALID_STATUS
HTTP_DECODER_EVENT_REQUEST_LINE_INCOMPLETE => HTP_LOG_CODE_REQUEST_LINE_INCOMPLETE
HTTP_DECODER_EVENT_LZMA_MEMLIMIT_REACHED => HTP_LOG_CODE_LZMA_MEMLIMIT_REACHED
HTTP_DECODER_EVENT_COMPRESSION_BOMB => HTP_LOG_CODE_COMPRESSION_BOMB

New log events to describe additional anomalies:
HTP_LOG_CODE_REQUEST_TOO_MANY_LZMA_LAYERS
HTP_LOG_CODE_RESPONSE_TOO_MANY_LZMA_LAYERS
HTP_LOG_CODE_PROTOCOL_CONTAINS_EXTRA_DATA
HTP_LOG_CODE_CONTENT_LENGTH_EXTRA_DATA_START
HTP_LOG_CODE_CONTENT_LENGTH_EXTRA_DATA_END
HTP_LOG_CODE_SWITCHING_PROTO_WITH_CONTENT_LENGTH
HTP_LOG_CODE_DEFORMED_EOL
HTP_LOG_CODE_PARSER_STATE_ERROR
HTP_LOG_CODE_MISSING_OUTBOUND_TRANSACTION_DATA
HTP_LOG_CODE_MISSING_INBOUND_TRANSACTION_DATA
HTP_LOG_CODE_ZERO_LENGTH_DATA_CHUNKS
HTP_LOG_CODE_REQUEST_LINE_UNKNOWN_METHOD
HTP_LOG_CODE_REQUEST_LINE_UNKNOWN_METHOD_NO_PROTOCOL
HTP_LOG_CODE_REQUEST_LINE_UNKNOWN_METHOD_INVALID_PROTOCOL
HTP_LOG_CODE_REQUEST_LINE_NO_PROTOCOL
HTP_LOG_CODE_RESPONSE_LINE_INVALID_PROTOCOL
HTP_LOG_CODE_RESPONSE_LINE_INVALID_RESPONSE_STATUS
HTP_LOG_CODE_RESPONSE_BODY_INTERNAL_ERROR
HTP_LOG_CODE_REQUEST_BODY_DATA_CALLBACK_ERROR
HTP_LOG_CODE_RESPONSE_INVALID_EMPTY_NAME
HTP_LOG_CODE_REQUEST_INVALID_EMPTY_NAME
HTP_LOG_CODE_RESPONSE_INVALID_LWS_AFTER_NAME
HTP_LOG_CODE_RESPONSE_HEADER_NAME_NOT_TOKEN
HTP_LOG_CODE_REQUEST_INVALID_LWS_AFTER_NAME
HTP_LOG_CODE_LZMA_DECOMPRESSION_DISABLED
HTP_LOG_CODE_CONNECTION_ALREADY_OPEN
HTP_LOG_CODE_COMPRESSION_BOMB_DOUBLE_LZMA
HTP_LOG_CODE_INVALID_CONTENT_ENCODING
HTP_LOG_CODE_INVALID_GAP
HTP_LOG_CODE_ERROR

The new htp_log API supports consuming log messages more easily than
walking a list and tracking the current offset. Internally, libhtp-rs
now provides log messages as a queue of htp_log_t, which means the
application can simply call htp_conn_next_log() to fetch the next log
message until the queue is empty. Once the application is done with a
log message, they can call htp_log_free() to dispose of it.

Functions supporting htp_log_t:
htp_conn_next_log(conn) - Get the next log message
htp_log_message(log) - To get the text of the message
htp_log_code(log) - To get the HTP_LOG_CODE value
htp_log_free(log) - To free the htp_log_t
catenacyber pushed a commit to catenacyber/suricata that referenced this pull request Jul 9, 2024
Ticket: OISF#2696

There are a lot of changes here, which are described below.

In general these changes are renaming constants to conform to the
libhtp-rs versions (which are generated by cbindgen); making all htp
types opaque and changing struct->member references to
htp_struct_member() function calls; and a handful of changes to offload
functionality onto libhtp-rs from suricata, such as URI normalization
and transaction cleanup.

Constants renamed to correspond to cbindgen generated names:

HTP_OK => HTP_STATUS_OK
HTP_ERROR => HTP_STATUS_ERROR

HTP_SERVER_* => HTP_SERVER_PERSONALITY_*

HTP_REQUEST_INVALID_T_E => HTP_FLAGS_REQUEST_INVALID_T_E
HTP_REQUEST_INVALID_C_L => HTP_FLAGS_REQUEST_INVALID_C_L
HTP_HOST_MISSING => HTP_FLAGS_HOST_MISSING
HTP_HOST_AMBIGUOUS => HTP_FLAGS_HOST_AMBIGUOUS
HTP_HOSTU_INVALID => HTP_FLAGS_HOSTU_INVALID
HTP_HOSTH_INVALID => HTP_FLAGS_HOSTH_INVALID

HTP_AUTH_UNRECOGNIZED => HTP_AUTH_TYPE_UNRECOGNIZED

HTP_M_UNKNOWN => HTP_METHOD_UNKNOWN
HTP_M_GET => HTP_METHOD_GET
HTP_M_POST => HTP_METHOD_POST
HTP_M_PUT => HTP_METHOD_PUT
HTP_M_CONNECT => HTP_METHOD_CONNECT

HTP_STREAM_ERROR => HTP_STREAM_STATE_ERROR
HTP_STREAM_TUNNEL => HTP_STREAM_STATE_TUNNEL

HTP_REQUEST_LINE => HTP_REQUEST_PROGRESS_LINE
HTP_REQUEST_HEADERS => HTP_REQUEST_PROGRESS_HEADERS
HTP_REQUEST_BODY => HTP_REQUEST_PROGRESS_BODY
HTP_REQUEST_TRAILER => HTP_REQUEST_PROGRESS_TRAILER
HTP_REQUEST_COMPLETE => HTP_REQUEST_PROGRESS_COMPLETE
HTP_RESPONSE_LINE => HTP_RESPONSE_PROGRESS_LINE
HTP_RESPONSE_HEADERS => HTP_RESPONSE_PROGRESS_HEADERS
HTP_RESPONSE_BODY => HTP_RESPONSE_PROGRESS_BODY
HTP_RESPONSE_TRAILER => HTP_RESPONSE_PROGRESS_TRAILER
HTP_RESPONSE_COMPLETE => HTP_RESPONSE_PROGRESS_COMPLETE

HTP_PROTOCOL_1_1 => HTP_PROTOCOL_V1_1
HTP_PROTOCOL_1_0 => HTP_PROTOCOL_V1_0
HTP_PROTOCOL_0_9 => HTP_PROTOCOL_V0_9

Functions introduced to handle opaque htp_tx_t:
- tx->cfg => htp_tx_cfg(tx)
- tx->flags => htp_tx_flags(tx)
- tx->is_protocol_0_9 => htp_tx_is_protocol_0_9(tx)
- tx->parsed_uri => htp_tx_parsed_uri(tx)
- tx->parsed_uri->path => htp_uri_path(htp_tx_parsed_uri(tx)
- tx->parsed_uri->hostname => htp_uri_hostname(htp_tx_parsed_uri(tx))
- tx->request_auth_type => htp_tx_request_auth_type(tx)
- tx->request_headers => htp_tx_request_headers(tx)
- tx->request_hostname => htp_tx_request_hostname(tx)
- tx->request_line => htp_tx_request_line(tx)
- tx->request_message_len => htp_tx_request_message_len(tx)
- tx->request_method => htp_tx_request_method(tx)
- tx->request_method_number => htp_tx_request_method_number(tx)
- tx->request_port_number => htp_tx_request_port_number(tx)
- tx->request_progress => htp_tx_request_progress(tx)
- tx->request_protocol => htp_tx_request_protocol(tx)
- tx->request_protocol_number => htp_tx_request_protocol_number(tx)
- tx->request_uri => htp_tx_request_uri(tx)
- tx->response_headers => htp_tx_response_headers(tx)
- tx->response_line => htp_tx_response_line(tx)
- tx->response_message => htp_tx_response_message(tx)
- tx->response_message_len => htp_tx_response_message_len(tx)
- tx->response_status => htp_tx_response_status(tx)
- tx->response_status_number => htp_tx_response_status_number(tx)
- tx->response_progress => htp_tx_response_progress(tx)
- tx->response_protocol_number => htp_tx_response_protocol_number(tx)
- htp_tx_get_user_data() => htp_tx_user_data(tx)
- htp_table_get_c(tx->request_headers, header) => htp_tx_request_header(tx, header)
- htp_table_get_c(tx->response_headers, header) => htp_tx_response_header(tx, header)
- htp_table_get_index(tx->request_headers, index) => htp_tx_request_header_index(tx, index)
- htp_table_size(tx->request_headers) => htp_tx_request_headers_size(tx)
- htp_tx_is_http_2_upgrade(tx) convenience function introduced to detect response status 101
  and “Upgrade: h2c" header.

Functions introduced to handle opaque htp_header_t:
- header->name => htp_header_name(header)
- bstr_ptr(header->name) => htp_header_name_ptr(header)
- bstr_len(header->name) => htp_header_name_len(header)
- header->value => htp_header_value(header)
- bstr_len(header->value) => htp_header_value_len(header)
- bstr_ptr(header->value) => htp_header_value_ptr(header)

Functions introduced to handle opaque htp_headers_t:
- htp_table_size(headers) => htp_headers_size(headers)
- htp_table_get_index(headers, index) => htp_headers_get_index(headers, index)

Functions introduced to handle opaque htp_tx_data_t:
- d->len => htp_tx_data_len()
- d->data => htp_tx_data_data()
- htp_tx_data_tx(data) function to get the htp_tx_t from the htp_tx_data_t
- htp_tx_data_is_empty(data) convenience function introduced to test if the data is empty.

Functions introduced to handle opaque htp_connp_t:
- htp_list_get(connp->transactions, txid) => htp_connp_tx(connp, txid)
- htp_list_size(http_state->conn->transactions) => htp_connp_tx_size(connp)
- htp_connp_get_connection(connp) => htp_connp_connection(connp)
- htp_connp_req_data(connp) => htp_connp_request_data(connp)
- htp_connp_req_close(connp) => htp_connp_request_close(connp)
- htp_connp_res_data(connp) => htp_connp_response_data(connp)
- htp_connp_get_jn_tx(connp) => htp_connp_get_request_tx(connp)
- htp_connp_get_out_tx(connp) => htp_connp_get_response_tx(connp)
- htp_connp_req_data_consumed(connp) => htp_connp_request_data_consumed(connp)
- htp_connp_res_data_consumed(connp) => htp_connp_response_data_consumed(connp)
- htp_connp_get_user_data(connp) => htp_connp_user_data(connp)

Functions introduced to handle opaque htp_conn_t:
- conn->in_data_counter => htp_conn_request_data_counter(conn)
- conn->out_data_counter => htp_conn_response_data_counter(conn)

Other changes:

Build libhtp-rs as a crate inside rust. Update autoconf to no longer
use libhtp as an external dependency. Remove HAVE_HTP feature defines
since they are no longer needed.

Make function arguments and return values const where possible

htp_tx_destroy(tx) will now free an incomplete transaction

htp_time_t replaced with standard struct timeval

Callbacks from libhtp now provide the htp_connp_t and the htp_tx_data_t
as separate arguments. This means the connection parser is no longer
fetched from the transaction inside callbacks.

SCHTPGenerateNormalizedUri() functionality moved inside libhtp-rs, which
now provides normalized URI values.
The normalized URI is available with accessor function: htp_tx_normalized_uri()
Configuration settings added to control the behaviour of the URI normalization:
- htp_config_set_normalized_uri_include_all()
- htp_config_set_plusspace_decode()
- htp_config_set_convert_lowercase()
- htp_config_set_double_decode_normalized_query()
- htp_config_set_double_decode_normalized_path()
- htp_config_set_backslash_convert_slashes()
- htp_config_set_bestfit_replacement_byte()
- htp_config_set_convert_lowercase()
- htp_config_set_nul_encoded_terminates()
- htp_config_set_nul_raw_terminates()
- htp_config_set_path_separators_compress()
- htp_config_set_path_separators_decode()
- htp_config_set_u_encoding_decode()
- htp_config_set_url_encoding_invalid_handling()
- htp_config_set_utf8_convert_bestfit()
- htp_config_set_normalized_uri_include_all()
- htp_config_set_plusspace_decode()
Constants related to configuring uri normalization:
- HTP_URL_DECODE_PRESERVE_PERCENT => HTP_URL_ENCODING_HANDLING_PRESERVE_PERCENT
- HTP_URL_DECODE_REMOVE_PERCENT => HTP_URL_ENCODING_HANDLING_REMOVE_PERCENT
- HTP_URL_DECODE_PROCESS_INVALID => HTP_URL_ENCODING_HANDLING_PROCESS_INVALID

htp_config_set_field_limits(soft_limit, hard_limit) changed to
htp_config_set_field_limit(limit) because libhtp didn't implement soft
limits.

libhtp logging API updated to provide HTP_LOG_CODE constants along with
the message. This eliminates the need to perform string matching on
message text to map log messages to HTTP_DECODER_EVENT values, and the
HTP_LOG_CODE values can be used directly. In support of this,
HTP_DECODER_EVENT values are mapped to their corresponding HTP_LOG_CODE
values:

HTTP_DECODER_EVENT_UNKNOWN_ERROR => HTP_LOG_CODE_UNKNOWN
HTTP_DECODER_EVENT_GZIP_DECOMPRESSION_FAILED => HTP_LOG_CODE_GZIP_DECOMPRESSION_FAILED
HTTP_DECODER_EVENT_REQUEST_FIELD_MISSING_COLON => HTP_LOG_CODE_REQUEST_FIELD_MISSING_COLON
HTTP_DECODER_EVENT_RESPONSE_FIELD_MISSING_COLON => HTP_LOG_CODE_RESPONSE_FIELD_MISSING_COLON
HTTP_DECODER_EVENT_INVALID_REQUEST_CHUNK_LEN => HTP_LOG_CODE_INVALID_REQUEST_CHUNK_LEN
HTTP_DECODER_EVENT_INVALID_RESPONSE_CHUNK_LEN => HTP_LOG_CODE_INVALID_RESPONSE_CHUNK_LEN
HTTP_DECODER_EVENT_INVALID_TRANSFER_ENCODING_VALUE_IN_REQUEST => HTP_LOG_CODE_INVALID_TRANSFER_ENCODING_VALUE_IN_REQUEST
HTTP_DECODER_EVENT_INVALID_TRANSFER_ENCODING_VALUE_IN_RESPONSE => HTP_LOG_CODE_INVALID_TRANSFER_ENCODING_VALUE_IN_RESPONSE
HTTP_DECODER_EVENT_INVALID_CONTENT_LENGTH_FIELD_IN_REQUEST => HTP_LOG_CODE_INVALID_CONTENT_LENGTH_FIELD_IN_REQUEST
HTTP_DECODER_EVENT_INVALID_CONTENT_LENGTH_FIELD_IN_RESPONSE => HTP_LOG_CODE_INVALID_CONTENT_LENGTH_FIELD_IN_RESPONSE
HTTP_DECODER_EVENT_DUPLICATE_CONTENT_LENGTH_FIELD_IN_REQUEST => HTP_LOG_CODE_DUPLICATE_CONTENT_LENGTH_FIELD_IN_REQUEST
HTTP_DECODER_EVENT_DUPLICATE_CONTENT_LENGTH_FIELD_IN_RESPONSE => HTP_LOG_CODE_DUPLICATE_CONTENT_LENGTH_FIELD_IN_RESPONSE
HTTP_DECODER_EVENT_100_CONTINUE_ALREADY_SEEN => HTP_LOG_CODE_CONTINUE_ALREADY_SEEN
HTTP_DECODER_EVENT_UNABLE_TO_MATCH_RESPONSE_TO_REQUEST => HTP_LOG_CODE_UNABLE_TO_MATCH_RESPONSE_TO_REQUEST
HTTP_DECODER_EVENT_INVALID_SERVER_PORT_IN_REQUEST => HTP_LOG_CODE_INVALID_SERVER_PORT_IN_REQUEST
HTTP_DECODER_EVENT_INVALID_AUTHORITY_PORT => HTP_LOG_CODE_INVALID_AUTHORITY_PORT
HTTP_DECODER_EVENT_REQUEST_HEADER_INVALID => HTP_LOG_CODE_REQUEST_HEADER_INVALID
HTTP_DECODER_EVENT_RESPONSE_HEADER_INVALID => HTP_LOG_CODE_RESPONSE_HEADER_INVALID
HTTP_DECODER_EVENT_MISSING_HOST_HEADER => HTP_LOG_CODE_MISSING_HOST_HEADER
HTTP_DECODER_EVENT_HOST_HEADER_AMBIGUOUS => HTP_LOG_CODE_HOST_HEADER_AMBIGUOUS
HTTP_DECODER_EVENT_INVALID_REQUEST_FIELD_FOLDING => HTP_LOG_CODE_INVALID_REQUEST_FIELD_FOLDING
HTTP_DECODER_EVENT_INVALID_RESPONSE_FIELD_FOLDING => HTP_LOG_CODE_INVALID_RESPONSE_FIELD_FOLDING
HTTP_DECODER_EVENT_REQUEST_FIELD_TOO_LONG => HTP_LOG_CODE_REQUEST_FIELD_TOO_LONG
HTTP_DECODER_EVENT_RESPONSE_FIELD_TOO_LONG => HTP_LOG_CODE_RESPONSE_FIELD_TOO_LONG
HTTP_DECODER_EVENT_FILE_NAME_TOO_LONG => HTP_LOG_CODE_REQUEST_LINE_INVALID
HTTP_DECODER_EVENT_REQUEST_LINE_INVALID => HTP_LOG_CODE_REQUEST_BODY_UNEXPECTED
HTTP_DECODER_EVENT_REQUEST_BODY_UNEXPECTED => HTP_LOG_CODE_RESPONSE_BODY_UNEXPECTED
HTTP_DECODER_EVENT_REQUEST_SERVER_PORT_TCP_PORT_MISMATCH => HTP_LOG_CODE_REQUEST_SERVER_PORT_TCP_PORT_MISMATCH
HTTP_DECODER_EVENT_URI_HOST_INVALID => HTP_LOG_CODE_URI_HOST_INVALID
HTTP_DECODER_EVENT_HEADER_HOST_INVALID => HTP_LOG_CODE_HEADER_HOST_INVALID
HTTP_DECODER_EVENT_AUTH_UNRECOGNIZED => HTP_LOG_CODE_AUTH_UNRECOGNIZED
HTTP_DECODER_EVENT_REQUEST_HEADER_REPETITION => HTP_LOG_CODE_REQUEST_HEADER_REPETITION
HTTP_DECODER_EVENT_RESPONSE_HEADER_REPETITION => HTP_LOG_CODE_RESPONSE_HEADER_REPETITION
HTTP_DECODER_EVENT_DOUBLE_ENCODED_URI => HTP_LOG_CODE_DOUBLE_ENCODED_URI
HTTP_DECODER_EVENT_URI_DELIM_NON_COMPLIANT => HTP_LOG_CODE_URI_DELIM_NON_COMPLIANT
HTTP_DECODER_EVENT_METHOD_DELIM_NON_COMPLIANT => HTP_LOG_CODE_METHOD_DELIM_NON_COMPLIANT
HTTP_DECODER_EVENT_REQUEST_LINE_LEADING_WHITESPACE => HTP_LOG_CODE_REQUEST_LINE_LEADING_WHITESPACE
HTTP_DECODER_EVENT_TOO_MANY_ENCODING_LAYERS => HTP_LOG_CODE_TOO_MANY_ENCODING_LAYERS
HTTP_DECODER_EVENT_ABNORMAL_CE_HEADER => HTP_LOG_CODE_ABNORMAL_CE_HEADER
HTTP_DECODER_EVENT_RESPONSE_MULTIPART_BYTERANGES => HTP_LOG_CODE_RESPONSE_MULTIPART_BYTERANGES
HTTP_DECODER_EVENT_RESPONSE_ABNORMAL_TRANSFER_ENCODING => HTP_LOG_CODE_RESPONSE_ABNORMAL_TRANSFER_ENCODING
HTTP_DECODER_EVENT_RESPONSE_CHUNKED_OLD_PROTO => HTP_LOG_CODE_RESPONSE_CHUNKED_OLD_PROTO
HTTP_DECODER_EVENT_RESPONSE_INVALID_PROTOCOL => HTP_LOG_CODE_RESPONSE_INVALID_PROTOCOL
HTTP_DECODER_EVENT_RESPONSE_INVALID_STATUS => HTP_LOG_CODE_RESPONSE_INVALID_STATUS
HTTP_DECODER_EVENT_REQUEST_LINE_INCOMPLETE => HTP_LOG_CODE_REQUEST_LINE_INCOMPLETE
HTTP_DECODER_EVENT_LZMA_MEMLIMIT_REACHED => HTP_LOG_CODE_LZMA_MEMLIMIT_REACHED
HTTP_DECODER_EVENT_COMPRESSION_BOMB => HTP_LOG_CODE_COMPRESSION_BOMB

New log events to describe additional anomalies:
HTP_LOG_CODE_REQUEST_TOO_MANY_LZMA_LAYERS
HTP_LOG_CODE_RESPONSE_TOO_MANY_LZMA_LAYERS
HTP_LOG_CODE_PROTOCOL_CONTAINS_EXTRA_DATA
HTP_LOG_CODE_CONTENT_LENGTH_EXTRA_DATA_START
HTP_LOG_CODE_CONTENT_LENGTH_EXTRA_DATA_END
HTP_LOG_CODE_SWITCHING_PROTO_WITH_CONTENT_LENGTH
HTP_LOG_CODE_DEFORMED_EOL
HTP_LOG_CODE_PARSER_STATE_ERROR
HTP_LOG_CODE_MISSING_OUTBOUND_TRANSACTION_DATA
HTP_LOG_CODE_MISSING_INBOUND_TRANSACTION_DATA
HTP_LOG_CODE_ZERO_LENGTH_DATA_CHUNKS
HTP_LOG_CODE_REQUEST_LINE_UNKNOWN_METHOD
HTP_LOG_CODE_REQUEST_LINE_UNKNOWN_METHOD_NO_PROTOCOL
HTP_LOG_CODE_REQUEST_LINE_UNKNOWN_METHOD_INVALID_PROTOCOL
HTP_LOG_CODE_REQUEST_LINE_NO_PROTOCOL
HTP_LOG_CODE_RESPONSE_LINE_INVALID_PROTOCOL
HTP_LOG_CODE_RESPONSE_LINE_INVALID_RESPONSE_STATUS
HTP_LOG_CODE_RESPONSE_BODY_INTERNAL_ERROR
HTP_LOG_CODE_REQUEST_BODY_DATA_CALLBACK_ERROR
HTP_LOG_CODE_RESPONSE_INVALID_EMPTY_NAME
HTP_LOG_CODE_REQUEST_INVALID_EMPTY_NAME
HTP_LOG_CODE_RESPONSE_INVALID_LWS_AFTER_NAME
HTP_LOG_CODE_RESPONSE_HEADER_NAME_NOT_TOKEN
HTP_LOG_CODE_REQUEST_INVALID_LWS_AFTER_NAME
HTP_LOG_CODE_LZMA_DECOMPRESSION_DISABLED
HTP_LOG_CODE_CONNECTION_ALREADY_OPEN
HTP_LOG_CODE_COMPRESSION_BOMB_DOUBLE_LZMA
HTP_LOG_CODE_INVALID_CONTENT_ENCODING
HTP_LOG_CODE_INVALID_GAP
HTP_LOG_CODE_ERROR

The new htp_log API supports consuming log messages more easily than
walking a list and tracking the current offset. Internally, libhtp-rs
now provides log messages as a queue of htp_log_t, which means the
application can simply call htp_conn_next_log() to fetch the next log
message until the queue is empty. Once the application is done with a
log message, they can call htp_log_free() to dispose of it.

Functions supporting htp_log_t:
htp_conn_next_log(conn) - Get the next log message
htp_log_message(log) - To get the text of the message
htp_log_code(log) - To get the HTP_LOG_CODE value
htp_log_free(log) - To free the htp_log_t
catenacyber pushed a commit to catenacyber/suricata that referenced this pull request Jul 10, 2024
Ticket: OISF#2696

There are a lot of changes here, which are described below.

In general these changes are renaming constants to conform to the
libhtp-rs versions (which are generated by cbindgen); making all htp
types opaque and changing struct->member references to
htp_struct_member() function calls; and a handful of changes to offload
functionality onto libhtp-rs from suricata, such as URI normalization
and transaction cleanup.

Constants renamed to correspond to cbindgen generated names:

HTP_OK => HTP_STATUS_OK
HTP_ERROR => HTP_STATUS_ERROR

HTP_SERVER_* => HTP_SERVER_PERSONALITY_*

HTP_REQUEST_INVALID_T_E => HTP_FLAGS_REQUEST_INVALID_T_E
HTP_REQUEST_INVALID_C_L => HTP_FLAGS_REQUEST_INVALID_C_L
HTP_HOST_MISSING => HTP_FLAGS_HOST_MISSING
HTP_HOST_AMBIGUOUS => HTP_FLAGS_HOST_AMBIGUOUS
HTP_HOSTU_INVALID => HTP_FLAGS_HOSTU_INVALID
HTP_HOSTH_INVALID => HTP_FLAGS_HOSTH_INVALID

HTP_AUTH_UNRECOGNIZED => HTP_AUTH_TYPE_UNRECOGNIZED

HTP_M_UNKNOWN => HTP_METHOD_UNKNOWN
HTP_M_GET => HTP_METHOD_GET
HTP_M_POST => HTP_METHOD_POST
HTP_M_PUT => HTP_METHOD_PUT
HTP_M_CONNECT => HTP_METHOD_CONNECT

HTP_STREAM_ERROR => HTP_STREAM_STATE_ERROR
HTP_STREAM_TUNNEL => HTP_STREAM_STATE_TUNNEL

HTP_REQUEST_LINE => HTP_REQUEST_PROGRESS_LINE
HTP_REQUEST_HEADERS => HTP_REQUEST_PROGRESS_HEADERS
HTP_REQUEST_BODY => HTP_REQUEST_PROGRESS_BODY
HTP_REQUEST_TRAILER => HTP_REQUEST_PROGRESS_TRAILER
HTP_REQUEST_COMPLETE => HTP_REQUEST_PROGRESS_COMPLETE
HTP_RESPONSE_LINE => HTP_RESPONSE_PROGRESS_LINE
HTP_RESPONSE_HEADERS => HTP_RESPONSE_PROGRESS_HEADERS
HTP_RESPONSE_BODY => HTP_RESPONSE_PROGRESS_BODY
HTP_RESPONSE_TRAILER => HTP_RESPONSE_PROGRESS_TRAILER
HTP_RESPONSE_COMPLETE => HTP_RESPONSE_PROGRESS_COMPLETE

HTP_PROTOCOL_1_1 => HTP_PROTOCOL_V1_1
HTP_PROTOCOL_1_0 => HTP_PROTOCOL_V1_0
HTP_PROTOCOL_0_9 => HTP_PROTOCOL_V0_9

Functions introduced to handle opaque htp_tx_t:
- tx->cfg => htp_tx_cfg(tx)
- tx->flags => htp_tx_flags(tx)
- tx->is_protocol_0_9 => htp_tx_is_protocol_0_9(tx)
- tx->parsed_uri => htp_tx_parsed_uri(tx)
- tx->parsed_uri->path => htp_uri_path(htp_tx_parsed_uri(tx)
- tx->parsed_uri->hostname => htp_uri_hostname(htp_tx_parsed_uri(tx))
- tx->request_auth_type => htp_tx_request_auth_type(tx)
- tx->request_headers => htp_tx_request_headers(tx)
- tx->request_hostname => htp_tx_request_hostname(tx)
- tx->request_line => htp_tx_request_line(tx)
- tx->request_message_len => htp_tx_request_message_len(tx)
- tx->request_method => htp_tx_request_method(tx)
- tx->request_method_number => htp_tx_request_method_number(tx)
- tx->request_port_number => htp_tx_request_port_number(tx)
- tx->request_progress => htp_tx_request_progress(tx)
- tx->request_protocol => htp_tx_request_protocol(tx)
- tx->request_protocol_number => htp_tx_request_protocol_number(tx)
- tx->request_uri => htp_tx_request_uri(tx)
- tx->response_headers => htp_tx_response_headers(tx)
- tx->response_line => htp_tx_response_line(tx)
- tx->response_message => htp_tx_response_message(tx)
- tx->response_message_len => htp_tx_response_message_len(tx)
- tx->response_status => htp_tx_response_status(tx)
- tx->response_status_number => htp_tx_response_status_number(tx)
- tx->response_progress => htp_tx_response_progress(tx)
- tx->response_protocol_number => htp_tx_response_protocol_number(tx)
- htp_tx_get_user_data() => htp_tx_user_data(tx)
- htp_table_get_c(tx->request_headers, header) => htp_tx_request_header(tx, header)
- htp_table_get_c(tx->response_headers, header) => htp_tx_response_header(tx, header)
- htp_table_get_index(tx->request_headers, index) => htp_tx_request_header_index(tx, index)
- htp_table_size(tx->request_headers) => htp_tx_request_headers_size(tx)
- htp_tx_is_http_2_upgrade(tx) convenience function introduced to detect response status 101
  and “Upgrade: h2c" header.

Functions introduced to handle opaque htp_header_t:
- header->name => htp_header_name(header)
- bstr_ptr(header->name) => htp_header_name_ptr(header)
- bstr_len(header->name) => htp_header_name_len(header)
- header->value => htp_header_value(header)
- bstr_len(header->value) => htp_header_value_len(header)
- bstr_ptr(header->value) => htp_header_value_ptr(header)

Functions introduced to handle opaque htp_headers_t:
- htp_table_size(headers) => htp_headers_size(headers)
- htp_table_get_index(headers, index) => htp_headers_get_index(headers, index)

Functions introduced to handle opaque htp_tx_data_t:
- d->len => htp_tx_data_len()
- d->data => htp_tx_data_data()
- htp_tx_data_tx(data) function to get the htp_tx_t from the htp_tx_data_t
- htp_tx_data_is_empty(data) convenience function introduced to test if the data is empty.

Functions introduced to handle opaque htp_connp_t:
- htp_list_get(connp->transactions, txid) => htp_connp_tx(connp, txid)
- htp_list_size(http_state->conn->transactions) => htp_connp_tx_size(connp)
- htp_connp_get_connection(connp) => htp_connp_connection(connp)
- htp_connp_req_data(connp) => htp_connp_request_data(connp)
- htp_connp_req_close(connp) => htp_connp_request_close(connp)
- htp_connp_res_data(connp) => htp_connp_response_data(connp)
- htp_connp_get_jn_tx(connp) => htp_connp_get_request_tx(connp)
- htp_connp_get_out_tx(connp) => htp_connp_get_response_tx(connp)
- htp_connp_req_data_consumed(connp) => htp_connp_request_data_consumed(connp)
- htp_connp_res_data_consumed(connp) => htp_connp_response_data_consumed(connp)
- htp_connp_get_user_data(connp) => htp_connp_user_data(connp)

Functions introduced to handle opaque htp_conn_t:
- conn->in_data_counter => htp_conn_request_data_counter(conn)
- conn->out_data_counter => htp_conn_response_data_counter(conn)

Other changes:

Build libhtp-rs as a crate inside rust. Update autoconf to no longer
use libhtp as an external dependency. Remove HAVE_HTP feature defines
since they are no longer needed.

Make function arguments and return values const where possible

htp_tx_destroy(tx) will now free an incomplete transaction

htp_time_t replaced with standard struct timeval

Callbacks from libhtp now provide the htp_connp_t and the htp_tx_data_t
as separate arguments. This means the connection parser is no longer
fetched from the transaction inside callbacks.

SCHTPGenerateNormalizedUri() functionality moved inside libhtp-rs, which
now provides normalized URI values.
The normalized URI is available with accessor function: htp_tx_normalized_uri()
Configuration settings added to control the behaviour of the URI normalization:
- htp_config_set_normalized_uri_include_all()
- htp_config_set_plusspace_decode()
- htp_config_set_convert_lowercase()
- htp_config_set_double_decode_normalized_query()
- htp_config_set_double_decode_normalized_path()
- htp_config_set_backslash_convert_slashes()
- htp_config_set_bestfit_replacement_byte()
- htp_config_set_convert_lowercase()
- htp_config_set_nul_encoded_terminates()
- htp_config_set_nul_raw_terminates()
- htp_config_set_path_separators_compress()
- htp_config_set_path_separators_decode()
- htp_config_set_u_encoding_decode()
- htp_config_set_url_encoding_invalid_handling()
- htp_config_set_utf8_convert_bestfit()
- htp_config_set_normalized_uri_include_all()
- htp_config_set_plusspace_decode()
Constants related to configuring uri normalization:
- HTP_URL_DECODE_PRESERVE_PERCENT => HTP_URL_ENCODING_HANDLING_PRESERVE_PERCENT
- HTP_URL_DECODE_REMOVE_PERCENT => HTP_URL_ENCODING_HANDLING_REMOVE_PERCENT
- HTP_URL_DECODE_PROCESS_INVALID => HTP_URL_ENCODING_HANDLING_PROCESS_INVALID

htp_config_set_field_limits(soft_limit, hard_limit) changed to
htp_config_set_field_limit(limit) because libhtp didn't implement soft
limits.

libhtp logging API updated to provide HTP_LOG_CODE constants along with
the message. This eliminates the need to perform string matching on
message text to map log messages to HTTP_DECODER_EVENT values, and the
HTP_LOG_CODE values can be used directly. In support of this,
HTP_DECODER_EVENT values are mapped to their corresponding HTP_LOG_CODE
values:

HTTP_DECODER_EVENT_UNKNOWN_ERROR => HTP_LOG_CODE_UNKNOWN
HTTP_DECODER_EVENT_GZIP_DECOMPRESSION_FAILED => HTP_LOG_CODE_GZIP_DECOMPRESSION_FAILED
HTTP_DECODER_EVENT_REQUEST_FIELD_MISSING_COLON => HTP_LOG_CODE_REQUEST_FIELD_MISSING_COLON
HTTP_DECODER_EVENT_RESPONSE_FIELD_MISSING_COLON => HTP_LOG_CODE_RESPONSE_FIELD_MISSING_COLON
HTTP_DECODER_EVENT_INVALID_REQUEST_CHUNK_LEN => HTP_LOG_CODE_INVALID_REQUEST_CHUNK_LEN
HTTP_DECODER_EVENT_INVALID_RESPONSE_CHUNK_LEN => HTP_LOG_CODE_INVALID_RESPONSE_CHUNK_LEN
HTTP_DECODER_EVENT_INVALID_TRANSFER_ENCODING_VALUE_IN_REQUEST => HTP_LOG_CODE_INVALID_TRANSFER_ENCODING_VALUE_IN_REQUEST
HTTP_DECODER_EVENT_INVALID_TRANSFER_ENCODING_VALUE_IN_RESPONSE => HTP_LOG_CODE_INVALID_TRANSFER_ENCODING_VALUE_IN_RESPONSE
HTTP_DECODER_EVENT_INVALID_CONTENT_LENGTH_FIELD_IN_REQUEST => HTP_LOG_CODE_INVALID_CONTENT_LENGTH_FIELD_IN_REQUEST
HTTP_DECODER_EVENT_INVALID_CONTENT_LENGTH_FIELD_IN_RESPONSE => HTP_LOG_CODE_INVALID_CONTENT_LENGTH_FIELD_IN_RESPONSE
HTTP_DECODER_EVENT_DUPLICATE_CONTENT_LENGTH_FIELD_IN_REQUEST => HTP_LOG_CODE_DUPLICATE_CONTENT_LENGTH_FIELD_IN_REQUEST
HTTP_DECODER_EVENT_DUPLICATE_CONTENT_LENGTH_FIELD_IN_RESPONSE => HTP_LOG_CODE_DUPLICATE_CONTENT_LENGTH_FIELD_IN_RESPONSE
HTTP_DECODER_EVENT_100_CONTINUE_ALREADY_SEEN => HTP_LOG_CODE_CONTINUE_ALREADY_SEEN
HTTP_DECODER_EVENT_UNABLE_TO_MATCH_RESPONSE_TO_REQUEST => HTP_LOG_CODE_UNABLE_TO_MATCH_RESPONSE_TO_REQUEST
HTTP_DECODER_EVENT_INVALID_SERVER_PORT_IN_REQUEST => HTP_LOG_CODE_INVALID_SERVER_PORT_IN_REQUEST
HTTP_DECODER_EVENT_INVALID_AUTHORITY_PORT => HTP_LOG_CODE_INVALID_AUTHORITY_PORT
HTTP_DECODER_EVENT_REQUEST_HEADER_INVALID => HTP_LOG_CODE_REQUEST_HEADER_INVALID
HTTP_DECODER_EVENT_RESPONSE_HEADER_INVALID => HTP_LOG_CODE_RESPONSE_HEADER_INVALID
HTTP_DECODER_EVENT_MISSING_HOST_HEADER => HTP_LOG_CODE_MISSING_HOST_HEADER
HTTP_DECODER_EVENT_HOST_HEADER_AMBIGUOUS => HTP_LOG_CODE_HOST_HEADER_AMBIGUOUS
HTTP_DECODER_EVENT_INVALID_REQUEST_FIELD_FOLDING => HTP_LOG_CODE_INVALID_REQUEST_FIELD_FOLDING
HTTP_DECODER_EVENT_INVALID_RESPONSE_FIELD_FOLDING => HTP_LOG_CODE_INVALID_RESPONSE_FIELD_FOLDING
HTTP_DECODER_EVENT_REQUEST_FIELD_TOO_LONG => HTP_LOG_CODE_REQUEST_FIELD_TOO_LONG
HTTP_DECODER_EVENT_RESPONSE_FIELD_TOO_LONG => HTP_LOG_CODE_RESPONSE_FIELD_TOO_LONG
HTTP_DECODER_EVENT_FILE_NAME_TOO_LONG => HTP_LOG_CODE_REQUEST_LINE_INVALID
HTTP_DECODER_EVENT_REQUEST_LINE_INVALID => HTP_LOG_CODE_REQUEST_BODY_UNEXPECTED
HTTP_DECODER_EVENT_REQUEST_BODY_UNEXPECTED => HTP_LOG_CODE_RESPONSE_BODY_UNEXPECTED
HTTP_DECODER_EVENT_REQUEST_SERVER_PORT_TCP_PORT_MISMATCH => HTP_LOG_CODE_REQUEST_SERVER_PORT_TCP_PORT_MISMATCH
HTTP_DECODER_EVENT_URI_HOST_INVALID => HTP_LOG_CODE_URI_HOST_INVALID
HTTP_DECODER_EVENT_HEADER_HOST_INVALID => HTP_LOG_CODE_HEADER_HOST_INVALID
HTTP_DECODER_EVENT_AUTH_UNRECOGNIZED => HTP_LOG_CODE_AUTH_UNRECOGNIZED
HTTP_DECODER_EVENT_REQUEST_HEADER_REPETITION => HTP_LOG_CODE_REQUEST_HEADER_REPETITION
HTTP_DECODER_EVENT_RESPONSE_HEADER_REPETITION => HTP_LOG_CODE_RESPONSE_HEADER_REPETITION
HTTP_DECODER_EVENT_DOUBLE_ENCODED_URI => HTP_LOG_CODE_DOUBLE_ENCODED_URI
HTTP_DECODER_EVENT_URI_DELIM_NON_COMPLIANT => HTP_LOG_CODE_URI_DELIM_NON_COMPLIANT
HTTP_DECODER_EVENT_METHOD_DELIM_NON_COMPLIANT => HTP_LOG_CODE_METHOD_DELIM_NON_COMPLIANT
HTTP_DECODER_EVENT_REQUEST_LINE_LEADING_WHITESPACE => HTP_LOG_CODE_REQUEST_LINE_LEADING_WHITESPACE
HTTP_DECODER_EVENT_TOO_MANY_ENCODING_LAYERS => HTP_LOG_CODE_TOO_MANY_ENCODING_LAYERS
HTTP_DECODER_EVENT_ABNORMAL_CE_HEADER => HTP_LOG_CODE_ABNORMAL_CE_HEADER
HTTP_DECODER_EVENT_RESPONSE_MULTIPART_BYTERANGES => HTP_LOG_CODE_RESPONSE_MULTIPART_BYTERANGES
HTTP_DECODER_EVENT_RESPONSE_ABNORMAL_TRANSFER_ENCODING => HTP_LOG_CODE_RESPONSE_ABNORMAL_TRANSFER_ENCODING
HTTP_DECODER_EVENT_RESPONSE_CHUNKED_OLD_PROTO => HTP_LOG_CODE_RESPONSE_CHUNKED_OLD_PROTO
HTTP_DECODER_EVENT_RESPONSE_INVALID_PROTOCOL => HTP_LOG_CODE_RESPONSE_INVALID_PROTOCOL
HTTP_DECODER_EVENT_RESPONSE_INVALID_STATUS => HTP_LOG_CODE_RESPONSE_INVALID_STATUS
HTTP_DECODER_EVENT_REQUEST_LINE_INCOMPLETE => HTP_LOG_CODE_REQUEST_LINE_INCOMPLETE
HTTP_DECODER_EVENT_LZMA_MEMLIMIT_REACHED => HTP_LOG_CODE_LZMA_MEMLIMIT_REACHED
HTTP_DECODER_EVENT_COMPRESSION_BOMB => HTP_LOG_CODE_COMPRESSION_BOMB

New log events to describe additional anomalies:
HTP_LOG_CODE_REQUEST_TOO_MANY_LZMA_LAYERS
HTP_LOG_CODE_RESPONSE_TOO_MANY_LZMA_LAYERS
HTP_LOG_CODE_PROTOCOL_CONTAINS_EXTRA_DATA
HTP_LOG_CODE_CONTENT_LENGTH_EXTRA_DATA_START
HTP_LOG_CODE_CONTENT_LENGTH_EXTRA_DATA_END
HTP_LOG_CODE_SWITCHING_PROTO_WITH_CONTENT_LENGTH
HTP_LOG_CODE_DEFORMED_EOL
HTP_LOG_CODE_PARSER_STATE_ERROR
HTP_LOG_CODE_MISSING_OUTBOUND_TRANSACTION_DATA
HTP_LOG_CODE_MISSING_INBOUND_TRANSACTION_DATA
HTP_LOG_CODE_ZERO_LENGTH_DATA_CHUNKS
HTP_LOG_CODE_REQUEST_LINE_UNKNOWN_METHOD
HTP_LOG_CODE_REQUEST_LINE_UNKNOWN_METHOD_NO_PROTOCOL
HTP_LOG_CODE_REQUEST_LINE_UNKNOWN_METHOD_INVALID_PROTOCOL
HTP_LOG_CODE_REQUEST_LINE_NO_PROTOCOL
HTP_LOG_CODE_RESPONSE_LINE_INVALID_PROTOCOL
HTP_LOG_CODE_RESPONSE_LINE_INVALID_RESPONSE_STATUS
HTP_LOG_CODE_RESPONSE_BODY_INTERNAL_ERROR
HTP_LOG_CODE_REQUEST_BODY_DATA_CALLBACK_ERROR
HTP_LOG_CODE_RESPONSE_INVALID_EMPTY_NAME
HTP_LOG_CODE_REQUEST_INVALID_EMPTY_NAME
HTP_LOG_CODE_RESPONSE_INVALID_LWS_AFTER_NAME
HTP_LOG_CODE_RESPONSE_HEADER_NAME_NOT_TOKEN
HTP_LOG_CODE_REQUEST_INVALID_LWS_AFTER_NAME
HTP_LOG_CODE_LZMA_DECOMPRESSION_DISABLED
HTP_LOG_CODE_CONNECTION_ALREADY_OPEN
HTP_LOG_CODE_COMPRESSION_BOMB_DOUBLE_LZMA
HTP_LOG_CODE_INVALID_CONTENT_ENCODING
HTP_LOG_CODE_INVALID_GAP
HTP_LOG_CODE_ERROR

The new htp_log API supports consuming log messages more easily than
walking a list and tracking the current offset. Internally, libhtp-rs
now provides log messages as a queue of htp_log_t, which means the
application can simply call htp_conn_next_log() to fetch the next log
message until the queue is empty. Once the application is done with a
log message, they can call htp_log_free() to dispose of it.

Functions supporting htp_log_t:
htp_conn_next_log(conn) - Get the next log message
htp_log_message(log) - To get the text of the message
htp_log_code(log) - To get the HTP_LOG_CODE value
htp_log_free(log) - To free the htp_log_t
catenacyber pushed a commit to catenacyber/suricata that referenced this pull request Oct 6, 2024
Ticket: OISF#2696

There are a lot of changes here, which are described below.

In general these changes are renaming constants to conform to the
libhtp-rs versions (which are generated by cbindgen); making all htp
types opaque and changing struct->member references to
htp_struct_member() function calls; and a handful of changes to offload
functionality onto libhtp-rs from suricata, such as URI normalization
and transaction cleanup.

Constants renamed to correspond to cbindgen generated names:

HTP_OK => HTP_STATUS_OK
HTP_ERROR => HTP_STATUS_ERROR

HTP_SERVER_* => HTP_SERVER_PERSONALITY_*

HTP_REQUEST_INVALID_T_E => HTP_FLAGS_REQUEST_INVALID_T_E
HTP_REQUEST_INVALID_C_L => HTP_FLAGS_REQUEST_INVALID_C_L
HTP_HOST_MISSING => HTP_FLAGS_HOST_MISSING
HTP_HOST_AMBIGUOUS => HTP_FLAGS_HOST_AMBIGUOUS
HTP_HOSTU_INVALID => HTP_FLAGS_HOSTU_INVALID
HTP_HOSTH_INVALID => HTP_FLAGS_HOSTH_INVALID

HTP_AUTH_UNRECOGNIZED => HTP_AUTH_TYPE_UNRECOGNIZED

HTP_M_UNKNOWN => HTP_METHOD_UNKNOWN
HTP_M_GET => HTP_METHOD_GET
HTP_M_POST => HTP_METHOD_POST
HTP_M_PUT => HTP_METHOD_PUT
HTP_M_CONNECT => HTP_METHOD_CONNECT

HTP_STREAM_ERROR => HTP_STREAM_STATE_ERROR
HTP_STREAM_TUNNEL => HTP_STREAM_STATE_TUNNEL

HTP_REQUEST_LINE => HTP_REQUEST_PROGRESS_LINE
HTP_REQUEST_HEADERS => HTP_REQUEST_PROGRESS_HEADERS
HTP_REQUEST_BODY => HTP_REQUEST_PROGRESS_BODY
HTP_REQUEST_TRAILER => HTP_REQUEST_PROGRESS_TRAILER
HTP_REQUEST_COMPLETE => HTP_REQUEST_PROGRESS_COMPLETE
HTP_RESPONSE_LINE => HTP_RESPONSE_PROGRESS_LINE
HTP_RESPONSE_HEADERS => HTP_RESPONSE_PROGRESS_HEADERS
HTP_RESPONSE_BODY => HTP_RESPONSE_PROGRESS_BODY
HTP_RESPONSE_TRAILER => HTP_RESPONSE_PROGRESS_TRAILER
HTP_RESPONSE_COMPLETE => HTP_RESPONSE_PROGRESS_COMPLETE

HTP_PROTOCOL_1_1 => HTP_PROTOCOL_V1_1
HTP_PROTOCOL_1_0 => HTP_PROTOCOL_V1_0
HTP_PROTOCOL_0_9 => HTP_PROTOCOL_V0_9

Functions introduced to handle opaque htp_tx_t:
- tx->cfg => htp_tx_cfg(tx)
- tx->flags => htp_tx_flags(tx)
- tx->is_protocol_0_9 => htp_tx_is_protocol_0_9(tx)
- tx->parsed_uri => htp_tx_parsed_uri(tx)
- tx->parsed_uri->path => htp_uri_path(htp_tx_parsed_uri(tx)
- tx->parsed_uri->hostname => htp_uri_hostname(htp_tx_parsed_uri(tx))
- tx->request_auth_type => htp_tx_request_auth_type(tx)
- tx->request_headers => htp_tx_request_headers(tx)
- tx->request_hostname => htp_tx_request_hostname(tx)
- tx->request_line => htp_tx_request_line(tx)
- tx->request_message_len => htp_tx_request_message_len(tx)
- tx->request_method => htp_tx_request_method(tx)
- tx->request_method_number => htp_tx_request_method_number(tx)
- tx->request_port_number => htp_tx_request_port_number(tx)
- tx->request_progress => htp_tx_request_progress(tx)
- tx->request_protocol => htp_tx_request_protocol(tx)
- tx->request_protocol_number => htp_tx_request_protocol_number(tx)
- tx->request_uri => htp_tx_request_uri(tx)
- tx->response_headers => htp_tx_response_headers(tx)
- tx->response_line => htp_tx_response_line(tx)
- tx->response_message => htp_tx_response_message(tx)
- tx->response_message_len => htp_tx_response_message_len(tx)
- tx->response_status => htp_tx_response_status(tx)
- tx->response_status_number => htp_tx_response_status_number(tx)
- tx->response_progress => htp_tx_response_progress(tx)
- tx->response_protocol_number => htp_tx_response_protocol_number(tx)
- htp_tx_get_user_data() => htp_tx_user_data(tx)
- htp_table_get_c(tx->request_headers, header) => htp_tx_request_header(tx, header)
- htp_table_get_c(tx->response_headers, header) => htp_tx_response_header(tx, header)
- htp_table_get_index(tx->request_headers, index) => htp_tx_request_header_index(tx, index)
- htp_table_size(tx->request_headers) => htp_tx_request_headers_size(tx)
- htp_tx_is_http_2_upgrade(tx) convenience function introduced to detect response status 101
  and “Upgrade: h2c" header.

Functions introduced to handle opaque htp_header_t:
- header->name => htp_header_name(header)
- bstr_ptr(header->name) => htp_header_name_ptr(header)
- bstr_len(header->name) => htp_header_name_len(header)
- header->value => htp_header_value(header)
- bstr_len(header->value) => htp_header_value_len(header)
- bstr_ptr(header->value) => htp_header_value_ptr(header)

Functions introduced to handle opaque htp_headers_t:
- htp_table_size(headers) => htp_headers_size(headers)
- htp_table_get_index(headers, index) => htp_headers_get_index(headers, index)

Functions introduced to handle opaque htp_tx_data_t:
- d->len => htp_tx_data_len()
- d->data => htp_tx_data_data()
- htp_tx_data_tx(data) function to get the htp_tx_t from the htp_tx_data_t
- htp_tx_data_is_empty(data) convenience function introduced to test if the data is empty.

Functions introduced to handle opaque htp_connp_t:
- htp_list_get(connp->transactions, txid) => htp_connp_tx(connp, txid)
- htp_list_size(http_state->conn->transactions) => htp_connp_tx_size(connp)
- htp_connp_get_connection(connp) => htp_connp_connection(connp)
- htp_connp_req_data(connp) => htp_connp_request_data(connp)
- htp_connp_req_close(connp) => htp_connp_request_close(connp)
- htp_connp_res_data(connp) => htp_connp_response_data(connp)
- htp_connp_get_jn_tx(connp) => htp_connp_get_request_tx(connp)
- htp_connp_get_out_tx(connp) => htp_connp_get_response_tx(connp)
- htp_connp_req_data_consumed(connp) => htp_connp_request_data_consumed(connp)
- htp_connp_res_data_consumed(connp) => htp_connp_response_data_consumed(connp)
- htp_connp_get_user_data(connp) => htp_connp_user_data(connp)

Functions introduced to handle opaque htp_conn_t:
- conn->in_data_counter => htp_conn_request_data_counter(conn)
- conn->out_data_counter => htp_conn_response_data_counter(conn)

Other changes:

Build libhtp-rs as a crate inside rust. Update autoconf to no longer
use libhtp as an external dependency. Remove HAVE_HTP feature defines
since they are no longer needed.

Make function arguments and return values const where possible

htp_tx_destroy(tx) will now free an incomplete transaction

htp_time_t replaced with standard struct timeval

Callbacks from libhtp now provide the htp_connp_t and the htp_tx_data_t
as separate arguments. This means the connection parser is no longer
fetched from the transaction inside callbacks.

SCHTPGenerateNormalizedUri() functionality moved inside libhtp-rs, which
now provides normalized URI values.
The normalized URI is available with accessor function: htp_tx_normalized_uri()
Configuration settings added to control the behaviour of the URI normalization:
- htp_config_set_normalized_uri_include_all()
- htp_config_set_plusspace_decode()
- htp_config_set_convert_lowercase()
- htp_config_set_double_decode_normalized_query()
- htp_config_set_double_decode_normalized_path()
- htp_config_set_backslash_convert_slashes()
- htp_config_set_bestfit_replacement_byte()
- htp_config_set_convert_lowercase()
- htp_config_set_nul_encoded_terminates()
- htp_config_set_nul_raw_terminates()
- htp_config_set_path_separators_compress()
- htp_config_set_path_separators_decode()
- htp_config_set_u_encoding_decode()
- htp_config_set_url_encoding_invalid_handling()
- htp_config_set_utf8_convert_bestfit()
- htp_config_set_normalized_uri_include_all()
- htp_config_set_plusspace_decode()
Constants related to configuring uri normalization:
- HTP_URL_DECODE_PRESERVE_PERCENT => HTP_URL_ENCODING_HANDLING_PRESERVE_PERCENT
- HTP_URL_DECODE_REMOVE_PERCENT => HTP_URL_ENCODING_HANDLING_REMOVE_PERCENT
- HTP_URL_DECODE_PROCESS_INVALID => HTP_URL_ENCODING_HANDLING_PROCESS_INVALID

htp_config_set_field_limits(soft_limit, hard_limit) changed to
htp_config_set_field_limit(limit) because libhtp didn't implement soft
limits.

libhtp logging API updated to provide HTP_LOG_CODE constants along with
the message. This eliminates the need to perform string matching on
message text to map log messages to HTTP_DECODER_EVENT values, and the
HTP_LOG_CODE values can be used directly. In support of this,
HTP_DECODER_EVENT values are mapped to their corresponding HTP_LOG_CODE
values:

HTTP_DECODER_EVENT_UNKNOWN_ERROR => HTP_LOG_CODE_UNKNOWN
HTTP_DECODER_EVENT_GZIP_DECOMPRESSION_FAILED => HTP_LOG_CODE_GZIP_DECOMPRESSION_FAILED
HTTP_DECODER_EVENT_REQUEST_FIELD_MISSING_COLON => HTP_LOG_CODE_REQUEST_FIELD_MISSING_COLON
HTTP_DECODER_EVENT_RESPONSE_FIELD_MISSING_COLON => HTP_LOG_CODE_RESPONSE_FIELD_MISSING_COLON
HTTP_DECODER_EVENT_INVALID_REQUEST_CHUNK_LEN => HTP_LOG_CODE_INVALID_REQUEST_CHUNK_LEN
HTTP_DECODER_EVENT_INVALID_RESPONSE_CHUNK_LEN => HTP_LOG_CODE_INVALID_RESPONSE_CHUNK_LEN
HTTP_DECODER_EVENT_INVALID_TRANSFER_ENCODING_VALUE_IN_REQUEST => HTP_LOG_CODE_INVALID_TRANSFER_ENCODING_VALUE_IN_REQUEST
HTTP_DECODER_EVENT_INVALID_TRANSFER_ENCODING_VALUE_IN_RESPONSE => HTP_LOG_CODE_INVALID_TRANSFER_ENCODING_VALUE_IN_RESPONSE
HTTP_DECODER_EVENT_INVALID_CONTENT_LENGTH_FIELD_IN_REQUEST => HTP_LOG_CODE_INVALID_CONTENT_LENGTH_FIELD_IN_REQUEST
HTTP_DECODER_EVENT_INVALID_CONTENT_LENGTH_FIELD_IN_RESPONSE => HTP_LOG_CODE_INVALID_CONTENT_LENGTH_FIELD_IN_RESPONSE
HTTP_DECODER_EVENT_DUPLICATE_CONTENT_LENGTH_FIELD_IN_REQUEST => HTP_LOG_CODE_DUPLICATE_CONTENT_LENGTH_FIELD_IN_REQUEST
HTTP_DECODER_EVENT_DUPLICATE_CONTENT_LENGTH_FIELD_IN_RESPONSE => HTP_LOG_CODE_DUPLICATE_CONTENT_LENGTH_FIELD_IN_RESPONSE
HTTP_DECODER_EVENT_100_CONTINUE_ALREADY_SEEN => HTP_LOG_CODE_CONTINUE_ALREADY_SEEN
HTTP_DECODER_EVENT_UNABLE_TO_MATCH_RESPONSE_TO_REQUEST => HTP_LOG_CODE_UNABLE_TO_MATCH_RESPONSE_TO_REQUEST
HTTP_DECODER_EVENT_INVALID_SERVER_PORT_IN_REQUEST => HTP_LOG_CODE_INVALID_SERVER_PORT_IN_REQUEST
HTTP_DECODER_EVENT_INVALID_AUTHORITY_PORT => HTP_LOG_CODE_INVALID_AUTHORITY_PORT
HTTP_DECODER_EVENT_REQUEST_HEADER_INVALID => HTP_LOG_CODE_REQUEST_HEADER_INVALID
HTTP_DECODER_EVENT_RESPONSE_HEADER_INVALID => HTP_LOG_CODE_RESPONSE_HEADER_INVALID
HTTP_DECODER_EVENT_MISSING_HOST_HEADER => HTP_LOG_CODE_MISSING_HOST_HEADER
HTTP_DECODER_EVENT_HOST_HEADER_AMBIGUOUS => HTP_LOG_CODE_HOST_HEADER_AMBIGUOUS
HTTP_DECODER_EVENT_INVALID_REQUEST_FIELD_FOLDING => HTP_LOG_CODE_INVALID_REQUEST_FIELD_FOLDING
HTTP_DECODER_EVENT_INVALID_RESPONSE_FIELD_FOLDING => HTP_LOG_CODE_INVALID_RESPONSE_FIELD_FOLDING
HTTP_DECODER_EVENT_REQUEST_FIELD_TOO_LONG => HTP_LOG_CODE_REQUEST_FIELD_TOO_LONG
HTTP_DECODER_EVENT_RESPONSE_FIELD_TOO_LONG => HTP_LOG_CODE_RESPONSE_FIELD_TOO_LONG
HTTP_DECODER_EVENT_FILE_NAME_TOO_LONG => HTP_LOG_CODE_REQUEST_LINE_INVALID
HTTP_DECODER_EVENT_REQUEST_LINE_INVALID => HTP_LOG_CODE_REQUEST_BODY_UNEXPECTED
HTTP_DECODER_EVENT_REQUEST_BODY_UNEXPECTED => HTP_LOG_CODE_RESPONSE_BODY_UNEXPECTED
HTTP_DECODER_EVENT_REQUEST_SERVER_PORT_TCP_PORT_MISMATCH => HTP_LOG_CODE_REQUEST_SERVER_PORT_TCP_PORT_MISMATCH
HTTP_DECODER_EVENT_URI_HOST_INVALID => HTP_LOG_CODE_URI_HOST_INVALID
HTTP_DECODER_EVENT_HEADER_HOST_INVALID => HTP_LOG_CODE_HEADER_HOST_INVALID
HTTP_DECODER_EVENT_AUTH_UNRECOGNIZED => HTP_LOG_CODE_AUTH_UNRECOGNIZED
HTTP_DECODER_EVENT_REQUEST_HEADER_REPETITION => HTP_LOG_CODE_REQUEST_HEADER_REPETITION
HTTP_DECODER_EVENT_RESPONSE_HEADER_REPETITION => HTP_LOG_CODE_RESPONSE_HEADER_REPETITION
HTTP_DECODER_EVENT_DOUBLE_ENCODED_URI => HTP_LOG_CODE_DOUBLE_ENCODED_URI
HTTP_DECODER_EVENT_URI_DELIM_NON_COMPLIANT => HTP_LOG_CODE_URI_DELIM_NON_COMPLIANT
HTTP_DECODER_EVENT_METHOD_DELIM_NON_COMPLIANT => HTP_LOG_CODE_METHOD_DELIM_NON_COMPLIANT
HTTP_DECODER_EVENT_REQUEST_LINE_LEADING_WHITESPACE => HTP_LOG_CODE_REQUEST_LINE_LEADING_WHITESPACE
HTTP_DECODER_EVENT_TOO_MANY_ENCODING_LAYERS => HTP_LOG_CODE_TOO_MANY_ENCODING_LAYERS
HTTP_DECODER_EVENT_ABNORMAL_CE_HEADER => HTP_LOG_CODE_ABNORMAL_CE_HEADER
HTTP_DECODER_EVENT_RESPONSE_MULTIPART_BYTERANGES => HTP_LOG_CODE_RESPONSE_MULTIPART_BYTERANGES
HTTP_DECODER_EVENT_RESPONSE_ABNORMAL_TRANSFER_ENCODING => HTP_LOG_CODE_RESPONSE_ABNORMAL_TRANSFER_ENCODING
HTTP_DECODER_EVENT_RESPONSE_CHUNKED_OLD_PROTO => HTP_LOG_CODE_RESPONSE_CHUNKED_OLD_PROTO
HTTP_DECODER_EVENT_RESPONSE_INVALID_PROTOCOL => HTP_LOG_CODE_RESPONSE_INVALID_PROTOCOL
HTTP_DECODER_EVENT_RESPONSE_INVALID_STATUS => HTP_LOG_CODE_RESPONSE_INVALID_STATUS
HTTP_DECODER_EVENT_REQUEST_LINE_INCOMPLETE => HTP_LOG_CODE_REQUEST_LINE_INCOMPLETE
HTTP_DECODER_EVENT_LZMA_MEMLIMIT_REACHED => HTP_LOG_CODE_LZMA_MEMLIMIT_REACHED
HTTP_DECODER_EVENT_COMPRESSION_BOMB => HTP_LOG_CODE_COMPRESSION_BOMB

New log events to describe additional anomalies:
HTP_LOG_CODE_REQUEST_TOO_MANY_LZMA_LAYERS
HTP_LOG_CODE_RESPONSE_TOO_MANY_LZMA_LAYERS
HTP_LOG_CODE_PROTOCOL_CONTAINS_EXTRA_DATA
HTP_LOG_CODE_CONTENT_LENGTH_EXTRA_DATA_START
HTP_LOG_CODE_CONTENT_LENGTH_EXTRA_DATA_END
HTP_LOG_CODE_SWITCHING_PROTO_WITH_CONTENT_LENGTH
HTP_LOG_CODE_DEFORMED_EOL
HTP_LOG_CODE_PARSER_STATE_ERROR
HTP_LOG_CODE_MISSING_OUTBOUND_TRANSACTION_DATA
HTP_LOG_CODE_MISSING_INBOUND_TRANSACTION_DATA
HTP_LOG_CODE_ZERO_LENGTH_DATA_CHUNKS
HTP_LOG_CODE_REQUEST_LINE_UNKNOWN_METHOD
HTP_LOG_CODE_REQUEST_LINE_UNKNOWN_METHOD_NO_PROTOCOL
HTP_LOG_CODE_REQUEST_LINE_UNKNOWN_METHOD_INVALID_PROTOCOL
HTP_LOG_CODE_REQUEST_LINE_NO_PROTOCOL
HTP_LOG_CODE_RESPONSE_LINE_INVALID_PROTOCOL
HTP_LOG_CODE_RESPONSE_LINE_INVALID_RESPONSE_STATUS
HTP_LOG_CODE_RESPONSE_BODY_INTERNAL_ERROR
HTP_LOG_CODE_REQUEST_BODY_DATA_CALLBACK_ERROR
HTP_LOG_CODE_RESPONSE_INVALID_EMPTY_NAME
HTP_LOG_CODE_REQUEST_INVALID_EMPTY_NAME
HTP_LOG_CODE_RESPONSE_INVALID_LWS_AFTER_NAME
HTP_LOG_CODE_RESPONSE_HEADER_NAME_NOT_TOKEN
HTP_LOG_CODE_REQUEST_INVALID_LWS_AFTER_NAME
HTP_LOG_CODE_LZMA_DECOMPRESSION_DISABLED
HTP_LOG_CODE_CONNECTION_ALREADY_OPEN
HTP_LOG_CODE_COMPRESSION_BOMB_DOUBLE_LZMA
HTP_LOG_CODE_INVALID_CONTENT_ENCODING
HTP_LOG_CODE_INVALID_GAP
HTP_LOG_CODE_ERROR

The new htp_log API supports consuming log messages more easily than
walking a list and tracking the current offset. Internally, libhtp-rs
now provides log messages as a queue of htp_log_t, which means the
application can simply call htp_conn_next_log() to fetch the next log
message until the queue is empty. Once the application is done with a
log message, they can call htp_log_free() to dispose of it.

Functions supporting htp_log_t:
htp_conn_next_log(conn) - Get the next log message
htp_log_message(log) - To get the text of the message
htp_log_code(log) - To get the HTP_LOG_CODE value
htp_log_free(log) - To free the htp_log_t
catenacyber pushed a commit to catenacyber/suricata that referenced this pull request Oct 6, 2024
Ticket: OISF#2696

There are a lot of changes here, which are described below.

In general these changes are renaming constants to conform to the
libhtp-rs versions (which are generated by cbindgen); making all htp
types opaque and changing struct->member references to
htp_struct_member() function calls; and a handful of changes to offload
functionality onto libhtp-rs from suricata, such as URI normalization
and transaction cleanup.

Constants renamed to correspond to cbindgen generated names:

HTP_OK => HTP_STATUS_OK
HTP_ERROR => HTP_STATUS_ERROR

HTP_SERVER_* => HTP_SERVER_PERSONALITY_*

HTP_REQUEST_INVALID_T_E => HTP_FLAGS_REQUEST_INVALID_T_E
HTP_REQUEST_INVALID_C_L => HTP_FLAGS_REQUEST_INVALID_C_L
HTP_HOST_MISSING => HTP_FLAGS_HOST_MISSING
HTP_HOST_AMBIGUOUS => HTP_FLAGS_HOST_AMBIGUOUS
HTP_HOSTU_INVALID => HTP_FLAGS_HOSTU_INVALID
HTP_HOSTH_INVALID => HTP_FLAGS_HOSTH_INVALID

HTP_AUTH_UNRECOGNIZED => HTP_AUTH_TYPE_UNRECOGNIZED

HTP_M_UNKNOWN => HTP_METHOD_UNKNOWN
HTP_M_GET => HTP_METHOD_GET
HTP_M_POST => HTP_METHOD_POST
HTP_M_PUT => HTP_METHOD_PUT
HTP_M_CONNECT => HTP_METHOD_CONNECT

HTP_STREAM_ERROR => HTP_STREAM_STATE_ERROR
HTP_STREAM_TUNNEL => HTP_STREAM_STATE_TUNNEL

HTP_REQUEST_LINE => HTP_REQUEST_PROGRESS_LINE
HTP_REQUEST_HEADERS => HTP_REQUEST_PROGRESS_HEADERS
HTP_REQUEST_BODY => HTP_REQUEST_PROGRESS_BODY
HTP_REQUEST_TRAILER => HTP_REQUEST_PROGRESS_TRAILER
HTP_REQUEST_COMPLETE => HTP_REQUEST_PROGRESS_COMPLETE
HTP_RESPONSE_LINE => HTP_RESPONSE_PROGRESS_LINE
HTP_RESPONSE_HEADERS => HTP_RESPONSE_PROGRESS_HEADERS
HTP_RESPONSE_BODY => HTP_RESPONSE_PROGRESS_BODY
HTP_RESPONSE_TRAILER => HTP_RESPONSE_PROGRESS_TRAILER
HTP_RESPONSE_COMPLETE => HTP_RESPONSE_PROGRESS_COMPLETE

HTP_PROTOCOL_1_1 => HTP_PROTOCOL_V1_1
HTP_PROTOCOL_1_0 => HTP_PROTOCOL_V1_0
HTP_PROTOCOL_0_9 => HTP_PROTOCOL_V0_9

Functions introduced to handle opaque htp_tx_t:
- tx->cfg => htp_tx_cfg(tx)
- tx->flags => htp_tx_flags(tx)
- tx->is_protocol_0_9 => htp_tx_is_protocol_0_9(tx)
- tx->parsed_uri => htp_tx_parsed_uri(tx)
- tx->parsed_uri->path => htp_uri_path(htp_tx_parsed_uri(tx)
- tx->parsed_uri->hostname => htp_uri_hostname(htp_tx_parsed_uri(tx))
- tx->request_auth_type => htp_tx_request_auth_type(tx)
- tx->request_headers => htp_tx_request_headers(tx)
- tx->request_hostname => htp_tx_request_hostname(tx)
- tx->request_line => htp_tx_request_line(tx)
- tx->request_message_len => htp_tx_request_message_len(tx)
- tx->request_method => htp_tx_request_method(tx)
- tx->request_method_number => htp_tx_request_method_number(tx)
- tx->request_port_number => htp_tx_request_port_number(tx)
- tx->request_progress => htp_tx_request_progress(tx)
- tx->request_protocol => htp_tx_request_protocol(tx)
- tx->request_protocol_number => htp_tx_request_protocol_number(tx)
- tx->request_uri => htp_tx_request_uri(tx)
- tx->response_headers => htp_tx_response_headers(tx)
- tx->response_line => htp_tx_response_line(tx)
- tx->response_message => htp_tx_response_message(tx)
- tx->response_message_len => htp_tx_response_message_len(tx)
- tx->response_status => htp_tx_response_status(tx)
- tx->response_status_number => htp_tx_response_status_number(tx)
- tx->response_progress => htp_tx_response_progress(tx)
- tx->response_protocol_number => htp_tx_response_protocol_number(tx)
- htp_tx_get_user_data() => htp_tx_user_data(tx)
- htp_table_get_c(tx->request_headers, header) => htp_tx_request_header(tx, header)
- htp_table_get_c(tx->response_headers, header) => htp_tx_response_header(tx, header)
- htp_table_get_index(tx->request_headers, index) => htp_tx_request_header_index(tx, index)
- htp_table_size(tx->request_headers) => htp_tx_request_headers_size(tx)
- htp_tx_is_http_2_upgrade(tx) convenience function introduced to detect response status 101
  and “Upgrade: h2c" header.

Functions introduced to handle opaque htp_header_t:
- header->name => htp_header_name(header)
- bstr_ptr(header->name) => htp_header_name_ptr(header)
- bstr_len(header->name) => htp_header_name_len(header)
- header->value => htp_header_value(header)
- bstr_len(header->value) => htp_header_value_len(header)
- bstr_ptr(header->value) => htp_header_value_ptr(header)

Functions introduced to handle opaque htp_headers_t:
- htp_table_size(headers) => htp_headers_size(headers)
- htp_table_get_index(headers, index) => htp_headers_get_index(headers, index)

Functions introduced to handle opaque htp_tx_data_t:
- d->len => htp_tx_data_len()
- d->data => htp_tx_data_data()
- htp_tx_data_tx(data) function to get the htp_tx_t from the htp_tx_data_t
- htp_tx_data_is_empty(data) convenience function introduced to test if the data is empty.

Functions introduced to handle opaque htp_connp_t:
- htp_list_get(connp->transactions, txid) => htp_connp_tx(connp, txid)
- htp_list_size(http_state->conn->transactions) => htp_connp_tx_size(connp)
- htp_connp_get_connection(connp) => htp_connp_connection(connp)
- htp_connp_req_data(connp) => htp_connp_request_data(connp)
- htp_connp_req_close(connp) => htp_connp_request_close(connp)
- htp_connp_res_data(connp) => htp_connp_response_data(connp)
- htp_connp_get_jn_tx(connp) => htp_connp_get_request_tx(connp)
- htp_connp_get_out_tx(connp) => htp_connp_get_response_tx(connp)
- htp_connp_req_data_consumed(connp) => htp_connp_request_data_consumed(connp)
- htp_connp_res_data_consumed(connp) => htp_connp_response_data_consumed(connp)
- htp_connp_get_user_data(connp) => htp_connp_user_data(connp)

Functions introduced to handle opaque htp_conn_t:
- conn->in_data_counter => htp_conn_request_data_counter(conn)
- conn->out_data_counter => htp_conn_response_data_counter(conn)

Other changes:

Build libhtp-rs as a crate inside rust. Update autoconf to no longer
use libhtp as an external dependency. Remove HAVE_HTP feature defines
since they are no longer needed.

Make function arguments and return values const where possible

htp_tx_destroy(tx) will now free an incomplete transaction

htp_time_t replaced with standard struct timeval

Callbacks from libhtp now provide the htp_connp_t and the htp_tx_data_t
as separate arguments. This means the connection parser is no longer
fetched from the transaction inside callbacks.

SCHTPGenerateNormalizedUri() functionality moved inside libhtp-rs, which
now provides normalized URI values.
The normalized URI is available with accessor function: htp_tx_normalized_uri()
Configuration settings added to control the behaviour of the URI normalization:
- htp_config_set_normalized_uri_include_all()
- htp_config_set_plusspace_decode()
- htp_config_set_convert_lowercase()
- htp_config_set_double_decode_normalized_query()
- htp_config_set_double_decode_normalized_path()
- htp_config_set_backslash_convert_slashes()
- htp_config_set_bestfit_replacement_byte()
- htp_config_set_convert_lowercase()
- htp_config_set_nul_encoded_terminates()
- htp_config_set_nul_raw_terminates()
- htp_config_set_path_separators_compress()
- htp_config_set_path_separators_decode()
- htp_config_set_u_encoding_decode()
- htp_config_set_url_encoding_invalid_handling()
- htp_config_set_utf8_convert_bestfit()
- htp_config_set_normalized_uri_include_all()
- htp_config_set_plusspace_decode()
Constants related to configuring uri normalization:
- HTP_URL_DECODE_PRESERVE_PERCENT => HTP_URL_ENCODING_HANDLING_PRESERVE_PERCENT
- HTP_URL_DECODE_REMOVE_PERCENT => HTP_URL_ENCODING_HANDLING_REMOVE_PERCENT
- HTP_URL_DECODE_PROCESS_INVALID => HTP_URL_ENCODING_HANDLING_PROCESS_INVALID

htp_config_set_field_limits(soft_limit, hard_limit) changed to
htp_config_set_field_limit(limit) because libhtp didn't implement soft
limits.

libhtp logging API updated to provide HTP_LOG_CODE constants along with
the message. This eliminates the need to perform string matching on
message text to map log messages to HTTP_DECODER_EVENT values, and the
HTP_LOG_CODE values can be used directly. In support of this,
HTP_DECODER_EVENT values are mapped to their corresponding HTP_LOG_CODE
values:

HTTP_DECODER_EVENT_UNKNOWN_ERROR => HTP_LOG_CODE_UNKNOWN
HTTP_DECODER_EVENT_GZIP_DECOMPRESSION_FAILED => HTP_LOG_CODE_GZIP_DECOMPRESSION_FAILED
HTTP_DECODER_EVENT_REQUEST_FIELD_MISSING_COLON => HTP_LOG_CODE_REQUEST_FIELD_MISSING_COLON
HTTP_DECODER_EVENT_RESPONSE_FIELD_MISSING_COLON => HTP_LOG_CODE_RESPONSE_FIELD_MISSING_COLON
HTTP_DECODER_EVENT_INVALID_REQUEST_CHUNK_LEN => HTP_LOG_CODE_INVALID_REQUEST_CHUNK_LEN
HTTP_DECODER_EVENT_INVALID_RESPONSE_CHUNK_LEN => HTP_LOG_CODE_INVALID_RESPONSE_CHUNK_LEN
HTTP_DECODER_EVENT_INVALID_TRANSFER_ENCODING_VALUE_IN_REQUEST => HTP_LOG_CODE_INVALID_TRANSFER_ENCODING_VALUE_IN_REQUEST
HTTP_DECODER_EVENT_INVALID_TRANSFER_ENCODING_VALUE_IN_RESPONSE => HTP_LOG_CODE_INVALID_TRANSFER_ENCODING_VALUE_IN_RESPONSE
HTTP_DECODER_EVENT_INVALID_CONTENT_LENGTH_FIELD_IN_REQUEST => HTP_LOG_CODE_INVALID_CONTENT_LENGTH_FIELD_IN_REQUEST
HTTP_DECODER_EVENT_INVALID_CONTENT_LENGTH_FIELD_IN_RESPONSE => HTP_LOG_CODE_INVALID_CONTENT_LENGTH_FIELD_IN_RESPONSE
HTTP_DECODER_EVENT_DUPLICATE_CONTENT_LENGTH_FIELD_IN_REQUEST => HTP_LOG_CODE_DUPLICATE_CONTENT_LENGTH_FIELD_IN_REQUEST
HTTP_DECODER_EVENT_DUPLICATE_CONTENT_LENGTH_FIELD_IN_RESPONSE => HTP_LOG_CODE_DUPLICATE_CONTENT_LENGTH_FIELD_IN_RESPONSE
HTTP_DECODER_EVENT_100_CONTINUE_ALREADY_SEEN => HTP_LOG_CODE_CONTINUE_ALREADY_SEEN
HTTP_DECODER_EVENT_UNABLE_TO_MATCH_RESPONSE_TO_REQUEST => HTP_LOG_CODE_UNABLE_TO_MATCH_RESPONSE_TO_REQUEST
HTTP_DECODER_EVENT_INVALID_SERVER_PORT_IN_REQUEST => HTP_LOG_CODE_INVALID_SERVER_PORT_IN_REQUEST
HTTP_DECODER_EVENT_INVALID_AUTHORITY_PORT => HTP_LOG_CODE_INVALID_AUTHORITY_PORT
HTTP_DECODER_EVENT_REQUEST_HEADER_INVALID => HTP_LOG_CODE_REQUEST_HEADER_INVALID
HTTP_DECODER_EVENT_RESPONSE_HEADER_INVALID => HTP_LOG_CODE_RESPONSE_HEADER_INVALID
HTTP_DECODER_EVENT_MISSING_HOST_HEADER => HTP_LOG_CODE_MISSING_HOST_HEADER
HTTP_DECODER_EVENT_HOST_HEADER_AMBIGUOUS => HTP_LOG_CODE_HOST_HEADER_AMBIGUOUS
HTTP_DECODER_EVENT_INVALID_REQUEST_FIELD_FOLDING => HTP_LOG_CODE_INVALID_REQUEST_FIELD_FOLDING
HTTP_DECODER_EVENT_INVALID_RESPONSE_FIELD_FOLDING => HTP_LOG_CODE_INVALID_RESPONSE_FIELD_FOLDING
HTTP_DECODER_EVENT_REQUEST_FIELD_TOO_LONG => HTP_LOG_CODE_REQUEST_FIELD_TOO_LONG
HTTP_DECODER_EVENT_RESPONSE_FIELD_TOO_LONG => HTP_LOG_CODE_RESPONSE_FIELD_TOO_LONG
HTTP_DECODER_EVENT_FILE_NAME_TOO_LONG => HTP_LOG_CODE_REQUEST_LINE_INVALID
HTTP_DECODER_EVENT_REQUEST_LINE_INVALID => HTP_LOG_CODE_REQUEST_BODY_UNEXPECTED
HTTP_DECODER_EVENT_REQUEST_BODY_UNEXPECTED => HTP_LOG_CODE_RESPONSE_BODY_UNEXPECTED
HTTP_DECODER_EVENT_REQUEST_SERVER_PORT_TCP_PORT_MISMATCH => HTP_LOG_CODE_REQUEST_SERVER_PORT_TCP_PORT_MISMATCH
HTTP_DECODER_EVENT_URI_HOST_INVALID => HTP_LOG_CODE_URI_HOST_INVALID
HTTP_DECODER_EVENT_HEADER_HOST_INVALID => HTP_LOG_CODE_HEADER_HOST_INVALID
HTTP_DECODER_EVENT_AUTH_UNRECOGNIZED => HTP_LOG_CODE_AUTH_UNRECOGNIZED
HTTP_DECODER_EVENT_REQUEST_HEADER_REPETITION => HTP_LOG_CODE_REQUEST_HEADER_REPETITION
HTTP_DECODER_EVENT_RESPONSE_HEADER_REPETITION => HTP_LOG_CODE_RESPONSE_HEADER_REPETITION
HTTP_DECODER_EVENT_DOUBLE_ENCODED_URI => HTP_LOG_CODE_DOUBLE_ENCODED_URI
HTTP_DECODER_EVENT_URI_DELIM_NON_COMPLIANT => HTP_LOG_CODE_URI_DELIM_NON_COMPLIANT
HTTP_DECODER_EVENT_METHOD_DELIM_NON_COMPLIANT => HTP_LOG_CODE_METHOD_DELIM_NON_COMPLIANT
HTTP_DECODER_EVENT_REQUEST_LINE_LEADING_WHITESPACE => HTP_LOG_CODE_REQUEST_LINE_LEADING_WHITESPACE
HTTP_DECODER_EVENT_TOO_MANY_ENCODING_LAYERS => HTP_LOG_CODE_TOO_MANY_ENCODING_LAYERS
HTTP_DECODER_EVENT_ABNORMAL_CE_HEADER => HTP_LOG_CODE_ABNORMAL_CE_HEADER
HTTP_DECODER_EVENT_RESPONSE_MULTIPART_BYTERANGES => HTP_LOG_CODE_RESPONSE_MULTIPART_BYTERANGES
HTTP_DECODER_EVENT_RESPONSE_ABNORMAL_TRANSFER_ENCODING => HTP_LOG_CODE_RESPONSE_ABNORMAL_TRANSFER_ENCODING
HTTP_DECODER_EVENT_RESPONSE_CHUNKED_OLD_PROTO => HTP_LOG_CODE_RESPONSE_CHUNKED_OLD_PROTO
HTTP_DECODER_EVENT_RESPONSE_INVALID_PROTOCOL => HTP_LOG_CODE_RESPONSE_INVALID_PROTOCOL
HTTP_DECODER_EVENT_RESPONSE_INVALID_STATUS => HTP_LOG_CODE_RESPONSE_INVALID_STATUS
HTTP_DECODER_EVENT_REQUEST_LINE_INCOMPLETE => HTP_LOG_CODE_REQUEST_LINE_INCOMPLETE
HTTP_DECODER_EVENT_LZMA_MEMLIMIT_REACHED => HTP_LOG_CODE_LZMA_MEMLIMIT_REACHED
HTTP_DECODER_EVENT_COMPRESSION_BOMB => HTP_LOG_CODE_COMPRESSION_BOMB

New log events to describe additional anomalies:
HTP_LOG_CODE_REQUEST_TOO_MANY_LZMA_LAYERS
HTP_LOG_CODE_RESPONSE_TOO_MANY_LZMA_LAYERS
HTP_LOG_CODE_PROTOCOL_CONTAINS_EXTRA_DATA
HTP_LOG_CODE_CONTENT_LENGTH_EXTRA_DATA_START
HTP_LOG_CODE_CONTENT_LENGTH_EXTRA_DATA_END
HTP_LOG_CODE_SWITCHING_PROTO_WITH_CONTENT_LENGTH
HTP_LOG_CODE_DEFORMED_EOL
HTP_LOG_CODE_PARSER_STATE_ERROR
HTP_LOG_CODE_MISSING_OUTBOUND_TRANSACTION_DATA
HTP_LOG_CODE_MISSING_INBOUND_TRANSACTION_DATA
HTP_LOG_CODE_ZERO_LENGTH_DATA_CHUNKS
HTP_LOG_CODE_REQUEST_LINE_UNKNOWN_METHOD
HTP_LOG_CODE_REQUEST_LINE_UNKNOWN_METHOD_NO_PROTOCOL
HTP_LOG_CODE_REQUEST_LINE_UNKNOWN_METHOD_INVALID_PROTOCOL
HTP_LOG_CODE_REQUEST_LINE_NO_PROTOCOL
HTP_LOG_CODE_RESPONSE_LINE_INVALID_PROTOCOL
HTP_LOG_CODE_RESPONSE_LINE_INVALID_RESPONSE_STATUS
HTP_LOG_CODE_RESPONSE_BODY_INTERNAL_ERROR
HTP_LOG_CODE_REQUEST_BODY_DATA_CALLBACK_ERROR
HTP_LOG_CODE_RESPONSE_INVALID_EMPTY_NAME
HTP_LOG_CODE_REQUEST_INVALID_EMPTY_NAME
HTP_LOG_CODE_RESPONSE_INVALID_LWS_AFTER_NAME
HTP_LOG_CODE_RESPONSE_HEADER_NAME_NOT_TOKEN
HTP_LOG_CODE_REQUEST_INVALID_LWS_AFTER_NAME
HTP_LOG_CODE_LZMA_DECOMPRESSION_DISABLED
HTP_LOG_CODE_CONNECTION_ALREADY_OPEN
HTP_LOG_CODE_COMPRESSION_BOMB_DOUBLE_LZMA
HTP_LOG_CODE_INVALID_CONTENT_ENCODING
HTP_LOG_CODE_INVALID_GAP
HTP_LOG_CODE_ERROR

The new htp_log API supports consuming log messages more easily than
walking a list and tracking the current offset. Internally, libhtp-rs
now provides log messages as a queue of htp_log_t, which means the
application can simply call htp_conn_next_log() to fetch the next log
message until the queue is empty. Once the application is done with a
log message, they can call htp_log_free() to dispose of it.

Functions supporting htp_log_t:
htp_conn_next_log(conn) - Get the next log message
htp_log_message(log) - To get the text of the message
htp_log_code(log) - To get the HTP_LOG_CODE value
htp_log_free(log) - To free the htp_log_t
catenacyber pushed a commit to catenacyber/suricata that referenced this pull request Oct 15, 2024
Ticket: OISF#2696

There are a lot of changes here, which are described below.

In general these changes are renaming constants to conform to the
libhtp-rs versions (which are generated by cbindgen); making all htp
types opaque and changing struct->member references to
htp_struct_member() function calls; and a handful of changes to offload
functionality onto libhtp-rs from suricata, such as URI normalization
and transaction cleanup.

Constants renamed to correspond to cbindgen generated names:

HTP_OK => HTP_STATUS_OK
HTP_ERROR => HTP_STATUS_ERROR

HTP_SERVER_* => HTP_SERVER_PERSONALITY_*

HTP_REQUEST_INVALID_T_E => HTP_FLAGS_REQUEST_INVALID_T_E
HTP_REQUEST_INVALID_C_L => HTP_FLAGS_REQUEST_INVALID_C_L
HTP_HOST_MISSING => HTP_FLAGS_HOST_MISSING
HTP_HOST_AMBIGUOUS => HTP_FLAGS_HOST_AMBIGUOUS
HTP_HOSTU_INVALID => HTP_FLAGS_HOSTU_INVALID
HTP_HOSTH_INVALID => HTP_FLAGS_HOSTH_INVALID

HTP_AUTH_UNRECOGNIZED => HTP_AUTH_TYPE_UNRECOGNIZED

HTP_M_UNKNOWN => HTP_METHOD_UNKNOWN
HTP_M_GET => HTP_METHOD_GET
HTP_M_POST => HTP_METHOD_POST
HTP_M_PUT => HTP_METHOD_PUT
HTP_M_CONNECT => HTP_METHOD_CONNECT

HTP_STREAM_ERROR => HTP_STREAM_STATE_ERROR
HTP_STREAM_TUNNEL => HTP_STREAM_STATE_TUNNEL

HTP_REQUEST_LINE => HTP_REQUEST_PROGRESS_LINE
HTP_REQUEST_HEADERS => HTP_REQUEST_PROGRESS_HEADERS
HTP_REQUEST_BODY => HTP_REQUEST_PROGRESS_BODY
HTP_REQUEST_TRAILER => HTP_REQUEST_PROGRESS_TRAILER
HTP_REQUEST_COMPLETE => HTP_REQUEST_PROGRESS_COMPLETE
HTP_RESPONSE_LINE => HTP_RESPONSE_PROGRESS_LINE
HTP_RESPONSE_HEADERS => HTP_RESPONSE_PROGRESS_HEADERS
HTP_RESPONSE_BODY => HTP_RESPONSE_PROGRESS_BODY
HTP_RESPONSE_TRAILER => HTP_RESPONSE_PROGRESS_TRAILER
HTP_RESPONSE_COMPLETE => HTP_RESPONSE_PROGRESS_COMPLETE

HTP_PROTOCOL_1_1 => HTP_PROTOCOL_V1_1
HTP_PROTOCOL_1_0 => HTP_PROTOCOL_V1_0
HTP_PROTOCOL_0_9 => HTP_PROTOCOL_V0_9

Functions introduced to handle opaque htp_tx_t:
- tx->cfg => htp_tx_cfg(tx)
- tx->flags => htp_tx_flags(tx)
- tx->is_protocol_0_9 => htp_tx_is_protocol_0_9(tx)
- tx->parsed_uri => htp_tx_parsed_uri(tx)
- tx->parsed_uri->path => htp_uri_path(htp_tx_parsed_uri(tx)
- tx->parsed_uri->hostname => htp_uri_hostname(htp_tx_parsed_uri(tx))
- tx->request_auth_type => htp_tx_request_auth_type(tx)
- tx->request_headers => htp_tx_request_headers(tx)
- tx->request_hostname => htp_tx_request_hostname(tx)
- tx->request_line => htp_tx_request_line(tx)
- tx->request_message_len => htp_tx_request_message_len(tx)
- tx->request_method => htp_tx_request_method(tx)
- tx->request_method_number => htp_tx_request_method_number(tx)
- tx->request_port_number => htp_tx_request_port_number(tx)
- tx->request_progress => htp_tx_request_progress(tx)
- tx->request_protocol => htp_tx_request_protocol(tx)
- tx->request_protocol_number => htp_tx_request_protocol_number(tx)
- tx->request_uri => htp_tx_request_uri(tx)
- tx->response_headers => htp_tx_response_headers(tx)
- tx->response_line => htp_tx_response_line(tx)
- tx->response_message => htp_tx_response_message(tx)
- tx->response_message_len => htp_tx_response_message_len(tx)
- tx->response_status => htp_tx_response_status(tx)
- tx->response_status_number => htp_tx_response_status_number(tx)
- tx->response_progress => htp_tx_response_progress(tx)
- tx->response_protocol_number => htp_tx_response_protocol_number(tx)
- htp_tx_get_user_data() => htp_tx_user_data(tx)
- htp_table_get_c(tx->request_headers, header) => htp_tx_request_header(tx, header)
- htp_table_get_c(tx->response_headers, header) => htp_tx_response_header(tx, header)
- htp_table_get_index(tx->request_headers, index) => htp_tx_request_header_index(tx, index)
- htp_table_size(tx->request_headers) => htp_tx_request_headers_size(tx)
- htp_tx_is_http_2_upgrade(tx) convenience function introduced to detect response status 101
  and “Upgrade: h2c" header.

Functions introduced to handle opaque htp_header_t:
- header->name => htp_header_name(header)
- bstr_ptr(header->name) => htp_header_name_ptr(header)
- bstr_len(header->name) => htp_header_name_len(header)
- header->value => htp_header_value(header)
- bstr_len(header->value) => htp_header_value_len(header)
- bstr_ptr(header->value) => htp_header_value_ptr(header)

Functions introduced to handle opaque htp_headers_t:
- htp_table_size(headers) => htp_headers_size(headers)
- htp_table_get_index(headers, index) => htp_headers_get_index(headers, index)

Functions introduced to handle opaque htp_tx_data_t:
- d->len => htp_tx_data_len()
- d->data => htp_tx_data_data()
- htp_tx_data_tx(data) function to get the htp_tx_t from the htp_tx_data_t
- htp_tx_data_is_empty(data) convenience function introduced to test if the data is empty.

Functions introduced to handle opaque htp_connp_t:
- htp_list_get(connp->transactions, txid) => htp_connp_tx(connp, txid)
- htp_list_size(http_state->conn->transactions) => htp_connp_tx_size(connp)
- htp_connp_get_connection(connp) => htp_connp_connection(connp)
- htp_connp_req_data(connp) => htp_connp_request_data(connp)
- htp_connp_req_close(connp) => htp_connp_request_close(connp)
- htp_connp_res_data(connp) => htp_connp_response_data(connp)
- htp_connp_get_jn_tx(connp) => htp_connp_get_request_tx(connp)
- htp_connp_get_out_tx(connp) => htp_connp_get_response_tx(connp)
- htp_connp_req_data_consumed(connp) => htp_connp_request_data_consumed(connp)
- htp_connp_res_data_consumed(connp) => htp_connp_response_data_consumed(connp)
- htp_connp_get_user_data(connp) => htp_connp_user_data(connp)

Functions introduced to handle opaque htp_conn_t:
- conn->in_data_counter => htp_conn_request_data_counter(conn)
- conn->out_data_counter => htp_conn_response_data_counter(conn)

Other changes:

Build libhtp-rs as a crate inside rust. Update autoconf to no longer
use libhtp as an external dependency. Remove HAVE_HTP feature defines
since they are no longer needed.

Make function arguments and return values const where possible

htp_tx_destroy(tx) will now free an incomplete transaction

htp_time_t replaced with standard struct timeval

Callbacks from libhtp now provide the htp_connp_t and the htp_tx_data_t
as separate arguments. This means the connection parser is no longer
fetched from the transaction inside callbacks.

SCHTPGenerateNormalizedUri() functionality moved inside libhtp-rs, which
now provides normalized URI values.
The normalized URI is available with accessor function: htp_tx_normalized_uri()
Configuration settings added to control the behaviour of the URI normalization:
- htp_config_set_normalized_uri_include_all()
- htp_config_set_plusspace_decode()
- htp_config_set_convert_lowercase()
- htp_config_set_double_decode_normalized_query()
- htp_config_set_double_decode_normalized_path()
- htp_config_set_backslash_convert_slashes()
- htp_config_set_bestfit_replacement_byte()
- htp_config_set_convert_lowercase()
- htp_config_set_nul_encoded_terminates()
- htp_config_set_nul_raw_terminates()
- htp_config_set_path_separators_compress()
- htp_config_set_path_separators_decode()
- htp_config_set_u_encoding_decode()
- htp_config_set_url_encoding_invalid_handling()
- htp_config_set_utf8_convert_bestfit()
- htp_config_set_normalized_uri_include_all()
- htp_config_set_plusspace_decode()
Constants related to configuring uri normalization:
- HTP_URL_DECODE_PRESERVE_PERCENT => HTP_URL_ENCODING_HANDLING_PRESERVE_PERCENT
- HTP_URL_DECODE_REMOVE_PERCENT => HTP_URL_ENCODING_HANDLING_REMOVE_PERCENT
- HTP_URL_DECODE_PROCESS_INVALID => HTP_URL_ENCODING_HANDLING_PROCESS_INVALID

htp_config_set_field_limits(soft_limit, hard_limit) changed to
htp_config_set_field_limit(limit) because libhtp didn't implement soft
limits.

libhtp logging API updated to provide HTP_LOG_CODE constants along with
the message. This eliminates the need to perform string matching on
message text to map log messages to HTTP_DECODER_EVENT values, and the
HTP_LOG_CODE values can be used directly. In support of this,
HTP_DECODER_EVENT values are mapped to their corresponding HTP_LOG_CODE
values:

HTTP_DECODER_EVENT_UNKNOWN_ERROR => HTP_LOG_CODE_UNKNOWN
HTTP_DECODER_EVENT_GZIP_DECOMPRESSION_FAILED => HTP_LOG_CODE_GZIP_DECOMPRESSION_FAILED
HTTP_DECODER_EVENT_REQUEST_FIELD_MISSING_COLON => HTP_LOG_CODE_REQUEST_FIELD_MISSING_COLON
HTTP_DECODER_EVENT_RESPONSE_FIELD_MISSING_COLON => HTP_LOG_CODE_RESPONSE_FIELD_MISSING_COLON
HTTP_DECODER_EVENT_INVALID_REQUEST_CHUNK_LEN => HTP_LOG_CODE_INVALID_REQUEST_CHUNK_LEN
HTTP_DECODER_EVENT_INVALID_RESPONSE_CHUNK_LEN => HTP_LOG_CODE_INVALID_RESPONSE_CHUNK_LEN
HTTP_DECODER_EVENT_INVALID_TRANSFER_ENCODING_VALUE_IN_REQUEST => HTP_LOG_CODE_INVALID_TRANSFER_ENCODING_VALUE_IN_REQUEST
HTTP_DECODER_EVENT_INVALID_TRANSFER_ENCODING_VALUE_IN_RESPONSE => HTP_LOG_CODE_INVALID_TRANSFER_ENCODING_VALUE_IN_RESPONSE
HTTP_DECODER_EVENT_INVALID_CONTENT_LENGTH_FIELD_IN_REQUEST => HTP_LOG_CODE_INVALID_CONTENT_LENGTH_FIELD_IN_REQUEST
HTTP_DECODER_EVENT_INVALID_CONTENT_LENGTH_FIELD_IN_RESPONSE => HTP_LOG_CODE_INVALID_CONTENT_LENGTH_FIELD_IN_RESPONSE
HTTP_DECODER_EVENT_DUPLICATE_CONTENT_LENGTH_FIELD_IN_REQUEST => HTP_LOG_CODE_DUPLICATE_CONTENT_LENGTH_FIELD_IN_REQUEST
HTTP_DECODER_EVENT_DUPLICATE_CONTENT_LENGTH_FIELD_IN_RESPONSE => HTP_LOG_CODE_DUPLICATE_CONTENT_LENGTH_FIELD_IN_RESPONSE
HTTP_DECODER_EVENT_100_CONTINUE_ALREADY_SEEN => HTP_LOG_CODE_CONTINUE_ALREADY_SEEN
HTTP_DECODER_EVENT_UNABLE_TO_MATCH_RESPONSE_TO_REQUEST => HTP_LOG_CODE_UNABLE_TO_MATCH_RESPONSE_TO_REQUEST
HTTP_DECODER_EVENT_INVALID_SERVER_PORT_IN_REQUEST => HTP_LOG_CODE_INVALID_SERVER_PORT_IN_REQUEST
HTTP_DECODER_EVENT_INVALID_AUTHORITY_PORT => HTP_LOG_CODE_INVALID_AUTHORITY_PORT
HTTP_DECODER_EVENT_REQUEST_HEADER_INVALID => HTP_LOG_CODE_REQUEST_HEADER_INVALID
HTTP_DECODER_EVENT_RESPONSE_HEADER_INVALID => HTP_LOG_CODE_RESPONSE_HEADER_INVALID
HTTP_DECODER_EVENT_MISSING_HOST_HEADER => HTP_LOG_CODE_MISSING_HOST_HEADER
HTTP_DECODER_EVENT_HOST_HEADER_AMBIGUOUS => HTP_LOG_CODE_HOST_HEADER_AMBIGUOUS
HTTP_DECODER_EVENT_INVALID_REQUEST_FIELD_FOLDING => HTP_LOG_CODE_INVALID_REQUEST_FIELD_FOLDING
HTTP_DECODER_EVENT_INVALID_RESPONSE_FIELD_FOLDING => HTP_LOG_CODE_INVALID_RESPONSE_FIELD_FOLDING
HTTP_DECODER_EVENT_REQUEST_FIELD_TOO_LONG => HTP_LOG_CODE_REQUEST_FIELD_TOO_LONG
HTTP_DECODER_EVENT_RESPONSE_FIELD_TOO_LONG => HTP_LOG_CODE_RESPONSE_FIELD_TOO_LONG
HTTP_DECODER_EVENT_FILE_NAME_TOO_LONG => HTP_LOG_CODE_REQUEST_LINE_INVALID
HTTP_DECODER_EVENT_REQUEST_LINE_INVALID => HTP_LOG_CODE_REQUEST_BODY_UNEXPECTED
HTTP_DECODER_EVENT_REQUEST_BODY_UNEXPECTED => HTP_LOG_CODE_RESPONSE_BODY_UNEXPECTED
HTTP_DECODER_EVENT_REQUEST_SERVER_PORT_TCP_PORT_MISMATCH => HTP_LOG_CODE_REQUEST_SERVER_PORT_TCP_PORT_MISMATCH
HTTP_DECODER_EVENT_URI_HOST_INVALID => HTP_LOG_CODE_URI_HOST_INVALID
HTTP_DECODER_EVENT_HEADER_HOST_INVALID => HTP_LOG_CODE_HEADER_HOST_INVALID
HTTP_DECODER_EVENT_AUTH_UNRECOGNIZED => HTP_LOG_CODE_AUTH_UNRECOGNIZED
HTTP_DECODER_EVENT_REQUEST_HEADER_REPETITION => HTP_LOG_CODE_REQUEST_HEADER_REPETITION
HTTP_DECODER_EVENT_RESPONSE_HEADER_REPETITION => HTP_LOG_CODE_RESPONSE_HEADER_REPETITION
HTTP_DECODER_EVENT_DOUBLE_ENCODED_URI => HTP_LOG_CODE_DOUBLE_ENCODED_URI
HTTP_DECODER_EVENT_URI_DELIM_NON_COMPLIANT => HTP_LOG_CODE_URI_DELIM_NON_COMPLIANT
HTTP_DECODER_EVENT_METHOD_DELIM_NON_COMPLIANT => HTP_LOG_CODE_METHOD_DELIM_NON_COMPLIANT
HTTP_DECODER_EVENT_REQUEST_LINE_LEADING_WHITESPACE => HTP_LOG_CODE_REQUEST_LINE_LEADING_WHITESPACE
HTTP_DECODER_EVENT_TOO_MANY_ENCODING_LAYERS => HTP_LOG_CODE_TOO_MANY_ENCODING_LAYERS
HTTP_DECODER_EVENT_ABNORMAL_CE_HEADER => HTP_LOG_CODE_ABNORMAL_CE_HEADER
HTTP_DECODER_EVENT_RESPONSE_MULTIPART_BYTERANGES => HTP_LOG_CODE_RESPONSE_MULTIPART_BYTERANGES
HTTP_DECODER_EVENT_RESPONSE_ABNORMAL_TRANSFER_ENCODING => HTP_LOG_CODE_RESPONSE_ABNORMAL_TRANSFER_ENCODING
HTTP_DECODER_EVENT_RESPONSE_CHUNKED_OLD_PROTO => HTP_LOG_CODE_RESPONSE_CHUNKED_OLD_PROTO
HTTP_DECODER_EVENT_RESPONSE_INVALID_PROTOCOL => HTP_LOG_CODE_RESPONSE_INVALID_PROTOCOL
HTTP_DECODER_EVENT_RESPONSE_INVALID_STATUS => HTP_LOG_CODE_RESPONSE_INVALID_STATUS
HTTP_DECODER_EVENT_REQUEST_LINE_INCOMPLETE => HTP_LOG_CODE_REQUEST_LINE_INCOMPLETE
HTTP_DECODER_EVENT_LZMA_MEMLIMIT_REACHED => HTP_LOG_CODE_LZMA_MEMLIMIT_REACHED
HTTP_DECODER_EVENT_COMPRESSION_BOMB => HTP_LOG_CODE_COMPRESSION_BOMB

New log events to describe additional anomalies:
HTP_LOG_CODE_REQUEST_TOO_MANY_LZMA_LAYERS
HTP_LOG_CODE_RESPONSE_TOO_MANY_LZMA_LAYERS
HTP_LOG_CODE_PROTOCOL_CONTAINS_EXTRA_DATA
HTP_LOG_CODE_CONTENT_LENGTH_EXTRA_DATA_START
HTP_LOG_CODE_CONTENT_LENGTH_EXTRA_DATA_END
HTP_LOG_CODE_SWITCHING_PROTO_WITH_CONTENT_LENGTH
HTP_LOG_CODE_DEFORMED_EOL
HTP_LOG_CODE_PARSER_STATE_ERROR
HTP_LOG_CODE_MISSING_OUTBOUND_TRANSACTION_DATA
HTP_LOG_CODE_MISSING_INBOUND_TRANSACTION_DATA
HTP_LOG_CODE_ZERO_LENGTH_DATA_CHUNKS
HTP_LOG_CODE_REQUEST_LINE_UNKNOWN_METHOD
HTP_LOG_CODE_REQUEST_LINE_UNKNOWN_METHOD_NO_PROTOCOL
HTP_LOG_CODE_REQUEST_LINE_UNKNOWN_METHOD_INVALID_PROTOCOL
HTP_LOG_CODE_REQUEST_LINE_NO_PROTOCOL
HTP_LOG_CODE_RESPONSE_LINE_INVALID_PROTOCOL
HTP_LOG_CODE_RESPONSE_LINE_INVALID_RESPONSE_STATUS
HTP_LOG_CODE_RESPONSE_BODY_INTERNAL_ERROR
HTP_LOG_CODE_REQUEST_BODY_DATA_CALLBACK_ERROR
HTP_LOG_CODE_RESPONSE_INVALID_EMPTY_NAME
HTP_LOG_CODE_REQUEST_INVALID_EMPTY_NAME
HTP_LOG_CODE_RESPONSE_INVALID_LWS_AFTER_NAME
HTP_LOG_CODE_RESPONSE_HEADER_NAME_NOT_TOKEN
HTP_LOG_CODE_REQUEST_INVALID_LWS_AFTER_NAME
HTP_LOG_CODE_LZMA_DECOMPRESSION_DISABLED
HTP_LOG_CODE_CONNECTION_ALREADY_OPEN
HTP_LOG_CODE_COMPRESSION_BOMB_DOUBLE_LZMA
HTP_LOG_CODE_INVALID_CONTENT_ENCODING
HTP_LOG_CODE_INVALID_GAP
HTP_LOG_CODE_ERROR

The new htp_log API supports consuming log messages more easily than
walking a list and tracking the current offset. Internally, libhtp-rs
now provides log messages as a queue of htp_log_t, which means the
application can simply call htp_conn_next_log() to fetch the next log
message until the queue is empty. Once the application is done with a
log message, they can call htp_log_free() to dispose of it.

Functions supporting htp_log_t:
htp_conn_next_log(conn) - Get the next log message
htp_log_message(log) - To get the text of the message
htp_log_code(log) - To get the HTP_LOG_CODE value
htp_log_free(log) - To free the htp_log_t
catenacyber pushed a commit to catenacyber/suricata that referenced this pull request Oct 15, 2024
Ticket: OISF#2696

There are a lot of changes here, which are described below.

In general these changes are renaming constants to conform to the
libhtp-rs versions (which are generated by cbindgen); making all htp
types opaque and changing struct->member references to
htp_struct_member() function calls; and a handful of changes to offload
functionality onto libhtp-rs from suricata, such as URI normalization
and transaction cleanup.

Constants renamed to correspond to cbindgen generated names:

HTP_OK => HTP_STATUS_OK
HTP_ERROR => HTP_STATUS_ERROR

HTP_SERVER_* => HTP_SERVER_PERSONALITY_*

HTP_REQUEST_INVALID_T_E => HTP_FLAGS_REQUEST_INVALID_T_E
HTP_REQUEST_INVALID_C_L => HTP_FLAGS_REQUEST_INVALID_C_L
HTP_HOST_MISSING => HTP_FLAGS_HOST_MISSING
HTP_HOST_AMBIGUOUS => HTP_FLAGS_HOST_AMBIGUOUS
HTP_HOSTU_INVALID => HTP_FLAGS_HOSTU_INVALID
HTP_HOSTH_INVALID => HTP_FLAGS_HOSTH_INVALID

HTP_AUTH_UNRECOGNIZED => HTP_AUTH_TYPE_UNRECOGNIZED

HTP_M_UNKNOWN => HTP_METHOD_UNKNOWN
HTP_M_GET => HTP_METHOD_GET
HTP_M_POST => HTP_METHOD_POST
HTP_M_PUT => HTP_METHOD_PUT
HTP_M_CONNECT => HTP_METHOD_CONNECT

HTP_STREAM_ERROR => HTP_STREAM_STATE_ERROR
HTP_STREAM_TUNNEL => HTP_STREAM_STATE_TUNNEL

HTP_REQUEST_LINE => HTP_REQUEST_PROGRESS_LINE
HTP_REQUEST_HEADERS => HTP_REQUEST_PROGRESS_HEADERS
HTP_REQUEST_BODY => HTP_REQUEST_PROGRESS_BODY
HTP_REQUEST_TRAILER => HTP_REQUEST_PROGRESS_TRAILER
HTP_REQUEST_COMPLETE => HTP_REQUEST_PROGRESS_COMPLETE
HTP_RESPONSE_LINE => HTP_RESPONSE_PROGRESS_LINE
HTP_RESPONSE_HEADERS => HTP_RESPONSE_PROGRESS_HEADERS
HTP_RESPONSE_BODY => HTP_RESPONSE_PROGRESS_BODY
HTP_RESPONSE_TRAILER => HTP_RESPONSE_PROGRESS_TRAILER
HTP_RESPONSE_COMPLETE => HTP_RESPONSE_PROGRESS_COMPLETE

HTP_PROTOCOL_1_1 => HTP_PROTOCOL_V1_1
HTP_PROTOCOL_1_0 => HTP_PROTOCOL_V1_0
HTP_PROTOCOL_0_9 => HTP_PROTOCOL_V0_9

Functions introduced to handle opaque htp_tx_t:
- tx->cfg => htp_tx_cfg(tx)
- tx->flags => htp_tx_flags(tx)
- tx->is_protocol_0_9 => htp_tx_is_protocol_0_9(tx)
- tx->parsed_uri => htp_tx_parsed_uri(tx)
- tx->parsed_uri->path => htp_uri_path(htp_tx_parsed_uri(tx)
- tx->parsed_uri->hostname => htp_uri_hostname(htp_tx_parsed_uri(tx))
- tx->request_auth_type => htp_tx_request_auth_type(tx)
- tx->request_headers => htp_tx_request_headers(tx)
- tx->request_hostname => htp_tx_request_hostname(tx)
- tx->request_line => htp_tx_request_line(tx)
- tx->request_message_len => htp_tx_request_message_len(tx)
- tx->request_method => htp_tx_request_method(tx)
- tx->request_method_number => htp_tx_request_method_number(tx)
- tx->request_port_number => htp_tx_request_port_number(tx)
- tx->request_progress => htp_tx_request_progress(tx)
- tx->request_protocol => htp_tx_request_protocol(tx)
- tx->request_protocol_number => htp_tx_request_protocol_number(tx)
- tx->request_uri => htp_tx_request_uri(tx)
- tx->response_headers => htp_tx_response_headers(tx)
- tx->response_line => htp_tx_response_line(tx)
- tx->response_message => htp_tx_response_message(tx)
- tx->response_message_len => htp_tx_response_message_len(tx)
- tx->response_status => htp_tx_response_status(tx)
- tx->response_status_number => htp_tx_response_status_number(tx)
- tx->response_progress => htp_tx_response_progress(tx)
- tx->response_protocol_number => htp_tx_response_protocol_number(tx)
- htp_tx_get_user_data() => htp_tx_user_data(tx)
- htp_table_get_c(tx->request_headers, header) => htp_tx_request_header(tx, header)
- htp_table_get_c(tx->response_headers, header) => htp_tx_response_header(tx, header)
- htp_table_get_index(tx->request_headers, index) => htp_tx_request_header_index(tx, index)
- htp_table_size(tx->request_headers) => htp_tx_request_headers_size(tx)
- htp_tx_is_http_2_upgrade(tx) convenience function introduced to detect response status 101
  and “Upgrade: h2c" header.

Functions introduced to handle opaque htp_header_t:
- header->name => htp_header_name(header)
- bstr_ptr(header->name) => htp_header_name_ptr(header)
- bstr_len(header->name) => htp_header_name_len(header)
- header->value => htp_header_value(header)
- bstr_len(header->value) => htp_header_value_len(header)
- bstr_ptr(header->value) => htp_header_value_ptr(header)

Functions introduced to handle opaque htp_headers_t:
- htp_table_size(headers) => htp_headers_size(headers)
- htp_table_get_index(headers, index) => htp_headers_get_index(headers, index)

Functions introduced to handle opaque htp_tx_data_t:
- d->len => htp_tx_data_len()
- d->data => htp_tx_data_data()
- htp_tx_data_tx(data) function to get the htp_tx_t from the htp_tx_data_t
- htp_tx_data_is_empty(data) convenience function introduced to test if the data is empty.

Functions introduced to handle opaque htp_connp_t:
- htp_list_get(connp->transactions, txid) => htp_connp_tx(connp, txid)
- htp_list_size(http_state->conn->transactions) => htp_connp_tx_size(connp)
- htp_connp_get_connection(connp) => htp_connp_connection(connp)
- htp_connp_req_data(connp) => htp_connp_request_data(connp)
- htp_connp_req_close(connp) => htp_connp_request_close(connp)
- htp_connp_res_data(connp) => htp_connp_response_data(connp)
- htp_connp_get_jn_tx(connp) => htp_connp_get_request_tx(connp)
- htp_connp_get_out_tx(connp) => htp_connp_get_response_tx(connp)
- htp_connp_req_data_consumed(connp) => htp_connp_request_data_consumed(connp)
- htp_connp_res_data_consumed(connp) => htp_connp_response_data_consumed(connp)
- htp_connp_get_user_data(connp) => htp_connp_user_data(connp)

Functions introduced to handle opaque htp_conn_t:
- conn->in_data_counter => htp_conn_request_data_counter(conn)
- conn->out_data_counter => htp_conn_response_data_counter(conn)

Other changes:

Build libhtp-rs as a crate inside rust. Update autoconf to no longer
use libhtp as an external dependency. Remove HAVE_HTP feature defines
since they are no longer needed.

Make function arguments and return values const where possible

htp_tx_destroy(tx) will now free an incomplete transaction

htp_time_t replaced with standard struct timeval

Callbacks from libhtp now provide the htp_connp_t and the htp_tx_data_t
as separate arguments. This means the connection parser is no longer
fetched from the transaction inside callbacks.

SCHTPGenerateNormalizedUri() functionality moved inside libhtp-rs, which
now provides normalized URI values.
The normalized URI is available with accessor function: htp_tx_normalized_uri()
Configuration settings added to control the behaviour of the URI normalization:
- htp_config_set_normalized_uri_include_all()
- htp_config_set_plusspace_decode()
- htp_config_set_convert_lowercase()
- htp_config_set_double_decode_normalized_query()
- htp_config_set_double_decode_normalized_path()
- htp_config_set_backslash_convert_slashes()
- htp_config_set_bestfit_replacement_byte()
- htp_config_set_convert_lowercase()
- htp_config_set_nul_encoded_terminates()
- htp_config_set_nul_raw_terminates()
- htp_config_set_path_separators_compress()
- htp_config_set_path_separators_decode()
- htp_config_set_u_encoding_decode()
- htp_config_set_url_encoding_invalid_handling()
- htp_config_set_utf8_convert_bestfit()
- htp_config_set_normalized_uri_include_all()
- htp_config_set_plusspace_decode()
Constants related to configuring uri normalization:
- HTP_URL_DECODE_PRESERVE_PERCENT => HTP_URL_ENCODING_HANDLING_PRESERVE_PERCENT
- HTP_URL_DECODE_REMOVE_PERCENT => HTP_URL_ENCODING_HANDLING_REMOVE_PERCENT
- HTP_URL_DECODE_PROCESS_INVALID => HTP_URL_ENCODING_HANDLING_PROCESS_INVALID

htp_config_set_field_limits(soft_limit, hard_limit) changed to
htp_config_set_field_limit(limit) because libhtp didn't implement soft
limits.

libhtp logging API updated to provide HTP_LOG_CODE constants along with
the message. This eliminates the need to perform string matching on
message text to map log messages to HTTP_DECODER_EVENT values, and the
HTP_LOG_CODE values can be used directly. In support of this,
HTP_DECODER_EVENT values are mapped to their corresponding HTP_LOG_CODE
values:

HTTP_DECODER_EVENT_UNKNOWN_ERROR => HTP_LOG_CODE_UNKNOWN
HTTP_DECODER_EVENT_GZIP_DECOMPRESSION_FAILED => HTP_LOG_CODE_GZIP_DECOMPRESSION_FAILED
HTTP_DECODER_EVENT_REQUEST_FIELD_MISSING_COLON => HTP_LOG_CODE_REQUEST_FIELD_MISSING_COLON
HTTP_DECODER_EVENT_RESPONSE_FIELD_MISSING_COLON => HTP_LOG_CODE_RESPONSE_FIELD_MISSING_COLON
HTTP_DECODER_EVENT_INVALID_REQUEST_CHUNK_LEN => HTP_LOG_CODE_INVALID_REQUEST_CHUNK_LEN
HTTP_DECODER_EVENT_INVALID_RESPONSE_CHUNK_LEN => HTP_LOG_CODE_INVALID_RESPONSE_CHUNK_LEN
HTTP_DECODER_EVENT_INVALID_TRANSFER_ENCODING_VALUE_IN_REQUEST => HTP_LOG_CODE_INVALID_TRANSFER_ENCODING_VALUE_IN_REQUEST
HTTP_DECODER_EVENT_INVALID_TRANSFER_ENCODING_VALUE_IN_RESPONSE => HTP_LOG_CODE_INVALID_TRANSFER_ENCODING_VALUE_IN_RESPONSE
HTTP_DECODER_EVENT_INVALID_CONTENT_LENGTH_FIELD_IN_REQUEST => HTP_LOG_CODE_INVALID_CONTENT_LENGTH_FIELD_IN_REQUEST
HTTP_DECODER_EVENT_INVALID_CONTENT_LENGTH_FIELD_IN_RESPONSE => HTP_LOG_CODE_INVALID_CONTENT_LENGTH_FIELD_IN_RESPONSE
HTTP_DECODER_EVENT_DUPLICATE_CONTENT_LENGTH_FIELD_IN_REQUEST => HTP_LOG_CODE_DUPLICATE_CONTENT_LENGTH_FIELD_IN_REQUEST
HTTP_DECODER_EVENT_DUPLICATE_CONTENT_LENGTH_FIELD_IN_RESPONSE => HTP_LOG_CODE_DUPLICATE_CONTENT_LENGTH_FIELD_IN_RESPONSE
HTTP_DECODER_EVENT_100_CONTINUE_ALREADY_SEEN => HTP_LOG_CODE_CONTINUE_ALREADY_SEEN
HTTP_DECODER_EVENT_UNABLE_TO_MATCH_RESPONSE_TO_REQUEST => HTP_LOG_CODE_UNABLE_TO_MATCH_RESPONSE_TO_REQUEST
HTTP_DECODER_EVENT_INVALID_SERVER_PORT_IN_REQUEST => HTP_LOG_CODE_INVALID_SERVER_PORT_IN_REQUEST
HTTP_DECODER_EVENT_INVALID_AUTHORITY_PORT => HTP_LOG_CODE_INVALID_AUTHORITY_PORT
HTTP_DECODER_EVENT_REQUEST_HEADER_INVALID => HTP_LOG_CODE_REQUEST_HEADER_INVALID
HTTP_DECODER_EVENT_RESPONSE_HEADER_INVALID => HTP_LOG_CODE_RESPONSE_HEADER_INVALID
HTTP_DECODER_EVENT_MISSING_HOST_HEADER => HTP_LOG_CODE_MISSING_HOST_HEADER
HTTP_DECODER_EVENT_HOST_HEADER_AMBIGUOUS => HTP_LOG_CODE_HOST_HEADER_AMBIGUOUS
HTTP_DECODER_EVENT_INVALID_REQUEST_FIELD_FOLDING => HTP_LOG_CODE_INVALID_REQUEST_FIELD_FOLDING
HTTP_DECODER_EVENT_INVALID_RESPONSE_FIELD_FOLDING => HTP_LOG_CODE_INVALID_RESPONSE_FIELD_FOLDING
HTTP_DECODER_EVENT_REQUEST_FIELD_TOO_LONG => HTP_LOG_CODE_REQUEST_FIELD_TOO_LONG
HTTP_DECODER_EVENT_RESPONSE_FIELD_TOO_LONG => HTP_LOG_CODE_RESPONSE_FIELD_TOO_LONG
HTTP_DECODER_EVENT_FILE_NAME_TOO_LONG => HTP_LOG_CODE_REQUEST_LINE_INVALID
HTTP_DECODER_EVENT_REQUEST_LINE_INVALID => HTP_LOG_CODE_REQUEST_BODY_UNEXPECTED
HTTP_DECODER_EVENT_REQUEST_BODY_UNEXPECTED => HTP_LOG_CODE_RESPONSE_BODY_UNEXPECTED
HTTP_DECODER_EVENT_REQUEST_SERVER_PORT_TCP_PORT_MISMATCH => HTP_LOG_CODE_REQUEST_SERVER_PORT_TCP_PORT_MISMATCH
HTTP_DECODER_EVENT_URI_HOST_INVALID => HTP_LOG_CODE_URI_HOST_INVALID
HTTP_DECODER_EVENT_HEADER_HOST_INVALID => HTP_LOG_CODE_HEADER_HOST_INVALID
HTTP_DECODER_EVENT_AUTH_UNRECOGNIZED => HTP_LOG_CODE_AUTH_UNRECOGNIZED
HTTP_DECODER_EVENT_REQUEST_HEADER_REPETITION => HTP_LOG_CODE_REQUEST_HEADER_REPETITION
HTTP_DECODER_EVENT_RESPONSE_HEADER_REPETITION => HTP_LOG_CODE_RESPONSE_HEADER_REPETITION
HTTP_DECODER_EVENT_DOUBLE_ENCODED_URI => HTP_LOG_CODE_DOUBLE_ENCODED_URI
HTTP_DECODER_EVENT_URI_DELIM_NON_COMPLIANT => HTP_LOG_CODE_URI_DELIM_NON_COMPLIANT
HTTP_DECODER_EVENT_METHOD_DELIM_NON_COMPLIANT => HTP_LOG_CODE_METHOD_DELIM_NON_COMPLIANT
HTTP_DECODER_EVENT_REQUEST_LINE_LEADING_WHITESPACE => HTP_LOG_CODE_REQUEST_LINE_LEADING_WHITESPACE
HTTP_DECODER_EVENT_TOO_MANY_ENCODING_LAYERS => HTP_LOG_CODE_TOO_MANY_ENCODING_LAYERS
HTTP_DECODER_EVENT_ABNORMAL_CE_HEADER => HTP_LOG_CODE_ABNORMAL_CE_HEADER
HTTP_DECODER_EVENT_RESPONSE_MULTIPART_BYTERANGES => HTP_LOG_CODE_RESPONSE_MULTIPART_BYTERANGES
HTTP_DECODER_EVENT_RESPONSE_ABNORMAL_TRANSFER_ENCODING => HTP_LOG_CODE_RESPONSE_ABNORMAL_TRANSFER_ENCODING
HTTP_DECODER_EVENT_RESPONSE_CHUNKED_OLD_PROTO => HTP_LOG_CODE_RESPONSE_CHUNKED_OLD_PROTO
HTTP_DECODER_EVENT_RESPONSE_INVALID_PROTOCOL => HTP_LOG_CODE_RESPONSE_INVALID_PROTOCOL
HTTP_DECODER_EVENT_RESPONSE_INVALID_STATUS => HTP_LOG_CODE_RESPONSE_INVALID_STATUS
HTTP_DECODER_EVENT_REQUEST_LINE_INCOMPLETE => HTP_LOG_CODE_REQUEST_LINE_INCOMPLETE
HTTP_DECODER_EVENT_LZMA_MEMLIMIT_REACHED => HTP_LOG_CODE_LZMA_MEMLIMIT_REACHED
HTTP_DECODER_EVENT_COMPRESSION_BOMB => HTP_LOG_CODE_COMPRESSION_BOMB

New log events to describe additional anomalies:
HTP_LOG_CODE_REQUEST_TOO_MANY_LZMA_LAYERS
HTP_LOG_CODE_RESPONSE_TOO_MANY_LZMA_LAYERS
HTP_LOG_CODE_PROTOCOL_CONTAINS_EXTRA_DATA
HTP_LOG_CODE_CONTENT_LENGTH_EXTRA_DATA_START
HTP_LOG_CODE_CONTENT_LENGTH_EXTRA_DATA_END
HTP_LOG_CODE_SWITCHING_PROTO_WITH_CONTENT_LENGTH
HTP_LOG_CODE_DEFORMED_EOL
HTP_LOG_CODE_PARSER_STATE_ERROR
HTP_LOG_CODE_MISSING_OUTBOUND_TRANSACTION_DATA
HTP_LOG_CODE_MISSING_INBOUND_TRANSACTION_DATA
HTP_LOG_CODE_ZERO_LENGTH_DATA_CHUNKS
HTP_LOG_CODE_REQUEST_LINE_UNKNOWN_METHOD
HTP_LOG_CODE_REQUEST_LINE_UNKNOWN_METHOD_NO_PROTOCOL
HTP_LOG_CODE_REQUEST_LINE_UNKNOWN_METHOD_INVALID_PROTOCOL
HTP_LOG_CODE_REQUEST_LINE_NO_PROTOCOL
HTP_LOG_CODE_RESPONSE_LINE_INVALID_PROTOCOL
HTP_LOG_CODE_RESPONSE_LINE_INVALID_RESPONSE_STATUS
HTP_LOG_CODE_RESPONSE_BODY_INTERNAL_ERROR
HTP_LOG_CODE_REQUEST_BODY_DATA_CALLBACK_ERROR
HTP_LOG_CODE_RESPONSE_INVALID_EMPTY_NAME
HTP_LOG_CODE_REQUEST_INVALID_EMPTY_NAME
HTP_LOG_CODE_RESPONSE_INVALID_LWS_AFTER_NAME
HTP_LOG_CODE_RESPONSE_HEADER_NAME_NOT_TOKEN
HTP_LOG_CODE_REQUEST_INVALID_LWS_AFTER_NAME
HTP_LOG_CODE_LZMA_DECOMPRESSION_DISABLED
HTP_LOG_CODE_CONNECTION_ALREADY_OPEN
HTP_LOG_CODE_COMPRESSION_BOMB_DOUBLE_LZMA
HTP_LOG_CODE_INVALID_CONTENT_ENCODING
HTP_LOG_CODE_INVALID_GAP
HTP_LOG_CODE_ERROR

The new htp_log API supports consuming log messages more easily than
walking a list and tracking the current offset. Internally, libhtp-rs
now provides log messages as a queue of htp_log_t, which means the
application can simply call htp_conn_next_log() to fetch the next log
message until the queue is empty. Once the application is done with a
log message, they can call htp_log_free() to dispose of it.

Functions supporting htp_log_t:
htp_conn_next_log(conn) - Get the next log message
htp_log_message(log) - To get the text of the message
htp_log_code(log) - To get the HTP_LOG_CODE value
htp_log_free(log) - To free the htp_log_t
catenacyber pushed a commit to catenacyber/suricata that referenced this pull request Oct 28, 2024
Ticket: OISF#2696

There are a lot of changes here, which are described below.

In general these changes are renaming constants to conform to the
libhtp-rs versions (which are generated by cbindgen); making all htp
types opaque and changing struct->member references to
htp_struct_member() function calls; and a handful of changes to offload
functionality onto libhtp-rs from suricata, such as URI normalization
and transaction cleanup.

Constants renamed to correspond to cbindgen generated names:

HTP_OK => HTP_STATUS_OK
HTP_ERROR => HTP_STATUS_ERROR

HTP_SERVER_* => HTP_SERVER_PERSONALITY_*

HTP_REQUEST_INVALID_T_E => HTP_FLAGS_REQUEST_INVALID_T_E
HTP_REQUEST_INVALID_C_L => HTP_FLAGS_REQUEST_INVALID_C_L
HTP_HOST_MISSING => HTP_FLAGS_HOST_MISSING
HTP_HOST_AMBIGUOUS => HTP_FLAGS_HOST_AMBIGUOUS
HTP_HOSTU_INVALID => HTP_FLAGS_HOSTU_INVALID
HTP_HOSTH_INVALID => HTP_FLAGS_HOSTH_INVALID

HTP_AUTH_UNRECOGNIZED => HTP_AUTH_TYPE_UNRECOGNIZED

HTP_M_UNKNOWN => HTP_METHOD_UNKNOWN
HTP_M_GET => HTP_METHOD_GET
HTP_M_POST => HTP_METHOD_POST
HTP_M_PUT => HTP_METHOD_PUT
HTP_M_CONNECT => HTP_METHOD_CONNECT

HTP_STREAM_ERROR => HTP_STREAM_STATE_ERROR
HTP_STREAM_TUNNEL => HTP_STREAM_STATE_TUNNEL

HTP_REQUEST_LINE => HTP_REQUEST_PROGRESS_LINE
HTP_REQUEST_HEADERS => HTP_REQUEST_PROGRESS_HEADERS
HTP_REQUEST_BODY => HTP_REQUEST_PROGRESS_BODY
HTP_REQUEST_TRAILER => HTP_REQUEST_PROGRESS_TRAILER
HTP_REQUEST_COMPLETE => HTP_REQUEST_PROGRESS_COMPLETE
HTP_RESPONSE_LINE => HTP_RESPONSE_PROGRESS_LINE
HTP_RESPONSE_HEADERS => HTP_RESPONSE_PROGRESS_HEADERS
HTP_RESPONSE_BODY => HTP_RESPONSE_PROGRESS_BODY
HTP_RESPONSE_TRAILER => HTP_RESPONSE_PROGRESS_TRAILER
HTP_RESPONSE_COMPLETE => HTP_RESPONSE_PROGRESS_COMPLETE

HTP_PROTOCOL_1_1 => HTP_PROTOCOL_V1_1
HTP_PROTOCOL_1_0 => HTP_PROTOCOL_V1_0
HTP_PROTOCOL_0_9 => HTP_PROTOCOL_V0_9

Functions introduced to handle opaque htp_tx_t:
- tx->cfg => htp_tx_cfg(tx)
- tx->flags => htp_tx_flags(tx)
- tx->is_protocol_0_9 => htp_tx_is_protocol_0_9(tx)
- tx->parsed_uri => htp_tx_parsed_uri(tx)
- tx->parsed_uri->path => htp_uri_path(htp_tx_parsed_uri(tx)
- tx->parsed_uri->hostname => htp_uri_hostname(htp_tx_parsed_uri(tx))
- tx->request_auth_type => htp_tx_request_auth_type(tx)
- tx->request_headers => htp_tx_request_headers(tx)
- tx->request_hostname => htp_tx_request_hostname(tx)
- tx->request_line => htp_tx_request_line(tx)
- tx->request_message_len => htp_tx_request_message_len(tx)
- tx->request_method => htp_tx_request_method(tx)
- tx->request_method_number => htp_tx_request_method_number(tx)
- tx->request_port_number => htp_tx_request_port_number(tx)
- tx->request_progress => htp_tx_request_progress(tx)
- tx->request_protocol => htp_tx_request_protocol(tx)
- tx->request_protocol_number => htp_tx_request_protocol_number(tx)
- tx->request_uri => htp_tx_request_uri(tx)
- tx->response_headers => htp_tx_response_headers(tx)
- tx->response_line => htp_tx_response_line(tx)
- tx->response_message => htp_tx_response_message(tx)
- tx->response_message_len => htp_tx_response_message_len(tx)
- tx->response_status => htp_tx_response_status(tx)
- tx->response_status_number => htp_tx_response_status_number(tx)
- tx->response_progress => htp_tx_response_progress(tx)
- tx->response_protocol_number => htp_tx_response_protocol_number(tx)
- htp_tx_get_user_data() => htp_tx_user_data(tx)
- htp_table_get_c(tx->request_headers, header) => htp_tx_request_header(tx, header)
- htp_table_get_c(tx->response_headers, header) => htp_tx_response_header(tx, header)
- htp_table_get_index(tx->request_headers, index) => htp_tx_request_header_index(tx, index)
- htp_table_size(tx->request_headers) => htp_tx_request_headers_size(tx)
- htp_tx_is_http_2_upgrade(tx) convenience function introduced to detect response status 101
  and “Upgrade: h2c" header.

Functions introduced to handle opaque htp_header_t:
- header->name => htp_header_name(header)
- bstr_ptr(header->name) => htp_header_name_ptr(header)
- bstr_len(header->name) => htp_header_name_len(header)
- header->value => htp_header_value(header)
- bstr_len(header->value) => htp_header_value_len(header)
- bstr_ptr(header->value) => htp_header_value_ptr(header)

Functions introduced to handle opaque htp_headers_t:
- htp_table_size(headers) => htp_headers_size(headers)
- htp_table_get_index(headers, index) => htp_headers_get_index(headers, index)

Functions introduced to handle opaque htp_tx_data_t:
- d->len => htp_tx_data_len()
- d->data => htp_tx_data_data()
- htp_tx_data_tx(data) function to get the htp_tx_t from the htp_tx_data_t
- htp_tx_data_is_empty(data) convenience function introduced to test if the data is empty.

Functions introduced to handle opaque htp_connp_t:
- htp_list_get(connp->transactions, txid) => htp_connp_tx(connp, txid)
- htp_list_size(http_state->conn->transactions) => htp_connp_tx_size(connp)
- htp_connp_get_connection(connp) => htp_connp_connection(connp)
- htp_connp_req_data(connp) => htp_connp_request_data(connp)
- htp_connp_req_close(connp) => htp_connp_request_close(connp)
- htp_connp_res_data(connp) => htp_connp_response_data(connp)
- htp_connp_get_jn_tx(connp) => htp_connp_get_request_tx(connp)
- htp_connp_get_out_tx(connp) => htp_connp_get_response_tx(connp)
- htp_connp_req_data_consumed(connp) => htp_connp_request_data_consumed(connp)
- htp_connp_res_data_consumed(connp) => htp_connp_response_data_consumed(connp)
- htp_connp_get_user_data(connp) => htp_connp_user_data(connp)

Functions introduced to handle opaque htp_conn_t:
- conn->in_data_counter => htp_conn_request_data_counter(conn)
- conn->out_data_counter => htp_conn_response_data_counter(conn)

Other changes:

Build libhtp-rs as a crate inside rust. Update autoconf to no longer
use libhtp as an external dependency. Remove HAVE_HTP feature defines
since they are no longer needed.

Make function arguments and return values const where possible

htp_tx_destroy(tx) will now free an incomplete transaction

htp_time_t replaced with standard struct timeval

Callbacks from libhtp now provide the htp_connp_t and the htp_tx_data_t
as separate arguments. This means the connection parser is no longer
fetched from the transaction inside callbacks.

SCHTPGenerateNormalizedUri() functionality moved inside libhtp-rs, which
now provides normalized URI values.
The normalized URI is available with accessor function: htp_tx_normalized_uri()
Configuration settings added to control the behaviour of the URI normalization:
- htp_config_set_normalized_uri_include_all()
- htp_config_set_plusspace_decode()
- htp_config_set_convert_lowercase()
- htp_config_set_double_decode_normalized_query()
- htp_config_set_double_decode_normalized_path()
- htp_config_set_backslash_convert_slashes()
- htp_config_set_bestfit_replacement_byte()
- htp_config_set_convert_lowercase()
- htp_config_set_nul_encoded_terminates()
- htp_config_set_nul_raw_terminates()
- htp_config_set_path_separators_compress()
- htp_config_set_path_separators_decode()
- htp_config_set_u_encoding_decode()
- htp_config_set_url_encoding_invalid_handling()
- htp_config_set_utf8_convert_bestfit()
- htp_config_set_normalized_uri_include_all()
- htp_config_set_plusspace_decode()
Constants related to configuring uri normalization:
- HTP_URL_DECODE_PRESERVE_PERCENT => HTP_URL_ENCODING_HANDLING_PRESERVE_PERCENT
- HTP_URL_DECODE_REMOVE_PERCENT => HTP_URL_ENCODING_HANDLING_REMOVE_PERCENT
- HTP_URL_DECODE_PROCESS_INVALID => HTP_URL_ENCODING_HANDLING_PROCESS_INVALID

htp_config_set_field_limits(soft_limit, hard_limit) changed to
htp_config_set_field_limit(limit) because libhtp didn't implement soft
limits.

libhtp logging API updated to provide HTP_LOG_CODE constants along with
the message. This eliminates the need to perform string matching on
message text to map log messages to HTTP_DECODER_EVENT values, and the
HTP_LOG_CODE values can be used directly. In support of this,
HTP_DECODER_EVENT values are mapped to their corresponding HTP_LOG_CODE
values:

HTTP_DECODER_EVENT_UNKNOWN_ERROR => HTP_LOG_CODE_UNKNOWN
HTTP_DECODER_EVENT_GZIP_DECOMPRESSION_FAILED => HTP_LOG_CODE_GZIP_DECOMPRESSION_FAILED
HTTP_DECODER_EVENT_REQUEST_FIELD_MISSING_COLON => HTP_LOG_CODE_REQUEST_FIELD_MISSING_COLON
HTTP_DECODER_EVENT_RESPONSE_FIELD_MISSING_COLON => HTP_LOG_CODE_RESPONSE_FIELD_MISSING_COLON
HTTP_DECODER_EVENT_INVALID_REQUEST_CHUNK_LEN => HTP_LOG_CODE_INVALID_REQUEST_CHUNK_LEN
HTTP_DECODER_EVENT_INVALID_RESPONSE_CHUNK_LEN => HTP_LOG_CODE_INVALID_RESPONSE_CHUNK_LEN
HTTP_DECODER_EVENT_INVALID_TRANSFER_ENCODING_VALUE_IN_REQUEST => HTP_LOG_CODE_INVALID_TRANSFER_ENCODING_VALUE_IN_REQUEST
HTTP_DECODER_EVENT_INVALID_TRANSFER_ENCODING_VALUE_IN_RESPONSE => HTP_LOG_CODE_INVALID_TRANSFER_ENCODING_VALUE_IN_RESPONSE
HTTP_DECODER_EVENT_INVALID_CONTENT_LENGTH_FIELD_IN_REQUEST => HTP_LOG_CODE_INVALID_CONTENT_LENGTH_FIELD_IN_REQUEST
HTTP_DECODER_EVENT_INVALID_CONTENT_LENGTH_FIELD_IN_RESPONSE => HTP_LOG_CODE_INVALID_CONTENT_LENGTH_FIELD_IN_RESPONSE
HTTP_DECODER_EVENT_DUPLICATE_CONTENT_LENGTH_FIELD_IN_REQUEST => HTP_LOG_CODE_DUPLICATE_CONTENT_LENGTH_FIELD_IN_REQUEST
HTTP_DECODER_EVENT_DUPLICATE_CONTENT_LENGTH_FIELD_IN_RESPONSE => HTP_LOG_CODE_DUPLICATE_CONTENT_LENGTH_FIELD_IN_RESPONSE
HTTP_DECODER_EVENT_100_CONTINUE_ALREADY_SEEN => HTP_LOG_CODE_CONTINUE_ALREADY_SEEN
HTTP_DECODER_EVENT_UNABLE_TO_MATCH_RESPONSE_TO_REQUEST => HTP_LOG_CODE_UNABLE_TO_MATCH_RESPONSE_TO_REQUEST
HTTP_DECODER_EVENT_INVALID_SERVER_PORT_IN_REQUEST => HTP_LOG_CODE_INVALID_SERVER_PORT_IN_REQUEST
HTTP_DECODER_EVENT_INVALID_AUTHORITY_PORT => HTP_LOG_CODE_INVALID_AUTHORITY_PORT
HTTP_DECODER_EVENT_REQUEST_HEADER_INVALID => HTP_LOG_CODE_REQUEST_HEADER_INVALID
HTTP_DECODER_EVENT_RESPONSE_HEADER_INVALID => HTP_LOG_CODE_RESPONSE_HEADER_INVALID
HTTP_DECODER_EVENT_MISSING_HOST_HEADER => HTP_LOG_CODE_MISSING_HOST_HEADER
HTTP_DECODER_EVENT_HOST_HEADER_AMBIGUOUS => HTP_LOG_CODE_HOST_HEADER_AMBIGUOUS
HTTP_DECODER_EVENT_INVALID_REQUEST_FIELD_FOLDING => HTP_LOG_CODE_INVALID_REQUEST_FIELD_FOLDING
HTTP_DECODER_EVENT_INVALID_RESPONSE_FIELD_FOLDING => HTP_LOG_CODE_INVALID_RESPONSE_FIELD_FOLDING
HTTP_DECODER_EVENT_REQUEST_FIELD_TOO_LONG => HTP_LOG_CODE_REQUEST_FIELD_TOO_LONG
HTTP_DECODER_EVENT_RESPONSE_FIELD_TOO_LONG => HTP_LOG_CODE_RESPONSE_FIELD_TOO_LONG
HTTP_DECODER_EVENT_FILE_NAME_TOO_LONG => HTP_LOG_CODE_REQUEST_LINE_INVALID
HTTP_DECODER_EVENT_REQUEST_LINE_INVALID => HTP_LOG_CODE_REQUEST_BODY_UNEXPECTED
HTTP_DECODER_EVENT_REQUEST_BODY_UNEXPECTED => HTP_LOG_CODE_RESPONSE_BODY_UNEXPECTED
HTTP_DECODER_EVENT_REQUEST_SERVER_PORT_TCP_PORT_MISMATCH => HTP_LOG_CODE_REQUEST_SERVER_PORT_TCP_PORT_MISMATCH
HTTP_DECODER_EVENT_URI_HOST_INVALID => HTP_LOG_CODE_URI_HOST_INVALID
HTTP_DECODER_EVENT_HEADER_HOST_INVALID => HTP_LOG_CODE_HEADER_HOST_INVALID
HTTP_DECODER_EVENT_AUTH_UNRECOGNIZED => HTP_LOG_CODE_AUTH_UNRECOGNIZED
HTTP_DECODER_EVENT_REQUEST_HEADER_REPETITION => HTP_LOG_CODE_REQUEST_HEADER_REPETITION
HTTP_DECODER_EVENT_RESPONSE_HEADER_REPETITION => HTP_LOG_CODE_RESPONSE_HEADER_REPETITION
HTTP_DECODER_EVENT_DOUBLE_ENCODED_URI => HTP_LOG_CODE_DOUBLE_ENCODED_URI
HTTP_DECODER_EVENT_URI_DELIM_NON_COMPLIANT => HTP_LOG_CODE_URI_DELIM_NON_COMPLIANT
HTTP_DECODER_EVENT_METHOD_DELIM_NON_COMPLIANT => HTP_LOG_CODE_METHOD_DELIM_NON_COMPLIANT
HTTP_DECODER_EVENT_REQUEST_LINE_LEADING_WHITESPACE => HTP_LOG_CODE_REQUEST_LINE_LEADING_WHITESPACE
HTTP_DECODER_EVENT_TOO_MANY_ENCODING_LAYERS => HTP_LOG_CODE_TOO_MANY_ENCODING_LAYERS
HTTP_DECODER_EVENT_ABNORMAL_CE_HEADER => HTP_LOG_CODE_ABNORMAL_CE_HEADER
HTTP_DECODER_EVENT_RESPONSE_MULTIPART_BYTERANGES => HTP_LOG_CODE_RESPONSE_MULTIPART_BYTERANGES
HTTP_DECODER_EVENT_RESPONSE_ABNORMAL_TRANSFER_ENCODING => HTP_LOG_CODE_RESPONSE_ABNORMAL_TRANSFER_ENCODING
HTTP_DECODER_EVENT_RESPONSE_CHUNKED_OLD_PROTO => HTP_LOG_CODE_RESPONSE_CHUNKED_OLD_PROTO
HTTP_DECODER_EVENT_RESPONSE_INVALID_PROTOCOL => HTP_LOG_CODE_RESPONSE_INVALID_PROTOCOL
HTTP_DECODER_EVENT_RESPONSE_INVALID_STATUS => HTP_LOG_CODE_RESPONSE_INVALID_STATUS
HTTP_DECODER_EVENT_REQUEST_LINE_INCOMPLETE => HTP_LOG_CODE_REQUEST_LINE_INCOMPLETE
HTTP_DECODER_EVENT_LZMA_MEMLIMIT_REACHED => HTP_LOG_CODE_LZMA_MEMLIMIT_REACHED
HTTP_DECODER_EVENT_COMPRESSION_BOMB => HTP_LOG_CODE_COMPRESSION_BOMB

New log events to describe additional anomalies:
HTP_LOG_CODE_REQUEST_TOO_MANY_LZMA_LAYERS
HTP_LOG_CODE_RESPONSE_TOO_MANY_LZMA_LAYERS
HTP_LOG_CODE_PROTOCOL_CONTAINS_EXTRA_DATA
HTP_LOG_CODE_CONTENT_LENGTH_EXTRA_DATA_START
HTP_LOG_CODE_CONTENT_LENGTH_EXTRA_DATA_END
HTP_LOG_CODE_SWITCHING_PROTO_WITH_CONTENT_LENGTH
HTP_LOG_CODE_DEFORMED_EOL
HTP_LOG_CODE_PARSER_STATE_ERROR
HTP_LOG_CODE_MISSING_OUTBOUND_TRANSACTION_DATA
HTP_LOG_CODE_MISSING_INBOUND_TRANSACTION_DATA
HTP_LOG_CODE_ZERO_LENGTH_DATA_CHUNKS
HTP_LOG_CODE_REQUEST_LINE_UNKNOWN_METHOD
HTP_LOG_CODE_REQUEST_LINE_UNKNOWN_METHOD_NO_PROTOCOL
HTP_LOG_CODE_REQUEST_LINE_UNKNOWN_METHOD_INVALID_PROTOCOL
HTP_LOG_CODE_REQUEST_LINE_NO_PROTOCOL
HTP_LOG_CODE_RESPONSE_LINE_INVALID_PROTOCOL
HTP_LOG_CODE_RESPONSE_LINE_INVALID_RESPONSE_STATUS
HTP_LOG_CODE_RESPONSE_BODY_INTERNAL_ERROR
HTP_LOG_CODE_REQUEST_BODY_DATA_CALLBACK_ERROR
HTP_LOG_CODE_RESPONSE_INVALID_EMPTY_NAME
HTP_LOG_CODE_REQUEST_INVALID_EMPTY_NAME
HTP_LOG_CODE_RESPONSE_INVALID_LWS_AFTER_NAME
HTP_LOG_CODE_RESPONSE_HEADER_NAME_NOT_TOKEN
HTP_LOG_CODE_REQUEST_INVALID_LWS_AFTER_NAME
HTP_LOG_CODE_LZMA_DECOMPRESSION_DISABLED
HTP_LOG_CODE_CONNECTION_ALREADY_OPEN
HTP_LOG_CODE_COMPRESSION_BOMB_DOUBLE_LZMA
HTP_LOG_CODE_INVALID_CONTENT_ENCODING
HTP_LOG_CODE_INVALID_GAP
HTP_LOG_CODE_ERROR

The new htp_log API supports consuming log messages more easily than
walking a list and tracking the current offset. Internally, libhtp-rs
now provides log messages as a queue of htp_log_t, which means the
application can simply call htp_conn_next_log() to fetch the next log
message until the queue is empty. Once the application is done with a
log message, they can call htp_log_free() to dispose of it.

Functions supporting htp_log_t:
htp_conn_next_log(conn) - Get the next log message
htp_log_message(log) - To get the text of the message
htp_log_code(log) - To get the HTP_LOG_CODE value
htp_log_free(log) - To free the htp_log_t
catenacyber pushed a commit to catenacyber/suricata that referenced this pull request Oct 28, 2024
Ticket: OISF#2696

There are a lot of changes here, which are described below.

In general these changes are renaming constants to conform to the
libhtp-rs versions (which are generated by cbindgen); making all htp
types opaque and changing struct->member references to
htp_struct_member() function calls; and a handful of changes to offload
functionality onto libhtp-rs from suricata, such as URI normalization
and transaction cleanup.

Constants renamed to correspond to cbindgen generated names:

HTP_OK => HTP_STATUS_OK
HTP_ERROR => HTP_STATUS_ERROR

HTP_SERVER_* => HTP_SERVER_PERSONALITY_*

HTP_REQUEST_INVALID_T_E => HTP_FLAGS_REQUEST_INVALID_T_E
HTP_REQUEST_INVALID_C_L => HTP_FLAGS_REQUEST_INVALID_C_L
HTP_HOST_MISSING => HTP_FLAGS_HOST_MISSING
HTP_HOST_AMBIGUOUS => HTP_FLAGS_HOST_AMBIGUOUS
HTP_HOSTU_INVALID => HTP_FLAGS_HOSTU_INVALID
HTP_HOSTH_INVALID => HTP_FLAGS_HOSTH_INVALID

HTP_AUTH_UNRECOGNIZED => HTP_AUTH_TYPE_UNRECOGNIZED

HTP_M_UNKNOWN => HTP_METHOD_UNKNOWN
HTP_M_GET => HTP_METHOD_GET
HTP_M_POST => HTP_METHOD_POST
HTP_M_PUT => HTP_METHOD_PUT
HTP_M_CONNECT => HTP_METHOD_CONNECT

HTP_STREAM_ERROR => HTP_STREAM_STATE_ERROR
HTP_STREAM_TUNNEL => HTP_STREAM_STATE_TUNNEL

HTP_REQUEST_LINE => HTP_REQUEST_PROGRESS_LINE
HTP_REQUEST_HEADERS => HTP_REQUEST_PROGRESS_HEADERS
HTP_REQUEST_BODY => HTP_REQUEST_PROGRESS_BODY
HTP_REQUEST_TRAILER => HTP_REQUEST_PROGRESS_TRAILER
HTP_REQUEST_COMPLETE => HTP_REQUEST_PROGRESS_COMPLETE
HTP_RESPONSE_LINE => HTP_RESPONSE_PROGRESS_LINE
HTP_RESPONSE_HEADERS => HTP_RESPONSE_PROGRESS_HEADERS
HTP_RESPONSE_BODY => HTP_RESPONSE_PROGRESS_BODY
HTP_RESPONSE_TRAILER => HTP_RESPONSE_PROGRESS_TRAILER
HTP_RESPONSE_COMPLETE => HTP_RESPONSE_PROGRESS_COMPLETE

HTP_PROTOCOL_1_1 => HTP_PROTOCOL_V1_1
HTP_PROTOCOL_1_0 => HTP_PROTOCOL_V1_0
HTP_PROTOCOL_0_9 => HTP_PROTOCOL_V0_9

Functions introduced to handle opaque htp_tx_t:
- tx->cfg => htp_tx_cfg(tx)
- tx->flags => htp_tx_flags(tx)
- tx->is_protocol_0_9 => htp_tx_is_protocol_0_9(tx)
- tx->parsed_uri => htp_tx_parsed_uri(tx)
- tx->parsed_uri->path => htp_uri_path(htp_tx_parsed_uri(tx)
- tx->parsed_uri->hostname => htp_uri_hostname(htp_tx_parsed_uri(tx))
- tx->request_auth_type => htp_tx_request_auth_type(tx)
- tx->request_headers => htp_tx_request_headers(tx)
- tx->request_hostname => htp_tx_request_hostname(tx)
- tx->request_line => htp_tx_request_line(tx)
- tx->request_message_len => htp_tx_request_message_len(tx)
- tx->request_method => htp_tx_request_method(tx)
- tx->request_method_number => htp_tx_request_method_number(tx)
- tx->request_port_number => htp_tx_request_port_number(tx)
- tx->request_progress => htp_tx_request_progress(tx)
- tx->request_protocol => htp_tx_request_protocol(tx)
- tx->request_protocol_number => htp_tx_request_protocol_number(tx)
- tx->request_uri => htp_tx_request_uri(tx)
- tx->response_headers => htp_tx_response_headers(tx)
- tx->response_line => htp_tx_response_line(tx)
- tx->response_message => htp_tx_response_message(tx)
- tx->response_message_len => htp_tx_response_message_len(tx)
- tx->response_status => htp_tx_response_status(tx)
- tx->response_status_number => htp_tx_response_status_number(tx)
- tx->response_progress => htp_tx_response_progress(tx)
- tx->response_protocol_number => htp_tx_response_protocol_number(tx)
- htp_tx_get_user_data() => htp_tx_user_data(tx)
- htp_table_get_c(tx->request_headers, header) => htp_tx_request_header(tx, header)
- htp_table_get_c(tx->response_headers, header) => htp_tx_response_header(tx, header)
- htp_table_get_index(tx->request_headers, index) => htp_tx_request_header_index(tx, index)
- htp_table_size(tx->request_headers) => htp_tx_request_headers_size(tx)
- htp_tx_is_http_2_upgrade(tx) convenience function introduced to detect response status 101
  and “Upgrade: h2c" header.

Functions introduced to handle opaque htp_header_t:
- header->name => htp_header_name(header)
- bstr_ptr(header->name) => htp_header_name_ptr(header)
- bstr_len(header->name) => htp_header_name_len(header)
- header->value => htp_header_value(header)
- bstr_len(header->value) => htp_header_value_len(header)
- bstr_ptr(header->value) => htp_header_value_ptr(header)

Functions introduced to handle opaque htp_headers_t:
- htp_table_size(headers) => htp_headers_size(headers)
- htp_table_get_index(headers, index) => htp_headers_get_index(headers, index)

Functions introduced to handle opaque htp_tx_data_t:
- d->len => htp_tx_data_len()
- d->data => htp_tx_data_data()
- htp_tx_data_tx(data) function to get the htp_tx_t from the htp_tx_data_t
- htp_tx_data_is_empty(data) convenience function introduced to test if the data is empty.

Functions introduced to handle opaque htp_connp_t:
- htp_list_get(connp->transactions, txid) => htp_connp_tx(connp, txid)
- htp_list_size(http_state->conn->transactions) => htp_connp_tx_size(connp)
- htp_connp_get_connection(connp) => htp_connp_connection(connp)
- htp_connp_req_data(connp) => htp_connp_request_data(connp)
- htp_connp_req_close(connp) => htp_connp_request_close(connp)
- htp_connp_res_data(connp) => htp_connp_response_data(connp)
- htp_connp_get_jn_tx(connp) => htp_connp_get_request_tx(connp)
- htp_connp_get_out_tx(connp) => htp_connp_get_response_tx(connp)
- htp_connp_req_data_consumed(connp) => htp_connp_request_data_consumed(connp)
- htp_connp_res_data_consumed(connp) => htp_connp_response_data_consumed(connp)
- htp_connp_get_user_data(connp) => htp_connp_user_data(connp)

Functions introduced to handle opaque htp_conn_t:
- conn->in_data_counter => htp_conn_request_data_counter(conn)
- conn->out_data_counter => htp_conn_response_data_counter(conn)

Other changes:

Build libhtp-rs as a crate inside rust. Update autoconf to no longer
use libhtp as an external dependency. Remove HAVE_HTP feature defines
since they are no longer needed.

Make function arguments and return values const where possible

htp_tx_destroy(tx) will now free an incomplete transaction

htp_time_t replaced with standard struct timeval

Callbacks from libhtp now provide the htp_connp_t and the htp_tx_data_t
as separate arguments. This means the connection parser is no longer
fetched from the transaction inside callbacks.

SCHTPGenerateNormalizedUri() functionality moved inside libhtp-rs, which
now provides normalized URI values.
The normalized URI is available with accessor function: htp_tx_normalized_uri()
Configuration settings added to control the behaviour of the URI normalization:
- htp_config_set_normalized_uri_include_all()
- htp_config_set_plusspace_decode()
- htp_config_set_convert_lowercase()
- htp_config_set_double_decode_normalized_query()
- htp_config_set_double_decode_normalized_path()
- htp_config_set_backslash_convert_slashes()
- htp_config_set_bestfit_replacement_byte()
- htp_config_set_convert_lowercase()
- htp_config_set_nul_encoded_terminates()
- htp_config_set_nul_raw_terminates()
- htp_config_set_path_separators_compress()
- htp_config_set_path_separators_decode()
- htp_config_set_u_encoding_decode()
- htp_config_set_url_encoding_invalid_handling()
- htp_config_set_utf8_convert_bestfit()
- htp_config_set_normalized_uri_include_all()
- htp_config_set_plusspace_decode()
Constants related to configuring uri normalization:
- HTP_URL_DECODE_PRESERVE_PERCENT => HTP_URL_ENCODING_HANDLING_PRESERVE_PERCENT
- HTP_URL_DECODE_REMOVE_PERCENT => HTP_URL_ENCODING_HANDLING_REMOVE_PERCENT
- HTP_URL_DECODE_PROCESS_INVALID => HTP_URL_ENCODING_HANDLING_PROCESS_INVALID

htp_config_set_field_limits(soft_limit, hard_limit) changed to
htp_config_set_field_limit(limit) because libhtp didn't implement soft
limits.

libhtp logging API updated to provide HTP_LOG_CODE constants along with
the message. This eliminates the need to perform string matching on
message text to map log messages to HTTP_DECODER_EVENT values, and the
HTP_LOG_CODE values can be used directly. In support of this,
HTP_DECODER_EVENT values are mapped to their corresponding HTP_LOG_CODE
values:

HTTP_DECODER_EVENT_UNKNOWN_ERROR => HTP_LOG_CODE_UNKNOWN
HTTP_DECODER_EVENT_GZIP_DECOMPRESSION_FAILED => HTP_LOG_CODE_GZIP_DECOMPRESSION_FAILED
HTTP_DECODER_EVENT_REQUEST_FIELD_MISSING_COLON => HTP_LOG_CODE_REQUEST_FIELD_MISSING_COLON
HTTP_DECODER_EVENT_RESPONSE_FIELD_MISSING_COLON => HTP_LOG_CODE_RESPONSE_FIELD_MISSING_COLON
HTTP_DECODER_EVENT_INVALID_REQUEST_CHUNK_LEN => HTP_LOG_CODE_INVALID_REQUEST_CHUNK_LEN
HTTP_DECODER_EVENT_INVALID_RESPONSE_CHUNK_LEN => HTP_LOG_CODE_INVALID_RESPONSE_CHUNK_LEN
HTTP_DECODER_EVENT_INVALID_TRANSFER_ENCODING_VALUE_IN_REQUEST => HTP_LOG_CODE_INVALID_TRANSFER_ENCODING_VALUE_IN_REQUEST
HTTP_DECODER_EVENT_INVALID_TRANSFER_ENCODING_VALUE_IN_RESPONSE => HTP_LOG_CODE_INVALID_TRANSFER_ENCODING_VALUE_IN_RESPONSE
HTTP_DECODER_EVENT_INVALID_CONTENT_LENGTH_FIELD_IN_REQUEST => HTP_LOG_CODE_INVALID_CONTENT_LENGTH_FIELD_IN_REQUEST
HTTP_DECODER_EVENT_INVALID_CONTENT_LENGTH_FIELD_IN_RESPONSE => HTP_LOG_CODE_INVALID_CONTENT_LENGTH_FIELD_IN_RESPONSE
HTTP_DECODER_EVENT_DUPLICATE_CONTENT_LENGTH_FIELD_IN_REQUEST => HTP_LOG_CODE_DUPLICATE_CONTENT_LENGTH_FIELD_IN_REQUEST
HTTP_DECODER_EVENT_DUPLICATE_CONTENT_LENGTH_FIELD_IN_RESPONSE => HTP_LOG_CODE_DUPLICATE_CONTENT_LENGTH_FIELD_IN_RESPONSE
HTTP_DECODER_EVENT_100_CONTINUE_ALREADY_SEEN => HTP_LOG_CODE_CONTINUE_ALREADY_SEEN
HTTP_DECODER_EVENT_UNABLE_TO_MATCH_RESPONSE_TO_REQUEST => HTP_LOG_CODE_UNABLE_TO_MATCH_RESPONSE_TO_REQUEST
HTTP_DECODER_EVENT_INVALID_SERVER_PORT_IN_REQUEST => HTP_LOG_CODE_INVALID_SERVER_PORT_IN_REQUEST
HTTP_DECODER_EVENT_INVALID_AUTHORITY_PORT => HTP_LOG_CODE_INVALID_AUTHORITY_PORT
HTTP_DECODER_EVENT_REQUEST_HEADER_INVALID => HTP_LOG_CODE_REQUEST_HEADER_INVALID
HTTP_DECODER_EVENT_RESPONSE_HEADER_INVALID => HTP_LOG_CODE_RESPONSE_HEADER_INVALID
HTTP_DECODER_EVENT_MISSING_HOST_HEADER => HTP_LOG_CODE_MISSING_HOST_HEADER
HTTP_DECODER_EVENT_HOST_HEADER_AMBIGUOUS => HTP_LOG_CODE_HOST_HEADER_AMBIGUOUS
HTTP_DECODER_EVENT_INVALID_REQUEST_FIELD_FOLDING => HTP_LOG_CODE_INVALID_REQUEST_FIELD_FOLDING
HTTP_DECODER_EVENT_INVALID_RESPONSE_FIELD_FOLDING => HTP_LOG_CODE_INVALID_RESPONSE_FIELD_FOLDING
HTTP_DECODER_EVENT_REQUEST_FIELD_TOO_LONG => HTP_LOG_CODE_REQUEST_FIELD_TOO_LONG
HTTP_DECODER_EVENT_RESPONSE_FIELD_TOO_LONG => HTP_LOG_CODE_RESPONSE_FIELD_TOO_LONG
HTTP_DECODER_EVENT_FILE_NAME_TOO_LONG => HTP_LOG_CODE_REQUEST_LINE_INVALID
HTTP_DECODER_EVENT_REQUEST_LINE_INVALID => HTP_LOG_CODE_REQUEST_BODY_UNEXPECTED
HTTP_DECODER_EVENT_REQUEST_BODY_UNEXPECTED => HTP_LOG_CODE_RESPONSE_BODY_UNEXPECTED
HTTP_DECODER_EVENT_REQUEST_SERVER_PORT_TCP_PORT_MISMATCH => HTP_LOG_CODE_REQUEST_SERVER_PORT_TCP_PORT_MISMATCH
HTTP_DECODER_EVENT_URI_HOST_INVALID => HTP_LOG_CODE_URI_HOST_INVALID
HTTP_DECODER_EVENT_HEADER_HOST_INVALID => HTP_LOG_CODE_HEADER_HOST_INVALID
HTTP_DECODER_EVENT_AUTH_UNRECOGNIZED => HTP_LOG_CODE_AUTH_UNRECOGNIZED
HTTP_DECODER_EVENT_REQUEST_HEADER_REPETITION => HTP_LOG_CODE_REQUEST_HEADER_REPETITION
HTTP_DECODER_EVENT_RESPONSE_HEADER_REPETITION => HTP_LOG_CODE_RESPONSE_HEADER_REPETITION
HTTP_DECODER_EVENT_DOUBLE_ENCODED_URI => HTP_LOG_CODE_DOUBLE_ENCODED_URI
HTTP_DECODER_EVENT_URI_DELIM_NON_COMPLIANT => HTP_LOG_CODE_URI_DELIM_NON_COMPLIANT
HTTP_DECODER_EVENT_METHOD_DELIM_NON_COMPLIANT => HTP_LOG_CODE_METHOD_DELIM_NON_COMPLIANT
HTTP_DECODER_EVENT_REQUEST_LINE_LEADING_WHITESPACE => HTP_LOG_CODE_REQUEST_LINE_LEADING_WHITESPACE
HTTP_DECODER_EVENT_TOO_MANY_ENCODING_LAYERS => HTP_LOG_CODE_TOO_MANY_ENCODING_LAYERS
HTTP_DECODER_EVENT_ABNORMAL_CE_HEADER => HTP_LOG_CODE_ABNORMAL_CE_HEADER
HTTP_DECODER_EVENT_RESPONSE_MULTIPART_BYTERANGES => HTP_LOG_CODE_RESPONSE_MULTIPART_BYTERANGES
HTTP_DECODER_EVENT_RESPONSE_ABNORMAL_TRANSFER_ENCODING => HTP_LOG_CODE_RESPONSE_ABNORMAL_TRANSFER_ENCODING
HTTP_DECODER_EVENT_RESPONSE_CHUNKED_OLD_PROTO => HTP_LOG_CODE_RESPONSE_CHUNKED_OLD_PROTO
HTTP_DECODER_EVENT_RESPONSE_INVALID_PROTOCOL => HTP_LOG_CODE_RESPONSE_INVALID_PROTOCOL
HTTP_DECODER_EVENT_RESPONSE_INVALID_STATUS => HTP_LOG_CODE_RESPONSE_INVALID_STATUS
HTTP_DECODER_EVENT_REQUEST_LINE_INCOMPLETE => HTP_LOG_CODE_REQUEST_LINE_INCOMPLETE
HTTP_DECODER_EVENT_LZMA_MEMLIMIT_REACHED => HTP_LOG_CODE_LZMA_MEMLIMIT_REACHED
HTTP_DECODER_EVENT_COMPRESSION_BOMB => HTP_LOG_CODE_COMPRESSION_BOMB

New log events to describe additional anomalies:
HTP_LOG_CODE_REQUEST_TOO_MANY_LZMA_LAYERS
HTP_LOG_CODE_RESPONSE_TOO_MANY_LZMA_LAYERS
HTP_LOG_CODE_PROTOCOL_CONTAINS_EXTRA_DATA
HTP_LOG_CODE_CONTENT_LENGTH_EXTRA_DATA_START
HTP_LOG_CODE_CONTENT_LENGTH_EXTRA_DATA_END
HTP_LOG_CODE_SWITCHING_PROTO_WITH_CONTENT_LENGTH
HTP_LOG_CODE_DEFORMED_EOL
HTP_LOG_CODE_PARSER_STATE_ERROR
HTP_LOG_CODE_MISSING_OUTBOUND_TRANSACTION_DATA
HTP_LOG_CODE_MISSING_INBOUND_TRANSACTION_DATA
HTP_LOG_CODE_ZERO_LENGTH_DATA_CHUNKS
HTP_LOG_CODE_REQUEST_LINE_UNKNOWN_METHOD
HTP_LOG_CODE_REQUEST_LINE_UNKNOWN_METHOD_NO_PROTOCOL
HTP_LOG_CODE_REQUEST_LINE_UNKNOWN_METHOD_INVALID_PROTOCOL
HTP_LOG_CODE_REQUEST_LINE_NO_PROTOCOL
HTP_LOG_CODE_RESPONSE_LINE_INVALID_PROTOCOL
HTP_LOG_CODE_RESPONSE_LINE_INVALID_RESPONSE_STATUS
HTP_LOG_CODE_RESPONSE_BODY_INTERNAL_ERROR
HTP_LOG_CODE_REQUEST_BODY_DATA_CALLBACK_ERROR
HTP_LOG_CODE_RESPONSE_INVALID_EMPTY_NAME
HTP_LOG_CODE_REQUEST_INVALID_EMPTY_NAME
HTP_LOG_CODE_RESPONSE_INVALID_LWS_AFTER_NAME
HTP_LOG_CODE_RESPONSE_HEADER_NAME_NOT_TOKEN
HTP_LOG_CODE_REQUEST_INVALID_LWS_AFTER_NAME
HTP_LOG_CODE_LZMA_DECOMPRESSION_DISABLED
HTP_LOG_CODE_CONNECTION_ALREADY_OPEN
HTP_LOG_CODE_COMPRESSION_BOMB_DOUBLE_LZMA
HTP_LOG_CODE_INVALID_CONTENT_ENCODING
HTP_LOG_CODE_INVALID_GAP
HTP_LOG_CODE_ERROR

The new htp_log API supports consuming log messages more easily than
walking a list and tracking the current offset. Internally, libhtp-rs
now provides log messages as a queue of htp_log_t, which means the
application can simply call htp_conn_next_log() to fetch the next log
message until the queue is empty. Once the application is done with a
log message, they can call htp_log_free() to dispose of it.

Functions supporting htp_log_t:
htp_conn_next_log(conn) - Get the next log message
htp_log_message(log) - To get the text of the message
htp_log_code(log) - To get the HTP_LOG_CODE value
htp_log_free(log) - To free the htp_log_t
catenacyber pushed a commit to catenacyber/suricata that referenced this pull request Nov 7, 2024
Ticket: OISF#2696

There are a lot of changes here, which are described below.

In general these changes are renaming constants to conform to the
libhtp-rs versions (which are generated by cbindgen); making all htp
types opaque and changing struct->member references to
htp_struct_member() function calls; and a handful of changes to offload
functionality onto libhtp-rs from suricata, such as URI normalization
and transaction cleanup.

Constants renamed to correspond to cbindgen generated names:

HTP_OK => HTP_STATUS_OK
HTP_ERROR => HTP_STATUS_ERROR

HTP_SERVER_* => HTP_SERVER_PERSONALITY_*

HTP_REQUEST_INVALID_T_E => HTP_FLAGS_REQUEST_INVALID_T_E
HTP_REQUEST_INVALID_C_L => HTP_FLAGS_REQUEST_INVALID_C_L
HTP_HOST_MISSING => HTP_FLAGS_HOST_MISSING
HTP_HOST_AMBIGUOUS => HTP_FLAGS_HOST_AMBIGUOUS
HTP_HOSTU_INVALID => HTP_FLAGS_HOSTU_INVALID
HTP_HOSTH_INVALID => HTP_FLAGS_HOSTH_INVALID

HTP_AUTH_UNRECOGNIZED => HTP_AUTH_TYPE_UNRECOGNIZED

HTP_M_UNKNOWN => HTP_METHOD_UNKNOWN
HTP_M_GET => HTP_METHOD_GET
HTP_M_POST => HTP_METHOD_POST
HTP_M_PUT => HTP_METHOD_PUT
HTP_M_CONNECT => HTP_METHOD_CONNECT

HTP_STREAM_ERROR => HTP_STREAM_STATE_ERROR
HTP_STREAM_TUNNEL => HTP_STREAM_STATE_TUNNEL

HTP_REQUEST_LINE => HTP_REQUEST_PROGRESS_LINE
HTP_REQUEST_HEADERS => HTP_REQUEST_PROGRESS_HEADERS
HTP_REQUEST_BODY => HTP_REQUEST_PROGRESS_BODY
HTP_REQUEST_TRAILER => HTP_REQUEST_PROGRESS_TRAILER
HTP_REQUEST_COMPLETE => HTP_REQUEST_PROGRESS_COMPLETE
HTP_RESPONSE_LINE => HTP_RESPONSE_PROGRESS_LINE
HTP_RESPONSE_HEADERS => HTP_RESPONSE_PROGRESS_HEADERS
HTP_RESPONSE_BODY => HTP_RESPONSE_PROGRESS_BODY
HTP_RESPONSE_TRAILER => HTP_RESPONSE_PROGRESS_TRAILER
HTP_RESPONSE_COMPLETE => HTP_RESPONSE_PROGRESS_COMPLETE

HTP_PROTOCOL_1_1 => HTP_PROTOCOL_V1_1
HTP_PROTOCOL_1_0 => HTP_PROTOCOL_V1_0
HTP_PROTOCOL_0_9 => HTP_PROTOCOL_V0_9

Functions introduced to handle opaque htp_tx_t:
- tx->cfg => htp_tx_cfg(tx)
- tx->flags => htp_tx_flags(tx)
- tx->is_protocol_0_9 => htp_tx_is_protocol_0_9(tx)
- tx->parsed_uri => htp_tx_parsed_uri(tx)
- tx->parsed_uri->path => htp_uri_path(htp_tx_parsed_uri(tx)
- tx->parsed_uri->hostname => htp_uri_hostname(htp_tx_parsed_uri(tx))
- tx->request_auth_type => htp_tx_request_auth_type(tx)
- tx->request_headers => htp_tx_request_headers(tx)
- tx->request_hostname => htp_tx_request_hostname(tx)
- tx->request_line => htp_tx_request_line(tx)
- tx->request_message_len => htp_tx_request_message_len(tx)
- tx->request_method => htp_tx_request_method(tx)
- tx->request_method_number => htp_tx_request_method_number(tx)
- tx->request_port_number => htp_tx_request_port_number(tx)
- tx->request_progress => htp_tx_request_progress(tx)
- tx->request_protocol => htp_tx_request_protocol(tx)
- tx->request_protocol_number => htp_tx_request_protocol_number(tx)
- tx->request_uri => htp_tx_request_uri(tx)
- tx->response_headers => htp_tx_response_headers(tx)
- tx->response_line => htp_tx_response_line(tx)
- tx->response_message => htp_tx_response_message(tx)
- tx->response_message_len => htp_tx_response_message_len(tx)
- tx->response_status => htp_tx_response_status(tx)
- tx->response_status_number => htp_tx_response_status_number(tx)
- tx->response_progress => htp_tx_response_progress(tx)
- tx->response_protocol_number => htp_tx_response_protocol_number(tx)
- htp_tx_get_user_data() => htp_tx_user_data(tx)
- htp_table_get_c(tx->request_headers, header) => htp_tx_request_header(tx, header)
- htp_table_get_c(tx->response_headers, header) => htp_tx_response_header(tx, header)
- htp_table_get_index(tx->request_headers, index) => htp_tx_request_header_index(tx, index)
- htp_table_size(tx->request_headers) => htp_tx_request_headers_size(tx)
- htp_tx_is_http_2_upgrade(tx) convenience function introduced to detect response status 101
  and “Upgrade: h2c" header.

Functions introduced to handle opaque htp_header_t:
- header->name => htp_header_name(header)
- bstr_ptr(header->name) => htp_header_name_ptr(header)
- bstr_len(header->name) => htp_header_name_len(header)
- header->value => htp_header_value(header)
- bstr_len(header->value) => htp_header_value_len(header)
- bstr_ptr(header->value) => htp_header_value_ptr(header)

Functions introduced to handle opaque htp_headers_t:
- htp_table_size(headers) => htp_headers_size(headers)
- htp_table_get_index(headers, index) => htp_headers_get_index(headers, index)

Functions introduced to handle opaque htp_tx_data_t:
- d->len => htp_tx_data_len()
- d->data => htp_tx_data_data()
- htp_tx_data_tx(data) function to get the htp_tx_t from the htp_tx_data_t
- htp_tx_data_is_empty(data) convenience function introduced to test if the data is empty.

Functions introduced to handle opaque htp_connp_t:
- htp_list_get(connp->transactions, txid) => htp_connp_tx(connp, txid)
- htp_list_size(http_state->conn->transactions) => htp_connp_tx_size(connp)
- htp_connp_get_connection(connp) => htp_connp_connection(connp)
- htp_connp_req_data(connp) => htp_connp_request_data(connp)
- htp_connp_req_close(connp) => htp_connp_request_close(connp)
- htp_connp_res_data(connp) => htp_connp_response_data(connp)
- htp_connp_get_jn_tx(connp) => htp_connp_get_request_tx(connp)
- htp_connp_get_out_tx(connp) => htp_connp_get_response_tx(connp)
- htp_connp_req_data_consumed(connp) => htp_connp_request_data_consumed(connp)
- htp_connp_res_data_consumed(connp) => htp_connp_response_data_consumed(connp)
- htp_connp_get_user_data(connp) => htp_connp_user_data(connp)

Functions introduced to handle opaque htp_conn_t:
- conn->in_data_counter => htp_conn_request_data_counter(conn)
- conn->out_data_counter => htp_conn_response_data_counter(conn)

Other changes:

Build libhtp-rs as a crate inside rust. Update autoconf to no longer
use libhtp as an external dependency. Remove HAVE_HTP feature defines
since they are no longer needed.

Make function arguments and return values const where possible

htp_tx_destroy(tx) will now free an incomplete transaction

htp_time_t replaced with standard struct timeval

Callbacks from libhtp now provide the htp_connp_t and the htp_tx_data_t
as separate arguments. This means the connection parser is no longer
fetched from the transaction inside callbacks.

SCHTPGenerateNormalizedUri() functionality moved inside libhtp-rs, which
now provides normalized URI values.
The normalized URI is available with accessor function: htp_tx_normalized_uri()
Configuration settings added to control the behaviour of the URI normalization:
- htp_config_set_normalized_uri_include_all()
- htp_config_set_plusspace_decode()
- htp_config_set_convert_lowercase()
- htp_config_set_double_decode_normalized_query()
- htp_config_set_double_decode_normalized_path()
- htp_config_set_backslash_convert_slashes()
- htp_config_set_bestfit_replacement_byte()
- htp_config_set_convert_lowercase()
- htp_config_set_nul_encoded_terminates()
- htp_config_set_nul_raw_terminates()
- htp_config_set_path_separators_compress()
- htp_config_set_path_separators_decode()
- htp_config_set_u_encoding_decode()
- htp_config_set_url_encoding_invalid_handling()
- htp_config_set_utf8_convert_bestfit()
- htp_config_set_normalized_uri_include_all()
- htp_config_set_plusspace_decode()
Constants related to configuring uri normalization:
- HTP_URL_DECODE_PRESERVE_PERCENT => HTP_URL_ENCODING_HANDLING_PRESERVE_PERCENT
- HTP_URL_DECODE_REMOVE_PERCENT => HTP_URL_ENCODING_HANDLING_REMOVE_PERCENT
- HTP_URL_DECODE_PROCESS_INVALID => HTP_URL_ENCODING_HANDLING_PROCESS_INVALID

htp_config_set_field_limits(soft_limit, hard_limit) changed to
htp_config_set_field_limit(limit) because libhtp didn't implement soft
limits.

libhtp logging API updated to provide HTP_LOG_CODE constants along with
the message. This eliminates the need to perform string matching on
message text to map log messages to HTTP_DECODER_EVENT values, and the
HTP_LOG_CODE values can be used directly. In support of this,
HTP_DECODER_EVENT values are mapped to their corresponding HTP_LOG_CODE
values:

HTTP_DECODER_EVENT_UNKNOWN_ERROR => HTP_LOG_CODE_UNKNOWN
HTTP_DECODER_EVENT_GZIP_DECOMPRESSION_FAILED => HTP_LOG_CODE_GZIP_DECOMPRESSION_FAILED
HTTP_DECODER_EVENT_REQUEST_FIELD_MISSING_COLON => HTP_LOG_CODE_REQUEST_FIELD_MISSING_COLON
HTTP_DECODER_EVENT_RESPONSE_FIELD_MISSING_COLON => HTP_LOG_CODE_RESPONSE_FIELD_MISSING_COLON
HTTP_DECODER_EVENT_INVALID_REQUEST_CHUNK_LEN => HTP_LOG_CODE_INVALID_REQUEST_CHUNK_LEN
HTTP_DECODER_EVENT_INVALID_RESPONSE_CHUNK_LEN => HTP_LOG_CODE_INVALID_RESPONSE_CHUNK_LEN
HTTP_DECODER_EVENT_INVALID_TRANSFER_ENCODING_VALUE_IN_REQUEST => HTP_LOG_CODE_INVALID_TRANSFER_ENCODING_VALUE_IN_REQUEST
HTTP_DECODER_EVENT_INVALID_TRANSFER_ENCODING_VALUE_IN_RESPONSE => HTP_LOG_CODE_INVALID_TRANSFER_ENCODING_VALUE_IN_RESPONSE
HTTP_DECODER_EVENT_INVALID_CONTENT_LENGTH_FIELD_IN_REQUEST => HTP_LOG_CODE_INVALID_CONTENT_LENGTH_FIELD_IN_REQUEST
HTTP_DECODER_EVENT_INVALID_CONTENT_LENGTH_FIELD_IN_RESPONSE => HTP_LOG_CODE_INVALID_CONTENT_LENGTH_FIELD_IN_RESPONSE
HTTP_DECODER_EVENT_DUPLICATE_CONTENT_LENGTH_FIELD_IN_REQUEST => HTP_LOG_CODE_DUPLICATE_CONTENT_LENGTH_FIELD_IN_REQUEST
HTTP_DECODER_EVENT_DUPLICATE_CONTENT_LENGTH_FIELD_IN_RESPONSE => HTP_LOG_CODE_DUPLICATE_CONTENT_LENGTH_FIELD_IN_RESPONSE
HTTP_DECODER_EVENT_100_CONTINUE_ALREADY_SEEN => HTP_LOG_CODE_CONTINUE_ALREADY_SEEN
HTTP_DECODER_EVENT_UNABLE_TO_MATCH_RESPONSE_TO_REQUEST => HTP_LOG_CODE_UNABLE_TO_MATCH_RESPONSE_TO_REQUEST
HTTP_DECODER_EVENT_INVALID_SERVER_PORT_IN_REQUEST => HTP_LOG_CODE_INVALID_SERVER_PORT_IN_REQUEST
HTTP_DECODER_EVENT_INVALID_AUTHORITY_PORT => HTP_LOG_CODE_INVALID_AUTHORITY_PORT
HTTP_DECODER_EVENT_REQUEST_HEADER_INVALID => HTP_LOG_CODE_REQUEST_HEADER_INVALID
HTTP_DECODER_EVENT_RESPONSE_HEADER_INVALID => HTP_LOG_CODE_RESPONSE_HEADER_INVALID
HTTP_DECODER_EVENT_MISSING_HOST_HEADER => HTP_LOG_CODE_MISSING_HOST_HEADER
HTTP_DECODER_EVENT_HOST_HEADER_AMBIGUOUS => HTP_LOG_CODE_HOST_HEADER_AMBIGUOUS
HTTP_DECODER_EVENT_INVALID_REQUEST_FIELD_FOLDING => HTP_LOG_CODE_INVALID_REQUEST_FIELD_FOLDING
HTTP_DECODER_EVENT_INVALID_RESPONSE_FIELD_FOLDING => HTP_LOG_CODE_INVALID_RESPONSE_FIELD_FOLDING
HTTP_DECODER_EVENT_REQUEST_FIELD_TOO_LONG => HTP_LOG_CODE_REQUEST_FIELD_TOO_LONG
HTTP_DECODER_EVENT_RESPONSE_FIELD_TOO_LONG => HTP_LOG_CODE_RESPONSE_FIELD_TOO_LONG
HTTP_DECODER_EVENT_FILE_NAME_TOO_LONG => HTP_LOG_CODE_REQUEST_LINE_INVALID
HTTP_DECODER_EVENT_REQUEST_LINE_INVALID => HTP_LOG_CODE_REQUEST_BODY_UNEXPECTED
HTTP_DECODER_EVENT_REQUEST_BODY_UNEXPECTED => HTP_LOG_CODE_RESPONSE_BODY_UNEXPECTED
HTTP_DECODER_EVENT_REQUEST_SERVER_PORT_TCP_PORT_MISMATCH => HTP_LOG_CODE_REQUEST_SERVER_PORT_TCP_PORT_MISMATCH
HTTP_DECODER_EVENT_URI_HOST_INVALID => HTP_LOG_CODE_URI_HOST_INVALID
HTTP_DECODER_EVENT_HEADER_HOST_INVALID => HTP_LOG_CODE_HEADER_HOST_INVALID
HTTP_DECODER_EVENT_AUTH_UNRECOGNIZED => HTP_LOG_CODE_AUTH_UNRECOGNIZED
HTTP_DECODER_EVENT_REQUEST_HEADER_REPETITION => HTP_LOG_CODE_REQUEST_HEADER_REPETITION
HTTP_DECODER_EVENT_RESPONSE_HEADER_REPETITION => HTP_LOG_CODE_RESPONSE_HEADER_REPETITION
HTTP_DECODER_EVENT_DOUBLE_ENCODED_URI => HTP_LOG_CODE_DOUBLE_ENCODED_URI
HTTP_DECODER_EVENT_URI_DELIM_NON_COMPLIANT => HTP_LOG_CODE_URI_DELIM_NON_COMPLIANT
HTTP_DECODER_EVENT_METHOD_DELIM_NON_COMPLIANT => HTP_LOG_CODE_METHOD_DELIM_NON_COMPLIANT
HTTP_DECODER_EVENT_REQUEST_LINE_LEADING_WHITESPACE => HTP_LOG_CODE_REQUEST_LINE_LEADING_WHITESPACE
HTTP_DECODER_EVENT_TOO_MANY_ENCODING_LAYERS => HTP_LOG_CODE_TOO_MANY_ENCODING_LAYERS
HTTP_DECODER_EVENT_ABNORMAL_CE_HEADER => HTP_LOG_CODE_ABNORMAL_CE_HEADER
HTTP_DECODER_EVENT_RESPONSE_MULTIPART_BYTERANGES => HTP_LOG_CODE_RESPONSE_MULTIPART_BYTERANGES
HTTP_DECODER_EVENT_RESPONSE_ABNORMAL_TRANSFER_ENCODING => HTP_LOG_CODE_RESPONSE_ABNORMAL_TRANSFER_ENCODING
HTTP_DECODER_EVENT_RESPONSE_CHUNKED_OLD_PROTO => HTP_LOG_CODE_RESPONSE_CHUNKED_OLD_PROTO
HTTP_DECODER_EVENT_RESPONSE_INVALID_PROTOCOL => HTP_LOG_CODE_RESPONSE_INVALID_PROTOCOL
HTTP_DECODER_EVENT_RESPONSE_INVALID_STATUS => HTP_LOG_CODE_RESPONSE_INVALID_STATUS
HTTP_DECODER_EVENT_REQUEST_LINE_INCOMPLETE => HTP_LOG_CODE_REQUEST_LINE_INCOMPLETE
HTTP_DECODER_EVENT_LZMA_MEMLIMIT_REACHED => HTP_LOG_CODE_LZMA_MEMLIMIT_REACHED
HTTP_DECODER_EVENT_COMPRESSION_BOMB => HTP_LOG_CODE_COMPRESSION_BOMB

New log events to describe additional anomalies:
HTP_LOG_CODE_REQUEST_TOO_MANY_LZMA_LAYERS
HTP_LOG_CODE_RESPONSE_TOO_MANY_LZMA_LAYERS
HTP_LOG_CODE_PROTOCOL_CONTAINS_EXTRA_DATA
HTP_LOG_CODE_CONTENT_LENGTH_EXTRA_DATA_START
HTP_LOG_CODE_CONTENT_LENGTH_EXTRA_DATA_END
HTP_LOG_CODE_SWITCHING_PROTO_WITH_CONTENT_LENGTH
HTP_LOG_CODE_DEFORMED_EOL
HTP_LOG_CODE_PARSER_STATE_ERROR
HTP_LOG_CODE_MISSING_OUTBOUND_TRANSACTION_DATA
HTP_LOG_CODE_MISSING_INBOUND_TRANSACTION_DATA
HTP_LOG_CODE_ZERO_LENGTH_DATA_CHUNKS
HTP_LOG_CODE_REQUEST_LINE_UNKNOWN_METHOD
HTP_LOG_CODE_REQUEST_LINE_UNKNOWN_METHOD_NO_PROTOCOL
HTP_LOG_CODE_REQUEST_LINE_UNKNOWN_METHOD_INVALID_PROTOCOL
HTP_LOG_CODE_REQUEST_LINE_NO_PROTOCOL
HTP_LOG_CODE_RESPONSE_LINE_INVALID_PROTOCOL
HTP_LOG_CODE_RESPONSE_LINE_INVALID_RESPONSE_STATUS
HTP_LOG_CODE_RESPONSE_BODY_INTERNAL_ERROR
HTP_LOG_CODE_REQUEST_BODY_DATA_CALLBACK_ERROR
HTP_LOG_CODE_RESPONSE_INVALID_EMPTY_NAME
HTP_LOG_CODE_REQUEST_INVALID_EMPTY_NAME
HTP_LOG_CODE_RESPONSE_INVALID_LWS_AFTER_NAME
HTP_LOG_CODE_RESPONSE_HEADER_NAME_NOT_TOKEN
HTP_LOG_CODE_REQUEST_INVALID_LWS_AFTER_NAME
HTP_LOG_CODE_LZMA_DECOMPRESSION_DISABLED
HTP_LOG_CODE_CONNECTION_ALREADY_OPEN
HTP_LOG_CODE_COMPRESSION_BOMB_DOUBLE_LZMA
HTP_LOG_CODE_INVALID_CONTENT_ENCODING
HTP_LOG_CODE_INVALID_GAP
HTP_LOG_CODE_ERROR

The new htp_log API supports consuming log messages more easily than
walking a list and tracking the current offset. Internally, libhtp-rs
now provides log messages as a queue of htp_log_t, which means the
application can simply call htp_conn_next_log() to fetch the next log
message until the queue is empty. Once the application is done with a
log message, they can call htp_log_free() to dispose of it.

Functions supporting htp_log_t:
htp_conn_next_log(conn) - Get the next log message
htp_log_message(log) - To get the text of the message
htp_log_code(log) - To get the HTP_LOG_CODE value
htp_log_free(log) - To free the htp_log_t
catenacyber pushed a commit to catenacyber/suricata that referenced this pull request Nov 7, 2024
Ticket: OISF#2696

There are a lot of changes here, which are described below.

In general these changes are renaming constants to conform to the
libhtp-rs versions (which are generated by cbindgen); making all htp
types opaque and changing struct->member references to
htp_struct_member() function calls; and a handful of changes to offload
functionality onto libhtp-rs from suricata, such as URI normalization
and transaction cleanup.

Constants renamed to correspond to cbindgen generated names:

HTP_OK => HTP_STATUS_OK
HTP_ERROR => HTP_STATUS_ERROR

HTP_SERVER_* => HTP_SERVER_PERSONALITY_*

HTP_REQUEST_INVALID_T_E => HTP_FLAGS_REQUEST_INVALID_T_E
HTP_REQUEST_INVALID_C_L => HTP_FLAGS_REQUEST_INVALID_C_L
HTP_HOST_MISSING => HTP_FLAGS_HOST_MISSING
HTP_HOST_AMBIGUOUS => HTP_FLAGS_HOST_AMBIGUOUS
HTP_HOSTU_INVALID => HTP_FLAGS_HOSTU_INVALID
HTP_HOSTH_INVALID => HTP_FLAGS_HOSTH_INVALID

HTP_AUTH_UNRECOGNIZED => HTP_AUTH_TYPE_UNRECOGNIZED

HTP_M_UNKNOWN => HTP_METHOD_UNKNOWN
HTP_M_GET => HTP_METHOD_GET
HTP_M_POST => HTP_METHOD_POST
HTP_M_PUT => HTP_METHOD_PUT
HTP_M_CONNECT => HTP_METHOD_CONNECT

HTP_STREAM_ERROR => HTP_STREAM_STATE_ERROR
HTP_STREAM_TUNNEL => HTP_STREAM_STATE_TUNNEL

HTP_REQUEST_LINE => HTP_REQUEST_PROGRESS_LINE
HTP_REQUEST_HEADERS => HTP_REQUEST_PROGRESS_HEADERS
HTP_REQUEST_BODY => HTP_REQUEST_PROGRESS_BODY
HTP_REQUEST_TRAILER => HTP_REQUEST_PROGRESS_TRAILER
HTP_REQUEST_COMPLETE => HTP_REQUEST_PROGRESS_COMPLETE
HTP_RESPONSE_LINE => HTP_RESPONSE_PROGRESS_LINE
HTP_RESPONSE_HEADERS => HTP_RESPONSE_PROGRESS_HEADERS
HTP_RESPONSE_BODY => HTP_RESPONSE_PROGRESS_BODY
HTP_RESPONSE_TRAILER => HTP_RESPONSE_PROGRESS_TRAILER
HTP_RESPONSE_COMPLETE => HTP_RESPONSE_PROGRESS_COMPLETE

HTP_PROTOCOL_1_1 => HTP_PROTOCOL_V1_1
HTP_PROTOCOL_1_0 => HTP_PROTOCOL_V1_0
HTP_PROTOCOL_0_9 => HTP_PROTOCOL_V0_9

Functions introduced to handle opaque htp_tx_t:
- tx->cfg => htp_tx_cfg(tx)
- tx->flags => htp_tx_flags(tx)
- tx->is_protocol_0_9 => htp_tx_is_protocol_0_9(tx)
- tx->parsed_uri => htp_tx_parsed_uri(tx)
- tx->parsed_uri->path => htp_uri_path(htp_tx_parsed_uri(tx)
- tx->parsed_uri->hostname => htp_uri_hostname(htp_tx_parsed_uri(tx))
- tx->request_auth_type => htp_tx_request_auth_type(tx)
- tx->request_headers => htp_tx_request_headers(tx)
- tx->request_hostname => htp_tx_request_hostname(tx)
- tx->request_line => htp_tx_request_line(tx)
- tx->request_message_len => htp_tx_request_message_len(tx)
- tx->request_method => htp_tx_request_method(tx)
- tx->request_method_number => htp_tx_request_method_number(tx)
- tx->request_port_number => htp_tx_request_port_number(tx)
- tx->request_progress => htp_tx_request_progress(tx)
- tx->request_protocol => htp_tx_request_protocol(tx)
- tx->request_protocol_number => htp_tx_request_protocol_number(tx)
- tx->request_uri => htp_tx_request_uri(tx)
- tx->response_headers => htp_tx_response_headers(tx)
- tx->response_line => htp_tx_response_line(tx)
- tx->response_message => htp_tx_response_message(tx)
- tx->response_message_len => htp_tx_response_message_len(tx)
- tx->response_status => htp_tx_response_status(tx)
- tx->response_status_number => htp_tx_response_status_number(tx)
- tx->response_progress => htp_tx_response_progress(tx)
- tx->response_protocol_number => htp_tx_response_protocol_number(tx)
- htp_tx_get_user_data() => htp_tx_user_data(tx)
- htp_table_get_c(tx->request_headers, header) => htp_tx_request_header(tx, header)
- htp_table_get_c(tx->response_headers, header) => htp_tx_response_header(tx, header)
- htp_table_get_index(tx->request_headers, index) => htp_tx_request_header_index(tx, index)
- htp_table_size(tx->request_headers) => htp_tx_request_headers_size(tx)
- htp_tx_is_http_2_upgrade(tx) convenience function introduced to detect response status 101
  and “Upgrade: h2c" header.

Functions introduced to handle opaque htp_header_t:
- header->name => htp_header_name(header)
- bstr_ptr(header->name) => htp_header_name_ptr(header)
- bstr_len(header->name) => htp_header_name_len(header)
- header->value => htp_header_value(header)
- bstr_len(header->value) => htp_header_value_len(header)
- bstr_ptr(header->value) => htp_header_value_ptr(header)

Functions introduced to handle opaque htp_headers_t:
- htp_table_size(headers) => htp_headers_size(headers)
- htp_table_get_index(headers, index) => htp_headers_get_index(headers, index)

Functions introduced to handle opaque htp_tx_data_t:
- d->len => htp_tx_data_len()
- d->data => htp_tx_data_data()
- htp_tx_data_tx(data) function to get the htp_tx_t from the htp_tx_data_t
- htp_tx_data_is_empty(data) convenience function introduced to test if the data is empty.

Functions introduced to handle opaque htp_connp_t:
- htp_list_get(connp->transactions, txid) => htp_connp_tx(connp, txid)
- htp_list_size(http_state->conn->transactions) => htp_connp_tx_size(connp)
- htp_connp_get_connection(connp) => htp_connp_connection(connp)
- htp_connp_req_data(connp) => htp_connp_request_data(connp)
- htp_connp_req_close(connp) => htp_connp_request_close(connp)
- htp_connp_res_data(connp) => htp_connp_response_data(connp)
- htp_connp_get_jn_tx(connp) => htp_connp_get_request_tx(connp)
- htp_connp_get_out_tx(connp) => htp_connp_get_response_tx(connp)
- htp_connp_req_data_consumed(connp) => htp_connp_request_data_consumed(connp)
- htp_connp_res_data_consumed(connp) => htp_connp_response_data_consumed(connp)
- htp_connp_get_user_data(connp) => htp_connp_user_data(connp)

Functions introduced to handle opaque htp_conn_t:
- conn->in_data_counter => htp_conn_request_data_counter(conn)
- conn->out_data_counter => htp_conn_response_data_counter(conn)

Other changes:

Build libhtp-rs as a crate inside rust. Update autoconf to no longer
use libhtp as an external dependency. Remove HAVE_HTP feature defines
since they are no longer needed.

Make function arguments and return values const where possible

htp_tx_destroy(tx) will now free an incomplete transaction

htp_time_t replaced with standard struct timeval

Callbacks from libhtp now provide the htp_connp_t and the htp_tx_data_t
as separate arguments. This means the connection parser is no longer
fetched from the transaction inside callbacks.

SCHTPGenerateNormalizedUri() functionality moved inside libhtp-rs, which
now provides normalized URI values.
The normalized URI is available with accessor function: htp_tx_normalized_uri()
Configuration settings added to control the behaviour of the URI normalization:
- htp_config_set_normalized_uri_include_all()
- htp_config_set_plusspace_decode()
- htp_config_set_convert_lowercase()
- htp_config_set_double_decode_normalized_query()
- htp_config_set_double_decode_normalized_path()
- htp_config_set_backslash_convert_slashes()
- htp_config_set_bestfit_replacement_byte()
- htp_config_set_convert_lowercase()
- htp_config_set_nul_encoded_terminates()
- htp_config_set_nul_raw_terminates()
- htp_config_set_path_separators_compress()
- htp_config_set_path_separators_decode()
- htp_config_set_u_encoding_decode()
- htp_config_set_url_encoding_invalid_handling()
- htp_config_set_utf8_convert_bestfit()
- htp_config_set_normalized_uri_include_all()
- htp_config_set_plusspace_decode()
Constants related to configuring uri normalization:
- HTP_URL_DECODE_PRESERVE_PERCENT => HTP_URL_ENCODING_HANDLING_PRESERVE_PERCENT
- HTP_URL_DECODE_REMOVE_PERCENT => HTP_URL_ENCODING_HANDLING_REMOVE_PERCENT
- HTP_URL_DECODE_PROCESS_INVALID => HTP_URL_ENCODING_HANDLING_PROCESS_INVALID

htp_config_set_field_limits(soft_limit, hard_limit) changed to
htp_config_set_field_limit(limit) because libhtp didn't implement soft
limits.

libhtp logging API updated to provide HTP_LOG_CODE constants along with
the message. This eliminates the need to perform string matching on
message text to map log messages to HTTP_DECODER_EVENT values, and the
HTP_LOG_CODE values can be used directly. In support of this,
HTP_DECODER_EVENT values are mapped to their corresponding HTP_LOG_CODE
values:

HTTP_DECODER_EVENT_UNKNOWN_ERROR => HTP_LOG_CODE_UNKNOWN
HTTP_DECODER_EVENT_GZIP_DECOMPRESSION_FAILED => HTP_LOG_CODE_GZIP_DECOMPRESSION_FAILED
HTTP_DECODER_EVENT_REQUEST_FIELD_MISSING_COLON => HTP_LOG_CODE_REQUEST_FIELD_MISSING_COLON
HTTP_DECODER_EVENT_RESPONSE_FIELD_MISSING_COLON => HTP_LOG_CODE_RESPONSE_FIELD_MISSING_COLON
HTTP_DECODER_EVENT_INVALID_REQUEST_CHUNK_LEN => HTP_LOG_CODE_INVALID_REQUEST_CHUNK_LEN
HTTP_DECODER_EVENT_INVALID_RESPONSE_CHUNK_LEN => HTP_LOG_CODE_INVALID_RESPONSE_CHUNK_LEN
HTTP_DECODER_EVENT_INVALID_TRANSFER_ENCODING_VALUE_IN_REQUEST => HTP_LOG_CODE_INVALID_TRANSFER_ENCODING_VALUE_IN_REQUEST
HTTP_DECODER_EVENT_INVALID_TRANSFER_ENCODING_VALUE_IN_RESPONSE => HTP_LOG_CODE_INVALID_TRANSFER_ENCODING_VALUE_IN_RESPONSE
HTTP_DECODER_EVENT_INVALID_CONTENT_LENGTH_FIELD_IN_REQUEST => HTP_LOG_CODE_INVALID_CONTENT_LENGTH_FIELD_IN_REQUEST
HTTP_DECODER_EVENT_INVALID_CONTENT_LENGTH_FIELD_IN_RESPONSE => HTP_LOG_CODE_INVALID_CONTENT_LENGTH_FIELD_IN_RESPONSE
HTTP_DECODER_EVENT_DUPLICATE_CONTENT_LENGTH_FIELD_IN_REQUEST => HTP_LOG_CODE_DUPLICATE_CONTENT_LENGTH_FIELD_IN_REQUEST
HTTP_DECODER_EVENT_DUPLICATE_CONTENT_LENGTH_FIELD_IN_RESPONSE => HTP_LOG_CODE_DUPLICATE_CONTENT_LENGTH_FIELD_IN_RESPONSE
HTTP_DECODER_EVENT_100_CONTINUE_ALREADY_SEEN => HTP_LOG_CODE_CONTINUE_ALREADY_SEEN
HTTP_DECODER_EVENT_UNABLE_TO_MATCH_RESPONSE_TO_REQUEST => HTP_LOG_CODE_UNABLE_TO_MATCH_RESPONSE_TO_REQUEST
HTTP_DECODER_EVENT_INVALID_SERVER_PORT_IN_REQUEST => HTP_LOG_CODE_INVALID_SERVER_PORT_IN_REQUEST
HTTP_DECODER_EVENT_INVALID_AUTHORITY_PORT => HTP_LOG_CODE_INVALID_AUTHORITY_PORT
HTTP_DECODER_EVENT_REQUEST_HEADER_INVALID => HTP_LOG_CODE_REQUEST_HEADER_INVALID
HTTP_DECODER_EVENT_RESPONSE_HEADER_INVALID => HTP_LOG_CODE_RESPONSE_HEADER_INVALID
HTTP_DECODER_EVENT_MISSING_HOST_HEADER => HTP_LOG_CODE_MISSING_HOST_HEADER
HTTP_DECODER_EVENT_HOST_HEADER_AMBIGUOUS => HTP_LOG_CODE_HOST_HEADER_AMBIGUOUS
HTTP_DECODER_EVENT_INVALID_REQUEST_FIELD_FOLDING => HTP_LOG_CODE_INVALID_REQUEST_FIELD_FOLDING
HTTP_DECODER_EVENT_INVALID_RESPONSE_FIELD_FOLDING => HTP_LOG_CODE_INVALID_RESPONSE_FIELD_FOLDING
HTTP_DECODER_EVENT_REQUEST_FIELD_TOO_LONG => HTP_LOG_CODE_REQUEST_FIELD_TOO_LONG
HTTP_DECODER_EVENT_RESPONSE_FIELD_TOO_LONG => HTP_LOG_CODE_RESPONSE_FIELD_TOO_LONG
HTTP_DECODER_EVENT_FILE_NAME_TOO_LONG => HTP_LOG_CODE_REQUEST_LINE_INVALID
HTTP_DECODER_EVENT_REQUEST_LINE_INVALID => HTP_LOG_CODE_REQUEST_BODY_UNEXPECTED
HTTP_DECODER_EVENT_REQUEST_BODY_UNEXPECTED => HTP_LOG_CODE_RESPONSE_BODY_UNEXPECTED
HTTP_DECODER_EVENT_REQUEST_SERVER_PORT_TCP_PORT_MISMATCH => HTP_LOG_CODE_REQUEST_SERVER_PORT_TCP_PORT_MISMATCH
HTTP_DECODER_EVENT_URI_HOST_INVALID => HTP_LOG_CODE_URI_HOST_INVALID
HTTP_DECODER_EVENT_HEADER_HOST_INVALID => HTP_LOG_CODE_HEADER_HOST_INVALID
HTTP_DECODER_EVENT_AUTH_UNRECOGNIZED => HTP_LOG_CODE_AUTH_UNRECOGNIZED
HTTP_DECODER_EVENT_REQUEST_HEADER_REPETITION => HTP_LOG_CODE_REQUEST_HEADER_REPETITION
HTTP_DECODER_EVENT_RESPONSE_HEADER_REPETITION => HTP_LOG_CODE_RESPONSE_HEADER_REPETITION
HTTP_DECODER_EVENT_DOUBLE_ENCODED_URI => HTP_LOG_CODE_DOUBLE_ENCODED_URI
HTTP_DECODER_EVENT_URI_DELIM_NON_COMPLIANT => HTP_LOG_CODE_URI_DELIM_NON_COMPLIANT
HTTP_DECODER_EVENT_METHOD_DELIM_NON_COMPLIANT => HTP_LOG_CODE_METHOD_DELIM_NON_COMPLIANT
HTTP_DECODER_EVENT_REQUEST_LINE_LEADING_WHITESPACE => HTP_LOG_CODE_REQUEST_LINE_LEADING_WHITESPACE
HTTP_DECODER_EVENT_TOO_MANY_ENCODING_LAYERS => HTP_LOG_CODE_TOO_MANY_ENCODING_LAYERS
HTTP_DECODER_EVENT_ABNORMAL_CE_HEADER => HTP_LOG_CODE_ABNORMAL_CE_HEADER
HTTP_DECODER_EVENT_RESPONSE_MULTIPART_BYTERANGES => HTP_LOG_CODE_RESPONSE_MULTIPART_BYTERANGES
HTTP_DECODER_EVENT_RESPONSE_ABNORMAL_TRANSFER_ENCODING => HTP_LOG_CODE_RESPONSE_ABNORMAL_TRANSFER_ENCODING
HTTP_DECODER_EVENT_RESPONSE_CHUNKED_OLD_PROTO => HTP_LOG_CODE_RESPONSE_CHUNKED_OLD_PROTO
HTTP_DECODER_EVENT_RESPONSE_INVALID_PROTOCOL => HTP_LOG_CODE_RESPONSE_INVALID_PROTOCOL
HTTP_DECODER_EVENT_RESPONSE_INVALID_STATUS => HTP_LOG_CODE_RESPONSE_INVALID_STATUS
HTTP_DECODER_EVENT_REQUEST_LINE_INCOMPLETE => HTP_LOG_CODE_REQUEST_LINE_INCOMPLETE
HTTP_DECODER_EVENT_LZMA_MEMLIMIT_REACHED => HTP_LOG_CODE_LZMA_MEMLIMIT_REACHED
HTTP_DECODER_EVENT_COMPRESSION_BOMB => HTP_LOG_CODE_COMPRESSION_BOMB

New log events to describe additional anomalies:
HTP_LOG_CODE_REQUEST_TOO_MANY_LZMA_LAYERS
HTP_LOG_CODE_RESPONSE_TOO_MANY_LZMA_LAYERS
HTP_LOG_CODE_PROTOCOL_CONTAINS_EXTRA_DATA
HTP_LOG_CODE_CONTENT_LENGTH_EXTRA_DATA_START
HTP_LOG_CODE_CONTENT_LENGTH_EXTRA_DATA_END
HTP_LOG_CODE_SWITCHING_PROTO_WITH_CONTENT_LENGTH
HTP_LOG_CODE_DEFORMED_EOL
HTP_LOG_CODE_PARSER_STATE_ERROR
HTP_LOG_CODE_MISSING_OUTBOUND_TRANSACTION_DATA
HTP_LOG_CODE_MISSING_INBOUND_TRANSACTION_DATA
HTP_LOG_CODE_ZERO_LENGTH_DATA_CHUNKS
HTP_LOG_CODE_REQUEST_LINE_UNKNOWN_METHOD
HTP_LOG_CODE_REQUEST_LINE_UNKNOWN_METHOD_NO_PROTOCOL
HTP_LOG_CODE_REQUEST_LINE_UNKNOWN_METHOD_INVALID_PROTOCOL
HTP_LOG_CODE_REQUEST_LINE_NO_PROTOCOL
HTP_LOG_CODE_RESPONSE_LINE_INVALID_PROTOCOL
HTP_LOG_CODE_RESPONSE_LINE_INVALID_RESPONSE_STATUS
HTP_LOG_CODE_RESPONSE_BODY_INTERNAL_ERROR
HTP_LOG_CODE_REQUEST_BODY_DATA_CALLBACK_ERROR
HTP_LOG_CODE_RESPONSE_INVALID_EMPTY_NAME
HTP_LOG_CODE_REQUEST_INVALID_EMPTY_NAME
HTP_LOG_CODE_RESPONSE_INVALID_LWS_AFTER_NAME
HTP_LOG_CODE_RESPONSE_HEADER_NAME_NOT_TOKEN
HTP_LOG_CODE_REQUEST_INVALID_LWS_AFTER_NAME
HTP_LOG_CODE_LZMA_DECOMPRESSION_DISABLED
HTP_LOG_CODE_CONNECTION_ALREADY_OPEN
HTP_LOG_CODE_COMPRESSION_BOMB_DOUBLE_LZMA
HTP_LOG_CODE_INVALID_CONTENT_ENCODING
HTP_LOG_CODE_INVALID_GAP
HTP_LOG_CODE_ERROR

The new htp_log API supports consuming log messages more easily than
walking a list and tracking the current offset. Internally, libhtp-rs
now provides log messages as a queue of htp_log_t, which means the
application can simply call htp_conn_next_log() to fetch the next log
message until the queue is empty. Once the application is done with a
log message, they can call htp_log_free() to dispose of it.

Functions supporting htp_log_t:
htp_conn_next_log(conn) - Get the next log message
htp_log_message(log) - To get the text of the message
htp_log_code(log) - To get the HTP_LOG_CODE value
htp_log_free(log) - To free the htp_log_t
catenacyber pushed a commit to catenacyber/suricata that referenced this pull request Dec 2, 2024
Ticket: OISF#2696

There are a lot of changes here, which are described below.

In general these changes are renaming constants to conform to the
libhtp-rs versions (which are generated by cbindgen); making all htp
types opaque and changing struct->member references to
htp_struct_member() function calls; and a handful of changes to offload
functionality onto libhtp-rs from suricata, such as URI normalization
and transaction cleanup.

Constants renamed to correspond to cbindgen generated names:

HTP_OK => HTP_STATUS_OK
HTP_ERROR => HTP_STATUS_ERROR

HTP_SERVER_* => HTP_SERVER_PERSONALITY_*

HTP_REQUEST_INVALID_T_E => HTP_FLAGS_REQUEST_INVALID_T_E
HTP_REQUEST_INVALID_C_L => HTP_FLAGS_REQUEST_INVALID_C_L
HTP_HOST_MISSING => HTP_FLAGS_HOST_MISSING
HTP_HOST_AMBIGUOUS => HTP_FLAGS_HOST_AMBIGUOUS
HTP_HOSTU_INVALID => HTP_FLAGS_HOSTU_INVALID
HTP_HOSTH_INVALID => HTP_FLAGS_HOSTH_INVALID

HTP_AUTH_UNRECOGNIZED => HTP_AUTH_TYPE_UNRECOGNIZED

HTP_M_UNKNOWN => HTP_METHOD_UNKNOWN
HTP_M_GET => HTP_METHOD_GET
HTP_M_POST => HTP_METHOD_POST
HTP_M_PUT => HTP_METHOD_PUT
HTP_M_CONNECT => HTP_METHOD_CONNECT

HTP_STREAM_ERROR => HTP_STREAM_STATE_ERROR
HTP_STREAM_TUNNEL => HTP_STREAM_STATE_TUNNEL

HTP_REQUEST_LINE => HTP_REQUEST_PROGRESS_LINE
HTP_REQUEST_HEADERS => HTP_REQUEST_PROGRESS_HEADERS
HTP_REQUEST_BODY => HTP_REQUEST_PROGRESS_BODY
HTP_REQUEST_TRAILER => HTP_REQUEST_PROGRESS_TRAILER
HTP_REQUEST_COMPLETE => HTP_REQUEST_PROGRESS_COMPLETE
HTP_RESPONSE_LINE => HTP_RESPONSE_PROGRESS_LINE
HTP_RESPONSE_HEADERS => HTP_RESPONSE_PROGRESS_HEADERS
HTP_RESPONSE_BODY => HTP_RESPONSE_PROGRESS_BODY
HTP_RESPONSE_TRAILER => HTP_RESPONSE_PROGRESS_TRAILER
HTP_RESPONSE_COMPLETE => HTP_RESPONSE_PROGRESS_COMPLETE

HTP_PROTOCOL_1_1 => HTP_PROTOCOL_V1_1
HTP_PROTOCOL_1_0 => HTP_PROTOCOL_V1_0
HTP_PROTOCOL_0_9 => HTP_PROTOCOL_V0_9

Functions introduced to handle opaque htp_tx_t:
- tx->cfg => htp_tx_cfg(tx)
- tx->flags => htp_tx_flags(tx)
- tx->is_protocol_0_9 => htp_tx_is_protocol_0_9(tx)
- tx->parsed_uri => htp_tx_parsed_uri(tx)
- tx->parsed_uri->path => htp_uri_path(htp_tx_parsed_uri(tx)
- tx->parsed_uri->hostname => htp_uri_hostname(htp_tx_parsed_uri(tx))
- tx->request_auth_type => htp_tx_request_auth_type(tx)
- tx->request_headers => htp_tx_request_headers(tx)
- tx->request_hostname => htp_tx_request_hostname(tx)
- tx->request_line => htp_tx_request_line(tx)
- tx->request_message_len => htp_tx_request_message_len(tx)
- tx->request_method => htp_tx_request_method(tx)
- tx->request_method_number => htp_tx_request_method_number(tx)
- tx->request_port_number => htp_tx_request_port_number(tx)
- tx->request_progress => htp_tx_request_progress(tx)
- tx->request_protocol => htp_tx_request_protocol(tx)
- tx->request_protocol_number => htp_tx_request_protocol_number(tx)
- tx->request_uri => htp_tx_request_uri(tx)
- tx->response_headers => htp_tx_response_headers(tx)
- tx->response_line => htp_tx_response_line(tx)
- tx->response_message => htp_tx_response_message(tx)
- tx->response_message_len => htp_tx_response_message_len(tx)
- tx->response_status => htp_tx_response_status(tx)
- tx->response_status_number => htp_tx_response_status_number(tx)
- tx->response_progress => htp_tx_response_progress(tx)
- tx->response_protocol_number => htp_tx_response_protocol_number(tx)
- htp_tx_get_user_data() => htp_tx_user_data(tx)
- htp_table_get_c(tx->request_headers, header) => htp_tx_request_header(tx, header)
- htp_table_get_c(tx->response_headers, header) => htp_tx_response_header(tx, header)
- htp_table_get_index(tx->request_headers, index) => htp_tx_request_header_index(tx, index)
- htp_table_size(tx->request_headers) => htp_tx_request_headers_size(tx)
- htp_tx_is_http_2_upgrade(tx) convenience function introduced to detect response status 101
  and “Upgrade: h2c" header.

Functions introduced to handle opaque htp_header_t:
- header->name => htp_header_name(header)
- bstr_ptr(header->name) => htp_header_name_ptr(header)
- bstr_len(header->name) => htp_header_name_len(header)
- header->value => htp_header_value(header)
- bstr_len(header->value) => htp_header_value_len(header)
- bstr_ptr(header->value) => htp_header_value_ptr(header)

Functions introduced to handle opaque htp_headers_t:
- htp_table_size(headers) => htp_headers_size(headers)
- htp_table_get_index(headers, index) => htp_headers_get_index(headers, index)

Functions introduced to handle opaque htp_tx_data_t:
- d->len => htp_tx_data_len()
- d->data => htp_tx_data_data()
- htp_tx_data_tx(data) function to get the htp_tx_t from the htp_tx_data_t
- htp_tx_data_is_empty(data) convenience function introduced to test if the data is empty.

Functions introduced to handle opaque htp_connp_t:
- htp_list_get(connp->transactions, txid) => htp_connp_tx(connp, txid)
- htp_list_size(http_state->conn->transactions) => htp_connp_tx_size(connp)
- htp_connp_get_connection(connp) => htp_connp_connection(connp)
- htp_connp_req_data(connp) => htp_connp_request_data(connp)
- htp_connp_req_close(connp) => htp_connp_request_close(connp)
- htp_connp_res_data(connp) => htp_connp_response_data(connp)
- htp_connp_get_jn_tx(connp) => htp_connp_get_request_tx(connp)
- htp_connp_get_out_tx(connp) => htp_connp_get_response_tx(connp)
- htp_connp_req_data_consumed(connp) => htp_connp_request_data_consumed(connp)
- htp_connp_res_data_consumed(connp) => htp_connp_response_data_consumed(connp)
- htp_connp_get_user_data(connp) => htp_connp_user_data(connp)

Functions introduced to handle opaque htp_conn_t:
- conn->in_data_counter => htp_conn_request_data_counter(conn)
- conn->out_data_counter => htp_conn_response_data_counter(conn)

Other changes:

Build libhtp-rs as a crate inside rust. Update autoconf to no longer
use libhtp as an external dependency. Remove HAVE_HTP feature defines
since they are no longer needed.

Make function arguments and return values const where possible

htp_tx_destroy(tx) will now free an incomplete transaction

htp_time_t replaced with standard struct timeval

Callbacks from libhtp now provide the htp_connp_t and the htp_tx_data_t
as separate arguments. This means the connection parser is no longer
fetched from the transaction inside callbacks.

SCHTPGenerateNormalizedUri() functionality moved inside libhtp-rs, which
now provides normalized URI values.
The normalized URI is available with accessor function: htp_tx_normalized_uri()
Configuration settings added to control the behaviour of the URI normalization:
- htp_config_set_normalized_uri_include_all()
- htp_config_set_plusspace_decode()
- htp_config_set_convert_lowercase()
- htp_config_set_double_decode_normalized_query()
- htp_config_set_double_decode_normalized_path()
- htp_config_set_backslash_convert_slashes()
- htp_config_set_bestfit_replacement_byte()
- htp_config_set_convert_lowercase()
- htp_config_set_nul_encoded_terminates()
- htp_config_set_nul_raw_terminates()
- htp_config_set_path_separators_compress()
- htp_config_set_path_separators_decode()
- htp_config_set_u_encoding_decode()
- htp_config_set_url_encoding_invalid_handling()
- htp_config_set_utf8_convert_bestfit()
- htp_config_set_normalized_uri_include_all()
- htp_config_set_plusspace_decode()
Constants related to configuring uri normalization:
- HTP_URL_DECODE_PRESERVE_PERCENT => HTP_URL_ENCODING_HANDLING_PRESERVE_PERCENT
- HTP_URL_DECODE_REMOVE_PERCENT => HTP_URL_ENCODING_HANDLING_REMOVE_PERCENT
- HTP_URL_DECODE_PROCESS_INVALID => HTP_URL_ENCODING_HANDLING_PROCESS_INVALID

htp_config_set_field_limits(soft_limit, hard_limit) changed to
htp_config_set_field_limit(limit) because libhtp didn't implement soft
limits.

libhtp logging API updated to provide HTP_LOG_CODE constants along with
the message. This eliminates the need to perform string matching on
message text to map log messages to HTTP_DECODER_EVENT values, and the
HTP_LOG_CODE values can be used directly. In support of this,
HTP_DECODER_EVENT values are mapped to their corresponding HTP_LOG_CODE
values:

HTTP_DECODER_EVENT_UNKNOWN_ERROR => HTP_LOG_CODE_UNKNOWN
HTTP_DECODER_EVENT_GZIP_DECOMPRESSION_FAILED => HTP_LOG_CODE_GZIP_DECOMPRESSION_FAILED
HTTP_DECODER_EVENT_REQUEST_FIELD_MISSING_COLON => HTP_LOG_CODE_REQUEST_FIELD_MISSING_COLON
HTTP_DECODER_EVENT_RESPONSE_FIELD_MISSING_COLON => HTP_LOG_CODE_RESPONSE_FIELD_MISSING_COLON
HTTP_DECODER_EVENT_INVALID_REQUEST_CHUNK_LEN => HTP_LOG_CODE_INVALID_REQUEST_CHUNK_LEN
HTTP_DECODER_EVENT_INVALID_RESPONSE_CHUNK_LEN => HTP_LOG_CODE_INVALID_RESPONSE_CHUNK_LEN
HTTP_DECODER_EVENT_INVALID_TRANSFER_ENCODING_VALUE_IN_REQUEST => HTP_LOG_CODE_INVALID_TRANSFER_ENCODING_VALUE_IN_REQUEST
HTTP_DECODER_EVENT_INVALID_TRANSFER_ENCODING_VALUE_IN_RESPONSE => HTP_LOG_CODE_INVALID_TRANSFER_ENCODING_VALUE_IN_RESPONSE
HTTP_DECODER_EVENT_INVALID_CONTENT_LENGTH_FIELD_IN_REQUEST => HTP_LOG_CODE_INVALID_CONTENT_LENGTH_FIELD_IN_REQUEST
HTTP_DECODER_EVENT_INVALID_CONTENT_LENGTH_FIELD_IN_RESPONSE => HTP_LOG_CODE_INVALID_CONTENT_LENGTH_FIELD_IN_RESPONSE
HTTP_DECODER_EVENT_DUPLICATE_CONTENT_LENGTH_FIELD_IN_REQUEST => HTP_LOG_CODE_DUPLICATE_CONTENT_LENGTH_FIELD_IN_REQUEST
HTTP_DECODER_EVENT_DUPLICATE_CONTENT_LENGTH_FIELD_IN_RESPONSE => HTP_LOG_CODE_DUPLICATE_CONTENT_LENGTH_FIELD_IN_RESPONSE
HTTP_DECODER_EVENT_100_CONTINUE_ALREADY_SEEN => HTP_LOG_CODE_CONTINUE_ALREADY_SEEN
HTTP_DECODER_EVENT_UNABLE_TO_MATCH_RESPONSE_TO_REQUEST => HTP_LOG_CODE_UNABLE_TO_MATCH_RESPONSE_TO_REQUEST
HTTP_DECODER_EVENT_INVALID_SERVER_PORT_IN_REQUEST => HTP_LOG_CODE_INVALID_SERVER_PORT_IN_REQUEST
HTTP_DECODER_EVENT_INVALID_AUTHORITY_PORT => HTP_LOG_CODE_INVALID_AUTHORITY_PORT
HTTP_DECODER_EVENT_REQUEST_HEADER_INVALID => HTP_LOG_CODE_REQUEST_HEADER_INVALID
HTTP_DECODER_EVENT_RESPONSE_HEADER_INVALID => HTP_LOG_CODE_RESPONSE_HEADER_INVALID
HTTP_DECODER_EVENT_MISSING_HOST_HEADER => HTP_LOG_CODE_MISSING_HOST_HEADER
HTTP_DECODER_EVENT_HOST_HEADER_AMBIGUOUS => HTP_LOG_CODE_HOST_HEADER_AMBIGUOUS
HTTP_DECODER_EVENT_INVALID_REQUEST_FIELD_FOLDING => HTP_LOG_CODE_INVALID_REQUEST_FIELD_FOLDING
HTTP_DECODER_EVENT_INVALID_RESPONSE_FIELD_FOLDING => HTP_LOG_CODE_INVALID_RESPONSE_FIELD_FOLDING
HTTP_DECODER_EVENT_REQUEST_FIELD_TOO_LONG => HTP_LOG_CODE_REQUEST_FIELD_TOO_LONG
HTTP_DECODER_EVENT_RESPONSE_FIELD_TOO_LONG => HTP_LOG_CODE_RESPONSE_FIELD_TOO_LONG
HTTP_DECODER_EVENT_FILE_NAME_TOO_LONG => HTP_LOG_CODE_REQUEST_LINE_INVALID
HTTP_DECODER_EVENT_REQUEST_LINE_INVALID => HTP_LOG_CODE_REQUEST_BODY_UNEXPECTED
HTTP_DECODER_EVENT_REQUEST_BODY_UNEXPECTED => HTP_LOG_CODE_RESPONSE_BODY_UNEXPECTED
HTTP_DECODER_EVENT_REQUEST_SERVER_PORT_TCP_PORT_MISMATCH => HTP_LOG_CODE_REQUEST_SERVER_PORT_TCP_PORT_MISMATCH
HTTP_DECODER_EVENT_URI_HOST_INVALID => HTP_LOG_CODE_URI_HOST_INVALID
HTTP_DECODER_EVENT_HEADER_HOST_INVALID => HTP_LOG_CODE_HEADER_HOST_INVALID
HTTP_DECODER_EVENT_AUTH_UNRECOGNIZED => HTP_LOG_CODE_AUTH_UNRECOGNIZED
HTTP_DECODER_EVENT_REQUEST_HEADER_REPETITION => HTP_LOG_CODE_REQUEST_HEADER_REPETITION
HTTP_DECODER_EVENT_RESPONSE_HEADER_REPETITION => HTP_LOG_CODE_RESPONSE_HEADER_REPETITION
HTTP_DECODER_EVENT_DOUBLE_ENCODED_URI => HTP_LOG_CODE_DOUBLE_ENCODED_URI
HTTP_DECODER_EVENT_URI_DELIM_NON_COMPLIANT => HTP_LOG_CODE_URI_DELIM_NON_COMPLIANT
HTTP_DECODER_EVENT_METHOD_DELIM_NON_COMPLIANT => HTP_LOG_CODE_METHOD_DELIM_NON_COMPLIANT
HTTP_DECODER_EVENT_REQUEST_LINE_LEADING_WHITESPACE => HTP_LOG_CODE_REQUEST_LINE_LEADING_WHITESPACE
HTTP_DECODER_EVENT_TOO_MANY_ENCODING_LAYERS => HTP_LOG_CODE_TOO_MANY_ENCODING_LAYERS
HTTP_DECODER_EVENT_ABNORMAL_CE_HEADER => HTP_LOG_CODE_ABNORMAL_CE_HEADER
HTTP_DECODER_EVENT_RESPONSE_MULTIPART_BYTERANGES => HTP_LOG_CODE_RESPONSE_MULTIPART_BYTERANGES
HTTP_DECODER_EVENT_RESPONSE_ABNORMAL_TRANSFER_ENCODING => HTP_LOG_CODE_RESPONSE_ABNORMAL_TRANSFER_ENCODING
HTTP_DECODER_EVENT_RESPONSE_CHUNKED_OLD_PROTO => HTP_LOG_CODE_RESPONSE_CHUNKED_OLD_PROTO
HTTP_DECODER_EVENT_RESPONSE_INVALID_PROTOCOL => HTP_LOG_CODE_RESPONSE_INVALID_PROTOCOL
HTTP_DECODER_EVENT_RESPONSE_INVALID_STATUS => HTP_LOG_CODE_RESPONSE_INVALID_STATUS
HTTP_DECODER_EVENT_REQUEST_LINE_INCOMPLETE => HTP_LOG_CODE_REQUEST_LINE_INCOMPLETE
HTTP_DECODER_EVENT_LZMA_MEMLIMIT_REACHED => HTP_LOG_CODE_LZMA_MEMLIMIT_REACHED
HTTP_DECODER_EVENT_COMPRESSION_BOMB => HTP_LOG_CODE_COMPRESSION_BOMB

New log events to describe additional anomalies:
HTP_LOG_CODE_REQUEST_TOO_MANY_LZMA_LAYERS
HTP_LOG_CODE_RESPONSE_TOO_MANY_LZMA_LAYERS
HTP_LOG_CODE_PROTOCOL_CONTAINS_EXTRA_DATA
HTP_LOG_CODE_CONTENT_LENGTH_EXTRA_DATA_START
HTP_LOG_CODE_CONTENT_LENGTH_EXTRA_DATA_END
HTP_LOG_CODE_SWITCHING_PROTO_WITH_CONTENT_LENGTH
HTP_LOG_CODE_DEFORMED_EOL
HTP_LOG_CODE_PARSER_STATE_ERROR
HTP_LOG_CODE_MISSING_OUTBOUND_TRANSACTION_DATA
HTP_LOG_CODE_MISSING_INBOUND_TRANSACTION_DATA
HTP_LOG_CODE_ZERO_LENGTH_DATA_CHUNKS
HTP_LOG_CODE_REQUEST_LINE_UNKNOWN_METHOD
HTP_LOG_CODE_REQUEST_LINE_UNKNOWN_METHOD_NO_PROTOCOL
HTP_LOG_CODE_REQUEST_LINE_UNKNOWN_METHOD_INVALID_PROTOCOL
HTP_LOG_CODE_REQUEST_LINE_NO_PROTOCOL
HTP_LOG_CODE_RESPONSE_LINE_INVALID_PROTOCOL
HTP_LOG_CODE_RESPONSE_LINE_INVALID_RESPONSE_STATUS
HTP_LOG_CODE_RESPONSE_BODY_INTERNAL_ERROR
HTP_LOG_CODE_REQUEST_BODY_DATA_CALLBACK_ERROR
HTP_LOG_CODE_RESPONSE_INVALID_EMPTY_NAME
HTP_LOG_CODE_REQUEST_INVALID_EMPTY_NAME
HTP_LOG_CODE_RESPONSE_INVALID_LWS_AFTER_NAME
HTP_LOG_CODE_RESPONSE_HEADER_NAME_NOT_TOKEN
HTP_LOG_CODE_REQUEST_INVALID_LWS_AFTER_NAME
HTP_LOG_CODE_LZMA_DECOMPRESSION_DISABLED
HTP_LOG_CODE_CONNECTION_ALREADY_OPEN
HTP_LOG_CODE_COMPRESSION_BOMB_DOUBLE_LZMA
HTP_LOG_CODE_INVALID_CONTENT_ENCODING
HTP_LOG_CODE_INVALID_GAP
HTP_LOG_CODE_ERROR

The new htp_log API supports consuming log messages more easily than
walking a list and tracking the current offset. Internally, libhtp-rs
now provides log messages as a queue of htp_log_t, which means the
application can simply call htp_conn_next_log() to fetch the next log
message until the queue is empty. Once the application is done with a
log message, they can call htp_log_free() to dispose of it.

Functions supporting htp_log_t:
htp_conn_next_log(conn) - Get the next log message
htp_log_message(log) - To get the text of the message
htp_log_code(log) - To get the HTP_LOG_CODE value
htp_log_free(log) - To free the htp_log_t
catenacyber pushed a commit to catenacyber/suricata that referenced this pull request Dec 2, 2024
Ticket: OISF#2696

There are a lot of changes here, which are described below.

In general these changes are renaming constants to conform to the
libhtp-rs versions (which are generated by cbindgen); making all htp
types opaque and changing struct->member references to
htp_struct_member() function calls; and a handful of changes to offload
functionality onto libhtp-rs from suricata, such as URI normalization
and transaction cleanup.

Constants renamed to correspond to cbindgen generated names:

HTP_OK => HTP_STATUS_OK
HTP_ERROR => HTP_STATUS_ERROR

HTP_SERVER_* => HTP_SERVER_PERSONALITY_*

HTP_REQUEST_INVALID_T_E => HTP_FLAGS_REQUEST_INVALID_T_E
HTP_REQUEST_INVALID_C_L => HTP_FLAGS_REQUEST_INVALID_C_L
HTP_HOST_MISSING => HTP_FLAGS_HOST_MISSING
HTP_HOST_AMBIGUOUS => HTP_FLAGS_HOST_AMBIGUOUS
HTP_HOSTU_INVALID => HTP_FLAGS_HOSTU_INVALID
HTP_HOSTH_INVALID => HTP_FLAGS_HOSTH_INVALID

HTP_AUTH_UNRECOGNIZED => HTP_AUTH_TYPE_UNRECOGNIZED

HTP_M_UNKNOWN => HTP_METHOD_UNKNOWN
HTP_M_GET => HTP_METHOD_GET
HTP_M_POST => HTP_METHOD_POST
HTP_M_PUT => HTP_METHOD_PUT
HTP_M_CONNECT => HTP_METHOD_CONNECT

HTP_STREAM_ERROR => HTP_STREAM_STATE_ERROR
HTP_STREAM_TUNNEL => HTP_STREAM_STATE_TUNNEL

HTP_REQUEST_LINE => HTP_REQUEST_PROGRESS_LINE
HTP_REQUEST_HEADERS => HTP_REQUEST_PROGRESS_HEADERS
HTP_REQUEST_BODY => HTP_REQUEST_PROGRESS_BODY
HTP_REQUEST_TRAILER => HTP_REQUEST_PROGRESS_TRAILER
HTP_REQUEST_COMPLETE => HTP_REQUEST_PROGRESS_COMPLETE
HTP_RESPONSE_LINE => HTP_RESPONSE_PROGRESS_LINE
HTP_RESPONSE_HEADERS => HTP_RESPONSE_PROGRESS_HEADERS
HTP_RESPONSE_BODY => HTP_RESPONSE_PROGRESS_BODY
HTP_RESPONSE_TRAILER => HTP_RESPONSE_PROGRESS_TRAILER
HTP_RESPONSE_COMPLETE => HTP_RESPONSE_PROGRESS_COMPLETE

HTP_PROTOCOL_1_1 => HTP_PROTOCOL_V1_1
HTP_PROTOCOL_1_0 => HTP_PROTOCOL_V1_0
HTP_PROTOCOL_0_9 => HTP_PROTOCOL_V0_9

Functions introduced to handle opaque htp_tx_t:
- tx->cfg => htp_tx_cfg(tx)
- tx->flags => htp_tx_flags(tx)
- tx->is_protocol_0_9 => htp_tx_is_protocol_0_9(tx)
- tx->parsed_uri => htp_tx_parsed_uri(tx)
- tx->parsed_uri->path => htp_uri_path(htp_tx_parsed_uri(tx)
- tx->parsed_uri->hostname => htp_uri_hostname(htp_tx_parsed_uri(tx))
- tx->request_auth_type => htp_tx_request_auth_type(tx)
- tx->request_headers => htp_tx_request_headers(tx)
- tx->request_hostname => htp_tx_request_hostname(tx)
- tx->request_line => htp_tx_request_line(tx)
- tx->request_message_len => htp_tx_request_message_len(tx)
- tx->request_method => htp_tx_request_method(tx)
- tx->request_method_number => htp_tx_request_method_number(tx)
- tx->request_port_number => htp_tx_request_port_number(tx)
- tx->request_progress => htp_tx_request_progress(tx)
- tx->request_protocol => htp_tx_request_protocol(tx)
- tx->request_protocol_number => htp_tx_request_protocol_number(tx)
- tx->request_uri => htp_tx_request_uri(tx)
- tx->response_headers => htp_tx_response_headers(tx)
- tx->response_line => htp_tx_response_line(tx)
- tx->response_message => htp_tx_response_message(tx)
- tx->response_message_len => htp_tx_response_message_len(tx)
- tx->response_status => htp_tx_response_status(tx)
- tx->response_status_number => htp_tx_response_status_number(tx)
- tx->response_progress => htp_tx_response_progress(tx)
- tx->response_protocol_number => htp_tx_response_protocol_number(tx)
- htp_tx_get_user_data() => htp_tx_user_data(tx)
- htp_table_get_c(tx->request_headers, header) => htp_tx_request_header(tx, header)
- htp_table_get_c(tx->response_headers, header) => htp_tx_response_header(tx, header)
- htp_table_get_index(tx->request_headers, index) => htp_tx_request_header_index(tx, index)
- htp_table_size(tx->request_headers) => htp_tx_request_headers_size(tx)
- htp_tx_is_http_2_upgrade(tx) convenience function introduced to detect response status 101
  and “Upgrade: h2c" header.

Functions introduced to handle opaque htp_header_t:
- header->name => htp_header_name(header)
- bstr_ptr(header->name) => htp_header_name_ptr(header)
- bstr_len(header->name) => htp_header_name_len(header)
- header->value => htp_header_value(header)
- bstr_len(header->value) => htp_header_value_len(header)
- bstr_ptr(header->value) => htp_header_value_ptr(header)

Functions introduced to handle opaque htp_headers_t:
- htp_table_size(headers) => htp_headers_size(headers)
- htp_table_get_index(headers, index) => htp_headers_get_index(headers, index)

Functions introduced to handle opaque htp_tx_data_t:
- d->len => htp_tx_data_len()
- d->data => htp_tx_data_data()
- htp_tx_data_tx(data) function to get the htp_tx_t from the htp_tx_data_t
- htp_tx_data_is_empty(data) convenience function introduced to test if the data is empty.

Functions introduced to handle opaque htp_connp_t:
- htp_list_get(connp->transactions, txid) => htp_connp_tx(connp, txid)
- htp_list_size(http_state->conn->transactions) => htp_connp_tx_size(connp)
- htp_connp_get_connection(connp) => htp_connp_connection(connp)
- htp_connp_req_data(connp) => htp_connp_request_data(connp)
- htp_connp_req_close(connp) => htp_connp_request_close(connp)
- htp_connp_res_data(connp) => htp_connp_response_data(connp)
- htp_connp_get_jn_tx(connp) => htp_connp_get_request_tx(connp)
- htp_connp_get_out_tx(connp) => htp_connp_get_response_tx(connp)
- htp_connp_req_data_consumed(connp) => htp_connp_request_data_consumed(connp)
- htp_connp_res_data_consumed(connp) => htp_connp_response_data_consumed(connp)
- htp_connp_get_user_data(connp) => htp_connp_user_data(connp)

Functions introduced to handle opaque htp_conn_t:
- conn->in_data_counter => htp_conn_request_data_counter(conn)
- conn->out_data_counter => htp_conn_response_data_counter(conn)

Other changes:

Build libhtp-rs as a crate inside rust. Update autoconf to no longer
use libhtp as an external dependency. Remove HAVE_HTP feature defines
since they are no longer needed.

Make function arguments and return values const where possible

htp_tx_destroy(tx) will now free an incomplete transaction

htp_time_t replaced with standard struct timeval

Callbacks from libhtp now provide the htp_connp_t and the htp_tx_data_t
as separate arguments. This means the connection parser is no longer
fetched from the transaction inside callbacks.

SCHTPGenerateNormalizedUri() functionality moved inside libhtp-rs, which
now provides normalized URI values.
The normalized URI is available with accessor function: htp_tx_normalized_uri()
Configuration settings added to control the behaviour of the URI normalization:
- htp_config_set_normalized_uri_include_all()
- htp_config_set_plusspace_decode()
- htp_config_set_convert_lowercase()
- htp_config_set_double_decode_normalized_query()
- htp_config_set_double_decode_normalized_path()
- htp_config_set_backslash_convert_slashes()
- htp_config_set_bestfit_replacement_byte()
- htp_config_set_convert_lowercase()
- htp_config_set_nul_encoded_terminates()
- htp_config_set_nul_raw_terminates()
- htp_config_set_path_separators_compress()
- htp_config_set_path_separators_decode()
- htp_config_set_u_encoding_decode()
- htp_config_set_url_encoding_invalid_handling()
- htp_config_set_utf8_convert_bestfit()
- htp_config_set_normalized_uri_include_all()
- htp_config_set_plusspace_decode()
Constants related to configuring uri normalization:
- HTP_URL_DECODE_PRESERVE_PERCENT => HTP_URL_ENCODING_HANDLING_PRESERVE_PERCENT
- HTP_URL_DECODE_REMOVE_PERCENT => HTP_URL_ENCODING_HANDLING_REMOVE_PERCENT
- HTP_URL_DECODE_PROCESS_INVALID => HTP_URL_ENCODING_HANDLING_PROCESS_INVALID

htp_config_set_field_limits(soft_limit, hard_limit) changed to
htp_config_set_field_limit(limit) because libhtp didn't implement soft
limits.

libhtp logging API updated to provide HTP_LOG_CODE constants along with
the message. This eliminates the need to perform string matching on
message text to map log messages to HTTP_DECODER_EVENT values, and the
HTP_LOG_CODE values can be used directly. In support of this,
HTP_DECODER_EVENT values are mapped to their corresponding HTP_LOG_CODE
values:

HTTP_DECODER_EVENT_UNKNOWN_ERROR => HTP_LOG_CODE_UNKNOWN
HTTP_DECODER_EVENT_GZIP_DECOMPRESSION_FAILED => HTP_LOG_CODE_GZIP_DECOMPRESSION_FAILED
HTTP_DECODER_EVENT_REQUEST_FIELD_MISSING_COLON => HTP_LOG_CODE_REQUEST_FIELD_MISSING_COLON
HTTP_DECODER_EVENT_RESPONSE_FIELD_MISSING_COLON => HTP_LOG_CODE_RESPONSE_FIELD_MISSING_COLON
HTTP_DECODER_EVENT_INVALID_REQUEST_CHUNK_LEN => HTP_LOG_CODE_INVALID_REQUEST_CHUNK_LEN
HTTP_DECODER_EVENT_INVALID_RESPONSE_CHUNK_LEN => HTP_LOG_CODE_INVALID_RESPONSE_CHUNK_LEN
HTTP_DECODER_EVENT_INVALID_TRANSFER_ENCODING_VALUE_IN_REQUEST => HTP_LOG_CODE_INVALID_TRANSFER_ENCODING_VALUE_IN_REQUEST
HTTP_DECODER_EVENT_INVALID_TRANSFER_ENCODING_VALUE_IN_RESPONSE => HTP_LOG_CODE_INVALID_TRANSFER_ENCODING_VALUE_IN_RESPONSE
HTTP_DECODER_EVENT_INVALID_CONTENT_LENGTH_FIELD_IN_REQUEST => HTP_LOG_CODE_INVALID_CONTENT_LENGTH_FIELD_IN_REQUEST
HTTP_DECODER_EVENT_INVALID_CONTENT_LENGTH_FIELD_IN_RESPONSE => HTP_LOG_CODE_INVALID_CONTENT_LENGTH_FIELD_IN_RESPONSE
HTTP_DECODER_EVENT_DUPLICATE_CONTENT_LENGTH_FIELD_IN_REQUEST => HTP_LOG_CODE_DUPLICATE_CONTENT_LENGTH_FIELD_IN_REQUEST
HTTP_DECODER_EVENT_DUPLICATE_CONTENT_LENGTH_FIELD_IN_RESPONSE => HTP_LOG_CODE_DUPLICATE_CONTENT_LENGTH_FIELD_IN_RESPONSE
HTTP_DECODER_EVENT_100_CONTINUE_ALREADY_SEEN => HTP_LOG_CODE_CONTINUE_ALREADY_SEEN
HTTP_DECODER_EVENT_UNABLE_TO_MATCH_RESPONSE_TO_REQUEST => HTP_LOG_CODE_UNABLE_TO_MATCH_RESPONSE_TO_REQUEST
HTTP_DECODER_EVENT_INVALID_SERVER_PORT_IN_REQUEST => HTP_LOG_CODE_INVALID_SERVER_PORT_IN_REQUEST
HTTP_DECODER_EVENT_INVALID_AUTHORITY_PORT => HTP_LOG_CODE_INVALID_AUTHORITY_PORT
HTTP_DECODER_EVENT_REQUEST_HEADER_INVALID => HTP_LOG_CODE_REQUEST_HEADER_INVALID
HTTP_DECODER_EVENT_RESPONSE_HEADER_INVALID => HTP_LOG_CODE_RESPONSE_HEADER_INVALID
HTTP_DECODER_EVENT_MISSING_HOST_HEADER => HTP_LOG_CODE_MISSING_HOST_HEADER
HTTP_DECODER_EVENT_HOST_HEADER_AMBIGUOUS => HTP_LOG_CODE_HOST_HEADER_AMBIGUOUS
HTTP_DECODER_EVENT_INVALID_REQUEST_FIELD_FOLDING => HTP_LOG_CODE_INVALID_REQUEST_FIELD_FOLDING
HTTP_DECODER_EVENT_INVALID_RESPONSE_FIELD_FOLDING => HTP_LOG_CODE_INVALID_RESPONSE_FIELD_FOLDING
HTTP_DECODER_EVENT_REQUEST_FIELD_TOO_LONG => HTP_LOG_CODE_REQUEST_FIELD_TOO_LONG
HTTP_DECODER_EVENT_RESPONSE_FIELD_TOO_LONG => HTP_LOG_CODE_RESPONSE_FIELD_TOO_LONG
HTTP_DECODER_EVENT_FILE_NAME_TOO_LONG => HTP_LOG_CODE_REQUEST_LINE_INVALID
HTTP_DECODER_EVENT_REQUEST_LINE_INVALID => HTP_LOG_CODE_REQUEST_BODY_UNEXPECTED
HTTP_DECODER_EVENT_REQUEST_BODY_UNEXPECTED => HTP_LOG_CODE_RESPONSE_BODY_UNEXPECTED
HTTP_DECODER_EVENT_REQUEST_SERVER_PORT_TCP_PORT_MISMATCH => HTP_LOG_CODE_REQUEST_SERVER_PORT_TCP_PORT_MISMATCH
HTTP_DECODER_EVENT_URI_HOST_INVALID => HTP_LOG_CODE_URI_HOST_INVALID
HTTP_DECODER_EVENT_HEADER_HOST_INVALID => HTP_LOG_CODE_HEADER_HOST_INVALID
HTTP_DECODER_EVENT_AUTH_UNRECOGNIZED => HTP_LOG_CODE_AUTH_UNRECOGNIZED
HTTP_DECODER_EVENT_REQUEST_HEADER_REPETITION => HTP_LOG_CODE_REQUEST_HEADER_REPETITION
HTTP_DECODER_EVENT_RESPONSE_HEADER_REPETITION => HTP_LOG_CODE_RESPONSE_HEADER_REPETITION
HTTP_DECODER_EVENT_DOUBLE_ENCODED_URI => HTP_LOG_CODE_DOUBLE_ENCODED_URI
HTTP_DECODER_EVENT_URI_DELIM_NON_COMPLIANT => HTP_LOG_CODE_URI_DELIM_NON_COMPLIANT
HTTP_DECODER_EVENT_METHOD_DELIM_NON_COMPLIANT => HTP_LOG_CODE_METHOD_DELIM_NON_COMPLIANT
HTTP_DECODER_EVENT_REQUEST_LINE_LEADING_WHITESPACE => HTP_LOG_CODE_REQUEST_LINE_LEADING_WHITESPACE
HTTP_DECODER_EVENT_TOO_MANY_ENCODING_LAYERS => HTP_LOG_CODE_TOO_MANY_ENCODING_LAYERS
HTTP_DECODER_EVENT_ABNORMAL_CE_HEADER => HTP_LOG_CODE_ABNORMAL_CE_HEADER
HTTP_DECODER_EVENT_RESPONSE_MULTIPART_BYTERANGES => HTP_LOG_CODE_RESPONSE_MULTIPART_BYTERANGES
HTTP_DECODER_EVENT_RESPONSE_ABNORMAL_TRANSFER_ENCODING => HTP_LOG_CODE_RESPONSE_ABNORMAL_TRANSFER_ENCODING
HTTP_DECODER_EVENT_RESPONSE_CHUNKED_OLD_PROTO => HTP_LOG_CODE_RESPONSE_CHUNKED_OLD_PROTO
HTTP_DECODER_EVENT_RESPONSE_INVALID_PROTOCOL => HTP_LOG_CODE_RESPONSE_INVALID_PROTOCOL
HTTP_DECODER_EVENT_RESPONSE_INVALID_STATUS => HTP_LOG_CODE_RESPONSE_INVALID_STATUS
HTTP_DECODER_EVENT_REQUEST_LINE_INCOMPLETE => HTP_LOG_CODE_REQUEST_LINE_INCOMPLETE
HTTP_DECODER_EVENT_LZMA_MEMLIMIT_REACHED => HTP_LOG_CODE_LZMA_MEMLIMIT_REACHED
HTTP_DECODER_EVENT_COMPRESSION_BOMB => HTP_LOG_CODE_COMPRESSION_BOMB

New log events to describe additional anomalies:
HTP_LOG_CODE_REQUEST_TOO_MANY_LZMA_LAYERS
HTP_LOG_CODE_RESPONSE_TOO_MANY_LZMA_LAYERS
HTP_LOG_CODE_PROTOCOL_CONTAINS_EXTRA_DATA
HTP_LOG_CODE_CONTENT_LENGTH_EXTRA_DATA_START
HTP_LOG_CODE_CONTENT_LENGTH_EXTRA_DATA_END
HTP_LOG_CODE_SWITCHING_PROTO_WITH_CONTENT_LENGTH
HTP_LOG_CODE_DEFORMED_EOL
HTP_LOG_CODE_PARSER_STATE_ERROR
HTP_LOG_CODE_MISSING_OUTBOUND_TRANSACTION_DATA
HTP_LOG_CODE_MISSING_INBOUND_TRANSACTION_DATA
HTP_LOG_CODE_ZERO_LENGTH_DATA_CHUNKS
HTP_LOG_CODE_REQUEST_LINE_UNKNOWN_METHOD
HTP_LOG_CODE_REQUEST_LINE_UNKNOWN_METHOD_NO_PROTOCOL
HTP_LOG_CODE_REQUEST_LINE_UNKNOWN_METHOD_INVALID_PROTOCOL
HTP_LOG_CODE_REQUEST_LINE_NO_PROTOCOL
HTP_LOG_CODE_RESPONSE_LINE_INVALID_PROTOCOL
HTP_LOG_CODE_RESPONSE_LINE_INVALID_RESPONSE_STATUS
HTP_LOG_CODE_RESPONSE_BODY_INTERNAL_ERROR
HTP_LOG_CODE_REQUEST_BODY_DATA_CALLBACK_ERROR
HTP_LOG_CODE_RESPONSE_INVALID_EMPTY_NAME
HTP_LOG_CODE_REQUEST_INVALID_EMPTY_NAME
HTP_LOG_CODE_RESPONSE_INVALID_LWS_AFTER_NAME
HTP_LOG_CODE_RESPONSE_HEADER_NAME_NOT_TOKEN
HTP_LOG_CODE_REQUEST_INVALID_LWS_AFTER_NAME
HTP_LOG_CODE_LZMA_DECOMPRESSION_DISABLED
HTP_LOG_CODE_CONNECTION_ALREADY_OPEN
HTP_LOG_CODE_COMPRESSION_BOMB_DOUBLE_LZMA
HTP_LOG_CODE_INVALID_CONTENT_ENCODING
HTP_LOG_CODE_INVALID_GAP
HTP_LOG_CODE_ERROR

The new htp_log API supports consuming log messages more easily than
walking a list and tracking the current offset. Internally, libhtp-rs
now provides log messages as a queue of htp_log_t, which means the
application can simply call htp_conn_next_log() to fetch the next log
message until the queue is empty. Once the application is done with a
log message, they can call htp_log_free() to dispose of it.

Functions supporting htp_log_t:
htp_conn_next_log(conn) - Get the next log message
htp_log_message(log) - To get the text of the message
htp_log_code(log) - To get the HTP_LOG_CODE value
htp_log_free(log) - To free the htp_log_t
catenacyber pushed a commit to catenacyber/suricata that referenced this pull request Dec 2, 2024
Ticket: OISF#2696

There are a lot of changes here, which are described below.

In general these changes are renaming constants to conform to the
libhtp-rs versions (which are generated by cbindgen); making all htp
types opaque and changing struct->member references to
htp_struct_member() function calls; and a handful of changes to offload
functionality onto libhtp-rs from suricata, such as URI normalization
and transaction cleanup.

Constants renamed to correspond to cbindgen generated names:

HTP_OK => HTP_STATUS_OK
HTP_ERROR => HTP_STATUS_ERROR

HTP_SERVER_* => HTP_SERVER_PERSONALITY_*

HTP_REQUEST_INVALID_T_E => HTP_FLAGS_REQUEST_INVALID_T_E
HTP_REQUEST_INVALID_C_L => HTP_FLAGS_REQUEST_INVALID_C_L
HTP_HOST_MISSING => HTP_FLAGS_HOST_MISSING
HTP_HOST_AMBIGUOUS => HTP_FLAGS_HOST_AMBIGUOUS
HTP_HOSTU_INVALID => HTP_FLAGS_HOSTU_INVALID
HTP_HOSTH_INVALID => HTP_FLAGS_HOSTH_INVALID

HTP_AUTH_UNRECOGNIZED => HTP_AUTH_TYPE_UNRECOGNIZED

HTP_M_UNKNOWN => HTP_METHOD_UNKNOWN
HTP_M_GET => HTP_METHOD_GET
HTP_M_POST => HTP_METHOD_POST
HTP_M_PUT => HTP_METHOD_PUT
HTP_M_CONNECT => HTP_METHOD_CONNECT

HTP_STREAM_ERROR => HTP_STREAM_STATE_ERROR
HTP_STREAM_TUNNEL => HTP_STREAM_STATE_TUNNEL

HTP_REQUEST_LINE => HTP_REQUEST_PROGRESS_LINE
HTP_REQUEST_HEADERS => HTP_REQUEST_PROGRESS_HEADERS
HTP_REQUEST_BODY => HTP_REQUEST_PROGRESS_BODY
HTP_REQUEST_TRAILER => HTP_REQUEST_PROGRESS_TRAILER
HTP_REQUEST_COMPLETE => HTP_REQUEST_PROGRESS_COMPLETE
HTP_RESPONSE_LINE => HTP_RESPONSE_PROGRESS_LINE
HTP_RESPONSE_HEADERS => HTP_RESPONSE_PROGRESS_HEADERS
HTP_RESPONSE_BODY => HTP_RESPONSE_PROGRESS_BODY
HTP_RESPONSE_TRAILER => HTP_RESPONSE_PROGRESS_TRAILER
HTP_RESPONSE_COMPLETE => HTP_RESPONSE_PROGRESS_COMPLETE

HTP_PROTOCOL_1_1 => HTP_PROTOCOL_V1_1
HTP_PROTOCOL_1_0 => HTP_PROTOCOL_V1_0
HTP_PROTOCOL_0_9 => HTP_PROTOCOL_V0_9

Functions introduced to handle opaque htp_tx_t:
- tx->cfg => htp_tx_cfg(tx)
- tx->flags => htp_tx_flags(tx)
- tx->is_protocol_0_9 => htp_tx_is_protocol_0_9(tx)
- tx->parsed_uri => htp_tx_parsed_uri(tx)
- tx->parsed_uri->path => htp_uri_path(htp_tx_parsed_uri(tx)
- tx->parsed_uri->hostname => htp_uri_hostname(htp_tx_parsed_uri(tx))
- tx->request_auth_type => htp_tx_request_auth_type(tx)
- tx->request_headers => htp_tx_request_headers(tx)
- tx->request_hostname => htp_tx_request_hostname(tx)
- tx->request_line => htp_tx_request_line(tx)
- tx->request_message_len => htp_tx_request_message_len(tx)
- tx->request_method => htp_tx_request_method(tx)
- tx->request_method_number => htp_tx_request_method_number(tx)
- tx->request_port_number => htp_tx_request_port_number(tx)
- tx->request_progress => htp_tx_request_progress(tx)
- tx->request_protocol => htp_tx_request_protocol(tx)
- tx->request_protocol_number => htp_tx_request_protocol_number(tx)
- tx->request_uri => htp_tx_request_uri(tx)
- tx->response_headers => htp_tx_response_headers(tx)
- tx->response_line => htp_tx_response_line(tx)
- tx->response_message => htp_tx_response_message(tx)
- tx->response_message_len => htp_tx_response_message_len(tx)
- tx->response_status => htp_tx_response_status(tx)
- tx->response_status_number => htp_tx_response_status_number(tx)
- tx->response_progress => htp_tx_response_progress(tx)
- tx->response_protocol_number => htp_tx_response_protocol_number(tx)
- htp_tx_get_user_data() => htp_tx_user_data(tx)
- htp_table_get_c(tx->request_headers, header) => htp_tx_request_header(tx, header)
- htp_table_get_c(tx->response_headers, header) => htp_tx_response_header(tx, header)
- htp_table_get_index(tx->request_headers, index) => htp_tx_request_header_index(tx, index)
- htp_table_size(tx->request_headers) => htp_tx_request_headers_size(tx)
- htp_tx_is_http_2_upgrade(tx) convenience function introduced to detect response status 101
  and “Upgrade: h2c" header.

Functions introduced to handle opaque htp_header_t:
- header->name => htp_header_name(header)
- bstr_ptr(header->name) => htp_header_name_ptr(header)
- bstr_len(header->name) => htp_header_name_len(header)
- header->value => htp_header_value(header)
- bstr_len(header->value) => htp_header_value_len(header)
- bstr_ptr(header->value) => htp_header_value_ptr(header)

Functions introduced to handle opaque htp_headers_t:
- htp_table_size(headers) => htp_headers_size(headers)
- htp_table_get_index(headers, index) => htp_headers_get_index(headers, index)

Functions introduced to handle opaque htp_tx_data_t:
- d->len => htp_tx_data_len()
- d->data => htp_tx_data_data()
- htp_tx_data_tx(data) function to get the htp_tx_t from the htp_tx_data_t
- htp_tx_data_is_empty(data) convenience function introduced to test if the data is empty.

Functions introduced to handle opaque htp_connp_t:
- htp_list_get(connp->transactions, txid) => htp_connp_tx(connp, txid)
- htp_list_size(http_state->conn->transactions) => htp_connp_tx_size(connp)
- htp_connp_get_connection(connp) => htp_connp_connection(connp)
- htp_connp_req_data(connp) => htp_connp_request_data(connp)
- htp_connp_req_close(connp) => htp_connp_request_close(connp)
- htp_connp_res_data(connp) => htp_connp_response_data(connp)
- htp_connp_get_jn_tx(connp) => htp_connp_get_request_tx(connp)
- htp_connp_get_out_tx(connp) => htp_connp_get_response_tx(connp)
- htp_connp_req_data_consumed(connp) => htp_connp_request_data_consumed(connp)
- htp_connp_res_data_consumed(connp) => htp_connp_response_data_consumed(connp)
- htp_connp_get_user_data(connp) => htp_connp_user_data(connp)

Functions introduced to handle opaque htp_conn_t:
- conn->in_data_counter => htp_conn_request_data_counter(conn)
- conn->out_data_counter => htp_conn_response_data_counter(conn)

Other changes:

Build libhtp-rs as a crate inside rust. Update autoconf to no longer
use libhtp as an external dependency. Remove HAVE_HTP feature defines
since they are no longer needed.

Make function arguments and return values const where possible

htp_tx_destroy(tx) will now free an incomplete transaction

htp_time_t replaced with standard struct timeval

Callbacks from libhtp now provide the htp_connp_t and the htp_tx_data_t
as separate arguments. This means the connection parser is no longer
fetched from the transaction inside callbacks.

SCHTPGenerateNormalizedUri() functionality moved inside libhtp-rs, which
now provides normalized URI values.
The normalized URI is available with accessor function: htp_tx_normalized_uri()
Configuration settings added to control the behaviour of the URI normalization:
- htp_config_set_normalized_uri_include_all()
- htp_config_set_plusspace_decode()
- htp_config_set_convert_lowercase()
- htp_config_set_double_decode_normalized_query()
- htp_config_set_double_decode_normalized_path()
- htp_config_set_backslash_convert_slashes()
- htp_config_set_bestfit_replacement_byte()
- htp_config_set_convert_lowercase()
- htp_config_set_nul_encoded_terminates()
- htp_config_set_nul_raw_terminates()
- htp_config_set_path_separators_compress()
- htp_config_set_path_separators_decode()
- htp_config_set_u_encoding_decode()
- htp_config_set_url_encoding_invalid_handling()
- htp_config_set_utf8_convert_bestfit()
- htp_config_set_normalized_uri_include_all()
- htp_config_set_plusspace_decode()
Constants related to configuring uri normalization:
- HTP_URL_DECODE_PRESERVE_PERCENT => HTP_URL_ENCODING_HANDLING_PRESERVE_PERCENT
- HTP_URL_DECODE_REMOVE_PERCENT => HTP_URL_ENCODING_HANDLING_REMOVE_PERCENT
- HTP_URL_DECODE_PROCESS_INVALID => HTP_URL_ENCODING_HANDLING_PROCESS_INVALID

htp_config_set_field_limits(soft_limit, hard_limit) changed to
htp_config_set_field_limit(limit) because libhtp didn't implement soft
limits.

libhtp logging API updated to provide HTP_LOG_CODE constants along with
the message. This eliminates the need to perform string matching on
message text to map log messages to HTTP_DECODER_EVENT values, and the
HTP_LOG_CODE values can be used directly. In support of this,
HTP_DECODER_EVENT values are mapped to their corresponding HTP_LOG_CODE
values:

HTTP_DECODER_EVENT_UNKNOWN_ERROR => HTP_LOG_CODE_UNKNOWN
HTTP_DECODER_EVENT_GZIP_DECOMPRESSION_FAILED => HTP_LOG_CODE_GZIP_DECOMPRESSION_FAILED
HTTP_DECODER_EVENT_REQUEST_FIELD_MISSING_COLON => HTP_LOG_CODE_REQUEST_FIELD_MISSING_COLON
HTTP_DECODER_EVENT_RESPONSE_FIELD_MISSING_COLON => HTP_LOG_CODE_RESPONSE_FIELD_MISSING_COLON
HTTP_DECODER_EVENT_INVALID_REQUEST_CHUNK_LEN => HTP_LOG_CODE_INVALID_REQUEST_CHUNK_LEN
HTTP_DECODER_EVENT_INVALID_RESPONSE_CHUNK_LEN => HTP_LOG_CODE_INVALID_RESPONSE_CHUNK_LEN
HTTP_DECODER_EVENT_INVALID_TRANSFER_ENCODING_VALUE_IN_REQUEST => HTP_LOG_CODE_INVALID_TRANSFER_ENCODING_VALUE_IN_REQUEST
HTTP_DECODER_EVENT_INVALID_TRANSFER_ENCODING_VALUE_IN_RESPONSE => HTP_LOG_CODE_INVALID_TRANSFER_ENCODING_VALUE_IN_RESPONSE
HTTP_DECODER_EVENT_INVALID_CONTENT_LENGTH_FIELD_IN_REQUEST => HTP_LOG_CODE_INVALID_CONTENT_LENGTH_FIELD_IN_REQUEST
HTTP_DECODER_EVENT_INVALID_CONTENT_LENGTH_FIELD_IN_RESPONSE => HTP_LOG_CODE_INVALID_CONTENT_LENGTH_FIELD_IN_RESPONSE
HTTP_DECODER_EVENT_DUPLICATE_CONTENT_LENGTH_FIELD_IN_REQUEST => HTP_LOG_CODE_DUPLICATE_CONTENT_LENGTH_FIELD_IN_REQUEST
HTTP_DECODER_EVENT_DUPLICATE_CONTENT_LENGTH_FIELD_IN_RESPONSE => HTP_LOG_CODE_DUPLICATE_CONTENT_LENGTH_FIELD_IN_RESPONSE
HTTP_DECODER_EVENT_100_CONTINUE_ALREADY_SEEN => HTP_LOG_CODE_CONTINUE_ALREADY_SEEN
HTTP_DECODER_EVENT_UNABLE_TO_MATCH_RESPONSE_TO_REQUEST => HTP_LOG_CODE_UNABLE_TO_MATCH_RESPONSE_TO_REQUEST
HTTP_DECODER_EVENT_INVALID_SERVER_PORT_IN_REQUEST => HTP_LOG_CODE_INVALID_SERVER_PORT_IN_REQUEST
HTTP_DECODER_EVENT_INVALID_AUTHORITY_PORT => HTP_LOG_CODE_INVALID_AUTHORITY_PORT
HTTP_DECODER_EVENT_REQUEST_HEADER_INVALID => HTP_LOG_CODE_REQUEST_HEADER_INVALID
HTTP_DECODER_EVENT_RESPONSE_HEADER_INVALID => HTP_LOG_CODE_RESPONSE_HEADER_INVALID
HTTP_DECODER_EVENT_MISSING_HOST_HEADER => HTP_LOG_CODE_MISSING_HOST_HEADER
HTTP_DECODER_EVENT_HOST_HEADER_AMBIGUOUS => HTP_LOG_CODE_HOST_HEADER_AMBIGUOUS
HTTP_DECODER_EVENT_INVALID_REQUEST_FIELD_FOLDING => HTP_LOG_CODE_INVALID_REQUEST_FIELD_FOLDING
HTTP_DECODER_EVENT_INVALID_RESPONSE_FIELD_FOLDING => HTP_LOG_CODE_INVALID_RESPONSE_FIELD_FOLDING
HTTP_DECODER_EVENT_REQUEST_FIELD_TOO_LONG => HTP_LOG_CODE_REQUEST_FIELD_TOO_LONG
HTTP_DECODER_EVENT_RESPONSE_FIELD_TOO_LONG => HTP_LOG_CODE_RESPONSE_FIELD_TOO_LONG
HTTP_DECODER_EVENT_FILE_NAME_TOO_LONG => HTP_LOG_CODE_REQUEST_LINE_INVALID
HTTP_DECODER_EVENT_REQUEST_LINE_INVALID => HTP_LOG_CODE_REQUEST_BODY_UNEXPECTED
HTTP_DECODER_EVENT_REQUEST_BODY_UNEXPECTED => HTP_LOG_CODE_RESPONSE_BODY_UNEXPECTED
HTTP_DECODER_EVENT_REQUEST_SERVER_PORT_TCP_PORT_MISMATCH => HTP_LOG_CODE_REQUEST_SERVER_PORT_TCP_PORT_MISMATCH
HTTP_DECODER_EVENT_URI_HOST_INVALID => HTP_LOG_CODE_URI_HOST_INVALID
HTTP_DECODER_EVENT_HEADER_HOST_INVALID => HTP_LOG_CODE_HEADER_HOST_INVALID
HTTP_DECODER_EVENT_AUTH_UNRECOGNIZED => HTP_LOG_CODE_AUTH_UNRECOGNIZED
HTTP_DECODER_EVENT_REQUEST_HEADER_REPETITION => HTP_LOG_CODE_REQUEST_HEADER_REPETITION
HTTP_DECODER_EVENT_RESPONSE_HEADER_REPETITION => HTP_LOG_CODE_RESPONSE_HEADER_REPETITION
HTTP_DECODER_EVENT_DOUBLE_ENCODED_URI => HTP_LOG_CODE_DOUBLE_ENCODED_URI
HTTP_DECODER_EVENT_URI_DELIM_NON_COMPLIANT => HTP_LOG_CODE_URI_DELIM_NON_COMPLIANT
HTTP_DECODER_EVENT_METHOD_DELIM_NON_COMPLIANT => HTP_LOG_CODE_METHOD_DELIM_NON_COMPLIANT
HTTP_DECODER_EVENT_REQUEST_LINE_LEADING_WHITESPACE => HTP_LOG_CODE_REQUEST_LINE_LEADING_WHITESPACE
HTTP_DECODER_EVENT_TOO_MANY_ENCODING_LAYERS => HTP_LOG_CODE_TOO_MANY_ENCODING_LAYERS
HTTP_DECODER_EVENT_ABNORMAL_CE_HEADER => HTP_LOG_CODE_ABNORMAL_CE_HEADER
HTTP_DECODER_EVENT_RESPONSE_MULTIPART_BYTERANGES => HTP_LOG_CODE_RESPONSE_MULTIPART_BYTERANGES
HTTP_DECODER_EVENT_RESPONSE_ABNORMAL_TRANSFER_ENCODING => HTP_LOG_CODE_RESPONSE_ABNORMAL_TRANSFER_ENCODING
HTTP_DECODER_EVENT_RESPONSE_CHUNKED_OLD_PROTO => HTP_LOG_CODE_RESPONSE_CHUNKED_OLD_PROTO
HTTP_DECODER_EVENT_RESPONSE_INVALID_PROTOCOL => HTP_LOG_CODE_RESPONSE_INVALID_PROTOCOL
HTTP_DECODER_EVENT_RESPONSE_INVALID_STATUS => HTP_LOG_CODE_RESPONSE_INVALID_STATUS
HTTP_DECODER_EVENT_REQUEST_LINE_INCOMPLETE => HTP_LOG_CODE_REQUEST_LINE_INCOMPLETE
HTTP_DECODER_EVENT_LZMA_MEMLIMIT_REACHED => HTP_LOG_CODE_LZMA_MEMLIMIT_REACHED
HTTP_DECODER_EVENT_COMPRESSION_BOMB => HTP_LOG_CODE_COMPRESSION_BOMB

New log events to describe additional anomalies:
HTP_LOG_CODE_REQUEST_TOO_MANY_LZMA_LAYERS
HTP_LOG_CODE_RESPONSE_TOO_MANY_LZMA_LAYERS
HTP_LOG_CODE_PROTOCOL_CONTAINS_EXTRA_DATA
HTP_LOG_CODE_CONTENT_LENGTH_EXTRA_DATA_START
HTP_LOG_CODE_CONTENT_LENGTH_EXTRA_DATA_END
HTP_LOG_CODE_SWITCHING_PROTO_WITH_CONTENT_LENGTH
HTP_LOG_CODE_DEFORMED_EOL
HTP_LOG_CODE_PARSER_STATE_ERROR
HTP_LOG_CODE_MISSING_OUTBOUND_TRANSACTION_DATA
HTP_LOG_CODE_MISSING_INBOUND_TRANSACTION_DATA
HTP_LOG_CODE_ZERO_LENGTH_DATA_CHUNKS
HTP_LOG_CODE_REQUEST_LINE_UNKNOWN_METHOD
HTP_LOG_CODE_REQUEST_LINE_UNKNOWN_METHOD_NO_PROTOCOL
HTP_LOG_CODE_REQUEST_LINE_UNKNOWN_METHOD_INVALID_PROTOCOL
HTP_LOG_CODE_REQUEST_LINE_NO_PROTOCOL
HTP_LOG_CODE_RESPONSE_LINE_INVALID_PROTOCOL
HTP_LOG_CODE_RESPONSE_LINE_INVALID_RESPONSE_STATUS
HTP_LOG_CODE_RESPONSE_BODY_INTERNAL_ERROR
HTP_LOG_CODE_REQUEST_BODY_DATA_CALLBACK_ERROR
HTP_LOG_CODE_RESPONSE_INVALID_EMPTY_NAME
HTP_LOG_CODE_REQUEST_INVALID_EMPTY_NAME
HTP_LOG_CODE_RESPONSE_INVALID_LWS_AFTER_NAME
HTP_LOG_CODE_RESPONSE_HEADER_NAME_NOT_TOKEN
HTP_LOG_CODE_REQUEST_INVALID_LWS_AFTER_NAME
HTP_LOG_CODE_LZMA_DECOMPRESSION_DISABLED
HTP_LOG_CODE_CONNECTION_ALREADY_OPEN
HTP_LOG_CODE_COMPRESSION_BOMB_DOUBLE_LZMA
HTP_LOG_CODE_INVALID_CONTENT_ENCODING
HTP_LOG_CODE_INVALID_GAP
HTP_LOG_CODE_ERROR

The new htp_log API supports consuming log messages more easily than
walking a list and tracking the current offset. Internally, libhtp-rs
now provides log messages as a queue of htp_log_t, which means the
application can simply call htp_conn_next_log() to fetch the next log
message until the queue is empty. Once the application is done with a
log message, they can call htp_log_free() to dispose of it.

Functions supporting htp_log_t:
htp_conn_next_log(conn) - Get the next log message
htp_log_message(log) - To get the text of the message
htp_log_code(log) - To get the HTP_LOG_CODE value
htp_log_free(log) - To free the htp_log_t
catenacyber pushed a commit to catenacyber/suricata that referenced this pull request Dec 3, 2024
Ticket: OISF#2696

There are a lot of changes here, which are described below.

In general these changes are renaming constants to conform to the
libhtp-rs versions (which are generated by cbindgen); making all htp
types opaque and changing struct->member references to
htp_struct_member() function calls; and a handful of changes to offload
functionality onto libhtp-rs from suricata, such as URI normalization
and transaction cleanup.

Constants renamed to correspond to cbindgen generated names:

HTP_OK => HTP_STATUS_OK
HTP_ERROR => HTP_STATUS_ERROR

HTP_SERVER_* => HTP_SERVER_PERSONALITY_*

HTP_REQUEST_INVALID_T_E => HTP_FLAGS_REQUEST_INVALID_T_E
HTP_REQUEST_INVALID_C_L => HTP_FLAGS_REQUEST_INVALID_C_L
HTP_HOST_MISSING => HTP_FLAGS_HOST_MISSING
HTP_HOST_AMBIGUOUS => HTP_FLAGS_HOST_AMBIGUOUS
HTP_HOSTU_INVALID => HTP_FLAGS_HOSTU_INVALID
HTP_HOSTH_INVALID => HTP_FLAGS_HOSTH_INVALID

HTP_AUTH_UNRECOGNIZED => HTP_AUTH_TYPE_UNRECOGNIZED

HTP_M_UNKNOWN => HTP_METHOD_UNKNOWN
HTP_M_GET => HTP_METHOD_GET
HTP_M_POST => HTP_METHOD_POST
HTP_M_PUT => HTP_METHOD_PUT
HTP_M_CONNECT => HTP_METHOD_CONNECT

HTP_STREAM_ERROR => HTP_STREAM_STATE_ERROR
HTP_STREAM_TUNNEL => HTP_STREAM_STATE_TUNNEL

HTP_REQUEST_LINE => HTP_REQUEST_PROGRESS_LINE
HTP_REQUEST_HEADERS => HTP_REQUEST_PROGRESS_HEADERS
HTP_REQUEST_BODY => HTP_REQUEST_PROGRESS_BODY
HTP_REQUEST_TRAILER => HTP_REQUEST_PROGRESS_TRAILER
HTP_REQUEST_COMPLETE => HTP_REQUEST_PROGRESS_COMPLETE
HTP_RESPONSE_LINE => HTP_RESPONSE_PROGRESS_LINE
HTP_RESPONSE_HEADERS => HTP_RESPONSE_PROGRESS_HEADERS
HTP_RESPONSE_BODY => HTP_RESPONSE_PROGRESS_BODY
HTP_RESPONSE_TRAILER => HTP_RESPONSE_PROGRESS_TRAILER
HTP_RESPONSE_COMPLETE => HTP_RESPONSE_PROGRESS_COMPLETE

HTP_PROTOCOL_1_1 => HTP_PROTOCOL_V1_1
HTP_PROTOCOL_1_0 => HTP_PROTOCOL_V1_0
HTP_PROTOCOL_0_9 => HTP_PROTOCOL_V0_9

Functions introduced to handle opaque htp_tx_t:
- tx->cfg => htp_tx_cfg(tx)
- tx->flags => htp_tx_flags(tx)
- tx->is_protocol_0_9 => htp_tx_is_protocol_0_9(tx)
- tx->parsed_uri => htp_tx_parsed_uri(tx)
- tx->parsed_uri->path => htp_uri_path(htp_tx_parsed_uri(tx)
- tx->parsed_uri->hostname => htp_uri_hostname(htp_tx_parsed_uri(tx))
- tx->request_auth_type => htp_tx_request_auth_type(tx)
- tx->request_headers => htp_tx_request_headers(tx)
- tx->request_hostname => htp_tx_request_hostname(tx)
- tx->request_line => htp_tx_request_line(tx)
- tx->request_message_len => htp_tx_request_message_len(tx)
- tx->request_method => htp_tx_request_method(tx)
- tx->request_method_number => htp_tx_request_method_number(tx)
- tx->request_port_number => htp_tx_request_port_number(tx)
- tx->request_progress => htp_tx_request_progress(tx)
- tx->request_protocol => htp_tx_request_protocol(tx)
- tx->request_protocol_number => htp_tx_request_protocol_number(tx)
- tx->request_uri => htp_tx_request_uri(tx)
- tx->response_headers => htp_tx_response_headers(tx)
- tx->response_line => htp_tx_response_line(tx)
- tx->response_message => htp_tx_response_message(tx)
- tx->response_message_len => htp_tx_response_message_len(tx)
- tx->response_status => htp_tx_response_status(tx)
- tx->response_status_number => htp_tx_response_status_number(tx)
- tx->response_progress => htp_tx_response_progress(tx)
- tx->response_protocol_number => htp_tx_response_protocol_number(tx)
- htp_tx_get_user_data() => htp_tx_user_data(tx)
- htp_table_get_c(tx->request_headers, header) => htp_tx_request_header(tx, header)
- htp_table_get_c(tx->response_headers, header) => htp_tx_response_header(tx, header)
- htp_table_get_index(tx->request_headers, index) => htp_tx_request_header_index(tx, index)
- htp_table_size(tx->request_headers) => htp_tx_request_headers_size(tx)
- htp_tx_is_http_2_upgrade(tx) convenience function introduced to detect response status 101
  and “Upgrade: h2c" header.

Functions introduced to handle opaque htp_header_t:
- header->name => htp_header_name(header)
- bstr_ptr(header->name) => htp_header_name_ptr(header)
- bstr_len(header->name) => htp_header_name_len(header)
- header->value => htp_header_value(header)
- bstr_len(header->value) => htp_header_value_len(header)
- bstr_ptr(header->value) => htp_header_value_ptr(header)

Functions introduced to handle opaque htp_headers_t:
- htp_table_size(headers) => htp_headers_size(headers)
- htp_table_get_index(headers, index) => htp_headers_get_index(headers, index)

Functions introduced to handle opaque htp_tx_data_t:
- d->len => htp_tx_data_len()
- d->data => htp_tx_data_data()
- htp_tx_data_tx(data) function to get the htp_tx_t from the htp_tx_data_t
- htp_tx_data_is_empty(data) convenience function introduced to test if the data is empty.

Functions introduced to handle opaque htp_connp_t:
- htp_list_get(connp->transactions, txid) => htp_connp_tx(connp, txid)
- htp_list_size(http_state->conn->transactions) => htp_connp_tx_size(connp)
- htp_connp_get_connection(connp) => htp_connp_connection(connp)
- htp_connp_req_data(connp) => htp_connp_request_data(connp)
- htp_connp_req_close(connp) => htp_connp_request_close(connp)
- htp_connp_res_data(connp) => htp_connp_response_data(connp)
- htp_connp_get_jn_tx(connp) => htp_connp_get_request_tx(connp)
- htp_connp_get_out_tx(connp) => htp_connp_get_response_tx(connp)
- htp_connp_req_data_consumed(connp) => htp_connp_request_data_consumed(connp)
- htp_connp_res_data_consumed(connp) => htp_connp_response_data_consumed(connp)
- htp_connp_get_user_data(connp) => htp_connp_user_data(connp)

Functions introduced to handle opaque htp_conn_t:
- conn->in_data_counter => htp_conn_request_data_counter(conn)
- conn->out_data_counter => htp_conn_response_data_counter(conn)

Other changes:

Build libhtp-rs as a crate inside rust. Update autoconf to no longer
use libhtp as an external dependency. Remove HAVE_HTP feature defines
since they are no longer needed.

Make function arguments and return values const where possible

htp_tx_destroy(tx) will now free an incomplete transaction

htp_time_t replaced with standard struct timeval

Callbacks from libhtp now provide the htp_connp_t and the htp_tx_data_t
as separate arguments. This means the connection parser is no longer
fetched from the transaction inside callbacks.

SCHTPGenerateNormalizedUri() functionality moved inside libhtp-rs, which
now provides normalized URI values.
The normalized URI is available with accessor function: htp_tx_normalized_uri()
Configuration settings added to control the behaviour of the URI normalization:
- htp_config_set_normalized_uri_include_all()
- htp_config_set_plusspace_decode()
- htp_config_set_convert_lowercase()
- htp_config_set_double_decode_normalized_query()
- htp_config_set_double_decode_normalized_path()
- htp_config_set_backslash_convert_slashes()
- htp_config_set_bestfit_replacement_byte()
- htp_config_set_convert_lowercase()
- htp_config_set_nul_encoded_terminates()
- htp_config_set_nul_raw_terminates()
- htp_config_set_path_separators_compress()
- htp_config_set_path_separators_decode()
- htp_config_set_u_encoding_decode()
- htp_config_set_url_encoding_invalid_handling()
- htp_config_set_utf8_convert_bestfit()
- htp_config_set_normalized_uri_include_all()
- htp_config_set_plusspace_decode()
Constants related to configuring uri normalization:
- HTP_URL_DECODE_PRESERVE_PERCENT => HTP_URL_ENCODING_HANDLING_PRESERVE_PERCENT
- HTP_URL_DECODE_REMOVE_PERCENT => HTP_URL_ENCODING_HANDLING_REMOVE_PERCENT
- HTP_URL_DECODE_PROCESS_INVALID => HTP_URL_ENCODING_HANDLING_PROCESS_INVALID

htp_config_set_field_limits(soft_limit, hard_limit) changed to
htp_config_set_field_limit(limit) because libhtp didn't implement soft
limits.

libhtp logging API updated to provide HTP_LOG_CODE constants along with
the message. This eliminates the need to perform string matching on
message text to map log messages to HTTP_DECODER_EVENT values, and the
HTP_LOG_CODE values can be used directly. In support of this,
HTP_DECODER_EVENT values are mapped to their corresponding HTP_LOG_CODE
values:

HTTP_DECODER_EVENT_UNKNOWN_ERROR => HTP_LOG_CODE_UNKNOWN
HTTP_DECODER_EVENT_GZIP_DECOMPRESSION_FAILED => HTP_LOG_CODE_GZIP_DECOMPRESSION_FAILED
HTTP_DECODER_EVENT_REQUEST_FIELD_MISSING_COLON => HTP_LOG_CODE_REQUEST_FIELD_MISSING_COLON
HTTP_DECODER_EVENT_RESPONSE_FIELD_MISSING_COLON => HTP_LOG_CODE_RESPONSE_FIELD_MISSING_COLON
HTTP_DECODER_EVENT_INVALID_REQUEST_CHUNK_LEN => HTP_LOG_CODE_INVALID_REQUEST_CHUNK_LEN
HTTP_DECODER_EVENT_INVALID_RESPONSE_CHUNK_LEN => HTP_LOG_CODE_INVALID_RESPONSE_CHUNK_LEN
HTTP_DECODER_EVENT_INVALID_TRANSFER_ENCODING_VALUE_IN_REQUEST => HTP_LOG_CODE_INVALID_TRANSFER_ENCODING_VALUE_IN_REQUEST
HTTP_DECODER_EVENT_INVALID_TRANSFER_ENCODING_VALUE_IN_RESPONSE => HTP_LOG_CODE_INVALID_TRANSFER_ENCODING_VALUE_IN_RESPONSE
HTTP_DECODER_EVENT_INVALID_CONTENT_LENGTH_FIELD_IN_REQUEST => HTP_LOG_CODE_INVALID_CONTENT_LENGTH_FIELD_IN_REQUEST
HTTP_DECODER_EVENT_INVALID_CONTENT_LENGTH_FIELD_IN_RESPONSE => HTP_LOG_CODE_INVALID_CONTENT_LENGTH_FIELD_IN_RESPONSE
HTTP_DECODER_EVENT_DUPLICATE_CONTENT_LENGTH_FIELD_IN_REQUEST => HTP_LOG_CODE_DUPLICATE_CONTENT_LENGTH_FIELD_IN_REQUEST
HTTP_DECODER_EVENT_DUPLICATE_CONTENT_LENGTH_FIELD_IN_RESPONSE => HTP_LOG_CODE_DUPLICATE_CONTENT_LENGTH_FIELD_IN_RESPONSE
HTTP_DECODER_EVENT_100_CONTINUE_ALREADY_SEEN => HTP_LOG_CODE_CONTINUE_ALREADY_SEEN
HTTP_DECODER_EVENT_UNABLE_TO_MATCH_RESPONSE_TO_REQUEST => HTP_LOG_CODE_UNABLE_TO_MATCH_RESPONSE_TO_REQUEST
HTTP_DECODER_EVENT_INVALID_SERVER_PORT_IN_REQUEST => HTP_LOG_CODE_INVALID_SERVER_PORT_IN_REQUEST
HTTP_DECODER_EVENT_INVALID_AUTHORITY_PORT => HTP_LOG_CODE_INVALID_AUTHORITY_PORT
HTTP_DECODER_EVENT_REQUEST_HEADER_INVALID => HTP_LOG_CODE_REQUEST_HEADER_INVALID
HTTP_DECODER_EVENT_RESPONSE_HEADER_INVALID => HTP_LOG_CODE_RESPONSE_HEADER_INVALID
HTTP_DECODER_EVENT_MISSING_HOST_HEADER => HTP_LOG_CODE_MISSING_HOST_HEADER
HTTP_DECODER_EVENT_HOST_HEADER_AMBIGUOUS => HTP_LOG_CODE_HOST_HEADER_AMBIGUOUS
HTTP_DECODER_EVENT_INVALID_REQUEST_FIELD_FOLDING => HTP_LOG_CODE_INVALID_REQUEST_FIELD_FOLDING
HTTP_DECODER_EVENT_INVALID_RESPONSE_FIELD_FOLDING => HTP_LOG_CODE_INVALID_RESPONSE_FIELD_FOLDING
HTTP_DECODER_EVENT_REQUEST_FIELD_TOO_LONG => HTP_LOG_CODE_REQUEST_FIELD_TOO_LONG
HTTP_DECODER_EVENT_RESPONSE_FIELD_TOO_LONG => HTP_LOG_CODE_RESPONSE_FIELD_TOO_LONG
HTTP_DECODER_EVENT_FILE_NAME_TOO_LONG => HTP_LOG_CODE_REQUEST_LINE_INVALID
HTTP_DECODER_EVENT_REQUEST_LINE_INVALID => HTP_LOG_CODE_REQUEST_BODY_UNEXPECTED
HTTP_DECODER_EVENT_REQUEST_BODY_UNEXPECTED => HTP_LOG_CODE_RESPONSE_BODY_UNEXPECTED
HTTP_DECODER_EVENT_REQUEST_SERVER_PORT_TCP_PORT_MISMATCH => HTP_LOG_CODE_REQUEST_SERVER_PORT_TCP_PORT_MISMATCH
HTTP_DECODER_EVENT_URI_HOST_INVALID => HTP_LOG_CODE_URI_HOST_INVALID
HTTP_DECODER_EVENT_HEADER_HOST_INVALID => HTP_LOG_CODE_HEADER_HOST_INVALID
HTTP_DECODER_EVENT_AUTH_UNRECOGNIZED => HTP_LOG_CODE_AUTH_UNRECOGNIZED
HTTP_DECODER_EVENT_REQUEST_HEADER_REPETITION => HTP_LOG_CODE_REQUEST_HEADER_REPETITION
HTTP_DECODER_EVENT_RESPONSE_HEADER_REPETITION => HTP_LOG_CODE_RESPONSE_HEADER_REPETITION
HTTP_DECODER_EVENT_DOUBLE_ENCODED_URI => HTP_LOG_CODE_DOUBLE_ENCODED_URI
HTTP_DECODER_EVENT_URI_DELIM_NON_COMPLIANT => HTP_LOG_CODE_URI_DELIM_NON_COMPLIANT
HTTP_DECODER_EVENT_METHOD_DELIM_NON_COMPLIANT => HTP_LOG_CODE_METHOD_DELIM_NON_COMPLIANT
HTTP_DECODER_EVENT_REQUEST_LINE_LEADING_WHITESPACE => HTP_LOG_CODE_REQUEST_LINE_LEADING_WHITESPACE
HTTP_DECODER_EVENT_TOO_MANY_ENCODING_LAYERS => HTP_LOG_CODE_TOO_MANY_ENCODING_LAYERS
HTTP_DECODER_EVENT_ABNORMAL_CE_HEADER => HTP_LOG_CODE_ABNORMAL_CE_HEADER
HTTP_DECODER_EVENT_RESPONSE_MULTIPART_BYTERANGES => HTP_LOG_CODE_RESPONSE_MULTIPART_BYTERANGES
HTTP_DECODER_EVENT_RESPONSE_ABNORMAL_TRANSFER_ENCODING => HTP_LOG_CODE_RESPONSE_ABNORMAL_TRANSFER_ENCODING
HTTP_DECODER_EVENT_RESPONSE_CHUNKED_OLD_PROTO => HTP_LOG_CODE_RESPONSE_CHUNKED_OLD_PROTO
HTTP_DECODER_EVENT_RESPONSE_INVALID_PROTOCOL => HTP_LOG_CODE_RESPONSE_INVALID_PROTOCOL
HTTP_DECODER_EVENT_RESPONSE_INVALID_STATUS => HTP_LOG_CODE_RESPONSE_INVALID_STATUS
HTTP_DECODER_EVENT_REQUEST_LINE_INCOMPLETE => HTP_LOG_CODE_REQUEST_LINE_INCOMPLETE
HTTP_DECODER_EVENT_LZMA_MEMLIMIT_REACHED => HTP_LOG_CODE_LZMA_MEMLIMIT_REACHED
HTTP_DECODER_EVENT_COMPRESSION_BOMB => HTP_LOG_CODE_COMPRESSION_BOMB

New log events to describe additional anomalies:
HTP_LOG_CODE_REQUEST_TOO_MANY_LZMA_LAYERS
HTP_LOG_CODE_RESPONSE_TOO_MANY_LZMA_LAYERS
HTP_LOG_CODE_PROTOCOL_CONTAINS_EXTRA_DATA
HTP_LOG_CODE_CONTENT_LENGTH_EXTRA_DATA_START
HTP_LOG_CODE_CONTENT_LENGTH_EXTRA_DATA_END
HTP_LOG_CODE_SWITCHING_PROTO_WITH_CONTENT_LENGTH
HTP_LOG_CODE_DEFORMED_EOL
HTP_LOG_CODE_PARSER_STATE_ERROR
HTP_LOG_CODE_MISSING_OUTBOUND_TRANSACTION_DATA
HTP_LOG_CODE_MISSING_INBOUND_TRANSACTION_DATA
HTP_LOG_CODE_ZERO_LENGTH_DATA_CHUNKS
HTP_LOG_CODE_REQUEST_LINE_UNKNOWN_METHOD
HTP_LOG_CODE_REQUEST_LINE_UNKNOWN_METHOD_NO_PROTOCOL
HTP_LOG_CODE_REQUEST_LINE_UNKNOWN_METHOD_INVALID_PROTOCOL
HTP_LOG_CODE_REQUEST_LINE_NO_PROTOCOL
HTP_LOG_CODE_RESPONSE_LINE_INVALID_PROTOCOL
HTP_LOG_CODE_RESPONSE_LINE_INVALID_RESPONSE_STATUS
HTP_LOG_CODE_RESPONSE_BODY_INTERNAL_ERROR
HTP_LOG_CODE_REQUEST_BODY_DATA_CALLBACK_ERROR
HTP_LOG_CODE_RESPONSE_INVALID_EMPTY_NAME
HTP_LOG_CODE_REQUEST_INVALID_EMPTY_NAME
HTP_LOG_CODE_RESPONSE_INVALID_LWS_AFTER_NAME
HTP_LOG_CODE_RESPONSE_HEADER_NAME_NOT_TOKEN
HTP_LOG_CODE_REQUEST_INVALID_LWS_AFTER_NAME
HTP_LOG_CODE_LZMA_DECOMPRESSION_DISABLED
HTP_LOG_CODE_CONNECTION_ALREADY_OPEN
HTP_LOG_CODE_COMPRESSION_BOMB_DOUBLE_LZMA
HTP_LOG_CODE_INVALID_CONTENT_ENCODING
HTP_LOG_CODE_INVALID_GAP
HTP_LOG_CODE_ERROR

The new htp_log API supports consuming log messages more easily than
walking a list and tracking the current offset. Internally, libhtp-rs
now provides log messages as a queue of htp_log_t, which means the
application can simply call htp_conn_next_log() to fetch the next log
message until the queue is empty. Once the application is done with a
log message, they can call htp_log_free() to dispose of it.

Functions supporting htp_log_t:
htp_conn_next_log(conn) - Get the next log message
htp_log_message(log) - To get the text of the message
htp_log_code(log) - To get the HTP_LOG_CODE value
htp_log_free(log) - To free the htp_log_t
catenacyber pushed a commit to catenacyber/suricata that referenced this pull request Dec 5, 2024
Ticket: OISF#2696

There are a lot of changes here, which are described below.

In general these changes are renaming constants to conform to the
libhtp-rs versions (which are generated by cbindgen); making all htp
types opaque and changing struct->member references to
htp_struct_member() function calls; and a handful of changes to offload
functionality onto libhtp-rs from suricata, such as URI normalization
and transaction cleanup.

Constants renamed to correspond to cbindgen generated names:

HTP_OK => HTP_STATUS_OK
HTP_ERROR => HTP_STATUS_ERROR

HTP_SERVER_* => HTP_SERVER_PERSONALITY_*

HTP_REQUEST_INVALID_T_E => HTP_FLAGS_REQUEST_INVALID_T_E
HTP_REQUEST_INVALID_C_L => HTP_FLAGS_REQUEST_INVALID_C_L
HTP_HOST_MISSING => HTP_FLAGS_HOST_MISSING
HTP_HOST_AMBIGUOUS => HTP_FLAGS_HOST_AMBIGUOUS
HTP_HOSTU_INVALID => HTP_FLAGS_HOSTU_INVALID
HTP_HOSTH_INVALID => HTP_FLAGS_HOSTH_INVALID

HTP_AUTH_UNRECOGNIZED => HTP_AUTH_TYPE_UNRECOGNIZED

HTP_M_UNKNOWN => HTP_METHOD_UNKNOWN
HTP_M_GET => HTP_METHOD_GET
HTP_M_POST => HTP_METHOD_POST
HTP_M_PUT => HTP_METHOD_PUT
HTP_M_CONNECT => HTP_METHOD_CONNECT

HTP_STREAM_ERROR => HTP_STREAM_STATE_ERROR
HTP_STREAM_TUNNEL => HTP_STREAM_STATE_TUNNEL

HTP_REQUEST_LINE => HTP_REQUEST_PROGRESS_LINE
HTP_REQUEST_HEADERS => HTP_REQUEST_PROGRESS_HEADERS
HTP_REQUEST_BODY => HTP_REQUEST_PROGRESS_BODY
HTP_REQUEST_TRAILER => HTP_REQUEST_PROGRESS_TRAILER
HTP_REQUEST_COMPLETE => HTP_REQUEST_PROGRESS_COMPLETE
HTP_RESPONSE_LINE => HTP_RESPONSE_PROGRESS_LINE
HTP_RESPONSE_HEADERS => HTP_RESPONSE_PROGRESS_HEADERS
HTP_RESPONSE_BODY => HTP_RESPONSE_PROGRESS_BODY
HTP_RESPONSE_TRAILER => HTP_RESPONSE_PROGRESS_TRAILER
HTP_RESPONSE_COMPLETE => HTP_RESPONSE_PROGRESS_COMPLETE

HTP_PROTOCOL_1_1 => HTP_PROTOCOL_V1_1
HTP_PROTOCOL_1_0 => HTP_PROTOCOL_V1_0
HTP_PROTOCOL_0_9 => HTP_PROTOCOL_V0_9

Functions introduced to handle opaque htp_tx_t:
- tx->cfg => htp_tx_cfg(tx)
- tx->flags => htp_tx_flags(tx)
- tx->is_protocol_0_9 => htp_tx_is_protocol_0_9(tx)
- tx->parsed_uri => htp_tx_parsed_uri(tx)
- tx->parsed_uri->path => htp_uri_path(htp_tx_parsed_uri(tx)
- tx->parsed_uri->hostname => htp_uri_hostname(htp_tx_parsed_uri(tx))
- tx->request_auth_type => htp_tx_request_auth_type(tx)
- tx->request_headers => htp_tx_request_headers(tx)
- tx->request_hostname => htp_tx_request_hostname(tx)
- tx->request_line => htp_tx_request_line(tx)
- tx->request_message_len => htp_tx_request_message_len(tx)
- tx->request_method => htp_tx_request_method(tx)
- tx->request_method_number => htp_tx_request_method_number(tx)
- tx->request_port_number => htp_tx_request_port_number(tx)
- tx->request_progress => htp_tx_request_progress(tx)
- tx->request_protocol => htp_tx_request_protocol(tx)
- tx->request_protocol_number => htp_tx_request_protocol_number(tx)
- tx->request_uri => htp_tx_request_uri(tx)
- tx->response_headers => htp_tx_response_headers(tx)
- tx->response_line => htp_tx_response_line(tx)
- tx->response_message => htp_tx_response_message(tx)
- tx->response_message_len => htp_tx_response_message_len(tx)
- tx->response_status => htp_tx_response_status(tx)
- tx->response_status_number => htp_tx_response_status_number(tx)
- tx->response_progress => htp_tx_response_progress(tx)
- tx->response_protocol_number => htp_tx_response_protocol_number(tx)
- htp_tx_get_user_data() => htp_tx_user_data(tx)
- htp_table_get_c(tx->request_headers, header) => htp_tx_request_header(tx, header)
- htp_table_get_c(tx->response_headers, header) => htp_tx_response_header(tx, header)
- htp_table_get_index(tx->request_headers, index) => htp_tx_request_header_index(tx, index)
- htp_table_size(tx->request_headers) => htp_tx_request_headers_size(tx)
- htp_tx_is_http_2_upgrade(tx) convenience function introduced to detect response status 101
  and “Upgrade: h2c" header.

Functions introduced to handle opaque htp_header_t:
- header->name => htp_header_name(header)
- bstr_ptr(header->name) => htp_header_name_ptr(header)
- bstr_len(header->name) => htp_header_name_len(header)
- header->value => htp_header_value(header)
- bstr_len(header->value) => htp_header_value_len(header)
- bstr_ptr(header->value) => htp_header_value_ptr(header)

Functions introduced to handle opaque htp_headers_t:
- htp_table_size(headers) => htp_headers_size(headers)
- htp_table_get_index(headers, index) => htp_headers_get_index(headers, index)

Functions introduced to handle opaque htp_tx_data_t:
- d->len => htp_tx_data_len()
- d->data => htp_tx_data_data()
- htp_tx_data_tx(data) function to get the htp_tx_t from the htp_tx_data_t
- htp_tx_data_is_empty(data) convenience function introduced to test if the data is empty.

Functions introduced to handle opaque htp_connp_t:
- htp_list_get(connp->transactions, txid) => htp_connp_tx(connp, txid)
- htp_list_size(http_state->conn->transactions) => htp_connp_tx_size(connp)
- htp_connp_get_connection(connp) => htp_connp_connection(connp)
- htp_connp_req_data(connp) => htp_connp_request_data(connp)
- htp_connp_req_close(connp) => htp_connp_request_close(connp)
- htp_connp_res_data(connp) => htp_connp_response_data(connp)
- htp_connp_get_jn_tx(connp) => htp_connp_get_request_tx(connp)
- htp_connp_get_out_tx(connp) => htp_connp_get_response_tx(connp)
- htp_connp_req_data_consumed(connp) => htp_connp_request_data_consumed(connp)
- htp_connp_res_data_consumed(connp) => htp_connp_response_data_consumed(connp)
- htp_connp_get_user_data(connp) => htp_connp_user_data(connp)

Functions introduced to handle opaque htp_conn_t:
- conn->in_data_counter => htp_conn_request_data_counter(conn)
- conn->out_data_counter => htp_conn_response_data_counter(conn)

Other changes:

Build libhtp-rs as a crate inside rust. Update autoconf to no longer
use libhtp as an external dependency. Remove HAVE_HTP feature defines
since they are no longer needed.

Make function arguments and return values const where possible

htp_tx_destroy(tx) will now free an incomplete transaction

htp_time_t replaced with standard struct timeval

Callbacks from libhtp now provide the htp_connp_t and the htp_tx_data_t
as separate arguments. This means the connection parser is no longer
fetched from the transaction inside callbacks.

SCHTPGenerateNormalizedUri() functionality moved inside libhtp-rs, which
now provides normalized URI values.
The normalized URI is available with accessor function: htp_tx_normalized_uri()
Configuration settings added to control the behaviour of the URI normalization:
- htp_config_set_normalized_uri_include_all()
- htp_config_set_plusspace_decode()
- htp_config_set_convert_lowercase()
- htp_config_set_double_decode_normalized_query()
- htp_config_set_double_decode_normalized_path()
- htp_config_set_backslash_convert_slashes()
- htp_config_set_bestfit_replacement_byte()
- htp_config_set_convert_lowercase()
- htp_config_set_nul_encoded_terminates()
- htp_config_set_nul_raw_terminates()
- htp_config_set_path_separators_compress()
- htp_config_set_path_separators_decode()
- htp_config_set_u_encoding_decode()
- htp_config_set_url_encoding_invalid_handling()
- htp_config_set_utf8_convert_bestfit()
- htp_config_set_normalized_uri_include_all()
- htp_config_set_plusspace_decode()
Constants related to configuring uri normalization:
- HTP_URL_DECODE_PRESERVE_PERCENT => HTP_URL_ENCODING_HANDLING_PRESERVE_PERCENT
- HTP_URL_DECODE_REMOVE_PERCENT => HTP_URL_ENCODING_HANDLING_REMOVE_PERCENT
- HTP_URL_DECODE_PROCESS_INVALID => HTP_URL_ENCODING_HANDLING_PROCESS_INVALID

htp_config_set_field_limits(soft_limit, hard_limit) changed to
htp_config_set_field_limit(limit) because libhtp didn't implement soft
limits.

libhtp logging API updated to provide HTP_LOG_CODE constants along with
the message. This eliminates the need to perform string matching on
message text to map log messages to HTTP_DECODER_EVENT values, and the
HTP_LOG_CODE values can be used directly. In support of this,
HTP_DECODER_EVENT values are mapped to their corresponding HTP_LOG_CODE
values:

HTTP_DECODER_EVENT_UNKNOWN_ERROR => HTP_LOG_CODE_UNKNOWN
HTTP_DECODER_EVENT_GZIP_DECOMPRESSION_FAILED => HTP_LOG_CODE_GZIP_DECOMPRESSION_FAILED
HTTP_DECODER_EVENT_REQUEST_FIELD_MISSING_COLON => HTP_LOG_CODE_REQUEST_FIELD_MISSING_COLON
HTTP_DECODER_EVENT_RESPONSE_FIELD_MISSING_COLON => HTP_LOG_CODE_RESPONSE_FIELD_MISSING_COLON
HTTP_DECODER_EVENT_INVALID_REQUEST_CHUNK_LEN => HTP_LOG_CODE_INVALID_REQUEST_CHUNK_LEN
HTTP_DECODER_EVENT_INVALID_RESPONSE_CHUNK_LEN => HTP_LOG_CODE_INVALID_RESPONSE_CHUNK_LEN
HTTP_DECODER_EVENT_INVALID_TRANSFER_ENCODING_VALUE_IN_REQUEST => HTP_LOG_CODE_INVALID_TRANSFER_ENCODING_VALUE_IN_REQUEST
HTTP_DECODER_EVENT_INVALID_TRANSFER_ENCODING_VALUE_IN_RESPONSE => HTP_LOG_CODE_INVALID_TRANSFER_ENCODING_VALUE_IN_RESPONSE
HTTP_DECODER_EVENT_INVALID_CONTENT_LENGTH_FIELD_IN_REQUEST => HTP_LOG_CODE_INVALID_CONTENT_LENGTH_FIELD_IN_REQUEST
HTTP_DECODER_EVENT_INVALID_CONTENT_LENGTH_FIELD_IN_RESPONSE => HTP_LOG_CODE_INVALID_CONTENT_LENGTH_FIELD_IN_RESPONSE
HTTP_DECODER_EVENT_DUPLICATE_CONTENT_LENGTH_FIELD_IN_REQUEST => HTP_LOG_CODE_DUPLICATE_CONTENT_LENGTH_FIELD_IN_REQUEST
HTTP_DECODER_EVENT_DUPLICATE_CONTENT_LENGTH_FIELD_IN_RESPONSE => HTP_LOG_CODE_DUPLICATE_CONTENT_LENGTH_FIELD_IN_RESPONSE
HTTP_DECODER_EVENT_100_CONTINUE_ALREADY_SEEN => HTP_LOG_CODE_CONTINUE_ALREADY_SEEN
HTTP_DECODER_EVENT_UNABLE_TO_MATCH_RESPONSE_TO_REQUEST => HTP_LOG_CODE_UNABLE_TO_MATCH_RESPONSE_TO_REQUEST
HTTP_DECODER_EVENT_INVALID_SERVER_PORT_IN_REQUEST => HTP_LOG_CODE_INVALID_SERVER_PORT_IN_REQUEST
HTTP_DECODER_EVENT_INVALID_AUTHORITY_PORT => HTP_LOG_CODE_INVALID_AUTHORITY_PORT
HTTP_DECODER_EVENT_REQUEST_HEADER_INVALID => HTP_LOG_CODE_REQUEST_HEADER_INVALID
HTTP_DECODER_EVENT_RESPONSE_HEADER_INVALID => HTP_LOG_CODE_RESPONSE_HEADER_INVALID
HTTP_DECODER_EVENT_MISSING_HOST_HEADER => HTP_LOG_CODE_MISSING_HOST_HEADER
HTTP_DECODER_EVENT_HOST_HEADER_AMBIGUOUS => HTP_LOG_CODE_HOST_HEADER_AMBIGUOUS
HTTP_DECODER_EVENT_INVALID_REQUEST_FIELD_FOLDING => HTP_LOG_CODE_INVALID_REQUEST_FIELD_FOLDING
HTTP_DECODER_EVENT_INVALID_RESPONSE_FIELD_FOLDING => HTP_LOG_CODE_INVALID_RESPONSE_FIELD_FOLDING
HTTP_DECODER_EVENT_REQUEST_FIELD_TOO_LONG => HTP_LOG_CODE_REQUEST_FIELD_TOO_LONG
HTTP_DECODER_EVENT_RESPONSE_FIELD_TOO_LONG => HTP_LOG_CODE_RESPONSE_FIELD_TOO_LONG
HTTP_DECODER_EVENT_FILE_NAME_TOO_LONG => HTP_LOG_CODE_REQUEST_LINE_INVALID
HTTP_DECODER_EVENT_REQUEST_LINE_INVALID => HTP_LOG_CODE_REQUEST_BODY_UNEXPECTED
HTTP_DECODER_EVENT_REQUEST_BODY_UNEXPECTED => HTP_LOG_CODE_RESPONSE_BODY_UNEXPECTED
HTTP_DECODER_EVENT_REQUEST_SERVER_PORT_TCP_PORT_MISMATCH => HTP_LOG_CODE_REQUEST_SERVER_PORT_TCP_PORT_MISMATCH
HTTP_DECODER_EVENT_URI_HOST_INVALID => HTP_LOG_CODE_URI_HOST_INVALID
HTTP_DECODER_EVENT_HEADER_HOST_INVALID => HTP_LOG_CODE_HEADER_HOST_INVALID
HTTP_DECODER_EVENT_AUTH_UNRECOGNIZED => HTP_LOG_CODE_AUTH_UNRECOGNIZED
HTTP_DECODER_EVENT_REQUEST_HEADER_REPETITION => HTP_LOG_CODE_REQUEST_HEADER_REPETITION
HTTP_DECODER_EVENT_RESPONSE_HEADER_REPETITION => HTP_LOG_CODE_RESPONSE_HEADER_REPETITION
HTTP_DECODER_EVENT_DOUBLE_ENCODED_URI => HTP_LOG_CODE_DOUBLE_ENCODED_URI
HTTP_DECODER_EVENT_URI_DELIM_NON_COMPLIANT => HTP_LOG_CODE_URI_DELIM_NON_COMPLIANT
HTTP_DECODER_EVENT_METHOD_DELIM_NON_COMPLIANT => HTP_LOG_CODE_METHOD_DELIM_NON_COMPLIANT
HTTP_DECODER_EVENT_REQUEST_LINE_LEADING_WHITESPACE => HTP_LOG_CODE_REQUEST_LINE_LEADING_WHITESPACE
HTTP_DECODER_EVENT_TOO_MANY_ENCODING_LAYERS => HTP_LOG_CODE_TOO_MANY_ENCODING_LAYERS
HTTP_DECODER_EVENT_ABNORMAL_CE_HEADER => HTP_LOG_CODE_ABNORMAL_CE_HEADER
HTTP_DECODER_EVENT_RESPONSE_MULTIPART_BYTERANGES => HTP_LOG_CODE_RESPONSE_MULTIPART_BYTERANGES
HTTP_DECODER_EVENT_RESPONSE_ABNORMAL_TRANSFER_ENCODING => HTP_LOG_CODE_RESPONSE_ABNORMAL_TRANSFER_ENCODING
HTTP_DECODER_EVENT_RESPONSE_CHUNKED_OLD_PROTO => HTP_LOG_CODE_RESPONSE_CHUNKED_OLD_PROTO
HTTP_DECODER_EVENT_RESPONSE_INVALID_PROTOCOL => HTP_LOG_CODE_RESPONSE_INVALID_PROTOCOL
HTTP_DECODER_EVENT_RESPONSE_INVALID_STATUS => HTP_LOG_CODE_RESPONSE_INVALID_STATUS
HTTP_DECODER_EVENT_REQUEST_LINE_INCOMPLETE => HTP_LOG_CODE_REQUEST_LINE_INCOMPLETE
HTTP_DECODER_EVENT_LZMA_MEMLIMIT_REACHED => HTP_LOG_CODE_LZMA_MEMLIMIT_REACHED
HTTP_DECODER_EVENT_COMPRESSION_BOMB => HTP_LOG_CODE_COMPRESSION_BOMB

New log events to describe additional anomalies:
HTP_LOG_CODE_REQUEST_TOO_MANY_LZMA_LAYERS
HTP_LOG_CODE_RESPONSE_TOO_MANY_LZMA_LAYERS
HTP_LOG_CODE_PROTOCOL_CONTAINS_EXTRA_DATA
HTP_LOG_CODE_CONTENT_LENGTH_EXTRA_DATA_START
HTP_LOG_CODE_CONTENT_LENGTH_EXTRA_DATA_END
HTP_LOG_CODE_SWITCHING_PROTO_WITH_CONTENT_LENGTH
HTP_LOG_CODE_DEFORMED_EOL
HTP_LOG_CODE_PARSER_STATE_ERROR
HTP_LOG_CODE_MISSING_OUTBOUND_TRANSACTION_DATA
HTP_LOG_CODE_MISSING_INBOUND_TRANSACTION_DATA
HTP_LOG_CODE_ZERO_LENGTH_DATA_CHUNKS
HTP_LOG_CODE_REQUEST_LINE_UNKNOWN_METHOD
HTP_LOG_CODE_REQUEST_LINE_UNKNOWN_METHOD_NO_PROTOCOL
HTP_LOG_CODE_REQUEST_LINE_UNKNOWN_METHOD_INVALID_PROTOCOL
HTP_LOG_CODE_REQUEST_LINE_NO_PROTOCOL
HTP_LOG_CODE_RESPONSE_LINE_INVALID_PROTOCOL
HTP_LOG_CODE_RESPONSE_LINE_INVALID_RESPONSE_STATUS
HTP_LOG_CODE_RESPONSE_BODY_INTERNAL_ERROR
HTP_LOG_CODE_REQUEST_BODY_DATA_CALLBACK_ERROR
HTP_LOG_CODE_RESPONSE_INVALID_EMPTY_NAME
HTP_LOG_CODE_REQUEST_INVALID_EMPTY_NAME
HTP_LOG_CODE_RESPONSE_INVALID_LWS_AFTER_NAME
HTP_LOG_CODE_RESPONSE_HEADER_NAME_NOT_TOKEN
HTP_LOG_CODE_REQUEST_INVALID_LWS_AFTER_NAME
HTP_LOG_CODE_LZMA_DECOMPRESSION_DISABLED
HTP_LOG_CODE_CONNECTION_ALREADY_OPEN
HTP_LOG_CODE_COMPRESSION_BOMB_DOUBLE_LZMA
HTP_LOG_CODE_INVALID_CONTENT_ENCODING
HTP_LOG_CODE_INVALID_GAP
HTP_LOG_CODE_ERROR

The new htp_log API supports consuming log messages more easily than
walking a list and tracking the current offset. Internally, libhtp-rs
now provides log messages as a queue of htp_log_t, which means the
application can simply call htp_conn_next_log() to fetch the next log
message until the queue is empty. Once the application is done with a
log message, they can call htp_log_free() to dispose of it.

Functions supporting htp_log_t:
htp_conn_next_log(conn) - Get the next log message
htp_log_message(log) - To get the text of the message
htp_log_code(log) - To get the HTP_LOG_CODE value
htp_log_free(log) - To free the htp_log_t
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

2 participants