Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: libhtp-rs #6402

Closed
wants to merge 51 commits into from
Closed

WIP: libhtp-rs #6402

wants to merge 51 commits into from

Conversation

cccs-rtmorti
Copy link
Contributor

Make sure these boxes are signed before submitting your Pull Request -- thank you.

Link to redmine ticket.

Describe changes:

WIP for libhtp-rs support. For discussion only. libhtp-rs still has some work to do before it is ready, but the current state is here:

https://github.com/cccs-rtmorti/libhtp-rs

Presently this still builds libhtp-rs as a shared library and links it like libhtp-c, but this will also change to build / link as a crate. To build against libhtp-rs, just symlink the libhtp subdir to a checkout of libhtp-rs and the libhtp-rs Makefile will take care of the rest.

cccs-rtmorti and others added 30 commits July 13, 2020 11:47
This value appears to always be null, and does not appear to be set
anywhere.
The type changed in libhtp which was causing incompatible pointer type
warnings.
Suricata compatibility fixes.  htp_list_array_t is now a Vector and struct members around lists are now accessed through accessor functions.

Approved-by: Todd Mortimer
htp_tx_t is now an opaque type so we have to use functions instead of
direct member access.
Use renamed function from libhtp2.
Adds convenience methods for htp_headers and htp_tx_t that make
accessing headers much more concise.

Makes some functions const to support this. These function args aren't
being modified so they should be const. The header values themselves
should not be modified.
Uses convenience functions for accessing lists inside of a connection.
Use accessor functions instead of directly using struct fields.
Use functions instead of struct fields.
Removed the '_get' to make it more consistent with other functions.
Approved-by: Richard Mortimer <[email protected]>
Approved-by: Duaa Abdelgadir <[email protected]>
Approved-by: Duaa Abdelgadir <[email protected]>
Approved-by: Todd Mortimer <[email protected]>
Approved-by: Todd Mortimer
Use a new ffi wrapper around lzma-rs to replace the previous lzma
implementation.
Callback handlers are now passed the connp directly in addition to the
transaction and / or transaction data.
Approved-by: Duaa Abdelgadir
Approved-by: Richard Mortimer
@cccs-rtmorti cccs-rtmorti requested review from norg and a team as code owners September 22, 2021 14:33
@catenacyber catenacyber marked this pull request as draft September 22, 2021 18:48
@catenacyber
Copy link
Contributor

Thanks for this great work :-)

Generally speaking, I think this will need a bit of commit history cleaning cf
Fix conflict in output-json-http

By the way, every commit must build, so I think, CI commit-check will fail here.

What is the general story for commits here ? (there seem to be many changes)
Preliminary commits, making changes but keeping libhtp-c
A commit switching to libhtp-rs
A commit removing what is remaining of libhtp-c (doc)

Presently this still builds libhtp-rs as a shared library and links it like libhtp-c, but this will also change to build / link as a crate.

Are you working on this ?

@victorjulien
Copy link
Member

Could you do an update that fixes the compilation issues? Don't mind the commit-check, formatting-check failures at this point, but would like to see runtime tests pass. Then we can compare in our QA with some confidence.

@cccs-rtmorti
Copy link
Contributor Author

No problem with cleaning up the commit history, etc. - we were expecting to do this anyway.

The general story is pretty straightforward. The c2rust conversion was initially functionally identical to the C libhtp and it was a drop-in replacement, but as we have been refactoring libhtp-rs to be more idiomatic some things changed, and so suricata had to change to handle the changes to the API. Most of the changes here are just to accommodate making most of the structures opaque, and changes to the names of constants to accommodate the way cbindgen generates names based off structures.

Some of the other changes have been to make the API nicer to use, such as the change to the log API to eliminate the need to track the position inside the log message buffer:

6331a09

Or to make URI normalization a libhtp function instead of something that suricata implements and manages on the side in SCHTPGenerateNormalizedUri():

09ae960

These kinds of changes have been motivated when we encountered awkwardness in maintaining the existing API, and so considered how we could make the API easier to use and more idiomatic at the same time.

Presently this still builds libhtp-rs as a shared library and links it like libhtp-c, but this will also change to build / link as a crate.

Are you working on this ?

Not right now - we have been focused on getting the last of the libhtp-rs TODOs and FIXMEs all sorted first.

Could you do an update that fixes the compilation issues?

No problem! Sorry - I thought you were more looking to just see the current diff.

@victorjulien
Copy link
Member

Almost there it seems

Number of concurrent jobs: 2
===> bug-2511: Sub test #1: FAIL : expected 1 matches; got 0 for filter {'count': 1, 'match': {'event_type': 'alert'}}
===> bug-2511: Sub test #2: FAIL : expected 1 matches; got 0 for filter {'count': 1, 'match': {'event_type': 'alert', 'alert.signature_id': 1}}

PASSED:  913
FAILED:  1
SKIPPED: 35

@catenacyber catenacyber added the needs rebase Needs rebase to master label Mar 24, 2022
@suricata-qa
Copy link

Warning: no commits in this PR have specified the following ticket(s):

Please update the commit(s) and submit a new PR.

@suricata-qa suricata-qa added the needs ticket Needs (link to) redmine ticket label Apr 26, 2022
Conflicts:
	src/app-layer-htp-file.h
	src/app-layer-htp.c
	src/app-layer-htp.h
	src/app-layer-http2.c
	src/detect-pcre.c
	src/detect-uricontent.c
@catenacyber
Copy link
Contributor

Could you close this PR and open a new one with the rebased changes ?
cf https://redmine.openinfosecfoundation.org/projects/suricata/wiki/Github_work_flow#Create-a-new-branch-for-incorporating-feedback

@cccs-rtmorti
Copy link
Contributor Author

Closed for #7438

victorjulien added a commit to victorjulien/suricata that referenced this pull request Oct 13, 2023
When the decoders encounter multiple layers of tunneling, multiple tunnel
packets are created. These are then stored in ThreadVars::decode_pq, where
they are processed after the current thread "slot" is done. However, due
to a logic error, the tunnel packets after the first, where not called
for the correct position in the packet pipeline. This would lead to these
packets not going through the FlowWorker module, so skipping everything
from flow tracking, detection and logging.

This would only happen for single and workers, due to how the pipelines
are constructed.

This patch addresses the issue by making sure only a "decode" thread
slot will service the ThreadVars::decode_pq.

Bug: OISF#6402.
victorjulien added a commit to victorjulien/suricata that referenced this pull request Oct 13, 2023
When the decoders encounter multiple layers of tunneling, multiple tunnel
packets are created. These are then stored in ThreadVars::decode_pq, where
they are processed after the current thread "slot" is done. However, due
to a logic error, the tunnel packets after the first, where not called
for the correct position in the packet pipeline. This would lead to these
packets not going through the FlowWorker module, so skipping everything
from flow tracking, detection and logging.

This would only happen for single and workers, due to how the pipelines
are constructed.

This patch addresses the issue by making sure only a "decode" thread
slot will service the ThreadVars::decode_pq.

Bug: OISF#6402.
victorjulien added a commit to victorjulien/suricata that referenced this pull request Oct 16, 2023
When the decoders encounter multiple layers of tunneling, multiple tunnel
packets are created. These are then stored in ThreadVars::decode_pq, where
they are processed after the current thread "slot" is done. However, due
to a logic error, the tunnel packets after the first, where not called
for the correct position in the packet pipeline. This would lead to these
packets not going through the FlowWorker module, so skipping everything
from flow tracking, detection and logging.

This would only happen for single and workers, due to how the pipelines
are constructed.

The "slot" holding the decoder, would contain 2 packets in
ThreadVars::decode_pq. Then it would call the pipeline on the first
packet with the next slot of the pipeline through a indirect call to
TmThreadsSlotVarRun(), so it would be called for the FlowWorker.
However when that first (the most inner) packet was done, the call
to TmThreadsSlotVarRun() would again service the ThreadVars::decode_pq
and process it, again moving the slot pointer forward, so past the
FlowWorker.

This patch addresses the issue by making sure only a "decode" thread
slot will service the ThreadVars::decode_pq, thus never moving the
slot past the FlowWorker.

Bug: OISF#6402.
victorjulien added a commit to victorjulien/suricata that referenced this pull request Oct 16, 2023
When the decoders encounter multiple layers of tunneling, multiple tunnel
packets are created. These are then stored in ThreadVars::decode_pq, where
they are processed after the current thread "slot" is done. However, due
to a logic error, the tunnel packets after the first, where not called
for the correct position in the packet pipeline. This would lead to these
packets not going through the FlowWorker module, so skipping everything
from flow tracking, detection and logging.

This would only happen for single and workers, due to how the pipelines
are constructed.

The "slot" holding the decoder, would contain 2 packets in
ThreadVars::decode_pq. Then it would call the pipeline on the first
packet with the next slot of the pipeline through a indirect call to
TmThreadsSlotVarRun(), so it would be called for the FlowWorker.
However when that first (the most inner) packet was done, the call
to TmThreadsSlotVarRun() would again service the ThreadVars::decode_pq
and process it, again moving the slot pointer forward, so past the
FlowWorker.

This patch addresses the issue by making sure only a "decode" thread
slot will service the ThreadVars::decode_pq, thus never moving the
slot past the FlowWorker.

Bug: OISF#6402.
victorjulien added a commit to victorjulien/suricata that referenced this pull request Oct 17, 2023
When the decoders encounter multiple layers of tunneling, multiple tunnel
packets are created. These are then stored in ThreadVars::decode_pq, where
they are processed after the current thread "slot" is done. However, due
to a logic error, the tunnel packets after the first, where not called
for the correct position in the packet pipeline. This would lead to these
packets not going through the FlowWorker module, so skipping everything
from flow tracking, detection and logging.

This would only happen for single and workers, due to how the pipelines
are constructed.

The "slot" holding the decoder, would contain 2 packets in
ThreadVars::decode_pq. Then it would call the pipeline on the first
packet with the next slot of the pipeline through a indirect call to
TmThreadsSlotVarRun(), so it would be called for the FlowWorker.
However when that first (the most inner) packet was done, the call
to TmThreadsSlotVarRun() would again service the ThreadVars::decode_pq
and process it, again moving the slot pointer forward, so past the
FlowWorker.

This patch addresses the issue by making sure only a "decode" thread
slot will service the ThreadVars::decode_pq, thus never moving the
slot past the FlowWorker.

Bug: OISF#6402.
(cherry picked from commit 15947f2)
victorjulien added a commit to victorjulien/suricata that referenced this pull request Oct 17, 2023
When the decoders encounter multiple layers of tunneling, multiple tunnel
packets are created. These are then stored in ThreadVars::decode_pq, where
they are processed after the current thread "slot" is done. However, due
to a logic error, the tunnel packets after the first, where not called
for the correct position in the packet pipeline. This would lead to these
packets not going through the FlowWorker module, so skipping everything
from flow tracking, detection and logging.

This would only happen for single and workers, due to how the pipelines
are constructed.

The "slot" holding the decoder, would contain 2 packets in
ThreadVars::decode_pq. Then it would call the pipeline on the first
packet with the next slot of the pipeline through a indirect call to
TmThreadsSlotVarRun(), so it would be called for the FlowWorker.
However when that first (the most inner) packet was done, the call
to TmThreadsSlotVarRun() would again service the ThreadVars::decode_pq
and process it, again moving the slot pointer forward, so past the
FlowWorker.

This patch addresses the issue by making sure only a "decode" thread
slot will service the ThreadVars::decode_pq, thus never moving the
slot past the FlowWorker.

Bug: OISF#6402.
(cherry picked from commit 15947f2)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs rebase Needs rebase to master needs ticket Needs (link to) redmine ticket
Development

Successfully merging this pull request may close these issues.

7 participants