-
Notifications
You must be signed in to change notification settings - Fork 475
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove leading and trailing whitespaces #247
Conversation
from field-values to be RFC 7230 compliant.
@@ -215,6 +215,9 @@ tokenize_header_value(undefined) -> | |||
tokenize_header_value(V) -> | |||
reversed_tokens(trim_and_reverse(V, false), [], []). | |||
|
|||
trim_leading_and_trailing_ws(S) -> | |||
re:replace(S, "^[ \\t]*|[ \\t]*$", "", [global, {return, list}]). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggest string:trim(String, both, [9,32]).
for performance;
RE input: ' hello there ', output: 'hello there', speed: 6.276318
TRIM input: ' hello there ', output: 'hello there', speed: 1.377677
Count = 1_000_000,
{Time, _} = timer:tc(fun() -> lists:foreach(fun(_) -> F() end, lists:seq(1, Count)) end),
Time / Count.```
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had the same idea too but then noticed mochiweb supports Erlang 18+ versions and string:trim/1
is an Erlang 20+ function.
string:strip/3
can be used but it would have to be called twice once for space and once for the tab character.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggest
string:trim(String, both, [9,32]).
for performance;RE input: ' hello there ', output: 'hello there', speed: 6.276318 TRIM input: ' hello there ', output: 'hello there', speed: 1.377677
Thats significant. Can we make a runtime distinction and differentiate between the Erlang versions? So if Erlang > 20 use trim/3
else use strip/3
?
I had the same idea too but then noticed mochiweb supports Erlang 18+ versions and
string:trim/1
is an Erlang 20+ function.
If we want to be RFC compliant, we can't usetrim/1
because that would remove more chars thanspace
andtab
.
OWS = *( SP / HTAB )
; optional whitespace
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to know what the performance of trim_and_reverse(trim_and_reverse(S))
would be, that would have the desired effect and leverage code that's already in that module. A few tests to verify the new behavior would be nice too.
On closer read the -define(IS_WHITESPACE(C),
(C =:= $\s orelse C =:= $\t orelse C =:= $\r orelse C =:= $\n)).
is_whitespace(C) ->
?IS_WHITESPACE(C).
trim_leading_and_trailing_ws(S) ->
trim_trailing_ws(lists:dropwhile(fun is_whitespace/1, S), [], 0).
trim_trailing_ws([C | Rest], Acc, N) when ?IS_WHITESPACE(C) ->
trim_trailing_ws(Rest, [C | Acc], 1 + N);
trim_trailing_ws([C | Rest], Acc, N) ->
trim_trailing_ws(Rest, [C | Acc], 0);
trim_trailing_ws([], Acc, N) ->
lists:reverse(lists:nthtail(Acc, N)). |
But 2 times 1> mochiweb_headers_playground:test().
T1 (trim_leading_and_trailing_ws) : 13.43003
T2 (trim_and_reverse) : 0.404285
T3 (trim_leading_and_trailing_ws_2): 0.609125 I made a small playground file which can be used to test this (with all your work and ideas)! So, let us use the following? trim_and_reverse(S) ->
trim_and_reverse(trim_and_reverse(S, false), false). |
I'll merge this and apply the suggestions (a test and more performant implementation) in a follow-up |
from field-values to be RFC 7230 compliant.
Fix #246