Refactor parser to fix inconsistencies #180

bbc2 · 2019-05-21T21:56:18Z

This fixes inconsistencies reported after the release of version 0.10.0 (comments in #148):

Valid escapes were interpreted as control characters even when in single-quoted strings.
# was interpreted as the start of a comment even if there was no whitespace preceding it.

However, we are keeping the interpretation of escapes in double-quoted strings as they didn't make sense in versions before 0.10.0.

The single large regular expression is replaced with a handwritten top-down parser using smaller regular expressions. The reason for this change is that it would have been very difficult or impossible to satisfy the parsing requirements with a single regex.

See individual commits to understand the changes better.

Fixes #170.

Using `str` (e.g. `bytes`) is inconsistent with the types and the implementation.

This fixes inconsistencies reported after the release of version 0.10.0: * Valid escapes were interpreted as control characters even when in single-quoted strings. * `#` was interpreted as the start of a comment even if there was no whitespace preceding it. However, we are keeping the interpretation of escapes in double-quoted strings as they didn't make sense in versions before 0.10.0. The single large regular expression is replaced with a handwritten top-down parser using smaller regular expressions. The reason for this change is that it would have been very difficult or impossible to satisfy the parsing requirements with a single regex.

coveralls · 2019-05-21T22:00:47Z

Coverage increased (+1.5%) to 90.217% when pulling e520f20 on bbc2:top-down-parser into 73124de on theskumar:master.

coveralls · 2019-05-21T22:00:49Z

Coverage increased (+1.5%) to 90.217% when pulling e520f20 on bbc2:top-down-parser into 73124de on theskumar:master.

theskumar · 2019-05-22T08:15:06Z

I really liked the top-down parser approach, it will be a lot easier to maintain and debug now. Reviewed the code and it all look great! Thank you so much!

AEHamrick · 2019-05-23T15:02:02Z

Would this be applicable to the behavior I'm seeing in pypa/pipenv#3757 with dotenv (by way of pipenv)? Will gladly open a new issue if not.

bbc2 · 2019-05-23T18:02:28Z

Yes, I believe we fixed the double-escaping issue in 0.10.0. See #170 for more information.

* master: (49 commits) Fix typo in README.md (#181) Refactor move run_command in cli Refractor: move 'to_env' to compat.py Refactor parser to fix inconsistencies (#180) chore(pypi): switch to markdown and twine for pypi upload + update supported version Bump version: 0.10.1 → 0.10.2 readme: Add new release stub Fix unicode/str inconsistency in Python 2 (#177) Add argument to choose .env file encoding, defaults to None Fix links in readme Restrict typing dependency to Python < 3.5 Add type hints and expose them to users Updated `.gitignore` with results from https://www.gitignore.io/api/python Added special case for `__file__` in Python 2. Fixes #130 Updated tests to show that this works. Moved `dotenv` package into `src` directory so that tests run against installed version. Added tox to run test suite against multiple versions of Python (with coverage) and run flake8. Updated Travis CI to use tox as well. Updated documentation about running tests. Fix ResourceWarning: unclosed file in setup.py and test_cli.py Clarify the usuages of export Isolate test files (#160) Deleted some temporary files that CLI tests were creating. Add python 3.7 to testsuite + udpate pypi creds ...

* Move parser to separate module * Add tests * Use unicode strings for unit tests in Python 2 Using `str` (e.g. `bytes`) is inconsistent with the types and the implementation. * Refactor parser This fixes inconsistencies reported after the release of version 0.10.0: * Valid escapes were interpreted as control characters even when in single-quoted strings. * `#` was interpreted as the start of a comment even if there was no whitespace preceding it. However, we are keeping the interpretation of escapes in double-quoted strings as they didn't make sense in versions before 0.10.0. The single large regular expression is replaced with a handwritten top-down parser using smaller regular expressions. The reason for this change is that it would have been very difficult or impossible to satisfy the parsing requirements with a single regex.

bbc2 added 4 commits May 14, 2019 00:33

Move parser to separate module

95c05d4

Add tests

b943e43

Use unicode strings for unit tests in Python 2

fd0a487

Using `str` (e.g. `bytes`) is inconsistent with the types and the implementation.

bbc2 requested a review from theskumar May 21, 2019 21:56

theskumar approved these changes May 22, 2019

View reviewed changes

theskumar merged commit 57f9639 into theskumar:master May 22, 2019

bbc2 deleted the top-down-parser branch May 23, 2019 18:02

AEHamrick mentioned this pull request May 23, 2019

Update vendored python-dotenv to fix doubles backslashes when loading .env file pypa/pipenv#3757

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor parser to fix inconsistencies #180

Refactor parser to fix inconsistencies #180

bbc2 commented May 21, 2019

coveralls commented May 21, 2019

coveralls commented May 21, 2019

theskumar commented May 22, 2019

AEHamrick commented May 23, 2019

bbc2 commented May 23, 2019

Refactor parser to fix inconsistencies #180

Refactor parser to fix inconsistencies #180

Conversation

bbc2 commented May 21, 2019

coveralls commented May 21, 2019

coveralls commented May 21, 2019

theskumar commented May 22, 2019

AEHamrick commented May 23, 2019

bbc2 commented May 23, 2019