Skip to content

Commit

Permalink
chore: check compatibility with cpython 3.12, clean-up, drop windows …
Browse files Browse the repository at this point in the history
…support, etc.
  • Loading branch information
tos-kamiya committed Oct 17, 2024
1 parent a71f1c2 commit fb4cbc1
Show file tree
Hide file tree
Showing 12 changed files with 134 additions and 178 deletions.
14 changes: 0 additions & 14 deletions .github/FUNDING.yml

This file was deleted.

28 changes: 0 additions & 28 deletions .github/workflows/tests-windows.yaml

This file was deleted.

3 changes: 1 addition & 2 deletions .github/workflows/tests.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ jobs:
max-parallel: 15
matrix:
platform: [ubuntu-latest, macos-latest]
python-version: ['3.8', '3.9', '3.10']
python-version: ['3.8', '3.9', '3.10', '3.11', '3.12']

steps:
- uses: actions/checkout@v2
Expand All @@ -21,7 +21,6 @@ jobs:
run: |
python -m pip install --upgrade pip setuptools wheel
python -m pip install tox tox-gh-actions
python -m pip install docopt-ng
- name: Install the package under test
run: python -m pip install -e .
- name: Test with tox
Expand Down
17 changes: 2 additions & 15 deletions README-pypi.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,24 +22,11 @@ Features:
## Installation

```sh
pip install dendro-text
pipx install dendro-text
```

If you run the dendro_text and get the following error message, please install dendro-text with docopt-ng.

```sh
$ dendro_text
Error: the Docopt module has not been installed. Install it with `pip install docopt-ng`.
```

```sh
pip install dendro-text[docopt-ng]
```

(To make `dendro-text` compatible with both `docopt` and `docopt-ng`, dependencies on them are now explicitly extra dependencies.)

To uninstall,

```sh
pip uninstall dendro-text
pipx uninstall dendro-text
```
48 changes: 16 additions & 32 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
[![Tests](https://github.com/tos-kamiya/dendro_text/actions/workflows/tests.yaml/badge.svg)](https://github.com/tos-kamiya/dendro_text/actions/workflows/tests.yaml)
[![Tests](https://github.com/tos-kamiya/dendro-text/actions/workflows/tests.yaml/badge.svg)](https://github.com/tos-kamiya/dendro-text/actions/workflows/tests.yaml)

dendro_text
dendro-text
===========

Draw a dendrogram of similarity between text files.
Expand All @@ -22,46 +22,26 @@ Features:
## Installation

```sh
pip install dendro-text
pipx install dendro-text
```

If you run the dendro_text and get the following error message, please install dendro-text with docopt-ng.

```sh
$ dendro_text
Error: the Docopt module has not been installed. Install it with `pip install docopt-ng`.
```

```sh
pip install dendro-text[docopt-ng]
```

(To make `dendro-text` compatible with both `docopt` and `docopt-ng`, dependencies on them are now explicitly extra dependencies.)

To uninstall,

```sh
pip uninstall dendro-text
pipx uninstall dendro-text
```

### Numba (option)

**To enable jit compilation by Numba, install it according to the instructions on [Numba website](https://numba.pydata.org/).**

Note that the installation of Numba differs for each platform. For example, on Ubuntu 20.04, in addition to installing `numba` with pip:

```sh
pip install numba
```

The following is required:
To install dendro-text with Numba,

```sh
sudo apt install python3-testresources
pipx install dendro-text --preinstall numba
```

Numba is used transparently. When you run `dendro_text`, if it detects that Numba is installed on your system, `dendro_text` will call functions compiled in jit, otherwise it will call pure Python functions.

The speedup with Numba was approx. 5x in one example I tried.

### picaf (option)
Expand All @@ -71,7 +51,7 @@ If you are doing tasks like investigating files in the dendrogram one by one (as
## Usage

```sh
dendro_text <file>...
dendro-text <file>...
```

### Options
Expand Down Expand Up @@ -160,7 +140,7 @@ abccfg
2. Create dendrograms showing file similarity by character-by-character comparison.

```sh
$ dendro_text -c *.txt
$ dendro-text -c *.txt
─┬─┬─┬── abcfg.txt
│ │ └── abcdfg.txt
│ └─┬── abccfg.txt
Expand All @@ -171,7 +151,7 @@ $ dendro_text -c *.txt
3. List files in order of similarity to a file `abccfg.txt`, with option `-N0`.

```sh
$ dendro_text -c -N0 abccfg.txt *.txt
$ dendro-text -c -N0 abccfg.txt *.txt
0 abccfg.txt
1 abcccfg.txt
1 abcdfg.txt
Expand All @@ -190,7 +170,7 @@ Tokens that are only in the first file are indicated by a red background color,
Note that the three files `abcccfg.txt`, `abccfg.txt`, and `abcfg.txt` are now grouped in one node, because they no longer differ.

```sh
$ dendro_text -c *.txt --prep 'sed s/c//g'
$ dendro-text -c *.txt --prep 'sed s/c//g'
─┬─┬── abcdfg.txt
│ └── abcccfg.txt,abccfg.txt,abcfg.txt
└── abdefg.txt
Expand All @@ -202,7 +182,7 @@ $ dendro_text -c *.txt --prep 'sed s/c//g'

The default tokenization (extracting words from the text) method is to split text at the point where the type of letter changes.

For example, the text "The version of dendro_text is marked as v1.1.1." turns into the following token sequence:
For example, the text "The version of dendro-text is marked as v1.1.1." turns into the following token sequence:

```sh
["The", " ", "version", " ", "of", " ", "dendro", "_", "text", " ",
Expand All @@ -229,7 +209,7 @@ The base name of the temporary file is the same as the original input file, but
For example, in the following command line,

```sh
$ dendro_text --prep p1.sh --prep p2.sh t1.txt t2.txt t3.txt
$ dendro-text --prep p1.sh --prep p2.sh t1.txt t2.txt t3.txt
```

Preprocessing scripts `p1.sh` and `p2.sh` will get (such as) `some/temp/dir/t1.txt`, `some/temp/dir/t2.txt` or `some/temp/dir/t3.txt` as input file.
Expand All @@ -242,3 +222,7 @@ Preprocessing scripts `p1.sh` and `p2.sh` will get (such as) `some/temp/dir/t1.t
* The file `Blocks.txt` is released under the [Unicode Data Files and Software License](https://www.unicode.org/license.txt).

* All of the other source code is released under [the BSD 2-Clause License](LICENSE).

## Changelog

* v2.0.0: The script is renamed to `dendro-text`. Drop windows support.
2 changes: 1 addition & 1 deletion dendro_text/VERSION
Original file line number Diff line number Diff line change
@@ -1 +1 @@
1.7.2
2.0.0
Loading

0 comments on commit fb4cbc1

Please sign in to comment.