Skip to content

Commit

Permalink
Merge pull request #135 from ku-nlp/develop
Browse files Browse the repository at this point in the history
Release
  • Loading branch information
hkiyomaru authored Jun 13, 2023
2 parents f97ae7a + d988c10 commit cdc5015
Show file tree
Hide file tree
Showing 15 changed files with 317 additions and 289 deletions.
2 changes: 1 addition & 1 deletion .github/workflows/test-example.yml
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ jobs:
strategy:
fail-fast: false
matrix:
python-version: ["3.8", "3.9", "3.10"]
python-version: ["3.8", "3.9", "3.10", "3.11"]
steps:
- name: Checkout repository
uses: actions/checkout@v3
Expand Down
7 changes: 3 additions & 4 deletions .github/workflows/test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -37,14 +37,13 @@ jobs:
path: /tmp/kwja
key: kwja-${{ env.KWJA_VERSION }}
- name: Run tests with KWJA
# KWJA does not support Python 3.7 and 3.11
if: ${{ matrix.python-version != 3.7 && matrix.python-version != 3.11 }}
# KWJA does not support Python 3.7
if: ${{ matrix.python-version != 3.7 }}
run: |
pipx install kwja
ulimit -n 4096
poetry run pytest --cov=./ --cov-report=xml -v ./tests
- name: Run tests without KWJA
if: ${{ matrix.python-version == 3.7 || matrix.python-version == 3.11 }}
if: ${{ matrix.python-version == 3.7 }}
run: |
poetry run pytest -v --ignore ./tests/processors/test_kwja.py ./tests
- name: Upload coverage to Codecov
Expand Down
7 changes: 4 additions & 3 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ repos:
hooks:
- id: absolufy-imports
- repo: https://github.com/pre-commit/mirrors-mypy
rev: v1.2.0
rev: v1.3.0
hooks:
- id: mypy
additional_dependencies:
Expand All @@ -34,8 +34,9 @@ repos:
- rich
- uvicorn
- fastapi
- markdown-it-py==2.2.0
- repo: https://github.com/asottile/pyupgrade
rev: v3.3.2
rev: v3.6.0
hooks:
- id: pyupgrade
args:
Expand All @@ -45,6 +46,6 @@ repos:
hooks:
- id: pydocstyle
- repo: https://github.com/pre-commit/mirrors-prettier
rev: v2.7.1
rev: v3.0.0-alpha.9-for-vscode
hooks:
- id: prettier
59 changes: 30 additions & 29 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,14 +16,22 @@
<a href="https://github.com/psf/black"><img alt="Code style - black" src="https://img.shields.io/badge/code%20style-black-222222?style=flat-square"></a>
</p>

---

**Documentation**: [https://rhoknp.readthedocs.io/en/latest/](https://rhoknp.readthedocs.io/en/latest/)

**Source Code**: [https://github.com/ku-nlp/rhoknp](https://github.com/ku-nlp/rhoknp)

---

_rhoknp_ is a Python binding for [Juman++](https://github.com/ku-nlp/jumanpp), [KNP](https://github.com/ku-nlp/knp), and [KWJA](https://github.com/ku-nlp/kwja).[^1]

[^1]: The logo was originally generated using OpenAI DALL·E 2
[^1]: The logo was generated by OpenAI DALL·E 2.

```python
import rhoknp

# Perform language analysis by Juman++
# Perform morphological analysis by Juman++
jumanpp = rhoknp.Jumanpp()
sentence = jumanpp.apply_to_sentence(
"電気抵抗率は電気の通しにくさを表す物性値である。"
Expand All @@ -33,47 +41,40 @@ sentence = jumanpp.apply_to_sentence(
for morpheme in sentence.morphemes: # a.k.a. keitai-so
...

# Save language analysis by Juman++
# Save the result
with open("result.jumanpp", "wt") as f:
f.write(sentence.to_jumanpp())

# Load language analysis by Juman++
# Load the result
with open("result.jumanpp", "rt") as f:
sentence = rhoknp.Sentence.from_jumanpp(f.read())
```

## Requirements

- Python 3.7+

## Optional requirements for language analysis

- [Juman++](https://github.com/ku-nlp/jumanpp) v2.0.0-rc3+
- [KNP](https://github.com/ku-nlp/knp) 5.0+
- [KWJA](https://github.com/ku-nlp/kwja) 1.0.0+
- (Optional) [Juman++](https://github.com/ku-nlp/jumanpp) v2.0.0-rc3+
- (Optional) [KNP](https://github.com/ku-nlp/knp) 5.0+
- (Optional) [KWJA](https://github.com/ku-nlp/kwja) 1.0.0+

## Installation

```shell
pip install rhoknp
```

## Documentation

[https://rhoknp.readthedocs.io/en/latest/](https://rhoknp.readthedocs.io/en/latest/)

## Quick tour

Let's start with using Juman++ with _rhoknp_.
Here is a simple example of using Juman++ to analyze a sentence.
Let's begin by using Juman++ with rhoknp.
Here, we present a simple example demonstrating how Juman++ can be used to analyze a sentence.

```python
# Perform language analysis by Juman++
# Perform morphological analysis by Juman++
jumanpp = rhoknp.Jumanpp()
sentence = jumanpp.apply_to_sentence("電気抵抗率は電気の通しにくさを表す物性値である。")
```

You can easily access the morphemes that make up the sentence.
You can easily access the individual morphemes that make up the sentence.

```python
for morpheme in sentence.morphemes: # a.k.a. keitai-so
Expand Down Expand Up @@ -125,7 +126,7 @@ with open("sentence.knp", "rt") as f:
sentence = rhoknp.Sentence.from_knp(f.read())
```

_rhoknp_ also provides APIs for document-level language analysis.
Furthermore, rhoknp provides convenient APIs for document-level language analysis.

```python
document = rhoknp.Document.from_raw_text(
Expand All @@ -140,10 +141,10 @@ document = rhoknp.Document.from_sentences(
)
```

Document objects can be handled in almost the same way as Sentence objects.
Document objects can be handled in a similar manner as Sentence objects.

```python
# Perform language analysis by Juman++
# Perform morphological analysis by Juman++
document = jumanpp.apply_to_document(document)

# Access language units in the document
Expand All @@ -161,25 +162,25 @@ with open("document.jumanpp", "rt") as f:
document = rhoknp.Document.from_jumanpp(f.read())
```

For more information, explore the [examples](./examples) and [documentation](https://rhoknp.readthedocs.io/en/latest/).
For more information, please refer to the [examples](./examples) and [documentation](https://rhoknp.readthedocs.io/en/latest/).

## Main differences from [pyknp](https://github.com/ku-nlp/pyknp/)

[_pyknp_](https://pypi.org/project/pyknp/) has been developed as the official Python binding for Juman++ and KNP.
In _rhoknp_, we redesigned the API from the top-down, taking into account the current use cases of _pyknp_.
The main differences are as follows:
[_pyknp_](https://pypi.org/project/pyknp/) serves as the official Python binding for Juman++ and KNP.
In the development of rhoknp, we redesigned the API, considering the current use cases of pyknp.
The key differences between the two are as follows:

- **Support for document-level language analysis**: _rhoknp_ can load and instantiate the result of document-level language analysis (i.e., cohesion analysis and discourse relation analysis).
- **Strictly type-aware**: _rhoknp_ is thoroughly annotated with type annotations.
- **Extensive test suite**: _rhoknp_ is tested with an extensive test suite. See the code coverage at [Codecov](https://app.codecov.io/gh/ku-nlp/rhoknp).
- **Support for document-level language analysis**: rhoknp allows you to load and instantiate the results of document-level language analysis, including cohesion analysis and discourse relation analysis.
- **Strict type-awareness**: rhoknp has been thoroughly annotated with type annotations, ensuring strict type checking and improved code clarity.
- **Comprehensive test suite**: rhoknp is extensively tested with a comprehensive test suite. You can view the code coverage report on [Codecov](https://app.codecov.io/gh/ku-nlp/rhoknp).

## License

MIT

## Contributing

We welcome contributions to _rhoknp_.
We warmly welcome contributions to rhoknp.
You can get started by reading the [contribution guide](https://rhoknp.readthedocs.io/en/latest/contributing/index.html).

## Reference
Expand Down
Loading

0 comments on commit cdc5015

Please sign in to comment.