Skip to content

Commit 9cc1eed

Browse files
authored
Merge pull request #1298 from PyThaiNLP/dev
Update Markdown fixes
2 parents 2f4428f + 05b0ceb commit 9cc1eed

File tree

6 files changed

+69
-43
lines changed

6 files changed

+69
-43
lines changed

.github/ISSUE_TEMPLATE/feature_request.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ about: Propose a change or addition เสนอความสามารถ
55

66
## Detailed description
77

8-
<!--- Provide a detailed description of the change or addition you are proposing -->
8+
<!--- Provide a detailed description of the change or addition -->
99
<!--- รายละเอียดการเปลี่ยนแปลงหรือสิ่งเพิ่มเติมที่คุณกำลังเสนอ -->
1010

1111
## Context
@@ -17,8 +17,8 @@ about: Propose a change or addition เสนอความสามารถ
1717

1818
## Possible implementation
1919

20-
<!--- Not obligatory, but suggest an idea for implementing addition or change -->
21-
<!--- ไม่จำเป็นต้องใส่ แต่คุณสามารถแนะนำได้ว่าการเปลี่ยนแปลงหรือเพิ่มเติมดังกล่าวน่าจะทำได้ด้วยวิธีไหน -->
20+
<!--- Suggest an idea for implementing the change or addition (Optional) -->
21+
<!--- แนะนำว่าจะเปลี่ยนแปลงหรือเพิ่มเติมได้ด้วยวิธีไหน (ไม่จำเป็นต้องใส่) -->
2222

2323
## Your environment
2424

README.md

Lines changed: 11 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -13,12 +13,12 @@
1313
[![Facebook](https://img.shields.io/badge/Facebook-0866FF?style=flat&logo=facebook&logoColor=white)](https://www.facebook.com/pythainlp/)
1414
[![Chat on Matrix](https://matrix.to/img/matrix-badge.svg)](https://matrix.to/#/#thainlp:matrix.org)
1515

16-
[pythainlp.org](https://pythainlp.org/) |
17-
[Tutorials](https://pythainlp.org/tutorials) |
18-
[License info](https://pythainlp.org/dev-docs/notes/license.html) |
19-
[Model cards](https://github.com/PyThaiNLP/pythainlp/wiki/Model-Cards) |
20-
[Adopters](https://github.com/PyThaiNLP/pythainlp/blob/dev/INTHEWILD.md) |
21-
*[เอกสารภาษาไทย](https://github.com/PyThaiNLP/pythainlp/blob/dev/README_TH.md)*
16+
[pythainlp.org](https://pythainlp.org/)
17+
| [Tutorials](https://pythainlp.org/tutorials)
18+
| [License info](https://pythainlp.org/dev-docs/notes/license.html)
19+
| [Model cards](https://github.com/PyThaiNLP/pythainlp/wiki/Model-Cards)
20+
| [Adopters](https://github.com/PyThaiNLP/pythainlp/blob/dev/INTHEWILD.md)
21+
| *[เอกสารภาษาไทย](https://github.com/PyThaiNLP/pythainlp/blob/dev/README_TH.md)*
2222

2323
Designed to be a Thai-focused counterpart to [NLTK](https://www.nltk.org/),
2424
**PyThaiNLP** provides standard tools for linguistic analysis under
@@ -186,9 +186,12 @@ with this BibTeX entry:
186186
------
187187

188188
<div align="center">
189-
<strong>We have only one official repository at https://github.com/PyThaiNLP/pythainlp and another mirror at https://gitlab.com/pythainlp/pythainlp</strong>
189+
<strong>We have only one official repository at
190+
https://github.com/PyThaiNLP/pythainlp and another mirror at
191+
https://gitlab.com/pythainlp/pythainlp</strong>
190192
</div>
191193

192194
<div align="center">
193-
<strong>Beware of malware if you use code from mirrors other than the official two on GitHub and GitLab.</strong>
195+
<strong>Beware of malware if you use code from mirrors other than the
196+
official two on GitHub and GitLab.</strong>
194197
</div>

README_TH.md

Lines changed: 13 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -13,12 +13,12 @@
1313
[![Facebook](https://img.shields.io/badge/Facebook-0866FF?style=flat&logo=facebook&logoColor=white)](https://www.facebook.com/pythainlp/)
1414
[![Chat on Matrix](https://matrix.to/img/matrix-badge.svg)](https://matrix.to/#/#thainlp:matrix.org)
1515

16-
[pythainlp.org](https://pythainlp.org/) |
17-
[วิธีใช้งาน](https://pythainlp.org/tutorials) |
18-
[ข้อมูลสัญญาอนุญาต](https://pythainlp.org/dev-docs/notes/license.html) |
19-
[ใบข้อมูลโมเดล](https://github.com/PyThaiNLP/pythainlp/wiki/Model-Cards) |
20-
[ใครใช้ PyThaiNLP บ้าง](https://github.com/PyThaiNLP/pythainlp/blob/dev/INTHEWILD.md) |
21-
*[English](https://github.com/PyThaiNLP/pythainlp/blob/dev/README.md)*
16+
[pythainlp.org](https://pythainlp.org/)
17+
| [วิธีใช้งาน](https://pythainlp.org/tutorials)
18+
| [ข้อมูลสัญญาอนุญาต](https://pythainlp.org/dev-docs/notes/license.html)
19+
| [ใบข้อมูลโมเดล](https://github.com/PyThaiNLP/pythainlp/wiki/Model-Cards)
20+
| [ใครใช้ PyThaiNLP บ้าง](https://github.com/PyThaiNLP/pythainlp/blob/dev/INTHEWILD.md)
21+
| *[English](https://github.com/PyThaiNLP/pythainlp/blob/dev/README.md)*
2222

2323
**PyThaiNLP** ถูกออกแบบให้เป็นเครื่องมือมาตรฐานสำหรับการวิเคราะห์ภาษาศาสตร์ภาษาไทย
2424
ภายใต้สัญญาอนุญาต Apache-2.0 โดยข้อมูลและโมเดลอยู่ภายใต้ CC0-1.0 และ CC-BY-4.0
@@ -100,7 +100,8 @@ pip install "pythainlp[extra1,extra2,...]"
100100

101101
</details>
102102

103-
สำหรับรายละเอียด dependencies สามารถดูได้ที่ส่วน `[project.optional-dependencies]` ใน
103+
สำหรับรายละเอียด dependencies
104+
สามารถดูได้ที่ส่วน `[project.optional-dependencies]` ใน
104105
[`pyproject.toml`](https://github.com/PyThaiNLP/pythainlp/blob/dev/pyproject.toml)
105106

106107
## ไดเรกทอรีข้อมูล
@@ -225,9 +226,12 @@ PyThaiNLP ดาวน์โหลดข้อมูล (ดูแค็ตต
225226
------
226227

227228
<div align="center">
228-
<strong>เรามีที่เก็บข้อมูลอย่างเป็นทางการที่เดียวที่ https://github.com/PyThaiNLP/pythainlp และมีที่เก็บสำเนาอีกแห่งที่ https://gitlab.com/pythainlp/pythainlp</strong>
229+
<strong>เรามีที่เก็บข้อมูลอย่างเป็นทางการที่เดียวที่
230+
https://github.com/PyThaiNLP/pythainlp และมีที่เก็บสำเนาอีกแห่งที่
231+
https://gitlab.com/pythainlp/pythainlp</strong>
229232
</div>
230233

231234
<div align="center">
232-
<strong>โปรดระมัดระวังซอฟต์แวร์ประสงค์ร้ายหรือมัลแวร์ ถ้าคุณใช้โค้ดจากที่เก็บข้อมูลอื่นนอกเหนือจากที่ GitHub และ GitLab ข้างต้น</strong>
235+
<strong>โปรดระมัดระวังซอฟต์แวร์ประสงค์ร้ายหรือมัลแวร์
236+
ถ้าคุณใช้โค้ดจากที่เก็บข้อมูลอื่นนอกเหนือจากที่ GitHub และ GitLab ข้างต้น</strong>
233237
</div>

SECURITY.md

Lines changed: 6 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,11 @@
2020

2121
The following security improvements are planned for future releases:
2222

23-
- Migrate from pickle to a safer serialization format such as JSON or MessagePack.
24-
- Upgrade the hashing algorithm for integrity verification from MD5 to SHA-256 or SHA-3.
23+
- Migrate from pickle to a safer serialization format such as JSON or
24+
[MessagePack][].
25+
- Upgrade the hashing algorithm for integrity verification from MD5 to SHA-256
26+
or SHA-3.
2527
- Implement digital signatures for corpus files to ensure authenticity.
2628
- Add version tracking to the corpus to prevent rollback attacks.
29+
30+
[MessagePack]: https://msgpack.org/

build_tools/analysis/README.md

Lines changed: 32 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -90,10 +90,12 @@ easy analysis in spreadsheet applications or data analysis tools.
9090
- `output/functions_no_hints.csv` - Functions without any type hints
9191
- `output/functions_incomplete_hints.csv` - Functions with partial hints
9292
- `output/class_variables_no_hints.csv` - Class variables without type hints
93-
- `output/instance_variables_no_hints.csv` - Instance variables without type hints
93+
- `output/instance_variables_no_hints.csv` -
94+
Instance variables without type hints
9495
- `output/module_variables_no_hints.csv` - Module variables without type hints
9596
- `output/type_aliases.csv` - All type aliases defined in the codebase
96-
- `output/submodule_summary.csv` - Summary statistics by submodule with mypy errors
97+
- `output/submodule_summary.csv` -
98+
Summary statistics by submodule with mypy errors
9799

98100
**CSV Schema:**
99101

@@ -155,9 +157,9 @@ cat output/submodule_summary.csv
155157
**Example Output:**
156158

157159
```text
158-
================================================================================
160+
==============================================================================
159161
TYPE ANNOTATION COVERAGE ANALYSIS FOR PYTHAINLP
160-
================================================================================
162+
==============================================================================
161163
162164
Repository root: /path/to/pythainlp
163165
Output directory: ./output
@@ -181,17 +183,17 @@ Analyzed 426 instance variables
181183
Analyzed 508 module variables
182184
Analyzed 0 type aliases
183185
184-
================================================================================
186+
==============================================================================
185187
OVERALL STATISTICS - FUNCTIONS/METHODS
186-
================================================================================
188+
==============================================================================
187189
Total functions/methods: 720
188190
Complete type hints: 592 (82.22%)
189191
Incomplete type hints: 56 ( 7.78%)
190192
No type hints: 72 (10.00%)
191193
192-
================================================================================
194+
==============================================================================
193195
OVERALL STATISTICS - VARIABLES
194-
================================================================================
196+
==============================================================================
195197
Total variables: 959
196198
Class variables: 25
197199
Instance variables: 426
@@ -202,19 +204,22 @@ No type hints: 909 (94.79%)
202204

203205
### Automated Analysis
204206

205-
The repository includes a GitHub Actions workflow that automatically runs the type hint analyzer on every push to the `dev` branch:
207+
The repository includes a GitHub Actions workflow that automatically runs
208+
the type hint analyzer on every push to the `dev` branch:
206209

207210
- **Workflow**: `.github/workflows/type-hint-analysis.yml`
208211
- **Trigger**: Push to `dev` branch
209212
- **Environment**: ubuntu-latest, Python 3.9
210213
- **Artifacts**: JSON and CSV files (30-day retention)
211214
- **Summary**: Displayed in GitHub Actions UI
212215

213-
The workflow provides continuous monitoring of type hint coverage as the codebase evolves.
216+
The workflow provides continuous monitoring of type hint coverage
217+
as the codebase evolves.
214218

215219
### Type Completeness Standards
216220

217-
This analyzer follows the type completeness guidelines from the Python typing documentation:
221+
This analyzer follows the type completeness guidelines from
222+
the Python typing documentation:
218223
<https://typing.python.org/en/latest/guides/libraries.html#type-completeness>
219224

220225
The analysis covers:
@@ -241,13 +246,16 @@ a library is considered to have complete type hints when:
241246

242247
**Type hint status:**
243248

244-
- **Complete:** All parameters and return value have type hints (for functions), or variable has type annotation (for variables)
245-
- **Incomplete:** Some parameters or return value missing type hints (for functions only)
249+
- **Complete:** All parameters and return value have type hints
250+
(for functions), or variable has type annotation (for variables)
251+
- **Incomplete:** Some parameters or return value missing type hints
252+
(for functions only)
246253
- **None:** No type hints at all
247254

248255
**Analyzed Elements:**
249256

250-
- **Functions/Methods:** Function signatures including parameters and return types
257+
- **Functions/Methods:** Function signatures including parameters
258+
and return types
251259
- Excludes `self` and `cls` parameters from parameter counts
252260
- Considers both parameters and return type for completeness
253261
- Tracks decorator usage (e.g., `@staticmethod`, `@lru_cache`)
@@ -361,7 +369,8 @@ The analyzer codebase maintains high documentation standards:
361369

362370
**Key Design decisions:**
363371

364-
1. **AST-based Analysis**: Uses Python's `ast` module rather than runtime inspection
372+
1. **AST-based Analysis**: Uses Python's `ast` module rather than runtime
373+
inspection
365374
- Pros: No need to import/execute code, faster, safer
366375
- Cons: Cannot detect dynamically generated code
367376

@@ -396,7 +405,8 @@ The analyzer codebase maintains high documentation standards:
396405
2. **More Accurate Reference Counting**
397406
- Use AST-based import analysis instead of text search
398407
- Track actual usage vs. string mentions
399-
- Distinguish between different types of references (call, attribute access, etc.)
408+
- Distinguish between different types of references
409+
(call, attribute access, etc.)
400410

401411
3. **Additional Metrics**
402412
- Generic type parameterization completeness
@@ -447,9 +457,12 @@ type annotations should be added to:
447457

448458
### What should not be annotated
449459

450-
1. **Reassignments** - Adding type annotations to reassignments causes `no-redef` errors
451-
2. **Dictionary subscript operations** - Cannot annotate `dict[key] = value` operations
452-
3. **Variables with obvious literal types** - Optional, but generally omitted for simple cases
460+
1. **Reassignments** -
461+
Adding type annotations to reassignments causes `no-redef` errors
462+
2. **Dictionary subscript operations** -
463+
Cannot annotate `dict[key] = value` operations
464+
3. **Variables with obvious literal types** -
465+
Optional, but generally omitted for simple cases
453466

454467
## Future Tools
455468

tokenization-benchmark.md

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,8 @@
33
**Note: This benchmark framework is obsolete and no longer actively maintained.**
44

55
A framework for benchmarking tokenization algorithms for Thai.
6-
It provides a command-line interface that allows users to conveniently execute the benchmarks,
6+
It provides a command-line interface that allows users to
7+
conveniently execute the benchmarks,
78
as well as a module interface for use in development pipelines.
89

910
## Metrics
@@ -107,7 +108,8 @@ pip install "pythainlp[benchmarks]"
107108

108109
## Related Work
109110

110-
- [Thai Tokenizers Docker][docker]: collection of Docker containers of pre-built Thai tokenizers.
111+
- [Thai Tokenizers Docker][docker]: collection of Docker containers of
112+
pre-built Thai tokenizers.
111113

112114
## Development
113115

0 commit comments

Comments
 (0)