• Home
  • Features
  • Pricing
  • Docs
  • Announcements
  • Sign In

PyThaiNLP / pythainlp / 11626163864

01 Nov 2024 07:49AM UTC coverage: 14.17% (+14.2%) from 0.0%
11626163864

Pull #952

github

web-flow
Merge 8f2551bc9 into 89ea62ebc
Pull Request #952: Specify a limited test suite

44 of 80 new or added lines in 48 files covered. (55.0%)

1048 of 7396 relevant lines covered (14.17%)

0.14 hits per line

Source File
Press 'n' to go to next uncovered line, 'b' for previous

0.0
/pythainlp/tokenize/tltk.py
1
# -*- coding: utf-8 -*-
2
# SPDX-FileCopyrightText: 2016-2024 PyThaiNLP Project
3
# SPDX-License-Identifier: Apache-2.0
4
from typing import List
×
5

6
try:
×
7
    from tltk.nlp import syl_segment
×
NEW
8
    from tltk.nlp import word_segment as tltk_segment
×
9
except ImportError:
×
10
    raise ImportError("Not found tltk! Please install tltk by pip install tltk")
×
11

12

13
def segment(text: str) -> List[str]:
×
14
    if not text or not isinstance(text, str):
×
15
        return []
×
16
    text = text.replace(" ", "<u/>")
×
17
    _temp = tltk_segment(text).replace("<u/>", " ").replace("<s/>", "")
×
18
    _temp = _temp.split("|")
×
19
    if _temp[-1] == "":
×
20
        del _temp[-1]
×
21
    return _temp
×
22

23

24
def syllable_tokenize(text: str) -> List[str]:
×
25
    if not text or not isinstance(text, str):
×
26
        return []
×
27
    _temp = syl_segment(text)
×
28
    _temp = _temp.split("~")
×
29
    if _temp[-1] == "<s/>":
×
30
        del _temp[-1]
×
31
    return _temp
×
32

33

34
def sent_tokenize(text: str) -> List[str]:
×
35
    text = text.replace(" ", "<u/>")
×
36
    _temp = tltk_segment(text).replace("<u/>", " ").replace("|", "")
×
37
    _temp = _temp.split("<s/>")
×
38
    if _temp[-1] == "":
×
39
        del _temp[-1]
×
40
    return _temp
×
STATUS · Troubleshooting · Open an Issue · Sales · Support · CAREERS · ENTERPRISE · START FREE · SCHEDULE DEMO
ANNOUNCEMENTS · TWITTER · TOS & SLA · Supported CI Services · What's a CI service? · Automated Testing

© 2025 Coveralls, Inc