• Home
  • Features
  • Pricing
  • Docs
  • Announcements
  • Sign In

PyThaiNLP / pythainlp / 5337431273

pending completion
5337431273

push

github

wannaphong
Add กาลพฤกษ์ to list words

3573 of 6329 relevant lines covered (56.45%)

0.56 hits per line

Source File
Press 'n' to go to next uncovered line, 'b' for previous

35.0
/pythainlp/parse/core.py
1
# -*- coding: utf-8 -*-
2
# Copyright (C) 2016-2023 PyThaiNLP Project
3
#
4
# Licensed under the Apache License, Version 2.0 (the "License");
5
# you may not use this file except in compliance with the License.
6
# You may obtain a copy of the License at
7
#
8
#     http://www.apache.org/licenses/LICENSE-2.0
9
#
10
# Unless required by applicable law or agreed to in writing, software
11
# distributed under the License is distributed on an "AS IS" BASIS,
12
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13
# See the License for the specific language governing permissions and
14
# limitations under the License.
15
from typing import List, Union
1✔
16

17

18
_tagger = None
1✔
19
_tagger_name = ""
1✔
20

21

22
def dependency_parsing(
1✔
23
    text: str, model: str = None, tag: str = "str", engine: str = "esupar"
24
) -> Union[List[List[str]], str]:
25
    """
26
    Dependency Parsing
27

28
    :param str text: text to do dependency parsing
29
    :param str model: model for using with engine \
30
        (for esupar and transformers_ud)
31
    :param str tag: output type (str or list)
32
    :param str engine: the name dependency parser
33
    :return: str (conllu) or List
34
    :rtype: Union[List[List[str]], str]
35

36
    **Options for engine**
37
        * *esupar* (default) - Tokenizer POS-tagger and Dependency-parser \
38
            with BERT/RoBERTa/DeBERTa model. `GitHub \
39
                <https://github.com/KoichiYasuoka/esupar>`_
40
        * *spacy_thai* - Tokenizer, POS-tagger, and dependency-parser \
41
            for Thai language, working on Universal Dependencies. \
42
            `GitHub <https://github.com/KoichiYasuoka/spacy-thai>`_
43
        * *transformers_ud* - TransformersUD \
44
            `GitHub <https://github.com/KoichiYasuoka/>`_
45
        * *ud_goeswith* - POS-tagging and dependency-parsing with \
46
            using `goeswith` for subwords
47

48
    **Options for model (esupar engine)**
49
        * *th* (default) - KoichiYasuoka/roberta-base-thai-spm-upos model \
50
            `Huggingface \
51
            <https://huggingface.co/KoichiYasuoka/roberta-base-thai-spm-upos>`_
52
        * *KoichiYasuoka/deberta-base-thai-upos* - DeBERTa(V2) model \
53
            pre-trained on Thai Wikipedia texts for POS-tagging and \
54
            dependency-parsing `Huggingface \
55
            <https://huggingface.co/KoichiYasuoka/deberta-base-thai-upos>`_
56
        * *KoichiYasuoka/roberta-base-thai-syllable-upos* - RoBERTa model \
57
            pre-trained on Thai Wikipedia texts for POS-tagging and \
58
            dependency-parsing. (syllable level) `Huggingface \
59
            <https://huggingface.co/KoichiYasuoka/roberta-base-thai-syllable-upos>`_
60
        * *KoichiYasuoka/roberta-base-thai-char-upos* - RoBERTa model \
61
            pre-trained on Thai Wikipedia texts for POS-tagging \
62
            and dependency-parsing. (char level) `Huggingface \
63
            <https://huggingface.co/KoichiYasuoka/roberta-base-thai-char-upos>`_
64

65
    If you want to train model for esupar, you can read \
66
    `Huggingface <https://github.com/KoichiYasuoka/esupar>`_
67

68
    **Options for model (transformers_ud engine)**
69
        * *KoichiYasuoka/deberta-base-thai-ud-head* (default) - \
70
            DeBERTa(V2) model pretrained on Thai Wikipedia texts \
71
            for dependency-parsing (head-detection on Universal \
72
            Dependencies) as question-answering, derived from \
73
            deberta-base-thai. \
74
            trained by th_blackboard.conll. `Huggingface \
75
            <https://huggingface.co/KoichiYasuoka/deberta-base-thai-ud-head>`_
76
        * *KoichiYasuoka/roberta-base-thai-spm-ud-head* - \
77
            roberta model pretrained on Thai Wikipedia texts \
78
            for dependency-parsing. `Huggingface \
79
            <https://huggingface.co/KoichiYasuoka/roberta-base-thai-spm-ud-head>`_
80

81
    **Options for model (ud_goeswith engine)**
82
        * *KoichiYasuoka/deberta-base-thai-ud-goeswith* (default) - \
83
            This is a DeBERTa(V2) model pre-trained on Thai Wikipedia \
84
            texts for POS-tagging and dependency-parsing (using goeswith for subwords) \
85
            `Huggingface <https://huggingface.co/KoichiYasuoka/deberta-base-thai-ud-goeswith>`_
86

87
    :Example:
88
    ::
89

90
        from pythainlp.parse import dependency_parsing
91

92
        print(dependency_parsing("ผมเป็นคนดี", engine="esupar"))
93
        # output:
94
        # 1       ผม      _       PRON    _       _       3       nsubj   _       SpaceAfter=No
95
        # 2       เป็น     _       VERB    _       _       3       cop     _       SpaceAfter=No
96
        # 3       คน      _       NOUN    _       _       0       root    _       SpaceAfter=No
97
        # 4       ดี       _       VERB    _       _       3       acl     _       SpaceAfter=No
98

99
        print(dependency_parsing("ผมเป็นคนดี", engine="spacy_thai"))
100
        # output:
101
        # 1       ผม              PRON    PPRS    _       2       nsubj   _       SpaceAfter=No
102
        # 2       เป็น             VERB    VSTA    _       0       ROOT    _       SpaceAfter=No
103
        # 3       คนดี             NOUN    NCMN    _       2       obj     _       SpaceAfter=No
104
    """
105
    global _tagger, _tagger_name
106
    if _tagger_name != engine:
1✔
107
        if engine == "esupar":
1✔
108
            from pythainlp.parse.esupar_engine import Parse
1✔
109

110
            _tagger = Parse(model=model)
×
111
        elif engine == "transformers_ud":
×
112
            from pythainlp.parse.transformers_ud import Parse
×
113

114
            _tagger = Parse(model=model)
×
115
        elif engine == "spacy_thai":
×
116
            from pythainlp.parse.spacy_thai_engine import Parse
×
117

118
            _tagger = Parse()
×
119
        elif engine == "ud_goeswith":
×
120
            from pythainlp.parse.ud_goeswith import Parse
×
121

122
            _tagger = Parse(model=model)
×
123
        else:
124
            raise NotImplementedError("The engine doesn't support.")
×
125
    _tagger_name = engine
×
126
    return _tagger(text, tag=tag)
×
STATUS · Troubleshooting · Open an Issue · Sales · Support · CAREERS · ENTERPRISE · START FREE · SCHEDULE DEMO
ANNOUNCEMENTS · TWITTER · TOS & SLA · Supported CI Services · What's a CI service? · Automated Testing

© 2026 Coveralls, Inc