• Home
  • Features
  • Pricing
  • Docs
  • Announcements
  • Sign In

manoss96 / pregex / 4356417821

pending completion
4356417821

push

github

manoss96
add logo and funding

1625 of 1625 relevant lines covered (100.0%)

3.0 hits per line

Source File
Press 'n' to go to next uncovered line, 'b' for previous

100.0
/src/pregex/core/classes.py
1
__doc__ = """
3✔
2
All classes within this module represent the so-called RegΕx *character classes*,
3
which can be used in order to define a set or "class" of characters that can be matched.
4

5
Class types
6
-------------------------------------------
7
A character class can be either one of the following two types:
8

9
        1. **Regular class**: This type of class represents the `[...]` pattern, 
10
           which can be translated as "match every character defined within the
11
           brackets". You can tell regular classes by their name, which follows
12
           the `Any*` pattern.
13

14

15
        2. **Negated class**: This type of class represents the `[^...]` pattern, 
16
           which can be translated as "match every character except for those 
17
           defined within the brackets". You can tell negated classes by their name, 
18
           which follows the `AnyBut*` pattern.
19

20
Here is an example containing a regular class as well as its negated counterpart.
21

22
.. code-block:: python
23
   
24
   from pregex.core.classes import AnyLetter, AnyButLetter
25

26
   regular = AnyLetter()
27
   negated = AnyButLetter()
28

29
   regular.print_pattern() # This prints "[A-Za-z]"
30
   negated.print_pattern() # This prints "[^A-Za-z]"
31

32
Class unions
33
-------------------------------------------
34
Classes of the same type can be combined together in order to get the union of
35
the sets of characters they represent. This can be easily done though the use 
36
of the bitwise OR operator ``|``, as depicted within the code snippet below:
37

38
.. code-block:: python
39
   
40
   from pregex.core.classes import AnyDigit, AnyLowercaseLetter
41

42
   pre = AnyDigit() | AnyLowercaseLetter()
43
   pre.print_pattern() # This prints "[\da-z]"
44

45
The same goes for negated classes as well:
46

47
.. code-block:: python
48

49
   from pregex.core.classes import AnyButDigit, AnyButLowercaseLetter
50

51
   pre = AnyButDigit() | AnyButLowercaseLetter()
52
   pre.print_pattern() # This prints "[^\da-z]"
53

54
However, attempting to get the union of a regular class and a negated class
55
causes a ``CannotBeUnionedException`` to be thrown.
56

57
.. code-block:: python
58

59
   from pregex.core.classes import AnyDigit, AnyButLowercaseLetter
60

61
   pre = AnyDigit() | AnyButLowercaseLetter() # This is not OK!
62

63
Lastly, it is also possible to union a regular class with a token, that is,
64
any string of length one or any instance of a class that is part of the
65
:py:mod:`pregex.core.tokens` module:
66

67
.. code-block:: python
68

69
   from pregex.core.classes import AnyDigit
70
   from pregex.core.tokens import Newline
71

72
   pre = AnyDigit() | "a" | Newline()
73
   
74
   pre.print_pattern() # This prints "[\da\\n]"
75

76
However, in the case of negated classes one is forced to wrap any tokens
77
within an :class:`AnyButFrom` class instance in order to achieve the same
78
result:
79

80
.. code-block:: python
81

82
   from pregex.core.classes import AnyButDigit
83
   from pregex.core.tokens import Newline
84

85
   pre = AnyButDigit() | AnyButFrom("a", Newline())
86
   
87
   pre.print_pattern() # This prints "[^\da\\n]"
88

89
Subtracting classes
90
-------------------------------------------
91
Subtraction is another operation that is exclusive to classes and it is made possible
92
via the overloaded subtraction operator ``-``. This feature comes in handy when one
93
wishes to construct a class that would be tiresome to create otherwise. Consider
94
for example the class of all word characters except for all characters in the
95
set `{C, c, G, g, 3}`. Constructing said class via subtraction
96
is extremely easy:
97

98
.. code-block:: python
99

100
   from pregex.core.classes import AnyWordChar, AnyFrom
101

102
   pre = AnyWordChar() - AnyFrom('C', 'c', 'G', 'g', '3')
103

104
Below we are able to see this operation's resulting pattern, from which it
105
becomes evident that building said pattern through multiple class unions would
106
be more time consuming, and more importantly, prone to errors.
107

108
.. code-block:: python
109

110
        [A-BD-FH-Za-bd-fh-z0-24-9_]
111

112
It should be noted that just like in the case of class unions, one is only 
113
allowed to subtract a regular class from a regular class or a negated class
114
from a negated class, as any other attempt will cause a 
115
``CannotBeSubtractedException`` to be thrown.
116

117
.. code-block:: python
118

119
   from pregex.core.classes import AnyWordChar, AnyButLowercaseLetter
120

121
   pre = AnyWordChar() - AnyButLowercaseLetter() # This is not OK!
122

123
Furthermore, applying the subtraction operation between a class and a token
124
is also possible, but just like in the case of class unions, this only works
125
with regular classes:
126

127
.. code-block:: python
128

129
   from pregex.core.classes import AnyWhitespace
130
   from pregex.core.tokens import Newline
131

132
   pre = AnyWhitespace() - Newline()
133
   
134
   pre.print_pattern() # This prints "[\\t \\x0b-\\r]"
135

136
Negating classes
137
-------------------------------------------
138
Finally, it is useful to know that every regular class can be negated through
139
the use of the bitwise NOT operator ``~``:
140

141
.. code-block:: python
142

143
   from pregex.core.classes import AnyDigit
144

145
   pre = ~ AnyDigit()
146
   pre.print_pattern() # This prints "[^0-9]"
147

148
Negated classes can be negated as well, however you should probably avoid
149
this as it doesn't help much in making the code any easier to read.
150

151
.. code-block:: python
152

153
   from pregex.core.classes import AnyButDigit
154

155
   pre = ~ AnyButDigit()
156
   pre.print_pattern() # This prints "[0-9]"
157

158
Therefore, in order to create a negated class one can either negate a regular `Any*`
159
class or use its `AnyBut*` negated class equivalent. The result is entirely the same
160
and which one you'll use is just a matter of choice.
161

162
Classes & methods
163
-------------------------------------------
164

165
Below are listed all classes within :py:mod:`pregex.core.classes`
166
along with any possible methods they may possess.
167
"""
168

169

170
import re as _re
3✔
171
import pregex.core.pre as _pre
3✔
172
import pregex.core.exceptions as _ex
3✔
173
from string import whitespace as _whitespace
3✔
174

175

176
class __Class(_pre.Pregex):
3✔
177
    '''
3✔
178
    Constitutes the base class for every class within "classes.py".
179

180
    Operations defined for classes:
181
        - Union:
182
            AnyLowercaseLetter() | AnyDigit() => [a-z0-9]
183
        - Subtraction:
184
            AnyLetter() - AnyLowercaseLetter() => [A-Z]
185
        - Negation:
186
            ~ AnyLowercaseLetter() => [^a-z]
187

188
    :param str pattern: The RegEx class pattern.
189
    :param bool is_negated: Indicates whether this class instance represents \
190
        a regular or a negated class.
191
    :param bool simplify_word: In case class instance represents the `word` \
192
        character class, this parameter indicates whether this character class \
193
        should be simplified to `\\w` or not.
194

195
    :note: Union and subtraction can only be performed on a pair of classes of the same type, \
196
        that is, either a pair of regular classes or a pair of negated classes.
197
    '''
198

199

200
    '''
2✔
201
    A set containing characters that must be escaped when used within a class.
202
    '''
203
    _to_escape = ('\\', '^', '[', ']', '-', '/')
3✔
204

205

206
    def __init__(self, pattern: str, is_negated: bool, simplify_word: bool = False) -> '__Class':
3✔
207
        '''
208
        Constitutes the base class for every class within "classes.py".
209

210
        Operations defined for classes:
211
            - Union:
212
                AnyLowercaseLetter() | AnyDigit() => [a-z0-9]
213
            - Subtraction:
214
                AnyLetter() - AnyLowercaseLetter() => [A-Z]
215
            - Negation:
216
                ~ AnyLowercaseLetter() => [^a-z]
217

218
        :param str pattern: The RegEx class pattern.
219
        :param bool is_negated: Indicates whether this class instance represents \
220
            a regular or a negated class.
221
        :param bool simplify_word: In case class instance represents the `word` \
222
            character class, this parameter indicates whether this character class \
223
            should be simplified to `\\w` or not.
224

225
        :note: Union and subtraction can only be performed on a pair of classes of the same type, \
226
            that is, either a pair of regular classes or a pair of negated classes.
227
        '''
228
        self.__is_negated = is_negated
3✔
229
        self.__verbose, pattern = __class__.__process(pattern, is_negated, simplify_word)
3✔
230
        super().__init__(pattern, escape=False)
3✔
231

232

233
    def _get_verbose_pattern(self) -> str:
3✔
234
        '''
235
        Returns a verbose representation of this class's pattern.
236
        '''
237
        return self.__verbose
3✔
238

239

240
    @staticmethod
3✔
241
    def __process(pattern: str, is_negated: bool, simplify_word: bool) -> tuple[str, str]:
3✔
242
        '''
243
        Performs some modifications to the provided pattern and returns \
244
        it in both a verbose and a simplified form.
245

246
        :param str pattern: The pattern that is to be processed.
247
        :param bool is_negated: Determines whether the patterns \
248
            belongs to a negated class or a regular one.
249
        :param bool simplify_word: Indicates whether `[A-Za-z0-9_]` \
250
            should be simplified to `[\w]` or not.
251
        '''
252
        if pattern == '.':
3✔
253
            return pattern, pattern
3✔
254
        # Separate ranges from chars.
255
        ranges, chars = __class__.__extract_classes(pattern)
3✔
256
        # Reduce chars to any possible ranges.
257
        ranges, chars = __class__.__chars_to_ranges(ranges, chars)
3✔
258
        # Combine classes back together.
259
        verbose_classes = ranges.union(chars)
3✔
260
        verbose_pattern = ''.join(f"[{'^' if is_negated else ''}{''.join(verbose_classes)}]")
3✔
261
        # Use shorthand notation for any classes that support this.
262
        simplified_classes = __class__.__verbose_to_shorthand(verbose_classes, simplify_word)
3✔
263
        simplified_pattern = ''.join(f"[{'^' if is_negated else ''}{''.join(simplified_classes)}]")
3✔
264
        # Replace any one-character classes with a single (possibly escaped) character
265
        simplified_pattern = _re.sub(r"\[([^\\]|\\.)\]", lambda m: str(__class__._to_pregex(m.group(1))) \
3✔
266
            if len(m.group(1)) == 1 else m.group(1), simplified_pattern)
267
        # Replace negated class shorthand-notation characters with their non-class shorthand-notation.
268
        return verbose_pattern, _re.sub(r"\[\^(\\w|\\d|\\s)\]", lambda m: m.group(1).upper(), simplified_pattern)
3✔
269

270

271
    @staticmethod    
3✔
272
    def __chars_to_ranges(ranges: set[str], chars: set[str]) -> tuple[set[str], set[str]]:
3✔
273
        '''
274
        Checks whether the provided characters can be incorporated within ranges.
275
        Returns the newly constructed ranges and characters as sets.
276

277
        :param set[str] ranges: A set containing all ranges.
278
        :param set[str] chars: A set containing all characters.
279
        '''
280
        # 1. Un-escape any escaped characters and convert to list.
281
        chars: list[str] = list(__class__.__modify_classes(chars, escape=False))
3✔
282
        ranges: list[list[str]] = list(__class__.__split_range(rng) for rng in 
3✔
283
            __class__.__modify_classes(ranges, escape=False))
284

285
        # 2. Check whether ranges can be constructed from chars.
286
        i = 0
3✔
287
        while i < len(chars):
3✔
288
            for j in range(len(chars)):
3✔
289
                if i != j and len(chars[j]) == 1:
3✔
290
                    c_i, c_j = chars[i], chars[j]
3✔
291
                    if len(chars[i]) > 1:
3✔
292
                        start, end = c_i[0], c_i[-1]
3✔
293
                    else:
294
                        start, end = c_i, c_i
3✔
295
                    if ord(start) == ord(c_j) + 1:
3✔
296
                        chars[i] = c_j + end
3✔
297
                        chars.pop(j)
3✔
298
                        i = -1
3✔
299
                        break
3✔
300
                    elif ord(end) == ord(c_j) - 1:
3✔
301
                        chars[i] = start + c_j
3✔
302
                        chars.pop(j)
3✔
303
                        i = -1
3✔
304
                        break
3✔
305
            i += 1
3✔
306

307
        # Check whether these character-ranges can be incorporated into
308
        # any existing ranges. If two characters are next to each other
309
        # then keep them as characters.
310
        ranges_set = set(f"{rng[0]}-{rng[1]}" for rng in ranges)
3✔
311
        chars_set = set()
3✔
312
        for c in chars:
3✔
313
            if len(c) == 1:
3✔
314
                chars_set.add(c)
3✔
315
            else:
316
                if ord(c[1]) == ord(c[0]) + 1:
3✔
317
                    chars_set.add(c[0])
3✔
318
                    chars_set.add(c[1])
3✔
319
                else:
320
                    ranges_set.add(f"{c[0]}-{c[1]}")
3✔
321

322
        ranges = __class__.__modify_classes(ranges_set, escape=True)
3✔
323
        chars = __class__.__modify_classes(chars_set, escape=True)
3✔
324

325
        return ranges, chars
3✔
326

327

328
    @staticmethod
3✔
329
    def __verbose_to_shorthand(classes: set[str], simplify_word: bool) -> set[str]:
3✔
330
        '''
331
        This method searches the provided set for subsets of character classes that \
332
        correspond to a shorthand-notation character class, and if it finds any, it \
333
        replaces them with said character class, returning the resulting set at the end.
334

335
        :param set[str] classes: The set containing the classes as strings.
336
        :param bool simplify_word: Indicates whether `[A-Za-z0-9_]` should be simplified \
337
            to `[\w]` or not.
338
        '''
339
        
340
        word_set = {'a-z', 'A-Z', '0-9', '_'}
3✔
341
        digit_set = {'0-9'}
3✔
342
        whitespace_set = {' ', '\t-\r'}
3✔
343
        if classes.issuperset(word_set) and simplify_word:
3✔
344
            classes = classes.difference(word_set).union({'\w'})
3✔
345
        elif classes.issuperset(digit_set):
3✔
346
            classes = classes.difference(digit_set).union({'\d'})
3✔
347
        if classes.issuperset(whitespace_set):
3✔
348
            classes = classes.difference(whitespace_set).union({'\s'})
3✔
349
        return classes
3✔
350

351

352
    def __invert__(self) -> '__Class':
3✔
353
        '''
354
        If this instance is a regular class, then converts it to its negated counterpart. \
355
        If this instance is a negated class, then converts it to its regular counterpart.
356
        '''
357
        s, rs = '' if self.__is_negated else '^', '^' if self.__is_negated else ''
3✔
358
        return __class__(f"[{s}{self.__verbose.lstrip('[' + rs).rstrip(']')}]", not self.__is_negated)
3✔
359

360

361
    def __or__(self, pre: '__Class' or str) -> '__Class':
3✔
362
        '''
363
        Returns a `__Class` instance representing the union of the provided classes.
364

365
        :param __Class | str pre: The class that is to be unioned with this instance.
366

367
        :raises CannotBeUnionedException: `pre` is neither a `__Class` instance nor a token.
368
        '''
369
        if not self.__is_negated:
3✔
370
            if isinstance(pre, str) and (len(pre) == 1):
3✔
371
                pre = AnyFrom(pre)
3✔
372
            elif isinstance(pre, _pre.Pregex) and pre._get_type() == _pre._Type.Token:
3✔
373
                pre = AnyFrom(pre)
3✔
374
        if not issubclass(pre.__class__, __class__):
3✔
375
            raise _ex.CannotBeUnionedException(pre, False)
3✔
376
        return __class__.__or(self, pre)
3✔
377

378

379
    def __ror__(self, pre: '__Class' or str) -> '__Class':
3✔
380
        '''
381
        Returns a `__Class` instance representing the union of the provided classes.
382

383
        :param __Class | str pre: The class that is to be unioned with this instance.
384

385
        :raises CannotBeUnionedException: `pre` is neither a `__Class` instance nor a token.
386
        '''
387
        if not self.__is_negated:
3✔
388
            if isinstance(pre, str) and (len(pre) == 1):
3✔
389
                pre = AnyFrom(pre)
3✔
390
            elif isinstance(pre, _pre.Pregex) and pre._get_type() == _pre._Type.Token:
3✔
391
                pre = AnyFrom(pre)
3✔
392
        if not issubclass(pre.__class__, __class__):
3✔
393
            raise _ex.CannotBeUnionedException(pre, False)
3✔
394
        return __class__.__or(pre, self)
3✔
395

396

397
    def __or(pre1: '__Class', pre2: '__Class') -> '__Class':
3✔
398
        '''
399
        Returns a `__Class` instance representing the union of the provided classes.
400

401
        :param __Class pre: The class that is to be unioned with this instance.
402

403
        :raises CannotBeUnionedException: `pre1` is a different type of class than `pre2`.
404
        '''
405
        if  pre1.__is_negated != pre2.__is_negated:
3✔
406
            raise _ex.CannotBeUnionedException(pre2, True)
3✔
407
        if isinstance(pre1, Any) or isinstance(pre2, Any):
3✔
408
            return Any()
3✔
409

410
        simplify_word = False
3✔
411
        if isinstance(pre1, (AnyWordChar, AnyButWordChar)):
3✔
412
            simplify_word = simplify_word or pre1._is_global()
3✔
413
        if isinstance(pre2, (AnyWordChar, AnyButWordChar)):
3✔
414
            simplify_word = simplify_word or pre2._is_global()
3✔
415
            
416
        ranges1, chars1 = __class__.__extract_classes(pre1.__verbose, unescape=True)
3✔
417
        ranges2, chars2 = __class__.__extract_classes(pre2.__verbose, unescape=True)
3✔
418

419
        ranges, chars = ranges1.union(ranges2), chars1.union(chars2)
3✔
420

421
        def reduce_ranges(ranges: list[str]) -> set[str]:
3✔
422
            '''
423
            Removes any sub-ranges if they are already included within an other specified range,
424
            and returns the set of all remaining ranges.
425

426
            :param list[str] ranges: A list containing all specified ranges.
427
            '''
428
            if len(ranges) < 2:
3✔
429
                return set(ranges)
3✔
430
            
431
            ranges = [__class__.__split_range(rng) for rng in ranges]
3✔
432

433
            i = 0
3✔
434
            while i < len(ranges):
3✔
435
                start_i, end_i = ranges[i]
3✔
436
                j = 0
3✔
437
                while j < len(ranges):
3✔
438
                    if i != j:
3✔
439
                        start_j, end_j = ranges[j]
3✔
440
                        if start_i <= start_j and ord(end_i) + 1 >= ord(start_j):
3✔
441
                            ranges[i] = start_i, max(end_i, end_j)
3✔
442
                            ranges.pop(j)
3✔
443
                            i = -1
3✔
444
                            break
3✔
445
                    j += 1
3✔
446
                i += 1
3✔
447

448
            return set(f"{rng[0]}-{rng[1]}" for rng in ranges)
3✔
449

450
        def reduce_chars(ranges: list[str], chars: list[str]):
3✔
451
            '''
452
            Removes any characters if those are already included within an other specified range,
453
            and returns the set of all remaining characters.
454

455
            :param list[str] ranges: A list containing all specified ranges.
456
            :param list[str] chars: A list containing all specified chars.
457
            '''
458

459
            ranges = [__class__.__split_range(rng) for rng in ranges]
3✔
460

461
            i = 0
3✔
462
            while i < len(chars):
3✔
463
                for j in range(len(ranges)):
3✔
464
                    start, end = ranges[j]
3✔
465
                    if chars[i] >= start and chars[i] <= end:
3✔
466
                        chars.pop(i)
3✔
467
                        i = -1
3✔
468
                        break
3✔
469
                    elif ord(start) == ord(chars[i]) + 1:
3✔
470
                        ranges[j][0] = chars[i]
3✔
471
                        chars.pop(i)
3✔
472
                        i = -1
3✔
473
                        break
3✔
474
                    elif ord(end) == ord(chars[i]) - 1:
3✔
475
                        ranges[j][1] = chars[i]
3✔
476
                        chars.pop(i)
3✔
477
                        i = -1
3✔
478
                        break
3✔
479
                i += 1
3✔
480

481
            return set(f"{rng[0]}-{rng[1]}" for rng in ranges), set(chars)        
3✔
482

483
        ranges = reduce_ranges(ranges)
3✔
484

485
        ranges, chars = reduce_chars(list(ranges), list(chars))
3✔
486

487
        result =  __class__.__modify_classes(ranges.union(chars), escape=True)
3✔
488

489
        return  __class__(
3✔
490
            f"[{'^' if pre1.__is_negated else ''}{''.join(result)}]",
491
            pre1.__is_negated, simplify_word)
492

493

494
    def __sub__(self, pre: '__Class' or str) -> '__Class':
3✔
495
        '''
496
        Returns a `__Class` instance representing the difference of the provided classes.
497

498
        :param __Class | str pre: The class that is to be subtracted from this instance.
499

500
        :raises CannotBeSubtractedException: `pre` is neither a `__Class` instance nor a token.
501
        '''
502
        if not self.__is_negated:
3✔
503
            if isinstance(pre, str) and (len(pre) == 1):
3✔
504
                pre = AnyFrom(pre)
3✔
505
            elif isinstance(pre, _pre.Pregex) and pre._get_type() == _pre._Type.Token:
3✔
506
                pre = AnyFrom(pre)
3✔
507
        if not issubclass(pre.__class__, __class__):
3✔
508
            raise _ex.CannotBeSubtractedException(pre, False)
3✔
509
        return __class__.__sub(self, pre)
3✔
510

511

512
    def __rsub__(self, pre: '__Class' or str) -> '__Class':
3✔
513
        '''
514
        Returns a `__Class` instance representing the difference of the provided classes.
515

516
        :param __Class | str pre: The class that is to be subtracted from this instance.
517

518
        :raises CannotBeSubtractedException: `pre` is neither a `__Class` instance nor a token.
519
        '''
520
        if not self.__is_negated:
3✔
521
            if isinstance(pre, str) and (len(pre) == 1):
3✔
522
                pre = AnyFrom(pre)
3✔
523
            elif isinstance(pre, _pre.Pregex) and pre._get_type() == _pre._Type.Token:
3✔
524
                pre = AnyFrom(pre)
3✔
525
        if not issubclass(pre.__class__, __class__):
3✔
526
            raise _ex.CannotBeSubtractedException(pre, False)
3✔
527
        return __class__.__sub(pre, self)
3✔
528

529

530
    def __sub(pre1: '__Class', pre2: '__Class') -> '__Class':
3✔
531
        '''
532
        Returns a `__Class` instance representing the difference of the provided classes.
533

534
        :param __Class  pre: The class that is to be subtracted from this instance.
535

536
        :raises CannotBeSubtractedException: `pre` is neither a `__Class` instance nor a token.
537
        :raises EmptyClassException: `pre2` is an instance of class "Any".
538
        '''
539
        if  pre1.__is_negated != pre2.__is_negated:
3✔
540
            raise _ex.CannotBeSubtractedException(pre2, True)
3✔
541
        if isinstance(pre2, Any):
3✔
542
            raise _ex.EmptyClassException(pre1, pre2)
3✔
543
        if isinstance(pre1, Any):
3✔
544
            return ~ pre2
3✔
545
        if isinstance(pre1, (AnyWordChar, AnyButWordChar)) and pre1._is_global():
3✔
546
            raise _ex.GlobalWordCharSubtractionException(pre1)
3✔
547

548
        # Subtract any ranges found in both pre2 and pre1 from pre1.
549
        def subtract_ranges(ranges1: set[str], ranges2: set[str]) -> tuple[set[str], set[str]]:
3✔
550
            '''
551
            Subtracts any range found within `ranges2` from `ranges1` and returns \
552
            the resulting ranges/characters in two seperate sets.
553

554
            :note: This method might also produce characters, for example \
555
                [A-Z] - [B-Z] produces the character 'A'.
556
            '''
557
            ranges1 = [__class__.__split_range(rng) for rng in ranges1]
3✔
558
            ranges2 = [__class__.__split_range(rng) for rng in ranges2]
3✔
559

560
            i = 0
3✔
561
            while i < len(ranges1):
3✔
562
                start_1, end_1 = ranges1[i]
3✔
563
                for start_2, end_2 in ranges2:
3✔
564
                    if start_1 <= end_2 and end_1 >= start_2:
3✔
565
                        if start_1 == start_2 and end_1 == end_2:
3✔
566
                            ranges1.pop(i)
3✔
567
                            i -= 1
3✔
568
                            break
3✔
569
                        split_rng = list()
3✔
570
                        if start_1 <= start_2 and end_1 <= end_2:
3✔
571
                            split_rng.append((start_1, chr(ord(start_2) - 1)))
3✔
572
                        elif start_1 >= start_2 and end_1 >= end_2:
3✔
573
                            split_rng.append((chr(ord(end_2) + 1), end_1))
3✔
574
                        else:
575
                            split_rng.append((start_1, chr(ord(start_2) - 1)))
3✔
576
                            split_rng.append((chr(ord(end_2) + 1), end_1))
3✔
577
                        if len(split_rng) > 0:
3✔
578
                            ranges1.pop(i)
3✔
579
                            i -= 1
3✔
580
                            ranges1 = ranges1 + split_rng
3✔
581
                            break
3✔
582
                i += 1
3✔
583

584
            ranges, chars = set(), set()
3✔
585
            for start, end in ranges1:
3✔
586
                if start == end:
3✔
587
                    chars.add(start)
3✔
588
                else:
589
                    if ord(end) == ord(start) + 1:
3✔
590
                        chars.add(start)
3✔
591
                        chars.add(end)
3✔
592
                    else:
593
                        ranges.add(f"{start}-{end}")
3✔
594

595
            return ranges, chars
3✔
596

597
        # 1. Extract classes while unescaping them
598
        ranges1, chars1 = __class__.__extract_classes(pre1.__verbose, unescape=True)
3✔
599
        ranges2, chars2 = __class__.__extract_classes(pre2.__verbose, unescape=True)
3✔
600

601
        # 2.a. Subtract ranges2 from chars1.
602
        splt_ranges2 = [__class__.__split_range(rng) for rng in ranges2]
3✔
603
        lst_chars1 = list(chars1)
3✔
604

605
        for start, end in splt_ranges2:
3✔
606
            i = 0
3✔
607
            while i < len(lst_chars1):
3✔
608
                c = lst_chars1[i]
3✔
609
                if c.isalnum and c >= start and c <= end:
3✔
610
                    lst_chars1.pop(i)
3✔
611
                    i = -1
3✔
612
                i += 1
3✔
613
        chars1 = set(lst_chars1)
3✔
614

615
        # 2.b Subtract chars2 from chars1.
616
        chars1 = chars1.difference(chars2)
3✔
617

618
        # 2.c. Subtract any characters in chars2 from ranges1.
619
        ranges1, reduced_chars = subtract_ranges(ranges1, set(f"{c}-{c}" for c in chars2))
3✔
620
        chars1 = chars1.union(reduced_chars)
3✔
621

622
        # 2.d. Subtract ranges2 from ranges1.
623
        ranges1, reduced_chars = subtract_ranges(ranges1, ranges2)
3✔
624
        chars1 = chars1.union(reduced_chars)
3✔
625

626
        # 3. Union ranges and chars together while escaping them.
627
        result = __class__.__modify_classes(ranges1.union(chars1), escape=True)
3✔
628
        
629
        if len(result) == 0:
3✔
630
            raise _ex.EmptyClassException(pre1, pre2)
3✔
631

632
        return  __class__(
3✔
633
            f"[{'^' if pre1.__is_negated else ''}{''.join(result)}]",
634
            pre1.__is_negated)
635

636

637
    @staticmethod
3✔
638
    def __extract_classes(pattern: str, unescape: bool = False) -> tuple[set[str], set[str]]:
3✔
639
        '''
640
        Extracts all classes from the provided class pattern and returns them \
641
        separated into two different sets based on whether they constitute a range \
642
        or an individual character.
643

644
        :param str pattern: The pattern from which classes are to be extracted.
645
        :param bool unespace: If `True` then unescapes all escaped characters. \
646
            Defaults to `False`.
647
        '''
648

649
        def get_start_index(pattern: str):
3✔
650
            if pattern.startswith('[^'):
3✔
651
                return 2
3✔
652
            else:
653
                return 1
3✔
654
        
655
        # Remove brackets etc from string.
656
        start_index = get_start_index(pattern)
3✔
657
        classes = pattern[start_index:-1]
3✔
658

659
        # Extract classes separately.
660
        ranges, chars = __class__.__separate_classes(classes)
3✔
661

662
        # Unescape any escaped characters if you must.
663
        if unescape:
3✔
664
            ranges = __class__.__modify_classes(ranges, escape=False)
3✔
665
            chars = __class__.__modify_classes(chars, escape=False)
3✔
666
        # Return classes.
667
        return ranges, chars
3✔
668

669

670
    @staticmethod
3✔
671
    def __separate_classes(classes: 'str') -> tuple[set[str], set[str]]:
3✔
672
        '''
673
        Extracts all classes from the provided character class pattern and \
674
        returns them separated into ranges and characters.
675

676
        :param str classes: One or more string character class patterns.
677
        '''
678
        range_pattern = \
3✔
679
            r"(?:\\(?:\[|\]|\^|\$|\-|\/|[a-z]|\\)|[^\[\]\^\$\-\/\\])" + \
680
            r"-(?:\\(?:\[|\]|\^|\$|\-|\/|[a-z]|\\)|[^\[\]\^\$\-\/\\])"
681
        ranges = set(_re.findall(range_pattern, classes))
3✔
682
        classes = _re.sub(pattern=range_pattern, repl="", string=classes)
3✔
683
        return (ranges, set(_re.findall(r"\\?.", classes, flags=_re.DOTALL)))
3✔
684

685
    
686
    @staticmethod
3✔
687
    def __modify_classes(classes: set[str], escape: bool) -> set[str]:
3✔
688
        '''
689
        Either escapes or unescapes any character within the provided classes that \
690
        needs to be escaped.
691

692
        :param bool escape: Determines whether to "escape" or "unescape" characters.
693
        '''
694

695
        def escape_char(c):
3✔
696
            return "\\" + c if c in __class__._to_escape else c
3✔
697
        def unescape_char(c):
3✔
698
            return c.replace("\\", "", 1) if len(c) > 1 and c[1] in __class__._to_escape else c
3✔
699

700
        fun = escape_char if escape else unescape_char
3✔
701
        modified_classes = set()
3✔
702

703
        for c in classes:
3✔
704
            if len(c) >= 3: # classes: [.-.], [\.-.], [\.-\.]
3✔
705
                start, end = tuple(map(fun, __class__.__split_range(c)))
3✔
706
                modified_c = start + "-" + end
3✔
707
            else: # characters: [.]
708
                modified_c = fun(c)
3✔
709
            modified_classes.add(modified_c)
3✔
710

711
        return modified_classes
3✔
712

713

714
    @staticmethod
3✔
715
    def __split_range(pattern: str) -> list[str]:
3✔
716
        '''
717
        Splits the provided range pattern and returns result as a list \
718
        of length two where the first element is the range's beginning \
719
        and the second element is the range's end.
720

721
        :param str pattern: The pattern that is to be split.
722
        
723
        :note: The provided range pattern must be in the form of \
724
            \\?.-\\?. where "\\?" specifies the option to escape the \
725
            character within the range, and "." includes newlines.
726
        '''
727

728
        count = pattern.count("-")
3✔
729

730
        if count == 1:
3✔
731
            return pattern.split("-")
3✔
732
        if count == 2:
3✔
733
            return pattern.split("-", 1) if pattern[-1] == "-" else pattern.rsplit("-", 1)
3✔
734
        return ["-", "-"]
3✔
735

736

737
class Any(__Class):
3✔
738
    '''
3✔
739
    Matches any possible character, including the newline character.
740
    '''
741

742
    def __init__(self) -> 'Any':
3✔
743
        '''
744
        Matches any possible character, including the newline character.
745
        '''
746
        super().__init__('.', is_negated=False)
3✔
747

748
    def __invert__(self) -> None:
3✔
749
        '''
750
        Raises a "CannotBeNegatedException".
751
        '''
752
        raise _ex.CannotBeNegatedException()
3✔
753

754

755
class AnyLetter(__Class):
3✔
756
    '''
3✔
757
    Matches any character from the Latin alphabet.
758
    '''
759

760
    def __init__(self) -> 'AnyLetter':
3✔
761
        '''
762
        Matches any character from the Latin alphabet.
763
        '''
764
        super().__init__('[a-zA-Z]', is_negated=False)
3✔
765

766

767
class AnyButLetter(__Class):
3✔
768
    '''
3✔
769
    Matches any character except for characters in the Latin alphabet.
770
    '''
771

772
    def __init__(self) -> 'AnyButLetter':
3✔
773
        '''
774
        Matches any character except for characters in the Latin alphabet.
775
        '''
776
        super().__init__('[^a-zA-Z]', is_negated=True)
3✔
777

778

779
class AnyLowercaseLetter(__Class):
3✔
780
    '''
3✔
781
    Matches any lowercase character from the Latin alphabet.
782
    '''
783

784
    def __init__(self) -> 'AnyLowercaseLetter':
3✔
785
        '''
786
        Matches any lowercase character from the Latin alphabet.
787
        '''
788
        super().__init__('[a-z]', is_negated=False)
3✔
789

790

791
class AnyButLowercaseLetter(__Class):
3✔
792
    '''
3✔
793
    Matches any character except for lowercase characters in the Latin alphabet.
794
    '''
795

796
    def __init__(self) -> 'AnyButLowercaseLetter':
3✔
797
        '''
798
        Matches any character except for lowercase characters in the Latin alphabet.
799
        '''
800
        super().__init__('[^a-z]', is_negated=True)
3✔
801

802
    
803
class AnyUppercaseLetter(__Class):
3✔
804
    '''
3✔
805
    Matches any uppercase character from the Latin alphabet.
806
    '''
807

808
    def __init__(self) -> 'AnyUppercaseLetter':
3✔
809
        '''
810
        Matches any uppercase character from the Latin alphabet.
811
        '''
812
        super().__init__('[A-Z]', is_negated=False)
3✔
813

814

815
class AnyButUppercaseLetter(__Class):
3✔
816
    '''
3✔
817
    Matches any character except for uppercase characters in the Latin alphabet.
818
    '''
819

820
    def __init__(self) -> 'AnyButUppercaseLetter':
3✔
821
        '''
822
        Matches any character except for uppercase characters in the Latin alphabet.
823
        '''
824
        super().__init__('[^A-Z]', is_negated=True)
3✔
825

826

827
class AnyDigit(__Class):
3✔
828
    '''
3✔
829
    Matches any numeric character.
830
    '''
831

832
    def __init__(self) -> 'AnyDigit':
3✔
833
        '''
834
        Matches any numeric character.
835
        '''
836
        super().__init__('[0-9]', is_negated=False)
3✔
837

838

839
class AnyButDigit(__Class):
3✔
840
    '''
3✔
841
    Matches any character except for numeric characters.
842
    '''
843

844
    def __init__(self) -> 'AnyButDigit':
3✔
845
        '''
846
        Matches any character except for numeric characters.
847
        '''
848
        super().__init__('[^0-9]', is_negated=True)
3✔
849

850

851
class AnyWordChar(__Class):
3✔
852
    '''
3✔
853
    Matches any alphanumeric character as well as the underscore character ``_``.
854

855
    :param bool is_global: Indicates whether to include foreign alphabetic \
856
        characters or not. Defaults to ``False``.
857

858
    :raises GlobalWordCharSubtractionException: There is an attempt to subtract \
859
        a regular character class from an instance of this class for which \
860
        parameter ``is_global`` has been set to ``True``.
861
    '''
862

863
    def __init__(self, is_global: bool = False) -> 'AnyWordChar':
3✔
864
        '''
865
        Matches any alphanumeric character as well as the underscore character ``_``.
866

867
        :param bool is_global: Indicates whether to include foreign alphabetic \
868
            characters or not. Defaults to ``False``.
869

870
        :raises GlobalWordCharSubtractionException: There is an attempt to subtract \
871
            a regular character class from an instance of this class for which \
872
            parameter ``is_global`` has been set to ``True``.
873
        '''
874
        super().__init__('[a-zA-Z0-9_]', is_negated=False, simplify_word=is_global)
3✔
875
        self.__is_global = is_global
3✔
876

877

878
    def _is_global(self) -> bool:
3✔
879
        '''
880
        Returns ``True`` if this instance supports matching foreign alphabetic \
881
        characters, else returns ``False``.
882
        '''
883
        return self.__is_global
3✔
884

885

886
    def __invert__(self) -> '__Class':
3✔
887
        '''
888
        Returns an instance of class "AnyButWordChar" where parameter "is_global" \
889
        is set according to the value of this instance's "is_global" parameter.
890
        '''
891
        return AnyButWordChar(is_global=self._is_global())
3✔
892

893

894
class AnyButWordChar(__Class):
3✔
895
    '''
3✔
896
    Matches any character except for alphanumeric characters \
897
    and the underscore character  "_".
898

899
    :param bool is_global: Indicates whether to include foreign alphabetic \
900
        characters or not. Defaults to ``False``.
901

902
    :raises GlobalWordCharSubtractionException: There is an attempt to subtract \
903
        a negated character class from an instance of this class for which \
904
        parameter ``is_global`` has been set to ``True``.
905
    '''
906

907
    def __init__(self, is_global: bool = False) -> 'AnyButWordChar':
3✔
908
        '''
909
        Matches any character except for alphanumeric characters \
910
        and the underscore character "_".
911

912
        :param bool is_global: Indicates whether to include foreign alphabetic \
913
            characters or not. Defaults to ``False``.
914

915
        :raises GlobalWordCharSubtractionException: There is an attempt to subtract \
916
            a negated character class from an instance of this class for which \
917
            parameter ``is_global`` has been set to ``True``.
918
        '''
919
        super().__init__('[^a-zA-Z0-9_]', is_negated=True, simplify_word=is_global)
3✔
920
        self.__is_global = is_global
3✔
921

922

923
    def _is_global(self) -> bool:
3✔
924
        '''
925
        Returns ``True`` if this instance also excludes foreign alphabetic \
926
        characters from matching, else returns ``False``.
927
        '''
928
        return self.__is_global
3✔
929

930

931
    def __invert__(self) -> '__Class':
3✔
932
        '''
933
        Returns an instance of class "AnyWordChar" where parameter "is_global" \
934
        is set according to the value of this instance's "is_global" parameter.
935
        '''
936
        return AnyWordChar(is_global=self._is_global())
3✔
937

938

939
class AnyPunctuation(__Class):
3✔
940
    '''
3✔
941
    Matches any puncutation character as defined within the ASCII table.
942
    '''
943

944
    def __init__(self) -> 'AnyPunctuation':
3✔
945
        '''
946
        Matches any puncutation character as defined within the ASCII table.
947
        '''
948
        super().__init__('[!-\/:-@\[-`{-~]', is_negated=False)
3✔
949

950

951
class AnyButPunctuation(__Class):
3✔
952
    '''
3✔
953
    Matches any character except for punctuation characters \
954
    as defined within the ASCII table.
955
    '''
956

957
    def __init__(self) -> 'AnyButPunctuation':
3✔
958
        '''
959
        Matches any character except for punctuation characters \
960
        as defined within the ASCII table.
961
        '''
962
        super().__init__('[^!-\/:-@\[-`{-~]', is_negated=True)
3✔
963

964

965
class AnyWhitespace(__Class):
3✔
966
    '''
3✔
967
    Matches any whitespace character.
968
    '''
969

970
    def __init__(self) -> 'AnyWhitespace':
3✔
971
        '''
972
        Matches any whitespace character.
973
        '''
974
        super().__init__(f'[{_whitespace}]', is_negated=False)
3✔
975

976

977
class AnyButWhitespace(__Class):
3✔
978
    '''
3✔
979
    Matches any character except for whitespace characters.
980
    '''
981

982
    def __init__(self) -> 'AnyButWhitespace':
3✔
983
        '''
984
        Matches any character except for whitespace characters.
985
        '''
986
        super().__init__(f'[^{_whitespace}]', is_negated=True)
3✔
987

988

989
class AnyBetween(__Class):
3✔
990
    '''
3✔
991
    Matches any character within the provided range.
992

993
    :param str start: The first character of the range.
994
    :param str end: The last character of the range.
995

996
    :raises InvalidArgumentTypeException: At least one of the provided characters \
997
        is neither a *Token* class instance nor a single-character string.
998
    :raises InvalidRangeException: A non-valid range is provided.
999

1000
    :note: Any pair of characters ``start``, ``end`` constitutes a valid range \
1001
        as long as the code point of character ``end`` is greater than the code \
1002
        point of character ``start``, as defined by the Unicode Standard.
1003
    '''
1004

1005
    def __init__(self, start: str, end: str) -> 'AnyBetween':
3✔
1006
        '''
1007
        Matches any character within the provided range.
1008

1009
        :param str start: The first character of the range.
1010
        :param str end: The last character of the range.
1011

1012
        :raises InvalidArgumentTypeException: At least one of the provided characters \
1013
            is neither a *Token* class instance nor a single-character string.
1014
        :raises InvalidRangeException: A non-valid range is provided.
1015

1016
        :note: Any pair of characters ``start``, ``end`` constitutes a valid range \
1017
            as long as the code point of character ``end`` is greater than the code \
1018
            point of character ``start``, as defined by the Unicode Standard.
1019
        '''
1020
        for c in (start, end):
3✔
1021
            if isinstance(c, (str, _pre.Pregex)):
3✔
1022
                if len(str(c).replace("\\", "", 1)) > 1:
3✔
1023
                    message = f"Argument \"{c}\" is neither a string nor a token."
3✔
1024
                    raise _ex.InvalidArgumentTypeException(message)
3✔
1025
            else:
1026
                message = f"Argument \"{c}\" is neither a string nor a token."
3✔
1027
                raise _ex.InvalidArgumentTypeException(message)
3✔
1028
        start, end = str(start), str(end)
3✔
1029
        if ord(start) >= ord(end):
3✔
1030
            raise _ex.InvalidRangeException(start, end)
3✔
1031
        start = f"\\{start}" if start in __class__._to_escape else start
3✔
1032
        end = f"\\{end}" if end in __class__._to_escape else end
3✔
1033
        super().__init__(f"[{start}-{end}]", is_negated=False)
3✔
1034

1035

1036
class AnyButBetween(__Class):
3✔
1037
    '''
3✔
1038
    Matches any character except for those within the provided range.
1039

1040
    :param str start: The first character of the range.
1041
    :param str end: The last character of the range.
1042

1043
    :raises InvalidArgumentTypeException: At least one of the provided characters \
1044
        is neither a *Token* class instance nor a single-character string.
1045
    :raises InvalidRangeException: A non-valid range is provided.
1046

1047
    :note: Any pair of characters ``start``, ``end`` constitutes a valid range \
1048
        as long as the code point of character ``end`` is greater than the code \
1049
        point of character ``start``, as defined by the Unicode Standard.
1050
    '''
1051

1052
    def __init__(self, start: str, end: str) -> 'AnyButBetween':
3✔
1053
        '''
1054
        Matches any character except for those within the provided range.
1055

1056
        :param str start: The first character of the range.
1057
        :param str end: The last character of the range.
1058

1059
        :raises InvalidArgumentTypeException: At least one of the provided characters \
1060
            is neither a *Token* class instance nor a single-character string.
1061
        :raises InvalidRangeException: A non-valid range is provided.
1062

1063
        :note: Any pair of characters ``start``, ``end`` constitutes a valid range \
1064
            as long as the code point of character ``end`` is greater than the code \
1065
            point of character ``start``, as defined by the Unicode Standard.
1066
        '''
1067
        for c in (start, end):
3✔
1068
            if isinstance(c, (str, _pre.Pregex)):
3✔
1069
                if len(str(c).replace("\\", "", 1)) > 1: 
3✔
1070
                    message = f"Argument \"{c}\" is neither a string nor a token."
3✔
1071
                    raise _ex.InvalidArgumentTypeException(message)
3✔
1072
            else:
1073
                message = f"Argument \"{c}\" is neither a string nor a token."
3✔
1074
                raise _ex.InvalidArgumentTypeException(message)
3✔
1075
        start, end = str(start), str(end)
3✔
1076
        if ord(start) >= ord(end):
3✔
1077
            raise _ex.InvalidRangeException(start, end)
3✔
1078
        start = f"\\{start}" if start in __class__._to_escape else start
3✔
1079
        end = f"\\{end}" if end in __class__._to_escape else end
3✔
1080
        super().__init__(f"[^{start}-{end}]", is_negated=True)      
3✔
1081

1082

1083
class AnyFrom(__Class):
3✔
1084
    '''
3✔
1085
    Matches any one of the provided characters.
1086

1087
    :param str | Pregex \*chars: One or more characters to match from. \
1088
        Each character must be either a string of length one or an instance \
1089
        of a class defined within the :py:mod:`pregex.core.tokens` module.
1090

1091
    :raises NotEnoughArgumentsExceptions: No arguments are provided.
1092
    :raises InvalidArgumentTypeException: At least one of the provided \
1093
        arguments is neither a string of length one nor an instance of \
1094
        a class defined within :py:mod:`pregex.core.tokens`.
1095
    '''
1096

1097
    def __init__(self, *chars: str or _pre.Pregex) -> 'AnyFrom':
3✔
1098
        '''
1099
        Matches any one of the provided characters.
1100

1101
        :param str | Pregex \*chars: One or more characters to match from. \
1102
            Each character must be either a string of length one or an instance \
1103
            of a class defined within the :py:mod:`pregex.core.tokens` module.
1104

1105
        :raises NotEnoughArgumentsExceptions: No arguments are provided.
1106
        :raises InvalidArgumentTypeException: At least one of the provided \
1107
            arguments is neither a string of length one nor an instance of \
1108
            a class defined within :py:mod:`pregex.core.tokens`.
1109
        '''
1110
        if len(chars) == 0:
3✔
1111
            message = f"No characters were provided to \"{__class__.__name__}\"."
3✔
1112
            raise _ex.NotEnoughArgumentsException(message)
3✔
1113
        for c in chars:
3✔
1114
            if isinstance(c, (str, _pre.Pregex)):
3✔
1115
                if len(str(c).replace("\\", "", 1)) > 1: 
3✔
1116
                    message = f"Argument \"{c}\" is neither a string nor a token."
3✔
1117
                    raise _ex.InvalidArgumentTypeException(message)
3✔
1118
            else:
1119
                message = f"Argument \"{c}\" is neither a string nor a token."
3✔
1120
                raise _ex.InvalidArgumentTypeException(message)
3✔
1121
        chars = tuple((f"\\{c}" if c in __class__._to_escape else c) \
3✔
1122
            if isinstance(c, str) else str(c) for c in chars)
1123
        super().__init__(f"[{''.join(chars)}]", is_negated=False)
3✔
1124

1125

1126
class AnyButFrom(__Class):
3✔
1127
    '''
3✔
1128
    Matches any character except for the provided characters.
1129

1130
    :param str | Pregex \*chars: One or more characters not to match from.
1131
        Each character must be either a string of length one or an instance \
1132
        of a class defined within the :py:mod:`pregex.core.tokens` module.
1133

1134
    :raises NotEnoughArgumentsExceptions: No arguments are provided.
1135
    :raises InvalidArgumentTypeException: At least one of the provided \
1136
        arguments is neither a string of length one nor an instance of \
1137
        a class defined within :py:mod:`pregex.core.tokens`.
1138
    '''
1139

1140
    def __init__(self, *chars: str or _pre.Pregex) -> 'AnyButFrom':
3✔
1141
        '''
1142
        Matches any character except for the provided characters.
1143

1144
        :param str | Pregex \*chars: One or more characters not to match from.
1145
            Each character must be either a string of length one or an instance \
1146
            of a class defined within the :py:mod:`pregex.core.tokens` module.
1147

1148
        :raises NotEnoughArgumentsExceptions: No arguments are provided.
1149
        :raises InvalidArgumentTypeException: At least one of the provided \
1150
            arguments is neither a string of length one nor an instance of \
1151
            a class defined within :py:mod:`pregex.core.tokens`.
1152
        '''
1153
        if len(chars) == 0:
3✔
1154
            message = f"No characters were provided to \"{__class__.__name__}\"."
3✔
1155
            raise _ex.NotEnoughArgumentsException(message)
3✔
1156
        for c in chars:
3✔
1157
            if isinstance(c, (str, _pre.Pregex)):
3✔
1158
                if len(str(c).replace("\\", "", 1)) > 1: 
3✔
1159
                    message = f"Argument \"{c}\" is neither a string nor a token."
3✔
1160
                    raise _ex.InvalidArgumentTypeException(message)
3✔
1161
            else:
1162
                message = f"Argument \"{c}\" is neither a string nor a token."
3✔
1163
                raise _ex.InvalidArgumentTypeException(message)
3✔
1164
        chars = tuple((f"\{c}" if c in __class__._to_escape else c)
3✔
1165
            if isinstance(c, str) else str(c) for c in chars)
1166
        super().__init__(f"[^{''.join(chars)}]", is_negated=True)
3✔
1167

1168

1169
class AnyGermanLetter(__Class):
3✔
1170
    '''
3✔
1171
    Matches any character from the German alphabet.
1172
    '''
1173

1174
    def __init__(self) -> 'AnyGermanLetter':
3✔
1175
        '''
1176
        Matches any character from the German alphabet.
1177
        '''
1178
        super().__init__('[a-zA-ZäöüßÄÖÜẞ]', is_negated=False)
3✔
1179

1180

1181
class AnyButGermanLetter(__Class):
3✔
1182
    '''
3✔
1183
    Matches any character except for characters in the German alphabet.
1184
    '''
1185

1186
    def __init__(self) -> 'AnyButGermanLetter':
3✔
1187
        '''
1188
        Matches any character except for characters in the German alphabet.
1189
        '''
1190
        super().__init__('[^a-zA-ZäöüßÄÖÜẞ]', is_negated=True)
3✔
1191

1192

1193
class AnyGreekLetter(__Class):
3✔
1194
    '''
3✔
1195
    Matches any character from the Greek alphabet.
1196
    '''
1197

1198
    def __init__(self) -> 'AnyGreekLetter':
3✔
1199
        '''
1200
        Matches any character from the Greek alphabet.
1201
        '''
1202
        # Start from 'Έ' and include "Ά" separately so that
1203
        # Ano Teleia '·' is not included.
1204
        super().__init__('[ΆΈ-ώ]', is_negated=False)
3✔
1205

1206

1207
class AnyButGreekLetter(__Class):
3✔
1208
    '''
3✔
1209
    Matches any character except for characters in the Greek alphabet.
1210
    '''
1211

1212
    def __init__(self) -> 'AnyGreekLetter':
3✔
1213
        '''
1214
        Matches any character except for characters in the Greek alphabet.
1215
        '''
1216
        # Start from 'Έ' and include "Ά" separately so that
1217
        # Ano Teleia '·' is not included.
1218
        super().__init__('[^ΆΈ-ώ]', is_negated=True)
3✔
1219

1220

1221
class AnyCyrillicLetter(__Class):
3✔
1222
    '''
3✔
1223
    Matches any character from the Cyrillic alphabet.
1224
    '''
1225

1226
    def __init__(self) -> 'AnyCyrillicLetter':
3✔
1227
        '''
1228
        Matches any character from the Cyrillic alphabet.
1229
        '''
1230
        super().__init__('[Ѐ-ӿ]', is_negated=False)
3✔
1231

1232
    
1233
class AnyButCyrillicLetter(__Class):
3✔
1234
    '''
3✔
1235
    Matches any character except for characters in the Cyrillic alphabet.
1236
    '''
1237

1238
    def __init__(self) -> 'AnyButCyrillicLetter':
3✔
1239
        '''
1240
        Matches any character except for characters in the Cyrillic alphabet.
1241
        '''
1242
        super().__init__('[^Ѐ-ӿ]', is_negated=True)
3✔
1243

1244

1245
class AnyCJK(__Class):
3✔
1246
    '''
3✔
1247
    Matches any character that is defined within the \
1248
    `CJK Unified Ideographs <https://en.wikipedia.org/wiki/CJK_Unified_Ideographs_(Unicode_block)>`_ \
1249
    Unicode block.
1250
    '''
1251
    def __init__(self) -> 'AnyCJK':
3✔
1252
        '''
1253
        Matches any character that is defined within the \
1254
        `CJK Unified Ideographs <https://en.wikipedia.org/wiki/CJK_Unified_Ideographs_(Unicode_block)>`_ \
1255
        Unicode block.
1256
        '''
1257
        super().__init__('[\u4e00-\u9fd5]', is_negated=False)
3✔
1258

1259
    
1260
class AnyButCJK(__Class):
3✔
1261
    '''
3✔
1262
    Matches any character except for those defined within the \
1263
    `CJK Unified Ideographs <https://en.wikipedia.org/wiki/CJK_Unified_Ideographs_(Unicode_block)>`_ \
1264
    Unicode block.
1265
    '''
1266
    def __init__(self) -> 'AnyButCJK':
3✔
1267
        '''
1268
        Matches any character except for those defined within the \
1269
        `CJK Unified Ideographs <https://en.wikipedia.org/wiki/CJK_Unified_Ideographs_(Unicode_block)>`_ \
1270
        Unicode block.
1271
        '''
1272
        super().__init__('[^\u4e00-\u9fd5]', is_negated=True)
3✔
1273

1274

1275
class AnyHebrewLetter(__Class):
3✔
1276
    '''
3✔
1277
    Matches any character that is defined within the \
1278
    `Hebrew <https://en.wikipedia.org/wiki/Hebrew_(Unicode_block)>`_ \
1279
    Unicode block.
1280
    '''
1281

1282
    def __init__(self) -> 'AnyHebrewLetter':
3✔
1283
        '''
1284
        Matches any character that is defined within the \
1285
        `Hebrew <https://en.wikipedia.org/wiki/Hebrew_(Unicode_block)>`_ \
1286
        Unicode block.
1287
        '''
1288
        super().__init__('[\u0590-\u05ff]', is_negated=False)
3✔
1289

1290

1291
class AnyButHebrewLetter(__Class):
3✔
1292
    '''
3✔
1293
    Matches any character excpet for those defined within the \
1294
    `Hebrew <https://en.wikipedia.org/wiki/Hebrew_(Unicode_block)>`_ \
1295
    Unicode block.
1296
    '''
1297

1298
    def __init__(self) -> 'AnyButHebrewLetter':
3✔
1299
        '''
1300
        Matches any character except for those defined within the \
1301
        `Hebrew <https://en.wikipedia.org/wiki/Hebrew_(Unicode_block)>`_ \
1302
        Unicode block.
1303
        '''
1304
        super().__init__('[^\u0590-\u05ff]', is_negated=True)
3✔
1305

1306

1307
class AnyKoreanLetter(__Class):
3✔
1308
    '''
3✔
1309
    Matches any character from the Korean alphabet.
1310
    '''
1311
    def __init__(self) -> 'AnyKoreanLetter':
3✔
1312
        '''
1313
        Matches any character from the Korean alphabet.
1314
        '''
1315
        super().__init__('[\u3131-\u314e\uac00-\ud7a3]', is_negated=False)
3✔
1316

1317

1318
class AnyButKoreanLetter(__Class):
3✔
1319
    '''
3✔
1320
    Matches any character except for characters in the Korean alphabet.
1321
    '''
1322
    def __init__(self) -> 'AnyButKoreanLetter':
3✔
1323
        '''
1324
        Matches any character except for characters in the Korean alphabet.
1325
        '''
1326
        super().__init__('[^\u3131-\u314e\uac00-\ud7a3]', is_negated=True)
3✔
STATUS · Troubleshooting · Open an Issue · Sales · Support · CAREERS · ENTERPRISE · START FREE · SCHEDULE DEMO
ANNOUNCEMENTS · TWITTER · TOS & SLA · Supported CI Services · What's a CI service? · Automated Testing

© 2025 Coveralls, Inc