• Home
  • Features
  • Pricing
  • Docs
  • Announcements
  • Sign In

KenKundert / nestedtext / 27680631093

17 Jun 2026 09:52AM UTC coverage: 98.437% (-0.5%) from 98.937%
27680631093

Pull #67

github

web-flow
Merge 33e279e2a into 5a3661ea2
Pull Request #67: fix comments being silently dropped when a collection is rendered inline

593 of 614 branches covered (96.58%)

Branch coverage included in aggregate %.

16 of 20 new or added lines in 1 file covered. (80.0%)

5 existing lines in 1 file now uncovered.

1296 of 1305 relevant lines covered (99.31%)

1.99 hits per line

Source File
Press 'n' to go to next uncovered line, 'b' for previous

98.43
/nestedtext/nestedtext.py
1
# encoding: utf8
2
"""
3
NestedText: A Human Readable and Writable Data Format
4

5
NestedText is a file format for holding structured data that is intended to be
6
entered, edited, or viewed by people.  It allows data to be organized into a
7
nested collection of itemized lists (dictionaries), ordered lists (lists), and
8
scalar text (strings).
9

10
It is easily created, modified, or viewed with a text editor and easily
11
understood and used by both programmers and non-programmers.
12
"""
13

14
# MIT License {{{1
15
# Copyright (c) 2020-2026 Ken and Kale Kundert
16
#
17
# Permission is hereby granted, free of charge, to any person obtaining a copy
18
# of this software and associated documentation files (the "Software"), to deal
19
# in the Software without restriction, including without limitation the rights
20
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
21
# copies of the Software, and to permit persons to whom the Software is
22
# furnished to do so, subject to the following conditions:
23
#
24
# The above copyright notice and this permission notice shall be included in all
25
# copies or substantial portions of the Software.
26
#
27
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
28
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
29
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
30
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
31
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
32
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
33
# SOFTWARE.
34

35
# Imports {{{1
36
from inform import (
2✔
37
    cull,
38
    full_stop,
39
    set_culprit,
40
    get_culprit,
41
    is_str,
42
    is_collection,
43
    is_mapping,
44
    join,
45
    plural,
46
    Error,
47
    Info,
48
)
49
import collections.abc
2✔
50
import io
2✔
51
import re
2✔
52
import unicodedata
2✔
53

54

55
# Utility functions {{{1
56
# convert_line_terminators {{{2
57
def convert_line_terminators(text):
2✔
58
    return text.replace("\r\n", "\n").replace("\r", "\n")
2✔
59

60

61
# Unspecified {{{2
62
# class that is used as a default in functions to signal nothing was given
63
class _Unspecified:
2✔
64
    def __bool__(self):  # pragma: no cover
65
        return False
66

67

68
# OnDupCallback {{{2
69
class _OnDupCallback(_Unspecified):
2✔
70
    pass
2✔
71

72

73
# Exceptions {{{1
74
# NestedTextError {{{2
75
class NestedTextError(Error, ValueError):
2✔
76
    r'''
77
    The *load* and *dump* functions all raise *NestedTextError* when they
78
    discover an error. *NestedTextError* subclasses both the Python *ValueError*
79
    and the *Error* exception from *Inform*.  You can find more documentation on
80
    what you can do with this exception in the `Inform documentation
81
    <https://inform.readthedocs.io/en/stable/api.html#exceptions>`_.
82

83
    All exceptions provide the following attributes:
84

85
    Attributes:
86
        args:
87
            The exception arguments.  A tuple that usually contains the
88
            problematic value.
89

90
        template:
91
            The possibly parameterized text used for the error message.
92

93
    Exceptions raised by the :func:`loads()` or :func:`load()` functions provide
94
    the following additional attributes:
95

96
    Attributes:
97
        source:
98
            The source of the *NestedText* content, if given. This is often a
99
            filename.
100

101
        line:
102
            The text of the line of *NestedText* content where the problem was found.
103

104
        prev_line:
105
            The text of the meaningful line immediately before where the problem was
106
            found.  This will not be a comment or blank line.
107

108
        lineno:
109
            The number of the line where the problem was found.  Line numbers are
110
            zero based except when included in messages to the end user.
111

112
        colno:
113
            The number of the character where the problem was found on *line*.
114
            Column numbers are zero based.
115

116
        codicil:
117
            The line that contains the error decorated with the location of the
118
            error.
119

120
    The exception culprit is the tuple that indicates where the error was found.
121
    With exceptions from :func:`loads()` or :func:`load()`, the culprit consists
122
    of the source name, if available, and the line number.  With exceptions from
123
    :func:`dumps()` or :func:`dump()`, the culprit consists of the keys that
124
    lead to the problematic value.
125

126
    As with most exceptions, you can simply cast it to a string to get a
127
    reasonable error message.
128

129
    .. code-block:: python
130

131
        >>> from textwrap import dedent
132
        >>> import nestedtext as nt
133

134
        >>> content = dedent("""
135
        ...     name1: value1
136
        ...     name1: value2
137
        ...     name3: value3
138
        ... """).strip()
139

140
        >>> try:
141
        ...     print(nt.loads(content))
142
        ... except nt.NestedTextError as e:
143
        ...     print(str(e))
144
        2: duplicate key: name1.
145
               1 ❬name1: value1❭
146
               2 ❬name1: value2❭
147
                  ▲
148

149
    You can also use the *report* method to print the message directly. This is
150
    appropriate if you are using *inform* for your messaging as it follows
151
    *inform*’s conventions::
152

153
        >> try:
154
        ..     print(nt.loads(content))
155
        .. except nt.NestedTextError as e:
156
        ..     e.report()
157
        error: 2: duplicate key: name1.
158
            ❬name1: value2❭
159
             ▲
160

161
    The *terminate* method prints the message directly and exits::
162

163
        >> try:
164
        ..     print(nt.loads(content))
165
        .. except nt.NestedTextError as e:
166
        ..     e.terminate()
167
        error: 2: duplicate key: name1.
168
            ❬name1: value2❭
169
             ▲
170

171
    With exceptions generated from :func:`load` or :func:`loads` you may see
172
    extra lines at the end of the message that show the problematic lines if
173
    you have the exception report itself as above.  Those extra lines are
174
    referred to as the codicil and they can be very helpful in illustrating the
175
    actual problem. You do not get them if you simply cast the exception to a
176
    string, but you can access them using :meth:`NestedTextError.get_codicil`.
177
    The codicil or codicils are returned as a tuple.  You should join them with
178
    newlines before printing them.
179

180
    .. code-block:: python
181

182
        >>> try:
183
        ...     print(nt.loads(content))
184
        ... except nt.NestedTextError as e:
185
        ...     print(e.get_message())
186
        ...     print(*e.get_codicil(), sep="\n")
187
        duplicate key: name1.
188
           1 ❬name1: value1❭
189
           2 ❬name1: value2❭
190
              ▲
191

192
    Note the ❬ and ❭ characters in the codicil. They delimit the extent of the
193
    text on each line and help you see troublesome leading or trailing white
194
    space.
195

196
    Exceptions produced by *NestedText* contain a *template* attribute that
197
    contains the basic text of the message. You can change this message by
198
    overriding the attribute using the *template* argument when using *report*,
199
    *terminate*, or *render*.  *render* is like casting the exception to a
200
    string except that allows for the passing of arguments.  For example, to
201
    convert a particular message to Spanish, you could use something like the
202
    following.
203

204
    .. code-block:: python
205

206
        >>> try:
207
        ...     print(nt.loads(content))
208
        ... except nt.NestedTextError as e:
209
        ...     template = None
210
        ...     if e.template == "duplicate key: {}.":
211
        ...         template = "llave duplicada: {}."
212
        ...     print(e.render(template=template))
213
        2: llave duplicada: name1.
214
               1 ❬name1: value1❭
215
               2 ❬name1: value2❭
216
                  ▲
217
    '''
218

219
# NT_DataError {{{2
220
class NestedTextDataError(NestedTextError):
2✔
221
    '''
222
    This exception is not emitted by NestedText itself, rather it is made
223
    available for reporting on errors that derive from loaded NestedText data.
224
    It includes information that would allow the user to easily find the
225
    offending datum.
226

227
    It supports the same arguments as the NestedTextError.  Specifically, you
228
    can specify any number of unnnamed arguments, that will be joined together
229
    to form the message.  It also accepts a *template* keyword argument that can
230
    be used to combine the arguments into a message.  In addition, it supports
231
    the following keyword-only arguments.
232

233
        source (str):
234
            The name of source file.
235
        keys (tuple(str)):
236
            The keys that uniquely identify the datum.
237
        keymap:
238
            The keymap created by the loader.
239
        kind (str):
240
            Either "value" or "key".  Identifies the offending part of the
241
            datum. Default is "value".
242
        offset (None or int):
243
            The location of the problem.  If not None, a pointer is added that
244
            points the specified location (given in the number of characters
245
            from the start of the value).
246
        show_line (bool):
247
            Whether or not to show the line in the NestedText document that
248
            contains the error.
249
    '''
250

251
    # constructor {{{3
252
    def __init__(
2✔
253
        self, *args,
254
        source='', keymap=None, keys=None,
255
        kind="value", offset=None, show_line=True,
256
        culprit=(), codicil=(),
257
        **kwargs
258
    ):
259
        # Users can use NestedTextError for their own errors.  To do so they
260
        # would pass a message or a message template along with the source, the
261
        # keymap, the offending keys, whether the problem is in the key or the
262
        # value, and perhaps an offset, and the information will be converted
263
        # into a suitable culprit and codicil.
264
        if is_str(culprit):
2✔
265
            culprit = cull((culprit,))
2✔
266
        if is_str(codicil):
2✔
267
            codicil = cull((codicil,))
2✔
268

269
        if keys and keymap:
2!
270
            loc = get_location(keys, keymap)
2✔
271
            line_nums = loc.get_line_numbers(kind=kind, sep='-')
2✔
272
            if line_nums:
2!
273
                source += f"@{line_nums}"
2✔
274
            orig_keys = get_keys(keys, keymap, sep="›")
2✔
275
            culprit = (source or None,) + (orig_keys,) + culprit
2✔
276
            kwargs['culprit'] = culprit
2✔
277

278
            if show_line:
2✔
279
                codicil += loc.as_line(kind=kind, offset=offset),
2✔
280
            kwargs['codicil'] = codicil
2✔
281
        super().__init__(*args, **kwargs)
2✔
282

283

284
# NotSuitableForInline {{{2
285
# this is only intended for internal use
286
class NotSuitableForInline(Exception):
2✔
287
    pass
2✔
288

289

290
# NestedText Reader {{{1
291
# Converts NestedText into Python data hierarchies.
292

293
# constants {{{2
294
# regular expressions used to recognize dict items
295
dict_item_regex = r"""
2✔
296
    (?P<key>[^\s].*?)      # key (must start with non-space character)
297
    \s*                    # optional white space
298
    :                      # separator
299
    (?:\ (?P<value>.*))?   # value
300
"""
301
dict_item_recognizer = re.compile(dict_item_regex, re.VERBOSE)
2✔
302

303

304
# report {{{2
305
def report(message, line, *args, colno=None, **kwargs):
2✔
306
    message = full_stop(message)
2✔
307
    culprits = get_culprit()
2✔
308
    codicil = [kwargs.get("codicil", "")]
2✔
309
    if culprits:
2✔
310
        kwargs["source"] = culprits[0]
2✔
311
    if line:
2✔
312
        # line numbers are always 0 based unless included in a message to user
313
        include_prev_line = not (
2✔
314
            line.prev_line is None or kwargs.pop("suppress_prev_line", False)
315
        )
316
        if colno is not None:
2✔
317
            # build codicil that shows both the line and the preceding line
318
            if include_prev_line:
2✔
319
                codicil += [f"{line.prev_line.lineno+1:>4} ❬{line.prev_line.text}❭"]
2✔
320
            else:
321
                codicil += []
2✔
322
            # replace tabs with → so that arrow points to right location.
323
            text = line.text.replace("\t", "→")
2✔
324
            codicil += [
2✔
325
                f"{line.lineno+1:>4} ❬{text}❭",
326
                "      " + (colno*" ") + "▲",
327
            ]
328
            kwargs["codicil"] = "\n".join(cull(codicil))
2✔
329
            kwargs["colno"] = colno
2✔
330
        else:
331
            kwargs["codicil"] = f"{line.lineno+1:>4} ❬{line.text}❭"
2✔
332
        kwargs["culprit"] = get_culprit(line.lineno + 1)
2✔
333
        kwargs["line"] = line.text
2✔
334
        kwargs["lineno"] = line.lineno
2✔
335
        if include_prev_line:
2✔
336
            kwargs["prev_line"] = line.prev_line.text
2✔
337
    else:
338
        kwargs["culprit"] = culprits  # pragma: no cover
339
    raise NestedTextError(template=message, *args, **kwargs)
2✔
340

341

342
# unrecognized_line {{{2
343
def unrecognized_line(line):
2✔
344
    # line will not be recognized if there is invalid white space in indentation
345
    first_non_space = line.text.lstrip(" ")[0]
2✔
346
    index_of_first_non_space = line.text.index(first_non_space)
2✔
347
    if first_non_space.strip() == "":
2✔
348
        # first non-space is a white space character
349
        # treat it as invalid indentation
350
        desc = unicodedata.name(first_non_space, "")
2✔
351
        if desc:
2✔
352
            desc = f" ({desc})"
2✔
353
        report(
2✔
354
            f"invalid character in indentation: {first_non_space!r}{desc}.",
355
            line,
356
            colno = index_of_first_non_space,
357
            codicil = "Only simple spaces are allowed in indentation."
358
        )
359
    else:
360
        report("unrecognized line.", line, colno=index_of_first_non_space)
2✔
361

362

363
# Lines class {{{2
364
class Lines:
2✔
365
    # constructor {{{3
366
    def __init__(self, lines, support_inlines):
2✔
367
        self.lines = lines
2✔
368
        self.support_inlines = support_inlines
2✔
369
        self.generator = self.read_lines()
2✔
370
        self.first_value_line = None
2✔
371
        self.last_comment_line = None
2✔
372
            # a location is needed for the top of the data, keys = ()
373
            # use the first value given, if the data is not empty
374
            # use last comment given if data is empty
375
        # comment-capture state
376
        self.prev_data_line = None
2✔
377
        self.header_comments = []   # comments before the first data item
2✔
378
        self.eof_comments = []      # footer comments (after the last data item)
2✔
379
        self._comment_buffer = []   # raw comment/blank Lines awaiting attachment
2✔
380
        self.next_line = None
2✔
381
        self._advance_to_data_line()
2✔
382

383
    # Line class {{{3
384
    class Line(Info):
2✔
385
        def render(self, col=None):
2✔
386
            result = [f"{self.lineno+1:>4} ❬{self.text}❭"]
2✔
387
            if col is not None:
2✔
388
                l = len(self.text)
2✔
389
                if l < col:
2✔
390
                    col = l
2✔
391
                result += ["      " + (col*" ") + "▲"]
2✔
392
            return "\n".join(result)
2✔
393

394
        def __str__(self):
2✔
395
            return self.text
2✔
396

397
        def __repr__(self):
2✔
398
            return self.__class__.__name__ + f"({self.lineno+1}: ❬{self.text}❭)"
2✔
399

400
    # read_lines() {{{3
401
    def read_lines(self):
2✔
402
        prev_line = None
2✔
403
        last_line = None
2✔
404
        for lineno, line in enumerate(self.lines):
2✔
405
            key = None
2✔
406
            value = None
2✔
407
            try:
2✔
408
                # decode to utf8 if a byte string or binary file is given
409
                line = line.decode('utf8')
2✔
410
            except AttributeError:
2✔
411
                pass
2✔
412
            line = line.rstrip("\n")
2✔
413

414
            # compute indentation
415
            stripped = line.lstrip(" ")
2✔
416
            depth = len(line) - len(stripped)
2✔
417

418
            # determine line type and extract values
419
            if stripped == "":
2✔
420
                kind = "blank"
2✔
421
                value = None
2✔
422
                depth = None
2✔
423
            elif stripped[:1] == "#":
2✔
424
                kind = "comment"
2✔
425
                # remove the '#' and exactly one optional space (the canonical
426
                # form is '# text'); rstrip whitespace.  This preserves
427
                # additional leading spaces inside the comment text so that
428
                # the dumper can round-trip them faithfully.
429
                value = stripped[1:]
2✔
430
                if value.startswith(" "):
2✔
431
                    value = value[1:]
2✔
432
                value = value.rstrip()
2✔
433
                # depth stays at the computed indent; needed for comment partitioning
434
            elif stripped == "-" or stripped.startswith("- "):
2✔
435
                kind = "list item"
2✔
436
                value = stripped[2:]
2✔
437
            elif stripped == ">" or stripped.startswith("> "):
2✔
438
                kind = "string item"
2✔
439
                value = line[depth+2:]
2✔
440
            elif stripped == ":" or stripped.startswith(": "):
2✔
441
                kind = "key item"
2✔
442
                value = line[depth+2:]
2✔
443
            elif stripped[0:1] in ["[", "{"] and self.support_inlines:
2✔
444
                tag = stripped[0:1]
2✔
445
                kind = "inline dict" if tag == "{" else "inline list"
2✔
446
                value = line[depth:]
2✔
447
            else:
448
                matches = dict_item_recognizer.fullmatch(stripped)
2✔
449
                if matches:
2✔
450
                    kind = "dict item"
2✔
451
                    key = matches.group("key")
2✔
452
                    value = matches.group("value")
2✔
453
                    if value is None:
2✔
454
                        value = ""
2✔
455
                else:
456
                    kind = "unrecognized"
2✔
457
                    value = line
2✔
458

459
            # bundle information about line
460
            this_line = self.Line(
2✔
461
                text = line,
462
                lineno = lineno,
463
                kind = kind,
464
                depth = depth,
465
                key = key,
466
                value = value,
467
                prev_line = prev_line,
468
            )
469
            if kind.endswith(" item") or kind.startswith("inline "):
2✔
470
                # Create prev_line, which differs from last_line in that it
471
                # is a copy of the line without a prev_line attribute of its
472
                # own. This avoids keeping a chain of all previous lines.
473
                #
474
                # In contrast, last_line is the actual this_line from the previous
475
                # non-blank/comment iteration
476
                prev_line = self.Line(
2✔
477
                    text = this_line.text,
478
                    value = this_line.value,
479
                    kind = this_line.kind,
480
                    depth = this_line.depth,
481
                    lineno = this_line.lineno,
482
                )
483

484
                # add this line as next_line in prev_line if this is a continued
485
                # multiline key or multiline string.
486
                if (
2✔
487
                    last_line                 and
488
                    depth == last_line.depth  and
489
                    kind == last_line.kind    and
490
                    kind in ["key item", "string item"]
491
                ):
492
                    last_line.next_line = this_line
2✔
493

494
            if kind in ['blank', 'comment']:
2✔
495
                self.last_comment_line = this_line
2✔
496
            else:
497
                last_line = this_line
2✔
498
                if not self.first_value_line:
2✔
499
                    self.first_value_line = this_line
2✔
500
            yield this_line
2✔
501

502
    # type_of_next() {{{3e
503
    def type_of_next(self):
2✔
504
        if self.next_line:
2✔
505
            return self.next_line.kind
2✔
506

507
    # still_within_level() {{{3
508
    def still_within_level(self, depth):
2✔
509
        if self.next_line:
2✔
510
            return self.next_line.depth >= depth
2✔
511

512
    # still_within_string() {{{3
513
    def still_within_string(self, depth):
2✔
514
        if self.next_line:
2✔
515
            return (
2✔
516
                self.next_line.kind == "string item" and
517
                self.next_line.depth >= depth
518
            )
519

520
    # still_within_key() {{{3
521
    def still_within_key(self, line, depth):
2✔
522
        if not self.next_line:
2✔
523
            report("indented value must follow multiline key", line)
2✔
524
        return (
2✔
525
            self.next_line.kind == "key item" and
526
            self.next_line.depth == depth
527
        )
528

529
    # depth_of_next() {{{3
530
    def depth_of_next(self):
2✔
531
        if self.next_line:
2✔
532
            return self.next_line.depth
2✔
533
        return 0
2✔
534

535
    # get_next() {{{3
536
    def get_next(self):
2✔
537
        line = self.next_line
2✔
538
        if line.kind == "unrecognized":
2✔
539
            unrecognized_line(line)
2✔
540

541
        # ensure the staging lists exist (Info returns None for unset attrs,
542
        # a fresh list is needed before the upcoming advance can append).
543
        line.trailing_comments = line.trailing_comments or []
2✔
544
        line.leading_comments = line.leading_comments or []
2✔
545

546
        self.prev_data_line = line
2✔
547

548
        # queue up the next useful line, capturing comments along the way.
549
        self._advance_to_data_line()
2✔
550

551
        return line
2✔
552

553
    # _advance_to_data_line() {{{3
554
    def _advance_to_data_line(self):
2✔
555
        """Pull from the generator until the next data line (or EOF).
556

557
        Buffered comment/blank lines are grouped into Comment objects and
558
        partitioned per the rules onto the surrounding data lines.
559
        """
560
        self.next_line = next(self.generator, None)
2✔
561
        while self.next_line and self.next_line.kind in ["blank", "comment"]:
2✔
562
            self._comment_buffer.append(self.next_line)
2✔
563
            self.next_line = next(self.generator, None)
2✔
564

565
        buffer_lines = self._comment_buffer
2✔
566
        self._comment_buffer = []
2✔
567

568
        if self.next_line is not None:
2✔
569
            self.next_line.leading_comments = self.next_line.leading_comments or []
2✔
570
            self.next_line.trailing_comments = self.next_line.trailing_comments or []
2✔
571

572
        if not buffer_lines:
2✔
573
            return
2✔
574

575
        if self.prev_data_line is None and self.next_line is not None:
2✔
576
            # before the first data line: rule 1 (Header / leading)
577
            # Partition on the raw buffer at the last blank line, then group
578
            # each half.  A comment block is leading on the first key only
579
            # when there is *no* blank between it and that key.
580
            header_lines, leading_lines = _partition_header_leading(buffer_lines)
2✔
581
            self.header_comments.extend(_group_comments(header_lines))
2✔
582
            self.next_line.leading_comments.extend(_group_comments(leading_lines))
2✔
583
        elif self.prev_data_line is None and self.next_line is None:
2✔
584
            # rule "No data" -- entire content is header
585
            self.header_comments.extend(_group_comments(buffer_lines))
2✔
586
        elif self.next_line is None:
2✔
587
            # after the last data line: rule 4 (Trailing / footer)
588
            last_indent = self.prev_data_line.depth or 0
2✔
589
            for c in _group_comments(buffer_lines):
2✔
590
                if c.indent > last_indent:
2✔
591
                    self.prev_data_line.trailing_comments.append(c)
2✔
592
                else:
593
                    self.eof_comments.append(c)
2✔
594
        else:
595
            # between two data lines: rule 2 (Leading / trailing)
596
            next_indent = self.next_line.depth or 0
2✔
597
            for c in _group_comments(buffer_lines):
2✔
598
                if c.indent <= next_indent:
2✔
599
                    self.next_line.leading_comments.append(c)
2✔
600
                else:
601
                    self.prev_data_line.trailing_comments.append(c)
2✔
602

603
    # indentation_error {{{3
604
    def indentation_error(self, line, depth):
2✔
605
        assert line.depth != depth
2✔
606
        prev_line = line.prev_line
2✔
607
        codicil = None
2✔
608
        if not line.prev_line and depth == 0:
2✔
609
            msg = "top-level content must start in column 1."
2✔
610
        elif (
2✔
611
            prev_line                     and
612
            prev_line.value               and
613
            prev_line.depth < line.depth  and
614
            prev_line.kind in ["list item", "dict item"]
615
        ):
616
            if prev_line.value.strip() == "":
2✔
617
                obs = ", which in this case consists only of whitespace"
2✔
618
            else:
619
                obs = ""
2✔
620
            msg = "invalid indentation."
2✔
621
            codicil = join(
2✔
622
                "An indent may only follow a dictionary or list item that does",
623
                f"not already have a value{obs}.",
624
                wrap = True
625
            )
626
        elif prev_line and prev_line.depth > line.depth:
2✔
627
            msg = "invalid indentation, partial dedent."
2✔
628
        else:
629
            msg = "invalid indentation."
2✔
630
        report(join(msg, wrap=True), line, colno=depth, codicil=codicil)
2✔
631

632

633
# KeyPolicy class {{{2
634
# Used to hold and implement the on_dup policy for dictionaries.
635
class KeyPolicy:
2✔
636
    @classmethod
2✔
637
    def set_policy(cls, on_dup):
2✔
638
        if callable(on_dup):
2✔
639
            # if on_dup is a function, convert it to a data structure that will
640
            # hold state during the load
641
            on_dup = {_OnDupCallback: on_dup}
2✔
642
        cls.on_dup = on_dup
2✔
643

644
    @classmethod
2✔
645
    def process_duplicate(cls, dictionary, key, keys, line=None, colno=None):
2✔
646
        if cls.on_dup is None or cls.on_dup == "error":
2✔
647
            report("duplicate key: {}.", line, key, colno=colno)
2✔
648
        if cls.on_dup == "ignore":
2✔
649
            return None
2✔
650
        if isinstance(cls.on_dup, dict):
2✔
651
            dup_handler = cls.on_dup.pop(_OnDupCallback)
2✔
652
            cls.on_dup.update(
2✔
653
                dict(dictionary=dictionary, keys=keys)
654
            )
655
            try:
2✔
656
                key = dup_handler(key=key, state=cls.on_dup)
2✔
657
                if key is None:
2✔
658
                    return None
2✔
659
            except KeyError:
2✔
660
                report("duplicate key: {}.", line, key, colno=colno)
2✔
661
            cls.on_dup[_OnDupCallback] = dup_handler  # restore dup_handler
2✔
662
        elif cls.on_dup != "replace":  # pragma: no cover
663
            raise AssertionError(
664
                f"{cls.on_dup}: unexpected value for on_dup."
665
            ) from None
666
        return key
2✔
667

668

669
# Comment class {{{2
670
class Comment:
2✔
671
    """A single comment captured during load or built by the user.
672

673
    Holds the comment text (joined by ``\\n`` for multi-line comments) plus
674
    indent and per-comment blank-line metadata.
675

676
    *indent* is the absolute indentation (in spaces) at which the comment's
677
    ``#`` will be placed.  It is set by the loader and is used by the
678
    dumper when ``tab`` is None.
679

680
    *tab* is the alternative way to express indent: a tabstop offset
681
    relative to the slot's natural indent.  When *tab* is not None, the
682
    dumper computes the absolute indent at emit time as
683
    ``natural_for_slot + tab * dumps.indent``; the ``indent`` field is
684
    ignored.  The loader leaves *tab* as None; :func:`annotate` and
685
    user-built Comments may set it.
686

687
    *before* / *after* are the number of blank lines emitted before and
688
    after this comment.  The loader does not set these; they exist for
689
    user-built comments only.
690

691
    *text* may also be ``None``: a Comment with ``text=None`` emits no
692
    ``#`` line at all; only its ``before`` / ``after`` blank lines are
693
    rendered.  This is a convenient way to inject pure blank-line
694
    separators inside a comment list (for example, between two
695
    Comments returned by a provider).
696
    """
697

698
    def __init__(self, text="", indent=0, *, tab=None, before=0, after=0):
2✔
699
        self.text = text
2✔
700
        self.indent = indent
2✔
701
        self.tab = tab
2✔
702
        self.before = before
2✔
703
        self.after = after
2✔
704

705
    def __repr__(self):
2✔
706
        extras = []
2✔
707
        if self.tab is not None:
2✔
708
            extras.append(f", tab={self.tab}")
2✔
709
        else:
710
            extras.append(f", indent={self.indent}")
2✔
711
        if self.before:
2✔
712
            extras.append(f", before={self.before}")
2✔
713
        if self.after:
2✔
714
            extras.append(f", after={self.after}")
2✔
715
        return f"Comment({self.text!r}{''.join(extras)})"
2✔
716

717
    def __eq__(self, other):
2✔
718
        if not isinstance(other, Comment):
2✔
719
            return NotImplemented
2✔
720
        return (
2✔
721
            self.text == other.text
722
            and self.indent == other.indent
723
            and self.tab == other.tab
724
            and self.before == other.before
725
            and self.after == other.after
726
        )
727

728
    # Comments are mutable, so they must not be hashable.
729
    __hash__ = None
2✔
730

731

732
# _group_comments {{{2
733
def _group_comments(comment_lines):
2✔
734
    """Convert a list of blank/comment Line objects into a list of Comments.
735

736
    Rules:
737
    - Adjacent comment lines at the same indent (no blank line between)
738
      merge into one Comment whose text is the source lines joined by '\\n'.
739
    - A blank line, or an indent change, closes the current Comment and
740
      starts a new one.
741
    - Pure blank lines are otherwise discarded; their spacing is the
742
      dumper's concern.
743

744
    Same-indent comment blocks separated by blanks therefore remain as
745
    distinct Comment objects.
746
    """
747
    comments = []
2✔
748
    cur_text_parts = []
2✔
749
    cur_indent = 0
2✔
750

751
    def flush():
2✔
752
        nonlocal cur_text_parts, cur_indent
753
        if cur_text_parts:
2✔
754
            comments.append(
2✔
755
                Comment(text="\n".join(cur_text_parts), indent=cur_indent)
756
            )
757
            cur_text_parts = []
2✔
758
            cur_indent = 0
2✔
759

760
    saw_blank = False
2✔
761
    for line in comment_lines:
2✔
762
        if line.kind == "blank":
2✔
763
            saw_blank = True
2✔
764
            continue
2✔
765
        indent = line.depth
2✔
766
        if cur_text_parts and not saw_blank and indent == cur_indent:
2✔
767
            cur_text_parts.append(line.value)
2✔
768
        else:
769
            flush()
2✔
770
            cur_text_parts.append(line.value)
2✔
771
            cur_indent = indent
2✔
772
        saw_blank = False
2✔
773
    flush()
2✔
774
    return comments
2✔
775

776

777
# _partition_header_leading {{{2
778
def _partition_header_leading(buffer_lines):
2✔
779
    """Split a pre-data-line buffer into (header_lines, leading_lines).
780

781
    The partition is at the LAST blank line in the buffer.  Everything before
782
    that blank becomes the header; everything after becomes the leading
783
    comment block on the first data item.  If no blank line is present, the
784
    entire buffer is leading.
785
    """
786
    last_blank = -1
2✔
787
    for i, line in enumerate(buffer_lines):
2✔
788
        if line.kind == "blank":
2✔
789
            last_blank = i
2✔
790
    if last_blank == -1:
2✔
791
        return [], list(buffer_lines)
2✔
792
    return list(buffer_lines[:last_blank]), list(buffer_lines[last_blank + 1:])
2✔
793

794

795
# Location class {{{2
796
class Location:
2✔
797
    """Holds information about the location of a token.
798

799
    Returned from :func:`load` and :func:`loads` as the values in a *keymap*.
800
    Objects of this class holds the line and column numbers of the key and value
801
    tokens.
802
    """
803

804
    def __init__(self, line=None, col=None, key_line=None, key_col=None):
2✔
805
        self.line = line
2✔
806
        self.key_line = key_line
2✔
807
        self.col = col
2✔
808
        self.key_col = key_col
2✔
809
        # Set by _add_keymap to the last data line consumed within this value's
810
        # scope (the source of value-trailing comments).  For leaf entries
811
        # this is ``line``; for nested values, the deepest descendant line.
812
        self.value_end_line = None
2✔
813
        # Comments are stored on the Location itself (not on the underlying
814
        # Line) so that two Locations that share a Line (parent and its
815
        # first child) do not inadvertently share comment lists.
816
        self.key_leading_comments = []
2✔
817
        self.key_trailing_comments = []
2✔
818
        self.value_leading_comments = []
2✔
819
        self.value_trailing_comments = []
2✔
820
        # Document-level comments only live on the keymap[()] Location.
821
        self.header_comments = []
2✔
822
        self.footer_comments = []
2✔
823
        # Per-Location dump spacing: when non-empty, replaces the dumps()
824
        # *spacing* argument for this Location's entire subtree.  Integer
825
        # keys are relative to this Location (0 == blanks between this
826
        # Location's direct children).  See get_spacing / set_spacing.
827
        self.spacing = {}
2✔
828
        # Per-slot comment providers.  Each (if not None) is a callable
829
        # with the signature ``provider(child_key) -> list[Comment]``,
830
        # invoked by the dumper for every child of this Location.  The
831
        # returned Comments are prepended to the child's own static
832
        # comments at the matching slot.  Closures over the callable's
833
        # state can be used to dedup or build comments dynamically.
834
        self.key_leading_provider = None
2✔
835
        self.key_trailing_provider = None
2✔
836
        self.value_leading_provider = None
2✔
837
        self.value_trailing_provider = None
2✔
838

839
    def __repr__(self):
2✔
840
        components = []
2✔
841
        if self.line:
2!
842
            components.append(f"lineno={self.line.lineno}")
2✔
843
            components.append(f"colno={self.col}")
2✔
844
            key_line = self.key_line
2✔
845
            if key_line is None:
2✔
846
                key_line = self.line
2✔
847
            components.append(f"key_lineno={key_line.lineno}")
2✔
848
            key_col = self.key_col
2✔
849
            if key_col is None:
2✔
850
                key_col = self.col
2✔
851
            components.append(f"key_colno={key_col}")
2✔
852
        return f"{self.__class__.__name__}({', '.join(components)})"
2✔
853

854
    # as_tuple() {{{3
855
    def as_tuple(self, kind="value"):
2✔
856
        """
857
        Returns the location of either the value or the key token as a tuple
858
        that contains the line number and the column number.  The line and
859
        column numbers are 0 based.
860

861
        Args:
862
            kind:
863
                Specify either “key” or “value” depending on which token is
864
                desired.
865
        """
866
        if kind == "key":
2✔
867
            line = self.key_line
2✔
868
            col = self.key_col
2✔
869
            if line is None:
2✔
870
                line = self.line
2✔
871
            if col is None:
2✔
872
                col = self.col
2✔
873
        else:
874
            assert kind == "value"
2✔
875
            line = self.line
2✔
876
            col = self.col
2✔
877
        return line.lineno, col
2✔
878

879
    # as_line() {{{3
880
    def as_line(self, kind="value", offset=0):
2✔
881
        """
882
        Returns a string containing two lines that identify the token in
883
        context.  The first line contains the line number and text of the line
884
        that contains the token.  The second line contains a pointer to the
885
        token.
886

887
        Args:
888
            kind:
889
                Specify either “key” or “value” depending on which token is
890
                desired.
891
            offset:
892
                If *offset* is None, the error pointer is not added to the line.
893
                If *offset* is an integer, the pointer is moved to the right by
894
                this many characters.  The default is 0.
895
                If *offset* is a tuple, it must have two values.  The first is
896
                the row offset and the second is the column offset.  This is
897
                useful for annotating errors in multiline strings.
898

899
        Raises:
900
            *IndexError* if row offset is out of range.
901
        """
902
        # get the line and the column number of the key or value
903
        if kind == "key":
2✔
904
            line = self.key_line
2✔
905
            col = self.key_col
2✔
906
            if line is None:
2✔
907
                line = self.line
2✔
908
            if col is None:
2✔
909
                col = self.col
2✔
910
        else:
911
            assert kind == "value"
2✔
912
            line = self.line
2✔
913
            col = self.col
2✔
914

915
        if not line:  # this occurs if input is completely empty
2✔
916
            return ""
2✔
917

918
        # process the offset
919
        if offset is None:
2✔
920
            return line.render()
2✔
921
        col_offset = offset
2✔
922
        try:
2✔
923
            row_offset, col_offset = offset
2✔
924
            while row_offset > 0:
2✔
925
                line = line.next_line
2✔
926
                row_offset -= 1
2✔
927
                if line is None:
2✔
928
                    raise IndexError(offset[0])
2✔
929
        except TypeError:
2✔
930
            pass
2✔
931

932
        return line.render(col + col_offset)
2✔
933

934
    # get_line_numbers() {{{3
935
    def get_line_numbers(self, kind="value", sep=None):
2✔
936
        """
937
        Returns the line numbers of a token either as a pair of integers or as a
938
        string.
939

940
        Args:
941
            kind:
942
                Specify either “key” or “value” depending on which token is
943
                desired.
944
            sep:
945
                The separator string.
946

947
                If given a string is returned and *sep* is inserted between two
948
                line numbers.  In this case the line numbers start at 1.
949

950
                If *sep* is not given, a tuple of integers is returned.  In this
951
                case the line numbers start at 0, but the second number returned
952
                is the last line number plus 1.  This form is suitable to use
953
                with the Python slice function to extract the lines from the
954
                *NestedText* source.
955
        """
956
        if kind == "key":
2✔
957
            line = self.key_line
2✔
958
            if line is None:
2✔
959
                line = self.line
2✔
960
        else:
961
            assert kind == "value"
2✔
962
            line = self.line
2✔
963

964
        # find line numbers
965
        first_lineno = line.lineno
2✔
966
        while line:
2✔
967
            last_lineno = line.lineno
2✔
968
            line = line.next_line
2✔
969

970
        if sep is None:
2✔
971
            return (first_lineno, last_lineno + 1)
2✔
972
        if first_lineno != last_lineno:
2✔
973
            return join(first_lineno+1, last_lineno+1, sep=sep)
2✔
974
        return str(first_lineno+1)
2✔
975

976
    # _get_original_key() {{{3
977
    def _get_original_key(self, key, strict):
2✔
978
        try:
2✔
979
            line = self.key_line
2✔
980
            if line.kind == "key item":
2✔
981
                # is multiline key (key fragment is actually held in line.text)
982
                key_frags = [line.text[line.depth+2:]]
2✔
983
                while line.next_line:
2✔
984
                    line = line.next_line
2✔
985
                    key_frags.append(line.text[line.depth+2:])
2✔
986
                key = "\n".join(key_frags)
2✔
987
            else:
988
                if line.kind != "list item":
2✔
989
                    key = line.key
2✔
990
            return key
2✔
991
        except AttributeError:
2✔
992
            # this occurs for list indexes
993
            return key
2✔
994

995
    # comment accessors {{{3
996
    # get_key_leading_comments() {{{4
997
    def get_key_leading_comments(self):
2✔
998
        """Return the leading Comments associated with this key."""
999
        return self.key_leading_comments
2✔
1000

1001
    # set_key_leading_comments {{{4
1002
    def set_key_leading_comments(self, comments):
2✔
1003
        """Replace the leading Comments for this key."""
1004
        self.key_leading_comments = list(comments)
2✔
1005

1006
    # add_key_leading_comment {{{4
1007
    def add_key_leading_comment(self, comment):
2✔
1008
        """Append a Comment to the leading list for this key."""
1009
        self.key_leading_comments.append(comment)
2✔
1010

1011
    # get_key_trailing_comments {{{4
1012
    def get_key_trailing_comments(self):
2✔
1013
        """Return Comments between the key line and the value line (multiline case)."""
1014
        return self.key_trailing_comments
2✔
1015

1016
    # set_key_trailing_comments {{{4
1017
    def set_key_trailing_comments(self, comments):
2✔
1018
        self.key_trailing_comments = list(comments)
2✔
1019

1020
    # add_key_trailing_comment {{{4
1021
    def add_key_trailing_comment(self, comment):
2✔
1022
        self.key_trailing_comments.append(comment)
2✔
1023

1024
    # get_value_leading_comments {{{4
1025
    def get_value_leading_comments(self):
2✔
1026
        """Return Comments leading the value (multiline case)."""
1027
        return self.value_leading_comments
2✔
1028

1029
    # set_value_leading_comments {{{4
1030
    def set_value_leading_comments(self, comments):
2✔
1031
        self.value_leading_comments = list(comments)
2✔
1032

1033
    # add_value_leading_comment {{{4
1034
    def add_value_leading_comment(self, comment):
2✔
1035
        self.value_leading_comments.append(comment)
2✔
1036

1037
    # get_value_trailing_comments {{{4
1038
    def get_value_trailing_comments(self):
2✔
1039
        """Return Comments trailing the value (after its last line)."""
1040
        return self.value_trailing_comments
2✔
1041

1042
    # set_value_trailing_comments {{{4
1043
    def set_value_trailing_comments(self, comments):
2✔
1044
        self.value_trailing_comments = list(comments)
2✔
1045

1046
    # add_value_trailing_comment {{{4
1047
    def add_value_trailing_comment(self, comment):
2✔
1048
        self.value_trailing_comments.append(comment)
2✔
1049

1050
    # get_header_comments {{{4
1051
    def get_header_comments(self):
2✔
1052
        """Return the document's header Comments.
1053

1054
        Header comments only ever live on the document-root Location, i.e.,
1055
        ``keymap[()]``.  On any other Location this list is empty.
1056
        """
1057
        return self.header_comments
2✔
1058

1059
    # set_header_comments {{{4
1060
    def set_header_comments(self, comments):
2✔
1061
        """Replace the document's header Comments."""
1062
        self.header_comments = list(comments)
2✔
1063

1064
    # add_header_comment {{{4
1065
    def add_header_comment(self, comment):
2✔
1066
        """Append a Comment to the document's header."""
1067
        self.header_comments.append(comment)
2✔
1068

1069
    # get_footer_comments {{{4
1070
    def get_footer_comments(self):
2✔
1071
        """Return the document's footer Comments.
1072

1073
        Footer comments only ever live on the document-root Location, i.e.,
1074
        ``keymap[()]``.  On any other Location this list is empty.
1075
        """
1076
        return self.footer_comments
2✔
1077

1078
    # set_footer_comments {{{4
1079
    def set_footer_comments(self, comments):
2✔
1080
        """Replace the document's footer Comments."""
1081
        self.footer_comments = list(comments)
2✔
1082

1083
    # add_footer_comment {{{4
1084
    def add_footer_comment(self, comment):
2✔
1085
        """Append a Comment to the document's footer."""
1086
        self.footer_comments.append(comment)
2✔
1087

1088
    # get_spacing {{{4
1089
    def get_spacing(self):
2✔
1090
        """Return the per-Location spacing dict (empty if none was set).
1091

1092
        When non-empty, this dict replaces the :func:`dumps` *spacing*
1093
        argument for this Location's entire subtree.  Integer keys count
1094
        relative depth below this Location: ``0`` is the number of blank
1095
        lines between this Location's direct children, ``1`` between its
1096
        grandchildren, and so on.  Absent depth keys default to ``0`` --
1097
        the global spacing is *not* consulted as a fallback.
1098

1099
        The ``"edges"`` key is only consulted on the document-root
1100
        Location (``keymap[()]``); it is ignored elsewhere.
1101
        """
1102
        return self.spacing
2✔
1103

1104
    # set_spacing {{{4
1105
    def set_spacing(self, spacing):
2✔
1106
        """Replace the per-Location spacing dict."""
1107
        self.spacing = dict(spacing)
2✔
1108

1109
    # get_key_leading_provider {{{4
1110
    def get_key_leading_provider(self):
2✔
1111
        """Return the per-child ``key_leading`` provider callable, if any.
1112

1113
        A provider has the signature ::
1114

1115
            provider(child_key) -> list[Comment]
1116

1117
        where *child_key* is the dict key (or list index) of one of this
1118
        Location's children.  When this Location's value is rendered the
1119
        dumper invokes the provider for every child, and prepends the
1120
        returned Comments to that child's static
1121
        :meth:`get_key_leading_comments` list.  Closures over the
1122
        callable's state can dedup or build Comments dynamically.
1123

1124
        Returned Comments whose ``tab`` is ``None`` are normalized to
1125
        ``tab=0`` at emit time (natural indent).  Providers are
1126
        callables and are *not* JSON-serializable; they are dropped on
1127
        :func:`keymap_to_jsonable` round-trips.
1128
        """
1129
        return self.key_leading_provider
2✔
1130

1131
    # set_key_leading_provider {{{4
1132
    def set_key_leading_provider(self, provider):
2✔
1133
        """Replace the per-child ``key_leading`` provider callable.
1134

1135
        Pass ``None`` to clear it.  See :meth:`get_key_leading_provider`
1136
        for the expected callable signature.
1137
        """
1138
        self.key_leading_provider = provider
2✔
1139

1140
    # get_key_trailing_provider {{{4
1141
    def get_key_trailing_provider(self):
2✔
1142
        """Return the per-child ``key_trailing`` provider; see
1143
        :meth:`get_key_leading_provider` for semantics."""
1144
        return self.key_trailing_provider
2✔
1145

1146
    # set_key_trailing_provider {{{4
1147
    def set_key_trailing_provider(self, provider):
2✔
1148
        """Replace the per-child ``key_trailing`` provider callable."""
1149
        self.key_trailing_provider = provider
2✔
1150

1151
    # get_value_leading_provider {{{4
1152
    def get_value_leading_provider(self):
2✔
1153
        """Return the per-child ``value_leading`` provider; see
1154
        :meth:`get_key_leading_provider` for semantics."""
1155
        return self.value_leading_provider
2✔
1156

1157
    # set_value_leading_provider {{{4
1158
    def set_value_leading_provider(self, provider):
2✔
1159
        """Replace the per-child ``value_leading`` provider callable."""
1160
        self.value_leading_provider = provider
2✔
1161

1162
    # get_value_trailing_provider {{{4
1163
    def get_value_trailing_provider(self):
2✔
1164
        """Return the per-child ``value_trailing`` provider; see
1165
        :meth:`get_key_leading_provider` for semantics."""
1166
        return self.value_trailing_provider
2✔
1167

1168
    # set_value_trailing_provider {{{4
1169
    def set_value_trailing_provider(self, provider):
2✔
1170
        """Replace the per-child ``value_trailing`` provider callable."""
1171
        self.value_trailing_provider = provider
2✔
1172

1173

1174
# Inline class {{{2
1175
class Inline:
2✔
1176
    # a recursive descent parser to interpret inline lists and dictionaries
1177

1178
    # constructor() {{{3
1179
    def __init__(self, line, keys, loader):
2✔
1180
        self.line = line
2✔
1181
        self.loader = loader
2✔
1182
        self.text = line.value
2✔
1183
        self.max_index = len(self.text)
2✔
1184
        self.starting_col = line.depth
2✔
1185
        try:
2✔
1186
            self.values, self.keymap, index = self.parse_inline_value(keys, 0)
2✔
1187
        except IndexError:
2✔
1188
            self.inline_error("line ended without closing delimiter", self.max_index)
2✔
1189
        if index < self.max_index:
2✔
1190
            extra = self.text[index:]
2✔
1191
            self.inline_error(
2✔
1192
                f"extra {plural(extra):character} after closing delimiter: ‘{{}}’.",
1193
                index, extra
1194
            )
1195
        assert index == self.max_index
2✔
1196

1197
    # parse_inline_value() {{{3
1198
    def parse_inline_value(self, keys, index, forbidden_chars=None):
2✔
1199
        if self.text[index] == "{":
2✔
1200
            return self.parse_inline_dict(keys, index)
2✔
1201
        elif self.text[index] == "[":
2✔
1202
            return self.parse_inline_list(keys, index)
2✔
1203
        else:
1204
            return self.parse_inline_str(keys, index, forbidden_chars)
2✔
1205

1206
    # parse_inline_dict() {{{3
1207
    def parse_inline_dict(self, keys, index):
2✔
1208
        starting_index = index
2✔
1209
        assert self.text[index] == "{"
2✔
1210
        index += 1
2✔
1211
        values = {}
2✔
1212
        need_another = False
2✔
1213

1214
        while self.text[index] != "}":
2✔
1215
            prev_index = index
2✔
1216
            orig_key, value, location, index = self.parse_inline_dict_item(keys, index)
2✔
1217
            key = self.loader.normalize_key(orig_key, keys)
2✔
1218
            if key in values:
2✔
1219
                key = KeyPolicy.process_duplicate(values, key, keys, self.line, prev_index)
2✔
1220
            if key is not None:
2✔
1221
                values[key] = value
2✔
1222
                self.loader._add_keymap(keys + (key,), location)
2✔
1223
            need_another = False
2✔
1224
            if self.text[index] not in ",}":
2✔
1225
                self.inline_error(
2✔
1226
                    "expected ‘,’ or ‘}}’, found ‘{}’", index, self.text[index]
1227
                )
1228
            if self.text[index] == ",":
2✔
1229
                index += 1
2✔
1230
                need_another = True
2✔
1231
        if need_another:
2✔
1232
            self.inline_error("expected value", index)
2✔
1233
        return (
2✔
1234
            values,
1235
            self.location(starting_index),
1236
            self.adjust_index(index+1)
1237
        )
1238

1239
    # parse_inline_dict_item() {{{3
1240
    def parse_inline_dict_item(self, keys, index):
2✔
1241
        forbidden_chars = "{}[],:"
2✔
1242
        key_index = self.adjust_index(index)
2✔
1243
        if self.text[index] in forbidden_chars:
2✔
1244
            key = ""
2✔
1245
        else:
1246
            key, _, index = self.parse_inline_value(keys, index, forbidden_chars)
2✔
1247
        if self.text[index] != ":":
2✔
1248
            self.inline_error(
2✔
1249
                "expected ‘:’, found ‘{}’", index, self.text[index], culprit=key
1250
            )
1251
        index = self.adjust_index(index+1)
2✔
1252
        if self.text[index] in ",}":
2✔
1253
            value = ""
2✔
1254
            loc = self.location(index)
2✔
1255
        else:
1256
            value, loc, index = self.parse_inline_value(keys, index, forbidden_chars)
2✔
1257
        self.add_key_location(loc, key_index)
2✔
1258
        return key, value, loc, index
2✔
1259

1260
    # parse_inline_list() {{{3
1261
    def parse_inline_list(self, keys, index):
2✔
1262
        forbidden_chars = "{}[],"
2✔
1263
        starting_index = index
2✔
1264
        assert self.text[index] == "["
2✔
1265
        index += 1
2✔
1266

1267
        # handle empty list
1268
        if self.text[index] == "]":
2✔
1269
            return [], self.location(starting_index), self.adjust_index(index+1)
2✔
1270

1271
        key = 0
2✔
1272
        values = []
2✔
1273
        value = ""
2✔
1274
        loc = self.location(index)
2✔
1275
        while True:
2✔
1276
            new_keys = keys + (key,)
2✔
1277
            c = self.text[index]
2✔
1278
            if c in ",]":
2✔
1279
                values.append(value)
2✔
1280
                self.loader._add_keymap(new_keys, loc)
2✔
1281
                key += 1
2✔
1282
                if c == "]":
2✔
1283
                    return (
2✔
1284
                        values,
1285
                        self.location(starting_index),
1286
                        self.adjust_index(index+1)
1287
                    )
1288
                index += 1
2✔
1289
                loc = self.location(index)
2✔
1290
                index = self.adjust_index(index)
2✔
1291
                value = ""
2✔
1292
            elif value:
2✔
1293
                self.inline_error(
2✔
1294
                    "expected ‘,’ or ‘]’, found ‘{}’", index, self.text[index]
1295
                )
1296
            elif c in "}],":
2✔
1297
                self.inline_error("expected value", index)
2✔
1298
            else:
1299
                value, loc, index = self.parse_inline_value(
2✔
1300
                    new_keys, index, forbidden_chars
1301
                )
1302

1303
    # parse_inline_str() {{{3
1304
    def parse_inline_str(self, keys, index, forbidden_chars):
2✔
1305
        starting_index = index
2✔
1306
        while self.text[index] not in forbidden_chars:
2✔
1307
            index = self.adjust_index(index+1)
2✔
1308
        value = self.text[starting_index:index].strip()
2✔
1309
        return value, self.location(starting_index), index
2✔
1310

1311
    # adjust_index() {{{3
1312
    def adjust_index(self, index):
2✔
1313
        # if desired index points to white space, shift right until it doesn’t
1314
        while index < self.max_index and self.text[index] in " \t":
2✔
1315
            index += 1
2✔
1316
        return index
2✔
1317

1318
    # location() {{{3
1319
    def location(self, index, **kwargs):
2✔
1320
        kwargs["line"] = self.line
2✔
1321
        return Location(col=index + self.starting_col, **kwargs)
2✔
1322

1323
    # add_key_location() {{{3
1324
    def add_key_location(self, loc, key_index):
2✔
1325
        loc.key_col = key_index + self.starting_col
2✔
1326

1327
    # inline_error {{{3
1328
    def inline_error(self, message, index, *args, culprit=None, **kwargs):
2✔
1329
        report(
2✔
1330
            full_stop(message),
1331
            self.line,
1332
            *args,
1333
            colno = index + self.starting_col,
1334
            culprit = culprit,
1335
            suppress_prev_line = True,
1336
            **kwargs,
1337
        )
1338

1339
    # get_values() {{{3
1340
    def get_values(self):
2✔
1341
        return self.values, self.keymap
2✔
1342

1343
    # render {{{3
1344
    def render(self, index):  # pragma: no cover
1345
        return f"❬{self.text}❭\n {index*' '}▲"
1346

1347
    # __repr__ {{{3
1348
    def __repr__(self):  # pragma: no cover
1349
        name = self.__class__.__name__
1350
        return f"{name}({self.text!r})"
1351

1352

1353
# NestedTextLoader class {{{2
1354
class NestedTextLoader:
2✔
1355
    # __init__() {{{3
1356
    def __init__(self, lines, top, source, on_dup, keymap, normalize_key, dialect):
2✔
1357
        KeyPolicy.set_policy(on_dup)
2✔
1358
        self.source = source
2✔
1359
        self.keymap = keymap
2✔
1360
        assert self.keymap is None or is_mapping(self.keymap)
2✔
1361
        self.normalize_key = normalize_key if normalize_key else lambda k, ks: k
2✔
1362
        if dialect and "i" in dialect:
2✔
1363
            support_inlines = False
2✔
1364
        else:
1365
            support_inlines = True
2✔
1366

1367
        with set_culprit(source):
2✔
1368
            lines = self.lines = Lines(lines, support_inlines)
2✔
1369
            if keymap is not None:
2✔
1370
                # add a location for the top-level of the data set
1371
                if lines.first_value_line:
2✔
1372
                    keymap[()] = Location(line=lines.first_value_line, col=0)
2✔
1373
                else:
1374
                    keymap[()] = Location(line=lines.last_comment_line, col=0)
2✔
1375
            next_is = lines.type_of_next()
2✔
1376

1377
            if top in ["any", any]:
2✔
1378
                if next_is is None:
2✔
1379
                    self.values, self.keymap = None, None
2✔
1380
                else:
1381
                    self.values, self.keymap = self._read_value(0, ())
2✔
1382

1383
            elif top in ["dict", dict]:
2✔
1384
                if next_is in ["dict item", "key item", "inline dict"]:
2✔
1385
                    self.values, self.keymap = self._read_value(0, ())
2✔
1386
                elif next_is is None:
2✔
1387
                    self.values, self.keymap = {}, None
2✔
1388
                else:
1389
                    report(
2✔
1390
                        "content must start with key or brace ({{).",
1391
                        lines.get_next()
1392
                    )
1393

1394
            elif top in ["list", list]:
2✔
1395
                if next_is in ["list item", "inline list"]:
2✔
1396
                    self.values, self.keymap = self._read_value(0, ())
2✔
1397
                elif next_is is None:
2✔
1398
                    self.values, self.keymap = [], None
2✔
1399
                else:
1400
                    report(
2✔
1401
                        "content must start with dash (-) or bracket ([).",
1402
                        lines.get_next(),
1403
                    )
1404

1405
            elif top in ["str", str]:
2✔
1406
                if next_is == "string item":
2✔
1407
                    self.values, self.keymap = self._read_value(0, ())
2✔
1408
                elif next_is is None:
2✔
1409
                    self.values, self.keymap = "", None
2✔
1410
                else:
1411
                    report(
2✔
1412
                        "content must start with greater-than sign (>).",
1413
                        lines.get_next(),
1414
                    )
1415

1416
            else:
1417
                raise NotImplementedError(top)  # pragma: no cover
1418

1419
            if lines.type_of_next():
2✔
1420
                report('extra content', lines.get_next())
2✔
1421

1422
            # attach header / footer comments to the document-root Location
1423
            # (keymap[()]) so that every keymap key remains a tuple, keeping
1424
            # depth-based iteration (``len(keys)``) safe.
1425
            if keymap is not None and () in keymap:
2✔
1426
                root = keymap[()]
2✔
1427
                if lines.header_comments:
2✔
1428
                    root.header_comments = list(lines.header_comments)
2✔
1429
                if lines.eof_comments:
2✔
1430
                    root.footer_comments = list(lines.eof_comments)
2✔
1431

1432
    # get_decoded() {{{3
1433
    def get_decoded(self):
2✔
1434
        return self.values
2✔
1435

1436
    # # get_keymap() {{{3
1437
    # this method becomes useful when an interface that returns the loader develops
1438
    # def get_keymap(self):
1439
    #     return self.keymap
1440

1441
    # # get_source() {{{3
1442
    # this method becomes useful when an interface that returns the loader develops
1443
    # def get_source(self):
1444
    #     return self.source
1445

1446
    # # get_value() {{{3
1447
    # this method becomes useful when an interface that returns the loader develops
1448
    # def get_value(self, keys):
1449
    #     """
1450
    #     Return the value associated with a set of keys.
1451
    #     """
1452
    #     value = self.values
1453
    #     key = None
1454
    #     keys_used = ()
1455
    #     try:
1456
    #         for key in keys:
1457
    #             keys_used += (key,)
1458
    #             value = value[key]
1459
    #     except (KeyError, IndexError) as e:
1460
    #         raise NestedTextError(
1461
    #             key, template=f"key not found ({}).", culprit=keys_used
1462
    #         )
1463
    #     return value
1464

1465
    # _add_keymap() {{{3
1466
    def _add_keymap(self, keys, location):
2✔
1467
        if self.keymap is not None:
2✔
1468
            # The last data line consumed is the last line of this value's
1469
            # scope, where trailing-after-value comments are staged.
1470
            location.value_end_line = self.lines.prev_data_line
2✔
1471
            # Claim any comments staged on the relevant Lines.  This makes
1472
            # each Location own its comments outright -- two Locations that
1473
            # share a Line (parent and its first child) cannot then
1474
            # double-emit the same comment block.
1475
            key_first = (
2✔
1476
                location.key_line if location.key_line is not None
1477
                else location.line
1478
            )
1479
            vl = location.line
2✔
1480
            ve = location.value_end_line
2✔
1481
            # For a multi-line key, walk the chain of fragment lines via
1482
            # next_line (set by Lines.read_lines only between consecutive
1483
            # ``key item`` lines at the same depth) so that comments
1484
            # staged on later fragments are claimed too.
1485
            key_last = key_first
2✔
1486
            if key_first is not None:
2!
1487
                while (
2✔
1488
                    getattr(key_last, "next_line", None) is not None
1489
                    and key_last.next_line.kind == "key item"
1490
                    and key_last.next_line.depth == key_last.depth
1491
                ):
1492
                    key_last = key_last.next_line
2✔
1493
            if key_first is not None and key_first.leading_comments:
2✔
1494
                location.key_leading_comments = list(key_first.leading_comments)
2✔
1495
                key_first.leading_comments = []
2✔
1496
            if (
2✔
1497
                vl is not None
1498
                and vl is not key_first
1499
                and vl is not key_last
1500
                and vl.leading_comments
1501
            ):
1502
                location.value_leading_comments = list(vl.leading_comments)
2✔
1503
                vl.leading_comments = []
2✔
1504
            # key_trailing collects (a) leading_comments staged on each
1505
            # *intermediate* key-fragment line -- these are comments that
1506
            # appeared between fragments of the multi-line key, the
1507
            # multi-line-key analogue of inline-in-multi-line-string -- and
1508
            # (b) the trailing_comments on each fragment, including the
1509
            # last.  All are emitted at the key-trailing position.  When
1510
            # the entire key+value is on one line (key_first == ve), the
1511
            # trailing comments belong to value_trailing instead.
1512
            kt = []
2✔
1513
            # Inline-in-multi-line-key comments need to be indented past
1514
            # the value's column so that a subsequent re-load classifies
1515
            # them as key_trailing (rather than as value_leading on the
1516
            # value or as a leading comment on the next sibling).  We
1517
            # bump to value-depth + 4 -- one tabstop deeper than the
1518
            # value's natural column at the default indent step.
1519
            safe_inline_indent = (vl.depth + 4) if vl is not None else None
2✔
1520
            cur = key_first
2✔
1521
            while cur is not None:
2!
1522
                if cur is not key_first and cur.leading_comments:
2✔
1523
                    if safe_inline_indent is not None:
2!
1524
                        for c in cur.leading_comments:
2✔
1525
                            if c.indent <= vl.depth:
2!
1526
                                c.indent = safe_inline_indent
2✔
1527
                    kt.extend(cur.leading_comments)
2✔
1528
                    cur.leading_comments = []
2✔
1529
                if cur is not ve and cur.trailing_comments:
2✔
1530
                    kt.extend(cur.trailing_comments)
2✔
1531
                    cur.trailing_comments = []
2✔
1532
                if cur is key_last:
2✔
1533
                    break
2✔
1534
                cur = cur.next_line
2✔
1535
            if kt:
2✔
1536
                location.key_trailing_comments = kt
2✔
1537
            if ve is not None and ve.trailing_comments:
2✔
1538
                location.value_trailing_comments = list(ve.trailing_comments)
2✔
1539
                ve.trailing_comments = []
2✔
1540
            self.keymap[keys] = location
2✔
1541

1542
    # _read_value() {{{3
1543
    def _read_value(self, depth, keys):
2✔
1544
        lines = self.lines
2✔
1545
        if lines.type_of_next() == "list item":
2✔
1546
            return self._read_list(depth, keys)
2✔
1547
        if lines.type_of_next() in ["dict item", "key item"]:
2✔
1548
            return self._read_dict(depth, keys)
2✔
1549
        if lines.type_of_next() == "string item":
2✔
1550
            return self._read_string(depth, keys)
2✔
1551
        if lines.type_of_next() in ["inline dict", "inline list"]:
2✔
1552
            return self._read_inline(keys)
2✔
1553
        unrecognized_line(lines.get_next())
2✔
1554

1555
    # _read_list() {{{3
1556
    def _read_list(self, depth, keys):
2✔
1557
        lines = self.lines
2✔
1558
        values = []
2✔
1559
        index = 0
2✔
1560
        first_line = lines.next_line
2✔
1561
        while lines.still_within_level(depth):
2✔
1562
            line = lines.get_next()
2✔
1563
            if line.depth != depth:
2✔
1564
                lines.indentation_error(line, depth)
2✔
1565
            if line.kind != "list item":
2✔
1566
                report("expected list item.", line, colno=depth)
2✔
1567
            new_keys = keys + (index,)
2✔
1568
            if line.value:
2✔
1569
                values.append(line.value)
2✔
1570
                self._add_keymap(
2✔
1571
                    new_keys, Location(line=line, key_col=depth, col=depth + 2)
1572
                )
1573
            else:
1574
                # value may simply be empty, or it may be on next line, in which
1575
                # case it must be indented.
1576
                depth_of_next = lines.depth_of_next()
2✔
1577
                if depth_of_next > depth:
2✔
1578
                    value, loc = self._read_value(depth_of_next, new_keys)
2✔
1579
                    loc.key_line = line
2✔
1580
                    loc.key_col = depth
2✔
1581
                else:
1582
                    value = ""
2✔
1583
                    loc = Location(line=line, key_col=depth, col=depth + 1)
2✔
1584
                values.append(value)
2✔
1585
                self._add_keymap(new_keys, loc)
2✔
1586
            index += 1
2✔
1587

1588
        return values, Location(line=first_line, col=first_line.depth)
2✔
1589

1590
    # _read_dict() {{{3
1591
    def _read_dict(self, depth, keys):
2✔
1592
        lines = self.lines
2✔
1593
        values = {}
2✔
1594
        first_line = lines.next_line
2✔
1595

1596
        # process all items in dictionary
1597
        while lines.still_within_level(depth):
2✔
1598
            line = lines.get_next()
2✔
1599
            key_line = line
2✔
1600
            key_col = depth
2✔
1601

1602
            # error checking
1603
            if line.depth != depth:
2✔
1604
                lines.indentation_error(line, depth)
2✔
1605
            if line.kind not in ["dict item", "key item"]:
2✔
1606
                report("expected dictionary item.", line, colno=depth)
2✔
1607

1608
            # process key
1609
            if line.kind == "key item":
2✔
1610
                # multiline key
1611
                original_key = self._read_key(line, depth)
2✔
1612
                value = None
2✔
1613
            else:
1614
                # key and value on a single line
1615
                original_key = line.key
2✔
1616
                value = line.value
2✔
1617
            key = self.normalize_key(original_key, keys)
2✔
1618
            if key in values:
2✔
1619
                # found duplicate key
1620
                key = KeyPolicy.process_duplicate(values, key, keys, line, depth)
2✔
1621
                if key is None:
2✔
1622
                    continue
2✔
1623
            new_keys = keys + (key,)
2✔
1624

1625
            # process value
1626
            if value:
2✔
1627
                # this is a single-line item, value was found above
1628
                loc = Location(line=line, col=depth + len(key) + 2)
2✔
1629
            else:
1630
                # value is on subsequent lines
1631
                depth_of_next = lines.depth_of_next()
2✔
1632
                if depth_of_next > depth:
2✔
1633
                    # read indented values
1634
                    value, loc = self._read_value(depth_of_next, new_keys)
2✔
1635
                elif line.kind == "dict item":
2✔
1636
                    # found the next key in this dictionary, so value is empty
1637
                    value = ""
2✔
1638
                    loc = Location(line=line, col=depth + len(key) + 1)
2✔
1639
                else:
1640
                    report("multiline key requires a value.", line, None, colno=depth)
2✔
1641

1642
            values[key] = value
2✔
1643
            loc.key_line = key_line
2✔
1644
            loc.key_col = key_col
2✔
1645
            self._add_keymap(new_keys, loc)
2✔
1646
        return values, Location(line=first_line, col=first_line.depth)
2✔
1647

1648
    # _read_key() {{{3
1649
    def _read_key(self, line, depth):
2✔
1650
        lines = self.lines
2✔
1651
        data = [line.value]
2✔
1652
        while lines.still_within_key(line, depth):
2✔
1653
            line = lines.get_next()
2✔
1654
            data.append(line.value)
2✔
1655
        return "\n".join(data)
2✔
1656

1657
    # _read_string() {{{3
1658
    def _read_string(self, depth, keys):
2✔
1659
        lines = self.lines
2✔
1660
        data = []
2✔
1661
        first_line = lines.next_line
2✔
1662
        loc = Location(line=first_line, key_col=depth)
2✔
1663
        last_line = first_line
2✔
1664
        while lines.still_within_string(depth):
2✔
1665
            line = lines.get_next()
2✔
1666
            last_line = line
2✔
1667
            data.append(line.value)
2✔
1668
            if line.depth != depth:
2✔
1669
                lines.indentation_error(line, depth)
2✔
1670
        # Per the rules, inline comments (those that appear on continuation
1671
        # lines of a multi-line string) are converted to trailing on the
1672
        # value immediately on load.  After the loop, gather any leading
1673
        # comments from continuation lines and move them onto the last
1674
        # line's trailing slot, which is where the value's trailing
1675
        # comments live (see Location.value_end_line).
1676
        cur = first_line.next_line
2✔
1677
        # Inline-in-multi-line-string comments need to be indented past
1678
        # the value's column so that a subsequent re-load classifies
1679
        # them as value_trailing (rather than as a leading comment on
1680
        # the next sibling, or as a footer when at EOF).  Bump shallow
1681
        # ones to value-depth + 4 -- one tabstop deeper than the value
1682
        # at the default indent step.
1683
        safe_inline_indent = depth + 4
2✔
1684
        while cur is not None:
2✔
1685
            inline = cur.leading_comments
2✔
1686
            if inline:
2✔
1687
                for c in inline:
2✔
1688
                    if c.indent <= depth:
2!
1689
                        c.indent = safe_inline_indent
2✔
1690
                last_line.trailing_comments = last_line.trailing_comments or []
2✔
1691
                if cur is last_line:
2✔
1692
                    # avoid clobbering when cur and last_line are the same
1693
                    last_line.trailing_comments = list(inline) + last_line.trailing_comments
2✔
1694
                else:
1695
                    last_line.trailing_comments.extend(inline)
2✔
1696
                cur.leading_comments = []
2✔
1697
            cur = cur.next_line
2✔
1698
        value = "\n".join(data)
2✔
1699
        loc.col = depth + (2 if value else 1)
2✔
1700
        return value, loc
2✔
1701

1702
    # _read_inline() {{{3
1703
    def _read_inline(self, keys):
2✔
1704
        lines = self.lines
2✔
1705
        line = lines.get_next()
2✔
1706
        return Inline(line, keys, self).get_values()
2✔
1707

1708

1709
# loads {{{2
1710
def loads(
2✔
1711
    content,
1712
    top = "dict",
1713
    *,
1714
    source = None,
1715
    on_dup = None,
1716
    keymap = None,
1717
    normalize_key = None,
1718
    dialect = None
1719
):
1720
    # description {{{3
1721
    r'''
1722
    Loads *NestedText* from string.
1723

1724
    Args:
1725
        content (str):
1726
            String that contains encoded data.
1727

1728
        top (str):
1729
            Top-level data type. The NestedText format allows for a dictionary,
1730
            a list, or a string as the top-level data container.  By specifying
1731
            top as “dict”, “list”, or “str” you constrain both the type of
1732
            top-level container and the return value of this function. By
1733
            specifying “any” you enable support for all three data types, with
1734
            the type of the returned value matching that of top-level container
1735
            in content. As a short-hand, you may specify the *dict*, *list*,
1736
            *str*, and *any* built-ins rather than specifying *top* with a
1737
            string.
1738

1739
        source (str or Path):
1740
            If given, this string is attached to any error messages as the
1741
            culprit. It is otherwise unused. Is often the name of the file that
1742
            originally contained the NestedText content.
1743

1744
        on_dup (str or func):
1745
            Indicates how duplicate keys in dictionaries should be handled.
1746
            Specifying "error" causes them to raise exceptions (the default
1747
            behavior). Specifying "ignore" causes them to be ignored (first
1748
            wins). Specifying "replace" results in them replacing earlier items
1749
            (last wins). By specifying a function, the keys can be
1750
            de-duplicated.  This call-back function returns a new key and takes
1751
            two arguments:
1752

1753
            key:
1754
                The new key (duplicates an existing key).
1755

1756
            state:
1757
                A dictionary containing other possibly helpful information:
1758

1759
                dictionary:
1760
                    The entire dictionary as it is at the moment the duplicate
1761
                    key is found.  You should not change it.
1762
                keys:
1763
                    The keys that identify the dictionary.
1764

1765
                This dictionary is created as *loads* is called and deleted as
1766
                it returns. Any values placed in it are retained and available
1767
                on subsequent calls to this function during the load operation.
1768

1769
            This function should return a new key.  If the key duplicates an
1770
            existing key, the value associated with that key is replaced.  If
1771
            *None* is returned, this key is ignored.  If a *KeyError* is
1772
            raised, the duplicate key is reported as an error.
1773

1774
            Be aware that key de-duplication occurs after key normalization.  As
1775
            such you should generate keys during de-duplication that are
1776
            consistent with your normalization scheme.
1777

1778
        keymap (dict):
1779
            Specify an empty dictionary or nothing at all for the value of
1780
            this argument.  If you give an empty dictionary it will be filled
1781
            with location information for the values that are returned.  Upon
1782
            return the dictionary maps a tuple containing the keys for the value
1783
            of interest to the location of that value in the *NestedText* source
1784
            document. The location is contained in a :class:`Location` object.
1785
            You can access the line and column number using the
1786
            :meth:`Location.as_tuple` method, and the line that contains the
1787
            value annotated with its location using the :meth:`Location.as_line`
1788
            method.
1789

1790
        normalize_key (func):
1791
            A function that takes two arguments; the original key for a value
1792
            and the tuple of normalized keys for its parent values.  It then
1793
            transforms the given key into the desired normalized form.  Only
1794
            called on dictionary keys, so the key will always be a string.
1795

1796
        dialect (str):
1797
            Specifies support for particular variations in *NestedText*.
1798

1799
            In general you are discouraged from using a dialect as it can result
1800
            in *NestedText* documents that are not compliant with the standard.
1801

1802
            The following deviant dialects are supported.
1803

1804
            *support inlines*:
1805
                If "i" is included in *dialect*, support for inline lists and
1806
                dictionaries is dropped.  The default is "I", which enables
1807
                support for inlines.  The main effect of disabling inlines in
1808
                the load functions is that keys may begin with ``[`` or ``{``.
1809

1810
    Returns:
1811
        The extracted data.  The type of the return value is specified by the
1812
        top argument.  If top is “any”, then the return value will match that of
1813
        top-level data container in the input content. If content is empty, an
1814
        empty data value of the type specified by top is returned. If top is
1815
        “any” None is returned.
1816

1817
    Raises:
1818
        NestedTextError: if there is a problem in the *NextedText* document.
1819

1820
    Examples:
1821

1822
        A *NestedText* document is specified to *loads* in the form of a string:
1823

1824
        .. code-block:: python
1825

1826
            >>> import nestedtext as nt
1827

1828
            >>> contents = """
1829
            ... name: Kristel Templeton
1830
            ... gender: female
1831
            ... age: 74
1832
            ... """
1833

1834
            >>> try:
1835
            ...     data = nt.loads(contents, "dict")
1836
            ... except nt.NestedTextError as e:
1837
            ...     e.terminate()
1838

1839
            >>> print(data)
1840
            {'name': 'Kristel Templeton', 'gender': 'female', 'age': '74'}
1841

1842
        *loads()* takes an optional argument, *source*. If specified, it is
1843
        added to any error messages. It is often used to designate the source
1844
        of *NestedText* document. For example, if *contents* were read from a
1845
        file, *source* would be the file name.  Here is a typical example of
1846
        reading *NestedText* from a file:
1847

1848
        .. code-block:: python
1849

1850
            >>> filename = "examples/duplicate-keys.nt"
1851
            >>> try:
1852
            ...     with open(filename, encoding="utf-8") as f:
1853
            ...         addresses = nt.loads(f.read(), source=filename)
1854
            ... except nt.NestedTextError as e:
1855
            ...     print(e.render())
1856
            examples/duplicate-keys.nt, 5: duplicate key: name.
1857
                   4 ❬name:❭
1858
                   5 ❬name:❭
1859
                      ▲
1860

1861
        Notice in the above example the encoding is explicitly specified as
1862
        "utf-8".  *NestedText* files should always be read and written using
1863
        *utf-8* encoding.
1864

1865
        The following examples demonstrate the various ways of handling
1866
        duplicate keys:
1867

1868
        .. code-block:: python
1869

1870
            >>> content = """
1871
            ... key: value 1
1872
            ... key: value 2
1873
            ... key: value 3
1874
            ... name: value 4
1875
            ... name: value 5
1876
            ... """
1877

1878
            >>> print(nt.loads(content))
1879
            Traceback (most recent call last):
1880
            ...
1881
            nestedtext.nestedtext.NestedTextError: 3: duplicate key: key.
1882
                   2 ❬key: value 1❭
1883
                   3 ❬key: value 2❭
1884
                      ▲
1885

1886
            >>> print(nt.loads(content, on_dup="ignore"))
1887
            {'key': 'value 1', 'name': 'value 4'}
1888

1889
            >>> print(nt.loads(content, on_dup="replace"))
1890
            {'key': 'value 3', 'name': 'value 5'}
1891

1892
            >>> def de_dup(key, state):
1893
            ...     if key not in state:
1894
            ...         state[key] = 1
1895
            ...     state[key] += 1
1896
            ...     return f"{key} — #{state[key]}"
1897

1898
            >>> print(nt.loads(content, on_dup=de_dup))
1899
            {'key': 'value 1', 'key — #2': 'value 2', 'key — #3': 'value 3', 'name': 'value 4', 'name — #2': 'value 5'}
1900

1901
    '''
1902

1903
    # code {{{3
1904
    if isinstance(content, bytes):
2✔
1905
        content = content.decode('utf-8-sig', errors='strict')
2✔
1906
    f = io.StringIO(content, newline=None)
2✔
1907
    loader = NestedTextLoader(
2✔
1908
        f, top, source, on_dup, keymap, normalize_key, dialect
1909
    )
1910
    return loader.get_decoded()
2✔
1911

1912

1913
# load {{{2
1914
def load(
2✔
1915
    f,
1916
    top = "dict",
1917
    *,
1918
    source = None,
1919
    on_dup = None,
1920
    keymap = None,
1921
    normalize_key = None,
1922
    dialect = None
1923
):
1924
    # description {{{3
1925
    r"""
1926
    Loads *NestedText* from file or stream.
1927

1928
    Is the same as :func:`loads` except the *NextedText* is accessed by reading
1929
    a file rather than directly from a string. It does not keep the full
1930
    contents of the file in memory and so is more memory efficient with large
1931
    files.
1932

1933
    Args:
1934
        f (str, os.PathLike, io.TextIOBase, collections.abc.Iterator):
1935
            The file to read the *NestedText* content from.  This can be
1936
            specified either as a path (e.g. a string or a `pathlib.Path`),
1937
            as a text IO object (e.g. an open file, or 0 for stdin), or as an
1938
            iterator.  If a path is given, the file will be opened, read, and
1939
            closed.  If an IO object is given, it will be read and not closed;
1940
            utf-8 encoding should be used..  If an iterator is given, it should
1941
            generate full lines in the same manner that iterating on a file
1942
            descriptor would.
1943
        kwargs:
1944
            See :func:`loads` for optional arguments.
1945

1946
    Returns:
1947
        The extracted data.
1948
        See :func:`loads` description of the return value.
1949

1950
    Raises:
1951
        NestedTextError: if there is a problem in the *NextedText* document.
1952
        OSError: if there is a problem opening the file.
1953

1954
    Examples:
1955

1956
        Load from a path specified as a string:
1957

1958
        .. code-block:: python
1959

1960
            >>> import nestedtext as nt
1961
            >>> print(open("examples/groceries.nt").read())
1962
            groceries:
1963
              - Bread
1964
              - Peanut butter
1965
              - Jam
1966
            <BLANKLINE>
1967

1968
            >>> nt.load("examples/groceries.nt")
1969
            {'groceries': ['Bread', 'Peanut butter', 'Jam']}
1970

1971
        Load from a `pathlib.Path`:
1972

1973
        .. code-block:: python
1974

1975
            >>> from pathlib import Path
1976
            >>> nt.load(Path("examples/groceries.nt"))
1977
            {'groceries': ['Bread', 'Peanut butter', 'Jam']}
1978

1979
        Load from an open file object:
1980

1981
        .. code-block:: python
1982

1983
            >>> with open("examples/groceries.nt") as f:
1984
            ...     nt.load(f)
1985
            ...
1986
            {'groceries': ['Bread', 'Peanut butter', 'Jam']}
1987

1988
    """
1989

1990
    # code {{{3
1991
    # Do not invoke the read method as that would read in the entire contents of
1992
    # the file, possibly consuming a lot of memory. Instead pass the file
1993
    # pointer into loader, it will iterate through the lines, discarding
1994
    # them once they are no longer needed, which reduces the memory usage.
1995

1996
    if isinstance(f, collections.abc.Iterator):
2✔
1997
        if not source:
2✔
1998
            source = getattr(f, "name", None)
2✔
1999
        loader = NestedTextLoader(
2✔
2000
            f, top, source, on_dup, keymap, normalize_key, dialect
2001
        )
2002
        return loader.get_decoded()
2✔
2003
    else:
2004
        if not source:
2✔
2005
            if f == 0:
2✔
2006
                source = '<stdin>'
2✔
2007
            else:
2008
                source = str(f)
2✔
2009
        with open(f, encoding="utf-8-sig") as fp:
2✔
2010
            loader = NestedTextLoader(
2✔
2011
                fp, top, source, on_dup, keymap, normalize_key, dialect
2012
            )
2013
            return loader.get_decoded()
2✔
2014

2015

2016
# NestedText Writer {{{1
2017
# Converts Python data hierarchies to NestedText.
2018

2019
# add_leader {{{2
2020
def add_leader(s, leader):
2✔
2021
    # split into separate lines
2022
    # add leader to each non-blank line
2023
    # add right-stripped leader to each blank line
2024
    #
2025
    # When the leader is pure indentation (only spaces), comment lines (those
2026
    # whose first non-space character is '#') are passed through unchanged --
2027
    # they already carry their own absolute indentation from
2028
    # _comments_to_lines, and re-indenting them at deeper levels would put
2029
    # them out of position.  When the leader contains content like "> "
2030
    # (multi-line string syntax) the leader is always applied -- the input
2031
    # is user data being converted into string items, not pre-rendered
2032
    # comments.
2033
    indent_only = (leader.lstrip(" ") == "")
2✔
2034
    rejoined = []
2✔
2035
    for line in s.split("\n"):
2✔
2036
        if indent_only and line and line.lstrip(" ")[:1] == "#":
2✔
2037
            rejoined.append(line)
2✔
2038
        elif line:
2✔
2039
            rejoined.append(leader + line)
2✔
2040
        else:
2041
            rejoined.append(leader.rstrip())
2✔
2042
    return "\n".join(rejoined)
2✔
2043

2044

2045
# add_prefix {{{2
2046
def add_prefix(prefix, suffix):
2✔
2047
    # A simple formatting of dict and list items will result in a space
2048
    # after the colon or dash if the value is placed on next line.
2049
    # This, function simply eliminates that space.
2050
    if not suffix or suffix.startswith("\n"):
2✔
2051
        return prefix + suffix
2✔
2052
    return prefix + " " + suffix
2✔
2053

2054

2055
# grow {{{2
2056
# add object to end of a tuple
2057
def grow(base, ext):
2✔
2058
    return base + (ext,)
2✔
2059

2060

2061
# NestedTextDumper class {{{2
2062
class NestedTextDumper:
2✔
2063
    # constructor {{{3
2064
    def __init__(
2✔
2065
        self,
2066
        indent,
2067
        sort_keys,
2068
        converters,
2069
        default,
2070
        spacing,
2071
        map_keys,
2072
        width,
2073
        inline_level,
2074
        inline_count,
2075
        dialect,
2076
    ):
2077
        assert indent > 0
2✔
2078
        self.indent = indent
2✔
2079
        self.converters = converters
2✔
2080
        self.map_keys = map_keys
2✔
2081
        self.default = default
2✔
2082
        self.spacing = spacing or {}
2✔
2083
        self.width = width
2✔
2084
        self.inline_level = inline_level
2✔
2085
        self.inline_count = inline_count
2✔
2086
        self.support_inlines = True
2✔
2087
        if dialect and "i" in dialect:
2✔
2088
            self.support_inlines = False
2✔
2089

2090
        # define key sorting function {{{4
2091
        if sort_keys:
2✔
2092
            if callable(sort_keys):
2✔
2093
                def sort(items, keys):
2✔
2094
                    return sorted(items, key=lambda k: sort_keys(k, keys))
2✔
2095
            else:
2096
                def sort(items, keys):
2✔
2097
                    return sorted(items)
2✔
2098
        else:
2099
            def sort(items, keys):
2✔
2100
                return items
2✔
2101
        self.sort = sort
2✔
2102

2103
        # define object type identification functions {{{4
2104
        if default == "strict":
2✔
2105
            self.is_a_dict = lambda obj: isinstance(obj, dict)
2✔
2106
            self.is_a_list = lambda obj: isinstance(obj, list)
2✔
2107
            self.is_a_str = lambda obj: isinstance(obj, str)
2✔
2108
            self.is_a_scalar = lambda obj: False
2✔
2109
        else:
2110
            self.is_a_dict = is_mapping
2✔
2111
            self.is_a_list = is_collection
2✔
2112
            self.is_a_str = is_str
2✔
2113
            self.is_a_scalar = lambda obj: obj is None or isinstance(obj, (bool, int, float))
2✔
2114
            if is_str(default):
2✔
2115
                raise NotImplementedError(default)  # pragma: no cover
2116

2117
    # render_key {{{3
2118
    def render_key(self, key, keys):
2✔
2119
        key = self.convert(key, keys)
2✔
2120
        if self.is_a_scalar(key):
2✔
2121
            if key is None:
2✔
2122
                key = ""
2✔
2123
            else:
2124
                key = str(key)
2✔
2125
        if not self.is_a_str(key) and callable(self.default):
2✔
2126
            key = self.default(key)
2✔
2127
        if not self.is_a_str(key):
2✔
2128
            raise NestedTextError(
2✔
2129
                key, template="keys must be strings.", culprit=keys
2130
            ) from None
2131
        return convert_line_terminators(key)
2✔
2132

2133
    # render_dict_item {{{3
2134
    def render_dict_item(self, key, value, keys, values):
2✔
2135
        multiline_key_required = (
2✔
2136
            not key
2137
            or "\n" in key
2138
            or key.strip() != key
2139
            or key[:1] == "#"
2140
            or (key[:1] in "[{" and self.support_inlines)
2141
            or key[:2] in ["- ", "> ", ": "]
2142
            or ": " in key
2143
        )
2144
        # The key_trailing and value_leading comment slots only have a
2145
        # rendering position in the multi-line dict-item *value* form
2146
        # (between the key line and the value's first line).  If either
2147
        # slot has any contribution -- static or via a parent provider --
2148
        # force the value onto its own line so those comments don't get
2149
        # silently dropped.
2150
        force_multiline_value = (
2✔
2151
            not multiline_key_required
2152
            and self._comments_force_multiline(keys)
2153
        )
2154
        if multiline_key_required:
2✔
2155
            key = "\n".join(": "+l if l else ":" for l in key.split("\n"))
2✔
2156
            if self.is_a_dict(value) or self.is_a_list(value):
2✔
2157
                return key + self.render_value(value, keys, values)
2✔
2158
            if is_str(value):
2✔
2159
                # force use of multiline value with multiline keys
2160
                value = convert_line_terminators(value)
2✔
2161
            else:
2162
                value = self.render_value(value, keys, values)
2✔
2163
            return key + "\n" + add_leader(value, self.indent*" " + "> ")
2✔
2164
        if force_multiline_value:
2✔
2165
            # Plain "key:" syntax, but force the value onto its own line
2166
            # so key_trailing / value_leading have a place to render.
2167
            if self.is_a_dict(value) or self.is_a_list(value):
2✔
2168
                return key + ":" + self.render_value(value, keys, values)
2✔
2169
            if is_str(value):
2✔
2170
                value_text = convert_line_terminators(value)
2✔
2171
            else:
2172
                value_text = self.render_value(value, keys, values)
2✔
2173
            return key + ":\n" + add_leader(value_text, self.indent*" " + "> ")
2✔
2174
        return add_prefix(key + ":", self.render_value(value, keys, values))
2✔
2175

2176
    # _comments_force_multiline {{{3
2177
    def _comments_force_multiline(self, keys):
2✔
2178
        """Return True if any source -- static key_trailing/value_leading
2179
        on this Location, or a parent provider for either slot -- will
2180
        contribute Comments that need the multi-line dict-item form.
2181
        """
2182
        if not is_mapping(self.map_keys):
2✔
2183
            return False
2✔
2184
        loc = self.map_keys.get(keys)
2✔
2185
        if loc is not None:
2✔
2186
            if loc.get_key_trailing_comments() or loc.get_value_leading_comments():
2✔
2187
                return True
2✔
2188
        if keys:
2!
2189
            parent_loc = self.map_keys.get(keys[:-1])
2✔
2190
            if parent_loc is not None:
2✔
2191
                if (
2✔
2192
                    parent_loc.get_key_trailing_provider() is not None
2193
                    or parent_loc.get_value_leading_provider() is not None
2194
                ):
2195
                    return True
2✔
2196
        return False
2✔
2197

2198
    # _inline_would_drop_comments {{{3
2199
    def _inline_would_drop_comments(self, keys):
2✔
2200
        """Return True if rendering the collection at *keys* inline would
2201
        silently drop a comment.
2202

2203
        The inline forms (``{...}`` / ``[...]``) emit no comments for the
2204
        collection's own value slots nor for any of its descendants, so
2205
        inlining is only safe when neither carries any comment.  Comments
2206
        on the collection's *key* (the ``key_leading``/``key_trailing``
2207
        slots at *keys* itself) are still emitted by the parent, so they
2208
        do not force multi-line here; only value-side comments at *keys*
2209
        and any comment (or comment provider) at a strict descendant do.
2210
        """
2211
        if not is_mapping(self.map_keys):
2✔
2212
            return False
2✔
2213
        loc = self.map_keys.get(keys)
2✔
2214
        if loc is not None:
2✔
2215
            if loc.get_value_leading_comments() or loc.get_value_trailing_comments():
2!
2216
                return True
2✔
NEW
2217
            if (
×
2218
                loc.get_key_leading_provider() is not None
2219
                or loc.get_key_trailing_provider() is not None
2220
                or loc.get_value_leading_provider() is not None
2221
                or loc.get_value_trailing_provider() is not None
2222
            ):
NEW
2223
                return True
×
2224
        depth = len(keys)
2✔
2225
        for path, loc in self.map_keys.items():
2!
2226
            if len(path) <= depth or path[:depth] != keys:
2!
NEW
2227
                continue
×
2228
            if (
2!
2229
                loc.get_key_leading_comments()
2230
                or loc.get_key_trailing_comments()
2231
                or loc.get_value_leading_comments()
2232
                or loc.get_value_trailing_comments()
2233
                or loc.get_key_leading_provider() is not None
2234
                or loc.get_key_trailing_provider() is not None
2235
                or loc.get_value_leading_provider() is not None
2236
                or loc.get_value_trailing_provider() is not None
2237
            ):
2238
                return True
2✔
NEW
2239
        return False
×
2240

2241
    # render_inline_value {{{3
2242
    def render_inline_value(self, obj, exclude, keys, values):
2✔
2243
        obj = self.convert(obj, keys)
2✔
2244
        if self.is_a_dict(obj):
2✔
2245
            return self.render_inline_dict(obj, keys, values)
2✔
2246
        if self.is_a_list(obj):
2✔
2247
            return self.render_inline_list(obj, keys, values)
2✔
2248
        return self.render_inline_scalar(obj, exclude, keys, values)
2✔
2249

2250
    # render_inline_dict {{{3
2251
    def render_inline_dict(self, obj, keys, values):
2✔
2252
        exclude = set("\n\r[]{}:,")
2✔
2253
        rendered = []
2✔
2254
        for k, v in obj.items():
2✔
2255
            new_keys = grow(keys, k)
2✔
2256
            new_values = grow(values, id(v))
2✔
2257
            key = self.render_key(k, new_keys)
2✔
2258
            mapped_key = self.map_key(key, new_keys)
2✔
2259
            v = obj[k]
2✔
2260
            rendered_value = self.render_inline_value(
2✔
2261
                v, exclude, new_keys, new_values
2262
            )
2263
            rendered_key = self.render_inline_scalar(
2✔
2264
                mapped_key, exclude, new_keys, new_values
2265
            )
2266
            rendered.append((mapped_key, key, f"{rendered_key}: {rendered_value}",))
2✔
2267
        items = [v for mk, k, v in self.sort(rendered, keys)]
2✔
2268
        return ''.join(["{", ", ".join(items), "}"])
2✔
2269

2270
    # render_inline_list {{{3
2271
    def render_inline_list(self, obj, keys, values):
2✔
2272
        rendered_values = []
2✔
2273
        for i, v in enumerate(obj):
2✔
2274
            rendered_value = self.render_inline_value(
2✔
2275
                v, set("\n\r[]{},"), grow(keys, i), grow(values, id(v))
2276
            )
2277
            rendered_values.append(rendered_value)
2✔
2278
        if len(rendered_values) == 1 and not rendered_values[0]:
2✔
2279
            return "[ ]"
2✔
2280
        content = ", ".join(rendered_values)
2✔
2281
        leading_delimiter = "[ " if content[0:1] == "," else "["
2✔
2282
        return leading_delimiter + content + "]"
2✔
2283

2284
    # render_inline_scalar {{{3
2285
    def render_inline_scalar(self, obj, exclude, keys, values):
2✔
2286
        obj = self.convert(obj, keys)
2✔
2287
        if self.is_a_str(obj):
2✔
2288
            value = obj
2✔
2289
        elif self.is_a_scalar(obj):
2✔
2290
            value = "" if obj is None else str(obj)
2✔
2291
        elif self.default and callable(self.default):
2✔
2292
            try:
2✔
2293
                obj = self.default(obj)
2✔
2294
            except TypeError:
2✔
2295
                raise NestedTextError(
2✔
2296
                    obj,
2297
                    template = f"unsupported type ({type(obj).__name__}).",
2298
                    culprit = keys
2299
                ) from None
2300
            return self.render_inline_value(obj, exclude, keys, values)
2✔
2301
        else:
2302
            raise NotSuitableForInline from None
2✔
2303

2304
        if exclude & set(value):
2✔
2305
            raise NotSuitableForInline from None
2✔
2306
        if value.strip() != value:
2✔
2307
            raise NotSuitableForInline from None
2✔
2308
        return value
2✔
2309

2310
    # _comments_to_lines {{{3
2311
    def _comments_to_lines(self, comments, natural=0):
2✔
2312
        """Render a list of Comment objects to a list of lines (no trailing \\n).
2313

2314
        *natural* is the natural indent (in spaces) for this slot at the
2315
        current depth -- used to resolve any Comment whose ``tab`` field
2316
        is not None to an absolute indent of
2317
        ``natural + tab * self.indent`` (clamped to >= 0).  Comments whose
2318
        ``tab`` is None render at their stored ``indent`` field absolutely.
2319

2320
        Per-comment ``before`` / ``after`` blank-line counts are honored.
2321
        Adjacent same-indent Comments are emitted contiguously; if such a
2322
        list is re-loaded the loader will merge them into a single
2323
        Comment (text joined by ``\\n``).  Text and slot assignment are
2324
        preserved; only the Comment-object granularity may change.
2325
        """
2326
        lines = []
2✔
2327
        for c in comments:
2✔
2328
            if c.tab is not None:
2✔
2329
                abs_indent = max(0, natural + c.tab * self.indent)
2✔
2330
            else:
2331
                abs_indent = c.indent
2✔
2332
            for _ in range(c.before):
2✔
2333
                lines.append("")
2✔
2334
            if c.text is not None:
2✔
2335
                ind = " " * abs_indent
2✔
2336
                for line in c.text.split("\n"):
2✔
2337
                    if line:
2✔
2338
                        lines.append(f"{ind}# {line}")
2✔
2339
                    else:
2340
                        lines.append(f"{ind}#")
2✔
2341
            for _ in range(c.after):
2✔
2342
                lines.append("")
2✔
2343
        return lines
2✔
2344

2345
    # _resolve_spacing {{{3
2346
    def _resolve_spacing(self, keys):
2✔
2347
        """Determine the blank-line count for joining sibling items whose
2348
        shared parent is at *keys* (absolute depth ``len(keys)``).
2349

2350
        Walks the keymap from the innermost prefix to the root looking for
2351
        a Location with a non-empty ``spacing`` dict.  The first one found
2352
        replaces the global *spacing* argument wholesale for that subtree:
2353
        ``spacing.get(N - len(A), 0)`` where ``A`` is that Location's keys
2354
        and ``N`` is the absolute depth.  Falls back to the global
2355
        ``self.spacing[N]`` only when no Location in the walk has a
2356
        non-empty spacing.
2357
        """
2358
        depth = len(keys)
2✔
2359
        if is_mapping(self.map_keys):
2✔
2360
            for i in range(depth, -1, -1):
2✔
2361
                loc = self.map_keys.get(keys[:i])
2✔
2362
                if loc is not None:
2✔
2363
                    sp = getattr(loc, "spacing", None)
2✔
2364
                    if sp:
2✔
2365
                        return sp.get(depth - i, 0)
2✔
2366
        return self.spacing.get(depth, 0) if self.spacing else 0
2✔
2367

2368
    # _join_items {{{3
2369
    def _join_items(self, items, keys):
2✔
2370
        """Join rendered sibling items with the configured *spacing*.
2371

2372
        ``keys`` is the path of the parent whose children are being joined;
2373
        the per-Location keymap spacing (if any) at any prefix of ``keys``
2374
        takes precedence over the global ``self.spacing`` (the dumps()
2375
        *spacing* argument).
2376
        """
2377
        n = self._resolve_spacing(keys)
2✔
2378
        return ("\n" + "\n" * n).join(items)
2✔
2379

2380
    # _wrap_with_comments {{{3
2381
    def _wrap_with_comments(self, rendered_value, keys):
2✔
2382
        """Inject the four comment slots from the keymap around the rendered item.
2383

2384
        - ``key_leading_comments``  emitted before the rendered item.
2385
        - ``key_trailing_comments`` injected between the key line and the
2386
          value's first line (multi-line value form only).
2387
        - ``value_leading_comments`` injected at the same position, after
2388
          ``key_trailing_comments`` (multi-line value form only).
2389
        - ``value_trailing_comments`` emitted after the rendered item.
2390

2391
        For each slot the dumper also consults the *parent* Location
2392
        (``keymap[keys[:-1]]``) for a per-child *provider* callable.  If
2393
        present, the provider is invoked as ``provider(child_key)`` and
2394
        the returned Comments are prepended to the child's static
2395
        comments at that slot.  Comments returned by a provider with
2396
        ``tab=None`` are normalized to ``tab=0``.
2397

2398
        Only applies when ``map_keys`` is a keymap dict (the form
2399
        returned by load).  When ``map_keys`` is a callable (key
2400
        transformer), there are no comments to apply.
2401
        """
2402
        if not is_mapping(self.map_keys):
2✔
2403
            return rendered_value
2✔
2404
        loc = self.map_keys.get(keys)
2✔
2405
        parent_loc = self.map_keys.get(keys[:-1]) if keys else None
2✔
2406

2407
        # gather child's static comments (if any) {{{4
2408
        if loc is not None:
2✔
2409
            kl = list(loc.get_key_leading_comments())
2✔
2410
            kt = list(loc.get_key_trailing_comments())
2✔
2411
            vl = list(loc.get_value_leading_comments())
2✔
2412
            vt = list(loc.get_value_trailing_comments())
2✔
2413
        else:
2414
            kl, kt, vl, vt = [], [], [], []
2✔
2415

2416
        # prepend parent's per-child provider output (if any) {{{4
2417
        if parent_loc is not None and keys:
2✔
2418
            child_key = keys[-1]
2✔
2419
            for getter, target in (
2✔
2420
                (parent_loc.get_key_leading_provider,  kl),
2421
                (parent_loc.get_key_trailing_provider, kt),
2422
                (parent_loc.get_value_leading_provider, vl),
2423
                (parent_loc.get_value_trailing_provider, vt),
2424
            ):
2425
                provider = getter()
2✔
2426
                if provider is None:
2✔
2427
                    continue
2✔
2428
                extras = list(provider(child_key) or [])
2✔
2429
                for c in extras:
2✔
2430
                    if c.tab is None:
2!
2431
                        c.tab = 0
2✔
2432
                target[:0] = extras
2✔
2433

2434
        if not (kl or kt or vl or vt):
2✔
2435
            return rendered_value
2✔
2436

2437
        n = len(keys)
2✔
2438
        key_natural = max(0, (n - 1) * self.indent)
2✔
2439
        val_natural = n * self.indent
2✔
2440
        leading = self._comments_to_lines(kl, natural=key_natural)
2✔
2441
        key_trailing = self._comments_to_lines(kt, natural=val_natural)
2✔
2442
        value_leading = self._comments_to_lines(vl, natural=val_natural)
2✔
2443
        trailing = self._comments_to_lines(vt, natural=val_natural)
2✔
2444
        value_lines = rendered_value.split("\n")
2✔
2445
        # Inject key_trailing and value_leading between the rendered key
2446
        # (which may span several lines for multi-line keys) and the
2447
        # value's first line.  We detect the key's line count by counting
2448
        # consecutive leading lines that look like multi-line key
2449
        # fragments (``: frag`` or ``:`` after lstrip) at the *same*
2450
        # indent.  If no multi-line-key prefix is present, the key is the
2451
        # first line (e.g. ``key:``).  Inline values (single-line output)
2452
        # don't get key_trailing / value_leading -- those are forced into
2453
        # multi-line by ``_comments_force_multiline``, which is what
2454
        # ensures we never silently drop them here.
2455
        if (key_trailing or value_leading) and len(value_lines) > 1:
2✔
2456
            boundary = 0
2✔
2457
            key_indent = None
2✔
2458
            for line in value_lines:
2!
2459
                stripped = line.lstrip()
2✔
2460
                if not (stripped == ":" or stripped.startswith(": ")):
2✔
2461
                    break
2✔
2462
                indent = len(line) - len(stripped)
2✔
2463
                if key_indent is None:
2✔
2464
                    key_indent = indent
2✔
2465
                elif indent != key_indent:
2✔
2466
                    break
2✔
2467
                boundary += 1
2✔
2468
            if boundary == 0:
2✔
2469
                boundary = 1
2✔
2470
            value_lines = (
2✔
2471
                value_lines[:boundary]
2472
                + key_trailing
2473
                + value_leading
2474
                + value_lines[boundary:]
2475
            )
2476
        return "\n".join(leading + value_lines + trailing)
2✔
2477

2478
    # render value {{{3
2479
    def render_value(self, obj, keys, values):
2✔
2480
        level = len(keys)
2✔
2481
        error = None
2✔
2482
        content = ""
2✔
2483
        obj = self.convert(obj, keys)
2✔
2484
        need_indented_block = is_collection(obj)
2✔
2485

2486
        if self.is_a_dict(obj):
2✔
2487
            self.check_for_cyclic_reference(obj, keys, values)
2✔
2488
            try:
2✔
2489
                if not self.support_inlines:
2✔
2490
                    raise NotSuitableForInline from None
2✔
2491
                if obj and (self.width <= 0 or level < self.inline_level):
2✔
2492
                    raise NotSuitableForInline from None
2✔
2493
                if self._inline_would_drop_comments(keys):
2✔
2494
                    raise NotSuitableForInline from None
2✔
2495
                try:
2✔
2496
                    if 0 < len(obj) < self.inline_count:
2✔
2497
                        raise NotSuitableForInline from None
2✔
2498
                    if obj and len(obj) > self.width/5:
2✔
2499
                        raise NotSuitableForInline from None
2✔
2500
                except TypeError:
2✔
UNCOV
2501
                    pass  # does not have len()
×
2502
                content = self.render_inline_dict(obj, keys, values)
2✔
2503
                if obj and (len(content) > self.width):
2✔
2504
                    raise NotSuitableForInline from None
2✔
2505
            except NotSuitableForInline:
2✔
2506
                rendered = []
2✔
2507
                for k, v in obj.items():
2✔
2508
                    new_keys = grow(keys, k)
2✔
2509
                    new_values = grow(values, id(v))
2✔
2510
                    key = self.render_key(k, new_keys)
2✔
2511
                    mapped_key = self.map_key(key, new_keys)
2✔
2512
                    rendered_value = self.render_dict_item(
2✔
2513
                        mapped_key, obj[k], new_keys, new_values
2514
                    )
2515
                    rendered_value = self._wrap_with_comments(rendered_value, new_keys)
2✔
2516
                    rendered.append((mapped_key, key, rendered_value))
2✔
2517
                content = self._join_items(
2✔
2518
                    [v for mk, k, v in self.sort(rendered, keys)], keys
2519
                )
2520
        elif self.is_a_list(obj):
2✔
2521
            self.check_for_cyclic_reference(obj, keys, values)
2✔
2522
            try:
2✔
2523
                if not self.support_inlines:
2✔
2524
                    raise NotSuitableForInline from None
2✔
2525
                if obj and (self.width <= 0 or level < self.inline_level):
2✔
2526
                    raise NotSuitableForInline from None
2✔
2527
                if self._inline_would_drop_comments(keys):
2✔
2528
                    raise NotSuitableForInline from None
2✔
2529
                try:
2✔
2530
                    if 0 < len(obj) < self.inline_count:
2!
UNCOV
2531
                        raise NotSuitableForInline from None
×
2532
                    if obj and (self.width <= 0 or len(obj) > self.width/3):
2!
UNCOV
2533
                        raise NotSuitableForInline from None
×
UNCOV
2534
                except TypeError:
×
UNCOV
2535
                    pass  # does not have len()
×
2536
                content = self.render_inline_list(obj, keys, values)
2✔
2537
                if obj and (len(content) > self.width):
2✔
2538
                    raise NotSuitableForInline from None
2✔
2539
            except NotSuitableForInline:
2✔
2540
                content = []
2✔
2541
                for i, v in enumerate(obj):
2✔
2542
                    new_keys = grow(keys, i)
2✔
2543
                    rendered_v = self.render_value(v, new_keys, grow(values, id(v)))
2✔
2544
                    item = add_prefix("-", rendered_v)
2✔
2545
                    item = self._wrap_with_comments(item, new_keys)
2✔
2546
                    content.append(item)
2✔
2547
                content = self._join_items(content, keys)
2✔
2548

2549
        elif self.is_a_str(obj):
2✔
2550
            text = convert_line_terminators(obj)
2✔
2551
            if "\n" in text or level == 0:
2✔
2552
                content = add_leader(text, "> ")
2✔
2553
                need_indented_block = True
2✔
2554
            else:
2555
                content = text
2✔
2556
        elif self.is_a_scalar(obj):
2✔
2557
            if obj is None:
2✔
2558
                content = ""
2✔
2559
            else:
2560
                content = str(obj)
2✔
2561
                if level == 0:
2✔
2562
                    content = add_leader(content, "> ")
2✔
2563
                    need_indented_block = True
2✔
2564
        elif self.default and callable(self.default):
2✔
2565
            try:
2✔
2566
                obj = self.default(obj)
2✔
2567
            except TypeError:
2✔
2568
                error = f"unsupported type ({type(obj).__name__})."
2✔
2569
            else:
2570
                content = self.render_value(obj, keys, values)
2✔
2571
        else:
2572
            error = f"unsupported type ({type(obj).__name__})."
2✔
2573

2574
        if need_indented_block and content and level:
2✔
2575
            content = "\n" + add_leader(content, self.indent*" ")
2✔
2576

2577
        if error:
2✔
2578
            raise NestedTextError(obj, template=error, culprit=keys) from None
2✔
2579

2580
        return content
2✔
2581

2582
    # check_for_cyclic_reference {{{3
2583
    def check_for_cyclic_reference(self, obj, keys, values):
2✔
2584
        if id(obj) in values[:-1]:
2✔
2585
            raise NestedTextError(
2✔
2586
                obj, template="circular reference.", culprit=keys
2587
            )
2588

2589
    # convert {{{3
2590
    # apply externally supplied converter to convert value to string
2591
    def convert(self, obj, keys):
2✔
2592
        converters = self.converters
2✔
2593
        converter = getattr(obj, "__nestedtext_converter__", None)
2✔
2594
        converter = converters.get(type(obj)) if converters else converter
2✔
2595
        if converter:
2✔
2596
            try:
2✔
2597
                return converter(obj)
2✔
2598
            except TypeError:
2✔
2599
                # is bound method
2600
                return converter()
2✔
2601
        elif converter is False:
2✔
2602
            raise NestedTextError(
2✔
2603
                obj,
2604
                template = f"unsupported type ({type(obj).__name__}).",
2605
                culprit = keys,
2606
            ) from None
2607
        return obj
2✔
2608

2609
    # map_key {{{3
2610
    # apply externally supplied mapping to convert key to desired form
2611
    def map_key(self, key, keys):
2✔
2612
        mapper = self.map_keys
2✔
2613
        if not mapper:
2✔
2614
            return key
2✔
2615
        if callable(mapper):
2✔
2616
            new_key = mapper(key, keys[:-1])
2✔
2617
            if new_key is None:
2✔
2618
                return key
2✔
2619
            return new_key
2✔
2620
        elif is_mapping(mapper):
2✔
2621
            try:
2✔
2622
                loc = mapper.get(keys)
2✔
2623
                if loc:
2✔
2624
                    return loc._get_original_key(key, strict=False)
2✔
2625
                else:
2626
                    return key
2✔
2627
            except AttributeError:    # pragma: no cover
2628
                raise AssertionError(
2629
                    "if map_keys is a dictionary, it must be a keymap"
2630
                ) from None
2631
        else:  # pragma: no cover
2632
            raise AssertionError(
2633
                "map_keys must be a callable or a dictionary"
2634
            ) from None
2635

2636

2637
# dumps {{{2
2638
def dumps(
2✔
2639
    obj,
2640
    *,
2641
    indent = 4,
2642
    sort_keys = False,
2643
    converters = None,
2644
    default = None,
2645
    spacing = None,
2646
    map_keys = None,
2647
    width = 0,
2648
    inline_level = 0,
2649
    inline_count = 1,
2650
    dialect = None,
2651
):
2652
    # description {{{3
2653
    '''Recursively convert object to *NestedText* string.
2654

2655
    Args:
2656
        obj:
2657
            The object to convert to *NestedText*.
2658

2659
        indent (int):
2660
            The number of spaces to use to represent a single level of
2661
            indentation.  Must be one or greater.
2662

2663
        sort_keys (bool or func):
2664
            Dictionary items are sorted by their key if *sort_keys* is *True*.
2665
            In this case, keys at all level are sorted alphabetically.  If
2666
            *sort_keys* is *False*, the natural order of dictionaries is
2667
            retained.
2668

2669
            If a function is passed in, it is expected to return the sort key.
2670
            The function is passed two tuples, each consists only of strings.
2671
            The first contains the mapped key, the original key, and the
2672
            rendered item.  So it takes the form::
2673

2674
                ('<mapped_key>', '<orig_key>', '<mapped_key>: <value>')
2675

2676
            The second contains the keys of the parent.
2677

2678
        converters (dict):
2679
            A dictionary where the keys are types and the values are converter
2680
            functions (functions that take an object and return it in a form
2681
            that can be processed by *NestedText*).  If a value is False, an
2682
            unsupported type error is raised.
2683

2684
            An object may provide its own converter by defining the
2685
            ``__nestedtext_converter__`` attribute.  It may be False, or it may
2686
            be a method that converts the object to a supported data type for
2687
            *NestedText*.  A matching converter specified in the *converters*
2688
            argument dominates over this attribute.
2689

2690
        default (func or “strict”):
2691
            The default converter. Use to convert otherwise unrecognized objects
2692
            to a form that can be processed. If not provided an error will be
2693
            raised for unsupported data types. Typical values are *repr* or
2694
            *str*. If “strict” is specified then only dictionaries, lists,
2695
            strings, and those types that have converters are allowed. If
2696
            *default* is not specified then a broader collection of value types
2697
            are supported, including *None*, *bool*, *int*, *float*, and *list*-
2698
            and *dict*-like objects.  In this case Booleans are rendered as
2699
            “True” and “False” and None is rendered as an empty string.  If
2700
            *default* is a function, it acts as the default converter.  If
2701
            it raises a TypeError, the value is reported as an
2702
            unsupported type.
2703

2704
        spacing (dict):
2705
            A mapping that controls vertical spacing in the rendered output.
2706

2707
            Integer keys specify the minimum number of blank lines between
2708
            sibling items at that depth.  ``spacing={0: 1}`` requests one blank
2709
            line between top-level items; ``spacing={0: 2, 1: 1}`` requests two
2710
            blank lines between top-level items and one between items at the
2711
            first nested level.  Depths not present in the mapping default to
2712
            zero.
2713

2714
            The special key ``"edges"`` is the number of blank lines between
2715
            the document's header comments and the first data item, and
2716
            between the last data item and the document's footer comments.
2717
            One number applies to both.  Defaults to zero.
2718

2719
        map_keys (func or keymap):
2720
            This argument is used to modify the way keys are rendered, and,
2721
            when it is a keymap, to preserve comments and blank-line spacing
2722
            on round trip.
2723

2724
            It may be a keymap that was created by :func:`load` or
2725
            :func:`loads`, in which case keys are rendered into their original
2726
            form, before any normalization or de-duplication was performed by
2727
            the load functions.  In addition, any comments captured by the
2728
            loader and stored on the keymap are re-emitted around their
2729
            associated keys.  Document-level header and footer comments are
2730
            stored on the root Location (``keymap[()]``) and emitted at the
2731
            top and bottom of the document.
2732

2733
            It may also be a function that takes two arguments: the key after
2734
            any needed conversion has been performed, and the tuple of parent
2735
            keys.  The value returned is used as the key and so must be a
2736
            string.  If no value is returned, the key is not modified.
2737

2738
        width (int):
2739
            Enables inline lists and dictionaries if greater than zero and if
2740
            resulting line would be less than or equal to given width.
2741

2742
        inline_level (int):
2743
            Recursion depth must be equal to this value or greater to be
2744
            eligible for inlining.
2745

2746
        inline_count (int):
2747
            The minimum number of items required of a dictionary or list to be
2748
            eligible for inlining.
2749

2750
        dialect (str):
2751
            Specifies support for particular variations in *NestedText*.
2752

2753
            In general you are discouraged from using a dialect as it can result
2754
            in *NestedText* documents that are not compliant with the standard.
2755

2756
            The following deviant dialects are supported.
2757

2758
            *support inlines*:
2759
                If "i" is included in *dialect*, support for inline lists and
2760
                dictionaries is dropped.  The default is "I", which enables
2761
                support for inlines.  The main effect of disabling inlines in
2762
                the dump functions is that empty lists and dictionaries are
2763
                output using an empty value, which is normally interpreted by
2764
                *NestedText* as an empty string.
2765

2766
    Returns:
2767
        The *NestedText* content without a trailing newline.  *NestedText* files
2768
        should end with a newline, but unlike :func:`dump`, this function does
2769
        not output that newline.  You will need to explicitly add that newline
2770
        when writing the output :func:`dumps` to a file.
2771

2772
    Raises:
2773
        NestedTextError: if there is a problem in the input data.
2774

2775
    Examples:
2776

2777
        .. code-block:: python
2778

2779
            >>> import nestedtext as nt
2780

2781
            >>> data = {
2782
            ...     "name": "Kristel Templeton",
2783
            ...     "gender": "female",
2784
            ...     "age": "74",
2785
            ... }
2786

2787
            >>> try:
2788
            ...     print(nt.dumps(data))
2789
            ... except nt.NestedTextError as e:
2790
            ...     print(str(e))
2791
            name: Kristel Templeton
2792
            gender: female
2793
            age: 74
2794

2795
        The *NestedText* format only supports dictionaries, lists, and strings.
2796
        By default, *dumps* is configured to be rather forgiving, so it will
2797
        render many of the base Python data types, such as *None*, *bool*,
2798
        *int*, *float* and list-like types such as *tuple* and *set* by
2799
        converting them to the types supported by the format.  This implies that
2800
        a round trip through *dumps* and *loads* could result in the types of
2801
        values being transformed. You can restrict *dumps* to only supporting
2802
        the native types of *NestedText* by passing `default="strict"` to
2803
        *dumps*.  Doing so means that values that are not dictionaries, lists,
2804
        or strings generate exceptions.
2805

2806
        .. code-block:: python
2807

2808
            >>> data = {"key": 42, "value": 3.1415926, "valid": True}
2809

2810
            >>> try:
2811
            ...     print(nt.dumps(data))
2812
            ... except nt.NestedTextError as e:
2813
            ...     print(str(e))
2814
            key: 42
2815
            value: 3.1415926
2816
            valid: True
2817

2818
            >>> try:
2819
            ...     print(nt.dumps(data, default="strict"))
2820
            ... except nt.NestedTextError as e:
2821
            ...     print(str(e))
2822
            key: unsupported type (int).
2823

2824
        Alternatively, you can specify a function to *default*, which is used
2825
        to convert values to recognized types.  It is used if no suitable
2826
        converter is available.  Typical values are *str* and *repr*.
2827

2828
        .. code-block:: python
2829

2830
            >>> class Color:
2831
            ...     def __init__(self, color):
2832
            ...         self.color = color
2833
            ...     def __repr__(self):
2834
            ...         return f"Color({self.color!r})"
2835
            ...     def __str__(self):
2836
            ...         return self.color
2837

2838
            >>> data["house"] = Color("red")
2839
            >>> print(nt.dumps(data, default=repr))
2840
            key: 42
2841
            value: 3.1415926
2842
            valid: True
2843
            house: Color('red')
2844

2845
            >>> print(nt.dumps(data, default=str))
2846
            key: 42
2847
            value: 3.1415926
2848
            valid: True
2849
            house: red
2850

2851
        If *Color* is consistently used with *NestedText*, you can include the
2852
        converter in *Color* itself.
2853

2854
        .. code-block:: python
2855

2856
            >>> class Color:
2857
            ...     def __init__(self, color):
2858
            ...         self.color = color
2859
            ...     def __nestedtext_converter__(self):
2860
            ...         return self.color.title()
2861

2862
            >>> data["house"] = Color("red")
2863
            >>> print(nt.dumps(data))
2864
            key: 42
2865
            value: 3.1415926
2866
            valid: True
2867
            house: Red
2868

2869
        You can also specify a dictionary of converters. The dictionary maps the
2870
        object type to a converter function.
2871

2872
        .. code-block:: python
2873

2874
            >>> class Info:
2875
            ...     def __init__(self, **kwargs):
2876
            ...         self.__dict__ = kwargs
2877

2878
            >>> converters = {
2879
            ...     bool: lambda b: "yes" if b else "no",
2880
            ...     int: hex,
2881
            ...     float: lambda f: f"{f:0.3}",
2882
            ...     Color: lambda c: c.color,
2883
            ...     Info: lambda i: i.__dict__,
2884
            ... }
2885

2886
            >>> data["attributes"] = Info(readable=True, writable=False)
2887

2888
            >>> try:
2889
            ...    print(nt.dumps(data, converters=converters))
2890
            ... except nt.NestedTextError as e:
2891
            ...     print(str(e))
2892
            key: 0x2a
2893
            value: 3.14
2894
            valid: yes
2895
            house: red
2896
            attributes:
2897
                readable: yes
2898
                writable: no
2899

2900
        The above example shows that *Color* in the *converters* argument
2901
        dominates over the ``__nestedtest__converter__`` class.
2902

2903
        If the dictionary maps a type to *None*, then the default behavior is
2904
        used for that type. If it maps to *False*, then an exception is raised.
2905

2906
        .. code-block:: python
2907

2908
            >>> converters = {
2909
            ...     bool: lambda b: "yes" if b else "no",
2910
            ...     int: hex,
2911
            ...     float: False,
2912
            ...     Color: lambda c: c.color,
2913
            ...     Info: lambda i: i.__dict__,
2914
            ... }
2915

2916
            >>> try:
2917
            ...    print(nt.dumps(data, converters=converters))
2918
            ... except nt.NestedTextError as e:
2919
            ...     print(str(e))
2920
            value: unsupported type (float).
2921

2922
        *converters* need not actually change the type of a value, it may simply
2923
        transform the value.  In the following example, *converters* is used to
2924
        transform dictionaries by removing empty dictionary fields.  It is also
2925
        converts dates and physical quantities to strings.
2926

2927
        .. code-block:: python
2928

2929
            >>> import arrow
2930
            >>> from inform import cull
2931
            >>> import quantiphy
2932

2933
            >>> class Dollars(quantiphy.Quantity):
2934
            ...     units = "$"
2935
            ...     form = "fixed"
2936
            ...     prec = 2
2937
            ...     strip_zeros = False
2938
            ...     show_commas = True
2939

2940
            >>> converters = {
2941
            ...     dict: cull,
2942
            ...     arrow.Arrow: lambda d: d.format("D MMMM YYYY"),
2943
            ...     quantiphy.Quantity: lambda q: str(q)
2944
            ... }
2945

2946
            >>> transaction = dict(
2947
            ...     date = arrow.get("2013-05-07T22:19:11.363410-07:00"),
2948
            ...     description = "Incoming wire from Publisher’s Clearing House",
2949
            ...     debit = 0,
2950
            ...     credit = Dollars(12345.67)
2951
            ... )
2952

2953
            >>> print(nt.dumps(transaction, converters=converters))
2954
            date: 7 May 2013
2955
            description: Incoming wire from Publisher’s Clearing House
2956
            credit: $12,345.67
2957

2958
        Both *default* and *converters* may be used together. *converters* has
2959
        priority over the built-in types and *default*.  When a function is
2960
        specified as *default*, it is always applied as a last resort.
2961

2962
        Use the *map_keys* argument to format the keys as you wish.  For
2963
        example, you may wish to render the keys at the first level of hierarchy
2964
        in upper case:
2965

2966
        .. code-block:: python
2967

2968
            >>> def map_keys(key, parent_keys):
2969
            ...     if len(parent_keys) == 0:
2970
            ...         return key.upper()
2971

2972
            >>> print(nt.dumps(transaction, converters=converters, map_keys=map_keys))
2973
            DATE: 7 May 2013
2974
            DESCRIPTION: Incoming wire from Publisher’s Clearing House
2975
            CREDIT: $12,345.67
2976

2977
        It can also be used map the keys back to their original form when
2978
        round-tripping a dataset when using key normalization or key
2979
        de-duplication:
2980

2981
        .. code-block:: python
2982

2983
            >>> content = """
2984
            ... Michael Jordan:
2985
            ...     occupation: basketball player
2986
            ... Michael Jordan:
2987
            ...     occupation: actor
2988
            ... Michael Jordan:
2989
            ...     occupation: football player
2990
            ... """
2991

2992
            >>> def de_dup(key, state):
2993
            ...     if key not in state:
2994
            ...         state[key] = 1
2995
            ...     state[key] += 1
2996
            ...     return f"{key}  ⟪#{state[key]}⟫"
2997

2998
            >>> keymap = {}
2999
            >>> people = nt.loads(content, dict, on_dup=de_dup, keymap=keymap)
3000
            >>> print(nt.dumps(people))
3001
            Michael Jordan:
3002
                occupation: basketball player
3003
            Michael Jordan  ⟪#2⟫:
3004
                occupation: actor
3005
            Michael Jordan  ⟪#3⟫:
3006
                occupation: football player
3007

3008
            >>> print(nt.dumps(people, map_keys=keymap))
3009
            Michael Jordan:
3010
                occupation: basketball player
3011
            Michael Jordan:
3012
                occupation: actor
3013
            Michael Jordan:
3014
                occupation: football player
3015

3016
    '''
3017

3018
    # code {{{3
3019
    dumper = NestedTextDumper(
2✔
3020
        indent, sort_keys, converters, default, spacing,
3021
        map_keys, width, inline_level, inline_count, dialect
3022
    )
3023
    content = dumper.render_value(obj, (), ())
2✔
3024

3025
    # prepend header / append footer comments when map_keys is a keymap dict
3026
    # carrying a document-root Location.  The blank-line gap between header
3027
    # and body (and between body and footer) is taken from spacing["edges"]
3028
    # if present, else zero.
3029
    if is_mapping(map_keys):
2✔
3030
        root = map_keys.get(())
2✔
3031
        header = root.get_header_comments() if root is not None else None
2✔
3032
        footer = root.get_footer_comments() if root is not None else None
2✔
3033
        root_spacing = root.get_spacing() if root is not None else None
2✔
3034
        if root_spacing:
2✔
3035
            edge_blanks = root_spacing.get("edges", 1)
2✔
3036
        else:
3037
            edge_blanks = (spacing or {}).get("edges", 1)
2✔
3038
        edge_sep = "\n" + ("\n" * edge_blanks)
2✔
3039
        if header:
2✔
3040
            header_lines = dumper._comments_to_lines(header, natural=0)
2✔
3041
            if header_lines:
2!
3042
                rendered = "\n".join(header_lines)
2✔
3043
                content = rendered + edge_sep + content if content else rendered
2✔
3044
        if footer:
2✔
3045
            footer_lines = dumper._comments_to_lines(footer, natural=0)
2✔
3046
            if footer_lines:
2!
3047
                rendered = "\n".join(footer_lines)
2✔
3048
                content = content + edge_sep + rendered if content else rendered
2✔
3049

3050
    return content
2✔
3051

3052

3053
# dump {{{2
3054
def dump(obj, dest, **kwargs):
2✔
3055
    # description {{{3
3056
    """Write the *NestedText* representation of the given object to the given file.
3057

3058
    Args:
3059
        obj:
3060
            The object to convert to *NestedText*.
3061
        dest (str, os.PathLike, io.TextIOBase):
3062
            The file to write the *NestedText* content to.  The file can be
3063
            specified either as a path (e.g. a string or a `pathlib.Path`) or
3064
            as a text IO instance (e.g. an open file, or 1 for stdout).  If a
3065
            path is given, the will be opened, written, and closed.  If an IO
3066
            object is given, it must have been opened in a mode that allows
3067
            writing (e.g.  ``open(path, "w")``), if applicable.  It will be
3068
            written and not closed.
3069

3070
            The name used for the file is arbitrary but it is tradition to use a
3071
            .nt suffix.  If you also wish to further distinguish the file type
3072
            by giving the schema, it is recommended that you use two suffixes,
3073
            with the suffix that specifies the schema given first and .nt given
3074
            last. For example: flicker.sig.nt.
3075
        kwargs:
3076
            See :func:`dumps` for optional arguments.
3077

3078
    Returns:
3079
        The *NestedText* content with a trailing newline.  This differs from
3080
        :func:`dumps`, which does not add the trailing newline.
3081

3082
    Raises:
3083
        NestedTextError: if there is a problem in the input data.
3084
        OSError: if there is a problem opening the file.
3085

3086
    Examples:
3087

3088
        This example writes to a pointer to an open file.
3089

3090
        .. code-block:: python
3091

3092
            >>> import nestedtext as nt
3093
            >>> from inform import fatal, os_error
3094

3095
            >>> data = {
3096
            ...     "name": "Kristel Templeton",
3097
            ...     "gender": "female",
3098
            ...     "age": "74",
3099
            ... }
3100

3101
            >>> try:
3102
            ...     with open("data.nt", "w", encoding="utf-8") as f:
3103
            ...         nt.dump(data, f)
3104
            ... except nt.NestedTextError as e:
3105
            ...     e.terminate()
3106
            ... except OSError as e:
3107
            ...     fatal(os_error(e))
3108

3109
        This example writes to a file specified by file name.  In general, the
3110
        file name and extension are arbitrary. However, by convention a
3111
        ‘.nt’ suffix is generally used for *NestedText* files.
3112

3113
        .. code-block:: python
3114

3115
            >>> try:
3116
            ...     nt.dump(data, "data.nt")
3117
            ... except nt.NestedTextError as e:
3118
            ...     e.terminate()
3119
            ... except OSError as e:
3120
            ...     fatal(os_error(e))
3121

3122
    """
3123

3124
    # code {{{3
3125
    content = dumps(obj, **kwargs)
2✔
3126

3127
    try:
2✔
3128
        dest.write(content + "\n")
2✔
3129
    except (AttributeError, TypeError) as e:
2✔
3130
        # Avoid nested try-except blocks, since they lead to chained exceptions
3131
        # (e.g. if the file isn’t found, etc.) that unnecessarily complicate the
3132
        # stack trace.
3133
        exception = e
2✔
3134
    else:
3135
        return
2✔
3136

3137
    if isinstance(exception, TypeError):
2✔
3138
        # file may be binary, encode in utf8 and try again
3139
        dest.write((content + "\n").encode('utf8'))
2✔
3140
    else:
3141
        # dest is a file name rather than a file pointer
3142
        with open(dest, "w", encoding="utf-8") as f:
2✔
3143
            f.write(content + "\n")
2✔
3144

3145

3146
# NestedText Utilities {{{1
3147
# Extras that are useful when using NestedText.
3148

3149
# get_keys {{{2
3150
def get_keys(keys, keymap, *, original=True, strict=True, sep=None):
2✔
3151
    # description {{{3
3152
    '''
3153
    Returns a key sequence given a normalized key sequence.
3154

3155
    Keys in the dataset output by the load functions are referred to as
3156
    normalized keys, even though no key normalization may have occurred.  This
3157
    distinguishes them from the original keys, which are the keys given in the
3158
    NestedText document read by the load functions.  The original keys are
3159
    mapped to normalized keys by the *normalize_key* argument to the load
3160
    function.  If normalization is not performed, the normalized keys are
3161
    the same as the original keys.
3162

3163
    By default this function returns the original key sequence that corresponds
3164
    to *keys*, a normalized key sequence.
3165

3166
    Args:
3167
        keys:
3168
            The sequence of normalized keys that identify a value in the
3169
            dataset.
3170
        keymap:
3171
            The keymap returned from :meth:`load` or :meth:`loads`.
3172
        original:
3173
            If true, return keys as originally given in the NestedText document
3174
            (pre-normalization). Otherwise return keys as they exist in the
3175
            dataset (post-normalization).  The value of this argument has no
3176
            effect if the keys were not normalized.
3177
        strict:
3178
            *strict* controls what happens if the given keys are not found in
3179
            *keymap*.
3180

3181
            The various options can be helpful when reporting errors on key
3182
            sequences that do not exist in the data set.  Since they are not in
3183
            the dataset, the original keys are not available.
3184

3185
            True or "error":
3186
                A *KeyError* is raised.
3187
            False or "all":
3188
                All keys given in *keys* are returned.
3189
            "found":
3190
                Only the initial available keys are returned.
3191
            "missing":
3192
                Only the missing final keys are returned.
3193

3194
            When returning keys, the initial available keys are converted to
3195
            their original form if *original* is true,  The missing keys are
3196
            always returned as given.
3197

3198
        sep:
3199
            A join string.  If given the keys are interleaved with *sep* and
3200
            joined into a string before being returned.
3201

3202
    Returns:
3203
        A tuple containing the desired keys if *sep* is not given.
3204
        A string formed by joining the keys with *sep* if *sep* is given.
3205

3206
    Examples:
3207

3208
        .. code-block:: python
3209

3210
            >>> import nestedtext as nt
3211

3212
            >>> contents = """
3213
            ... Names:
3214
            ...     Given: Fumiko
3215
            ... """
3216

3217
            >>> def normalize_key(key, keys):
3218
            ...     return key.lower()
3219

3220
            >>> data = nt.loads(contents, "dict", normalize_key=normalize_key, keymap=(keymap:={}))
3221

3222
            >>> print(get_keys(("names", "given"), keymap))
3223
            ('Names', 'Given')
3224

3225
            >>> print(get_keys(("names", "given"), keymap, sep="❭"))
3226
            Names❭Given
3227

3228
            >>> print(get_keys(("names", "given"), keymap, original=False))
3229
            ('names', 'given')
3230

3231
            >>> keys = get_keys(("names", "surname"), keymap, strict=True)
3232
            Traceback (most recent call last):
3233
            ...
3234
            KeyError: ('names', 'surname')
3235

3236
            >>> print(get_keys(("names", "surname"), keymap, strict="found"))
3237
            ('Names',)
3238

3239
            >>> print(get_keys(("names", "surname"), keymap, strict="missing"))
3240
            ('surname',)
3241

3242
            >>> print(get_keys(("names", "surname"), keymap, strict="all"))
3243
            ('Names', 'surname')
3244

3245
    '''
3246

3247
    # code {{{3
3248
    assert strict in [True, False, "missing", "error", "all", "found"], strict
2✔
3249
    if type(keys) is not tuple:
2✔
3250
        keys = tuple(keys)
2✔
3251

3252
    to_return = ()
2✔
3253
    for i in range(len(keys)):
2✔
3254
        try:
2✔
3255
            loc = keymap[tuple(keys[:i+1])]
2✔
3256
            key = loc._get_original_key(keys[i], strict) if original else keys[i]
2✔
3257
            if strict != "missing":
2✔
3258
                to_return += key,
2✔
3259
        except (KeyError, IndexError):
2✔
3260
            if strict in [True, "error"]:
2✔
3261
                raise
2✔
3262
            if strict != "found":
2✔
3263
                to_return += keys[i],
2✔
3264
    if sep:
2✔
3265
        return sep.join(str(k) for k in to_return)
2✔
3266
    return to_return
2✔
3267

3268

3269
# get_value{{{2
3270
def get_value(data, keys):
2✔
3271
    # description {{{3
3272
    '''
3273
    Get value from keys.
3274

3275
    Args:
3276
        data:
3277
            Your data set as returned by :meth:`load` or :meth:`loads`.
3278
        keys:
3279
            The sequence of normalized keys that identify a value in the
3280
            dataset.
3281

3282
    Returns:
3283
        The value that corresponds to a tuple of keys from a keymap.
3284

3285
    Examples:
3286

3287
        .. code-block:: python
3288

3289
            >>> import nestedtext as nt
3290

3291
            >>> contents = """
3292
            ... names:
3293
            ...     given: Fumiko
3294
            ...     surname: Purvis
3295
            ... """
3296

3297
            >>> data = nt.loads(contents, "dict")
3298

3299
            >>> nt.get_value(data, ("names", "given"))
3300
            'Fumiko'
3301

3302
    '''
3303

3304
    # code {{{3
3305
    for key in keys:
2✔
3306
        try:
2✔
3307
            data = data[key]
2✔
3308
        except TypeError:
2✔
3309
            raise KeyError(key)
2✔
3310
    return data
2✔
3311

3312

3313
# get_line_numbers {{{2
3314
def get_line_numbers(keys, keymap, kind="value", *, strict=True, sep=None):
2✔
3315
    # description {{{3
3316
    '''
3317
    Get line numbers from normalized key sequence.
3318

3319
    This function returns the line numbers of the key or value selected by
3320
    *keys*.  It is used when reporting an error in a value that is possibly a
3321
    multiline string.  If the location contained in a keymap were used the user
3322
    would only see the line number of the first line, which may confuse some
3323
    users into believing the error is actually contained in the first line.
3324
    Using this function gives both the starting and ending line number so the
3325
    user focuses on the whole string and not just the first line.  This only
3326
    happens for multiline keys and multiline strings.
3327

3328
    If *sep* is given, either one line number or both the beginning and ending line
3329
    numbers are returned, joined with the separator. In this case the line numbers
3330
    start from line 1.
3331

3332
    If *sep* is not given, the line numbers are returned as a tuple containing a pair
3333
    of integers that is tailored to be suitable to be arguments to the Python slice
3334
    function (see example). The beginning line number and 1 plus the ending line
3335
    number is returned as a tuple. In this case the line numbers start at 0.
3336

3337
    If *keys* corresponds to a composite value (a dictionary or list), the
3338
    line on which it ends cannot be easily determined, so the value is treated
3339
    as if it consists of a single line.  This is considered tolerable as it is
3340
    expected that this function is primarily used to return the line number of
3341
    leaf values, which are always strings.
3342

3343
    Args:
3344
        keys:
3345
            The sequence of normalized keys that identify a value in the
3346
            dataset.
3347
        keymap:
3348
            The keymap returned from :meth:`load` or :meth:`loads`.
3349
        kind:
3350
            Specify either “key” or “value” depending on which token is
3351
            desired.
3352
        strict:
3353
            If *strict* is true, a *KeyError* is raised if *keys* is not found.
3354
            Otherwise the line number that corresponds to composite value that
3355
            would contain *keys* if it existed.  This composite value
3356
            corresponds to the largest sequence of keys that does actually exist
3357
            in the dataset.
3358
        sep:
3359
            The separator string. If given a string is returned and *sep* is
3360
            inserted between two line numbers.  Otherwise a tuple is returned.
3361

3362
    Raises:
3363
        KeyError:
3364
            If keys are not in *keymap* and *strict* is true.
3365

3366
    Example:
3367
        >>> import nestedtext as nt
3368

3369
        >>> doc = """
3370
        ... key:
3371
        ...     > this is line 1
3372
        ...     > this is line 2
3373
        ...     > this is line 3
3374
        ... """
3375

3376
        >>> data = nt.loads(doc, keymap=(keymap:={}))
3377
        >>> keys = ("key",)
3378
        >>> lines = nt.get_line_numbers(keys, keymap, sep="-")
3379
        >>> text = doc.splitlines()
3380
        >>> print(
3381
        ...     f"Lines {lines}:",
3382
        ...     *text[slice(*nt.get_line_numbers(keys, keymap))],
3383
        ...     sep="\\n"
3384
        ... )
3385
        Lines 3-5:
3386
            > this is line 1
3387
            > this is line 2
3388
            > this is line 3
3389

3390
    '''
3391

3392
    # code {{{3
3393
    loc = get_location(keys, keymap)
2✔
3394
    if not loc:
2✔
3395
        if strict:
2✔
3396
            raise KeyError(keys)
2✔
3397
        else:
3398
            found = get_keys(keys, keymap, original=False, strict="found")
2✔
3399
            loc = keymap[found]
2✔
3400
    return loc.get_line_numbers(kind, sep)
2✔
3401

3402

3403
# get_location {{{2
3404
def get_location(keys, keymap):
2✔
3405
    # description {{{3
3406
    '''
3407
    Returns :class:`Location` information from the keys.
3408
    None is returned if location is unknown.
3409

3410
    Args:
3411
        keys:
3412
            The sequence of normalized keys that identify a value in the
3413
            dataset.
3414
        keymap:
3415
            The keymap returned from :meth:`load` or :meth:`loads`.
3416
    '''
3417

3418
    # code {{{3
3419
    if type(keys) is not tuple:
2✔
3420
        keys = tuple(keys)
2✔
3421

3422
    try:
2✔
3423
        return keymap[keys]
2✔
3424
    except KeyError:
2✔
3425
        return None
2✔
3426

3427

3428
# annotate {{{2
3429
def annotate(
2✔
3430
    keys,
3431
    keymap,
3432
    *,
3433
    key_leading=(),
3434
    key_trailing=(),
3435
    value_leading=(),
3436
    value_trailing=(),
3437
    header=(),
3438
    footer=(),
3439
    spacing=None,
3440
):
3441
    '''Create or update ``keymap[tuple(keys)]`` with comments and
3442
    per-Location spacing in a single call.
3443

3444
    This is the from-scratch counterpart to the keymap that :func:`load`
3445
    builds.  Each of the four per-key slot kwargs --
3446
    ``key_leading``, ``key_trailing``, ``value_leading``,
3447
    ``value_trailing`` -- accepts either:
3448

3449
    - an iterable of :class:`Comment` objects (static), which become
3450
      the comments attached to *this* Location at that slot.  Each
3451
      Comment is interpreted in *tab mode*: its ``tab`` field
3452
      (defaulting to 0 when ``None``) is the tabstop offset from the
3453
      slot's natural indent, resolved by the dumper at emit time using
3454
      the ``dumps(indent=...)`` setting; or
3455
    - a callable (a *provider*) with the signature ::
3456

3457
          provider(child_key) -> list[Comment]
3458

3459
      installed on this Location to be invoked by the dumper for **each
3460
      child** of this Location's value.  The returned Comments are
3461
      prepended to the matching child's static comments at the same
3462
      slot, before rendering.  This is how dynamic section / group
3463
      headers are produced (the closure can dedup over previously-seen
3464
      keys).  Comments returned by a provider with ``tab=None`` are
3465
      normalized to ``tab=0`` at emit time.  Providers are not
3466
      JSON-serializable and are dropped on :func:`keymap_to_jsonable`
3467
      round-trips.
3468

3469
    The natural indent for each slot, given ``N = len(keys)`` and
3470
    ``S = dumps.indent``:
3471

3472
    +-------------------+---------------------+
3473
    | slot              | natural indent      |
3474
    +===================+=====================+
3475
    | ``key_leading``   | ``(N - 1) * S``     |
3476
    +-------------------+---------------------+
3477
    | ``key_trailing``, |                     |
3478
    | ``value_leading``,| ``N * S``           |
3479
    | ``value_trailing``|                     |
3480
    +-------------------+---------------------+
3481
    | ``header``,       | ``0``               |
3482
    | ``footer``        |                     |
3483
    +-------------------+---------------------+
3484

3485
    Static lists in the per-key slots are not allowed at the root
3486
    (``keys == ()``) since the root has no key line to attach to.  A
3487
    provider callable, however, *is* allowed at the root -- it
3488
    decorates each top-level child.
3489

3490
    ``spacing``, if given, is applied via :meth:`Location.set_spacing`.
3491

3492
    Args:
3493
        keys:
3494
            The keys tuple identifying the Location.  Use ``()`` for the
3495
            document-root Location.
3496
        keymap:
3497
            The keymap dict to mutate.
3498
        key_leading, key_trailing, value_leading, value_trailing:
3499
            Either an iterable of :class:`Comment` objects (stored on
3500
            this Location) or a callable provider (invoked per child of
3501
            this Location; see above).  Static lists are not allowed
3502
            when ``keys == ()``.
3503
        header, footer:
3504
            Iterables of :class:`Comment` objects for the document
3505
            header and footer.  Only allowed when ``keys == ()``.
3506
        spacing:
3507
            Per-Location spacing dict; see :meth:`Location.set_spacing`.
3508

3509
    Returns:
3510
        The :class:`Location` that was created or updated.
3511

3512
    Raises:
3513
        ValueError: if a static list is supplied for any of
3514
            ``key_leading``/``key_trailing``/``value_leading``/
3515
            ``value_trailing`` at the root, or if ``header``/``footer``
3516
            is supplied for non-root keys.
3517
    '''
3518
    if not isinstance(keys, tuple):
2✔
3519
        keys = tuple(keys)
2✔
3520

3521
    is_root = (keys == ())
2✔
3522

3523
    def _is_provider(value):
2✔
3524
        # A provider is callable but not a tuple/list/Comment/dict; we
3525
        # check `callable(value)` and exclude iterables (which would be
3526
        # static comment lists).
3527
        return callable(value) and not isinstance(
2✔
3528
            value, (list, tuple, set, frozenset)
3529
        )
3530

3531
    per_key_slots = (
2✔
3532
        ("key_leading",    key_leading,    "set_key_leading_comments",    "set_key_leading_provider"),
3533
        ("key_trailing",   key_trailing,   "set_key_trailing_comments",   "set_key_trailing_provider"),
3534
        ("value_leading",  value_leading,  "set_value_leading_comments",  "set_value_leading_provider"),
3535
        ("value_trailing", value_trailing, "set_value_trailing_comments", "set_value_trailing_provider"),
3536
    )
3537

3538
    if is_root:
2✔
3539
        for name, value, _, _ in per_key_slots:
2✔
3540
            if value and not _is_provider(value):
2✔
3541
                raise ValueError(
2✔
3542
                    f"{name}= as a static Comment list is not allowed at the"
3543
                    " document root (keys=()); use header/footer, or pass a"
3544
                    " provider callable to decorate each top-level child."
3545
                )
3546
    if not is_root and (header or footer):
2✔
3547
        raise ValueError(
2✔
3548
            "header/footer are only allowed at the document root (keys=())."
3549
        )
3550

3551
    loc = keymap.get(keys)
2✔
3552
    if loc is None:
2✔
3553
        loc = Location()
2✔
3554
        keymap[keys] = loc
2✔
3555

3556
    def _tab_mode(comments):
2✔
3557
        out = []
2✔
3558
        for c in comments:
2✔
3559
            if c.tab is None:
2✔
3560
                c.tab = 0
2✔
3561
            out.append(c)
2✔
3562
        return out
2✔
3563

3564
    for _name, value, set_static, set_provider in per_key_slots:
2✔
3565
        if not value:
2✔
3566
            continue
2✔
3567
        if _is_provider(value):
2✔
3568
            getattr(loc, set_provider)(value)
2✔
3569
        else:
3570
            getattr(loc, set_static)(_tab_mode(value))
2✔
3571
    if header:
2✔
3572
        loc.set_header_comments(_tab_mode(header))
2✔
3573
    if footer:
2✔
3574
        loc.set_footer_comments(_tab_mode(footer))
2✔
3575

3576
    if spacing is not None:
2✔
3577
        loc.set_spacing(spacing)
2✔
3578

3579
    return loc
2✔
3580

3581

3582
# keymap_to/from_json {{{2
3583
class _RestoredLocation(Location):
2✔
3584
    """A Location reconstructed from JSON.
3585

3586
    Carries only what :func:`dumps` reads from a keymap: the original key
3587
    string (for ``_get_original_key``) and the comment slots.
3588
    """
3589
    def __init__(self, original_key=None):
2✔
3590
        super().__init__()
2✔
3591
        self._original_key = original_key
2✔
3592

3593
    def _get_original_key(self, key, strict):
2✔
3594
        if self._original_key is not None:
2✔
3595
            return self._original_key
2✔
3596
        return key
2✔
3597

3598

3599
def _comment_to_dict(c):
2✔
3600
    d = {"text": c.text, "indent": c.indent}
2✔
3601
    if c.tab is not None:
2✔
3602
        d["tab"] = c.tab
2✔
3603
    if c.before:
2✔
3604
        d["before"] = c.before
2✔
3605
    if c.after:
2✔
3606
        d["after"] = c.after
2✔
3607
    return d
2✔
3608

3609

3610
def _comment_from_dict(d):
2✔
3611
    return Comment(
2✔
3612
        text=d["text"],
3613
        indent=d["indent"],
3614
        tab=d.get("tab"),
3615
        before=d.get("before", 0),
3616
        after=d.get("after", 0),
3617
    )
3618

3619

3620
def keymap_to_jsonable(keymap, **kwargs):
2✔
3621
    '''Reduce a keymap to a JSON-serializable structure for use with :func:`dumps`.
3622

3623
    Captures only what :func:`dumps` needs from the keymap to reconstruct
3624
    the original file: the original key strings (so ``map_keys`` can
3625
    restore them) and the per-entry comment slots, plus the document
3626
    header / footer on ``keymap[()]``.  Source line/column information is
3627
    discarded.  Per-slot provider callables (set via
3628
    :meth:`Location.set_key_leading_provider` and the matching
3629
    ``set_*_provider`` methods) are also dropped because callables are
3630
    not JSON-serializable; rebuilt keymaps therefore omit any
3631
    provider-driven decoration.
3632

3633
    The returned object is built from ``dict``, ``list``, ``str``, ``int``,
3634
    and ``None`` — safe to pass through :mod:`json`, :mod:`msgpack`, or any
3635
    similar encoder.
3636

3637
    Args:
3638
        keymap:
3639
            The keymap returned from :func:`load` or :func:`loads`, or any
3640
            equivalent mapping from key-tuples to :class:`Location`
3641
            objects.
3642
        **kwargs:
3643
            Any extra keyword arguments are included in the returned structure
3644
            under a top-level "meta" key.  These values are not used by
3645
            :func:`keymap_from_jsonable` but are included to allow you to
3646
            include any extra metadata you wish in the JSON-serializable
3647
            structure.  No attempt is made to ensure that the values in
3648
            **kwargs** are themselves JSON-serializable, so you should ensure
3649
            that they are if you intend to pass the output through a JSON
3650
            encoder.
3651

3652
    Returns:
3653
        A JSON-serializable ``dict``.  Pass it to :func:`keymap_from_jsonable`
3654
        to rebuild a keymap that can be given to :func:`dumps` as
3655
        ``map_keys=``.
3656
    '''
3657
    entries = []
2✔
3658
    for keys, loc in keymap.items():
2✔
3659
        entry = {"keys": list(keys)}
2✔
3660
        if keys and isinstance(keys[-1], str):
2✔
3661
            entry["original_key"] = loc._get_original_key(keys[-1], strict=False)
2✔
3662
        for attr, label in (
2✔
3663
            ("key_leading_comments",   "key_leading"),
3664
            ("key_trailing_comments",  "key_trailing"),
3665
            ("value_leading_comments", "value_leading"),
3666
            ("value_trailing_comments","value_trailing"),
3667
        ):
3668
            comments = getattr(loc, attr, None)
2✔
3669
            if comments:
2✔
3670
                entry[label] = [_comment_to_dict(c) for c in comments]
2✔
3671
        if not keys:
2✔
3672
            if loc.header_comments:
2✔
3673
                entry["header"] = [_comment_to_dict(c) for c in loc.header_comments]
2✔
3674
            if loc.footer_comments:
2✔
3675
                entry["footer"] = [_comment_to_dict(c) for c in loc.footer_comments]
2✔
3676
        sp = getattr(loc, "spacing", None)
2✔
3677
        if sp:
2✔
3678
            # JSON object keys must be strings; integer depth keys are
3679
            # stringified here and parsed back in keymap_from_jsonable.
3680
            entry["spacing"] = {str(k): v for k, v in sp.items()}
2✔
3681
        entries.append(entry)
2✔
3682
    return cull(dict(keymap=entries, meta=kwargs))
2✔
3683

3684

3685
def keymap_from_jsonable(data):
2✔
3686
    '''Rebuild a keymap from the output of :func:`keymap_to_jsonable`.
3687

3688
    The returned mapping is suitable for passing to :func:`dumps` (or
3689
    :func:`dump`) as ``map_keys=``; it will restore the original key
3690
    strings and inject the captured comments.  Locations in the rebuilt
3691
    keymap do *not* carry source line/column information.
3692

3693
    Args:
3694
        data:
3695
            The JSON-serializable structure produced by
3696
            :func:`keymap_to_jsonable` (or an equivalent reconstruction of
3697
            it, e.g., from ``json.loads``).
3698
    '''
3699
    keymap = {}
2✔
3700
    for entry in data["keymap"]:
2✔
3701
        keys = tuple(entry["keys"])
2✔
3702
        loc = _RestoredLocation(original_key=entry.get("original_key"))
2✔
3703
        loc.key_leading_comments = [
2✔
3704
            _comment_from_dict(d) for d in entry.get("key_leading", [])
3705
        ]
3706
        loc.key_trailing_comments = [
2✔
3707
            _comment_from_dict(d) for d in entry.get("key_trailing", [])
3708
        ]
3709
        loc.value_leading_comments = [
2✔
3710
            _comment_from_dict(d) for d in entry.get("value_leading", [])
3711
        ]
3712
        loc.value_trailing_comments = [
2✔
3713
            _comment_from_dict(d) for d in entry.get("value_trailing", [])
3714
        ]
3715
        if not keys:
2✔
3716
            loc.header_comments = [
2✔
3717
                _comment_from_dict(d) for d in entry.get("header", [])
3718
            ]
3719
            loc.footer_comments = [
2✔
3720
                _comment_from_dict(d) for d in entry.get("footer", [])
3721
            ]
3722
        raw_spacing = entry.get("spacing")
2✔
3723
        if raw_spacing:
2✔
3724
            # Convert numeric string keys back to int (depth keys); leave
3725
            # non-numeric keys -- e.g. "edges" -- as strings.
3726
            loc.spacing = {
2✔
3727
                (int(k) if k.lstrip("-").isdigit() else k): v
3728
                for k, v in raw_spacing.items()
3729
            }
3730
        keymap[keys] = loc
2✔
3731
    return keymap
2✔
3732

3733
# vim: set sw=4 sts=4 tw=80 fo=croqj foldmethod=marker et spell:
STATUS · Troubleshooting · Open an Issue · Sales · Support · CAREERS · ENTERPRISE · START FREE · SCHEDULE DEMO
ANNOUNCEMENTS · TWITTER · TOS & SLA · Supported CI Services · What's a CI service? · Automated Testing

© 2026 Coveralls, Inc