• Home
  • Features
  • Pricing
  • Docs
  • Announcements
  • Sign In

winter-telescope / winterdrp / 3768040465

pending completion
3768040465

Pull #255

github

GitHub
Merge e0578770f into 677197e2e
Pull Request #255: Fix #254

10 of 10 new or added lines in 1 file covered. (100.0%)

4645 of 6143 relevant lines covered (75.61%)

0.76 hits per line

Source File
Press 'n' to go to next uncovered line, 'b' for previous

89.04
/winterdrp/data/image_data.py
1
"""
2
Module to specify the input data classes for
3
:class:`winterdrp.processors.base_processor.ImageHandler`
4

5
The basic idea of the code is to pass
6
:class:`~winterdrp.data.base_data.DataBlock` objects
7
through a series of :class:`~wintedrp.processors.BaseProcessor` objects.
8
Since a given image can easily be ~10-100Mb, and there may be several hundred raw images
9
from a typical survey in a given night, the total data volume for these processors
10
could be several 10s of Gb or more. Storing these all in RAM would be very
11
inefficient/slow for a typical laptop or many larger processing machines.
12

13
To mitigate this, the code can be operated in **cache mode**. In that case,
14
after raw images are loaded, only the header data is stored in memory.
15
The actual image data itself is stored temporarily in as a npy file
16
in a dedicated cache directory, and only loaded into memory when needed.
17
When the data is updated, the npy file is changed.
18
The path of the file is a unique hash, and includes the read time of the file,
19
so multiple copies of an image can be read and modified independently.
20

21
In cache mode, all of the image data is temporarily stored in a cache,
22
and this cache can therefore reach the size of 10s of Gb.
23
The location of the cache is in the configurable
24
**output data directory**. This would increase linearly with successive code executions.
25
To mitigate that, and to avoid cleaning the cache by hand,
26
the code tries to automatically delete cache files as needed.
27

28
Python provides a default `__del__()` method for handling clean up when an object
29
is deleted. Images automatically delete their cache in this method. However, has a
30
somewhat-complicated method of 'garbage collection' (see
31
`the official description <https://devguide.python.org/internals/garbage-collector>`_
32
for more info), and it is not guaranteed that Image objects will
33
clean themselves.
34

35
As a fallback, when you run the code from the command line (and therefore call
36
__main__),  we use the standard python
37
`tempfile library <https://docs.python.org/3/library/tempfile.html>` to create a
38
temporary directory, and set this as a cache. We call the directory using `with`
39
context manager, ensuring that cleanup runs automatically before exiting,
40
even if the code crashes/raises errors. We also use `tempfile` and careful cleaning
41
 for the unit tests, as provided by the  base test class.
42
 **If you try to interact with the code in any other way, please be mindful of this
43
 behaviour, and ensure that you clean your cache in a responsible way!**
44

45
If you don't like this feature, you don't need to use it. Cache mode is entirely
46
optional, and can be disabled by setting the environment variable to false.
47

48
You can change this via an environment variable.
49

50
.. code-block:: bash
51

52
    export USE_WINTER_CACHE = false
53

54
See :doc:`usage` for more information about selecting cache mode,
55
and setting the output data directory.
56
"""
57
import copy
1✔
58
import hashlib
1✔
59
import logging
1✔
60
import threading
1✔
61
from pathlib import Path
1✔
62
from typing import Optional
1✔
63

64
import numpy as np
1✔
65
from astropy.io.fits import Header
1✔
66
from astropy.time import Time
1✔
67

68
from winterdrp.data.base_data import DataBatch, DataBlock
1✔
69
from winterdrp.data.cache import USE_CACHE, cache
1✔
70

71
logger = logging.getLogger(__name__)
1✔
72

73

74
class Image(DataBlock):
1✔
75
    """
76
    A subclass of :class:`~winterdrp.data.base_data.DataBlock`,
77
    containing an image and header.
78

79
    This class serves as input for
80
    :class:`~winterdrp.processors.base_processor.BaseImageProcessor` and
81
    :class:`~winterdrp.processors.base_processor.BaseCandidateGenerator` processors.
82
    """
83

84
    cache_files = []
1✔
85

86
    def __init__(self, data: np.ndarray, header: Header):
1✔
87
        self._data = None
1✔
88
        self.header = header
1✔
89
        super().__init__()
1✔
90
        self.cache_path = self.get_cache_path()
1✔
91
        if USE_CACHE:
1✔
92
            self.cache_files.append(self.cache_path)
1✔
93
        self.set_data(data=data)
1✔
94

95
    def get_cache_path(self) -> Path:
1✔
96
        """
97
        Get a unique cache path for the image (.npy file).
98
        This is hash, using name and time, so should be unique even
99
        when rerunning on the same image.
100

101
        :return: unique cache file path
102
        """
103
        base = "".join([str(Time.now()), self.get_name(), str(threading.get_ident())])
1✔
104
        name = f"{hashlib.sha1(base.encode()).hexdigest()}.npy"
1✔
105
        return cache.get_cache_dir().joinpath(name)
1✔
106

107
    def __str__(self):
1✔
108
        return f"<An {self.__class__.__name__} object, built from {self.get_name()}>"
×
109

110
    def set_data(self, data: np.ndarray):
1✔
111
        """
112
        Set the data with cache
113

114
        :param data: Updated image data
115
        :return: None
116
        """
117
        if USE_CACHE:
1✔
118
            self.set_cache_data(data)
1✔
119
        else:
120
            self.set_ram_data(data)
×
121

122
    def set_cache_data(self, data: np.ndarray):
1✔
123
        """
124
        Set the data with cache
125

126
        :param data: Updated image data
127
        :return: None
128
        """
129
        np.save(self.cache_path.as_posix(), data, allow_pickle=False)
1✔
130

131
    def set_ram_data(self, data: np.ndarray):
1✔
132
        """
133
        Set the data in RAM
134

135
        :param data: Updated image data
136
        :return: None
137
        """
138
        self._data = data
×
139

140
    def get_data(self) -> np.ndarray:
1✔
141
        """
142
        Get the image data from cache
143

144
        :return: image data (numpy array)
145
        """
146
        if USE_CACHE:
1✔
147
            return self.get_cache_data()
1✔
148

149
        return self.get_ram_data()
×
150

151
    def get_cache_data(self) -> np.ndarray:
1✔
152
        """
153
        Get the image data from cache
154

155
        :return: image data (numpy array)
156
        """
157
        return np.load(self.cache_path.as_posix(), allow_pickle=True)
1✔
158

159
    def get_ram_data(self) -> np.ndarray:
1✔
160
        """
161
        Get the image data from RAM
162

163
        :return: image data (numpy array)
164
        """
165
        return self._data
×
166

167
    def get_header(self) -> Header:
1✔
168
        """
169
        Get the image header
170

171
        :return: astropy Header
172
        """
173
        return self.header
1✔
174

175
    def set_header(self, header: Header):
1✔
176
        """
177
        Update the header
178

179
        :param header: updated header
180
        :return: None
181
        """
182
        self.header = header
1✔
183

184
    def __getitem__(self, item):
1✔
185
        return self.header.__getitem__(item)
1✔
186

187
    def __setitem__(self, key, value):
1✔
188
        self.header.__setitem__(key, value)
1✔
189

190
    def keys(self):
1✔
191
        """
192
        Get the header keys
193

194
        :return: Keys of header
195
        """
196
        return self.header.keys()
1✔
197

198
    def __del__(self):
1✔
199
        self.cache_path.unlink(missing_ok=True)
1✔
200
        self.cache_files.remove(self.cache_path)
1✔
201

202
    def __deepcopy__(self, memo):
1✔
203
        new = type(self)(
1✔
204
            data=copy.deepcopy(self.get_data()), header=copy.deepcopy(self.get_header())
205
        )
206
        return new
1✔
207

208
    def __copy__(self):
1✔
209
        new = type(self)(
×
210
            data=self.get_data().__copy__(), header=self.get_header().__copy__()
211
        )
212
        return new
×
213

214

215
class ImageBatch(DataBatch):
1✔
216
    """
217
    A subclass of :class:`~winterdrp.data.base_data.DataBatch`,
218
    which contains :class:`~winterdrp.data.image_data.Image` objects
219
    """
220

221
    data_type = Image
1✔
222

223
    def __init__(self, batch: Optional[list[Image] | Image] = None):
1✔
224
        super().__init__(batch=batch)
1✔
225

226
    def append(self, item: Image):
1✔
227
        self._append(item)
1✔
228

229
    def __str__(self):
1✔
230
        return (
×
231
            f"<An {self.__class__.__name__} object, "
232
            f"containing {[x.get_name() for x in self.get_batch()]}>"
233
        )
234

235
    def get_batch(self) -> list[Image]:
1✔
236
        """Returns the :class:`~winterdrp.data.image_data.ImageBatch`
237
        items within the batch
238

239
        :return: list of :class:`~winterdrp.data.image_data.Image` objects
240
        """
241
        return self.get_data_list()
1✔
STATUS · Troubleshooting · Open an Issue · Sales · Support · CAREERS · ENTERPRISE · START FREE · SCHEDULE DEMO
ANNOUNCEMENTS · TWITTER · TOS & SLA · Supported CI Services · What's a CI service? · Automated Testing

© 2026 Coveralls, Inc