• Home
  • Features
  • Pricing
  • Docs
  • Announcements
  • Sign In

gwastro / gwin / 89

pending completion
89

Pull #19

travis-ci

web-flow
likelihood: added missing class to __all__
Pull Request #19: Wrapped tex strings to 80-character limit

4 of 4 new or added lines in 2 files covered. (100.0%)

899 of 2223 relevant lines covered (40.44%)

0.4 hits per line

Source File
Press 'n' to go to next uncovered line, 'b' for previous

71.14
/gwin/likelihood.py
1
# Copyright (C) 2016  Collin Capano
2
# This program is free software; you can redistribute it and/or modify it
3
# under the terms of the GNU General Public License as published by the
4
# Free Software Foundation; either version 3 of the License, or (at your
5
# option) any later version.
6
#
7
# This program is distributed in the hope that it will be useful, but
8
# WITHOUT ANY WARRANTY; without even the implied warranty of
9
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General
10
# Public License for more details.
11
#
12
# You should have received a copy of the GNU General Public License along
13
# with this program; if not, write to the Free Software Foundation, Inc.,
14
# 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301, USA.
15

16

17
#
18
# =============================================================================
19
#
20
#                                   Preamble
21
#
22
# =============================================================================
23
#
24

25
"""
1✔
26
This modules provides classes and functions for evaluating the log likelihood
27
for parameter estimation.
28
"""
29

30
import numpy
1✔
31

32
from scipy import (stats, special)
1✔
33

34
from pycbc import (conversions, filter, transforms)
1✔
35
from pycbc.waveform import NoWaveformError
1✔
36
from pycbc.types import Array
1✔
37
from pycbc.io import FieldArray
1✔
38

39
# Used to manage a likelihood instance across multiple cores or MPI
40
_global_instance = None
1✔
41

42

43
def _call_global_likelihood(*args, **kwds):
1✔
44
    return _global_instance(*args, **kwds)  # pylint:disable=not-callable
×
45

46

47
class _NoPrior(object):
1✔
48
    """Dummy class to just return 0 if no prior is provided in a
49
    likelihood generator.
50
    """
51
    @staticmethod
1✔
52
    def apply_boundary_conditions(**params):
53
        return params
1✔
54

55
    def __call__(self, **params):
1✔
56
        return 0.
1✔
57

58

59
class BaseLikelihoodEvaluator(object):
1✔
60
    r"""Base container class for generating waveforms, storing the data, and
61
    computing posteriors.
62

63
    The nomenclature used by this class and those that inherit from it is as
64
    follows: Given some model parameters :math:`\Theta` and some data
65
    :math:`d` with noise model :math:`n`, we define:
66

67
     * the **likelihood function**: :math:`p(d|\Theta)`
68

69
     * the **noise likelihood**: :math:`p(d|n)`
70

71
     * the **likelihood ratio**:
72
       :math:`\mathcal{L}(\Theta) = \frac{p(d|\Theta)}{p(d|n)}`
73

74
     * the **prior**: :math:`p(\Theta)`
75

76
     * the **posterior**: :math:`p(\Theta|d) \propto p(d|\Theta)p(\Theta)`
77

78
     * the **prior-weighted likelihood ratio**:
79
       :math:`\hat{\mathcal{L}}(\Theta) = \frac{p(d|\Theta)p(\Theta)}{p(d|n)}`
80

81
     * the **SNR**: :math:`\rho(\Theta) = \sqrt{2\log\mathcal{L}(\Theta)}`;
82
       for two detectors, this is approximately the same quantity as the
83
       coincident SNR used in the CBC search.
84

85
    .. note::
86

87
        Although the posterior probability is only proportional to
88
        :math:`p(d|\Theta)p(\Theta)`, here we refer to this quantity as the
89
        posterior. Also note that for a given noise model, the prior-weighted
90
        likelihood ratio is proportional to the posterior, and so the two can
91
        usually be swapped for each other.
92

93
    When performing parameter estimation we work with the log of these values
94
    since we are mostly concerned with their values around the maxima. If
95
    we have multiple detectors, each with data :math:`d_i`, then these values
96
    simply sum over the detectors. For example, the log likelihood ratio is:
97

98
    .. math::
99

100
        \log \mathcal{L}(\Theta) =
101
            \sum_i \left[\log p(\Theta|d_i) - \log p(n|d_i)\right]
102

103
    This class provides boiler-plate methods and attributes for evaluating the
104
    log likelihood ratio, log prior, and log likelihood. This class
105
    makes no assumption about the detectors' noise model :math:`n`. As such,
106
    the methods for computing these values raise ``NotImplementedErrors``. These
107
    functions need to be monkey patched, or other classes that inherit from
108
    this class need to define their own functions.
109

110
    Instances of this class can be called like a function. The default is for
111
    this class to call its ``logposterior`` function, but this can be changed
112
    with the ``set_callfunc`` method.
113

114
    Parameters
115
    ----------
116
    variable_args : (tuple of) string(s)
117
        A tuple of parameter names that will be varied.
118
    waveform_generator : generator class, optional
119
        A generator class that creates waveforms.
120
    data : dict, optional
121
        A dictionary of data, in which the keys are the detector names and the
122
        values are the data.
123
    prior : callable, optional
124
        A callable class or function that computes the log of the prior. If
125
        None provided, will use ``_noprior``, which returns 0 for all parameter
126
        values.
127
    sampling_parameters : list, optional
128
        Replace one or more of the variable args with the given parameters
129
        for sampling.
130
    replace_parameters : list, optional
131
        The variable args to replace with sampling parameters. Must be the
132
        same length as ``sampling_parameters``.
133
    sampling_transforms : list, optional
134
        List of transforms to use to go between the variable args and the
135
        sampling parameters. Required if ``sampling_parameters`` is not None.
136
    waveform_transforms : list, optional
137
        List of transforms to use to go from the variable args to parameters
138
        understood by the waveform generator.
139

140
    Attributes
141
    ----------
142
    waveform_generator : dict
143
        The waveform generator that the class was initialized with.
144
    data : dict
145
        The data that the class was initialized with.
146
    lognl : {None, float}
147
        The log of the noise likelihood summed over the number of detectors.
148
    return_meta : {True, bool}
149
        If True, ``prior``, ``logposterior``, and ``logplr`` will return the
150
        value of the prior, the loglikelihood ratio, and the log jacobian,
151
        along with the posterior/plr.
152

153
    Methods
154
    -------
155
    logjacobian :
156
        Returns the log of the jacobian needed to go from the parameter space
157
        of the variable args to the sampling args.
158
    prior :
159
        A function that returns the log of the prior.
160
    loglikelihood :
161
        A function that returns the log of the likelihood function.
162
    logposterior :
163
        A function that returns the log of the posterior.
164
    loglr :
165
        A function that returns the log of the likelihood ratio.
166
    logplr :
167
        A function that returns the log of the prior-weighted likelihood ratio.
168
    snr :
169
        A function that returns the square root of twice the log likelihood
170
        ratio. If the log likelihood ratio is < 0, will return 0.
171
    evaluate :
172
        Maps a list of values to their parameter names and calls whatever the
173
        call function is set to.
174
    set_callfunc :
175
        Set the function to use when the class is called as a function.
176
    """
177
    name = None
1✔
178
    required_kwargs = []
1✔
179

180
    def __init__(self, variable_args,
1✔
181
                 waveform_generator=None, data=None, prior=None,
182
                 sampling_parameters=None, replace_parameters=None,
183
                 sampling_transforms=None, waveform_transforms=None,
184
                 return_meta=True):
185
        if isinstance(variable_args, basestring):
1✔
186
            variable_args = (variable_args,)
×
187
        if not isinstance(variable_args, tuple):
1✔
188
            variable_args = tuple(variable_args)
1✔
189
        self._variable_args = variable_args
1✔
190
        # store data, waveform generator
191
        self._waveform_generator = waveform_generator
1✔
192
        # we'll store a copy of the data
193
        if data is not None:
1✔
194
            self._data = dict([[ifo, 1*data[ifo]] for ifo in data])
1✔
195
        else:
196
            self._data = None
1✔
197
        # store prior
198
        if prior is None:
1✔
199
            self._prior = _NoPrior()
1✔
200
        else:
201
            # check that the variable args of the prior evaluator is the same
202
            # as the waveform generator
203
            if prior.variable_args != variable_args:
1✔
204
                raise ValueError("variable args of prior and waveform "
×
205
                                 "generator do not match")
206
            self._prior = prior
1✔
207
        # initialize the log nl to None
208
        self._lognl = None
1✔
209
        self.return_meta = return_meta
1✔
210
        # store sampling parameters and transforms
211
        if sampling_parameters is not None:
1✔
212
            if replace_parameters is None or \
×
213
                    len(replace_parameters) != len(sampling_parameters):
214
                raise ValueError("number of sampling parameters must be the "
×
215
                                 "same as the number of replace parameters")
216
            if sampling_transforms is None:
×
217
                raise ValueError("must provide sampling transforms for the "
×
218
                                 "sampling parameters")
219
            # pull out the replaced parameters
220
            self._sampling_args = [arg for arg in self._variable_args if
×
221
                                   arg not in replace_parameters]
222
            # add the samplign parameters
223
            self._sampling_args += sampling_parameters
×
224
            self._sampling_transforms = sampling_transforms
×
225
        else:
226
            self._sampling_args = self._variable_args
1✔
227
            self._sampling_transforms = None
1✔
228
        self._waveform_transforms = waveform_transforms
1✔
229

230
    @property
1✔
231
    def variable_args(self):
232
        """Returns the variable arguments."""
233
        return self._variable_args
1✔
234

235
    @property
1✔
236
    def waveform_generator(self):
237
        """Returns the waveform generator that was set."""
238
        return self._waveform_generator
×
239

240
    @property
1✔
241
    def data(self):
242
        """Returns the data that was set."""
243
        return self._data
1✔
244

245
    @property
1✔
246
    def sampling_args(self):
247
        """Returns the sampling arguments."""
248
        return self._sampling_args
1✔
249

250
    @property
1✔
251
    def sampling_transforms(self):
252
        """Returns the sampling transforms."""
253
        return self._sampling_transforms
1✔
254

255
    def apply_sampling_transforms(self, samples, inverse=False):
1✔
256
        """Applies the sampling transforms to the given samples.
257

258
        If ``sampling_transforms`` is None, just returns the samples.
259

260
        Parameters
261
        ----------
262
        samples : dict or FieldArray
263
            The samples to apply the transforms to.
264
        inverse : bool, optional
265
            Whether to apply the inverse transforms (i.e., go from the sampling
266
            args to the variable args). Default is False.
267

268
        Returns
269
        -------
270
        dict or FieldArray
271
            The transformed samples, along with the original samples.
272
        """
273
        if self._sampling_transforms is None:
1✔
274
            return samples
1✔
275
        return transforms.apply_transforms(samples, self._sampling_transforms,
×
276
                                           inverse=inverse)
277

278
    @property
1✔
279
    def lognl(self):
280
        """Returns the log of the noise likelihood."""
281
        return self._lognl
×
282

283
    def set_lognl(self, lognl):
1✔
284
        """Set the value of the log noise likelihood."""
285
        self._lognl = lognl
1✔
286

287
    def logjacobian(self, **params):
1✔
288
        r"""Returns the log of the jacobian needed to transform pdfs in the
289
        ``variable_args`` parameter space to the ``sampling_args`` parameter
290
        space.
291

292
        Let :math:`\mathbf{x}` be the set of variable parameters,
293
        :math:`\mathbf{y} = f(\mathbf{x})` the set of sampling parameters, and
294
        :math:`p_x(\mathbf{x})` a probability density function defined over
295
        :math:`\mathbf{x}`.
296
        The corresponding pdf in :math:`\mathbf{y}` is then:
297

298
        .. math::
299

300
            p_y(\mathbf{y}) =
301
                p_x(\mathbf{x})\left|\mathrm{det}\,\mathbf{J}_{ij}\right|,
302

303
        where :math:`\mathbf{J}_{ij}` is the Jacobian of the inverse transform
304
        :math:`\mathbf{x} = g(\mathbf{y})`. This has elements:
305

306
        .. math::
307

308
            \mathbf{J}_{ij} = \frac{\partial g_i}{\partial{y_j}}
309

310
        This function returns
311
        :math:`\log \left|\mathrm{det}\,\mathbf{J}_{ij}\right|`.
312

313

314
        Parameters
315
        ----------
316
        \**params :
317
            The keyword arguments should specify values for all of the variable
318
            args and all of the sampling args.
319

320
        Returns
321
        -------
322
        float :
323
            The value of the jacobian.
324
        """
325
        if self._sampling_transforms is None:
1✔
326
            return 0.
1✔
327
        else:
328
            return numpy.log(abs(transforms.compute_jacobian(
×
329
                params, self._sampling_transforms, inverse=True)))
330

331
    def prior(self, **params):
1✔
332
        """This function should return the prior of the given params.
333
        """
334
        logj = self.logjacobian(**params)
1✔
335
        logp = self._prior(**params) + logj
1✔
336
        if numpy.isnan(logp):
1✔
337
            logp = -numpy.inf
×
338
        return self._formatreturn(logp, prior=logp, logjacobian=logj)
1✔
339

340
    def prior_rvs(self, size=1, prior=None):
1✔
341
        """Returns random variates drawn from the prior.
342

343
        If the ``sampling_args`` are different from the ``variable_args``, the
344
        variates are transformed to the `sampling_args` parameter space before
345
        being returned.
346

347
        Parameters
348
        ----------
349
        size : int, optional
350
            Number of random values to return for each parameter. Default is 1.
351
        prior : JointDistribution, optional
352
            Use the given prior to draw values rather than the saved prior.
353

354
        Returns
355
        -------
356
        FieldArray
357
            A field array of the random values.
358
        """
359
        # draw values from the prior
360
        if prior is None:
1✔
361
            prior = self._prior
1✔
362
        p0 = prior.rvs(size=size)
1✔
363
        # transform if necessary
364
        if self._sampling_transforms is not None:
1✔
365
            ptrans = self.apply_sampling_transforms(p0)
×
366
            # pull out the sampling args
367
            p0 = FieldArray.from_arrays([ptrans[arg]
×
368
                                         for arg in self._sampling_args],
369
                                        names=self._sampling_args)
370
        return p0
1✔
371

372
    def loglikelihood(self, **params):
1✔
373
        """Returns the natural log of the likelihood function.
374
        """
375
        raise NotImplementedError("Likelihood function not set.")
×
376

377
    def loglr(self, **params):
1✔
378
        """Returns the natural log of the likelihood ratio.
379
        """
380
        return self.loglikelihood(**params) - self.lognl
×
381

382
    # the names and order of data returned by _formatreturn when
383
    # return_metadata is True
384
    metadata_fields = ["prior", "loglr", "logjacobian"]
1✔
385

386
    def _formatreturn(self, val, prior=None, loglr=None, logjacobian=0.):
1✔
387
        """Adds the prior to the return value if return_meta is True.
388
        Otherwise, just returns the value.
389

390
        Parameters
391
        ----------
392
        val : float
393
            The value to return.
394
        prior : float, optional
395
            The value of the prior.
396
        loglr : float, optional
397
            The value of the log likelihood-ratio.
398
        logjacobian : float, optional
399
            The value of the log jacobian used to go from the variable args
400
            to the sampling args.
401

402
        Returns
403
        -------
404
        val : float
405
            The given value to return.
406
        *If return_meta is True:*
407
        metadata : (prior, loglr, logjacobian)
408
            A tuple of the prior, log likelihood ratio, and logjacobian.
409
        """
410
        if self.return_meta:
1✔
411
            return val, (prior, loglr, logjacobian)
1✔
412
        else:
413
            return val
1✔
414

415
    def logplr(self, **params):
1✔
416
        """Returns the log of the prior-weighted likelihood ratio.
417
        """
418
        if self.return_meta:
1✔
419
            logp, (_, _, logj) = self.prior(**params)
×
420
        else:
421
            logp = self.prior(**params)
1✔
422
            logj = None
1✔
423
        # if the prior returns -inf, just return
424
        if logp == -numpy.inf:
1✔
425
            return self._formatreturn(logp, prior=logp, logjacobian=logj)
1✔
426
        llr = self.loglr(**params)
1✔
427
        return self._formatreturn(llr + logp, prior=logp, loglr=llr,
1✔
428
                                  logjacobian=logj)
429

430
    def logposterior(self, **params):
1✔
431
        """Returns the log of the posterior of the given params.
432
        """
433
        if self.return_meta:
×
434
            logp, (_, _, logj) = self.prior(**params)
×
435
        else:
436
            logp = self.prior(**params)
×
437
            logj = None
×
438
        # if the prior returns -inf, just return
439
        if logp == -numpy.inf:
×
440
            return self._formatreturn(logp, prior=logp, logjacobian=logj)
×
441
        ll = self.loglikelihood(**params)
×
442
        return self._formatreturn(ll + logp, prior=logp, loglr=ll-self._lognl,
×
443
                                  logjacobian=logj)
444

445
    def snr(self, **params):
1✔
446
        """Returns the "SNR" of the given params. This will return
447
        imaginary values if the log likelihood ratio is < 0.
448
        """
449
        return conversions.snr_from_loglr(self.loglr(**params))
×
450

451
    _callfunc = logposterior
1✔
452

453
    @classmethod
1✔
454
    def set_callfunc(cls, funcname):
455
        """Sets the function used when the class is called as a function.
456

457
        Parameters
458
        ----------
459
        funcname : str
460
            The name of the function to use; must be the name of an instance
461
            method.
462
        """
463
        cls._callfunc = getattr(cls, funcname)
1✔
464

465
    def evaluate(self, params, callfunc=None):
1✔
466
        """Evaluates the call function at the given list of parameter values.
467

468
        Parameters
469
        ----------
470
        params : list
471
            A list of values. These are assumed to be in the same order as
472
            variable args.
473
        callfunc : str, optional
474
            The name of the function to call. If None, will use
475
            ``self._callfunc``. Default is None.
476

477
        Returns
478
        -------
479
        float or tuple :
480
            If ``return_meta`` is False, the output of the call function. If
481
            ``return_meta`` is True, a tuple of the output of the call function
482
            and the meta data.
483
        """
484
        params = dict(zip(self._sampling_args, params))
1✔
485
        # apply inverse transforms to go from sampling parameters to
486
        # variable args
487
        params = self.apply_sampling_transforms(params, inverse=True)
1✔
488
        # apply boundary conditions
489
        params = self._prior.apply_boundary_conditions(**params)
1✔
490
        # apply waveform transforms
491
        if self._waveform_transforms is not None:
1✔
492
            params = transforms.apply_transforms(params,
×
493
                                                 self._waveform_transforms,
494
                                                 inverse=False)
495
        # apply any boundary conditions to the parameters before
496
        # generating/evaluating
497
        if callfunc is not None:
1✔
498
            f = getattr(self, callfunc)
1✔
499
        else:
500
            f = self._callfunc
1✔
501
        return f(**params)
1✔
502

503
    __call__ = evaluate
1✔
504

505

506
#
507
# =============================================================================
508
#
509
#                              Test distributions
510
#
511
# =============================================================================
512
#
513
class TestNormal(BaseLikelihoodEvaluator):
1✔
514
    r"""The test distribution is an multi-variate normal distribution.
515

516
    The number of dimensions is set by the number of ``variable_args`` that are
517
    passed. For details on the distribution used, see
518
    ``scipy.stats.multivariate_normal``.
519

520
    Parameters
521
    ----------
522
    variable_args : (tuple of) string(s)
523
        A tuple of parameter names that will be varied.
524
    mean : array-like, optional
525
        The mean values of the parameters. If None provide, will use 0 for all
526
        parameters.
527
    cov : array-like, optional
528
        The covariance matrix of the parameters. If None provided, will use
529
        unit variance for all parameters, with cross-terms set to 0.
530
    **kwargs :
531
        All other keyword arguments are passed to ``BaseLikelihoodEvaluator``.
532

533
    """
534
    name = "test_normal"
1✔
535

536
    def __init__(self, variable_args, mean=None, cov=None, **kwargs):
1✔
537
        # set up base likelihood parameters
538
        super(TestNormal, self).__init__(variable_args, **kwargs)
×
539
        # set the lognl to 0 since there is no data
540
        self.set_lognl(0.)
×
541
        # store the pdf
542
        if mean is None:
×
543
            mean = [0.]*len(variable_args)
×
544
        if cov is None:
×
545
            cov = [1.]*len(variable_args)
×
546
        self._dist = stats.multivariate_normal(mean=mean, cov=cov)
×
547
        # check that the dimension is correct
548
        if self._dist.dim != len(variable_args):
×
549
            raise ValueError("dimension mis-match between variable_args and "
×
550
                             "mean and/or cov")
551

552
    def loglikelihood(self, **params):
1✔
553
        """Returns the log pdf of the multivariate normal.
554
        """
555
        return self._dist.logpdf([params[p] for p in self.variable_args])
×
556

557

558
class TestEggbox(BaseLikelihoodEvaluator):
1✔
559
    r"""The test distribution is an 'eggbox' function:
560

561
    .. math::
562

563
        \log \mathcal{L}(\Theta) = \left[
564
            2+\prod_{i=1}^{n}\cos\left(\frac{\theta_{i}}{2}\right)\right]^{5}
565

566
    The number of dimensions is set by the number of ``variable_args`` that are
567
    passed.
568

569
    Parameters
570
    ----------
571
    variable_args : (tuple of) string(s)
572
        A tuple of parameter names that will be varied.
573
    **kwargs :
574
        All other keyword arguments are passed to ``BaseLikelihoodEvaluator``.
575

576
    """
577
    name = "test_eggbox"
1✔
578

579
    def __init__(self, variable_args, **kwargs):
1✔
580
        # set up base likelihood parameters
581
        super(TestEggbox, self).__init__(variable_args, **kwargs)
×
582

583
        # set the lognl to 0 since there is no data
584
        self.set_lognl(0.)
×
585

586
    def loglikelihood(self, **params):
1✔
587
        """Returns the log pdf of the eggbox function.
588
        """
589
        return (2 + numpy.prod(numpy.cos([
×
590
            params[p]/2. for p in self.variable_args]))) ** 5
591

592

593
class TestRosenbrock(BaseLikelihoodEvaluator):
1✔
594
    r"""The test distribution is the Rosenbrock function:
595

596
    .. math::
597

598
        \log \mathcal{L}(\Theta) = -\sum_{i=1}^{n-1}[
599
            (1-\theta_{i})^{2}+100(\theta_{i+1} - \theta_{i}^{2})^{2}]
600

601
    The number of dimensions is set by the number of ``variable_args`` that are
602
    passed.
603

604
    Parameters
605
    ----------
606
    variable_args : (tuple of) string(s)
607
        A tuple of parameter names that will be varied.
608
    **kwargs :
609
        All other keyword arguments are passed to ``BaseLikelihoodEvaluator``.
610

611
    """
612
    name = "test_rosenbrock"
1✔
613

614
    def __init__(self, variable_args, **kwargs):
1✔
615
        # set up base likelihood parameters
616
        super(TestRosenbrock, self).__init__(variable_args, **kwargs)
×
617

618
        # set the lognl to 0 since there is no data
619
        self.set_lognl(0.)
×
620

621
    def loglikelihood(self, **params):
1✔
622
        """Returns the log pdf of the Rosenbrock function.
623
        """
624
        l = 0
×
625
        p = [params[p] for p in self.variable_args]
×
626
        for i in range(len(p) - 1):
×
627
            l -= ((1 - p[i])**2 + 100 * (p[i+1] - p[i]**2)**2)
×
628
        return l
×
629

630

631
class TestVolcano(BaseLikelihoodEvaluator):
1✔
632
    r"""The test distribution is a two-dimensional 'volcano' function:
633

634
    .. math::
635
        \Theta =
636
            \sqrt{\theta_{1}^{2} + \theta_{2}^{2}} \log \mathcal{L}(\Theta) =
637
            25\left(e^{\frac{-\Theta}{35}} +
638
                    \frac{1}{2\sqrt{2\pi}} e^{-\frac{(\Theta-5)^{2}}{8}}\right)
639

640
    Parameters
641
    ----------
642
    variable_args : (tuple of) string(s)
643
        A tuple of parameter names that will be varied. Must have length 2.
644
    **kwargs :
645
        All other keyword arguments are passed to ``BaseLikelihoodEvaluator``.
646

647
    """
648
    name = "test_volcano"
1✔
649

650
    def __init__(self, variable_args, **kwargs):
1✔
651
        # set up base likelihood parameters
652
        super(TestVolcano, self).__init__(variable_args, **kwargs)
×
653

654
        # make sure there are exactly two variable args
655
        if len(self.variable_args) != 2:
×
656
            raise ValueError("TestVolcano distribution requires exactly "
×
657
                             "two variable args")
658

659
        # set the lognl to 0 since there is no data
660
        self.set_lognl(0.)
×
661

662
    def loglikelihood(self, **params):
1✔
663
        """Returns the log pdf of the 2D volcano function.
664
        """
665
        p = [params[p] for p in self.variable_args]
×
666
        r = numpy.sqrt(p[0]**2 + p[1]**2)
×
667
        mu, sigma = 5.0, 2.0
×
668
        return 25 * (
×
669
            numpy.exp(-r/35) + 1 / (sigma * numpy.sqrt(2 * numpy.pi)) *
670
            numpy.exp(-0.5 * ((r - mu) / sigma) ** 2))
671

672

673
#
674
# =============================================================================
675
#
676
#                              Data-based likelihoods
677
#
678
# =============================================================================
679
#
680

681
class GaussianLikelihood(BaseLikelihoodEvaluator):
1✔
682
    r"""Computes log likelihoods assuming the detectors' noise is Gaussian.
683

684
    With Gaussian noise the log likelihood functions for signal
685
    :math:`\log p(d|\Theta)` and for noise :math:`\log p(d|n)` are given by:
686

687
    .. math::
688

689
        \log p(d|\Theta) &=  -\frac{1}{2} \sum_i
690
            \left< h_i(\Theta) - d_i | h_i(\Theta) - d_i \right> \\
691
        \log p(d|n) &= -\frac{1}{2} \sum_i \left<d_i | d_i\right>
692

693
    where the sum is over the number of detectors, :math:`d_i` is the data in
694
    each detector, and :math:`h_i(\Theta)` is the model signal in each
695
    detector. The inner product is given by:
696

697
    .. math::
698

699
        \left<a | b\right> = 4\Re \int_{0}^{\infty}
700
            \frac{\tilde{a}(f) \tilde{b}(f)}{S_n(f)} \mathrm{d}f,
701

702
    where :math:`S_n(f)` is the PSD in the given detector.
703

704
    Note that the log prior-weighted likelihood ratio has one less term
705
    than the log posterior, since the :math:`\left<d_i|d_i\right>` term cancels
706
    in the likelihood ratio:
707

708
    .. math::
709

710
        \log \hat{\mathcal{L}} = \log p(\Theta) + \sum_i \left[
711
            \left<h_i(\Theta)|d_i\right> -
712
            \frac{1}{2} \left<h_i(\Theta)|h_i(\Theta)\right> \right]
713

714
    For this reason, by default this class returns ``logplr`` when called as a
715
    function instead of ``logposterior``. This can be changed via the
716
    ``set_callfunc`` method.
717

718
    Upon initialization, the data is whitened using the given PSDs. If no PSDs
719
    are given the data and waveforms returned by the waveform generator are
720
    assumed to be whitened. The likelihood function of the noise,
721

722
    .. math::
723

724
        p(d|n) = \frac{1}{2} \sum_i \left<d_i|d_i\right>,
725

726
    is computed on initialization and stored as the `lognl` attribute.
727

728
    By default, the data is assumed to be equally sampled in frequency, but
729
    unequally sampled data can be supported by passing the appropriate
730
    normalization using the ``norm`` keyword argument.
731

732
    For more details on initialization parameters and definition of terms, see
733
    ``BaseLikelihoodEvaluator``.
734

735
    Parameters
736
    ----------
737
    variable_args : (tuple of) string(s)
738
        A tuple of parameter names that will be varied.
739
    waveform_generator : generator class
740
        A generator class that creates waveforms. This must have a ``generate``
741
        function which takes parameter values as keyword arguments, a
742
        detectors attribute which is a dictionary of detectors keyed by their
743
        names, and an epoch which specifies the start time of the generated
744
        waveform.
745
    data : dict
746
        A dictionary of data, in which the keys are the detector names and the
747
        values are the data (assumed to be unwhitened). The list of keys must
748
        match the waveform generator's detectors keys, and the epoch of every
749
        data set must be the same as the waveform generator's epoch.
750
    f_lower : float
751
        The starting frequency to use for computing inner products.
752
    psds : {None, dict}
753
        A dictionary of FrequencySeries keyed by the detector names. The
754
        dictionary must have a psd for each detector specified in the data
755
        dictionary. If provided, the inner products in each detector will be
756
        weighted by 1/psd of that detector.
757
    f_upper : {None, float}
758
        The ending frequency to use for computing inner products. If not
759
        provided, the minimum of the largest frequency stored in the data
760
        and a given waveform will be used.
761
    norm : {None, float or array}
762
        An extra normalization weight to apply to the inner products. Can be
763
        either a float or an array. If ``None``, ``4*data.values()[0].delta_f``
764
        will be used.
765
    **kwargs :
766
        All other keyword arguments are passed to ``BaseLikelihoodEvaluator``.
767

768
    Examples
769
    --------
770
    Create a signal, and set up the likelihood evaluator on that signal:
771

772
    >>> from pycbc import psd as pypsd
773
    >>> from pycbc.waveform.generator import (FDomainDetFrameGenerator,
774
                                              FDomainCBCGenerator)
775
    >>> import gwin
776
    >>> seglen = 4
777
    >>> sample_rate = 2048
778
    >>> N = seglen*sample_rate/2+1
779
    >>> fmin = 30.
780
    >>> m1, m2, s1z, s2z, tsig, ra, dec, pol, dist = (
781
            38.6, 29.3, 0., 0., 3.1, 1.37, -1.26, 2.76, 3*500.)
782
    >>> generator = FDomainDetFrameGenerator(
783
            FDomainCBCGenerator, 0.,
784
            variable_args=['tc'], detectors=['H1', 'L1'],
785
            delta_f=1./seglen, f_lower=fmin,
786
            approximant='SEOBNRv2_ROM_DoubleSpin',
787
            mass1=m1, mass2=m2, spin1z=s1z, spin2z=s2z,
788
            ra=ra, dec=dec, polarization=pol, distance=dist)
789
    >>> signal = generator.generate(tc=tsig)
790
    >>> psd = pypsd.aLIGOZeroDetHighPower(N, 1./seglen, 20.)
791
    >>> psds = {'H1': psd, 'L1': psd}
792
    >>> likelihood_eval = gwin.GaussianLikelihood(
793
            ['tc'], generator, signal, fmin, psds=psds, return_meta=False)
794

795
    Now compute the log likelihood ratio and prior-weighted likelihood ratio;
796
    since we have not provided a prior, these should be equal to each other:
797

798
    >>> likelihood_eval.loglr(tc=tsig)
799
    ArrayWithAligned(277.92945279883855)
800
    >>> likelihood_eval.logplr(tc=tsig)
801
    ArrayWithAligned(277.92945279883855)
802

803
    Compute the log likelihood and log posterior; since we have not
804
    provided a prior, these should both be equal to zero:
805

806
    >>> likelihood_eval.loglikelihood(tc=tsig)
807
    ArrayWithAligned(0.0)
808
    >>> likelihood_eval.logposterior(tc=tsig)
809
    ArrayWithAligned(0.0)
810

811
    Compute the SNR; for this system and PSD, this should be approximately 24:
812

813
    >>> likelihood_eval.snr([tsig])
814
    ArrayWithAligned(23.576660187517593)
815

816
    Using the same likelihood evaluator, evaluate the log prior-weighted
817
    likelihood ratio at several points in time, check that the max is at tsig,
818
    and plot (note that we use the class as a function here, which defaults
819
    to calling ``logplr``):
820

821
    >>> from matplotlib import pyplot
822
    >>> times = numpy.arange(seglen*sample_rate)/float(sample_rate)
823
    >>> lls = numpy.array([likelihood_eval([t]) for t in times])
824
    >>> times[lls.argmax()]
825
    3.10009765625
826
    >>> fig = pyplot.figure(); ax = fig.add_subplot(111)
827
    >>> ax.plot(times, lls)
828
    [<matplotlib.lines.Line2D at 0x1274b5c50>]
829
    >>> fig.show()
830

831
    Create a prior and use it (see distributions module for more details):
832

833
    >>> from pycbc import distributions
834
    >>> uniform_prior = distributions.Uniform(tc=(tsig-0.2,tsig+0.2))
835
    >>> prior_eval = gwin.JointDistribution(['tc'], uniform_prior)
836
    >>> likelihood_eval = gwin.GaussianLikelihood(
837
            generator, signal, 20., psds=psds, prior=prior_eval,
838
            return_meta=False)
839
    >>> likelihood_eval.logplr([tsig])
840
    ArrayWithAligned(278.84574353071264)
841
    >>> likelihood_eval.logposterior([tsig])
842
    ArrayWithAligned(0.9162907318741418)
843
    """
844
    name = 'gaussian'
1✔
845
    required_kwargs = ['waveform_generator', 'data', 'f_lower']
1✔
846

847
    def __init__(self, variable_args, waveform_generator=None, data=None,
1✔
848
                 f_lower=None, psds=None, f_upper=None, norm=None,
849
                 **kwargs):
850
        if waveform_generator is None:
1✔
851
            raise ValueError("waveform_generator must be provided")
×
852
        if data is None:
1✔
853
            raise ValueError("data must be provided")
×
854
        if f_lower is None:
1✔
855
            raise ValueError("f_lower must be provided")
×
856
        # set up the boiler-plate attributes; note: we'll compute the
857
        # log evidence later
858
        super(GaussianLikelihood, self).__init__(
1✔
859
            variable_args,
860
            waveform_generator=waveform_generator, data=data,
861
            **kwargs)
862
        # check that the data and waveform generator have the same detectors
863
        if (sorted(waveform_generator.detectors.keys()) !=
1✔
864
                sorted(self._data.keys())):
865
            raise ValueError(
×
866
                "waveform generator's detectors ({0}) does not "
867
                "match data ({1})".format(
868
                    ','.join(sorted(waveform_generator.detector_names)),
869
                    ','.join(sorted(self._data.keys()))))
870
        # check that the data and waveform generator have the same epoch
871
        if any(waveform_generator.epoch != d.epoch
1✔
872
               for d in self._data.values()):
873
            raise ValueError("waveform generator does not have the same epoch "
×
874
                             "as all of the data sets.")
875
        # check that the data sets all have the same lengths
876
        dlens = numpy.array([len(d) for d in data.values()])
1✔
877
        if not all(dlens == dlens[0]):
1✔
878
            raise ValueError("all data must be of the same length")
×
879
        # we'll use the first data set for setting values
880
        d = data.values()[0]
1✔
881
        N = len(d)
1✔
882
        # figure out the kmin, kmax to use
883
        kmin, kmax = filter.get_cutoff_indices(f_lower, f_upper, d.delta_f,
1✔
884
                                               (N-1)*2)
885
        self._kmin = kmin
1✔
886
        self._kmax = kmax
1✔
887
        if norm is None:
1✔
888
            norm = 4*d.delta_f
1✔
889
        # we'll store the weight to apply to the inner product
890
        if psds is None:
1✔
891
            w = Array(numpy.sqrt(norm)*numpy.ones(N))
1✔
892
            self._weight = {det: w for det in data}
1✔
893
        else:
894
            # temporarily suppress numpy divide by 0 warning
895
            numpysettings = numpy.seterr(divide='ignore')
1✔
896
            self._weight = {det: Array(numpy.sqrt(norm/psds[det]))
1✔
897
                            for det in data}
898
            numpy.seterr(**numpysettings)
1✔
899
        # whiten the data
900
        for det in self._data:
1✔
901
            self._data[det][kmin:kmax] *= self._weight[det][kmin:kmax]
1✔
902
        # compute the log likelihood function of the noise and save it
903
        self.set_lognl(-0.5*sum([
1✔
904
            d[kmin:kmax].inner(d[kmin:kmax]).real
905
            for d in self._data.values()]))
906
        # set default call function to logplor
907
        self.set_callfunc('logplr')
1✔
908

909
    def loglr(self, **params):
1✔
910
        r"""Computes the log likelihood ratio,
911

912
        .. math::
913

914
            \log \mathcal{L}(\Theta) = \sum_i
915
                \left<h_i(\Theta)|d_i\right> -
916
                \frac{1}{2}\left<h_i(\Theta)|h_i(\Theta)\right>,
917

918
        at the given point in parameter space :math:`\Theta`.
919

920
        Parameters
921
        ----------
922
        \**params :
923
            The keyword arguments should give the values of each parameter to
924
            evaluate.
925

926
        Returns
927
        -------
928
        numpy.float64
929
            The value of the log likelihood ratio evaluated at the given point.
930
        """
931
        lr = 0.
1✔
932
        try:
1✔
933
            wfs = self._waveform_generator.generate(**params)
1✔
934
        except NoWaveformError:
×
935
            # if no waveform was generated, just return 0
936
            return lr
×
937
        for det, h in wfs.items():
1✔
938
            # the kmax of the waveforms may be different than internal kmax
939
            kmax = min(len(h), self._kmax)
1✔
940
            # whiten the waveform
941
            if self._kmin >= kmax:
1✔
942
                # if the waveform terminates before the filtering low frequency
943
                # cutoff, there is nothing to filter, so just go onto the next
944
                continue
×
945
            slc = slice(self._kmin, kmax)
1✔
946
            h[self._kmin:kmax] *= self._weight[det][slc]
1✔
947
            lr += (
1✔
948
                # <h, d>
949
                self.data[det][slc].inner(h[slc]).real -
950
                # <h, h>/2.
951
                0.5*h[slc].inner(h[slc]).real
952
            )
953
        return numpy.float64(lr)
1✔
954

955
    def loglikelihood(self, **params):
1✔
956
        r"""Computes the log likelihood of the paramaters,
957

958
        .. math::
959

960
            p(d|\Theta) = -\frac{1}{2}\sum_i
961
                \left<h_i(\Theta) - d_i | h_i(\Theta) - d_i\right>
962

963
        Parameters
964
        ----------
965
        \**params :
966
            The keyword arguments should give the values of each parameter to
967
            evaluate.
968

969
        Returns
970
        -------
971
        float
972
            The value of the log likelihood evaluated at the given point.
973
        """
974
        # since the loglr has fewer terms, we'll call that, then just add
975
        # back the noise term that canceled in the log likelihood ratio
976
        return self.loglr(**params) + self._lognl
1✔
977

978
    def logposterior(self, **params):
1✔
979
        """Computes the log-posterior probability at the given point in
980
        parameter space.
981

982
        Parameters
983
        ----------
984
        \**params :
985
            The keyword arguments should give the values of each parameter to
986
            evaluate.
987

988
        Returns
989
        -------
990
        float
991
            The value of the log-posterior evaluated at the given point in
992
            parameter space.
993
        metadata : tuple
994
            If ``return_meta``, the prior and likelihood ratio as a tuple.
995
            Otherwise, just returns the log-posterior.
996
        """
997
        # since the logplr has fewer terms, we'll call that, then just add
998
        # back the noise term that canceled in the log likelihood ratio
999
        logplr = self.logplr(**params)
1✔
1000
        if self.return_meta:
1✔
1001
            logplr, (pr, lr, lj) = logplr
×
1002
        else:
1003
            pr = lr = lj = None
1✔
1004
        return self._formatreturn(logplr + self._lognl, prior=pr, loglr=lr,
1✔
1005
                                  logjacobian=lj)
1006

1007

1008
class MarginalizedPhaseGaussianLikelihood(GaussianLikelihood):
1✔
1009
    r"""The likelihood is analytically marginalized over phase.
1010

1011
    This class can be used with signal models that can be written as:
1012

1013
    .. math::
1014

1015
        \tilde{h}(f; \Theta, \phi) = A(f; \Theta)e^{i\Psi(f; \Theta) + i \phi},
1016

1017
    where :math:`\phi` is an arbitrary phase constant. This phase constant
1018
    can be analytically marginalized over with a uniform prior as follows:
1019
    assuming the noise is stationary and Gaussian (see `GaussianLikelihood`
1020
    for details), the posterior is:
1021

1022
    .. math::
1023

1024
        p(\Theta,\phi|d)
1025
            &\propto p(\Theta)p(\phi)p(d|\Theta,\phi) \\
1026
            &\propto p(\Theta)\frac{1}{2\pi}\exp\left[
1027
                -\frac{1}{2}\sum_{i}^{N_D} \left<
1028
                    h_i(\Theta,\phi) - d_i, h_i(\Theta,\phi) - d_i
1029
                \right>\right].
1030

1031
    Here, the sum is over the number of detectors :math:`N_D`, :math:`d_i`
1032
    and :math:`h_i` are the data and signal in detector :math:`i`,
1033
    respectively, and we have assumed a uniform prior on :math:`phi \in [0,
1034
    2\pi)`. With the form of the signal model given above, the inner product
1035
    in the exponent can be written as:
1036

1037
    .. math::
1038

1039
        -\frac{1}{2}\left<h_i - d_i, h_i- d_i\right>
1040
            &= \left<h_i, d_i\right> -
1041
               \frac{1}{2}\left<h_i, h_i\right> -
1042
               \frac{1}{2}\left<d_i, d_i\right> \\
1043
            &= \Re\left\{O(h^0_i, d_i)e^{-i\phi}\right\} -
1044
               \frac{1}{2}\left<h^0_i, h^0_i\right> -
1045
               \frac{1}{2}\left<d_i, d_i\right>,
1046

1047
    where:
1048

1049
    .. math::
1050

1051
        h_i^0 &\equiv \tilde{h}_i(f; \Theta, \phi=0); \\
1052
        O(h^0_i, d_i) &\equiv 4 \int_0^\infty
1053
            \frac{\tilde{h}_i^*(f; \Theta,0)\tilde{d}_i(f)}{S_n(f)}\mathrm{d}f.
1054

1055
    Gathering all of the terms that are not dependent on :math:`\phi` together:
1056

1057
    .. math::
1058

1059
        \alpha(\Theta, d) \equiv \exp\left[-\frac{1}{2}\sum_i
1060
            \left<h^0_i, h^0_i\right> + \left<d_i, d_i\right>\right],
1061

1062
    we can marginalize the posterior over :math:`\phi`:
1063

1064
    .. math::
1065

1066
        p(\Theta|d)
1067
            &\propto p(\Theta)\alpha(\Theta,d)\frac{1}{2\pi}
1068
                     \int_{0}^{2\pi}\exp\left[\Re \left\{
1069
                         e^{-i\phi} \sum_i O(h^0_i, d_i)
1070
                     \right\}\right]\mathrm{d}\phi \\
1071
            &\propto p(\Theta)\alpha(\Theta, d)\frac{1}{2\pi}
1072
                     \int_{0}^{2\pi}\exp\left[
1073
                         x(\Theta,d)\cos(\phi) + y(\Theta, d)\sin(\phi)
1074
                     \right]\mathrm{d}\phi.
1075

1076
    The integral in the last line is equal to :math:`2\pi I_0(\sqrt{x^2+y^2})`,
1077
    where :math:`I_0` is the modified Bessel function of the first kind. Thus
1078
    the marginalized log posterior is:
1079

1080
    .. math::
1081

1082
        \log p(\Theta|d) \propto \log p(\Theta) +
1083
            I_0\left(\left|\sum_i O(h^0_i, d_i)\right|\right) -
1084
            \frac{1}{2}\sum_i\left[ \left<h^0_i, h^0_i\right> -
1085
                                    \left<d_i, d_i\right> \right]
1086

1087
    This class computes the above expression for the log likelihood.
1088
    """
1089
    name = 'marginalized_phase'
1✔
1090

1091
    def loglr(self, **params):
1✔
1092
        r"""Computes the log likelihood ratio,
1093

1094
        .. math::
1095

1096
            \log \mathcal{L}(\Theta) =
1097
                I_0 \left(\left|\sum_i O(h^0_i, d_i)\right|\right) -
1098
                \frac{1}{2}\left<h^0_i, h^0_i\right>,
1099

1100
        at the given point in parameter space :math:`\Theta`.
1101

1102
        Parameters
1103
        ----------
1104
        \**params :
1105
            The keyword arguments should give the values of each parameter to
1106
            evaluate.
1107

1108
        Returns
1109
        -------
1110
        numpy.float64
1111
            The value of the log likelihood ratio evaluated at the given point.
1112
        """
1113
        try:
1✔
1114
            wfs = self._waveform_generator.generate(**params)
1✔
1115
        except NoWaveformError:
×
1116
            # if no waveform was generated, just return 0
1117
            return 0.
×
1118
        hh = 0.
1✔
1119
        hd = 0j
1✔
1120
        for det, h in wfs.items():
1✔
1121
            # the kmax of the waveforms may be different than internal kmax
1122
            kmax = min(len(h), self._kmax)
1✔
1123
            # whiten the waveform
1124
            if self._kmin >= kmax:
1✔
1125
                # if the waveform terminates before the filtering low frequency
1126
                # cutoff, there is nothing to filter, so just go onto the next
1127
                continue
×
1128
            h[self._kmin:kmax] *= self._weight[det][self._kmin:kmax]
1✔
1129
            hh += h[self._kmin:kmax].inner(h[self._kmin:kmax]).real
1✔
1130
            hd += self.data[det][self._kmin:kmax].inner(h[self._kmin:kmax])
1✔
1131
        hd = abs(hd)
1✔
1132
        return numpy.log(special.i0e(hd)) + hd - 0.5*hh
1✔
1133

1134

1135
likelihood_evaluators = {cls.name: cls for cls in (
1✔
1136
    TestEggbox,
1137
    TestNormal,
1138
    TestRosenbrock,
1139
    TestVolcano,
1140
    GaussianLikelihood,
1141
    MarginalizedPhaseGaussianLikelihood,
1142
)}
1143

1144
__all__ = ['BaseLikelihoodEvaluator',
1✔
1145
           'TestNormal', 'TestEggbox', 'TestVolcano', 'TestRosenbrock',
1146
           'GaussianLikelihood', 'MarginalizedPhaseGaussianLikelihood',
1147
           'likelihood_evaluators']
STATUS · Troubleshooting · Open an Issue · Sales · Support · CAREERS · ENTERPRISE · START FREE · SCHEDULE DEMO
ANNOUNCEMENTS · TWITTER · TOS & SLA · Supported CI Services · What's a CI service? · Automated Testing

© 2025 Coveralls, Inc