1605
92%

Ran 29 May 2018 10:29AM UTC

Jobs 1

Files 36

Run time 3s

Badge

Embed ▾

pending completion

Build # 1605

Build Type

push

travis-ci

Committed by

dominichamon

Commit Message

Benchmarking is hard. Making sense of the benchmarking results is even harder. (#593)

The first problem you have to solve yourself. The second one can be aided.
The benchmark library can compute some statistics over the repetitions,
which helps with grasping the results somewhat.

But that is only for the one set of results. It does not really help to compare
the two benchmark results, which is the interesting bit. Thankfully, there are
these bundled `tools/compare.py` and `tools/compare_bench.py` scripts.

They can provide a diff between two benchmarking results. Yay!
Except not really, it's just a diff, while it is very informative and better than
nothing, it does not really help answer The Question - am i just looking at the noise?
It's like not having these per-benchmark statistics...

Roughly, we can formulate the question as:
> Are these two benchmarks the same?
> Did my change actually change anything, or is the difference below the noise level?

Well, this really sounds like a [null hypothesis](https://en.wikipedia.org/wiki/Null_hypothesis), does it not?
So maybe we can use statistics here, and solve all our problems?
lol, no, it won't solve all the problems. But maybe it will act as a tool,
to better understand the output, just like the usual statistics on the repetitions...

I'm making an assumption here that most of the people care about the change
of average value, not the standard deviation. Thus i believe we can use T-Test,
be it either [Student's t-test](https://en.wikipedia.org/wiki/Student%27s_t-test), or [Welch's t-test](https://en.wikipedia.org/wiki/Welch%27s_t-test).
**EDIT**: however, after @dominichamon review, it was decided that it is better
to use more robust [Mann–Whitney U test](https://en.wikipedia.org/wiki/Mann–Whitney_U_test)
I'm using [scipy.stats.mannwhitneyu](https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.mannwhitneyu.html#scipy.stats.mannwhitneyu).

There are two new use... (continued)

Run Details

1503 of 1724 relevant lines covered (87.18%)

5006333.93 hits per line

Jobs

ID	Job ID	Ran	Files	Coverage
1	1605.1 (COMPILER=g++ C_COMPILER=gcc BUILD_TYPE=Coverage)	29 May 2018 10:29AM UTC	0	87.18	Travis Job 1605.1

Source Files on build 1605

Detailed source file information is not available for this build.

google / benchmark / 1605
92%

README BADGES
x

Markdown

Textile

RDoc

HTML

Rst

Jobs

Source Files on build 1605

google / benchmark / 1605 92%

README BADGES x

Markdown

Textile

RDoc

HTML

Rst

Jobs

Source Files on build 1605

google / benchmark / 1605
92%

README BADGES
x