• Home
  • Features
  • Pricing
  • Docs
  • Announcements
  • Sign In

Displayr / flipChart / 2943

pending completion
2943

push

travis-ci-com

chschan
Untracked: fix tests to export warnings about sorting

1550 of 1718 relevant lines covered (90.22%)

152.75 hits per line

Source File
Press 'n' to go to next uncovered line, 'b' for previous

91.37
/R/preparedata.R
1
#' PrepareData
2
#'
3
#' Prepares input data for charting.
4
#' @param chart.type Character; chart type to be plotted.
5
#' @param subset subset An optional vector specifying a subset of
6
#'     observations to be used in the fitting process, or, the name of
7
#'     a variable in \code{data}. It may not be an expression.
8
#' @param weights An optional vector of sampling weights, or, the name
9
#'     of a variable in \code{data}. It may not be an expression.
10
#' @param input.data.table Array; typically a table of some kind,
11
#'     which is then processed using
12
#'     \code{\link[flipTables]{AsTidyTabularData}}.
13
#' @param input.data.tables List of array; each component is assumed
14
#'     to be a Qtable and will be processed using.
15
#'     \code{\link[flipTables]{AsTidyTabularData}}
16
#' @param input.data.raw List, containing variables or data.frames or Regression outputs from flipRegression.
17
#'     In the case of multiple Regression outputs, the labels default to the R name of the Regression output.
18
#' @param input.data.pasted List of length six; the first component of
19
#'     which is assumed to be from a user-entered/pasted table; will
20
#'     be processed by \code{\link{ParseUserEnteredTable}}.
21
#' @param input.data.other A PickAny Multi Q variable.
22
#' @param data.source Where multiple data inputs are provided, a text
23
#'     string can be provided to disambiguate. Refer to the source
24
#'     code for a precise understanding of how this works (it is not
25
#'     obvious and is not likely to be of any use for most cases, so
26
#'     should usually be left as a \code{NULL}).
27
#' @param signif.append Append attributes used to show statistical test for significance.
28
#' @param signif.symbol Character; Symbol used on chart to indicate significance. This can "Arrow" or "Caret".
29
#' @param signif.symbol.size Numeric; size of symbol in pixels.
30
#' @param signif.p.cutoffs Numeric; vector of p-values used to determine color of symbols.
31
#'     These values should be supplied in decreasing order. The colors used will correspond
32
#'     to the smallest cutoff larger than the p-value of that cell.
33
#' @param signif.colors.pos Character; vector of colors, of the same length as \code{signif.p.cutoffs}.
34
#' @param signif.colors.neg Character; vector of colors, of the same length as \code{signif.p.cutoffs}.
35
#' @param signif.colors.on.font Boolean; whether signif colors should also affect data label font colors.
36
#' @param first.aggregate Logical; whether or not the input data needs
37
#'     to be aggregated in this function. A single variable is
38
#'     tabulated, 2 variables are crosstabbed if \code{group.by.last} is selected,
39
#'     and otherwise the mean is computed. If \code{input.data.raw} contains
40
#'     two an 'X' variable and a 'Y' variable in the first two elements of the list,
41
#'     the data is automatically aggregated and crosstabbed.
42
#' @param scatter.input.columns.order (deprecated) Use \code{scatter.mult.yvals} instead.
43
#' @param scatter.mult.yvals Logical; When \code{chart.type} is "Scatter',
44
#'     a \code{TRUE} value indicates that columns of input.data.table or input.data.pasted
45
#'     should be considered multiple series instead of different attributes (default).
46
#' @param group.by.last Logical; \code{TRUE} and \code{first.aggregate} and there is data
47
#'     in either of \code{input.data.table} or \code{input.data.pasted}, the data is aggregated
48
#'     using the last variable
49
#' @param tidy Logical; whether or not the input data needs to be
50
#'     aggregated in this function (e.g., if an x and y variable have
51
#'     been provided, a contingency table is used to aggregate. This
52
#'     defaults to \code{TRUE}. It aggressively seeks to turn the data
53
#'     into a named vector or a matrix using
54
#'     \code{\link[flipTables]{TidyTabularData}}. This is not applied
55
#'     when \code{data.input.tables} are provided, or when the chart
56
#'     type is any of \code{"Scatter"}, \code{"Bean"},
57
#'     \code{"Histogram"}, \code{"Density"}, \code{"Box"}, or
58
#'     \code{"Violin"}.
59
#' @param tidy.labels Logical; whether to remove common prefixes from the
60
#'     labels of the input data.
61
#' @param transpose Logical; should the resulting matrix (of created)
62
#'     be transposed?
63
#' @param row.names.to.remove Character vector or delimited string of
64
#'     row labels specifying rows to remove from the returned table;
65
#'     default is \code{c("NET", "SUM")}
66
#' @param column.names.to.remove Character vector or delimited string
67
#'     of column labels specifying columns to remove from the returned
68
#'     table; default is \code{c("NET", "SUM")}.
69
#' @param split Character delimiter to split
70
#'     \code{row.names.to.remove} and \code{col.names.to.remove}
71
#'     on. Default is to split on either of \code{","} or \code{";"}.
72
#'     Assumed to be a regular expression; see \code{\link{strsplit}}.
73
#' @param hide.empty.rows.and.columns Logical; if \code{TRUE} empty
74
#'     rows and columns will be removed from the data.  Empty here
75
#'     meaning that a row or column contains all \code{NA} values, or
76
#'     in the case of percentages, that a row or column contains only
77
#'     0's. Retained for backwards-compatibility but is superseded by
78
#'     \code{hide.empty.rows} and \code{hide.empty.columns}.
79
#' @param hide.empty.rows Logical; hide rows with only NAs or 0's (percentages).
80
#' @param hide.empty.columns Logical; hide columns with only NAs or 0's (percentages).
81
#' @param select.rows String; Comma separated list of rows, by name or index
82
#'     to select from input table. If blank (default), then all rows are selected.
83
#' @param select.columns String; Comma separated list of columns, by name or index
84
#'     to select from input table. If blank (default), then all columns are selected.
85
#' @param auto.order.rows Logical; Automatically order rows by correspondence analysis.
86
#' @param sort.rows Logical; whether to sort the rows of the table. This operation is
87
#'     performed after row selection. (Ignored if \code{auto.order.rows} is true).
88
#' @param sort.rows.column String; If \code{sort.rows} is true, this column
89
#'     (specified by name or index) is used for sorting the rows. If not specified,
90
#'     the column with the largest \code{Column n} or the right-most column
91
#'     will be used for sorting.
92
#' @param sort.rows.exclude String; If \code{sort.rows} is \code{TRUE}, then rows
93
#'      in \code{sort.rows.exclude} will be excluded from sorting and
94
#'      appended at the bottom of the table.
95
#' @param sort.rows.decreasing Logical; Whether rows should be sorted in decreasing order.
96
#' @param auto.order.columns Logical; Automatically order columns by correspondence analysis.
97
#' @param sort.columns Logical; whether to sort the columns of the table.
98
#'      This operation is performed after column selection (Ignored if
99
#'      \code{auto.order.columns} is true.
100
#' @param sort.columns.row String; If \code{sort.columns} is true, this row
101
#'      (specified by name or index) is used for sorting the columns. If not specified,
102
#'      the row with the largest \code{n} or the bottom row
103
#'      will be used for sorting.
104
#' @param sort.columns.exclude String; If \code{sort.columns} is \code{TRUE}, then columns
105
#'      in \code{sort.columns.exclude} will be excluded from sorting and
106
#'      appended at the right of the table.'
107
#' @param sort.columns.decreasing Logical; Whether columns should be sorted in decreasing order.
108
#' @param hide.output.threshold Integer; If sample size ('Column n' or 'n') is provided
109
#'      then each cell in the input table will be checked to ensure
110
#'      'n' or 'Column n' is larger than specified threshold, otherwise an error
111
#'      message is given.
112
#' @param hide.values.threshold Integer; If sample size ('Column n' or 'n') is provided
113
#'      then each cell in the input table will be checked to ensure
114
#'      'n' or 'Column n' is larger than specified threshold,
115
#'      otherwise the cell will be set to \code{NA}.
116
#' @param hide.rows.threshold Integer; If sample size ('Column n' or 'n')
117
#'      is provided, then rows and with sample sizes smaller than threshold
118
#'      will be removed from table. Vectors will be treated as 1-d matrices
119
#' @param hide.columns.threshold Integer; If sample size ('Column n' or 'n')
120
#'      is provided, then columns with sample sizes smaller than threshold
121
#'      will be removed from table. Vectors will not be affected.
122
#' @param first.k.rows Integer; Number of rows to select from the top of the input table. This occurs after select and sort.
123
#' @param last.k.rows Integer; Number of rows to select from the bottom of the input table. This occurs after select and sort.
124
#' @param first.k.columns Integer; Number of columns to select from the left of the input table. This occurs after select and sort.
125
#' @param last.k.columns Integer; Number of columns to select from the right of the input table. This occurs after select and sort.
126
#' @param reverse.rows Logical; Whether to reverse order of rows. This operation is
127
#'      performed after row selection and sorting.
128
#' @param reverse.columns Logical; Whether to reverse order of columns. This operation
129
#'      is peformed after column selection and sorting.
130
#' @param show.labels Logical; If \code{TRUE}, labels are used for
131
#'     names in the data output if raw data is supplied.
132
#' @param as.percentages Logical; If \code{TRUE}, aggregate values in the
133
#' output table are given as percentages summing to 100. If \code{FALSE},
134
#' column sums are given.
135
#' @param hide.percent.symbol Percentage data is shown without percentage symbols and the symbol
136
#'  is also removed from the statistic attribute.
137
#' @param categorical.as.binary If data is aggregated and this is true, then categorical variables will be converted into indicator variables for each level in the factor.
138
#' @param date.format One of \code{"Automatic", "US", "International" or "No date formatting"}.
139
#' This is used to determine whether strings which are interpreted as dates
140
#' in the (row)names will be read in the US (month-day-year) or the
141
#' International (day-month-year) format. By default US format is used
142
#' if it cannot be deduced from the input data.
143
#' @param values.title The title for the values axis of a chart (e.g.,
144
#'     the y-axis of a column chart or the x-axis of a bar chart).
145
#' @param column.labels A comma separated list of names to replace the default column names
146
#'      of \code{pd$data}. This is applied after all other data manipulations
147
#' @param row.labels A comma separated list of names to replace the default row names
148
#'      of \code{pd$data}. This is applied after all other data manipulations
149
#' @details It is assumed that only one of \code{input.data.pasted},
150
#'     \code{input.data.table}, \code{input.data.tables},
151
#'     \code{input.data.other}, \code{input.data.raw} is non-NULL.
152
#'     They are checked for nullity in that order.
153
#' @importFrom flipU ConvertCommaSeparatedStringToVector
154
#' @importFrom flipTransformations ParseUserEnteredTable
155
#'     SplitVectorToList
156
#' @importFrom flipTables TidyTabularData RemoveRowsAndOrColumns SelectRows SelectColumns SortRows SortColumns ReverseRows ReverseColumns HideOutputsWithSmallSampleSizes HideValuesWithSmallSampleSizes HideRowsWithSmallSampleSizes HideColumnsWithSmallSampleSizes AutoOrderRows AutoOrderColumns ConvertQTableToArray
157
#' @importFrom flipData TidyRawData
158
#' @importFrom flipFormat Labels Names ExtractCommonPrefix
159
#' @importFrom flipStatistics Table WeightedTable
160
#' @importFrom verbs Sum
161
#' @importFrom stats setNames
162
#' @return A list with components \itemize{ \item \code{data} - If
163
#'     possible, a named vector or matrix, or if that is not posible
164
#'     or a data.frame is requested, a data.frame.  \item
165
#'     \code{weights} - Numeric vector of user-supplied weights.
166
#'     \item \code{values.title} - Character string to be used for the
167
#'     y-axis title; will only be a non-empty string if some
168
#'     aggregation has been performed on \code{data} \item
169
#'     \code{scatter.variable.indices} A named vector indicating which
170
#'     columns in \code{data} should be plotted in a scatterplot as
171
#'     \code{x}, \code{y}, \code{sizes}, and \code{colors}. Is
172
#'     \code{NULL} if \code{chart.type} does not contain
173
#'     \code{"Scatter"} or \code{"Bubble"}. \code{NA} is used when the
174
#'     data does not exist.  }
175
#' @export
176
#' @seealso \code{\link[flipTables]{AsTidyTabularData}},
177
#'     \code{\link[flipData]{TidyRawData}},
178
#'     \code{\link[flipTransformations]{ParseUserEnteredTable}}
179
PrepareData <- function(chart.type,
180
                        subset = TRUE,
181
                        weights = NULL,
182
                        input.data.table = NULL,
183
                        input.data.tables = NULL,
184
                        input.data.raw = NULL,
185
                        input.data.pasted = NULL,
186
                        input.data.other = NULL,
187
                        data.source = NULL,
188
                        signif.append = FALSE,
189
                        signif.symbol = "Arrow",
190
                        signif.symbol.size = 12,
191
                        signif.p.cutoffs = c(0.5, 0.2, 0.1, 0.05, 0.01, 0.005, 0.001, 1e-04, 1e-05, 1e-06),
192
                        signif.colors.pos = rep("#0000FF", 10),
193
                        signif.colors.neg = rep("#FF0000", 10),
194
                        signif.colors.on.font = FALSE,
195
                        first.aggregate = NULL,
196
                        scatter.input.columns.order = NULL,
197
                        scatter.mult.yvals = FALSE,
198
                        group.by.last = FALSE,
199
                        tidy = TRUE,
200
                        tidy.labels = FALSE,
201
                        transpose = FALSE,
202
                        select.rows = NULL,
203
                        first.k.rows = NA,
204
                        last.k.rows = NA,
205
                        select.columns = NULL,
206
                        first.k.columns = NA,
207
                        last.k.columns = NA,
208
                        auto.order.rows = FALSE,
209
                        sort.rows = FALSE,
210
                        sort.rows.exclude = c("NET", "SUM", "Total"),
211
                        sort.rows.column = NULL,
212
                        sort.rows.decreasing = FALSE,
213
                        auto.order.columns = FALSE,
214
                        sort.columns = FALSE,
215
                        sort.columns.exclude = c("NET", "SUM", "Total"),
216
                        sort.columns.row = NULL,
217
                        sort.columns.decreasing = FALSE,
218
                        hide.output.threshold = 0,
219
                        hide.values.threshold = 0,
220
                        hide.rows.threshold = 0,
221
                        hide.columns.threshold = 0,
222
                        reverse.rows = FALSE,
223
                        reverse.columns = FALSE,
224
                        row.names.to.remove = c("NET", "SUM", "Total"),
225
                        column.names.to.remove = c("NET", "SUM", "Total"),
226
                        split = "[;,]",
227
                        hide.empty.rows.and.columns = TRUE,
228
                        hide.empty.rows = hide.empty.rows.and.columns,
229
                        hide.empty.columns = hide.empty.rows.and.columns,
230
                        hide.percent.symbol = FALSE,
231
                        as.percentages = FALSE,
232
                        categorical.as.binary = NULL,
233
                        date.format = "Automatic",
234
                        show.labels = TRUE,
235
                        column.labels = "",
236
                        row.labels = "",
237
                        values.title = "")
238
{
239

240
    # Scenarios to address
241
    # - User provides a single numeric variable and wants to plot a bar for each value.
242
    # - User provides a single categorical variable and wants to plot a bar for each value.
243
    # - User provides two numeric variables and wants to plot a stacked bar plot of the unique values.
244
    # - User provides two numeric variables and wants to plot a stacked column chart of the crosstab.
245
    # - Data is in a weird format (e.g., JSON) for Venn diagram.
246
    # - User wants to treat variables or variable sets NOT as 'raw' data. E.g., performing a correspondence analysis of raw data.
247
    # - User wants to treat pasted data as raw data.
248
    # - User wants to treat otherData as raw data
249
    # - Scatterplots of raw data, where separate drop boxes have been used as inputs.
250
    # - Scatterplots of raw data, where a table has been used as an input.
251
    # - Scatterplots of raw data, where pasted data has been used as an input.
252
    # - Venn diagrams of JSON.
253
    # - Venn diagrams of multiple binary variables
254
    # - Histogram, Density, Bean, Violin, and Box plots of numeric variables
255
    # - Histogram, Density, Bean, Violin, and Box plots of an x and a y variable, where the histograms are conditional on the X.
256
    # - Aggregation by crosstabbing
257
    # - Sankey requires a data.frame
258
    # - means of multiple variables of raw data if aggregating
259
    ## Other things for the future...
260
    # - Taking the average of multiple numeric variables.
261
    # - Frequencies of multiple categorical variables (Pick One - Multi)
262

263
    #### This function does the following things:
264
    # 0. Checks if an input contains a subscripted Q Table and removes attributes
265
    #    if the Viz output was created before the release of Q Table subscripting.
266
    # 1. Converts the data inputs into a single data object called 'data'.
267
    # 2. Filters the data and/or removes missing values
268
    # 3. Aggregate the data if so required.
269
    # 4. Tailoring the data for the chart type.
270
    # 5. Transformations of the tidied data (sorting, transposing, removing rows).
271

272
    # This function needs to be frequently understood and generalized
273
    # by multiple people. Consequently, the goal has been to write the code in such a
274
    # way as to make it as easy to read and maintain as possible. In particular,
275
    # many obvious ways to make this code more efficent have been ignored in the interests
276
    # of making it easy to read (and in recognition that the efficiency gains would be trivial anyway).
277

278
    ###########################################################################
279
    # 0. Check subscripted QTables unclassed and attr removed for legacy outputs.
280
    ###########################################################################
281

282
    allow.qtables <- get0("ALLOW.QTABLE.CLASS", ifnotfound = FALSE, envir = .GlobalEnv)
634✔
283

284
    if (!allow.qtables)
634✔
285
    {
286
        input.data.table <- unclassQTable(input.data.table)
633✔
287
        input.data.tables <- unclassQTable(input.data.tables)
633✔
288
        input.data.raw <- unclassQTable(input.data.raw)
633✔
289
        input.data.other <- unclassQTable(input.data.other)
633✔
290
    }
291

292
    ###########################################################################
293
    # 1. Converts the data inputs into a single data object called 'data'.
294
    ###########################################################################
295
    data.source.index <- if (is.null(data.source)) NULL else
634✔
296
        switch(data.source,
634✔
297
                "Link to a table" = 1,
634✔
298
                "Link to a table in 'Pages'" = 1,
634✔
299
                "Link to multiple tables" = 2,
634✔
300
                "Link to multiple tables in 'Pages'" = 2,
634✔
301
                "Link to a variable" = 3,
634✔
302
                "Link to a variable in 'Data'" = 3,
634✔
303
                "Link to variables" = 3,
634✔
304
                "Link to variables in 'Data'" = 3,
634✔
305
                "Question Type: Pick Any" = 3,
634✔
306
                "Variable Set: Binary - Multi" = 3,
634✔
307
                "Question Type: Number - Multi" = 3,
634✔
308
                "Variable Set: Numeric - Multi" = 3,
634✔
309
                "Type or paste in data" = 4,
634✔
310
                "Use an existing R Output" = 5,
634✔
311
                "Use an existing R Output in 'Pages'" = 5,
634✔
312
                "Link to questions" = 3,
634✔
313
                "Link to variable sets in 'Data'" = 3,
634✔
314
                "Link to a question" = 3,
634✔
315
                "Link to a variable in 'Data'" = 3,
634✔
316

317
                       { # Default
634✔
318
                           warning("'", data.source, "' is not a recognized data source.")
×
319
                           3
×
320
                       }
321
                )
322
    # Convert lists of NULLs into single NULLs.
323
    if (all(sapply(input.data.raw, is.null)))
634✔
324
        input.data.raw <- NULL
297✔
325
    # Ignore colors/sizes/labels if x and y are not supplied
326
    if (length(input.data.raw) >= 2 && all(vapply(input.data.raw[1:2], is.null, logical(1L))))
634✔
327
        input.data.raw <- NULL
×
328
    if (all(vapply(input.data.pasted, is.null, logical(1L))))
634✔
329
        input.data.pasted <- NULL
572✔
330
    # Check that there is no ambiguity regarding which input to use.
331
    checkNumberOfDataInputs(data.source.index, input.data.table, input.data.tables,
634✔
332
                            input.data.raw, input.data.pasted, input.data.other)
634✔
333
    # Assign the data to 'data'
334
    data <- processInputData(input.data.table, subset, weights)
626✔
335
    if (is.null(data))
626✔
336
        data <- input.data.tables
449✔
337
    if (is.null(data))
626✔
338
        data <- coerceToDataFrame(input.data.raw, chart.type)
445✔
339
    if (is.null(data))
620✔
340
        data <- input.data.other
115✔
341
    if (is.null(data))
620✔
342
        data <- processPastedData(input.data.pasted,
62✔
343
                                  warn = tidy,
62✔
344
                                  date.format, subset, weights)
62✔
345

346
    # Replacing variable names with variable/question labels if appropriate
347
    if (is.data.frame(data))
620✔
348
        names(data) <- if (show.labels) Labels(data) else Names(data)
364✔
349
    chart.title <- attr(data, "title")
620✔
350

351
    ###########################################################################
352
    # 2. Filters the data and/or removes missing values
353
    ###########################################################################
354
    if (isScatter(chart.type) && !is.null(input.data.raw) && containsQTable(input.data.raw))
620✔
355
        subset <- TRUE
103✔
356
    filt <- length(subset) > 1 && NROW(subset) == NROW(data)
620✔
357
    if (!is.null(input.data.raw) || filt || NROW(weights) == NROW(data))
620✔
358
    {
359
        missing <- if (chart.type %in% c("Venn", "Sankey") && !any(checkRegressionOutput(input.data.raw)))
325✔
360
            "Exclude cases with missing data" else "Use partial data"
325✔
361
        n <- NROW(data)
325✔
362
        if (invalid.joining <- !is.null(attr(data, "InvalidVariableJoining")))
325✔
363
        {
364
            if (!isDistribution(chart.type) && length(subset) > 1 || NROW(weights) > 1)
12✔
365
                warning("The variables have been automatically spliced together without ",
4✔
366
                        "any knowledge of which case should be matched with which. ",
4✔
367
                        "This may cause the results to be misleading.")
4✔
368
        }
369
        # As we can potentially use the variable in two different ways, we suppress the warning
370
        if (isScatter(chart.type))
325✔
371
        {
372
            # Make sure column names are unique otherwise TidyData will remove
373
            # them WITHOUT warning
374
            data <- suppressWarnings(TidyRawData(data, subset = subset,
123✔
375
                    weights = weights, missing = missing, error.if.insufficient.obs = FALSE,
123✔
376
                    remove.missing.levels = FALSE))
123✔
377
        }
378
        if (!isScatter(chart.type))
325✔
379
            data <- TidyRawData(data, subset = subset, weights = weights,
202✔
380
                        missing = missing, error.if.insufficient.obs = FALSE,
202✔
381
                        remove.missing.levels = isDistribution(chart.type))
202✔
382
        if (invalid.joining)
323✔
383
            attr(data, "InvalidVariableJoining") <- TRUE
10✔
384
        n.post <- NROW(data)
323✔
385
        if (missing == "Exclude cases with missing data" && n.post < n)
323✔
386
            warning("After removing missing values and/or filtering, ", n.post,
1✔
387
                    " observations remain.")
1✔
388
        weights <- setWeight(data, weights)
323✔
389
    }
390
    if (filt)
618✔
391
        attr(data, "assigned.rownames") <- FALSE
43✔
392

393

394
    ###########################################################################
395
    # 3. Aggregate the data if so required.
396
    ###########################################################################
397
    crosstab <- !(chart.type %in% c("Scatter", "Venn") || isDistribution(chart.type)) &&
618✔
398
                 (rawDataLooksCrosstabbable(input.data.raw) || group.by.last)
618✔
399
    if (is.null(first.aggregate))
618✔
400
        first.aggregate <- crosstab
470✔
401
    if ((chart.type %in% c("Scatter", "Venn") || isDistribution(chart.type)) &&
618✔
402
        first.aggregate)
618✔
403
    {
404
        warning("Data is not aggregated for this chart type.")
1✔
405
        first.aggregate <- FALSE
1✔
406
    }
407
    if (crosstab || first.aggregate)
618✔
408
    {
409
        #crosstab <- NCOL(data) == 2 || group.by.last
410
        if (crosstab && !is.null(attr(data, "InvalidVariableJoining")))
111✔
411
            warning("The variables being crosstabbed have different lengths; ",
2✔
412
                    "it is likely that the crosstab is invalid.")
2✔
413
        data <- aggregateDataForCharting(data, weights, chart.type,
111✔
414
                    crosstab, categorical.as.binary, as.percentages)
111✔
415
        if (crosstab)
111✔
416
            group.by.last <- TRUE
48✔
417
    }
418

419
    ###########################################################################
420
    # 4. Tailoring the data for the chart type.
421
    ###########################################################################
422
    multiple.tables <- isTableList(input.data.table) || isTableList(input.data.tables)
618✔
423
    data <- prepareForSpecificCharts(data, multiple.tables, input.data.raw, chart.type,
618✔
424
                                     weights, show.labels, scatter.mult.yvals)
618✔
425
    weights <- setWeight(data, weights)
618✔
426
    scatter.mult.yvals <- isTRUE(attr(data, "scatter.mult.yvals"))
618✔
427

428
    ###########################################################################
429
    # 5. Transformations of the tidied data (e.g., sorting, transposing, removing rows).
430
    ###########################################################################
431
    original.dim.names <- dimnames(data)
618✔
432
    if (isTRUE(transpose) && isScatter(chart.type))
618✔
433
    {
434
        warning("Data was not transposed. This option is incompatible with Scatter charts")
1✔
435
        transpose <- FALSE
1✔
436
    }
437

438
    # Add info about significance arrows - this needs to occur here
439
    # so that the stat testing info makes use of RearrangeRowsColumn
440
    if (!is.null(attr(input.data.table, "QStatisticsTestingInfo", exact = TRUE)) && signif.append)
618✔
441
        data <- addStatTesting(data, attr(data, "QStatisticsTestingInfo"), signif.p.cutoffs,
13✔
442
                    signif.colors.pos, signif.colors.neg, signif.colors.on.font, signif.symbol, signif.symbol.size)
13✔
443

444

445
    # Do not drop 1-column table to keep name for legend
446
    drop <- (tidy && (chart.type %in% c("Pie", "Donut") ||
618✔
447
            !any(nchar(select.columns), na.rm = TRUE) &&
618✔
448
            !any(nchar(column.labels), na.rm = TRUE)))
618✔
449
    data <- transformTable(data, chart.type, multiple.tables, tidy, drop,
617✔
450
                   is.raw.data = !is.null(input.data.raw) || !is.null(input.data.pasted) || !is.null(input.data.other),
617✔
451
                   hide.output.threshold, hide.values.threshold, hide.rows.threshold, hide.columns.threshold,
617✔
452
                   transpose, group.by.last || first.aggregate,
617✔
453
                   hide.empty.rows, hide.empty.columns, date.format)
617✔
454

455
    # Sort must happen AFTER tidying
456
    data <- RearrangeRowsColumns(data,
617✔
457
                                 multiple.tables =  multiple.tables,
617✔
458
                                 select.rows, first.k.rows, last.k.rows,
617✔
459
                                 select.columns, first.k.columns, last.k.columns,
617✔
460
                                 row.names.to.remove, column.names.to.remove, split,
617✔
461
                                 auto.order.rows, auto.order.columns,
617✔
462
                                 sort.rows, sort.rows.decreasing, sort.rows.column,
617✔
463
                                 sort.rows.exclude, reverse.rows,
617✔
464
                                 sort.columns, sort.columns.decreasing, sort.columns.row,
617✔
465
                                 sort.columns.exclude, reverse.columns)
617✔
466

467

468

469
    # Calculate percentages after all the select/hide operations are completed
470
    data <- convertPercentages(data, as.percentages, hide.percent.symbol, chart.type, multiple.tables)
616✔
471

472
    # Update QStatisticsTestingInfo to match data manipulations
473
    # This is not used by R-viz or PPT, only for Excel exporting
474
    if (!is.null(attr(input.data.table, "QStatisticsTestingInfo", exact = TRUE)) && signif.append)
616✔
475
        data <- updateQStatisticsInfo(data, original.dim.names, transpose)
13✔
476

477
    if (any(nchar(column.labels)))
616✔
478
        data <- replaceDimNames(data, 2, column.labels)
3✔
479
    if (any(nchar(row.labels)))
616✔
480
        data <- replaceDimNames(data, 1, row.labels)
1✔
481

482
    if (scatter.mult.yvals)
616✔
483
        data <- convertScatterMultYvalsToDataFrame(data, input.data.raw, show.labels, date.format)
28✔
484

485

486
    ###########################################################################
487
    # Finalizing the result.
488
    ###########################################################################
489

490

491
    if (tidy.labels)
616✔
492
        data <- tidyLabels(data, chart.type)
17✔
493
    if (isScatter(chart.type)) # to remove span NETS
616✔
494
        data <- RemoveRowsAndOrColumns(data,
180✔
495
                row.names.to.remove = row.names.to.remove,
180✔
496
                column.names.to.remove = column.names.to.remove, split = split)
180✔
497
    if (filt && !is.null(attr(subset, "label")) && !is.null(input.data.raw) && NCOL(data) == 1 &&
616✔
498
        chart.type %in% c("Table", "Area", "Bar", "Column", "Line", "Radar", "Palm", "Time Series"))
616✔
499
    {
500
        # Do not drop 1-column table (from aggregated data) to keep name for legend
501
        data <- CopyAttributes(as.matrix(data), data)
7✔
502
        colnames(data) <- attr(subset, "label")
7✔
503
        drop <- FALSE
7✔
504
    }
505
    data <- setAxisTitles(data, chart.type, drop, values.title)
616✔
506
    values.title <- attr(data, "values.title")
616✔
507
    categories.title <- attr(data, "categories.title")
616✔
508
    attr(data, "values.title") <- NULL
616✔
509
    attr(data, "categories.title") <- NULL
616✔
510
    if (multiple.tables)
616✔
511
    {
512
        for (i in seq_along(data))
9✔
513
        {
514
            attr(data[[i]], "values.title") <- NULL
22✔
515
            attr(data[[i]], "categories.title") <- NULL
22✔
516
            if (NCOL(data[[i]]) > 2)
22✔
517
                attr(data[[i]], "statistic") <- NULL
3✔
518
        }
519
    }
520
    if (isScatter(chart.type) && !is.null(input.data.raw))
616✔
521
        data <- rmScatterDefaultNames(data)
122✔
522
    if (scatter.mult.yvals)
616✔
523
        attr(data, "scatter.mult.yvals") <- TRUE
28✔
524

525
    # Do not re-assign scatter variable indices if it already
526
    # exists (this is sometimes set in ExtractChartData
527
    # for some S3 classes) unless specifically requested
528
    if (isScatter(chart.type) && !scatter.mult.yvals &&
616✔
529
        (is.null(attr(data, "scatter.variable.indices")) ||
616✔
530
         any(nchar(select.columns), na.rm = TRUE)))
616✔
531
        attr(data, "scatter.variable.indices") <- scatterVariableIndices(input.data.raw, data, show.labels)
12✔
532

533
    # This is a work around bug RS-3402
534
    # This is now fixed in Q 5.2.7+, but we retain support for older versions
535
    # by converting to a matrix if necessary
536
    if (chart.type == "Table" && !is.null(attr(data, "statistic")) &&
616✔
537
        (is.null(dim(data)) || length(dim(data)) == 1))
616✔
538
    {
539
        tmp <- attr(data, "statistic")
3✔
540
        data <- as.matrix(data)
3✔
541
        attr(data, "statistic") <- tmp
3✔
542
    }
543

544
    # Modify multi-stat QTables so they are 3 dimensional arrays
545
    # and statistic attribute from the primary statistic
546
    # This is needed to correctly export chart to powerpoint and
547
    # R GUI code checks the statistic attribute to determine axis formatting
548
    if (!tidy && is.array(data) && !is.null(attr(data, "questions")) &&
616✔
549
        is.null(attr(data, "statistic")))
616✔
550
    {
551
        data <- ConvertQTableToArray(data)
30✔
552
        #attr(data, "statistic") <- dimnames(data)[[3]][1]
553
        #attr(data, "multi-stat") <- TRUE
554
    }
555
    if (sort.rows)
616✔
556
        attr(data, "sorted.rows") <- TRUE
9✔
557
    if (!is.null(input.data.table))
616✔
558
        attr(data, "footerhtml") <- attr(input.data.table, "footerhtml", exact = TRUE)
177✔
559

560
    list(data = data,
616✔
561
         weights = weights,
616✔
562
         values.title = values.title,
616✔
563
         categories.title = categories.title,
616✔
564
         chart.title = chart.title,
616✔
565
         chart.footer = attr(data, "footer", exact = TRUE),
616✔
566
         scatter.variable.indices = attr(data, "scatter.variable.indices"))
616✔
567
}
568

569
replaceDimNames <- function(x, dim, labels)
570
{
571
    if (length(dim(x)) < dim)
4✔
572
        x <- CopyAttributes(as.matrix(x), x)
×
573

574
    new.labels <- paste0(dimnames(x)[[dim]], rep("", dim(x)[dim])) # get length right
4✔
575
    tmp.labels <- ConvertCommaSeparatedStringToVector(labels)
4✔
576
    tmp.len <- min(length(tmp.labels), length(new.labels))
4✔
577
    new.labels[1:tmp.len] <- tmp.labels[1:tmp.len]
4✔
578
    dimnames(x)[[dim]] <- new.labels
4✔
579
    return(x)
4✔
580
}
581

582

583
#' Handle input of table or tables
584
#' @noRd
585
#' @description This function allows a list of tables to be supplied
586
#'  via the \code{input.data.table} argument in the same way as
587
#'  \code{input.data.tables}.
588
#' @param x Input data which may be a matrix or list of matrix
589
unlistTable <- function(x)
590
{
591
    if (is.null(x))
×
592
        return(x)
×
593
    if (is.list(x) && !is.data.frame(x) && length(x) == 1)
×
594
        return(x[[1]])
×
595
    else
596
        return(x)
×
597
}
598

599
isTableList <- function(x)
600
{
601
    inherits(x, "list") && !is.data.frame(x) && is.list(x) && length(x) > 1 &&
1,233✔
602
    (is.matrix(x[[1]]) || is.data.frame(x[[1]]) || is.numeric(x[[1]]))
1,233✔
603
}
604

605
isScatter <- function(chart.type)
606
{
607
    grepl("Scatter|Bubble", chart.type)
8,387✔
608
}
609

610
#' @importFrom verbs Sum
611
crosstabOneVariable <- function(x, group, weights = NULL,
612
        categorical.as.binary = FALSE, as.percentages = FALSE)
613
{
614
    data <- data.frame(x = x, y = group)
53✔
615
    data$w <- if (is.null(weights)) rep.int(1L, NROW(data)) else weights
53✔
616

617
    if (is.numeric(x) || !categorical.as.binary)
53✔
618
    {
619
        data$x <- AsNumeric(data$x, binary = FALSE)
20✔
620
        if (!is.null(weights))
20✔
621
        {
622
            data$xw <- data$x * weights
1✔
623
            out <- Table(xw ~ y, data = data, FUN = sum) / Table(w ~ y, data = data, FUN = sum)
1✔
624

625
        } else
626
            out <- Table(x ~ y, data = data, FUN = mean)
19✔
627
        attr(out, "statistic") <- "Average"
20✔
628
        return(out)
20✔
629
    }
630
    out <- Table(w ~ x + y, data = data, FUN = sum)
33✔
631
    if (as.percentages)
33✔
632
    {
633
        out <- out / Sum(data$w * !is.na(data$x), remove.missing = FALSE) * 100
1✔
634
        attr(out, "statistic") <- "%"
1✔
635
    } else
636
        attr(out, "statistic") <- "Counts"
32✔
637
    out
33✔
638
}
639

640

641

642

643

644
#' Aggregrate Raw Data For Charting
645
#' @param data \code{data.frame} containing raw data
646
#' @param weights numeric vector of weights
647
#' @param chart.type character; type of chart to be plotted
648
#' @param crosstab Aggregate using a contingency table.
649
#' @param categorical.as.binary Whether to convert factors to indicator variables
650
#' @param as.percentages Whether to return percentages instead of counts.
651
#'     This is only used if the chart.type is "Heat". The difference between these
652
#'     calculations is this percentage uses the number of observations in the dataframe
653
#'     as the denomicator. For bar/column charts, it is computing row percentages.
654
#' @return aggregated data
655
#' @noRd
656
#' @importFrom flipStatistics Table WeightedTable
657
#' @importFrom flipTransformations AsNumeric
658
aggregateDataForCharting <- function(data, weights, chart.type, crosstab,
659
        categorical.as.binary, as.percentages)
660
{
661
    if (chart.type != "Heat")
111✔
662
        as.percentages <- FALSE
109✔
663

664
    # In tables that show aggregated tables, only the x-axis title is
665
    # taken from dimnames. But both names should be set in case
666
    # the table is transposed
667
    if (NCOL(data) == 1)
111✔
668
    {
669
        out <- as.matrix(WeightedTable(unlist(data), weights = weights))
21✔
670
        names(dimnames(out)) <- c(names(data)[1], "")
21✔
671
        attr(out, "statistic") <- "Count"
21✔
672
    }
673
    else if (crosstab)
90✔
674
    {
675
        if (is.null(categorical.as.binary))
48✔
676
            categorical.as.binary <- TRUE
45✔
677

678
        data <- as.data.frame(data)
48✔
679
        tmp.names <- names(data)
48✔
680
        k <- NCOL(data)
48✔
681
        group.var <- data[, k]
48✔
682

683
        if (k <= 2)
48✔
684
        {
685
            out <- crosstabOneVariable(data[, 1], group.var, weights,
43✔
686
                        categorical.as.binary, as.percentages)
43✔
687
            if (attr(out, "statistic") == "Average")
43✔
688
                attr(out, "categories.title") <- tmp.names[2]
14✔
689
            else
690
                names(dimnames(out)) <- tmp.names
29✔
691
        }
692
        else
693
        {
694
            res <- lapply(data[, -k], crosstabOneVariable, group = group.var,
5✔
695
                        weights = weights, categorical.as.binary = categorical.as.binary,
5✔
696
                        as.percentages = as.percentages)
5✔
697
            out <- do.call("rbind", res)
5✔
698

699
            if (chart.type == "Heat")
5✔
700
                names(dimnames(out)) <- c("", attr(group.var, "question", exact = TRUE))
×
701
            else
702
                names(dimnames(out)) <- c("", tmp.names[2])
5✔
703

704
            attr.list <- lapply(res, attr, "statistic", exact = TRUE)
5✔
705
            if (all(attr.list == attr.list[[1]]))
5✔
706
                attr(out, "statistic") <- setNames(attr.list[[1]], names(attr.list[1]))
5✔
707
        }
708
    }
709
    else # first.aggregate
710
    {
711
        if (is.null(categorical.as.binary))
42✔
712
            categorical.as.binary <- FALSE
40✔
713

714
        if (categorical.as.binary)
42✔
715
        {
716
            tmp.dat <- data
1✔
717
            tmp.names <- Names(data)
1✔
718
            tmp.numeric <- sapply(data, is.numeric)
1✔
719
        }
720
        if (is.data.frame(data))
42✔
721
            data <- AsNumeric(data, binary = categorical.as.binary)
36✔
722
        if (!is.null(weights))
42✔
723
        {
724
            xw <- sweep(data, 1, weights, "*")
3✔
725
            sum.xw <- apply(xw, 2, sum, na.rm = TRUE)
3✔
726
            w <- matrix(weights, nrow(data), ncol(data))
3✔
727
            w[is.na(data)] <- 0
3✔
728
            sum.w <- apply(w, 2, sum)
3✔
729
            out <- sum.xw / sum.w
3✔
730
        } else
731
           out <- apply(data, 2, mean, na.rm = TRUE)
39✔
732

733
        if (categorical.as.binary && any(!tmp.numeric))
42✔
734
        {
735
            ind <- which(!tmp.numeric)
1✔
736
            for (ii in ind)
1✔
737
            {
738
                tmp.pos <- grep(paste0("^", tmp.names[ii]), names(out))
2✔
739
                names(out)[tmp.pos] <- levels(tmp.dat[[ii]])
2✔
740
            }
741
        }
742
        out <- as.matrix(out)
42✔
743

744
        # If ANY of the variables have been converted to percentages
745
        # label 'statistic' attribute to prevent mixed summary
746
        # statistics from being summed
747
        if (categorical.as.binary && any(!tmp.numeric))
42✔
748
            attr(out, "statistic") <- "%"
1✔
749
        else if (!categorical.as.binary || all(tmp.numeric))
41✔
750
            attr(out, "statistic") <- "Average"
41✔
751
    }
752
    attr(out, "assigned.rownames") <- TRUE
111✔
753
    out
111✔
754
}
755

756
#' coerceToDataFrame
757
#'
758
#' @description Takes various formats of data (in particular, lists of variables and
759
#' data.frames, and forces them to become a data frame. Where the coercion
760
#' involves creating rows in the data frame that are unlikely to be from the same analysis unit, a warning
761
#' is provided.
762
#' @param x Input data which may be a list of variables or dataframe
763
#' @param chart.type For any value except \code{"Scatter"}, x$Y will be
764
#'      ignored if x$X contains more than one variable
765
#' @param remove.NULLs Logical; whether to remove null entries
766
#' @importFrom flipTables TidyTabularData
767
#' @return A \code{\link{data.frame}})
768
#' @importFrom stats sd
769
#' @importFrom flipChartBasics MatchTable
770
#' @importFrom flipFormat TidyLabels
771
#' @importFrom flipU MakeUniqueNames
772
coerceToDataFrame <- function(x, chart.type = "Column", remove.NULLs = TRUE)
773
{
774
    if (is.null(x))
445✔
775
        return(x)
115✔
776
    if (is.data.frame(x))
330✔
777
        return(x)
9✔
778
    if (is.list(x) && length(x) == 1 && is.matrix(x[[1]])) # List only contains a matrix
321✔
779
    {
780
        tmp.names <- getFullRowNames(x[[1]])
7✔
781
        x <- as.data.frame(x[[1]])
7✔
782
        rownames(x) <- tmp.names
7✔
783
        return(x)
7✔
784
    }
785
    if (is.character(x))
314✔
786
    {
787
        x <- TidyTabularData(x)
×
788
        rownames(x) <- getFullRowNames(x)
×
789
        return(as.data.frame(x))
×
790
    }
791

792
    # For plotting regression output in a scatterplot, coerce regression object to chart data
793
    if (any(reg.outputs <- checkRegressionOutput(x)) && isScatter(chart.type) && is.list(x))
314✔
794
    {
795
        if (reg.outputs[1])
86✔
796
            x[[1]] <- extractRegressionScatterData(x[[1]])
56✔
797
        if (reg.outputs[2])
86✔
798
        {
799
            # Always expect names attributes of the models or table to be passed by Q/Displayr
800
            # However, catch case where names arent provided in the Y element of input.data.raw
801
            reg.names <- if (!is.null(names(x[[2]]))) names(x[[2]]) else LETTERS[seq_along(x[[2]])]
40✔
802
            x[[2]] <- mapply(extractRegressionScatterData,
40✔
803
                             x = x[[2]], y.axis = TRUE, name = reg.names, SIMPLIFY = FALSE)
40✔
804
        }
805

806
    }
807

808

809
    # if labels are present in raw data, extract and store for later
810
    rlabels <- x$labels
314✔
811
    x$labels <- NULL
314✔
812

813
    # Dealing with situation where x$X is a list containing only one thing.
814
    if (is.list(x[[1]]) && length(x[[1]]) == 1)
314✔
815
        x[[1]] <- x[[1]][[1]]
19✔
816

817
    # For Scatterplot, y-coordinates are entered by a multi comboBox
818
    # Remove duplicates before rownames are messed up
819
    if (isScatter(chart.type) && length(x) >= 2 && is.list(x[[2]]))
314✔
820
    {
821
        if (length(x[[2]]) > 1 || NCOL(x[[2]][[1]]) > 1)
71✔
822
            names(x[[2]]) <- NULL
25✔
823
        for (i in seq_along(x[[2]]))
71✔
824
        {
825
            # Replace rownames to preserve rowspans and duplicated labels
826
            y.rnames <- getFullRowNames(x[[2]][[i]])
86✔
827
            if (!is.null(nrow(x[[2]][[i]])))
86✔
828
                rownames(x[[2]][[i]]) <- MakeUniqueNames(y.rnames)
79✔
829
            else
830
                names(x[[2]][[i]]) <- MakeUniqueNames(y.rnames)
7✔
831
        }
832
        # Remap all Y elements to common array and keep attributes
833
        if (!is.null(unlist(lapply(x[[2]], rownames))) && length(x[[2]]) >= 2 && any(reg.outputs))
71✔
834
        {
835
            y.all.rownames <- unique(unlist(lapply(x[[2]], getFullRowNames)))
12✔
836
            base.values <- rep(NA, length(y.all.rownames))
12✔
837
            x[[2]] <- lapply(seq_along(x[[2]]), function(i) {
12✔
838
                vals <- base.values
24✔
839
                indices <- match(names(x[[2]][[i]]), y.all.rownames, nomatch = 0)
24✔
840
                vals[indices] <- x[[2]][[i]]
24✔
841
                names(vals) <- y.all.rownames
24✔
842
                CopyAttributes(vals, x[[2]][[i]])
24✔
843
            })
844
        }
845
        x[[2]] <- data.frame(x[[2]], check.names = FALSE, check.rows = FALSE,
71✔
846
                             fix.empty.names = FALSE, stringsAsFactors = FALSE)
71✔
847
        ind.autonames <- grep("^structure\\(|^c\\(", colnames(x[[2]]), perl = TRUE)
71✔
848
        for (ii in ind.autonames)
71✔
849
        {
850
            tmp.name <- attr(x[[2]][, ii], "name")
29✔
851
            colnames(x[[2]])[ii] <- if (!is.null(tmp.name)) tmp.name else " "
29✔
852
        }
853
    }
854

855
    if (!isScatter(chart.type) && (length(x) == 1 && is.list(x) && (is.matrix(x[[1]]) || !is.atomic(x[[1]]))))
314✔
856
    {
857
        x <- x[[1]]
84✔
858
        if (is.null(rlabels) && !is.atomic(x))
84✔
859
        {
860
            rlabels <- x$labels
84✔
861
            x$labels <- NULL
84✔
862
        }
863
    }
864

865
    # Checking to see if all the elements of x are single variables.
866
    all.variables <- all(sapply(x, NCOL) == 1)
314✔
867
    # Remove entries in the list which are null
868
    if (remove.NULLs)
314✔
869
        x <- Filter(Negate(is.null), x)
314✔
870
    x.rows <- sapply(x, function(m) NROW(as.data.frame(m)))
314✔
871
    k <- length(x.rows)
314✔
872
    extra.cols <- NULL
314✔
873
    if (isScatter(chart.type))
314✔
874
    {
875
        # Trim Y if sizes or color variable is provided
876
        if (NCOL(x$Y) > 1 && (!is.null(x$Z1) || !is.null(x$Z2) || !is.null(x$groups)))
126✔
877
        {
878
            warning("Only the first column of '", scatterDefaultNames(2),
2✔
879
                    "' variables is used'")
2✔
880
            extra.cols <- x$Y[, -1, drop = FALSE]
2✔
881
            x$Y <- x$Y[, 1, drop = FALSE]
2✔
882
        }
883
        for (i in 1:k)
126✔
884
        {
885
            tmp.names <- getFullRowNames(x[[i]])
259✔
886
            if (!is.null(names(x)) && names(x)[i] != "Y" && NCOL(x[[i]]) > 1)
259✔
887
            {
888
                warning("Only the first column of '", scatterDefaultNames(i),
6✔
889
                        "' variables is used")
6✔
890
                x[[i]] <- x[[i]][, 1, drop = FALSE]
6✔
891
            }
892
            if (!is.null(nrow(x[[i]])))
259✔
893
                rownames(x[[i]]) <- tmp.names
182✔
894
        }
895
    }
896

897
    # Extracting variable names
898
    if (isScatter(chart.type))
314✔
899
        nms <- unlist(lapply(1:k, function(i) {
126✔
900
            if (length(dim(x[[i]])) < 2) tidyScatterDefaultNames(names(x)[i])
160✔
901
            else                         colnames(x[[i]]) }))
99✔
902
    else
903
        nms <- if (all.variables) names(x) else unlist(lapply(x, names)) # i.e. 'X', 'Y', 'labels'
188✔
904

905
    # Check for row names to match on
906
    x.all.rownames <- NULL
314✔
907
    removed.rownames <- NULL
314✔
908
    if (isScatter(chart.type) && length(x) > 1)
314✔
909
    {
910
        # Check for row names to match on
911
        x.all.rownames <- getFullRowNames(x[[1]])
116✔
912
        for (i in 2:k)
116✔
913
        {
914
            if (length(x.all.rownames) == 0)
133✔
915
                x.all.rownames <- getFullRowNames(x[[i]])
15✔
916
            else
917
            {
918
                tmp.names <- getFullRowNames(x[[i]])
118✔
919
                if (length(tmp.names) > 0)
118✔
920
                {
921
                    removed.rownames <- unique(c(setdiff(x.all.rownames, tmp.names),
118✔
922
                                                 setdiff(tmp.names, x.all.rownames)))
118✔
923
                    x.all.rownames <- intersect(x.all.rownames, tmp.names)
118✔
924
                }
925

926
            }
927
        }
928

929
        # This is only rearranging the tables into the right order/dimensions
930
        # Note that we don't use MergeTables because this forces tables into the same type
931
        if (length(x.all.rownames) > 0)
116✔
932
        {
933
            for (i in 1:k)
103✔
934
                x[[i]] <- MatchTable(x[[i]], ref.names = x.all.rownames,
218✔
935
                                as.matrix = FALSE, trim.whitespace = FALSE,
218✔
936
                                silent.remove.duplicates = TRUE)
218✔
937

938
            if (!is.null(extra.cols))
103✔
939
                extra.cols <- MatchTable(extra.cols, ref.names = x.all.rownames,
1✔
940
                                as.matrix = FALSE, trim.whitespace = FALSE,
1✔
941
                                silent.remove.duplicates = TRUE)
1✔
942

943
            if (length(x.all.rownames) < max(x.rows))
103✔
944
            {
945
                discarded.rows <- if (length(removed.rownames) == 0) NULL else {
65✔
946
                    paste0(": ", paste0(removed.rownames, collapse = ", "))
64✔
947
                }
948
                if (any(reg.outputs))
65✔
949
                    base.warning <- paste0("Y input coefficients that did not appear in the list of X input ",
63✔
950
                                           "coefficients were discarded")
63✔
951
                else
952
                {
953
                    # Suppress warnings when removed rows are named "NET"
954
                    # This happens often when inputs are BANNERS
955
                    if (length(removed.rownames) > 0)
2✔
956
                        removed.rownames <- removed.rownames[trimws(removed.rownames) != "NET"]
1✔
957
                    base.warning <- "Rows that did not occur in all of the input tables were discarded"
2✔
958
                }
959

960
                if (length(removed.rownames) > 0)
65✔
961
                    warning(base.warning, discarded.rows)
64✔
962
            }
963
            if (length(rlabels) > 0)
103✔
964
                warning("The 'Labels' variable has been ignored. Using row names of ",
×
965
                        "'X-coordinates' and 'Y-coordinates' instead")
×
966
            rlabels <- x.all.rownames
103✔
967
        }
968
    }
969

970
    if (any(reg.outputs) && length(x.all.rownames) == 0 && length(x) > 1)
314✔
971
    {
972
        x.names <- paste0(sQuote(names(x[[1]])), collapse = ", ")
5✔
973
        y.names <- paste0(sQuote(rownames(x[[2]])), collapse = ", ")
5✔
974
        stop("The X coordinate and Y coordinate inputs don't have any variables with matching names. ",
5✔
975
             "Please ensure that there is matching input for both the X and Y coordinate input. ",
5✔
976
             "The X coordinate input has names: ", x.names, ". ",
5✔
977
             "The Y coordinate input has names: ", y.names, ".")
5✔
978
    }
979

980
    num.obs <- sapply(x, NROW)
309✔
981
    if (isScatter(chart.type) && is.null(x.all.rownames) &&
309✔
982
        length(unique(num.obs[num.obs > 0])) > 1)
309✔
983
    {
984
        # If data is aggregated (e.g. the mean of each variable) then
985
        # the length can differ
986
        names(num.obs) <- sapply(names(num.obs), tidyScatterDefaultNames)
1✔
987
        ind.diff <- which(num.obs > 0 & num.obs != num.obs[1])
1✔
988
        stop("Variables for '", paste(names(num.obs)[ind.diff], collapse = "', '"),
1✔
989
            "' differ in length from variables for '", names(num.obs)[1], "'. ",
1✔
990
            "Check that all variables are from the same data set.")
1✔
991
    }
992

993
    # Splicing together elements of the input list if lengths vary
994
    # Note that elements of x can contain lists of variables
995
    invalid.joining <- FALSE
308✔
996
    if (!isScatter(chart.type) && (NCOL(x) > 1 || is.list(x) && length(x) > 1))
308✔
997
    {
998
        if (invalid.joining <- sd(x.rows) != 0)
145✔
999
        {
1000
            k <- length(x.rows)
12✔
1001
            out <- matrix(NA, max(x.rows), k)
12✔
1002
            for (i in 1:k)
12✔
1003
                out[1:x.rows[i], i] <- x[[i]]
40✔
1004
            x <- out
12✔
1005
        }
1006
    }
1007
    x <- data.frame(x, stringsAsFactors = FALSE, check.names = FALSE)
308✔
1008
    names(x) <- MakeUniqueNames(nms)
308✔
1009
    if (!is.null(extra.cols))
308✔
1010
        x <- data.frame(x, extra.cols, stringsAsFactors = FALSE, check.names = FALSE)
2✔
1011

1012
    # Set rownames
1013
    if (!is.null(rlabels) && nrow(x) == length(rlabels))
308✔
1014
         rownames(x) <- MakeUniqueNames(as.character(rlabels))
104✔
1015
    if (invalid.joining)
308✔
1016
        attr(x, "InvalidVariableJoining") <- TRUE
12✔
1017
    return(x)
308✔
1018
}
1019

1020

1021
isDistribution <- function(chart.type)
1022
{
1023
    grepl("Bean|Box|Histogram|Density|Violin", chart.type)
2,505✔
1024
}
1025

1026
#' @importFrom flipStatistics ExtractChartData
1027
#' @importFrom verbs FlattenTableAndDropStatisticsIfNecessary
1028
processInputData <- function(x, subset, weights)
1029
{
1030
    if (is.null(x))
626✔
1031
        return(x)
449✔
1032

1033
    if (length(subset) > 1)
177✔
1034
    {
1035
        msg <- paste("Filters have been applied to this visualization. They have been ignored.",
×
1036
                "To apply filters you need to instead filter the source data that is being visualized.")
×
1037
        tb.desc <- attr(x, "basedescription")
×
1038
        if (is.null(tb.desc) || tb.desc$FilteredProportion == 0)
×
1039
            warning(msg)
×
1040
        else if ((mean(subset) * 100) + tb.desc$FilteredProportion != 100)
×
1041
            warning(msg)
×
1042
    }
1043
    if (length(weights) > 0)
177✔
1044
    {
1045
        msg <- paste("Weights have been applied to this visualization. They have been ignored.",
×
1046
                "To apply weights you need to instead weight the source data that is being visualized.")
×
1047
        if (is.null(attr(x, "basedescription")) || is.null(attr(x, "weight.name")))
×
1048
            warning(msg)
×
1049
        else if (!isTRUE(attr(weights, "name") == attr(x, "weight.name")))
×
1050
            warning(msg)
×
1051
    }
1052

1053
    # Simplify input if only a single table has been specified
1054
    if ("list" %in% class(x) && is.list(x) && !is.data.frame(x))
177✔
1055
    {
1056
        if (length(x) == 1)
17✔
1057
            x <- x[[1]]
12✔
1058
    }
1059

1060
    # Try to use S3 method to extract data
1061
    x <- ExtractChartData(x)
177✔
1062

1063
    # Flatten tables with spans or grid questions
1064
    has.mult.stats <- is.null(attr(x, "statistic")) && !is.null(attr(x, "questiontypes"))
177✔
1065
    ndim <- length(dim(x)) - has.mult.stats
177✔
1066
    if (ndim >= 2)
177✔
1067
    {
1068
        if (has.mult.stats)
127✔
1069
            x <- flattenMultiStatTable(x)
11✔
1070
        else
1071
            x <- FlattenTableAndDropStatisticsIfNecessary(x)
116✔
1072
    }
1073

1074
    if (hasUserSuppliedRownames(x))
177✔
1075
        attr(x, "assigned.rownames") <- TRUE
122✔
1076

1077
    return(x)
177✔
1078
}
1079

1080
# Function is only called when we know it is a QTable with questiontype attributes and multiple stats
1081
#' @importFrom stats ftable
1082
flattenMultiStatTable <- function(x)
1083
{
1084
    # Set dimnames of flattened table using function in verbs package
1085
    # This will handle row/column spans from banners
1086
    x0 <- suppressWarnings(FlattenTableAndDropStatisticsIfNecessary(x))
11✔
1087
    n.dims <- length(dim(x))
11✔
1088
    if (n.dims < 4)
11✔
1089
    {
1090
        rownames(x) <- rownames(x0)
4✔
1091
        colnames(x) <- colnames(x0)
4✔
1092
        return(x)
4✔
1093
    }
1094

1095
    stat.names <- dimnames(x)[[n.dims]]
7✔
1096
    new.dnames <- dimnames(x0)
7✔
1097
    new.dnames[[length(new.dnames) + 1]] <- stat.names
7✔
1098
    new.x <- array(NA, dim = c(dim(x0), length(stat.names)), dimnames = new.dnames)
7✔
1099

1100
    qtypes <- attr(x, "questiontypes")
7✔
1101
    for (i in 1:length(stat.names))
7✔
1102
    {
1103
        if (n.dims == 4){
14✔
1104
            ## Multi is in rows, combine 2nd and 3rd dimensions of table
1105
            if (qtypes[1] %in% c("PickOneMulti", "PickAnyGrid", "NumberGrid"))
14✔
1106
                new.x[,,i] <- ftable(x[,,,i], row.vars = 1, col.vars = 2:3)
×
1107
            else
1108
                new.x[,,i] <- ftable(x[,,,i], row.vars = 2:1, col.vars = 3)
14✔
1109
        } else if (n.dims == 5)  # e.g. Nominal - Multi by Binary - Grid
×
1110
            new.x[,,i] <- ftable(x[,,,,i], row.vars = c(1, 3), col.vars = c(2, 4))
×
1111
    }
1112
    return(CopyAttributes(new.x, x))
7✔
1113
}
1114

1115
processPastedData <- function(input.data.pasted, warn, date.format, subset, weights)
1116
{
1117
    if (length(subset) > 1)
62✔
1118
        warning("Filters have been applied to this visualization. They have been ignored. ",
×
1119
            "To apply filters you need to instead filter the source data that is being visualized.")
×
1120
    if (length(weights) > 0)
62✔
1121
        warning("Weights have been applied to this visualization. They have been ignored. ",
×
1122
            "To apply weights you need to instead weight the source data that is being visualized.")
×
1123

1124
    us.format <- switch(date.format, US = TRUE, International = FALSE, Automatic = NULL, "No date formatting")
62✔
1125
    want.data.frame <- length(input.data.pasted) > 1L && isTRUE(input.data.pasted[[2]])
62✔
1126
    processed <- tryCatch(ParseUserEnteredTable(input.data.pasted[[1]],
62✔
1127
                                  want.data.frame = want.data.frame,
62✔
1128
                                  want.factors = FALSE, #input.data.pasted[[2]], #charts has no concept of factors
62✔
1129
                                  want.col.names = input.data.pasted[[3]],
62✔
1130
                                  want.row.names = input.data.pasted[[4]],
62✔
1131
                                  us.format = us.format,
62✔
1132
                                  warn = warn),
62✔
1133
             error = function(e) {input.data.pasted[[1]]})
×
1134
    if (!is.null(processed) && length(input.data.pasted) > 3)
62✔
1135
        attr(processed, "assigned.rownames") <- input.data.pasted[[4]]
54✔
1136
    if (!is.null(processed) && want.data.frame)
62✔
1137
        attr(processed, "assigned.rownames") <- TRUE
23✔
1138
    if (!is.null(attr(processed, "row.column.names")))
62✔
1139
        names(dimnames(processed)) <- attr(processed, "row.column.names")
2✔
1140
    return(processed)
62✔
1141
}
1142

1143
#' @importFrom verbs Sum
1144
checkNumberOfDataInputs <- function(data.source.index, table, tables, raw, pasted, other)
1145
{
1146
    data.provided <- !vapply(list(table, tables, raw, pasted, other), is.null, logical(1L))
634✔
1147
    n.data <- sum(data.provided)
634✔
1148
    if (n.data == 0)
634✔
1149
        stop("No data has been provided.")
1✔
1150
    else if (is.null(data.source.index))
633✔
1151
    {
1152
        if (n.data > 1)
605✔
1153
            stop("There are ", n.data, " data inputs. One and only one data argument may be supplied.")
×
1154

1155
    } else if (!data.provided[data.source.index])
28✔
1156
        stop("The data provided does not match the 'data.source.index'.")
7✔
1157
}
1158

1159
# For error messages, etc
1160
scatterDefaultNames <- function(i)
1161
{
1162
    return(switch(i,
8✔
1163
         "X",
8✔
1164
         "Y",
8✔
1165
         "Sizes",
8✔
1166
         "Colors",
8✔
1167
         "Groups"))
8✔
1168
}
1169

1170
tidyScatterDefaultNames <- function(x)
1171
{
1172
    return(switch(x,
162✔
1173
        X = "X coordinates",
162✔
1174
        Y = "Y coordinates",
162✔
1175
        Z1 = "Sizes",
162✔
1176
        Z2 = "Colors"))
162✔
1177
}
1178

1179
rmScatterDefaultNames <- function(data)
1180
{
1181
    # Remove default names so they are not shown in the axis
1182
    if (is.data.frame(data) && !is.null(colnames(data)))
122✔
1183
    {
1184
        if (colnames(data)[1] == "X coordinates")
122✔
1185
            colnames(data)[1] <- " "
89✔
1186
        if (NCOL(data) >= 2 && colnames(data)[2] == "Y coordinates")
122✔
1187
            colnames(data)[2] <- "  "
42✔
1188
    }
1189
    return(data)
122✔
1190
}
1191

1192

1193
scatterVariableIndices <- function(input.data.raw, data, show.labels)
1194
{
1195
    # Use ExtractChartData to convert any raw Regression input
1196
    if (any(reg.outputs <- checkRegressionOutput(input.data.raw)))
155✔
1197
    {
1198
        if(reg.outputs[1])
72✔
1199
            input.data.raw[[1]] <- extractRegressionScatterData(input.data.raw[[1]])
43✔
1200
        if(reg.outputs[2])
72✔
1201
            input.data.raw[[2]] <- lapply(input.data.raw[[2]], extractRegressionScatterData, y.axis = TRUE)
29✔
1202
    }
1203

1204
    # Creating indices in situations where the user has provided a table.
1205
    len <- length(input.data.raw)
155✔
1206
    indices <- c(x = 1,
155✔
1207
                 y = 2,
155✔
1208
                 sizes = if (NCOL(data) >= 3) 3 else NA,
155✔
1209
                 colors = if (NCOL(data) >= 4) 4 else NA,
155✔
1210
                 groups = NCOL(data))
155✔
1211
    if (is.null(input.data.raw) || is.data.frame(input.data.raw) || is.list(input.data.raw) && len == 1)
155✔
1212
        return(indices)
53✔
1213

1214
    .getColumnIndex <- function(i)
102✔
1215
    {
1216
        if (i > len)
510✔
1217
            return(NA)
227✔
1218
        if (raw.is.null[i])
283✔
1219
            return(NA)
74✔
1220
        ind <- cumsum(!raw.is.null)[i]
209✔
1221
        lst <- input.data.raw[[i]]
209✔
1222
        if (is.null(lst))
209✔
1223
            return(NA)
×
1224
        nms <- names(data)
209✔
1225

1226
        # If inputs are variables, match on label/variable name to avoid problems with duplicates
1227
        # This should not be applied on tables which do not necessarily have unique names
1228
        if (!is.null(attr(lst, "label")) && is.null(attr(lst, "questions")))
209✔
1229
        {
1230
            nm <- if (show.labels) Labels(lst) else Names(lst)
16✔
1231
            if (is.null(nm) || length(nm) != 1)
16✔
1232
                return(ind)
×
1233
            pos <- match(nm, nms)
16✔
1234
            if (!is.na(pos))
16✔
1235
                return(pos)
15✔
1236
        }
1237
        return(ind)
194✔
1238
    }
1239

1240

1241
    # Indices corresponding to selections in input.raw.data
1242
    raw.is.null <- sapply(input.data.raw, is.null)
102✔
1243
    indices["x"] <- .getColumnIndex(1)
102✔
1244
    indices["y"] <- .getColumnIndex(2)
102✔
1245
    indices["sizes"] <- .getColumnIndex(3)
102✔
1246
    indices["colors"] <- .getColumnIndex(4)
102✔
1247
    indices["groups"] <- .getColumnIndex(5)
102✔
1248
    indices
102✔
1249
}
1250

1251
checkForNegPercent <- function(x)
1252
{
1253
    ind.negative <- which(x < 0)
39✔
1254
    if (length(ind.negative) > 0)
39✔
1255
    {
1256
        warning("Percentages calculated ignoring negative values.")
×
1257
        x[ind.negative] <- 0
×
1258
    }
1259
    return(x)
39✔
1260
}
1261

1262

1263
asPercentages <- function(data)
1264
{
1265
    if (length(dim(data)) == 2 && is.null(attr(data, "statistic")) &&
39✔
1266
        length(attr(data, "questions")) == 2 && attr(data, "questions")[2] == "SUMMARY")
39✔
1267
    {
1268
        # 1-dimensional table with multiple statistics
1269
        data[,1] <- checkForNegPercent(data[,1])
1✔
1270
        data[,1] <- prop.table(data[,1]) * 100
1✔
1271
    }
1272
    else if (length(dim(data)) > 2)
38✔
1273
    {
1274
        # 2-dimensional table with statistics
1275
        data[,,1] <- checkForNegPercent(data[,,1])
4✔
1276
        if (NCOL(data) == 1)
4✔
1277
            data[,,1] <- suppressWarnings(prop.table(data[,,1])) * 100
1✔
1278
        else
1279
            data[,,1] <- prop.table(suppressWarnings(TidyTabularData(data)), 1) * 100
3✔
1280
        dimnames(data)[[3]][1] <- "%"
4✔
1281
    }
1282
    else if (NCOL(data) > 1)
34✔
1283
    {
1284
        # 2-dimensional table without statistics
1285
        data <- checkForNegPercent(data)
11✔
1286
        data <- prop.table(data, 1) * 100
11✔
1287
        attr(data, "statistic") <- "Row %"
11✔
1288
    }
1289
    else
1290
    {
1291
        # 1-dimensional table without statistics
1292
        data <- checkForNegPercent(data)
23✔
1293
        data <- prop.table(data) * 100
23✔
1294
        attr(data, "statistic") <- "%"
23✔
1295
    }
1296
    data
39✔
1297
}
1298

1299
RearrangeRowsColumns <- function(data,
1300
                                 multiple.tables,
1301
                                 select.rows, first.k.rows, last.k.rows,
1302
                                 select.columns, first.k.columns, last.k.columns,
1303
                                 row.names.to.remove, column.names.to.remove, split,
1304
                                 auto.order.rows, auto.order.columns,
1305
                                 sort.rows, sort.rows.decreasing, sort.rows.column,
1306
                                 sort.rows.exclude, reverse.rows,
1307
                                 sort.columns, sort.columns.decreasing, sort.columns.row,
1308
                                 sort.columns.exclude, reverse.columns)
1309
{
1310
    if (multiple.tables)
639✔
1311
    {
1312
        for(i in seq_along(data))
9✔
1313
            data[[i]] = RearrangeRowsColumns(data[[i]], FALSE,
22✔
1314
                                 select.rows, first.k.rows, last.k.rows,
22✔
1315
                                 select.columns, first.k.columns, last.k.columns,
22✔
1316
                                 row.names.to.remove, column.names.to.remove, split,
22✔
1317
                                 auto.order.rows, auto.order.columns,
22✔
1318
                                 sort.rows, sort.rows.decreasing, sort.rows.column,
22✔
1319
                                 sort.rows.exclude, reverse.rows,
22✔
1320
                                 sort.columns, sort.columns.decreasing, sort.columns.row,
22✔
1321
                                 sort.columns.exclude, reverse.columns)
22✔
1322
        return(data)
9✔
1323
    }
1324

1325
    # Select first so that sorting only occurs in rows/columns of interest
1326
    data <- SelectRows(data, select = select.rows)
630✔
1327
    data <- SelectColumns(data, select = select.columns)
630✔
1328

1329
    if (auto.order.rows)
630✔
1330
    {
1331
        data <- try(AutoOrderRows(data))
1✔
1332
        if (inherits(data, "try-error"))
1✔
1333
            stop("Could not perform correspondence analysis on table. Try hiding empty rows.")
×
1334
    }
1335
    else if (sort.rows)
629✔
1336
        data <- SortRows(data, sort.rows.decreasing, sort.rows.column, sort.rows.exclude)
9✔
1337
    if (reverse.rows)
630✔
1338
        data <- ReverseRows(data)
2✔
1339

1340
    if (auto.order.columns)
630✔
1341
    {
1342
        data <- try(AutoOrderColumns(data))
×
1343
        if (inherits(data, "try-error"))
×
1344
            stop("Could not perform correspondence analysis on table. Try hiding empty columns.")
×
1345
    }
1346
    else if (sort.columns)
630✔
1347
        data <- SortColumns(data, sort.columns.decreasing, sort.columns.row, sort.columns.exclude)
3✔
1348
    if (reverse.columns)
630✔
1349
        data <- ReverseColumns(data)
1✔
1350

1351
    # Keep hidden rows/columns until after sorting
1352
    # Sort is often performed on the NET values
1353
    data <- RemoveRowsAndOrColumns(data,
630✔
1354
                row.names.to.remove = row.names.to.remove,
630✔
1355
                column.names.to.remove = column.names.to.remove, split = split)
630✔
1356

1357
    # Keep last to retain order from sorting
1358
    data <- SelectRows(data, first.k = first.k.rows, last.k = last.k.rows)
629✔
1359
    data <- SelectColumns(data, first.k = first.k.columns, last.k = last.k.columns)
629✔
1360

1361
}
1362

1363
#' @importFrom flipTables RemoveRowsAndOrColumns HideEmptyRows HideEmptyColumns
1364
#' @importFrom flipTime AsDate AsDateTime IsDateTime
1365
#' @importFrom flipU CopyAttributes
1366
#' @importFrom verbs Sum
1367
transformTable <- function(data,
1368
                           chart.type,
1369
                           multiple.tables,
1370
                           tidy,
1371
                           drop,
1372
                           is.raw.data,
1373
                           hide.output.threshold,
1374
                           hide.values.threshold,
1375
                           hide.rows.threshold, hide.columns.threshold,
1376
                           transpose,
1377
                           first.aggregate,
1378
                           hide.empty.rows, hide.empty.columns,
1379
                           date.format,
1380
                           table.counter = 1)
1381
{
1382
    if (multiple.tables)
639✔
1383
    {
1384
        for (i in seq_along(data))
9✔
1385
            data[[i]] = transformTable(data[[i]],
22✔
1386
                                       chart.type,
22✔
1387
                                       FALSE,
22✔
1388
                                       FALSE,
22✔
1389
                                       FALSE,
22✔
1390
                                       is.raw.data,
22✔
1391
                                       0, 0, 0, 0, # sample size not used
22✔
1392
                                       transpose,
22✔
1393
                                       first.aggregate,
22✔
1394
                                       hide.empty.rows, hide.empty.columns,
22✔
1395
                                       date.format,
22✔
1396
                                       i)
22✔
1397
        return(data)
9✔
1398
    }
1399

1400
    if (hide.empty.rows)
630✔
1401
        data <- if (isListOrRaggedArray(data)) lapply(data, HideEmptyRows)
620✔
1402
                else HideEmptyRows(data)
620✔
1403

1404
    if (hide.empty.columns)
630✔
1405
    {
1406
        if (isScatter(chart.type))
618✔
1407
            old.names <- colnames(data)
187✔
1408
        data <- if (isListOrRaggedArray(data)) lapply(data, HideEmptyColumns)
618✔
1409
                else HideEmptyColumns(data)
618✔
1410
    }
1411

1412
    # Switching rows and columns
1413
    # This is the first operation performed to ensure that both
1414
    # hide.rows.threshold and row.names.to.remove refer to rows AFTER tranposing
1415
    if (isTRUE(transpose))
630✔
1416
    {
1417
        if (length(dim(data)) > 2)
10✔
1418
            new.data <- aperm(data, c(2, 1, 3))
5✔
1419
        else
1420
            new.data <- t(data)
5✔
1421
        data <- CopyAttributes(new.data, data)
10✔
1422
        attr(data, "questions") <- rev(attr(data, "questions"))
10✔
1423
    }
1424

1425
    # Checking sample sizes (if available)
1426
    # This needs to happen after row/columns have been (de)selected
1427
    if (any(as.integer(hide.output.threshold), na.rm = TRUE))
630✔
1428
        data <- HideOutputsWithSmallSampleSizes(data, hide.output.threshold)
×
1429
    if (any(as.integer(hide.values.threshold), na.rm = TRUE))
630✔
1430
        data <- HideValuesWithSmallSampleSizes(data, hide.values.threshold)
×
1431
    if (any(as.integer(hide.rows.threshold), na.rm = TRUE))
630✔
1432
        data <- HideRowsWithSmallSampleSizes(data, hide.rows.threshold)
1✔
1433
    if (any(as.integer(hide.columns.threshold), na.rm = TRUE))
630✔
1434
        data <- HideColumnsWithSmallSampleSizes(data, hide.columns.threshold)
×
1435

1436
    # Set axis names before dropping dimensions (but AFTER transpose)
1437
    data <- setAxisTitles(data, chart.type, drop)
630✔
1438
    if (chart.type == "Scatter" && is.null(dim(data)))
630✔
1439
    {
1440
        tmp.names <- names(data)
21✔
1441
        dim(data) <- c(length(data), 1)
21✔
1442
        if (!is.null(tmp.names))
21✔
1443
            rownames(data) <- tmp.names
21✔
1444
    }
1445

1446
    # Convert to matrix to avoid state names from being turned into numeric values
1447
    # when TidyTabularData is called
1448
    if (gsub(" ", "", chart.type) == "GeographicMap" && is.data.frame(data))
630✔
1449
        data <- CopyAttributes(as.matrix(data), data)
1✔
1450

1451
    # This must happen after sample sizes have been used
1452
    # (only first statistic is retained after tidying)
1453
    if (tidy && !chart.type %in% c("Venn", "Sankey", "Heat") &&
630✔
1454
        !isScatter(chart.type) && !isDistribution(chart.type))
630✔
1455
            data <- tryCatch(TidyTabularData(data), error = function(e) { data })
2✔
1456

1457

1458
    if (!grepl("^No date", date.format) && date.format != "Automatic")
630✔
1459
    {
1460
        input.us.format <- !grepl("International", date.format)
7✔
1461
        output.format.str <- if (!grepl("International", date.format)) "%b %d %Y" else "%d %b %Y"
7✔
1462
        if (!is.null(rownames(data)) && IsDateTime(rownames(data)))
7✔
1463
        {
1464
            tmp.dates <- try(suppressWarnings(AsDate(rownames(data), us.format = input.us.format)), silent = TRUE)
3✔
1465
            if (inherits(tmp.dates, "try-error"))
3✔
1466
                tmp.dates <- suppressWarnings(AsDate(rownames(data)))
×
1467
            rownames(data) <- format(tmp.dates, output.format.str)
3✔
1468
        }
1469
        else if (IsDateTime(names(data)))
4✔
1470
        {
1471
            tmp.dates <- try(suppressWarnings(AsDate(names(data), us.format = input.us.format)), silent = TRUE)
2✔
1472
            if (inherits(tmp.dates, "try-error"))
2✔
1473
                tmp.dates <- suppressWarnings(AsDate(names(data)))
×
1474
            names(data) <- format(tmp.dates, output.format.str)
2✔
1475
        }
1476
    }
1477
    return(data)
630✔
1478
}
1479

1480
convertPercentages <- function(data, as.percentages, hide.percent.symbol, chart.type,
1481
                               multiple.tables, table.counter = 1)
1482
{
1483
    if (multiple.tables)
638✔
1484
    {
1485
        for (i in seq_along(data))
9✔
1486
            data[[i]] <- convertPercentages(data[[i]], as.percentages, hide.percent.symbol,
22✔
1487
                            chart.type, FALSE, i)
22✔
1488
        return(data)
9✔
1489
    }
1490

1491
    ### If data is already percentages in Qtable then divide by 100
1492
    ### Note that R outputs and pasted data will already be in decimals
1493
    #stat <- attr(data, "statistic")
1494
    #qst <- attr(data, "questions")
1495
    #if (!is.null(stat) && !is.null(qst) && grepl("%)?$", stat))
1496
    #    data <- data / 100
1497

1498
    # Convert to percentages - this must happen AFTER transpose and RemoveRowsAndOrColumns
1499
    if (as.percentages && chart.type != "Venn")
629✔
1500
    {
1501
        percentages.warning <- paste0("The data has not been converted to percentages/proportions. ",
43✔
1502
        "To convert to percentages, first convert to a more suitable type (e.g., create a table).")
43✔
1503
        if (!is.numeric(data) && !is.data.frame(data) &&
43✔
1504
            (is.null(attr(data, "questions")) || chart.type %in% c("Pie", "Donut", "Heat")))
43✔
1505
            warning(percentages.warning)
×
1506
        else if (chart.type %in% c("Pie", "Donut"))
43✔
1507
        {
1508
            data <- data / Sum(data) * 100
3✔
1509
            attr(data, "statistic") <- "%"
3✔
1510
        }
1511
        else if (chart.type == "Heat" && isTRUE(grepl("%$", attr(data, "statistic"))))
40✔
1512
            data <- data
1✔
1513
        else
1514
            data <- asPercentages(data) # converts character QTables to numeric
39✔
1515

1516
        if (isTRUE(attr(data, "values.title") == "n") || isTRUE(attr(data, "values.title") == "Count"))
43✔
1517
            attr(data, "values.title") <- "%"
7✔
1518
    }
1519

1520
    if (hide.percent.symbol)
629✔
1521
    {
1522
        if (isTRUE(grepl("%", attr(data, "statistic"))))
2✔
1523
            attr(data, "statistic") <- "Percent"
2✔
1524
        else if (!is.null(attr(data, "questions")) && !is.null(attr(data, "name")) &&
×
1525
                  is.null(attr(data, "statistic")))
×
1526
        {
1527
            dlen <- length(dim(data))
×
1528
            primary.stat <- dimnames(data)[[dlen]][1]
×
1529
            if (grepl("%", primary.stat))
×
1530
                dimnames(data)[[dlen]][1] <- gsub("%", "Percent", primary.stat)
×
1531
        }
1532
    }
1533
    return(data)
629✔
1534
}
1535

1536
#' @importFrom flipTables TidyTabularData
1537
#' @importFrom flipTransformations AsNumeric
1538
#' @importFrom flipU MakeUniqueNames
1539
#' @importFrom verbs SumEachRow
1540
prepareForSpecificCharts <- function(data,
1541
                                     multiple.tables,
1542
                                     input.data.raw,
1543
                                     chart.type,
1544
                                     weights,
1545
                                     show.labels,
1546
                                     scatter.mult.yvals)
1547
{
1548
    if (!isDistribution(chart.type) && chart.type != "Table" && !is.null(input.data.raw) &&
618✔
1549
        is.list(input.data.raw$X) && length(input.data.raw$X) > 10 && !inherits(input.data.raw$X, "Regression"))
618✔
1550
        warning("With a large number of variables, it may be better to first create ",
×
1551
                 "a table and then create a visualization using the table.")
×
1552

1553
    # Multiple tables
1554
    if (multiple.tables)
618✔
1555
    {
1556
        data <- lapply(data, TidyTabularData)
9✔
1557
        # flipStandardCharts::Scatterplot takes an array input, with column numbers indicating how to plot.
1558
        if (isScatter(chart.type))
9✔
1559
            attr(data, "scatter.variable.indices") <- c(x = 1, y = 2, sizes = 3, colors = 4)
7✔
1560
    }
1561
    else if (chart.type == "Table" || chart.type == "Heat")
609✔
1562
    {
1563
        # Do nothing
1564
    }
1565
    else if (chart.type == "Venn")
578✔
1566
    {
1567
        missing.data.rows <- SumEachRow(as.matrix(is.na(data))) > 0
14✔
1568
        if (any(missing.data.rows))
14✔
1569
        {
1570
            data <- data[!missing.data.rows, ]
×
1571
            warning(Sum(missing.data.rows), " case(s) with missing data have been removed.")
×
1572
        }
1573
    }
1574
    else if (chart.type == "Sankey")
564✔
1575
    {
1576
        data <- coerceToDataFrame(data)
×
1577
    }
1578
    # Scatterplots
1579
    else if (isScatter(chart.type))
564✔
1580
    {
1581
        .isQTableWithMultStatistic <- function(x)
173✔
1582
        {
1583
            !is.null(attr(x, "questions")) && !is.null(attr(x, "name")) && is.null(attr(x, "statistic"))
12✔
1584
        }
1585

1586

1587
        if (isTRUE(scatter.mult.yvals) ||
173✔
1588
            (is.list(input.data.raw$Y) && length(input.data.raw$Y) > 1))
173✔
1589
        {
1590
            # Tag data for reformatting but this is preformed later after
1591
            # Row/column manipulations
1592
            attr(data, "scatter.mult.yvals") <- TRUE
24✔
1593

1594
        } else if (NCOL(input.data.raw$Y[[1]]) > 1 && is.null(input.data.raw$Z1) &&
149✔
1595
            is.null(input.data.raw$Z2) && is.null(input.data.raw$groups))
149✔
1596
        {
1597
            if (!(.isQTableWithMultStatistic(input.data.raw$Y[[1]]) &&
6✔
1598
                  length(dim(input.data.raw$Y[[1]])) < 3))
6✔
1599
                attr(data, "scatter.mult.yvals") <- TRUE
4✔
1600

1601
            if (.isQTableWithMultStatistic(input.data.raw$Y[[1]]))
6✔
1602
            {
1603
                if (length(dim(input.data.raw$Y[[1]])) < 3)
3✔
1604
                    attr(data, "ycol") <- 1
2✔
1605
                else
1606
                    attr(data, "ycol") <- NCOL(input.data.raw$Y[[1]])
1✔
1607
            }
1608

1609
        } else
1610
        {
1611
            if (!is.data.frame(data) && !is.matrix(data))
143✔
1612
                data <- TidyTabularData(data)
24✔
1613

1614
            # Removing duplicate columns
1615
            if (length(dim(data)) == 2 && any(d <- duplicated(names(data))))
143✔
1616
                data <- data[, !d]
×
1617

1618
            # flipStandardCharts::Scatterplot takes an array input, with column numbers indicating how to plot.
1619
            if (is.null(attr(data, "scatter.variable.indices")))
143✔
1620
                attr(data, "scatter.variable.indices") <- scatterVariableIndices(input.data.raw, data, show.labels)
143✔
1621
        }
1622
    }
1623
    # Charts that plot the distribution of raw data (e.g., histograms)
1624
    else if (isDistribution(chart.type))
391✔
1625
    {
1626
        # input.data.raw could be NULL and the result below be a logical of zero length.
1627
        if (is.null(input.data.raw))
121✔
1628
            input.data.raw <- list(NULL)
66✔
1629
        len <- Sum(!vapply(input.data.raw, is.null, FALSE))
121✔
1630
        if (len > 1L)  # variables from multiple GUI controls
121✔
1631
        {
1632
            if (NCOL(input.data.raw[[1]]) > 1 && (NCOL(input.data.raw[[2]]) == 1 || len > 2))
12✔
1633
                stop("If using a grouping variable, you may only have one additional variable.")
×
1634
            # Splitting the first variable by the second
1635
            else if (#!is.null(input.data.raw[[2]]) &&
1636
                NCOL(input.data.raw[[1]]) == 1 && NCOL(input.data.raw[[2]]) == 1)
12✔
1637
            {
1638
                if (!is.null(weights))
12✔
1639
                    weights <- SplitVectorToList(weights, data[[2]])
5✔
1640
                data <- SplitVectorToList(data[[1]], data[[2]])
12✔
1641
                attr(data, "weights") <- weights
12✔
1642
            }
1643
        }
1644
        else # Coercing data to numeric format, if required
1645
            data <- AsNumeric(data, binary = FALSE)
109✔
1646
        #if (!is.list(data))
1647
        #    data <- list(data)
1648
    }
1649
    else
1650
    {
1651
        # Set rownames before TidyTabularData so that factor are not converted to numeric
1652
        tmp.stat <- attr(data, "statistic")
270✔
1653
        data <- useFirstColumnAsLabel(data,
270✔
1654
            allow.numeric.rownames = chart.type %in% c("Area", "Bar", "Column", "Line", "Stream"))
270✔
1655
        attr(data, "statistic") <- tmp.stat
270✔
1656
    }
1657
    data
618✔
1658
}
1659

1660

1661
setWeight <- function(x, weights)
1662
{
1663
    if (!is.null(w <-  attr(x, "weights")))
941✔
1664
        return(w)
39✔
1665
    weights
902✔
1666
}
1667

1668
#' Check for object of class list or a \emph{ragged} array
1669
#' @noRd
1670
isListOrRaggedArray <- function(x)
1,238✔
1671
    inherits(x, "list") || (inherits(x, "array") && !all(vapply(x, length, 1L) == 1))
1,238✔
1672

1673

1674
#' @noRd
1675
useFirstColumnAsLabel <- function(x, remove.duplicates = TRUE,
1676
    allow.numeric.rownames = TRUE, allow.duplicate.rownames = TRUE)
1677
{
1678
    if (length(dim(x)) != 2 || ncol(x) == 1)
270✔
1679
        return(x)
138✔
1680
    if (NROW(x) == 1) # single row input
132✔
1681
        return(x)
5✔
1682
    if (hasUserSuppliedRownames(x))
127✔
1683
        return(x)
86✔
1684

1685
    if (!allow.numeric.rownames && is.numeric(x[,1]))
41✔
1686
        return(x)
×
1687

1688
    # Catch Q Tables which have numeric row names but are
1689
    # not raw data tables. It is not appropriate to use
1690
    # the first column as a label in this case because
1691
    # it contains a statistic.
1692
    if (allow.numeric.rownames
41✔
1693
        && IsQTable(x)
41✔
1694
        && !isRawDataQTable(x))
41✔
1695
        return(x)
2✔
1696

1697
    # What to do with duplicate rownames?
1698
    ind.dup <- duplicated(x[,1])
39✔
1699

1700
    # Duplicated numeric vectors are most likely data variables, not rownames
1701
    if (any(ind.dup) && is.numeric(x[,1]))
39✔
1702
        return(x)
14✔
1703

1704
    # For duplicated character vectors, we remove duplicates
1705
    if (any(ind.dup))
25✔
1706
    {
1707
        if (!allow.duplicate.rownames) # scatterplot
8✔
1708
        {
1709
            warning("First column was not used as labels ",
×
1710
                    "because it contains duplicated values: ",
×
1711
                    paste(unique(x[ind.dup,1]), collapse=", "))
×
1712
            return(x)
×
1713
        }
1714

1715
        # If too many duplicates, then assume it is not expected to be a rowname
1716
        # The exception is when the rownames are QDates
1717
        is.date <- is.factor(x[,1]) &&
8✔
1718
            all(!is.na(suppressWarnings(AsDate(levels(x[,1]), on.parse.failure = "silent"))))
8✔
1719
        if ((!is.date) && mean(ind.dup, na.rm = T) > 0.9) # too many duplicates
8✔
1720
            return(x)
8✔
1721
        wmsg <- if (IsDateTime(x[,1])) ". Check aggregation level of date variable '"
×
1722
                else                   ". Consider aggregating on '"
×
1723

1724
        warning("Duplicated entries in '", colnames(x)[1], "': ",
×
1725
            paste(unique(x[ind.dup,1]), collapse = ", "),
×
1726
            wmsg, colnames(x)[1], "'.")
×
1727
        if (remove.duplicates)
×
1728
        {
1729
            warning("Only the first unique entry is shown.")
×
1730
            x <- x[!ind.dup, ]
×
1731
        }
1732
        else
1733
            return(x)
×
1734
    }
1735
    if (inherits(x[,1], 'Date') || inherits(x[,1], 'POSIXct') ||
17✔
1736
        inherits(x[,1], 'POSIXlt') || inherits(x[,1], 'POSIXt'))
17✔
1737
        r.tmp <- format(x[,1], "%b %d %Y")
3✔
1738
    else if (is.factor(x[,1])) # QDates are also factors
14✔
1739
        r.tmp <- make.unique(as.character(x[,1]))
×
1740
    else
1741
        r.tmp <- make.unique(as.character(x[,1]))
14✔
1742

1743
    is.missing <- is.na(r.tmp)
17✔
1744
    if (any(is.missing))
17✔
1745
        warning("Rows ", paste(which(is.missing), collapse = ","),
×
1746
                " have been omitted because of missing values.")
×
1747
    ind <- which(!is.missing)
17✔
1748

1749
    c.title <- colnames(x)[1]
17✔
1750
    c2.title <- if (NCOL(x) == 2) colnames(x)[2]
17✔
1751
    x <- x[ind, -1, drop = FALSE]
17✔
1752
    rownames(x) <- r.tmp[ind]
17✔
1753
    attr(x, "categories.title") <- c.title
17✔
1754
    if (!is.null(c2.title))
17✔
1755
        attr(x, "values.title") <- c2.title
7✔
1756
    return(x)
17✔
1757
}
1758

1759
setAxisTitles <- function(x, chart.type, drop, values.title = "")
1760
{
1761
    if (isScatter(chart.type))
1,246✔
1762
    {
1763
        # Charting functions will automatically use column names
1764
        attr(x, "categories.title") <- ""
370✔
1765
        attr(x, "values.title") <- ""
370✔
1766

1767
    } else if (chart.type == "Heat")
876✔
1768
    {
1769
        # No default axis labels for summary tables
1770
        # Because it depends on the question type used to create the table
1771
        if (length(attr(x, "questions")) == 2 &&
12✔
1772
            "SUMMARY" %in% attr(x, "questions"))
12✔
1773
        {
1774
            attr(x, "categories.title") <- ""
4✔
1775
            attr(x, "values.title") <- ""
4✔
1776
        }
1777

1778
        if (is.null(attr(x, "categories.title")))
12✔
1779
            attr(x, "categories.title") <- names(dimnames(x))[2]
5✔
1780
        if (is.null(attr(x, "categories.title")))
12✔
1781
            attr(x, "categories.title") <- attr(x, "questions")[2]
3✔
1782

1783
        if (is.null(attr(x, "values.title")))
12✔
1784
            attr(x, "values.title") <- names(dimnames(x))[1]
4✔
1785
        if (is.null(attr(x, "values.title")))
12✔
1786
            attr(x, "values.title") <- attr(x, "questions")[1]
2✔
1787
    } else
1788
    {
1789
        # Extract categories.title from aggregated data
1790
        if (is.null(attr(x, "categories.title")))
864✔
1791
            attr(x, "categories.title") <- names(dimnames(x))[1]
681✔
1792
        # Extract categories.title from Qtables
1793
        if (is.null(attr(x, "categories.title")) && !is.null(attr(x, "questions")))
864✔
1794
            attr(x, "categories.title") <- attr(x, "questions")[1]
70✔
1795
        if (!is.null(attr(x, "statistic")) && grepl("%$", attr(x, "statistic")))
864✔
1796
            attr(x, "values.title") <- "%"
91✔
1797
        else if (!is.null(attr(x, "statistic")) && grepl("Percent", attr(x, "statistic")))
773✔
1798
            attr(x, "values.title") <- ""
2✔
1799
        else if (any(nchar(attr(x, "statistic"))))
771✔
1800
            attr(x, "values.title") <- attr(x, "statistic")
210✔
1801
        if (is.null(attr(x, "values.title")) && length(dimnames(x)) == 3)
864✔
1802
            attr(x, "values.title") <- dimnames(x)[[3]][1]
37✔
1803
    }
1804
    if (sum(nchar(values.title)) > 0)
1,246✔
1805
        attr(x, "values.title") <- values.title
4✔
1806
    if (is.null(attr(x, "values.title")))
1,246✔
1807
        attr(x, "values.title") <- ""
280✔
1808
    if (drop && !is.data.frame(x) && !chart.type %in% c("Scatter", "Heat"))
1,246✔
1809
    {
1810
        # only drop 1 dimension from a 2d matrix
1811
        if (length(dim(x)) == 2 && (dim(x)[2] == 1 || dim(x)[1] == 1)) {
571✔
1812
            if (dim(x)[2] == 1) {
101✔
1813
                tmp.vec <- x[, 1]
93✔
1814
                names(tmp.vec) <- rownames(x)
93✔
1815
            }
1816
            else if (dim(x)[1] == 1) {
8✔
1817
                tmp.vec <- x[1, ]
8✔
1818
                names(tmp.vec) <- colnames(x)
8✔
1819
            }
1820
            attr(tmp.vec, "statistic") <- attr(x, "statistic")
101✔
1821
            attr(tmp.vec, "questions") <- attr(x, "questions")
101✔
1822
            attr(tmp.vec, "categories.title") <- attr(x, "categories.title")
101✔
1823
            attr(tmp.vec, "values.title") <- attr(x, "values.title")
101✔
1824
            x <- tmp.vec
101✔
1825
        }
1826
        else
1827
            x <- CopyAttributes(drop(x), x)
470✔
1828
    }
1829
    x
1,246✔
1830
}
1831

1832
getFullRowNames <- function(x)
1833
{
1834
    if (!is.null(attr(x, "span")))
627✔
1835
        return(apply(attr(x, "span")$rows, 1, paste, collapse = " - "))
51✔
1836
    else if (!is.null(nrow(x)) && hasUserSuppliedRownames(x))
576✔
1837
        return(MakeUniqueNames(rownames(x)))
400✔
1838
    else if (!is.list(x) && is.null(nrow(x)))
176✔
1839
        return(MakeUniqueNames(names(x)))
158✔
1840
    else if (is.list(x) && length(x) == 1)
18✔
1841
        return(getFullRowNames(x[[1]]))
2✔
1842
    else
1843
        return(NULL)
16✔
1844
}
1845

1846

1847
#' Helps tidy Q variables and tables
1848
#' @description Inputs supplied via input.data.raw can be in a range of
1849
#'  formats. This function does a minimal job of checking for attribute
1850
#'  and using these as names when appropriate. Currently, it does
1851
#'  two functions. (1) Returns the span instead of the values and
1852
#'  (2) assigns column names to 1-dimensional Q tables. Inputs which
1853
#'  cannot be safely converted to a matrix (e.g. date/time or factors)
1854
#'  are returned as is without any changes.
1855
#'
1856
#' @param x Q table or variable
1857
#' @param use.span Logical; Whether the span categories should be returned
1858
#'  instead of the values in the table. Row names will be preserved.
1859
#'  A warning will be given if this option is selected but no span
1860
#'  attribute is found in \code{x}.
1861
#' @param show.labels This option is only relevant for Q variables.
1862
#'   For tables, the resulting variable will always be named by
1863
#'   by the 'name' attribute, but for variables both the 'label' and
1864
#'   'name' attribute can be used.
1865
#' @param is.scatter.annot.data This condition is applied to input
1866
#'   data expected to be used for annotation data for scatterplots.
1867
#'   it checks that the data is one-dimensional and stops immediately
1868
#'   and gives an error if this condition is not met. This avoid
1869
#'   some nonsense output or misleading error messages that might
1870
#'   be given by PrepareData.
1871
#'
1872
#' @export
1873
PrepareForCbind <- function(x, use.span = FALSE, show.labels = TRUE,
1874
                        is.scatter.annot.data = FALSE)
1875
{
1876
    if (is.null(x))
13✔
1877
        return(x)
×
1878
    if (is.scatter.annot.data && NCOL(x) > 1)
13✔
1879
        stop("Annotation data for Scatterplots should be a single-column table ",
×
1880
             "or variable with the same number of values as the number of ",
×
1881
             "points in the chart")
×
1882

1883
    allow.qtables <- get0("ALLOW.QTABLE.CLASS", ifnotfound = FALSE, envir = .GlobalEnv)
13✔
1884

1885
    if (!allow.qtables)
13✔
1886
        x <- unclassQTable(x)
11✔
1887

1888
    if (use.span && is.null(attr(x, "span")))
13✔
1889
        warning("Spans were not used as this attribute was not found in the data.")
×
1890

1891
    new.dat <- NULL
13✔
1892
    if (inherits(x, c("POSIXct", "POSIXt", "Date")) || is.factor(x))
13✔
1893
    {
1894
        # For variables, this function is not really required
1895
        # and for non-atomic types it results in info being lost
1896
        new.dat <- data.frame(x)
1✔
1897

1898
    } else if (use.span && !is.null(attr(x, "span")))
12✔
1899
    {
1900
        # Q tables can always be converted to a matrix
1901
        new.dat <- as.matrix(attr(x, "span")$rows[, 1])
2✔
1902
        rownames(new.dat) <- if (!is.null(rownames(x))) rownames(x) else names(x)
2✔
1903

1904
        # Assign a blank name, so this column is not
1905
        # accidentally used for another variable
1906
        # The space is needed to avoid ugly R defaults
1907
        colnames(new.dat) <- " "
2✔
1908
        new.dat <- CopyAttributes(new.dat, x)
2✔
1909
        return(new.dat)
2✔
1910
    }
1911
    else if (!is.list(x))
10✔
1912
    {
1913
        # Avoid trying to convert complex data structures
1914
        # including dataframes which might have different types
1915
        new.dat <- as.matrix(x)
8✔
1916

1917
    } else
1918
        new.dat <- x
2✔
1919

1920
    # Multi-column tables are generally already correctly named
1921
    if ((is.data.frame(x) || !is.list(x)) && ncol(new.dat) == 1)
11✔
1922
    {
1923
        if (!is.null(attr(x, "label")) && show.labels)     # x is a variable
8✔
1924
            colnames(new.dat) <- Labels(x)
1✔
1925
        else if (!is.null(attr(x, "name")) && length(Names(x)) == 1) # x is a table or a variable
7✔
1926
            colnames(new.dat) <- Names(x)
5✔
1927
        else
1928
            colnames(new.dat) <- " "
2✔
1929
    }
1930
    CopyAttributes(new.dat, x)
11✔
1931
}
1932

1933

1934

1935
rawDataLooksCrosstabbable <- function(input.data.raw, data)
1936
{
1937
    if (is.null(input.data.raw))
314✔
1938
        return(FALSE)
175✔
1939
    if (is.null(input.data.raw))
139✔
1940
        return(FALSE)
×
1941
    not.nulls <- !vapply(input.data.raw, is.null, logical(1L))
139✔
1942
    if (length(not.nulls) == 1)
139✔
1943
        return(FALSE)
79✔
1944
    if (!not.nulls[1] || !not.nulls[2])
60✔
1945
        return(FALSE)
5✔
1946
    if (length(not.nulls) > 2)
55✔
1947
    {
1948
        if (Sum(not.nulls) != 2)
19✔
1949
            return(FALSE)
7✔
1950
        input.data.raw <- input.data.raw[1:2]
12✔
1951
    }
1952
    nms <- names(input.data.raw)
48✔
1953
    ncols <- vapply(input.data.raw, NCOL, integer(1L))
48✔
1954
    if (any(ncols != 1))
48✔
1955
        return(FALSE)
×
1956
    return(all(nms == c("X", "Y")))
48✔
1957
}
1958

1959
hasUserSuppliedRownames <- function(data)
1960
{
1961
    if (is.null(rownames(data)))
762✔
1962
        return(FALSE)
61✔
1963
    tmp <- attr(data, "assigned.rownames")
701✔
1964
    if (isTRUE(tmp))
701✔
1965
        return(TRUE)
93✔
1966
    if (!is.null(tmp) && !tmp)
608✔
1967
        return(FALSE)
3✔
1968
    if (length(dim(data)) < 2 && is.null(names(data)))
605✔
1969
        return(FALSE)
×
1970

1971
    # Default row names
1972
    rnames <- gsub("Row ", "", rownames(data))
605✔
1973
    if (all(rnames == as.character(1:nrow(data))))
605✔
1974
        return(FALSE)
58✔
1975

1976
    return(TRUE)
547✔
1977
}
1978

1979

1980
#' @importFrom utils tail
1981
tidyLabels <- function(data, chart.type)
1982
{
1983
    tmp <- NULL
17✔
1984
    vertical.chart <- isDistribution(chart.type) || chart.type == "Venn"
17✔
1985
    if (length(dim(data)) >= 2)
17✔
1986
    {
1987
        orig.names <- if (vertical.chart) colnames(data)
12✔
1988
                      else                rownames(data)
12✔
1989
        if (!IsDateTime(orig.names))
12✔
1990
        {
1991
            tmp <- ExtractCommonPrefix(orig.names)
11✔
1992
            if (!is.na(tmp$common.prefix))
11✔
1993
            {
1994
                warning(sprintf("'%s' has been removed from labels. To turn off de-select 'DATA MANIPULATION > Tidy labels'", tmp$common.prefix))
3✔
1995
                if (vertical.chart)
3✔
1996
                    colnames(data) <- tmp$shortened.labels
2✔
1997
                else
1998
                {
1999
                    rownames(data) <- tmp$shortened.labels
1✔
2000
                    if (is.null(attr(data, "categories.title")))
1✔
2001
                        attr(data, "categories.title") <- tmp$common.prefix
1✔
2002
                }
2003
            }
2004
        }
2005
    }
2006
    else if (!is.null(names(data))) # lists and vectors
5✔
2007
    {
2008
        if (!IsDateTime(names(data)))
5✔
2009
        {
2010
            tmp <- ExtractCommonPrefix(names(data))
5✔
2011
            if (!is.na(tmp$common.prefix))
5✔
2012
            {
2013
                warning(sprintf("'%s' has been removed from labels. To turn off de-select 'DATA MANIPULATION > Tidy labels'", tmp$common.prefix))
1✔
2014
                names(data) <- tmp$shortened.labels
1✔
2015
                if (is.null(attr(data, "categories.title")))
1✔
2016
                    attr(data, "categories.title") <- tmp$common.prefix
1✔
2017
            }
2018
        }
2019
    }
2020

2021
    # Remove span labels
2022
    if (isScatter(chart.type) &&  !is.null(rownames(data)) &&
17✔
2023
        all(grepl(" - ", rownames(data), fixed = TRUE)))
17✔
2024
    {
2025
        rownames(data) <- MakeUniqueNames(sapply(rownames(data),
3✔
2026
            function(x) tail(strsplit(x, " - ")[[1]], n = 1)))
3✔
2027
    }
2028
    data
17✔
2029
}
2030

2031

2032
checkRegressionOutput <- function(x)
2033
{
2034
    # First element always a single element
2035
    # Second element is a list of elements
2036
    # Last four elements are Z1, Z2, groups and labels that should never be regression outputs
2037
    return(c(inherits(x$X, "Regression"), any(sapply(x$Y, function(e) inherits(e, "Regression")))))
507✔
2038
}
2039

2040
#' @importFrom flipFormat TidyLabels
2041
extractRegressionScatterData <- function(x, y.axis = FALSE, name = NULL)
2042
{
2043
    if (!inherits(x, "Regression"))
178✔
2044
        return(x)
9✔
2045
    chart.data <- ExtractChartData(x)
169✔
2046
    if (!is.null(x$importance))
169✔
2047
        names(chart.data) <- TidyLabels(names(chart.data))
93✔
2048
    if (y.axis)
169✔
2049
    {
2050
        chart.data <- as.array(chart.data)
70✔
2051
        attr(chart.data, "name") <- name
70✔
2052
    }
2053
    return(chart.data)
169✔
2054
}
2055

2056

2057
# This function is used when scatter.mult.yvals = TRUE
2058
# It converts the input data frame which a data series in each column
2059
# into a the standard input format, where the data series
2060
# is indicated by the value in the "Groups" column
2061
# If x-coordinates are supplied in input.data.raw$X, then
2062
# rownames attached to the x-coordinates will be used as
2063
# rownames of the resulting data frame
2064
# Otherwise, the rownames of data will be used as the
2065
# x-coordinates and the rownames of the output data
2066
# will be blank (with spaces as padding for uniqueness)
2067
# This function also updates the attribute "scatter.variable.indices"
2068
# to describe the format of the output data frame
2069

2070
convertScatterMultYvalsToDataFrame <- function(data, input.data.raw, show.labels, date.format)
2071
{
2072
    data.row.labels <- rownames(data)
28✔
2073
    n <- nrow(data)
28✔
2074
    if (any(reg.outputs <- sapply(input.data.raw$Y, function(e) inherits(e, "Regression"))))
28✔
2075
    {
2076
        extracted.data.raw.Y <- input.data.raw$Y
8✔
2077
        extracted.data.raw.Y[reg.outputs] <- lapply(input.data.raw$Y[reg.outputs], ExtractChartData)
8✔
2078
        regression.names <- names(input.data.raw$Y)
8✔
2079
        idx <- which(reg.outputs)
8✔
2080
        for(i in seq_along(idx))
8✔
2081
            attr(extracted.data.raw.Y[[idx[i]]], "label") <- regression.names[idx[i]]
8✔
2082
        y.names <- if (show.labels) Labels(extracted.data.raw.Y) else Names(extracted.data.raw.Y)
8✔
2083
    } else
2084
        y.names <- if (show.labels) Labels(input.data.raw$Y) else Names(input.data.raw$Y)
20✔
2085

2086
    # Figure out which columns to use as the X and Y coordinates
2087
    if (is.list(input.data.raw$Y) && is.null(input.data.raw$X))
28✔
2088
    {
2089
        # No X-coordinates supplied in variables
2090
        m <- length(input.data.raw$Y)
×
2091
        y.ind <- 1:m
×
2092
        xvar <- rep(1:n, m)
×
2093

2094
    } else if (is.null(input.data.raw$Y) && hasUserSuppliedRownames(data))
28✔
2095
    {
2096
        # Use rowlabels as X-coordinate if character labels given
2097
        m <- ncol(data)
9✔
2098
        if (!is.null(attr(data, "ycol")))
9✔
2099
            m <- attr(data, "ycol")
×
2100
        y.ind <- 1:m
9✔
2101
        xvar <- rep(rownames(data), m)
9✔
2102
        data.row.labels <- rep("", nrow(data))
9✔
2103
    } else
2104
    {
2105
        # Otherwise use first column as X-coordinates
2106
        m <- ncol(data) - 1
19✔
2107
        if (!is.null(attr(data, "ycol")))
19✔
2108
            m <- attr(data, "ycol") - 1
1✔
2109
        y.ind <- (1:m) + 1
19✔
2110
        xvar <- rep(data[,1], m)
19✔
2111
    }
2112

2113
    if (!hasUserSuppliedRownames(data))
28✔
2114
        data.row.labels <- rep("", nrow(data))
5✔
2115
    if (length(y.names) < m)
28✔
2116
        y.names <- colnames(data)[y.ind]
17✔
2117
    if (length(y.names) < m)
28✔
2118
        y.names <- paste("Group", 1:m)
1✔
2119
    if (any(checkRegressionOutput(input.data.raw)) && length(y.names) >= m)
28✔
2120
        y.names <- colnames(data)[y.ind]
9✔
2121

2122
    # Data from other statistics is restructured and appended separately
2123
    extravar <- NULL
28✔
2124
    if (!is.null(attr(data, "ycol")))
28✔
2125
    {
2126
        # Other statistics are in the rest of input.data.raw$Y[[1]]
2127
        # But we need to take from data because we may have removed row/cols
2128
        yvar <- as.vector(unlist(data[,y.ind]))
1✔
2129
        y.names <- dimnames(input.data.raw$Y[[1]])[[2]]
1✔
2130
        tmp.ind <- charmatch(y.names, colnames(data)[y.ind])
1✔
2131
        y.names <- y.names[!is.na(tmp.ind)]
1✔
2132
        y.names.patt <- paste(paste0("\\Q", y.names, "\\E"), collapse = "|")
1✔
2133
        stat.names <- dimnames(input.data.raw$Y[[1]])[[3]]
1✔
2134
        extravar <- matrix(NA, nrow = length(yvar), ncol = length(stat.names) - 1)
1✔
2135

2136
        for (i in 2:length(stat.names))
1✔
2137
        {
2138
            stat.names.patt <- paste0("\\Q.", stat.names[i], "\\E$") # make patt strict (e.g 'p')!
1✔
2139
            tmp.ind <- intersect(grep(stat.names.patt, colnames(data)),
1✔
2140
                                 grep(y.names.patt, colnames(data)))
1✔
2141
            if (length(tmp.ind) > 0)
1✔
2142
                extravar[,i-1] <- unlist(data[,tmp.ind])
1✔
2143
        }
2144
        colnames(extravar) <- stat.names[-1]
1✔
2145

2146
    } else if (length(dim(data)) >= 3)
27✔
2147
    {
2148
        # Other statistics are in the 3rd dimension of table
2149
        yvar <- as.vector(unlist(data[,y.ind,1]))
6✔
2150
        extravar <- apply(data[, y.ind, -1, drop = FALSE], 3, unlist)
6✔
2151

2152
    } else # simple case with no other statistics
2153
        yvar <- as.vector(unlist(data[,y.ind]))
21✔
2154

2155

2156
    # newdata needs to use data rather than input.data.raw
2157
    # otherwise it will not handle filters etc
2158
    newdata <- data.frame(X = xvar,
28✔
2159
                          Y = yvar,
28✔
2160
                          Groups = factor(rep(y.names, each = n), levels = y.names),
28✔
2161
                          stringsAsFactors = FALSE)
28✔
2162

2163
    if (length(extravar) > 0)
28✔
2164
        newdata <- cbind(newdata, extravar)
7✔
2165
    rownames(newdata) <- if (length(unique(data.row.labels)) <= 1) NULL
28✔
2166
                         else                                      MakeUniqueNames(rep(data.row.labels, m))
28✔
2167
    if (!grepl("^No date", date.format) && date.format != "Automatic")
28✔
2168
    {
2169
        if (IsDateTime(as.character(newdata[,1])))
1✔
2170
            newdata[,1] <- format(AsDate(as.character(newdata[,1]),
1✔
2171
            us.format = !grepl("International", date.format)), "%b %d %Y")
1✔
2172
    }
2173

2174
    # Preserve column names where possible
2175
    if (!is.null(input.data.raw$X))
28✔
2176
        colnames(newdata)[1] <- colnames(data)[1]
16✔
2177
    else if (!is.null(qst <- attr(data, "questions")))
12✔
2178
    {
2179
        colnames(newdata)[1] <- qst[1]
7✔
2180
        if (length(qst) >= 2)
7✔
2181
            colnames(newdata)[3] <- qst[2]
7✔
2182
    }
2183
    if (length(dim(data)) == 3)
28✔
2184
        colnames(newdata)[2] <- dimnames(data)[[3]][1]
6✔
2185
    else if (!is.null(attr(data, "statistic")))
22✔
2186
        colnames(newdata)[2] <- attr(data, "statistic")
2✔
2187

2188

2189
    data <- newdata
28✔
2190
    attr(data, "scatter.variable.indices") <- c(x = 1, y = 2, sizes = 0, colors = 3, groups = 3)
28✔
2191
    return(data)
28✔
2192
}
2193

2194
#' @importFrom flipU IsQTable
2195
containsQTable <- function(x)
2196
{
2197
    if (is.data.frame(x)) return(FALSE)
266✔
2198
    if (!is.list(x)) return(IsQTable(x))
9,613✔
2199
    any(vapply(x, containsQTable, logical(1L)))
1,611✔
2200
}
2201

2202
#' @importFrom abind abind
2203
addStatTesting <- function(x, x.siginfo, p.cutoffs, colors.pos, colors.neg, colors.on.font, symbol, symbol.size)
2204
{
2205
    arrow.dir <- x.siginfo$significancedirection
13✔
2206
    if (all(arrow.dir == "None"))
13✔
2207
        return(x)
×
2208

2209
    arrow.pval <- x.siginfo$pcorrected
13✔
2210
    arrow.colors <- rep("", length(arrow.dir))
13✔
2211
    ind.pos <- which(arrow.dir == "Up")
13✔
2212
    for (ii in ind.pos)
13✔
2213
    {
2214
        j <- max(which(arrow.pval[ii] < p.cutoffs))
60✔
2215
        arrow.colors[ii] <- colors.pos[j]
60✔
2216
    }
2217
    ind.neg <- which(arrow.dir == "Down")
13✔
2218
    for (ii in ind.neg)
13✔
2219
    {
2220
        j <- max(which(arrow.pval[ii] < p.cutoffs))
49✔
2221
        arrow.colors[ii] <- colors.neg[j]
49✔
2222
    }
2223

2224
    dn <- dim(x)
13✔
2225
    if (is.null(dn)) # vector
13✔
2226
    {
2227
        tmp.x <- matrix(x, length(x), 1, dimnames = list(names(x), NULL))
×
2228

2229
    } else if (length(dn) == 1)
13✔
2230
    {
2231
        dn <- c(dn, 1)
4✔
2232
        tmp.x <- matrix(x, ncol = 1, dimnames = list(rownames(x), NULL))
4✔
2233
    } else if (is.null(attr(x, "statistic", exact = TRUE)) && length(dn) == 2)
9✔
2234
    {
2235
        tmp.x <- array(x, c(dn[1], 1, dn[2]))
1✔
2236
        dimnames(tmp.x) <- list(dimnames(x)[[1]], NULL, dimnames(x)[[2]])
1✔
2237
    } else
2238
        tmp.x <- x
8✔
2239

2240
    # Append new cell-statistic and annotation for each differently colored arrow
2241
    # Usually, this is one color each for each direction but there are theoretically 10 levels each
2242
    mat.list <- list(tmp.x)
13✔
2243
    annot.list <- list()
13✔
2244
    signames <- c()
13✔
2245
    k <- 1
13✔
2246
    for (tmp.dir in c("Up", "Down"))
13✔
2247
    {
2248
        tmp.col <- unique(arrow.colors[which(arrow.dir == tmp.dir)])
26✔
2249
        for (cc in tmp.col)
26✔
2250
        {
2251
            j <- length(mat.list)
26✔
2252
            mat.list[[j+1]] <- matrix(arrow.dir == tmp.dir & arrow.colors == cc,
26✔
2253
                nrow=nrow(tmp.x), ncol=ncol(tmp.x), byrow = TRUE)
26✔
2254
            tmp.signame <- paste0("signif", tmp.dir, cc)
26✔
2255
            signames <- c(signames, tmp.signame)
26✔
2256

2257
            # Add annotation for font colors
2258
            if (colors.on.font)
26✔
2259
            {
2260
                annot.list[[k]] <- list(type = "Recolor text",
2✔
2261
                    data = tmp.signame, threstype = "above threshold", threshold = 0,
2✔
2262
                    color = cc)
2✔
2263
                k <- k + 1
2✔
2264
            }
2265

2266
            # Add annotation for symbol (arrow or caret)
2267
            if (symbol != "None")
26✔
2268
            {
2269
                annot.list[[k]] <- list(type = paste(symbol, "-", tolower(tmp.dir)),
24✔
2270
                    data = tmp.signame, threstype = "above threshold", threshold = 0,
24✔
2271
                    color = cc, size = symbol.size)
24✔
2272
                k <- k + 1
24✔
2273
            }
2274
        }
2275
    }
2276
    new.dat <- abind(mat.list, along = 3)
13✔
2277
    if (length(dim(tmp.x)) == 3)
13✔
2278
        dimnames(new.dat)[[3]] <- c(dimnames(tmp.x)[[3]], signames)
4✔
2279
    else
2280
        dimnames(new.dat)[[3]] <- c(attr(x, "statistic", exact = TRUE), signames)
9✔
2281
    new.dat <- CopyAttributes(new.dat, x)
13✔
2282
    attr(new.dat, "statistic") <- NULL
13✔
2283
    attr(new.dat, "signif-annotations") <- annot.list
13✔
2284
    return(new.dat)
13✔
2285
}
2286

2287
updateQStatisticsInfo <- function(x, original.dim.names, transpose)
2288
{
2289
    x.siginfo <- attr(x, "QStatisticsTestingInfo")
13✔
2290
    if (is.null(x.siginfo))
13✔
2291
        return(x)
×
2292
    if (length(dim(x)) < 2 || NCOL(x) == 1)
13✔
2293
    {
2294
        curr.names <- if (length(dim(x)) == 0) names(x) else rownames(x)
5✔
2295
        ind <- match(curr.names, original.dim.names[[1]])
5✔
2296
        attr(x, "QStatisticsTestingInfo") <- x.siginfo[ind,]
5✔
2297
        return(x)
5✔
2298
    }
2299
    if (transpose)
8✔
2300
        original.dim.names <- original.dim.names[2:1]
2✔
2301
    rows.changed <- length(dimnames(x)[[1]]) != length(original.dim.names[[1]]) ||
8✔
2302
        any(dimnames(x)[[1]] != original.dim.names[[1]])
8✔
2303
    cols.changed <- length(dimnames(x)[[2]]) != length(original.dim.names[[2]]) ||
8✔
2304
        any(dimnames(x)[[2]] != original.dim.names[[2]])
8✔
2305
    if (!rows.changed && !cols.changed)
8✔
2306
        return(x)
×
2307

2308
    x.siginfo <- attr(x, "QStatisticsTestingInfo")
8✔
2309
    cind <- if (transpose) 1 else 2
8✔
2310
    rind <- if (transpose) 2 else 1
8✔
2311
    nc <- length(original.dim.names[[cind]])
8✔
2312
    row.ord <- match(dimnames(x)[[1]], original.dim.names[[1]])
8✔
2313
    col.ord <- match(dimnames(x)[[2]], original.dim.names[[2]])
8✔
2314
    ind2d <- expand.grid(col.ord, row.ord)
8✔
2315
    ind <- 1:nrow(ind2d)
8✔
2316
    for (ii in 1:length(ind))
8✔
2317
        ind[ii] <- (ind2d[ii, cind] - 1) * nc + ind2d[ii, rind]
426✔
2318
    attr(x, "QStatisticsTestingInfo") <- x.siginfo[ind,]
8✔
2319

2320
    # Remove QStatisticsTestingInfo if any of it is invalid
2321
    # to avoid an exception getting thrown on export
2322
    if (any(!is.finite(attr(x, "QStatisticsTestingInfo")$significancearrowratio)))
8✔
2323
        attr(x, "QStatisticsTestingInfo") <- NULL
×
2324
    return(x)
8✔
2325
}
2326

2327
isQTableClass <- function(x) inherits(x, "QTable") || inherits(x, "qTable")
1,492✔
2328

2329
unclassQTable <- function(data)
2330
{
2331
    if (is.null(data)) return(data)
1,960✔
2332
    if (is.list(data) && !is.data.frame(data))
587✔
2333
    {
2334
        qtable.elements <- vapply(data, isQTableClass, logical(1L))
384✔
2335
        data[qtable.elements] <- lapply(data[qtable.elements], unclassQTable)
384✔
2336
        return(data)
384✔
2337
    }
2338
    if (isQTableClass(data))
203✔
2339
    {
2340
        data <- unclass(data)
21✔
2341
        data.attributes <- attributes(data)
21✔
2342
        is.subscripted.table <- !is.null(data.attributes[["original.questiontypes"]])
21✔
2343
        if (!is.subscripted.table) return(data)
18✔
2344
        data.attribute.names <- names(data.attributes)
3✔
2345
        attr.to.remove <- qTableAttributesToRemove(data.attribute.names)
3✔
2346
        attributes(data)[attr.to.remove] <- NULL
3✔
2347
        return(data)
3✔
2348
    }
2349
    data
182✔
2350
}
2351

2352
#' @importFrom verbs IsQTableAttribute
2353
qTableAttributesToRemove <- function(attr.names)
2354
{
2355
    qtable.attr.names <- eval(formals(IsQTableAttribute)[["qtable.attrs"]])
3✔
2356
    qtable.attr.names <- c(qtable.attr.names, paste0("original.", qtable.attr.names))
3✔
2357
    attr.names %in% qtable.attr.names & !attr.names %in% c("dim", "dimnames", "names")
3✔
2358
}
2359

2360
#' Check the questions and statistics attribute,
2361
#' and the dimnames in the last dimension,
2362
#' to work out if this table is likely to be
2363
#' a raw data table in Q/Displayr
2364
#' @noRd
2365
isRawDataQTable <- function(x) {
2366
    questions <- attr(x, "questions")
3✔
2367
    if ("RAW DATA" %in% questions)
3✔
2368
        return(TRUE)
1✔
2369

2370
    statistic <- attr(x, "statistic")
2✔
2371
    if (statistic %in% c("Values", "Labels"))
2✔
2372
        return(TRUE)
×
2373

2374
    dn <- dimnames(x)
2✔
2375
    if (is.null(dn))
2✔
2376
        return(FALSE)
×
2377

2378
    last.dn <- dn[length(dn)]
2✔
2379
    if (last.dn %in% c("Values", "Labels"))
2✔
2380
        return(TRUE)
×
2381

2382
    return(FALSE)
2✔
2383
}
STATUS · Troubleshooting · Open an Issue · Sales · Support · CAREERS · ENTERPRISE · START FREE · SCHEDULE DEMO
ANNOUNCEMENTS · TWITTER · TOS & SLA · Supported CI Services · What's a CI service? · Automated Testing

© 2025 Coveralls, Inc