• Home
  • Features
  • Pricing
  • Docs
  • Announcements
  • Sign In

uwescience / myria
59%
master: 27%

Build:
Build:
LAST BUILD BRANCH: blob_expr_UDF
DEFAULT BRANCH: master
Repo Added 11 Feb 2014 06:30PM UTC
Files 0
Badge
Embed ▾
README BADGES
x

If you need to use a raster PNG badge, change the '.svg' to '.png' in the link

Markdown

Textile

RDoc

HTML

Rst

LAST BUILD ON BRANCH uda-rewrite-script-eval
branch: uda-rewrite-script-eval
CHANGE BRANCH
x
Reset
  • uda-rewrite-script-eval
  • CSVFragmentTupleSource-bug-fix
  • FileSink
  • IDB_store_and_schema_inference
  • MPW_reconfig
  • add-binary-support
  • add-driver-dir
  • add-partition-function-to-ingest
  • add-s3-UriSink
  • add_PF_in_JSON_ingest
  • add_broadcast
  • add_code_formatter
  • add_count_filter
  • add_few_recursive_json
  • add_limit_to_dataset_api
  • add_myriax_doc
  • anonymous-for-local-deployment
  • ansible_fix
  • blob-udf-new-merge
  • blob_expr_UDF
  • blob_literal
  • byteRangeExpression
  • catalog-scan
  • concatExpression
  • cosmo8Fix
  • create-deployment
  • cross_compile
  • dbExecute
  • deleteTables
  • deploy_python
  • doc-ingest
  • dup_elim_hash
  • elastic_cluster
  • enable_worker_sysgc
  • error_reporting
  • exclude_old_jersey
  • exit_when_OOM
  • fix-ipc-channel-failure
  • fix-null-child-when-generating-schema
  • fix-old-jsons
  • fix-query-summary
  • fix_conf_parsing
  • fix_constant_emit_in_statefulapply
  • fix_cp_return_code
  • fix_dup_opIds
  • fix_fragment_assigned_workers
  • fix_hanging_query_when_output_unavailable
  • fix_some_jsons
  • fix_state_col_offset
  • fix_tbb_column_size_with_blob
  • fixing-CSVFragmentTupleSource
  • flatmap_apply
  • flatmap_expressions
  • gradle-2.3
  • gradle-cache
  • hash_table_stats
  • hdfs-persistence
  • hyrkas-msd-code
  • imply_clean_catalog
  • int-to-boolean-casting
  • integration_tests
  • java_udf
  • jmx
  • jmx_symlink_overwrite
  • jortiz16-partition
  • lbrendanl_elastic
  • load_options
  • master
  • mod_op
  • multi_file_scan_script
  • myria-jar-deploy-jortiz
  • myriaPythonWorker
  • new_blob_type
  • opid-private
  • optimize_aggregates
  • optimize_local_join
  • optimize_stringcol
  • orderbyTests
  • orzikhd_sql_resource
  • output-encoding
  • overwrite_fix
  • parallel-ingest-operator-for-raco
  • parallelIngest
  • parallelIngest-bug
  • parallelTest
  • parallel_tester
  • partitioning_order
  • perfenforce-merge
  • postgres_float
  • pre_refactoring_merge
  • production
  • profiling-relations
  • prune_hash_join
  • pythonIO
  • pyudf_string
  • readme
  • reef
  • reef-xdgmm
  • reef550project
  • reef_merge_tmp
  • refactor-data-inputs
  • refactor_agg
  • refactor_deployment
  • renameImport
  • rename_to_alternate
  • restart_cluster
  • return_proper_message_for_failed_ingestion
  • revert_refactoring
  • s3_uri_fix
  • several-extensions
  • split_operator
  • stats_collector
  • streaming_query_results
  • test-branch-add-worker
  • tipsy_hdfs_support
  • tipsy_url_support
  • track_relations_in_query_scan
  • travis-s3
  • uri_fix
  • urisource-http
  • workerid_race

pending completion
396

push

travis-ci

dhalperi
UserDefinedAggregate: combine expressions into a script

In many cases, the updaters for user-defined aggregates share code. E.g., for
an argmax, you may do something like this:

def higher(max1, max2, val1, val2):
   case when max1 > max2 then val1 else val2 end;

And then if I have a table Student(name, gpa), I may define argmax using this
updater to pick the {name,gpa} of the student with the highest gpa.

update = [higher(s.gpa, state.gpa, s.name, state.name),
          higher(s.gpa, state.gpa, s.gpa, state.gpa)]

Currently, we compile each of the update expressions individually and then
execute them in series. Unless Java's JIT is really awesome, this likely leads
to redundant execution. (Performance results indicate that the JIT does not
optimize this redundancy away.)

Instead, we should generate the entire updater script as a single block of
code, and compile it as a single method. The execution code gets simpler and we
expose more optimization opportunities to the compiler. In my experiments, the
time of UDA execution decreases by ~20% or better.

- Add a new ScriptEvalInterface for compiled script objects, and clean up the
  name of the old EvalInterface -> ExpressionEvalInterface
- Rename Evaluator.getJavaExpression() to reflect the fact that it always
  includes code to append to an input column.
- Refactor UserDefinedAggregator (and associated Factory) to use the new script
  interface.
- Add the AppendableTable interface to Tuple so that we can use it with the new
  ScriptEvalInterface.

55 of 55 new or added lines in 5 files covered. (100.0%)

13647 of 23051 relevant lines covered (59.2%)

1.74 hits per line

Relevant lines Covered
Build:
Build:
23051 RELEVANT LINES 13647 COVERED LINES
1.74 HITS PER LINE
Source Files on uda-rewrite-script-eval
  • List 0
  • Changed 0
  • Source Changed 0
  • Coverage Changed 0
Coverage ∆ File Lines Relevant Covered Missed Hits/Line

Recent builds

Builds Branch Commit Type Ran Committer Via Coverage
396 uda-rewrite-script-eval UserDefinedAggregate: combine expressions into a script In many cases, the updaters for user-defined aggregates share code. E.g., for an argmax, you may do something like this: def higher(max1, max2, val1, val2): case when max1 > max2 then va... push 06 Mar 2015 12:22AM UTC dhalperi travis-ci pending completion  
See All Builds (1778)
  • Repo on GitHub
STATUS · Troubleshooting · Open an Issue · Sales · Support · CAREERS · ENTERPRISE · START FREE · SCHEDULE DEMO
ANNOUNCEMENTS · TWITTER · TOS & SLA · Supported CI Services · What's a CI service? · Automated Testing

© 2025 Coveralls, Inc