• Home
  • Features
  • Pricing
  • Docs
  • Announcements
  • Sign In

IBM / unitxt
80%
main: 81%

Build:
Build:
LAST BUILD BRANCH: fix/disable-milu-test-gated-dataset
DEFAULT BRANCH: main
Repo Added 24 Dec 2024 03:17PM UTC
Files 64
Badge
Embed â–¾
README BADGES
x

If you need to use a raster PNG badge, change the '.svg' to '.png' in the link

Markdown

Textile

RDoc

HTML

Rst

LAST BUILD ON BRANCH codecov
branch: codecov
CHANGE BRANCH
x
Reset
  • codecov
  • 1.16.1
  • 1.16.2
  • 1.16.3
  • 1.16.4
  • 1.17.0
  • 1.17.1
  • 1.17.2
  • 1.18.0
  • 1.19.0
  • 1.20.0
  • 1.21.0
  • 1.22.1
  • 1.22.3
  • 1.23.0
  • 1.23.1
  • 1.24.0
  • 1.25.0
  • 1.26.0
  • 1.26.1
  • 1.26.2
  • 1.26.3
  • 1.26.4
  • 1.26.5
  • 1.26.6
  • 1.26.7
  • 1.26.8
  • 1.26.9
  • 2024-blog
  • Add-multiple-choice-example
  • Added-example-for-standalone-metric-evaluation
  • Added-param-to-control-of-confidence-interval-calculation-in-evaluate-api
  • Documenation-updates
  • Example-of-creating-yaml-representation-of-card
  • Fix-LoadJsonFile
  • LoadFromAPI-optional-apikey
  • accelerate-rag-metrics
  • add-audio-support
  • add-balance-operator
  • add-cache-gitignore
  • add-cross-inference-models
  • add-docstring-llm-judge
  • add-engine-id-method
  • add-format-and-system-prompt-to-meta-data
  • add-global-mmlu-lite-sensitivity-cards
  • add-granite-docs-format
  • add-hf-to-cross-provider-inference-engine
  • add-inline-template-support
  • add-metric-example
  • add-more-judges
  • add-more-llmjudge-benchmarks
  • add-more-metrics-for-schema-linking
  • add-non-verify-option-to-api-loader
  • add-quality-dataset
  • add-replicate
  • add-schema-linking
  • add-spacy-req-to-examples-tests
  • add-text2sql
  • add-text2sql-blog-post
  • add-to_markdown-to-instance-score
  • add-to_yaml-for-artifiact
  • add-tokenizer-name
  • add-vision-benchmark-example
  • add-vllm-to-cross
  • add_completeness_judge
  • add_entity_type_filter_to_operators
  • add_generation_text_to_meta_data
  • add_judges
  • add_metadata
  • added_social_iqa_card
  • airbench
  • allow-read-timeout
  • an_issue_with_loader_cache
  • api_call_evaluation
  • arc-indic-rudra
  • arena-hard-fix
  • assistant-improve-links
  • assistant_assessment
  • assitant-with-search
  • atta_q_safety
  • azure
  • banner-top-website
  • base-dep
  • batch-size-inference
  • bench-and-models
  • bench-recipe-in-cli
  • benjams/add_bioasq_miniwiki_datasets
  • benjams/add_hotpotqa
  • benjams/add_watson_x
  • benjams/enrich_tags
  • benjams/fix_bioasq_card
  • benjams/fix_clap_nq_benchmark
  • benjams/fix_clapnq
  • benjams/fix_watsonx_qa_dataset
  • biggen-bench
  • biggen-multilingual
  • biggen-revert
  • blog-update
  • cache-key-and-lock
  • ccc_inference
  • changes
  • chat_api_format
  • cli-benchmark-fix
  • cli-enhancements
  • cli-imports
  • cli-util
  • clinc-faster
  • comment-out-sql
  • convert-inline-templates
  • correct_tool_calling
  • correcteness-criteria
  • criteria-typo
  • criterias
  • cross-inference-add-model
  • cross-inference-custom-model
  • csv-loader
  • data-classification-cross-provider-engine
  • datasets351
  • dedup-operator
  • default-template-policy
  • demos-sampling-seed
  • demos_experimental
  • disable-litellm-cache
  • down-dount
  • ds-4-req
  • empty_yaml_strings
  • entity_squad_metric
  • eval_assist_documentation
  • evalassist-judges
  • evaluate_different_formats
  • extend-choices-order
  • extend_coverage_some
  • external_client_for_wml_infer_engine
  • f1-docs
  • feature/add-global-mmlu-cards
  • filter_if_missing_field
  • filter_wikitq
  • finqa-hash-to-top
  • fix-DiverseLabelSampler
  • fix-artifact-saving
  • fix-aus-legal-qa
  • fix-azure-llmjudge
  • fix-azure-openai
  • fix-batching
  • fix-bench-docs
  • fix-bird-task
  • fix-bootstrap-empty
  • fix-bug-when-WML-does-not-return-any-content-or-tool-call
  • fix-cache-dir
  • fix-catalog
  • fix-criteria-json
  • fix-datasets-4
  • fix-dependencies-installation
  • fix-disable-mem-caching
  • fix-examples
  • fix-fusion
  • fix-images-demos-pool
  • fix-inference
  • fix-inference-tests
  • fix-issue-in-token-decosing
  • fix-litellm-without-task-data
  • fix-load-csv
  • fix-loaders-trust
  • fix-loading2
  • fix-metrics-docs
  • fix-missing-dataset
  • fix-model-name
  • fix-mt_bench-style-llm-as-judge-post-processor
  • fix-multiple-source-loader
  • fix-nan-ci
  • fix-number-of-batchs
  • fix-pearsonr-tests
  • fix-qa-evaluation-data-classification-policy
  • fix-rag-metrics
  • fix-rits-model-names
  • fix-scout-name
  • fix-some-tests
  • fix-tablebench-dp-split
  • fix-task-metrics
  • fix-tests
  • fix-tests-sacrebleu-ja
  • fix-text2sql_utils-sort_df
  • fix-tools-nested-params
  • fix-typo-in-azure-openai-variable-name-and-dictionary-key
  • fix-vision
  • fix-zero-division-in-compare-performances
  • fix/correct-choice-position-handling
  • fix/disable-milu-test-gated-dataset
  • fix/negative-index-support
  • fix_assistance_token_error
  • fix_bfcl
  • fix_global_mmlu
  • fix_llmjudge
  • fix_mmmu
  • fix_mtrag
  • fix_ollama
  • fix_performance_test
  • fix_prompts_table_benchmark
  • fix_rag_metrics
  • fix_summarize_from_human_feedback
  • fix_xlam_function_calling
  • fixed-bug-in-tool-inference
  • fixed_wiki_bio
  • fixing_criterias_in_catalog
  • frames
  • from_api_import
  • function-operators
  • gg-add-prompt-to-result
  • gg-fc-fix
  • gg-hf
  • gg-prediction-field
  • global-mmlu-improvment
  • gpqa
  • granite-guardian-minor-changes
  • granite-guardian-result-type
  • granite-guardian-support
  • groupby_processor
  • handle_empty_tool_call_list
  • head-qa-updates
  • helm-test-fix
  • hf-cache
  • hf-files
  • hf-retry
  • hf-timeout
  • hf-tool-calling
  • hf_pipeline_peft
  • homepage
  • hub-rust
  • image_key_value_extrqaction
  • imports_html_button
  • improve-assistant
  • improve-context-parsing
  • improve-score-option-selection
  • improve-tc-example
  • improve__instance_scores_summary
  • improve_inference_log
  • improve_merge_error_message
  • improved-error-messages
  • improved-parsing-of-MT-bench-style-rating-parsing
  • improved_multi_turn_example
  • indic_milu
  • inference_engine_cache
  • issue-1881
  • issues-stale
  • jb/fix-arena-hard-template
  • jb/fix-cli
  • jb/gg-hack
  • jb/provoq-updates
  • jb/replicate-models
  • jb/safety-updates
  • json
  • json-loader
  • jsonschema
  • just_lazy_loader
  • just_to_run_examples
  • key_value_extraction_improvements
  • know_your_splits
  • last_line_processor
  • lazy-return-multi-stream
  • lazy_evaluate
  • lazy_loadHF
  • lazy_scipy
  • llm-as-judge-metric-update-again
  • llm-judge-cot
  • llm-judge-granite-evals
  • llm-judge-judgebench
  • llm-judge-prepare
  • llm-judge-response-name
  • llm-judge-str-evaluator-name
  • llm-judge-summaries
  • llm-judge-use-cross-provider
  • llmjudge-add-prompts-by-default
  • llmjudge-changes
  • load_dataset_use_cache_default
  • local-cache
  • log-probs-hf-fix
  • long-bench
  • main
  • meteor_n_resample
  • metric_based_ner
  • metrics-formatting
  • metrics_fix
  • mistral_small_watsonx_support
  • mixed_args_support
  • mlcommons-ailuminate
  • mm_updates
  • mock-performence
  • module_name_same_catalog
  • more-bluebench-fixes
  • mtrag
  • mtrag_corpora
  • multi-turn-metrics
  • multi_turn_rag_example
  • multiple-choice-improved
  • multithreading-support
  • nave_tool_calling
  • ner_example
  • networkx
  • new-base-metric
  • new-text2sql-metrics-scores
  • no_iterable_datasets
  • no_loader_cache
  • normalize-bench-target
  • nve_tool_calling
  • ollama
  • ollama-host
  • ollama_inference
  • override_ci_method_globalmetric
  • pandas-403
  • patch-1
  • peft
  • performance_blue_benchmark
  • performance_no_cProfile
  • performance_no_cProfile_existing_loaders
  • pipeline_tokenizer
  • place-correct-choice-position
  • polish_performance
  • prediction-type-without-load
  • prep-tests
  • preparation3
  • prevent-ds-4
  • protobuf
  • provider-specific-args-and-allow-unroecognized-model-name
  • pythonize_the_yaml
  • rag-bench
  • rag-metric-update-again
  • ragbench
  • readme-update
  • real_mm_rag
  • refactor-inference
  • refactor-llm-ad-judge-to-map-reduce
  • reflector-integration
  • reflector-semantic-integration
  • remote_catalog
  • remove-balance-new
  • remove-ibm-branding-from-doc
  • remove-src-lock
  • remove_bam_llm_as_judges
  • remove_break_point
  • remove_ds351_installation
  • remove_genai_support
  • remove_gpqa_experts
  • remove_redundant_from_performance_yml
  • renovate/configure
  • return_source_to_recipe_to_performance
  • reuse-hf-cache-for-actions
  • reuters-improvments
  • rits_infer
  • safety-benchmark
  • safety_airbench2024
  • settings-docs
  • simple_qa
  • simplify-artifact-link
  • small_issue_with_error_box
  • small_modifs_to_profiler
  • small_typos_in_loaders
  • small_typos_to_profiler
  • social_iqa_new
  • space-id-only
  • speed-up-prep-tests
  • sqllite3-error
  • summaries-pos-bias
  • support-max-per-split-in-benchmarks
  • system-leakage
  • table_as_image
  • tables_bench
  • task-types
  • test_faithfulness_with_external_client
  • text2sql-execution-accuracy-metric-fix
  • text2sql-metric-fixes
  • text2sql-metrics-cache
  • text2sql-metrics-fixes
  • text2sql-metrics-update
  • tool-calling-3
  • tool-calling-correctness
  • tool-calling-multi-turn
  • tool-calling-support
  • tool-calling-wx_ai
  • torr
  • torr_documentation
  • tot
  • touch_the_loaded_dataset
  • try_lmarena-ai_arena_hard_auto
  • typed_recipe_artifact_saving
  • typo_in_intersect_corr_fields
  • unitxt-assistant
  • up-readme
  • upd-readme
  • update-ag-news
  • update-cov
  • update-datasets-descriptions
  • update-metrics-docs
  • update-sacrebleu
  • update-to-tool-calling-metric
  • update-vis-bench
  • update_ibm_wml_engine_#1775
  • update_rag_metrics
  • update_rag_metrics_leftover
  • updates-7
  • use-repr-for-cache
  • users/ofir/add_qa_template_exact_output
  • users/ofir/hf_inference_debug
  • users/ofir/template_for_bbq
  • users/ofir/update_Wml_llmajj
  • vision_bench
  • vision_bench_update
  • vision_templates
  • whitesource/configure
  • wml_comp
  • wxai-async-chat
  • wxai-chat-features
  • xstest
  • yifanmai/cross-provider-vertex-ai
  • yifanmai/fix-indexed-row-major-none
  • yifanmai/wikitq-1-shot

24 Dec 2024 03:22PM UTC coverage: 80.313%. First build
12483476006

Pull #1456

github

web-flow
Merge 66ec1f4d0 into 0aa2f7395
Pull Request #1456: Fix coverage tests

1332 of 1649 branches covered (80.78%)

Branch coverage included in aggregate %.

8414 of 10486 relevant lines covered (80.24%)

0.8 hits per line

Relevant lines Covered
Build:
Build:
10486 RELEVANT LINES 8414 COVERED LINES
0.8 HITS PER LINE
Source Files on codecov
Detailed source file information is not available for this build.

Recent builds

Builds Branch Commit Type Ran Committer Via Coverage
12483476006 codecov Merge 66ec1f4d0 into 0aa2f7395 Pull #1456 24 Dec 2024 03:27PM UTC web-flow github
80.31
12483404214 codecov Merge 85bba19ff into 0aa2f7395 Pull #1456 24 Dec 2024 03:20PM UTC web-flow github
80.32
See All Builds (1863)
  • Repo on GitHub
STATUS · Troubleshooting · Open an Issue · Sales · Support · CAREERS · ENTERPRISE · START FREE · SCHEDULE DEMO
ANNOUNCEMENTS · TWITTER · TOS & SLA · Supported CI Services · What's a CI service? · Automated Testing

© 2026 Coveralls, Inc