• Home
  • Features
  • Pricing
  • Docs
  • Announcements
  • Sign In

MITLibraries / timdex-dataset-api
94%
main: 94%

Build:
Build:
LAST BUILD BRANCH: USE-143-record-metadata-join
DEFAULT BRANCH: main
Repo Added 26 Nov 2024 09:20PM UTC
Token 6Ra2O2Hw9sRKiMVfLUH0SUOYrZqZct8CJ regen
Build 335 Last
Files 8
Badge
Embed ▾
README BADGES
x

If you need to use a raster PNG badge, change the '.svg' to '.png' in the link

Markdown

Textile

RDoc

HTML

Rst

LAST BUILD ON BRANCH USE-143-always-metadata-join
branch: USE-143-always-metadata-join
CHANGE BRANCH
x
Reset
Sync Branches
  • USE-143-always-metadata-join
  • TIMX-414-scaffold-library-project
  • TIMX-415-load-dataset
  • TIMX-415-write-to-dataset
  • TIMX-417-read-from-dataset
  • TIMX-424-reorder-partition-columns
  • TIMX-425-update-load-dataset-and-apply-filtering
  • TIMX-427-improve-logging
  • TIMX-432-rework-dataset-partitions
  • TIMX-453-read-transformed-records-from-dataset
  • TIMX-456-bump-version-number
  • TIMX-456-filter-with-or-conditions
  • TIMX-465-run-record-offset-column
  • TIMX-468-read-configs
  • TIMX-494-new-timdexsource-class
  • TIMX-494-pip-audit-and-logging-updates
  • TIMX-494-run-metadata
  • TIMX-494-source-current-runs-and-records
  • TIMX-496-add-same-day-run-timestamp
  • TIMX-496-establish-migrations-and-backfill-migration
  • TIMX-497-filtering-current-records
  • TIMX-504-dataset-fragments-vs-batches
  • TIMX-506-dataset-metadata-class-client
  • TIMX-507-current-records-utilize-metadata-layer
  • TIMX-508-run-timestamp-data-migration
  • TIMX-509-explicit-run-timestamp
  • TIMX-512-row-group-sizes
  • TIMX-515-hotfix-install-duckdb-httpfs-extension
  • TIMX-515-static-duckdb-file-prep
  • TIMX-526-projected-views
  • TIMX-527-write-append-deltas
  • TIMX-528-merge-append-deltas
  • TIMX-529-sql-based-read-methods
  • TIMX-530-create-static-metadata-db-file
  • TIMX-530-prep-work-and-s3-client
  • TIMX-533-rework-dataset-load
  • TIMX-537-bump-to-major-version-3
  • TIMX-540-ecs-duckdb-s3-connection
  • TIMX-541-extension-installation-lambda-context
  • TIMX-543-cr-optimize-v2
  • TIMX-543-keyset-pagination-for-reading
  • TIMX-559-update-duckdb-dependency
  • USE-142-dataset-embedding-imports
  • USE-142-dataset-embedding-types
  • USE-142-embeddings-source-and-write
  • USE-143-embeddings-read
  • USE-143-record-metadata-join
  • USE-58-lazy-load-current-records
  • bump-version-0-6-0
  • dependabot/pip/boto3-1.35.72
  • dependabot/pip/boto3-1.35.74
  • dependabot/pip/boto3-1.35.76
  • dependabot/pip/boto3-1.35.77
  • dependabot/pip/boto3-1.35.78
  • dependabot/pip/boto3-1.35.79
  • dependabot/pip/boto3-1.35.80
  • dependabot/pip/boto3-1.35.81
  • dependabot/pip/boto3-1.35.82
  • dependabot/pip/boto3-1.35.83
  • dependabot/pip/boto3-1.35.84
  • dependabot/pip/boto3-1.35.85
  • dependabot/pip/boto3-1.35.86
  • dependabot/pip/boto3-1.35.87
  • dependabot/pip/boto3-1.35.88
  • dependabot/pip/boto3-1.35.90
  • dependabot/pip/boto3-1.35.91
  • dependabot/pip/boto3-1.35.92
  • dependabot/pip/boto3-1.35.93
  • dependabot/pip/boto3-1.35.94
  • dependabot/pip/boto3-1.35.96
  • dependabot/pip/boto3-1.35.97
  • dependabot/pip/boto3-1.35.98
  • dependabot/pip/boto3-1.35.99
  • dependabot/pip/boto3-1.36.0
  • dependabot/pip/boto3-1.36.1
  • dependabot/pip/boto3-1.36.10
  • dependabot/pip/boto3-1.36.11
  • dependabot/pip/boto3-1.36.12
  • dependabot/pip/boto3-1.36.13
  • dependabot/pip/boto3-1.36.14
  • dependabot/pip/boto3-1.36.15
  • dependabot/pip/boto3-1.36.16
  • dependabot/pip/boto3-1.36.17
  • dependabot/pip/boto3-1.36.18
  • dependabot/pip/boto3-1.36.19
  • dependabot/pip/boto3-1.36.2
  • dependabot/pip/boto3-1.36.20
  • dependabot/pip/boto3-1.36.21
  • dependabot/pip/boto3-1.36.22
  • dependabot/pip/boto3-1.36.24
  • dependabot/pip/boto3-1.36.25
  • dependabot/pip/boto3-1.36.26
  • dependabot/pip/boto3-1.36.4
  • dependabot/pip/boto3-1.36.5
  • dependabot/pip/boto3-1.36.6
  • dependabot/pip/boto3-1.36.7
  • dependabot/pip/boto3-1.36.8
  • dependabot/pip/boto3-1.36.9
  • dependabot/pip/boto3-1.37.0
  • dependabot/pip/boto3-1.37.1
  • dependabot/pip/boto3-1.37.2
  • dependabot/pip/boto3-1.37.3
  • dependabot/pip/boto3-stubs-1.35.76
  • dependabot/pip/boto3-stubs-1.35.77
  • dependabot/pip/boto3-stubs-1.35.78
  • dependabot/pip/boto3-stubs-1.35.79
  • dependabot/pip/boto3-stubs-1.35.80
  • dependabot/pip/boto3-stubs-1.35.81
  • dependabot/pip/boto3-stubs-1.35.82
  • dependabot/pip/boto3-stubs-1.35.83
  • dependabot/pip/boto3-stubs-1.35.84
  • dependabot/pip/boto3-stubs-1.35.85
  • dependabot/pip/boto3-stubs-1.35.86
  • dependabot/pip/boto3-stubs-1.35.87
  • dependabot/pip/boto3-stubs-1.35.88
  • dependabot/pip/boto3-stubs-1.35.90
  • dependabot/pip/boto3-stubs-1.35.91
  • dependabot/pip/boto3-stubs-1.35.92
  • dependabot/pip/boto3-stubs-1.35.93
  • dependabot/pip/boto3-stubs-1.35.94
  • dependabot/pip/boto3-stubs-1.35.96
  • dependabot/pip/boto3-stubs-1.35.97
  • dependabot/pip/boto3-stubs-1.35.98
  • dependabot/pip/boto3-stubs-1.35.99
  • dependabot/pip/boto3-stubs-1.36.0
  • dependabot/pip/boto3-stubs-1.36.1
  • dependabot/pip/boto3-stubs-1.36.10
  • dependabot/pip/boto3-stubs-1.36.11
  • dependabot/pip/boto3-stubs-1.36.12
  • dependabot/pip/boto3-stubs-1.36.13
  • dependabot/pip/boto3-stubs-1.36.14
  • dependabot/pip/boto3-stubs-1.36.15
  • dependabot/pip/boto3-stubs-1.36.16
  • dependabot/pip/boto3-stubs-1.36.17
  • dependabot/pip/boto3-stubs-1.36.18
  • dependabot/pip/boto3-stubs-1.36.19
  • dependabot/pip/boto3-stubs-1.36.2
  • dependabot/pip/boto3-stubs-1.36.21
  • dependabot/pip/boto3-stubs-1.36.22
  • dependabot/pip/boto3-stubs-1.36.24
  • dependabot/pip/boto3-stubs-1.36.25
  • dependabot/pip/boto3-stubs-1.36.26
  • dependabot/pip/boto3-stubs-1.36.4
  • dependabot/pip/boto3-stubs-1.36.5
  • dependabot/pip/boto3-stubs-1.36.6
  • dependabot/pip/boto3-stubs-1.36.7
  • dependabot/pip/boto3-stubs-1.36.8
  • dependabot/pip/boto3-stubs-1.36.9
  • dependabot/pip/boto3-stubs-1.37.0
  • dependabot/pip/boto3-stubs-1.37.1
  • dependabot/pip/boto3-stubs-1.37.2
  • dependabot/pip/boto3-stubs-1.37.3
  • dependabot/pip/cryptography-44.0.1
  • dependabot/pip/ipython-8.30.0
  • dependabot/pip/ipython-8.31.0
  • dependabot/pip/ipython-8.32.0
  • dependabot/pip/moto-5.0.23
  • dependabot/pip/moto-5.0.24
  • dependabot/pip/moto-5.0.25
  • dependabot/pip/moto-5.0.26
  • dependabot/pip/moto-5.0.27
  • dependabot/pip/moto-5.0.28
  • dependabot/pip/moto-5.1.0
  • dependabot/pip/mypy-1.14.0
  • dependabot/pip/mypy-1.14.1
  • dependabot/pip/mypy-1.15.0
  • dependabot/pip/pandas-stubs-2.2.3.241126
  • dependabot/pip/pyarrow-stubs-17.13
  • dependabot/pip/pyarrow-stubs-17.14
  • dependabot/pip/pytest-8.3.4
  • dependabot/pip/ruff-0.8.1
  • dependabot/pip/ruff-0.8.2
  • dependabot/pip/ruff-0.8.3
  • dependabot/pip/ruff-0.8.4
  • epic-TIMX-515
  • epic-timx-515
  • hotfix-pin-duckdb-v1-3-2
  • hotfix-pin-sqlalchemy-2.0.44
  • main
  • temp-IN-1438-marimo-ecs-permissions
  • v0.2.0
  • v0.3.0
  • v0.6.0
  • v1-release
  • v1.0
  • v1.1
  • v2.0
  • v2.1
  • v2.2
  • v2.3
  • v3.0
  • v3.2
  • v3.3
  • v3.4

09 Dec 2025 03:52PM UTC coverage: 93.861%. Remained the same
20069671054

push

github

ghukill
Join embeddings queries on record metadata

Why these changes are being introduced:

Two nice-to-have functionalities were missing from the first pass of embeddings
reading:

1. filter by record metadata columns now in embeddings schema
2. retrieve record metadata columns in embeddings read methods

There was a deliberate choice to keep embeddings read methods simple in the first
pass.  This builds on that work.

How this addresses that need:

For TIMDEXEmbeddings.read_batches_iter(), the base read method for all embeddings
read methods, perform a join to record metadata via the composite key (timdex_record_id,
run_id, run_record_offset).  Given that record metadata is very fast and memory safe, this
join is too.  By performing this join, we can expose record metadata columns that
intentionally don't exist in the embeddings schema -- e.g. 'source' or 'run_timestamp' --
for filtering and selecting.

Side effects of this change:
* Read methods for TIMDEXEmbeddings can filter and return columns only found in
records metadata tables/views.

Relevant ticket(s):
* https://mitlibraries.atlassian.net/browse/USE-143

33 of 33 new or added lines in 1 file covered. (100.0%)

6 existing lines in 1 file now uncovered.

688 of 733 relevant lines covered (93.86%)

0.94 hits per line

Relevant lines Covered
Build:
Build:
733 RELEVANT LINES 688 COVERED LINES
0.94 HITS PER LINE
Source Files on USE-143-always-metadata-join
  • Tree
  • List 8
  • Changed 0
  • Source Changed 0
  • Coverage Changed 0
Coverage ∆ File Lines Relevant Covered Missed Hits/Line

Recent builds

Builds Branch Commit Type Ran Committer Via Coverage
20069671054 USE-143-always-metadata-join Join embeddings queries on record metadata Why these changes are being introduced: Two nice-to-have functionalities were missing from the first pass of embeddings reading: 1. filter by record metadata columns now in embeddings schema 2. retriev... push 09 Dec 2025 03:55PM UTC ghukill github
93.86
20069569452 USE-143-always-metadata-join Join embeddings queries on record metadata Why these changes are being introduced: Two nice-to-have functionalities were missing from the first pass of embeddings reading: 1. filter by record metadata columns now in embeddings schema 2. retriev... push 09 Dec 2025 03:52PM UTC ghukill github
93.86
See All Builds (335)

Badge your Repo: timdex-dataset-api

We detected this repo isn’t badged! Grab the embed code to the right, add it to your repo to show off your code coverage, and when the badge is live hit the refresh button to remove this message.

Could not find badge in README.

Embed ▾
README BADGES
x

If you need to use a raster PNG badge, change the '.svg' to '.png' in the link

Markdown

Textile

RDoc

HTML

Rst

Refresh
  • Settings
  • Repo on GitHub
STATUS · Troubleshooting · Open an Issue · Sales · Support · CAREERS · ENTERPRISE · START FREE · SCHEDULE DEMO
ANNOUNCEMENTS · TWITTER · TOS & SLA · Supported CI Services · What's a CI service? · Automated Testing

© 2025 Coveralls, Inc