Skip to main content
Version: 0.0.0-dev

TEMS Source Explorer — Content Ingestion API

The Content Ingestion API is the write-side of the TEMS Source Explorer Backend-for-Frontend (BFF). A media provider uses it to push enriched TemsCore content into the Source Explorer, which indexes it (embeddings into Milvus, metadata into PEACH) so it can be discovered through the Source Explorer catalogue.

Authentication — two modes on the same endpoint

The endpoint accepts both dataspace-native and OAuth2 bearers; the BFF dispatches by inspecting the JWT's iss claim. Both modes are usually active in parallel in production; either can be disabled by leaving its env vars empty.

Mode 1 — EDR-native (dataspace-native callers)

The bearer is an EDR (Endpoint Data Reference): a short-lived JWT minted by the Source Explorer's own Eclipse Dataspace Connector (EDC) when a provider acquires the ingestion offering through the dataspace.

  1. The provider negotiates and acquires the Source Explorer's ingestion offering through their EDC connector (Dataspace Protocol).
  2. The provider's EDC connector receives an EDR for the granted transfer.
  3. The provider calls this API directly, carrying the EDR on the Authorization header (no intermediate data plane).

The BFF verifies the EDR's signature against the Source Explorer's JWKS, checks aud / iss / expiry, and projects the authenticated DID into the request identity.

Mode 2 — OAuth2 against the SE-internal Keycloak

The bearer is a standard OAuth2 JWT signed by the Source Explorer's internal Keycloak (issuer https://ebu.home.dndm.ch/auth/realms/edc, audience ingestion-api). The BFF reads the JWT's scope claim to choose between two sub-modes:

  • proxy sub-mode (scope ingest:proxy) — the Source Explorer's own EDC data plane, when an offering is configured to be proxied through EDC's "data plane → backend" pattern (the canonical Mode A from the EDC spec). The data plane validates the upstream EDR, fetches a JWT from the SE-internal KC via client_credentials, and forwards the request with the consumer DID it derived from the EDR injected as the X-Dataspace-Identity header. The BFF re-enforces anti-spoofing against that header.

  • internal sub-mode (scope ingest:internal) — an EBU-controlled internal service (typically the local indexer pulling content from a provider's public API and pushing enriched output here). Authenticated as a trusted service via client_credentials; the JWT subject identifies the service for the audit trail. No X-Dataspace-Identity header is required, and anti-spoofing on mediaWork.hasProvider is skipped — the internal service is trusted to set the provider DID correctly for whichever upstream it ingested from.

The scope claim is mandatory. A JWT issued by the SE Keycloak without either configured scope is rejected with 401.

Anti-spoofing invariant

Depending on the auth mode, the authenticated DID compared against mediaWork.hasProvider is sourced differently:

Auth modeDID sourceAnti-spoofing check
EDR-nativeaud claim of the validated EDRhasProvider == aud (403 on mismatch)
OAuth2 ingest:proxyX-Dataspace-Identity headerhasProvider == X-Dataspace-Identity (403 on mismatch)
OAuth2 ingest:internaln/a — trusted internalskipped (the service is trusted)

A mismatch in the non-internal modes is rejected with 403. A missing X-Dataspace-Identity on the proxy mode is rejected with 400.

Enrichment invariant

The canonical article of the media work MUST carry hasEmbedding.hasVector with at least one vector. The canonical article is resolved from mediaWork.hasCanonicalVersion; when the payload does not declare one, the first entry in articles[] is treated as canonical. A payload whose canonical article has not been enriched is rejected with 400.

Other articles (translations, alternate versions) are accepted without their own embedding: the Source Explorer's downstream index keys on the canonical article's embedding to represent the whole work. Carrying non-canonical articles on the wire is supported for completeness and is not required to include hasEmbedding.

Embeddings are produced by the local indexer, the EBU service that turns TemsCore articles into embedding vectors and writes them back under hasEmbedding.hasVector. It can be deployed on-premises by the provider, or consumed as a hosted (SaaS) instance. Run a payload through the local indexer before pushing it to this endpoint.

Request body shape

The endpoint accepts two equivalent body shapes:

  • Canonical — TemsCore at the root: { "mediaWork": {…}, "articles": […] }.
  • Wrapped alias — the same TemsCore object nested under an item key: { "item": { "mediaWork": {…}, "articles": […] } }. Tolerated for compatibility with callers that follow the legacy "PEACH /index" body convention (notably the EBU local indexer); the BFF unwraps and treats it as the canonical shape. Use the canonical shape for new integrations.

Authentication

EDR-native. The bearer is an Endpoint Data Reference (EDR): a JWT signed by the Source Explorer's EDC connector and issued to a provider when it acquires the ingestion offering through the dataspace. It is not an API key and cannot be obtained out-of-band.

Two header shapes are accepted:

  • RFC 6750: Authorization: Bearer <edr-jwt>
  • EDC data plane convention: Authorization: <edr-jwt> (raw token)

Validation: signature against the Source Explorer JWKS, plus aud (the acquiring participant's DID), iss, and expiry. The DID used for the anti-spoofing check is read from the validated aud claim.

Security Scheme Type:

http

HTTP Authorization Scheme:

bearer

Bearer format:

JWT

Contact

European Broadcasting Union — TEMS:

URL: https://tech.ebu.ch