Dataspace challenges

Draft Community Group Report,

More details about this document
This version:
https://w3c-cg.github.io/dataspaces/
Feedback:
public-dataspaces@w3.org with subject line “[Challenges] … message topic …” (archives)
Issue Tracking:
GitHub
Editor:
Pieter Colpaert

Abstract

The W3C community group on dataspaces maintains a list of challenges.

Status of this document

1. Introduction

This document lists the challenges the W3C Dataspaces Community Group wants to see discussed.

Note: While the word dataspaces is often written in 2 words, we position dataspaces as a similar, yet different but equally important concept as a database, also written in 1 word.

Different definitions exist of dataspaces. This community did not (yet) put forward its own definition. It acknowledges however that the common denominator is that dataspaces entail data sharing among multiple participants in that dataspace.

Data sharing across multiple participants requires an interdisciplinary approach: it is as much about governance, as it is about business models for actors that can support such ecosystems, as it is about technical aspects.

Dataspaces bring a renewed interest in topics such as interoperability, trust and ecosystem governance.

2. Challenges

We introduce a short description of the challenge.

The challenges are a yearly recurring topic at the Semantics in Dataspaces workshop (SDS).

2.1. Federating Vocabulary Hubs

The challenge calls for a (federated) vocabulary hub architecture so multiple participants can publish and govern vocabularies without a single centralized hub. The goal is sovereignty, scalability, and resilience while enabling semantic interoperability using standards like Linked Data and SPARQL federation.

The fact that not everyone is going to speak the same vocabulary from day 1 also raises the need for alignments between different vocabularies.

Related SDS papers

2.2. Blueprints for actors, roles and responsibilities

This challenge asks for conceptual mapping between dataspace standards and their substitutes or complements (including technologies outside the dataspace ecosystem). The goal is to improve interoperability, avoid repeated mistakes, and identify gaps via peer-reviewed analysis and consensus.

Discussion points to the DSSC Blueprint and toolbox as existing conceptual clarifications, and references an upcoming report comparing FIWARE, EDC, and SIMPL.

Related SDS papers

2.3. End-User Confidence in Data Integrity

This challenge focuses on integrity guarantees for data stored by third parties: checksums for individual objects and collections, stable across migrations between dataspaces. It emphasizes preventing tampering or data loss for end users.

Discussion asks whether this targets Linked Web Storage, how to define a "package" (HTTP response vs. triples/quads), and suggests applying integrity checks to metadata as well as data in a distributed setting.

Related SDS papers

2.4. Interoperable Policy Engines

The challenge calls for a formal, interoperable evaluation model for ODRL policies so usage control decisions are consistent across dataspaces. It requests a reference implementation, documentation, and a shared test suite of policies, requests, and expected results.

Discussion notes the need to model requests and world state (possibly via SHACL profiles), points to EDC policy engine work and DSSC enforcement frameworks, and highlights an ESWC25 paper providing a compliance report model, evaluator, test suite, and the FORCE demo application.

Related SDS papers

2.5. Data Discovery

This challenge targets automated discovery of interoperable and trustworthy datasets over time. It proposes a criteria language (schema, provenance, geo-temporal scope, usage conditions), a compatible catalog data model, and an evaluation algorithm with a reference implementation.

Discussion cites VoID, DCAT-AP, DSSC building blocks, and RDF Data Cube for metadata, debates whether federated querying is feasible given "data at source" policies, and argues that richer metadata (e.g., CSVW-based structural descriptions) is required beyond linking to vocabularies. Examples and public SPARQL endpoints are suggested to show feasibility.

Related SDS papers

2.6. Pipelining Workflows Across Participants

This challenge argues that dataspaces duplicate processing pipelines and could benefit from reuse of derived datasets or shared processing services. It proposes expressing desired outcomes, mapping them to processing plans, and allowing participants to advertise which plan steps they can perform, plus a techno-economic analysis of value-adding roles.

Earlier discussions stressed that “intermediaries” are better seen as participants; pipelines are business-layer concerns, while dataspace protocols remain peer-to-peer. It highlights the role of trust anchors, policy negotiation, and provenance models, and suggests aligning with Gaia-X service offerings and clearer layer/plane models for business versus technical concerns.

From a more high-level perspective however, the Data Governance Act (as well as the Digital Omnibus draft) talk about intermediaries as a company that is a trusted party to bring data from source to consumer.

Related SDS papers

Conformance

Document conventions

Conformance requirements are expressed with a combination of descriptive assertions and RFC 2119 terminology. The key words “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”, “SHOULD”, “SHOULD NOT”, “RECOMMENDED”, “MAY”, and “OPTIONAL” in the normative parts of this document are to be interpreted as described in RFC 2119. However, for readability, these words do not appear in all uppercase letters in this specification.

All of the text of this specification is normative except sections explicitly marked as non-normative, examples, and notes. [RFC2119]

Examples in this specification are introduced with the words “for example” or are set apart from the normative text with class="example", like this:

This is an example of an informative example.

Informative notes begin with the word “Note” and are set apart from the normative text with class="note", like this:

Note, this is an informative note.

References

Normative References

[RFC2119]
S. Bradner. Key words for use in RFCs to Indicate Requirement Levels. March 1997. Best Current Practice. URL: https://datatracker.ietf.org/doc/html/rfc2119