Skip to content

Support for provenance in 3D Tiles #805

Open
@javagl

Description

@javagl

The term "provenance" here means what is sometimes also referred to as data lineage. Specifically, approaches for keeping track of the origin and changes of data, in a structured, machine-processable form.

Introduction

Many application areas of 3D Tiles involve planning and decision making. Examples are architecture, engineering, construction, and mission support. The decision making processes here crucially depend on knowledge about the reliability of the data. This includes information about the origin of the data (e.g. whether it is a CAD application or a drone scan), possible preprocessing steps (like simplification or optimizations for visualization), or manual modifications (like annotations have have been added and stored as metadata). The concept of data provenance could therefore be applied to 3D Tiles on many different levels.

Scope

For now, this issue focusses on a very narrow subset of the data, namely on metadata. On some level, there even isn't a clear technical distinction between "metadata" and "geometry" - namely, when metadata is represented in binary form with EXT_structural_metadata. So fow now, the focus is very narrow on the JSON-structured metadata. And in this narrow context, the goal of provenance could be compared to that of a version control system - namely, to know which modifications have been applied to the metadata.

Goals

The goal of preserving provenance information in 3D Tiles in the given scope would be

  • knowing the original data (including information about its origin)
  • for each modification:
    • when was the modification?
    • who did the modification?
    • maybe 'metadata' (e.g. reasons for the modification - similar to a commit message)
    • what was the old state and what is the new state?

Representation and Storage

The provenance information could either be tracked and stored externally, or be part of the tileset itself (as meta-metadata...). In both cases, one could very broadly categorize two representations:

  • Storing the new state (and deriving the difference by comparing the new state to the old state)
  • Storing the difference (and deriving the new state by applying the 'change' to the old state)

The best choice here will depend on the granularity and modification types (see below). Common 'event databases' store the initial state, and all modifications as a sequence of 'transactions/events' that modify the data. If the goal is to store the provenance information in the tileset itself, then the size of that data may grow considerably (particularly, when certain "bulk operations" are applied to binary metadata). Further options could be considered - like only storing the "initial" and the "latest" state, or only storing the "previous" and "current" state.

Granularity

A first differentiation could be the granularity level on which the modifications take place:

  • coarse-grained: on the level of tileset-, tile-, and content metadata
  • fine-grained: in tile content - i.e. in EXT_structural_metadata in glTF assets

Modification Types

Another differentiation could be whether only values are modified, or whether there can also be structural modifications.

  • values: Setting a new value for an existing metadata property (like tileset.metadata.properties["lastModificationDate"] = "2024-01-23")
  • structure: Adding a new property to an existing metadata class (like schema.classes["..."].properties.add("lastModificationDate", "STRING"))

Maybe this issue can be used to

  • gather more specific requirements
  • think about approaches that could cover "as many of them as reasonably(!) possible"
  • identify the limits of these approaches
  • draft out possible implementations (for data producers and consumers)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions