Skip to content

5. Chronology through trusted attestation

Date: 2025-02-27

Status

Accepted

Context

A key challenge we face when merging state is providing a shared sense of chronological ordering. Depending upon which event occurred first, the resultant merged state may differ.

With clocks drifting across machines and synchronising them fraught with issues, it is not as simple as capturing a local timestamp.

One solution is trusted timestamp attestation servers as per RFC 3161, relying upon trusted remote machines to provide trusted timestamps. We can run these servers ourselves, or rely upon third-party servers such as Free TSA.

An alternative is matrix clocks, which can provide a logical sense of chronological ordering.

Decision

Relying upon third party attestation servers is undesirable, and running a fleet of those servers ourselves introduces another component within the stack and increases deployment complexity.

Matrix clocks on the other hand will make storage and replication significantly more complex to implement and reason about.

In the end, we have decided on an approach similar to timestamp attestation servers, but making it a part of Data Mesher itself.

One or more instances within a Data Mesher cluster will be designated as trusted. After each merge operation, each instance within the cluster will append an ED25519-based signature to relevant sections of its new local state.

Only state signed and verified as coming from a trusted instance is to be used when determining the chronological ordering of events.

Unverified state should be ignored and reported.

Where more than one trusted signature is available, the earliest signature is to be used to determine chronological ordering.

The signatures in question will look something like the following, with each entry being keyed by an ED25519 public key:

{
    "foo": "bar",
    "fizz": 1,
    "signatures": {
        "4/rs1S8OqrcB1m6OTyQ/9L5W9mJ7hmknqnQ+ZQVHmS4=": {
            "signed_at": "2025-02-18T17:01:27.88Z",
            "signature": "hqyEMpT+oQEysX1eJ2iKAznzvmAGoZc5cfDTKB3NKyr8yKtGV87ZKNsBMYnq0LEGKDzUjcnOKGFoEXttEab4Dw=="
        },
        "uD6UZzmMqr7spmAYTO9UI6qCCfsHClFjMGRqxLN6Fjs=": {
            "signed_at": "2025-02-18T17:01:30.489Z",
            "signature": "QHPLEMK3S5Hl0gYvV7QffY+R2Kg9wAIcBbEbcJMVArdvM+nWpu3bC42jQSmmcO+zHtaOKWEgUYT2Why2MZ/WCQ=="
        }
    }
}

Signing and verification should be performed on the payload without the signatures field, e.g.:

1
2
3
4
{
    "foo": "bar",
    "fizz": 1
}

Consequences

This decision weakens the decentralised properties of Data Mesher by introducing trusted nodes.

If the machines in question were to become compromised, or their private keys obtained, a bad actor could inject malicious data.

In addition, this design choice introduces a propagation delay, as state from untrusted instances needs to merge through trusted instances before other nodes in the cluster will respect the changes.

If no trusted instances are reachable within the cluster, no state changes can be made until they are restored, reducing the cluster to a read-only mode.

That being said, Data Mesher is intended to track hostnames and types of application data which do not update frequently and will have relatively large times to live.