cc @piet
@hoodownr This is exactly the problem that tangles solve.
In the most trivial implementation, each message would refer to the newest available entry of all known logs. That includes a lot of redundant information though, you might want to go for the transitive reduction: Each entry should have at most one incoming reference from among all messages you currently know about.
This is the scheme that gives you the best possible ordering information with the least amount of references. This might still be problematic if there is a large number of updates, so you might want to put put a cap on the maximal number of tangle-tip references per message. If there have been more updates than the cap, you'd need some tiebreaker to decide which ones to reference. This means that the partial order induces formed by the tangle is not the tightest one possible, but you can probably approximate it well enough to be useful. Tiebreakers could be metrics (e.g. pointing to the entries such that the largest number of nodes becomes part of the tangle, or minimizing the number of non-tangle nodes per log), random choices, or any combinations thereof.
When you want to order stuff, you'd compute a topological order on the tangle, with a tiebreaker of your choice. Doing this dynamically) can become complicated (and is a longstanding open research problem), but at least batch topsort is simple.
@hoodownr Very quickly:
- yup, tangle backlinks would be in the payloads
- vector clock: technically not the same, but they are pretty similar. In some sense sense, if you made vector clocks sparse and transitive, you'd get tangles.
- backlinks for the tangle should include a hash of the target's metadata, to guarantee that the resulting graphs are acyclic. Without the hash, you could point the tangle to an entry that doesn't exist yet and later create that entry pointing to the first one.
- pseudocode topsort: https://en.wikipedia.org/wiki/Topological_sorting#Algorithms
- getting stuff done within two days: don't spend too much time on heuristics for which feeds to include in the tangle if there are too many, just pick a random subset of the appropriate size. If you really need to save time, don't compute where tangle backlinks are needed, just point to the newest entries of randomly selected logs (this will have a noticeable effect on the quality of the sorting though).
- cats > computers