You are reading content from Scuttlebutt
@andrestaltz %hEOaREi6NZJzJLYZn1WrKFJOWXS3CmwoR/Mk6FRCrwM=.sha256
Re: %HPMQEUbUL

The other article I want to write is all about massively collaborative open source. Here's the bullet list draft of it:

  • The GitHub model of collaboration is now engraved in the minds of many programmers
  • It's not the only model, though. The Linux kernel is still coordinated through emails
  • Even pull requests, "patches", are sent via plain text emails
  • Although email is arguably a less attractive technology, it is a free cyberspace
  • As a conclusion, the Linux kernel project is entirely unaffected by the GitHub acquisition
  • However, interestingly, it's hard to say whether the Linux kernel is a free cyberspace, because it has a military-style chain of command with one general: Linus Torvalds
  • It's anyway an interesting and important model of collaboration. The oldest git repo, as well.
  • GH model:
    • repo: contains git, issues/PRs stack, releases
    • maintainer(s) of the repo
    • requesters: create issues and PRs. I'll call these simply "requests"
  • The maintainer has 3 roles:
    • author git commits on the git history
    • manage (as in workplace "Manager" type of manage) the backlog of requests
    • publish releases (which consist of choosing a name for the project, sending to npm/maven/etc, managing changelog and version names)
  • Let's scrutinize these 3 different hats/roles: Authoring, Requests management, Publishing and versioning
  • Authoring
    • This is basically when the maintainer is writing git commits
    • There are two types of commits: Convergence and Extension
    • Convergence is merge commits or anything that incorporates other branches/forks
    • Extension is every other commit. We're extending the history of the codebase
    • Extension is not necessarily a divergence, but can be
    • (Every divergence/disagreement began as a naive extension)
    • (Divergence/disagreement is only between people, not between commits)
    • In GH model, convergence is usually through pull requests
    • Convergence could be different though
    • One example: I published and versioned my own fork of react-native-workers, but never sent a PR. Later garrettm found the fork and merged it into his repo https://github.com/garrettm/react-native-workers/commit/97dbe7917c2b079ec2bdd08625067910a5c0115c
    • Hints towards a different social dynamics for convergence
    • Hints towards splitting the Author role into two roles: the Extender role, the Converger role
    • Of course it will often be the case that Author = Extender+Converger
    • But open source collaboration is always voluntary, never obligation
    • It's a service in the original sense of service: altruism/selfless
    • Some people may volunteer for extension, others may volunteer for convergence
  • Request management:
    • Bug reports, issues, pull requests are sent to an inbox
    • The inbox is a hierarchical cyberspace
    • The inbox is a community
    • GitHub Inc. at the (invisible) top of the hierarchy, repo maintainer(s) as the second top
    • This hierarchical power can be used for unfair decisions in the community
    • Email (as in mailing lists in the Linux kernel) is one alternative for community communication
    • Other free cyberspaces could be explored for community discussions
    • Requests are only "requests" because of hierarchy
    • In a free cyberspace, issues, bug reports, PRs, would be just public reports
    • No one should be obliged to work on any of those
    • Extension and convergence work would be volunteered to address those public reports
  • Publishing and versioning
    • In the GH model, the maintainer should also npm publish or something equivalent
    • This involves three responsibilities: distribution, naming, versioning
    • Distribution is about choosing a package repository: npm or ssb-npm-registry, etc
    • Distribution has typically been location-addressed
    • Location-addressing, e.g. https://registry.npmjs.org encodes hierarchy, implies a hierarchical cyberspace
    • Distribution could be content-addressed, e.g. Dat or IPFS, so the package is in a free cyberspace
    • "Distributors" are people taking the voluntary responsibility of seeding the package
    • Naming is about giving an alias for the package, usable by the community to discuss public reports (issues, bug reports, PRs)
    • Naming is often just a human-friendly alias for a vague idea
    • Naming is hard, often is a game of digging through the semantics of a word, and how it matches a concept
    • Naming is actually Onomatology and a bit of Ontology
    • Versioning is used to reduce vagueness
    • E.g. react is a vague idea, react@16.3.0 is a specific package
    • Versioning has been mostly monotonic x.y.z numbers
    • One problem with monotonic versioning in a free cyberspace is that it encodes only one linear history
    • Versioning in a free cyberspace cannot be a monotonic code
    • Another problem, unrelated to freedom in cyberspace, is using monotonic versioning for communication
    • SemVer: failed in its own objectives
    • Quote: "What you can do is let Semantic Versioning provide you with a sane way to release and upgrade packages without having to roll new versions of dependent packages". Is a lie.
    • ComVer: https://github.com/staltz/comver is about admitting that versioning is mostly about compatibility
    • Spec-ulation talk by Rich Hickey
    • In that talk, Hickey talks about the contract of a package: what it requires (preconditions) and what it provides (postconditions)
    • "Change" is therefore the evolution of that contract
    • Compatibility is maintained when the preconditions are made more general OR the postconditions are made more specific
    • A contract X is said "compatible" with contract Y if the above happens
    • Versioning in a free cyberspace should be a way of encoding a directed acyclic graph (DAG) of library contracts
    • One naive approach is to formally write the contract, create a cryptohash of it, and then forks (extensions) of a project can then refer to that hash
    • This approach is imperfect, though, because compatible contracts would have different hashes
    • Maybe there is a crypto approach here where compatible contracts would yield the same identifier
    • Maybe formally writing the contract of a library is hopeless, maybe contracts will always be informal
    • Example: some of the contract can be encoded as a specification, some of it can be encoded as unit tests, some of it with property-based testing (generative, Haskell quickcheck style), or formal verification, but this is tending towards mathematical proofs of algorithms, which is hard
    • Conclusion: contracts are hard to enforce
    • Versioning will always (?) be soft guarantees of compatibility
    • But at least it seems clear now that versioning in a free cyberspace would not be monotonic x.y.z versioning, but instead it would (somehow!) communicate compatible contracts between package X and package Y
  • Conclusion: in a free cyberspace, the social dynamics of open source collaboration would be voluntary with ephemeral roles, not permanent titles
  • Roles: git Extenders, git Convergers, Reporters, Community gardening, Distributors, Namers (Onomatologists), and Contract Compatibility Analysts.
  • We need to figure out a way of doing all of that without fixed hierarchies
Join Scuttlebutt now