You are reading content from Scuttlebutt
@cel %VG8tIpqIInWAxbBlFXdrk9h+bOacGcRIH5CSc0ue8yI=.sha256

Request for sync-up on markdown newline rendering

In ssb-markdown v4.0.0 the newline rendering was changed, from GFM-style to CommonMark-style. Before v4.0.0, ssb-markdown used ssb-marked with the breaks: true option to use GFM-style linebreak-rendering. "Commonmark needs two spaces or a <br> to create a new line, GFM does newlines when you do a newline" [%glPwQU6...]. This is a breaking change in post rendering. I was reminded of this today when I saw %vZwqyoj... where it appears to me as a shell icon on each line but I think maybe they were intended to be on one line.

@christianbundy considered the old behavior a bug [%v6Cj1y1...], but I don't think it is a bug. Either way, I think it is safe to say that most of the hundreds of thousands of posts I have replicated so far were authored with the assumption they would be rendered that way. Changing it means that old messages are not rendered not as they were intended. It also means that messages composed with ssb-markdown v4.0.0 are not rendered as they were intended, by old or existing clients. Right now there is no straightforfoward way to render posts correctly, because to do so means you have to know which style of newline rendering they should use, and that can only be guessed at.

Previously I wrote: "Part of the value of the network I think is the corpus of message content, the data-lake, so to deprecate large amounts of existing content I would consider a shame" [%mTboZT8...]. I still think this. I don't think SSB should be a place where we look back on old content and think, oh, it's broken because it's old. SSB messages are immutable: they are supposed to stand the test of time. Things in this world are not permanent, but we can still try, and respect our attempts by treating historical content well and being mindful of breaking compatibility.

Current state of the clients: Patchbay uses ssb-markdown v4.0.0, and Patchwork uses it but not yet in the latest release. patchfoo and ssb-viewer use ssb-marked. git-ssb-web uses ssb-marked, patched [%Z2dr7bn...]. Decent uses ssb-markdown@^3.6.0. As for go-sbot, patchless, ngx-ssb-client or others, I don't know. But this is potentially an issue of schema, not just clients.

I would like to request that ssb-markdown revert the line break behavior to GFM-style which is that a newline makes a newline (usually), since that is what most of the content on the network has been composed to assume.

If that is not acceptable, may I ask that ssb-markdown accept a flag to set which style of newline rendering to use. (ssb-marked has such an option, breaks, which is set to true in patchfoo, ssb-viewer, git-ssb-web and ssb-markdown ≤ v3.6.0.) Then I would also ask Patchbay and Patchwork, if they still want to use CommonMark-style linebreak-rendering, to set a flag on posts they publish to indicate they are composed with that linebreak setting, and then render posts according to that flag, which patchfoo and ssb-viewer would then also do.

Either of those options would allow us to get back to more unified message rendering, except for of posts done recently in Patchbay and patchwork@master. Without such changes, we are stuck in the status quo which is inconsisent message rendering and what I think are unnecessary trade-offs between prioritizing old vs. new messages and users of different ssb clients/apps.

cc: #patchbay-dev #patchwork-dev #ssb-markdown @mix @mmckegg @christianbundy @arj @cryptix @ev

@mix %o9Hq24WNW9JfJRelyb3m3Tik9DfAuFfkpLG+/h9fR6M=.sha256

I don't have a strong feeling about this either way.

I appreciate and understand the perspective you're taking. I also appreciate the work that @christianbundy did upgrading our markdown setup.

Thanks for starting this conversation @cel

User has chosen not to be hosted publicly
@kas %EyOz/ZoWg1RQV+XAEe6fdQALJEK99S4toHleBTlmCqo=.sha256
Voted **Request for sync-up on markdown newline rendering** In [ssb-markdown](ht
@andrestaltz %esBOb8ji1K3lIDMCfSPdCfbvSrUz8ujTGpL0KZN9+nk=.sha256

One client you forgot to mention is Manyverse, and there I use remark with a couple of plugins:

  • remark-linkify-regex to linkify @ feeds and % msgs
  • remark-images-to-ssb-serve-blobs to linkify & blobs
  • remark-gemoji-to-emoji
  • mdast-normalize-react-native to fix the markdown for some react native quirks
User has not chosen to be hosted publicly
User has chosen not to be hosted publicly
User has chosen not to be hosted publicly
@ev %pBdarpoG913Nb9HEw7XdBJd/pAvmzr1Fem1WOM65E7k=.sha256

@cel I agree that we should keep newline rendering the same, if possible.

@Christian Bundy %8fnvgls5yEtz8Uzoj7jMfDynPwW0KL1FTEHoQ0AtNzs=.sha256

Hey @cel, thanks for pinging me on this and bringing up this regression. It's a bummer that Markdown has suffered from fragmentation, but you and all make a great point that we should support the author's intention whenever possible.

Do we have any sort of count for how many messages were written in Markdown versus GFM? For example, I've always been under the impression that our messages were written in Markdown, so that's what I've written. The Daring Fireball spec is very specific about newlines, so I've followed that and ignored the message preview, but I understand that others have probably treated the message preview as canonical.

I feel a bit down that my posts written in Markdown will be go back to being renderered in GFM, but I think it's best to cater to folks who have used the message preview as canon. My intuition is that anyone geeky enough to ignore the message preview and think "that's a bug" is probably technical enough to [eventually] edit their old posts to specify that they aren't GFM, and I may be the only one.

I'm not completely clear which parts of this should be handled in ssb-markdown and which should be handled in the client, but I think the ideal forward could look something like:

  • Switch back to GitHub-flavored newlines for posts that don't specify a media type.
  • Agree on how to specify media types and variants (see RFC 7764 identifiers)
  • Add Markdown metadata when posting Markdown content (e.g. { type: 'text/markdown', variant: 'GFM' })
  • Support both GFM and CommonMark variants in ssb-markdown.

Currently we're only passing the message content to ssb-markdown, so the "which Markdown renderer do I use" would have to be handled by the client, which isn't easy or fun for anyone. I hate to say it, but at this point it seems like it would be simpler to just take user input as Markdown, render to HTML, and post that to the feed rather than assume that the clients can handle Markdown variants like GFM.

Another less-complex alternative would be to drop Markdown/CommonMark support and just use GFM across the board, but I'll admit that I'm not super excited to drop the standard in favor of GitHub's implementation. Anyway, what do you all think?

@cel %EaBrLmds+JpF4VmyETpqBsKw4fBFb6If7Vxct3ZRYrQ=.sha256
Voted One client you forgot to mention is Manyverse, and there I use `remark` wit
@Christian Bundy %l34PSvxMfKXFz+Wopk/A98nCKYq6RKzlsfbEUU8k+84=.sha256

@cel

P.S. I wanted to make sure I told you about how much I appreciated the way you brought up the problem, described the context, and made suggestions (multiple!) for how we might go about resolving the regression. It would've been super easy to just make a post saying "I'm bummed that ssb-markdown broke newlines" but I thought it was really cool how you went way past that so that we could all properly sync up and focus on resolving this. ❤

@cel %/PXKvhjN3iBcElYckxbZJd1+iu5PpShEtROLaP1HoAA=.sha256

@andrestaltz thanks, I don't know why I missed that. It is a good sign I think that we could have three or more different markdown engines rendering posts in mostly mutually compatible way. I read in the readme for remark-parse that remark has a GFM option which defaults to true, and a CommonMark option which defaults to false, but it doesn't mention about the line break handling, nor was I able to find out about that from looking in the code. I'm not able to check right now but I think I recall seeing GFM-style line break rendering in Manyverse.

@christianbundy thanks for being open to this; I do wish to resolve it. And also about RFC 7764 (and 7763), I didn't know about those.

I count 153440 posts today in my local flumelog (so not exactly "hundreds of thousands"). How many were composed with the assumption of Markdown or GFM is hard to say. Many posts may not make use of single-linebreak significance, GFM features, or Markdown at all. To best estimate, I think it would require laboriously making a guess for each feed to pick a post at which it looks like they started using ssb-markdown v4.0.0 or Patchbay v7.15.6, and assume that they used that from that point until now. But for are an upper bound, here are some dates and calculations:

  • ssb-markdown v4.0.0 published 2018-11-25
  • patchcore depending on ssb-markdown@markdown-it on 2018-11-19, ssb-markdown@^4.0.0 in release v2.1.0 on 2018-11-25
  • patchbay v7.15.6 released 2018-12-04
  • patchwork using ssb-markdown@^4.0.0 on 2018-11-25, untagged

Counting my local posts received since 2018-11-19, I get 11334; since 2018-12-04, 8544. Of those messages, ones that contain the pattern /[^\n]\n[^\n]/, which I think should match a solitary newline, number 2286 and 1622, respectively. Source: &oq+ax1i.... Personally I have only noticed a handful of recent messages that appeared to have line breaks incorrectly rendered, but there are probably more I did not notice.

I think it should be possible to be able to render markdown from a message content as its author intended, without user intervention, if the message includes information needed to resolve ambiguities, such as a variant property like you mentioned. Or at least, I think it is a worthwhile goal and probably attainable. Even if such messages are not rendered completely correctly immediately by all clients, just having the needed context in the raw message would mean the clients could be eventually changed to render them consistently.

ssb-markdown accepts an opts parameter. A client could take a variant property from the message content and pass the renderer an option based on that. Similarly could probably be done for ssb-marked and marked's functions. When publishing a message, I think it would be okay for the client to use a hard-coded variant according to the developers' judgment. Or the client could take it from config or the user's feed, or the composer UI - but just for consistent message rendering this would not be necessary.

I'm not sure about type: 'text/markdown', unless it was a second-level property, or a blob mention, since we already have type: 'post' as a top-level property which is well-supported. If posts use HTML, I'm not sure that would be simpler. If a client is not targetting a web browser, it would have to parse the HTML. If it was web-based, it would still need to sanitize the HTML, unless its sandbox takes care of that. Also it might have to parse the HTML to rewrite SSB refs to URLs, and to render a message as plain text for inline use. Also, all implementations would need to be changed to accomodate this as a new feature - and they would still probably need to handle old/existing messages.

I think GFM is a standard as is CommonMark, as they were both added to IANA's Markdown Variants registry. SSB Markdown as it is might not be exactly any of these, since it uses SSB-specific features for link rendering. Would it be appropriate to modify the variant to indicate it is SSB-specific, such as SSB-GFM and SSB-CommonMark? I think it could be useful to distinguish them from the pure variants until/unless the message text can be handled by a non-SSB markdown implementation, i.e. the message text uses links only with URIs, not sigil-refs/mentions.

User has chosen not to be hosted publicly
@mix %Bh0YiMLBtraasfup2p//hLxW04HbpNOSo/WZbEPHxV0=.sha256
Voted [@cel](@f/6sQ6d2CMxRUhLpspgGIulDxDCwYD7DzFzPNr7u5AU=.ed25519) P.S. I wante
User has chosen not to be hosted publicly
@cel %tDOfQSuehhZiEtXBbNUjqB5V5E4RILNtRDRh4gL6sQM=.sha256

Another format difference besides newlines: CommonMark allows footnotes/link-definitions within a blockquote. Example in %ithrMtt....

More complete differences between GFM and CommonMark (and Original), according to remark-parse:

GFM mode (boolean, default: true) turns on:

CommonMark mode (boolean, default: false) allows:

  • Empty lines to split blockquotes
  • Parentheses (( and )) around for link and image titles
  • Any escaped ASCII-punctuation character
  • Closing parenthesis ()) as an ordered list marker
  • URL definitions (and footnotes, when enabled) in blockquotes

CommonMark mode disallows:

  • Code directly following a paragraph
  • ATX-headings (# Hash headings) without spacing after opening hashes
    or and before closing hashes
  • Setext headings (Underline headings\n---) when following a paragraph
  • Newlines in link and image titles
  • White space in link and image URLs in auto-links (links in brackets,
    < and >)
  • Lazy blockquote continuation, lines not preceded by a closing angle
    bracket (>), for lists, code, and thematicBreak
User has not chosen to be hosted publicly
@Christian Bundy %5QReCI7+x7XrXp2BskB0QOpZGLg845rhGZlEr87fkxY=.sha256

@masukomi

For provenance, I think @cel forked this thread here: %yT4SaZf...

Join Scuttlebutt now