| 
  • If you are citizen of an European Union member nation, you may not use this service unless you are at least 16 years old.

  • You already know Dokkio is an AI-powered assistant to organize & manage your digital files & messages. Very soon, Dokkio will support Outlook as well as One Drive. Check it out today!

View
 

Markdown

This version was saved 5 years ago View current version     Page history
Saved by Tantek
on March 19, 2019 at 10:47:45 am
 

Thoughts on Markdown

Markdown is a clever and popular set of conventions for lightweight text markup, yet it has some really annoying warts. By following Markdown's own goals and principles, we can fix some problems and make additional improvements.

 

Markdown principles

From the main description:

 

The overriding design goal for Markdown’s formatting syntax is to make it as readable as possible. The idea is that a Markdown-formatted document should be publishable as-is, as plain text, without looking like it’s been marked up with tags or formatting instructions.

 

Brilliant scoping.

 

I've looked at many Markdown extensions, and most violate the principles implied by the design goal(s) description. Most extensions look like line-noise gibberish (much worse than HTML markup), thus their use-cases would be better solved with HTML - no need for such extensions to exist. Just use HTML.

 

Another excerpt:

 

...the single biggest source of inspiration for Markdown’s syntax is the format of plain text email.

 

Problems summary

The problems I have with Markdown all stem from violating the stated goals / implied principles of Markdown itself, in particular "formatting instructions" that look like line-noise (link syntax, image embedding, ## headers, backticks).

 

1. Asterisk demarcation for italic (never what I've seen it mean in plain text email).

2. No-one uses double asterisks or underlines in plain text email to mean something specific. It just looks like a typo.

3. Hyperlink syntax unnatural, deceptively difficult to recall, worse than MediaWiki [ ] style.

4. Image embedding uses a "!" prefix in a readability-hurting manner when a simpler syntax was possible. Again, never seen in plain text email.

5. Reference styles don't allow for (and prefer, which they should) end-note/foot-note style links.

6. ## header styles are unnatural and no better than MediaWiki == style.

7. backticks are not commonly used for code markup (ever? never seen in plain text email either).

8. Indented text should mean blockquote or just pre-wrap (friendlier), not code block (based on normal readability expectations and normal non-coding usage being preferred over coder-specific usage).

 

See also: https://en.wikipedia.org/wiki/Lightweight_markup_language for comparisons of alternatives.

 

Modest solutions summary

Solutions for Markdown's problems can be derived by going back to Markdown's original design principle, and (re)starting from that.

 

A summary of modest proposals for fixing these problems based on what people type and/or what looks *good* in plain text:

1. Asterisk demarcation should mean bold (Slack, IM clients, and G+ already do this)

2. No idea what to do with double (or more) asterisks or underlines. So ignore them.

3. Hyperlinking can/should be done automatically by just having links that use URLs in general (see 4 for exceptions)

4. Image embedding can/should be done automatically by just having links that use URLs that end in .jpg, .jpeg, gif, .png (similarly with video embedding for .mov .mp4 .ogv, audio embedding for .mp3 .wav).

5. Add ^1 to reference styling to create inline [1] style links that link to end notes rather than linking entirety of a phrase.

6. Drop ## header styles

7. Drop backticks as meaning anything

8. Treat double-space or more indented text as pre (was blockquote).

8a. Prefix/suffix a block of code with line(s) of comments in whatever programming language is being used. /*...*/ #... //... <!-- ... --> etc. A blank line after a similar line of commenting ends the block.

 

Blocks of code reasoning

Need a better solution to embedding blocks of code. Research needed. Providing a hint to the language of the code being used would be useful for syntax highlighting. Blocks of code should be copy/pasteable directly into code editing environments for immediate re-use (any solution requiring prefixing each line with some punctuation like a "#" would fail this). Perhaps prefix/suffix a block of code with line(s) of comments in whatever programming language is being used. /*...*/ #... //... <!-- ... --> etc.

 

Before and After

Here is a table of examples of existing problematic Markdown syntax along with suggested improvements for comparison:

Existing Problematic Syntax
Suggested Improved Syntax
*italic*
/italic/ - some email prior art for this, no personal use-case, but requested
**bold**
*bold*
__underline__ _underline_

*[abbr]: abbreviation

*[HTML]: Hyper Text Markup Language

abbr (abbreviation)

HTML (Hyper Text Markup Language)

[](https://tantek.com/) https://tantek.com/
[Tantek](https://tantek.com/) Tantek https://tantek.com/
[Tantek's site](https://tantek.com/)

Tantek's site

https://tantek.com/

or

_Tantek's_site_ https://tantek.com/

![](http://w3.org/Icons/text.gif) http://w3.org/Icons/text.gif
![text icon](http://w3.org/Icons/text.gif) http://w3.org/Icons/text.gif (text icon)
[![](http://w3.org/Icons/text.gif)](http://indiewebcamp.com/text) http://w3.org/Icons/text.gif http://indiewebcamp.com/text
[![text icon](http://w3.org/Icons/text.gif)](http://indiewebcamp.com/text) http://w3.org/Icons/text.gif (text icon) http://indiewebcamp.com/text

# H1 #

 

## H2 ##

 

### H3 ###

 

#### H4 ####

H1

==

H2

--

H3

....

H4

. . .

`<code>` inline code use-cases are infrequent, so no replacement currently. plain text inline code examples in the wild welcome.
  code block

  pre-wrap block

/* start of a CSS, JS, C++ or similar code block */

/* end of a same block, trailing blank line required */

 

<!-- start of some markup to display literally -->

<p>some literal markup and text</p>

<!-- end of markup code block, including trailing blank -->

 

Note: *[abbr]: syntax is not in Markdown itself but is from Markdown Extra.

 

Re-use from Markdown

Aside from reasonable whitespace treatment, there may be other features of Markdown that are worth keeping, such as treatment of lists and blockquotes. In particular I'm explicitly thinking of re-using the following (having used them naturally in plain text and have them work as expected in Markdown).

 

Blockquote

Prefixing a line with "> " indicates a blockquote. Example post of mine with "> " lines that was POSSEd to GitHub Markdown as expected: http://tantek.com/2018/064/t1/

Additional '>' chars for nesting blockquotes, e.g. ">> "

This use of '>' for quoting has been in use by plain text email/usenet since the 1980s if not earlier.

 

Naming

Naming is hard. Nonetheless people have asked me for a name for this fixed, corrected, or replacement for markdown, so here are some thoughts (that I've searched and not found any critical collisions for)

  • tmark
  • VITA / -vita- a mnemonic for the order of video image text-alt/alink or text alink
    • this is growing on me, especially for the mnemonic usefulness 
  • markdownt
  • markdoubt
  • markdont
  • markdonot / markdonut
  • ...

 

 

Auto linking

In general auto-link any URLs. See below for exceptions for auto link embedding instead.

 

Auto link embed URLs

(implemented in http://tantek.com/github/cassis auto_link)

If the URL ends in .jpg, .jpeg, gif, .png, .svg then make it an img,

Ending with .mp4 .mov .ogv .webm , use a video tag,

Ending with .mp3 .wav, use an audio tag,

Else just hyperlink the URL to itself.

 

Good ideas

Here are few additions, that I'm convinced are good ideas based on real-world evidence for their need, and real-world publishing experience

 

Hyperlinks with link text

We could allow a hyperlinked text block as such:

 

linktext

URL

 

or optionally with URL parenthesized with < > (a common plain text email convention) or ( ) (English parenthetical applying to previous word/phrase)

 

linktext

<URL>

 

linktext

(URL)

 

is converted to:

 

<a href=URL>linktext</a>

 

Details:

  • If linktext ends with ":" (consider any (sentence or phrase termination) punctuation like:".!?,;:-—*/" - others?) then the linktext is NOT linked to the URL. "something: URL" is a common enough plain text pattern that expresses an intent to show the URL visibly inline that it is better left as-is, with the URL visibly linked to itself.
  • Where each is on a line by itself and linktext is the entirety of that line, or inline: linktext URL, or linktext <URL>, or linktext (URL), where linktext is only one word (how else do you know how many words to auto-link? and any use of "" etc. just to mean "link this text" is an abuse of punctuation because the linktext is not a quotation.).
  • If linktext includes underlining: like _link_text_ URL then the entirety of "link text" is linked, without any explicit underlining as that is presumed to be subsumed by the default hyperlink styling.
  • Use of other stylisting phrasings like *bold* or /italic/ maintain their styling (since they're not part of default hyperlink styling) while permitting inline multiword linktext like: *bold text* URL, or /italic text/ URL.

 

 

Hyperlinked images

(now supported in https://tantek.com/cassis.js auto_link)

Expanding upon the previous, if the "linktext" were a URL ending in .jpg .gif .png (URL2), then make a hyperlinked image, e.g.:

 

URL2.png

URL

 

where each is on a line by itself, or inline: URL2.png URL

is converted to:

 

<a href=URL><img src="URL2.png" alt=""/></a>

 

Video with hyperlink

Expanding upon the previous, if the URL2 ended in .mp4 .mov .ogv, then make a video with fallback hyperlink, e.g.:

 

URL2.mp4

URL

 

where each is on a line by itself, or inline: URL2.png URL

is converted to:

 

<video src="URL2.mp4"><a href=URL>a video</a></video>

 

Video with poster

Expanding upon the previous, if URL ended in .jpg .gif .png, then make a video with poster image, e.g.:

 

URL2.mp4

URL.jpg

 

where each is on a line by itself, or inline: URL2.mp4 URL.jpg

is converted to:

 

<video src="URL2.mp4" poster="URL.jpg">a video</video>

 

Question: should it also automatically make a fallback image?

<video src="URL2.mp4" poster="URL.jpg"><img src="URL.jpg" alt="a video"/></video>

 

Video with poster and hyperlink

 

Expanding upon the previous, if there was a third URL3 in-between URL2 and URL, that ended in .jpg .gif .png, then make a video with poster image URL3 and fallback hyperlink, e.g.:

 

URL2.mp4

URL3.jpg

URL

 

where each is on a line by itself, or inline: URL2.mp4 URL3.jpg URL

is converted to:

 

<video src="URL2.mp4" poster="URL3.jpg"><a href=URL>a video</a></video>

 

Question: should it make a fallback hyperlinked image?

<video src="URL2.mp4" poster="URL3.jpg"><a href=URL><img src="URL3.jpg" alt="a video"/></a></video>

 

Detail: separators may be mixed space / linebreak. e.g.

URL2.mp4 URL3.jpg

URL

 

or

 

URL2.mp4

URL3.jpg URL

 

Other ideas

Here are few possible additions, that I'd only really want to consider after there was sufficient real-world evidence for their need, and perhaps some informal plain text publishing experience first.

 

Alt text for images

Another variant of the above, if the URL ends in .jpg .gif .png, text following was not a URL, then use the text following as an image alternate:

 

URL.png

(alt-text)

 

where each is on a line by itself, or inline: URL.png (alt-text)

is converted to:

 

<img src="URL.png" alt="alt-text"/>

 

Detail: there may be linebreaks in the alt-text

 

Hyperlinked images with alt text

Finally, combine the previous two to allow hyperlinked images with alt text

 

Since alt text is necessarily more strongly tied to image contents, it should be closer to image, whereas an image could be reasonably hyperlinked to various different URLs

 

URL2.png

(alt-text)

URL

 

where each is on a line by itself, or inline: URL2.png (alt-text) URL

is converted to:

 

<a href="/URL"><img src="URL2.png" alt="alt-text"/></a>

 

Details: there may be linebreaks in the alt text, separators may be mixed space and linebreak, e.g.

URL2.png (alt-text)

URL

 

or

 

URL2.png

(alt-text) URL

 

 

Note: previous alternative - not as good since it separates the (alt-text) from the image

inline: URL2.png URL (alt-text)

or:

URL2.png

URL

(alt-text)

 

 

More header styles

Added and incorporated more header styles like a line of periods or spaced periods into summary table based on positive feedback in person 2019-077.

 

 

Partial Implementation

 

 

Other possible needs

Inline code snippets

Though I do use inline code markup in HTML and wiki syntax quite often, I'm not convinced this is a general need that deserves a special syntax. Might possibly need a (better than Markdown) solution to embedding inline code snippets. Gathering examples for now:

  • class names inline in documentation, e.g.:
    • validate your h-card with indiewebify.me, and make sure you have a u-url to your home page!
  • examples of HTML attributes with values inline in documentation, e.g.:
    • make sure to put rel=me on hyperlinks on your home page to your other profiles

 

 

Past improvement ideas

Some ideas for improving Markdown that seemed like decent incremental improvements, but later I decided they weren't that much better, or far more drastic solutions/changes were needed.

 

Expand Link Styling Syntax

Update: the Markdown link syntax is not like anything anyone ever types in email. Ditch it.

Previous thoughts on improving it:

Current Markdown link styling:

 [example](http://example.com/ "Title")

Suggested additions:

 [example](http://example.com/ "Title" .class1 .class2 #id rel=rel-value boolean_attribute attribute1=one-word-value attribute2="quoted multi-word value")

 

Re-using .class #id from CSS and jQuery.

 

Re-using rel=rel-value boolean_attribute attribute1=one-word-value attribute2="quoted multi-word value" from simple HTML attribute syntax.

 

In all cases, all the "extra" information is contained in the parentheses, and thus reasonably easy to always skip when just reading over the content.

 

If there is no URL, then just use span for the markup rather than an a href.

If there is no URL but there is a src attribute, then just use an img (or video or audio if the src has an extension that better maps to those.)

 

Allow multiple URLs, the first for an href, and a second URL for the src of a linked img (or video or audio if the second URL has an extension that better maps to those). Similarly if there is a URL and an explicit src= attribute (for the second URL).  Could even allow multiple src= attributes, turning them into source elements to provide multiple video or audio embedded sources. A final image src could provide a fallback image.

 

Adoption

Markdown appears to be growing in popularity among the IndieWeb crowd, as both an authoring format, and a storage format. However, every time I've looked at it, the above-mentioned problems irked me sufficiently to not want to adopt it, and I've stubbornly stuck with using HTML (currently HTML5+hAtom) as my de facto structured storage format (e.g. for tantek.com posts). I'm considering forking Markdown and making the improvements noted above.

 

See Also

Comments (0)

You don't have permission to comment on this page.