| 
  • If you are citizen of an European Union member nation, you may not use this service unless you are at least 16 years old.

  • You already know Dokkio is an AI-powered assistant to organize & manage your digital files & messages. Very soon, Dokkio will support Outlook as well as One Drive. Check it out today!

View
 

Markdown

Page history last edited by Tantek 1 week, 4 days ago

 

Thoughts on Markdown

Markdown is a clever and popular set of conventions for lightweight text markup, yet it has some really annoying warts. By following Markdown's own goals and principles, we can fix some problems and make additional improvements.

 

Markdown principles

 

Prime Directive

 

From the main description:

 

The overriding design goal for Markdown’s formatting syntax is to make it as readable as possible. The idea is that a Markdown-formatted document should be publishable as-is, as plain text, without looking like it’s been marked up with tags or formatting instructions.

 

Brilliant scoping. Since "overriding design goal" literally means it overrides or takes precedence or comes first before any other aspect of its design, I call this the Markdown Prime Directive.

 

I've looked at many Markdown extensions, and most violate the principles implied by the design goal(s) description. Most extensions look like line-noise gibberish (much worse than HTML markup), thus their use-cases would be better solved with HTML - no need for such extensions to exist. Just use HTML.

 

Inspiration from empirical usage

 

Another excerpt:

 

...the single biggest source of inspiration for Markdown’s syntax is the format of plain text email.

 

 

Problems summary

 

The problems I have with Markdown all stem from violating the stated goals / implied principles of Markdown itself, in particular "formatting instructions" that look like line-noise (link syntax, image embedding, ## headers, backticks).

 

When your Markdown becomes more plain text pollution than punctuation, just use HTML.

 

In particular:

 

1. Poor inline-style design/assumptions

     a. Asterisk demarcation for italics (never what I've seen it mean in plain text email).

     b. No-one uses double asterisks or underlines in plain text email to mean something specific. It just looks like a typo.

     c. tilda (~) cannot be assumed to mean strike-through, e.g. this tweet.

2. Hyperlink syntax unnatural, deceptively difficult to recall, worse than MediaWiki [ ] style.

3. Image embedding uses a "!" prefix in a readability-hurting manner when a simpler syntax was possible. Again, never seen in plain text email.

4. Reference styles don't allow for (and prefer, which they should) end-note/foot-note style links.

5. ## header styles are unnatural and no better than MediaWiki == style.

6. backticks are not commonly used for code markup (ever? never seen in plain text email either).

7. Indented text should mean blockquote or just pre-wrap (friendlier), not code block (based on normal readability expectations and normal non-coding usage being preferred over coder-specific usage).

8. line breaks immediately after text characters should mean hard line breaks (e.g. poetry), while Markdown interprets them as soft-breaks and rewraps them into a single "line", breaking existing plain text poetry

 

Outline of use-cases clusters

Markdown problems are a key motivation for replacing Markdown with something simpler and better fitting its primary design goal, however a good replacement should have its own set of direct user use-cases to help fulfill scope and not exceed it. These fall into the following categories (which could be implemented as modules, or even function options)

  1. inline text semantics & styling & shortcodes (abbreviation, italic, bold, underline, emoticons to emoji)
  2. block semantics & styling (headings, lists, paragraphs, blockquotes, preformatted, figures, code blocks)
  3. auto-linking (URLs with or without http, @-mentions, @-@s, ^n footnotes)
  4. auto-embedding (images, videos, iframes)
  5. hypermedia (linked text, images) and other attributes on content (e.g. alt text)

 

Considering also:

  • more HTML inline semantics (time, data)
  • more HTML block semantics ?
  • additional use-cases from BBCode, MyCode, AsciiDoc, reStructuredText, e.g.
    • inline styling: strikethrough, monospace
    • block semantics: citations / bibliography
  • use-cases brought up in IndieWeb chat:
    • citations / bibliography
    • call-out

Leaving out for now:

  • table layout & styling. possible temporary placeholder guidance for a 1.0: use preformatted text feature to preserve ASCII plain text table presentation, then later extend to incorporate table semantics.

 

Consider sorting use-cases by how many different https://en.wikipedia.org/wiki/Lightweight_markup_language (https://en.wikipedia.org/wiki/Category:Lightweight_markup_languages) attempt solutions for them.

 

 

Modest solutions summary

Solutions for Markdown's problems can be derived by going back to Markdown's original design principle, and (re)starting from that.

 

A summary of modest proposals for fixing these problems based on what people type and/or what looks *good* in plain text:

1. Asterisk demarcation should mean bold (Slack, IM clients, and others already do this)

2. No idea what to do with double (or more) asterisks or underlines demarcation, and they’re ugly. So ignore and discourage them.

3. ✅ Hyperlinking can/should be done automatically by just having links that use URLs in general (see 4 for exceptions)

4. ✅ Image embedding can/should be done automatically by just having links that use URLs that end in .jpg, .jpeg, gif, .png (similarly with video embedding for .mov .mp4 .ogv, audio embedding for .mp3 .wav).

5. ✅ Add ^1 to reference styling to create inline footnote ¹ style links that link to end notes rather than linking entirety of a phrase.

6. Drop ## header styles (they’re ugly)

7. Drop backticks as meaning anything (no "obvious" meaning in text)

8. Treat double-space or more indented text as pre (was blockquote).

8a. Prefix/suffix a block of code with line(s) of comments in whatever programming language is being used. /*...*/ #... //... <!-- ... --> etc. A blank line after a similar line of commenting ends the block.

9. Treat line breaks after non-space characters are hard line breaks, while treating a single trailing space before a line break as a soft line break (as someone would have typed when manually breaking a prose paragraph into multiple lines)

 

✅ = implemented in CASSIS https://tantek.com/cassis.js auto_link function.

 

Blocks of code reasoning

Need a better solution to embedding blocks of code. Research needed. Providing a hint to the language of the code being used would be useful for syntax highlighting. Blocks of code should be copy/pasteable directly into code editing environments for immediate re-use (any solution requiring prefixing each line with some punctuation like a "#" would fail this). Perhaps prefix/suffix a block of code with line(s) of comments in whatever programming language is being used. /*...*/ #... //... <!-- ... --> etc.

 

Before and After

Here is a table of examples of existing problematic Markdown syntax along with suggested improvements for comparison:

 

Existing Problematic Syntax
Suggested Improved Syntax
*italic*
/italic/ - with word boundaries on both sides. some email prior art for this, no personal use-case, but requested (btrem, others previously)
**bold**
*bold* - with word boundaries on both sides
__underline__

_underline_ , _underlined phrase_ - with word boundaries on both sides

 

Allow but discourage:

_underlined_phrase_ - less readable than _underlined phrase_

*[ABBR]: abbreviation

*[HTML]: Hyper Text Markup Language

ABBR (abbreviation) - restrict to all caps single word, word boundaries on both sides, e.g.

HTML (Hyper Text Markup Language)

 

Yet rather than generating:

<abbr title="abbreviation">ABBR</abbr>

<abbr title="Hyper Text Markup Language">HTML</abbr>

Which has some accessibility challenges and may sometimes not be the intended presentation (hiding of the expansion).

 

A perhaps more accessible and presentation preserving alternative may be to generate:

<abbr aria-labelledby="abbr_abbreviation">ABBR</abbr> (<span id="abbr_abbreviation">abbreviation</span>)

<abbr aria-labelledby="html_hyper_text_markup_language">HTML</abbr> (<span id="html_hyper_text_markup_language">Hyper Text Markup Language</span>)

though that should be tested to see if it works well in screen readers and other accessibility tools

[](https://tantek.com/) https://tantek.com/
[Tantek](https://tantek.com/)

 

Tantek (https://tantek.com/) — literally what iOS pastes into a txt from linkedtext, this would roundtrip as a single linkedword. This would also roughly "support" the Markdown link syntax, while however preserving the brackets in the visible link text, which may also useful. This also has the least punctuation necessary to distinguish an intent to link (parenthetical URL) from the linktext (just the prior space delimited word or hyphenated-phrase). Also compatible with APA simple in-prose link styling: https://microformats.org/wiki/citation-formats#APA_site_or_profile 

 

_Tantek_ (https://tantek.com/) — combining prior underline syntax and parenthetical URL work as well. This notation could indicate an explicit preference to attach the link to the previous underlined phrase, avoiding potential false positives of links inline after underscores (unsure of any cases of that).

 

Consider restricting the non-underlined syntax to:

 

Capitalizedword (https://example.com/)

hyphenated-phrase (https://example.com/)

 

Only auto-linking the "simple" cases of a Capitalized word OR a hyphenated-phrase followed by a (parenthetical URL), to avoid the false positive of a lowercase word followed by a (parenthetical URL), e.g. in the first example in: https://microformats.org/wiki/citation-formats#APA_site_or_profile

 

 

Rejected:

 

Tantek https://tantek.com/ — will likely have too many false positives, and URLs delineated by spaces probably indicate an author intent to keep the URL explicitly visible

 

_Tantek_ https://tantek.com/ — this would work better than the prior example syntax, however still indicates an author intent to keep the URL explicitly visible

 

_Tantek_ <https://tantek.com/> — since <URL> was/is widely used in plain text emails (before Markdown), and the <...> notation could indicate an explicit preference to attach the link to the previous underlined phrase, avoiding potential false positives of links inline after underscores. However rejecting this because CMoS 6.110 discouraging use of angle brackets to delineate URLs (or anything besides "typesetting instructions", which is quite fitting for HTML tags)

 

 

[Tantek's site](https://tantek.com/)

_Tantek's site_ (https://tantek.com/) — minimal use of _ to markup a phrase to be underlined, trailing paren link is used by iOS paste linkedtext into txt.

 

Tantek's site

(https://tantek.com/)

— minimum use of punctuation, an entire line being a parenthesized URL visually attaches it to the prior entire line of text. Useful for linking an entire block or list item of text

 

_Tantek's site_

(https://tantek.com/)

 

Allow but discourage:

_Tantek's_site_ (https://tantek.com/) — this is better, however the added "_"  between words still feels forced (and inconsistent with use of * or /)

 

_Tantek's_site_

(https://tantek.com/)

 

Rejected:

_Tantek's site_ https://tantek.com/ — URLs delineated by spaces probably indicate an author intent to keep the URL explicitly visible

_Tantek's_site_ https://tantek.com/ — all the added "_" feel forced, and seem to reduce readability, plus URLs delineated by spaces probably indicate an author intent to keep the URL explicitly visible

_Tantek’s_site_ <https://tantek.com/> — rejecting due to CMoS 6.110 discouraging use of angle brackets to delineate URLs.

 

Tantek's site

https://tantek.com/

— this might have too many false positives, and a URL on a line by itself may indicate an author intent to keep the URL explicitly visible in the context of lines before and after

![](http://w3.org/Icons/text.gif) http://w3.org/Icons/text.gif
![text icon](http://w3.org/Icons/text.gif)

http://w3.org/Icons/text.gif (text icon)

[![](http://w3.org/Icons/text.gif)](http://indiewebcamp.com/text)

http://w3.org/Icons/text.gif (https://indieweb.org/text)

http://w3.org/Icons/text.gif https://indieweb.org/text

[![text icon](http://w3.org/Icons/text.gif)](http://indiewebcamp.com/text)

http://w3.org/Icons/text.gif (text icon, https://indieweb.org/text)

http://w3.org/Icons/text.gif (https://indieweb.org/text, text icon)

http://w3.org/Icons/text.gif (text icon) https://indieweb.org/text

<iframe src="URL-to-embed">

 

(Markdown has no explicit syntax of its own for iframes)

[URL-to-embed] — note: if the URL-to-embed has an image file extension, convert to an <img> tag, similarly to <video> or <audio>. The use of an explicit brackets [...] around an image or video filename may make it more obvious in plaintext that the URL is intended to be viewed inline rather than hyperlinked

<iframe aria-label="text summary of embed." src="URL-to-embed">

 

(Markdown had no explicit syntax of its own for iframes)

[URL-to-embed] (text summary of embed) — a single space between URL & parenthetical alt is preferred for plain text readability but could be optional

 

[URL-to-embed] (text summary of embed, https://indieweb.org/embed) — hyperlink the entire embedded resource to a link

[URL-to-embed] (https://indieweb.org/embed, text summary of embed)

[URL-to-embed] (text summary of embed) https://indieweb.org/embed

# H1 #

 

## H2 ##

 

### H3 ###

 

#### H4 ####

 

H1

== — any 2+ =s in order to approximate length under heading text, similar with - , . , .(sp) below, preceded by blank line or at start, followed by blank line

 

H2

--

 

H3

....

 

H4

. . .

`<code>` inline code use-cases are infrequent, so no replacement currently. plain text inline code examples in the wild welcome.
  code block

  pre-wrap block

/* start of a CSS, JS, C++ or similar code block */

/* end of a same block, trailing blank line required */

 

<!-- start of some markup to display literally -->

<p>some literal markup and text</p>

<!-- end of markup code block, including trailing blank -->

 

Note: *[abbr]: syntax is not in Markdown itself but is from Markdown Extra.

 

Re-use from Markdown

Aside from reasonable whitespace treatment, there may be other features of Markdown that are worth keeping, such as treatment of lists and blockquotes. In particular I'm explicitly thinking of re-using the following (having used them naturally in plain text and have them work as expected in Markdown).

 

Blockquote

Prefixing a line with "> " indicates a blockquote. Example post of mine with "> " lines that was POSSEd to GitHub Markdown as expected: http://tantek.com/2018/064/t1/

Additional '>' chars for nesting blockquotes, e.g. ">> "

This use of '>' for quoting has been in use by plain text email/usenet since the 1980s if not earlier.

 

Auto linking

In general auto-link any URLs. See below for exceptions for auto link embedding instead.

 

Auto link embed URLs

(implemented in http://tantek.com/github/cassis auto_link)

If the URL ends in .jpg, .jpeg, gif, .png, .svg then make it an img,

Ending with .mp4 .mov .ogv .webm , use a video tag,

Ending with .mp3 .wav, use an audio tag,

Else just hyperlink the URL to itself.

 

Good ideas

Here are few additions, that I'm convinced are good ideas based on real-world evidence for their need, and real-world publishing experience

 

Hyperlinks with link text

We could allow a hyperlinked text block as such:

 

linktext

(URL)

 

with URL parenthesized with ( ) (English parenthetical applying to previous word/phrase).

 

is converted to:

 

<a href=URL>linktext</a>

 

Details:

  • If linktext ends with ":" (consider any (sentence or phrase termination) punctuation like:".!?,;:-—*/" - others?) then the linktext is NOT linked to the URL. "something: URL" is a common enough plain text pattern that expresses an intent to show the URL visibly inline that it is better left as-is, with the URL visibly linked to itself.
  • Where each is on a line by itself and linktext is the entirety of that line, or inline: linktext URL, or linktext (URL), where linktext is only one word (how else do you know how many words to auto-link? and any use of "" etc. just to mean "link this text" is an abuse of punctuation because the linktext is not a quotation.).
  • If linktext includes underlining: like _link text_ (URL) then the entirety of "link text" is linked, without any explicit underlining as that is presumed to be subsumed by the default hyperlink styling.
  • Use of other style phrasings like *bold* or /italic/ maintain their styling (since they're not part of default hyperlink styling) while permitting inline multiword linktext like: *bold text* URL, or /italic text/ URL.

 

Rejected hyperlinked text ideas

 

linktext

URL


Reason: too likely to have false positives, and URL on its line by itself may be intended as part of a visible linebreaks list.

 

linktext <URL>

 

URL delineated with < > (a common plain text email convention)

 

Reason: CMoS 6.110 discourages use of angle brackets to delineate URLs.

 

linktext (URL)

 

Reason: false positives intending to visible display parenthesized URL adjacent to linktext, see https://microformats.org/wiki/citation-formats#APA_site_or_profile 

 

Hyperlinked quoted names

Quite often named media are quoted, like names of movies, songs, books, albums, even blog posts, and given subsequent URLs, thus it makes sense to hyperlink them. E.g.

 

“Leonardo Da Vinci: The Universal Man” (https://www.amazon.com/gp/video/detail/B08HJP5LKN/)

from https://tantek.com/2023/365/t3/watched-leonard-da-vinci-universal-man

should be auto-linked to:

Leonardo Da Vinci: The Universal Man

 

Headlines of reviews use singlequotes around movie titles, sometimes with a parenthesized year afterwards, e.g.:

'TRON' (1982)

 

while that should be left as-is, it does hint at how to express that sort of thing with a link as well, e.g. perhaps:

'TRON' (1982 https://en.wikipedia.org/wiki/Tron)

or

'TRON' (1982, https://en.wikipedia.org/wiki/Tron)

can be auto-linked to:

'TRON' (1982)

 

This would also work for songs, albums, and books, which are sometimes disambiguated with a parenthesized year.

 

There is potential to expand the parenthetical syntax beyond year and link to other citational information, like author, precise date of publication. see citation styles documented in https://microformats.org/wiki/citation-formats#styles for ideas for extending this, or separate citation text formats that are recognizable for auto-marking up with h-cite.

 

This could work for anything that accepts (URL) for hypermedia. E.g.

_Tantek Çelik_ (https://tantek.com/ .h-card .vcard #main-card)

for a classnames and ID attributes.

 

The appending of classnames and ID attribute values inside the URL parenthetical would appear like noise to plain text readers so it's not great, but it would be a way to minimally invasively add the ability to add class and ID values to any bit of hypermedia. It could potentially be used for any inline phrase content like:

 

*bold* (.class1 .class2 #id)

 

Which would be discouraged for general text comms usage, however could be useful for folks using this for writing notes or a markdown replacement.

 

Ideally there would be a more human-reader-friendly way (emojis?) to indicate type of a Capitalized Proper Noun (person, place, etc.) which could be turned into a class name in the markup. (not sure how to unobtrusively visually indicate an ID), to avoid many/most uses of class names and ID attributes in normal readable plain text. Emoji space Capitalized Proper Noun?

👤 Tantek -> h-card

📅 IndieWebCamp -> h-event

 

Or we use human readable labels (name:value pairs) derived from citation and library formats, e.g. Category: cat1, Category: cat2, ID:A123), and then have a separate mapping for a set of common Categories (from libraries) into classnames. Category: Person (or Organization) -> h-card, then a catch-all for random one-word categories to h-* equivalents which would handle some like: Category: Event -> h-event, Category: Feed -> h-feed.

Could also work for additional properties like start/end datetimes (From: time1, To: time2) or shorter (time1-time2). Needs research into plain text equivalents of specific kinds/pieces of structured data to see which are common enough to be worth doing something with or building on.

 

 

Hyperlinked images

(now supported in https://tantek.com/cassis.js auto_link)

Expanding upon the previous, if the "linktext" were a URL ending in .jpg .gif .png (URL2), then make a hyperlinked image, e.g.:

 

URL2.png

URL

 

where each is on a line by itself, or inline: URL2.png URL

is converted to:

 

<a href=URL><img src="URL2.png" alt=""/></a>

 

Consider instead:

 

URL2.png

(URL)

 

and inline: URL2.png (URL)

 

Alt text for images

(2019-078 supported in https://tantek.com/cassis.js auto_link)

Another variant of the above, if the URL ends in .jpg .gif .png, text following was not a URL, then use the text following as an image alternate:

 

URL.png

(alt-text)

 

where each is on a line by itself, or inline: URL.png (alt-text)

is converted to:

 

<img src="URL.png" alt="alt-text"/>

 

Detail: there may be linebreaks (and nested parentheticals!) in the alt-text, however any URL will terminate the alt text (URLs don't belong in alt text).

 

Hyperlinked images with alt text

(2019-078 supported in https://tantek.com/cassis.js auto_link)

Combine the previous two to allow hyperlinked images with alt text

 

Since alt text is necessarily more strongly tied to image contents, it should be closer to image, whereas an image could be reasonably hyperlinked to various different URLs

 

URL2.png

(alt-text)

URL

 

where each is on a line by itself, or inline: URL2.png (alt-text) URL

is converted to:

 

<a href="/URL"><img src="URL2.png" alt="alt-text"/></a>

 

Details: there may be linebreaks (and nested parentheticals!) in the alt text, however any URL will terminate the alt text (URLs don't belong in alt text). Separators may be mixed space and linebreak, e.g.

URL2.png (alt-text)

URL

 

or

 

URL2.png

(alt-text) URL

 

 

Note: previous alternative - not as good since it separates the (alt-text) from the image

inline: URL2.png URL (alt-text)

or:

URL2.png

URL

(alt-text)

 

Consider instead:

 

URL2.png

(alt-text)

(URL)

 

but the two parenthesized lines in a row look awkward.

 

URL2.png

(alt-text URL)

 

or also allowing

 

URL2.png

(URL alt-text)

 

or inline: URL2.png (alt-text URL)

or URL2.png (URL alt-text)

 

To be forgiving of allowing either order since it may be easy to forget a specific order.

 

Also allow comma delimited inside the parentheses for syntactic sugar:

 

URL2.png

(alt-text, URL)

 

or also allowing

 

URL2.png

(URL, alt-text)

 

or inline: URL2.png (alt-text, URL)

or URL2.png (URL, alt-text)

 

Make similar adjustments for Video with hyperlink, Video with poster, Video with poster and hyperlink.

 

Perhaps the parenthetical space/comma delimited items have potential beyond two values, especially if each item inside has a "type" that can imply its usage, e.g.

 

URL2.png (alt-text, URL, .class, #id)

video-URL.mp4 (alt-text, poster-URL.jpg, URL, .class, #id)

 

Only works to have one "free text" item inside (and only for alt-text for a media URL), all other items must have no spaces, or be URLs to poster image(s), or alternate formats, or a page hyperlink.

 

 

Video with hyperlink

Expanding upon Hyperlinked images, if the URL2 ended in .mp4 .mov .ogv, then make a video with fallback hyperlink, e.g.:

 

URL2.mp4

URL

 

where each is on a line by itself, or inline: URL2.png URL

is converted to:

 

<video src="URL2.mp4"><a href=URL>a video</a></video>

 

Video with poster

Expanding upon the previous, if URL ended in .jpg .gif .png, then make a video with poster image, e.g.:

 

URL2.mp4

URL.jpg

 

where each is on a line by itself, or inline: URL2.mp4 URL.jpg

is converted to:

 

<video src="URL2.mp4" poster="URL.jpg">a video</video>

 

Question: should it also automatically make a fallback image?

<video src="URL2.mp4" poster="URL.jpg"><img src="URL.jpg" alt="a video"/></video>

 

Video with poster and hyperlink

 

Expanding upon the previous, if there was a third URL3 in-between URL2 and URL, that ended in .jpg .gif .png, then make a video with poster image URL3 and fallback hyperlink, e.g.:

 

URL2.mp4

URL3.jpg

URL

 

where each is on a line by itself, or inline: URL2.mp4 URL3.jpg URL

is converted to:

 

<video src="URL2.mp4" poster="URL3.jpg"><a href=URL>a video</a></video>

 

Question: should it make a fallback hyperlinked image?

<video src="URL2.mp4" poster="URL3.jpg"><a href=URL><img src="URL3.jpg" alt="a video"/></a></video>

 

Detail: separators may be mixed space / linebreak. e.g.

URL2.mp4 URL3.jpg

URL

 

or

 

URL2.mp4

URL3.jpg URL

 

 

Lists

Design based on monospace typing/display conventions, e.g. classic email archives. Following guidance from CMoS 6.127, 6.130

 

* unordered

* list

* items

  * with two-spaces-indented

  * nested list items

    with two spaces further indented continuations

 

1. ordered numbered

2. lists

   a. with three-spaces-indented

   b. lettered items

      and three spaces further indented continuations

 

 1. unnested but space indented single digit

 2. numbered

 3. list

 4. items

 5. imply

 6. count

 7. of

 8. at

 9. least

10. ten

11. items

 

Additional numbering styles e.g.: Roman I. II. III. IV. …, Capital A. B. C. …, trailing paren a) b) c) … i) ii) iii) iv) …, parenthesized (1) (2) (3) … (a) (b) (c) …

 

Consider run-in (AKA inline) lists, if/when we get HTML markup for them like a <list> element (better than <il> bc too similar to <li> typo, or <rl> too close to Ruby markup). See CMoS 6.126 for more details.

 

 

Aside

Design based on plain text usage of "Aside: " as a sentence prefix, or a paragraph prefix.

Replace it with <aside> markup, perhaps with a nested element surrounding the "Aside: " prefix, so that it can be hidden with CSS if desired, e.g. if there’s CSS to style the aside element as a float, or with a background or both to distinguish it from the rest of the text.

 

A variant to consider: "(Aside:" … ")" — similarly convert it to an <aside> element, except drop the parentheses. This could also be used for a multiparagraph aside.

 

 

Figure image with alt text and caption

 

Plaintext figures, another block formatting use-case (presume full-width for typical use-case and mobile-centric)


Like linked images with alt, and a line/paragraph of text after except double line break (blank line) both before and after:

 

imageurl.jpg
(Alt text)
Caption

 

imageurl.jpg (Alt text)
Caption

 

turns into:

<figure><img src="imageurl.jpg" alt="alt text" />

<figcaption>Caption</figcaption>

</figure>

imageurl.jpg (Alt text, hyperlink)
Caption

imageurl.jpg (hyperlink, Alt text)
Caption

 

turns into:

<figure><a href="/hyperlink"><img src="imageurl.jpg" alt="alt text" /></a>

<figcaption>Caption</figcaption>

</figure>

 

Postpone: for now not worrying about hyperlinking the entire figure+caption unless there are substantial examples showing a need.

 

 

Not doing: all on one line like: imageurl.jpg (hyperlink, alt text) Caption, because likely false positives, and Figures in common presentation typically have their caption below the figure, thus it's reasonable to expect (depend on) that in the plain text version

 

 

Other ideas

Here are few possible additions, that I'd only really want to consider after there was sufficient real-world evidence for their need, and perhaps some informal plain text publishing experience first.

 

More header styles

Added and incorporated more header styles like a line of periods or spaced periods into summary table based on positive feedback in person 2019-077.

 

 

Partial Implementation

 

 

Other possible needs

Whether incomplete or postponed, gathering a few additional use-cases here for further consideration, iteration, or explicit postponement.

 

Inline code snippets

Though I do use inline code markup in HTML and wiki syntax quite often, I'm not convinced this is a general need that deserves a special syntax.

Certainly the common use of backticks (`code`) is ugly and a source of plain text punctuation pollution I wish to avoid.

 

Might possibly need a (better than Markdown) solution to embedding inline code snippets. Gathering examples for now:

  • class names inline in documentation, e.g.:
    • validate your h-card with indiewebify.me, and make sure you have a u-url to your home page!
  • examples of HTML attributes with values inline in documentation, e.g.:
    • make sure to put rel=me on hyperlinks on your home page to your other profiles

 

Feels like inline code may be a special form of inline quotations, so maybe there's a possibility there. Perhaps that's how the use of backticks developed, except backticks are meaningless or even confusing to typical readers. In prose CSS discussions I have seen the names of HTML attributes quoted with a single straight quote like: 'class' or 'href' or 'rel'.

 

Table layout & styling

Maybe only possible temporary placeholder guidance for a 1.0: use preformatted text feature to preserve ASCII plain text table presentation, then later extend to incorporate table semantics.

looking for a good "comparison of table markup across lightweight markup languages" but haven’t found one. would like an analysis of plain text tables in e.g. email archives. also worth exploring and documenting text style guides for tables (e.g. Chicago Manual) and Tufte’s guidance for presenting tables. https://people.inf.ethz.ch/markusp/teaching/guides/guide-tables.pdf has some examples and analysis.

 

I remember ages ago brainstorming with Tab Atkins about how to extend the HTML PRE element for "simple" use to wrap plain text ASCII tables, or maybe CSV, and maybe indicate enough information (delimiters?) to parse CSV to then present a TABLE-like DOM to access rows, columns, cells of data inside. He wrote up a proposal for using PRE with CSV tables: https://www.xanthir.com/etc/csv.html 

 

Naming

Naming is hard. Nonetheless people have asked me for a name for this fixed, corrected, or replacement for markdown, so here are some thoughts (that I've searched and not found any critical collisions for)

  • tmark
  • VITA / -vita- a mnemonic for the order of video image text-alt/alink or text alink
    • this is growing on me, especially for the mnemonic usefulness
  • VITAL - similar mnemonic, Video Image Text-Alternatve Link, and when fully expanded:
    • VITAL (Video Image Text-Alternative Link) - it also expresses the ABBR (Abbreviation) construct.
  • textup — focusing on the positive, uplifting text, rather than being down about mark(up)
  • writeup — similarly, focused on writing rather than markup or not marking up

Name to frame it as explicitly excluding markup:

  • markdownt
  • markdont
  • markdonot / markdonut
  • markzero — distinguish clearly that "down" is not enough, zero apparent markup is the goal, it should look like plain text punctuation and formatting
  • nomark — make it very clear that the key design principle is NO markup, only punctuation that reads well on its own. like NoSQL.
  • zeromark — similarly, zero markup is the goal, like not even "a little bit" (as crept immediately into classic Markdown)
  • ...

Rejected — ideas at one point that don't work

  • MarkDoubt — why inject doubt into a name in any means?
  • CommonText — already overloaded with libraries, functions, npm. too wordy. positive: clear alternative to CommonMark
  • ...

Avoid — naming patterns to avoid

  • negative words — avoid negative associations
  • lots of syllables — 3+ syllables take longer to speak and will likely be shortened if adopted, so shorten in advance
  • very common word or phrase — likely to be overloaded and hard to discover via web search

Consider words (or pair of words) that are

Good words to consider — short, positive, 1-2 syllables, evocative of functionality

  • text
  • write
  • link
  • rich
  • mark — potential association / complementary / competitive with "markdown"
  • chat — association with broad implementations
  • ...

 

 

Past improvement ideas

Some ideas for improving Markdown that seemed like decent incremental improvements, but later I decided they weren't that much better, or far more drastic solutions/changes were needed.

 

Expand Link Styling Syntax

Update: the Markdown link syntax is not like anything anyone ever types in email. Ditch it.

Previous thoughts on improving it:

Current Markdown link styling:

 [example](http://example.com/ "Title")

Suggested additions:

 [example](http://example.com/ "Title" .class1 .class2 #id rel=rel-value boolean_attribute attribute1=one-word-value attribute2="quoted multi-word value")

 

Re-using .class #id from CSS and jQuery.

 

Re-using rel=rel-value boolean_attribute attribute1=one-word-value attribute2="quoted multi-word value" from simple HTML attribute syntax.

 

In all cases, all the "extra" information is contained in the parentheses, and thus reasonably easy to always skip when just reading over the content.

 

If there is no URL, then just use span for the markup rather than an a href.

If there is no URL but there is a src attribute, then just use an img (or video or audio if the src has an extension that better maps to those.)

 

Allow multiple URLs, the first for an href, and a second URL for the src of a linked img (or video or audio if the second URL has an extension that better maps to those). Similarly if there is a URL and an explicit src= attribute (for the second URL).  Could even allow multiple src= attributes, turning them into source elements to provide multiple video or audio embedded sources. A final image src could provide a fallback image.

 

Adoption

Markdown appears to be growing in popularity among the IndieWeb crowd, as both an authoring format, and a storage format. However, every time I've looked at it, the above-mentioned problems irked me sufficiently to not want to adopt it, and I've stubbornly stuck with using HTML (currently HTML5+hAtom) as my de facto structured storage format (e.g. for tantek.com posts). I'm considering forking Markdown and making the improvements noted above.

 

Comparison

Compare this simplification effort and syntaxes with the others listed at https://en.wikipedia.org/wiki/Lightweight_markup_language#Comparison_of_language_features (https://en.wikipedia.org/wiki/Category:Lightweight_markup_languages) and see which if any additional features are worth considering (re)including in this syntax, to show up "green" in that table. Consider making a more comprehensive feature comparison table starting with the features listed in the Before/After table near the top of this page, perhaps similar to https://docs.asciidoctor.org/asciidoc/latest/asciidoc-vs-markdown/ 

 

See Also

Comments (0)

You don't have permission to comment on this page.