I love markdown and use it for all my notes, however it really needs a native wa...

ambivalence · 2025-02-25T23:39:13 1740526753

Markdown is plaintext so you decide what it means. I personally write *italic* and **bold**, so I can use _underline_. Most Markdown to HTML converters would make the last example into italic, but you can customize many of them.

Commonmark doesn't even mention "bold", "italic", and "underline". It just says "emphasis" and "strong emphasis". You can style it however you want.

paulryanrogers · 2025-02-26T00:13:25 1740528805

This kind of undercuts the advantage of a semi-universal format. Though I'd agree underscore wrappers are quite reasonable and natural.

ratorx · 2025-02-26T16:18:08 1740586688

Markdown isn’t really meant to be a universal markup format. Its primary goal is to document conventions of annotating plain text which keep the plaintext semi-consistent and readable.

So the purpose of , * etc is purely emphasis. If you need to represent something specific (bold, italic etc) then that’s a job for the Markdown parser (or embedded HTML etc). The result of the parser (HTML, etc) will be less human readable, but actually able to specify formatting.

I agree that CommonMark could be extended, but I think the focus should be on semantic* relevance rather than markup specification.

voltaireodactyl · 2025-02-26T04:23:18 1740543798

I love the Fountain spec for exactly this reason. I primarily began using it since it’s Markdown for screenwriting, but it has bold, underline, and italics along with the usual markdown stuff like comments etc. I find it to be by far the best way to write plaintext anything other than code. It’s also a bit more opinionated than Markdown which I highly prefer.

Tagbert · 2025-02-25T23:40:20 1740526820

it might depend on what you want to do with the underline. Does it just indicate some kind of emphasis?

Could you use the convention in your documents that "_" is the underline delimiter? I know that the default is to render it as italic/emphasis but that is just a decision at rendering time. The semantics of emphasize/underline could easily overlap.

Of course if you want 3 levels of emphasis with bold, italic, and underline, then yes you need to look elsewhere.

Markdown isn't really a formatting tool. it is a way to structure text in the minimal way that a person would interpret it and a machine could render it.

fsckboy · 2025-02-26T05:25:54 1740547554

>converting some older books and lectures...If anyone has a good solution I'm all ears.

I don't know if this helps you, but you said "older": in the 20th century world of typewriters--which had no italics--underlining was used as a substitute for italics. Transforming underlines to italics or going the other way was considered normal. You wouldn't use both in the same document.

dredmorbius · 2025-02-26T05:50:43 1740549043

There's notional underlining, which in typewritten documents is effectively the equivalent of italic, and there is typographical underlining, where "underline" means "there is a line under this element and/or text".

Both matter, and although Markdown flavours handle the notional case well, they fall down at this (and several other) typographical capabilities. Expressing text in a particular colour (or greyshade) is another example. It's possible to achieve this in practice through embedded HTML and/or CSS tags, or through augmented Markdown variants (Pandoc's Markdown can achieve some things CommonMark or DaringFireball Markdown cannot).

Ultimately though I find I need to switch to a more capable and consistent text-layout engine, usually LaTeX in my case.

Though for even quite large and modestly complex works, Markdown is either sufficient entirely or is useful in getting the work off the ground before switching to a more powerful option.

fsckboy · 2025-02-26T20:38:06 1740602286

i said "typewriter", and there is only one kind of underline on a typewriter.

converting old typewritten notes, they may contain typewriter underlining, and it may represent italics. Markdown would be entirely sufficient to handle that.

there was no need to de-clarify my comment.

dredmorbius · 2025-02-27T03:26:22 1740626782

The typewriter is distinct and often intermediate writing device standing between the markedly free-form though also variable handwriting and the much more standardised, though fairly developed, capabilities of typeset documents.

Unlike handwriting, typewriting uniform (both in type and spacing), and markedly faster.

Unlike printing, typewriting is limited (generally a single typeface, no variability in face, size, or styling (e.g., roman, bold, italic), and requires further guidance to define specifically what result is desired where a typewritten work is not a document's final form.

It's worth noting that print itself differs from handwriting: when we write letters, forms and sizes vary, different writers often differ markedly in their own scripts, trained copyists may achieve a high level of standardisation, but that itself requires significant training and is achievable only by a limited number of artisans,[1] and letterforms themselves are not discrete but individually instanced each time they are created. With the advent of moveable-type printing,[2] letterforms became fixed, and with digital typesetting and computer fonts, each discrete shape or language-specific forms, say, the Roman A, Greek Α (alpha), and Cyrillic А (Azǔ/Азъ), are represented by distinct code points, but are nearly or entirely indistinguishable when rendered on-screen or in print. Further, over the history of both handwriting and typesetting, conventions have emerged for the textual representation of language, including spacing of words (versus scripto continuo), punctuation, paragraphs, page numbering, division of books into chapters, sections, parts, subsections, etc., of lists, tables, indices, (foot|end|side)notes, (parenthesis), drop-caps, figure captions, cataloguing, etc., etc. All of those were inventions and conventions not inherent to language, writing, printing, document preparation, or archival and retrieval themselves. There's still considerable variation between different print language representations, e.g., many texts lack equivalents of italic, bold, or even upper/lower case letterform distinctions.

Typewriting itself occupies an interesting space, being a primary endpoint for some types of documents (correspondence, forms, and the like) and an intermediate form for others, most notably published articles and books. Given that typewriting has both capabilities and limitations which aren't present in typeset documents (whether moveable type or digital), it's not possible to draw a distinct correspondence between what a typewriter outputs and how that might be represented in a derived document. Yes, typewriters can generate underlines, but that might be represented in typeset print as italic, bold, underline, or something else entirely. In practice, editors proofing marks were inserted (as handwritten notations) on a typed manuscript to indicate the preferred presentation, generally following the author's intent and/or the publisher's own house style conventions. See: <https://en.wikipedia.org/wiki/List_of_proofreader%27s_marks>.

________________________________

Notes:

1. An anecdote which sticks with me: among the 1001 Arabian Nights stories is one in which a character makes specific references to the not only his literacy and scribal capabilities, but the types of scripts he could produce. That is, this was a specific and valued skill of that age worth noting, even in a general-audience work.

2. As distinguished from earlier monoblock printing in which a whole work was engraved on a wood block or metal plate, typified by early Pamphilus, seu de Amore from which we have the word pamphlet, see: <https://www.etymonline.com/word/pamphlet>. Such monoblock prints were more like a photocopied handwritten letter, in which variations in individual letterforms are replicated, than they are standardised print obtained from moveable type or, more recently and familiarly, computer-based digital typesetting or Web documents, in which fonts are standardised and each given character is identical to all others matching that style.

alpaca128 · 2025-02-25T23:44:26 1740527066

The solution are HTML tags like <u></u>

dredmorbius · 2025-02-26T05:53:54 1740549234

Markdown is often, and was originally intended for, HTML generation. But that's not the only target which can be achieved, particularly with such tools as Pandoc, a document format interchange Swiss Army knife.

Relying on format-specific tags imposes stronger constraints on endpoints and/or increases complexity of your document build process.

alpaca128 · 2025-02-26T10:25:47 1740565547

Inline HTML is part of the standard Markdown syntax, not a complication. If your tool doesn't support HTML it doesn't support Markdown. The format can be so simple in the first place because it allows this escape hatch for anything non-trivial. And tools like Pandoc can handle that just fine.

dredmorbius · 2025-02-26T14:00:05 1740578405

My point is that Markdown conversion tools, notably Pandoc, whilst they will incorporate inline HTML when generating HTML endpoints will not convert such inlined code to other endpoints, e.g., LaTeX, DocBook, OpenDocument, etc.

If you want those outputs to faithfully represent formatting, you either need to juggle multiple inline directives for each desired output format, or find some universal Markdown-based mechanism for achieving the same result.

I'd like to make clear that I'm familiar with Markdown; the fact that its original design intent was streamlining HTML generation; that inline "native" code is a feature, not a but, but all the same a rather fraught one; and that actual practice has moved far beyond Markdown merely being used to generate HTML, least of all my own such practice.

I've discussed this situation previously on HN (ironically from the PoV of using LaTeX embeds within Markdown creating problems when attempting to generate other-than-LaTeX outputs), see: <https://news.ycombinator.com/item?id=29690056> (2021).

And I'd asked about the HTML and/or LaTeX conditional generation in a StackOverflow post about seven years ago: <https://stackoverflow.com/questions/4820502a9/pandoc-have-ei...>.

oneeyedpigeon · 2025-02-26T10:44:24 1740566664

Ah, the good old Unarticulated Annotation element!

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/u

grimui · 2025-02-25T23:35:09 1740526509

Not the best solution but you could use the HTML underline tag