I've been frustrated by Markdown previews not supporting Pandoc features, so I created a Pandoc-based Markdown preview for VS Code [1]. The preview supports all Pandoc extensions to Markdown syntax, because Pandoc itself generates the preview. There is also optional support for code execution with Jupyter kernels. I'm currently in the process of adding support for non-Markdown formats (including scroll sync), plus taking advantage of some of the new Pandoc 3.0 features.
Old is new, the editor and the extension are now defunct. What was best about this exercise, I got so well versed with the markdown and Pandoc features at the time, that I didn’t need the preview at all.
The release 3.0 is unusable on windows. When running pandoc —version, it spawns off many instances and won’t return the value. Anyone has encountered such an issue?
Yes, I'm adding support for arbitrary text-based formats, including LaTeX. So it will be possible to write LaTeX and see a live HMTL preview generated by Pandoc.
In principle, it should be possible to create a PDF preview with proper SyncTeX support for synchronizing LaTeX source and PDF preview locations, but that gets complicated when Pandoc+LaTeX generate the PDF. It may be best to leave LaTeX-PDF previews to dedicated LaTeX previewers that don't involve Pandoc.
I actually released my extension around the same time that the Quarto extension came out. Quarto is great for documents running R code or needing some of Quarto's advanced document features. My extension has scroll sync and the preview updates live while you type. If you need code execution, you can use multiple Jupyter kernels per document and execute inline code. Also, code execution is non-blocking, so the preview still updates when you type, and code output appears live as it becomes available.
Pandoc is a great piece of software. As a university teacher and researcher, I use it in three ways:
1. I write markdown for my website and for the websites for my research projects and simply generate standalone html out of it. Done.
2. When we create electronic exams, the exam platform takes questions using a html-backed rich text editor. We write down our exam questions using markdown, create html document fragments, that we simply paste into the exam platform.
3. When students do electronic exams, we receive xml files from our exam platform. We use python to pass on submissions to different submission checkers (akin to autograders or static analysis) and create yaml files with the student submission and grading suggestions and static analysis annotations. We manually review and grade and comment within the yaml file (that works incredibly well), collect all the data using python and generate markdown reports for each student, including their submission, our comments and scoring. We pass this markdown through pandoc, creating well layouted pdfs which we either print and hand out or send out electronically.
Pandoc fits our yaml+markdown-based processes very well. Only for the actual research papers we still write LaTeX and build pdfs without pandoc.
Interesting! I use a very similar process for creating exams and student projects, but am the only one in my department who does so. Are any of your processes/tools publicly available? (Mine are basically cobbled together in Haskell and Python.)
Sorry, same. There's such a myriad of e-learning platforms in Germany and I guess it's the same for most countries.
I would believe the same goes for our own research static analysis and autotrading platform (in our case SQL) which probably every CS department also has quite a few of.
I wouldn't put my hopes up for anything publicly available that fits your platform and has a bus factor higher than 1.
Every now and then, I ponder putting some of my scripts together into something I could actually hand over to someone else, but have not yet had the time.
for the research papers which you write in LaTeX you should have a look at MonsterWriter.
Disclaimer: I'm the creator of MonsterWriter and very keen to receive feedback and learn about how universities and their students write papers, thesis, ...
Though I don't really like you advertising, thank you for the suggestion. As a computer science researcher I'll give you some feedback why your application is a total deal-breaker for me and my colleagues:
* It's not running on Linux. Nobody in our department runs windows or mac.
* We already have huge BibTex citation libraries that we use in papers and just reference the necessary papers. These citation library files grow and grow. I won't manually add citations for each paper.
* We collaborate and version through git. If collaborative writing and version control does not work at least as easy as our plaintext-git-handling, that's a hard no.
* You do know that for conference or journal submission word and LaTeX templates with given page limits in these templates are given, right? How would I use, say, LNCS in MonsterWriter? Writing seems not to be page-based. How do I know that I'm over the limit?
* My wife is a researcher in the social sciences, and they extensively use MS Word's change tracking and merging feature to write papers. If MonsterWriter does not support this in an accessible and visually appealing manner, it would be a hard no for her as well.
With your feature set, you're not really targeting researchers, even if you think you do.
Just tried opening it. It's looks nice, but I'm going to write some quick, slightly negative, comments, based on your claims about using it.
The table formatting is not good enough. It's not obvious how to left-justify a column. It's also not clear how to line a column up along "." (which I often use for numbers). Both of these are fairly easy in LaTeX.
The outputted LaTeX looks OK, but it's not obvious how to format -- most journals, and Universities (for PhDs) will have a fixed style you have to use. I suppose I could take the LaTeX and randomly hack it, but then I need to learn LaTeX to fix any issues that causes.
Regarding the outputted LaTeX, the idea is to grow the amount of supported templates. So there would be templates for every important journal. For now the focus is to make the thesis template flexible enough that it works for most bachelor/master thesis.
For HN reading along, SetApp is a way to distribute apps and get paid outside the app store. Really, that exists.
// Disclosure: Unless you are disavowing your ability as author to offer a recommendation that can be trusted, you probably mean "Disclosure" not "Disclaimer". Disclosure = here is my potential bias. Disclaimer = YMMV, no warranties express or implied.
I love Pandoc. I don't often write "documents for office consumption" but when I do, I just write a markdown file and spit out docx or PDF. I was congratulated more than once on how coherent my documents are in their structure.
It's also not too difficult to hook up a GH actions job to generate the documents with pandoc and spit them out directly into dropbox/sharepoint for "non-techie consumption". Great for semi-technical documentation that bis/sales/support people need to be in the loop on.
Oh that's a great idea! I wish I had pandoc in my university days--I ended up writing a lot of (non-technical) papers in latex just because I hated using word for the task.
Out of every tool I’ve ever used to make a .docx file from Markdown, Pandoc is the only one that has consistent results with converting Markdown headers to Word styles rather than just a bigger font size. Lots of Markdown tools in my tool belt, and would love to know of any more that can do this, because it’s really useful on the (unfortunate) occasions something needs to live as a Word doc.
I do wish there was an easy way to create Word document titles from H1s in Markdown. It makes sense that they should be converted to top-level headings, but it adds an annoying bit of friction to my workflow.
Oh really? I've tried --shift-heading in the past and it worked to move headings up a level, but not to the title. I'll have to read the docs more carefully and give it another go. Thank you.
That's great news. I've been waiting for years for a dedicated 'Figure' element. The workaround was pretty brittle. It'll make pandoc-plot [0] easier to maintain as well.
Does it still automatically generate "smart" quotes (which are anything but) from traditional ones during conversion?
Love the tool, but this is the most awful default setting I've seen in a program in a while, especially if you include any code that depends on quotes not being mangled.
This I agree with. I don't know the exact current status, but having debugged related rendering issues many times over the years, I wish it had always been hard to enable conversion to so-called smart quotes, rather than hard to prevent it.
I should add: the above is about the only quibble I can think of, which is impressive. I love love love pandoc! It's a highly dependable and capable swiss army docs tool. I use it constantly, eg to help generate CLI help text and HTML, man, info and plain text manuals from (mostly) markdown sources. Huge congrats and thanks to the developers for their hard work and for this latest release.
I love pandoc. With it's lua filters, I love using it for generating html and blog posts, one thing which always annoys me about most static website generating tools is they make you use some very limited templating language, when I just want to use a proper programming language.
My only irritation -- while I understand why one would want to do it for neatness, it's annoying that the "pandoc" package no longer provides the "pandoc" program! Maybe instead introducing "pandoc-core" and renaming "pandoc-cli" to "pandoc" would be better (it would certainly avoid breaking existing scripts, like mine).
This (I also generate my blog using pandoc). In another case, I wanted to go from Markdown to groff -mom and it was a totally straightforward matter with a custom Writer in Lua.
I've been looking for a tech stack to replace latex for decades now. As a very recent development, the combination of pandoc+weasyprint (plus a little bit of homebrewed pandoc filter magic) has now become good enough for my needs, and I have finally been able to take the plunge. Feels great.
For those who are a little less adventurous and who happen to be in the social sciences, humanities, journalism, etc., pandoc+msword is also definitely worth looking into. It's a much better tech stack than standalone msword. -- It's really only in the STEM fields that, in my mind, there really is no way around latex.
__shite_templating_compile_source_to_html() {
# If content has front matter metadata, it is presumed to be in a format
# that the content compiler can safely process and elide or ignore.
local file_type=${1:?"Fail. We expect file type of content like html, org, md etc."}
case ${file_type} in
html )
pandoc -f html -t html
;;
md )
pandoc -f markdown -t html
;;
org )
pandoc -f org -t html
;;
esac
}
Templates look like this. Notice the $(cat -) in the middle. That's how the HTML content produced by Pandoc gets injected in the middle of everything else.
Since I've rolled my own SSG, I wanted it to compile $FORMAT -> HTML directly, so I write fewer bugs :D
I chose Pandoc because it does a reasonably OK job compiling orgmode, _and_ has good support for other formats I use from time to time (e.g. markdown, ASCIIDOC).
Before this, I was using hugo, with a compile cycle similar to your setup, viz. org -> markdown (via ox-hugo), and then hugo did the md -> HTML thing. hugo sort of supports org -> html, but their batteries-included compiler is not very good. Points for trying, though.
I've been using Pandoc to write Latex-lite for a couple years now. Just write .md files with basic Markdown syntax for all the major text content, and add some Latex when I need to do something more particular. Best of both worlds, really.
One day I wish to see the AsciiDoc(tor) Reader. I'd love to be freed from Ruby as AsciiDoc is superior to Markdown and most other lightweight markup syntax options in features and syntax. This lack of features is why we have an incompatible group of Markdown syntax forks (aka "flavors" to mask that forks are incompatible).
As others wrote, Pandoc is Haskell so it compiles to a fairly efficient binary.
But more importantly, unlike the various Markdown flavors or AsciiDoc, it is incredibly extensible thanks to the combination of custom filters and the possibility to add HTML classes and attributes. One can write filters to leverage the class/attribute information and perform transformations at the AST level, which basically lets you define a DSL with an arbitrary number of custom elements.
I wrote a collection of filters for the publication of a large online legal playbook. Not only did Pandoc make it possible to introduce different kind of custom elements that don't exist in plain Markdown or AsciiDoc, but by using different filters it was possible to use a single Markdown source to generate both the book and various summaries such as a list of examples, a list of civil code clauses etc.
I don't know Haskell that well so I used Rust for the filters, but that worked very well.
> Pandoc is Haskell so it compiles to a fairly efficient binary.
This is nebulous. Haskell's compiled binaries are not ideal, for a number of reasons.[^1] GHC does very little to optimise for many typical metrics of "efficient". The binaries it produces are enormous because it (unavoidably) bundles the runtime along with the program itself, and there is a lot of empty space in the binaries. Shrinking them can improve startup times significantly especially on spinning rust drives.
That said, Haskell programs are at least _compiled_, and they do result in binaries which, if well written, can result in running times comparable to (or, sometimes, shorter than) your average hand-rolled C code that achieves the same goals.
Of course, none of this casts any shadow on the fact that Pandoc is, indeed, an excellently engineered piece of software that stands as a testament to the value of Haskell for real-world business logic and problem solving.
It does do some things better and I appreciate calling it a new name and 'starting over' instead of another fork, but what's not covered is metadata. If I want to add author, license, tags, keywords, description, etc. there is no in-document way to do this. Almost all other media format types from images, audio, to other documents like ODF have a way to do metadata and this (and Markdown) doesn't cover said important use case.
Seems there's a long-standing (for the project) open issue where it's still being mulled over.
Imports are also very nice for writing longer texts--especially how AsciiDoc lets you +1 all of your headings so the heading hierarchy works as a standalone document and a part of a larger whole.
I appreciate that you pointed it out though to give me a chance to reevaluate my thoughts on the project. It seems it has a better trajectory than when I had last looked.
I've successfully used it from Clojure (I think it's through JRuby). With a few lines of code you can configure AsciiDoctor to whatever you need. It's way easier than fiddling at the command-line (I couldn't immediately understand how to get extensions and how it played with whatever version of the software I got through `apt`). It'd be good to have alternatives just for the sake of it - but I didn't find anything particularly lacking
The maintainers seem very responsive and active on Github. It's not as nice as a spec and multiple implementations - and I guess you're locked in to one library, but at least it's not as bad as Orgmode - where you're locked in to an editor as well
Yes, the maintainer is great. I have at this point just used Nix and post-processed Asciidoctor instead of trying any sort of other tools, but it gets tricker as you noted if you want to use it inside something else. It's not a compiled binary nor is it a C lib other languages could get at. Much of that could be attributed to the spec being quite.
Though when it comes to annoyance with Markdown forks: AsciiDoctor is basically that to AsciiDoc. It's mostly compatible, but when it isn't, it really bites.
The more important part is that you end up with a binary instead of needing an interpreted language which makes the tooling a mess. Python and Ruby are the same thing to most people.
fantastic software, never build it from source, or if you have to, make sure you have an OS that bundles all the Haskell dependencies into a single meta package
Haskell tooling went from awful to best in class after stack came out. Look for “Quick stack method” in the installing page- It should be easy to build from source now with just a few commands. No doubt will take a long time to compile all the packages and you might still have issues tracking down any non-Haskell dependencies (c libraries).
It or one of its libraries did not compile for me OOTB in Gentoo. Granted it's marked as nonstable (~*). That's unfortunate, because I really liked to use it when I was mainly running Debian. Though I didn't really put any debugging effort into making it work.
One new feature that will make Python documentarians happy is the `—-list-tables` flag for rST output: You can now convert any table to the list table syntax of reStructuredText, which is, in many's opinion, superior to classic tables with ASCII borders.
I'm in a job where I pretty much never need to output a PDF, but whenever the occasional thing comes around, pandoc is always there for me. Such a useful tool.
I use pandoc to convert GitHub style markdown to PDF/EPUB ebooks. The default output is good and there are plenty of customization options too. I didn't know LaTeX/CSS but stitched a few things together with help from stackexchange sites to customize the output produced. Later came to know there are third-party templates that I could've used/started with.
Since Pandoc has Lua inbuilt I wonder if it can also run LuaLatex in full? Because then it could support really all features of LaTeX and become a kind of SuperLaTeX.
[1]: Examples and animations: https://codebraid.org/presentations/scipy2022/. Installation for VS Code: https://marketplace.visualstudio.com/items?itemName=gpoore.c.... Installation for VSCodium: https://open-vsx.org/extension/gpoore/codebraid-preview.