Well you could explain them in terms of category theory as an endofunctor T: C -> C, together with two natural transformations eta : 1 -> T and mu : T^2 -> T, that satisfy a number of axioms. The Haskell definition encodes a special case, where the underlying "category" is "Hask", whose objects are Haskell types and morphisms are Haskell terms. Of course this is strictly speaking not a category, but it is close enough.
In mathematics monads usually arise as adjunctions between two functors, for example beginning with a set of elements, you can consider the free monoid generated by it and forget the group structure, this gives you a much larger set. If you did this operation on a set of characters, you would get the set of all strings of those characters, eta would in this case be the operation that given a character in the character set gives you the corresponding string of length one and mu would concatenate two strings.
Unfortunately, you've taken it two steps in the opposite direction from helpful. The point is that when your audience is neither haskellers nor mathematicians (I.e. the vast majority of programmers on Earth) repeatedly explaining monads in Haskell and/or mathematics is not helpful.
Imagine if you wanted to learn about monads, but every article went "Monads are a simple and powerful idea that, interestingly enough, can be very, very well expressed in Latin. Therefore, I will switch to Latin for the remainder of this article. Oh... You haven't studied Latin? You really should! It's really very useful. Moving on... Cogitus sin extricatus..."
If you want your audience to understand, you need to explain it in C.
Well if the question is 'what is a monad' then the most straight forward explanation is that it is an abstract mathematical concept in category theory and that there are certain laws that have to be satisfied. You can even draw really pretty pictures of those laws http://math.ucr.edu/home/baez/week92.html#tale.
The advantage of mathematics is that it cleanly separates the external view and internal view of a concept. The axioms are easy to state abstractly and they are the most important part, haskell only allows to abstractly define the type of the operations, but can't abstractly enforce the laws. Rather any instance of the typeclass is assumed to satisfy the laws.
Those happen to also hold for certain constructions in functional programming languages, like lists (the list monad) and several others, not by coincidence, but because those languages have a close connection cartesian closed categories.
I firmly believe it is not helpful to explain something by analogy, because an analogy only goes so far. The mathematical notion of a monad is not complicated at all and is only obscured by writing page after page about them in the syntax of some arbitrary programming language.
C is probably the worst language to implement monads in because the type systems really lacks polymorphism, higher order functions and any sort of type-class overloading to clean up the syntax. I mean you could hypothetically pass around a record full of void function pointers to get around all this, but it would be so ugly and unintuitive that it would be silly to try and explain monads this way.
I think you have the same idea I do. In a statically compiled language, all of the high order polymorphism eventually compiles down to actual, concrete code that should be expressible in hand-specialized C structs and functions performing a single, specific task. Haskell makes that specialization process incredibly convenient and C makes it incredibly inconvenient. But, the point is not to ship a bunch of production code quickly. The point is to provide a low-abstraction example to demonstrate concretely what the compiler is doing for you without prefacing the example with "Assuming of course that you have already studied Haskell..."
You'd end up building a small functional runtime. That's a fun project to understand compilers better. But given that Haskell types are erased at runtime the very thing that makes monads monads wouldn't even be around anymore. All the values would be represented uniformly by some *StgClosure struct and the whole program would just be a mess of casts and projections into these values.
So, we're talking about Haskell passing around void pointers and later doing cross-your-fingers typecasts on them? That doesn't sound very much like the Haskell I keep hearing about. I was expecting something more like an ungodly stack of C++ templates eventually building up a return type containing a record of all side delayed effects of the function. That C++ template could then be manually flattened to a C struct. It would be a gross amount of manual labor. But, it would also be typesafe in plain C.
There is a translation from any typed language into an untyped language. Writing code in that untyped language is not going to be type safe, while the code generated (correctly) in that untyped language from the typed language is still guaranteed to be correct.
It is entirely possible that the only way to get anything safe out of some Haskell code is to rely on checks the Haskell compiler gives you at compile time, which the C compiler cannot give you.
That said, people often underestimate the kinds of guarantees you can bang out of a C compiler, at the cost of a bit of verbosity.
> So, we're talking about Haskell passing around void pointers and later doing cross-your-fingers typecasts on them?
Following on this thought: Every compiler is essentially an assembler programmer! And we all know how error prone it is to code in assembler. So how can the compiler ever produce error-free binaries?
Seriously, man! Anybody who would understand your explanation is the kind of person who already knows what monads are, and thus does not need your explanation in the first place.
In mathematics monads usually arise as adjunctions between two functors, for example beginning with a set of elements, you can consider the free monoid generated by it and forget the group structure, this gives you a much larger set. If you did this operation on a set of characters, you would get the set of all strings of those characters, eta would in this case be the operation that given a character in the character set gives you the corresponding string of length one and mu would concatenate two strings.