As I understand it, the problem lies with making the different subsystems interact: Forking itself is reasonably easy using NtCreateProcess(), but the Win32 subsystem won't know how to deal with the forked process and stuff will break, including console output.
I don't see Microsoft adding forking support to the Win32 subsystem any time soon, so you'd end up rewriting Cygwin from scratch by reverse engineering Interix...