A real person wouldn't have to pay to read random blog, Reddit comments, StackOverflow answers or code on GitHub (many open source licenses do not imply license for training).
They might have to pay for books, or use a library.
Should these cases be treated differently? If so, it might lead to more closed internet with even more paywalls.
I think those are less of an issue. They want to train on paywalled news articles, magazines and books. In addition to other media that the average person would have to pay for or would otherwise have limitations applied.
In my opinion, if any copyright related rule is applied to books or other paywalled content, it should equally apply any Joe Shmoe's blog or code on GitHub.
They might have to pay for books, or use a library.
Should these cases be treated differently? If so, it might lead to more closed internet with even more paywalls.