Meanwhile Elsevier, who is widely known for inhibiting science progress by setting incredible high prices even for government funded research papers, makes a move against SciHub [1] and LibGen [2] again [3].
(Note that the filesizes in the directory listing are all wrong -- that's the original index.html from loc.gov/cds/downloads/MDSConnect/)
This makes it a lot easier to use this dataset at e.g. hackathons, where a lot of people would simultaneously pester that LoC server, which already seemed pretty bandwidth-limited on its own when I downloaded the files.
Having never done this before I've got to have a look at the internetarchive tool first, but yes, that would work (I'd hate to take your money then not be able to deliver).
This amount of usage falls well within most VPS companies' free trial / promo codes offerings and should cost you nothing. Use a throwaway email account and drop it after a month.
AWS will give you a whole year if you haven't tried them yet and the other popular VPS companies (DO, Linode, etc.) all will give you at least $10 startup credit. This is probably simpler and faster than figuring out how to receive <$4 from some random internet commenter.
What's the copyright? Would it be legal to unzip those and serve them directly, so archive.org or anyone else can make them more inviting for access?
I know you shouldn't look a gift horse in the mouth but there's not even an index or a rough idea what something like "Name Authorities" might mean. That's not what I call wide open doors, that more seems like doing some legally required minimum.
Files are 25 million bibliographic index files which were produced by US Federal employees, so yes, they're likely in the public ___domain as a result.
It's basically an authentication provider maintained by the Library of Congress, which serves to define cannonical identifiers for library-catalogued entities, like books and public figures.
I wonder how much more extensive the release could have been were copyright laws not in the way.
Then there's the old question of whether the works under copyright today will ever go in to the public ___domain, or if their copyright will be extended forever by future changes in copyright law.
They didn't have to limit their release just to bibliographic index files. If they wanted to, they could have released manuscripts, letters, newsletters, videos, or any other media they have. But they may have felt inhibited by copyright laws.
So my question is, had copyright laws not been an issue, how much more would they have released?
There is also the larger question of whether the value of copyright law outweighs the value of not having it, so that everyone can benefit from this treasure trove of knowledge.
I don't think these records meet the standard definition of 'media' anyway. This is really just data that can be used for cataloguing purposes and other media custodian/librarian applications.
Given that the LoC has made it their goal to archive at least one copy of everything, I think they are not quite the right people to fall into your anti-copyright cross hairs. However, I do strongly agree with your overall premises.
[1] https://sci-hub.cc/
[2] http://libgen.io/
[3] https://torrentfreak.com/elsevier-wants-15-million-piracy-da...