So, what applications are iterative Krylov subspace methods good for? Can you give us a sense of where they're useful, and what (in general) inverse methods in this space are?
The sibling comments are great, so I'll comment in a more direct application-oriented way: Krylov methods provide a way to expose most the machinery of linear algebra to you by means of only forward matrix-multiplication with the relevant matrix.
A cool thing is that means you can do linear algebra operations with a matrix (including solving linear systems) without ever needing to actually explicitly construct said matrix. We just need a recipe to describe the action of multiplication with a matrix in order to do more advanced things with it. This is why they are so popular for sparse matrices, but their suitability can go beyond just that.
Projecting problems into Krylov spaces also tends to have a noticeable regularizing effect.
Krylov subspace methods have been the top algorithms for years for solving linear systems Ax = b and eigenvalue problems Ax = λx. They basically find x as a polynomial in A: x = a_0b + a_1Ab + a_2A^2b + ... . Why a polynomial? Because it's easy to construct, and A^{-1} can be represented as a polynomial in A (with a finite number of terms).
More generally: subspace methods find a solution x as a linear combination of a few column vectors that form a matrix V. So x = Vy, where y is a very small vector. Now only if you're lucky you'll find AVy = b, but in general AVy /= b. So therefore you require the residual to be perpendicular to V or something like that: V'(AVy - b) = 0. In the end you solve a very small problem (V'AV)y = V'b for y.
GMRES for instances solves AVy = b in the least squares sense: (AV)'AVy = (AV)'b.
(warning, I am not a numerical computation specialist)
Krylov subspaces are a (the ?) common framework to compute eigenvalues and singular values, and it works for sparse matrices as well. IOW, it is extremely useful for most problems that can be expressed as matrix factorization (e.g. recommendation using collaborative filtering, initial page rank-like algos).
I remember using Conjugate Gradient (https://en.wikipedia.org/wiki/Conjugate_gradient_method) for my 4th year university project, which involved finding the steady state of large and complex systems (e.g. railway signalling systems), that were expressed as sparse matrices with millions of rows/columns.
It worked really well and executed faster than other iterative methods such as Jacoby/Gauss-Seidel, but the conjugate gradient method was a bit unstable and at times I would not be able to converge toward a solution - I think it was due to the high number of multiplications involved which may cause over-/underflow (but I can't remember).
The matrices themselves are a representation of a Markov chain.