It might be nice if the author qualified "most of the freely available data on the internet" with "whether or not it was copyrighted" or something to acknowledge the widespread theft of the works of millions.
Theft is the wrong term, it implies that the original is no longer available. It's copyright infringement at best, and possibly fair use depending on jurisdiction. It wasn't theft when the RIAA went on a lawsuit spree against mp3 copying, and it isn't theft now.