http://www.commoncrawl.org/
Common Crawl Foundation is a California 501(c)3 non-profit founded by Gil Elbaz with the goal of democratizing access to web information by producing and maintaining an open repository of web crawl data that is universally accessible.