In a recent post to the MySQL Performance Blog, there's a pointer to a few resources you can use if you need some sample datasets to run your application against - everything from airline flight information to energy usage data.
Sometimes you just need some data to test and stress things. But randomly generated data is awful - it doesn't have realistic distributions, and it isn't easy to understand whether your results are meaningful and correct. Real or quasi-real data is best. Whether you're looking for a couple of megabytes or many terabytes, the following sources of data might help you benchmark and test under more realistic conditions.
The sample data sets vary from fake movie information to sample site traffic data to the large data sets that Amazon provides (including the Human Genome and US Census data). Some of the comments also link to other sources.