Text Mining & Analysis @ Pitt

An introduction to text mining/analysis and resources for finding text data, preparing text data for analysis, methods and tools for analyzing text data, and further readings regarding text mining and its various methods.

Sources of Text Data

Before you can start your text mining/analysis project, you'll first need to gather text data and build a corpus (an organized collection of texts). There are many sources of text data, whether you're conducting a research study through interviewing or surveying, collecting primary sources or journal articles, downloading data sets or corpora created by others, or extracting data from the web. If you are looking for text data for your text mining project, below are links to pages with resources and/or tools for some major sources of text data: