Text Mining & Analysis @ Pitt

An introduction to text mining/analysis and resources for finding text data, preparing text data for analysis, methods and tools for analyzing text data, and further readings regarding text mining and its various methods.


Text mining, also referred to as text analysis, is the process of examining texts to discover new information or answer specific research questions, using algorithms that can quickly identify facts, patterns, and relationships in large collections of documents (e.g., emails, social media posts, blog posts, books, articles, diary entries, etc.). This information can be converted into structured forms for visualization using charts, graphs, mind maps, word clouds, and more.

Text mining differs from manual text analysis in that analytical processes are automated and applied to collections of texts that are usually too large to be read and analyzed by humans. The algorithms, math, and statistics used in text mining also enable more quantitative analysis and the uncovering of information that is easily missed by human scrutiny. However, text mining is most useful when combined with manual analysis and critical interpretation of the results of text mining.


Text Mining vs Text Analysis: What's the Difference?


Text analysis is, more a less, a synonym for text mining:

“Text mining began with the computational and information management fields (e.g. database searching and information retrieval), whereas Text analysis began in the humanities with the manual analysis of text, (e.g Bible concordances and newspaper indexes). More recently, the two terms have become synonymous, and now generally refer to the use of computational methods to search, retrieve, and analyse text data.” 

Credit: Felicity Berends, “Library Guides: Text Mining & Text Analysis: Introduction.”

This Guide


This guide provides lots of information and resources for text mining and analysis, from gathering text data for analysis to visualizing the results of your analysis:


