SOCI 5260/6500: Text Analysis
Summer II 2014
M,T,W,Th, 12-1:50pm, Wooten Hal 116, July 7-August 8, 2014
Professor Gabe Ignatow
gignatow@gmail.com
Although this course has a room assigned, it is online-only. We will communicate by email, supplemented by several in-person meetings.
Course description: Social media sites generate massive volumes of natural language data that are available for social science research, and social scientists have developed a number of new technologies for analyzing this data. Researchers are scaling up traditional research techniques to take advantage of new sources of textual data, as well as developing new methods along with new theoretical and metatheoretical frameworks and approaches to research ethics. This course provides a practical guide to contemporary text mining and analysis for the social sciences, covering both qualitative and quantitative text analytic research methods. Our focus in this course is mainly on sociological text analysis methods, including computer-assisted qualitative methods, semantic text analysis methods, and topic models.
Requirements:
1) Completion of weekly assignments (see below)
2) Completion of 10-page final paper
Final paper requirements:
The final paper can be a proposal for a text mining and analysis project, a completed text mining and analysis project, or somewhere in between. For all final papers, students must collect their own data and explain and justify their sampling strategy. For CAQDAS projects, students must develop a coding scheme and apply it to a sub-sample of the larger text sample. For projects using more highly automated methods, students must review relevant text analysis methods and propose a strategy that can yield results relevant to the research question.
10 pages inclusive of full references, 12-pt font, double-spaced
WEEK 1: INTRODUCTION AND TEXT MINING
Steve Stemler An Overview of Content Analysis
Chris Bail The Cultural Environment: Measuring Culture with Big Data
Carl W. Roberts, A Conceptual Framework for Quantitative Text Analysis
optional: Sociological Discourse Analysis
Assignments: send by email to gignatow@gmail.com by 12pm Friday July 11
1) Propose one or more research questions that could be approached with text analysis methods
2) Identify 3 or more possible data sources, including newspaper archives, historical archives, social media platforms, websites, or research databases.
(15 points)
WEEK 2: TEXT MINING AND CAQDAS
1. Text Mining
brief overview article
another overview article
Text mining packages (free) (check YouTube for tutorials)
NCapture from NVivo
Helium Scraper
Mozenda
Outwit Hub
Visual Web Ripper
WebHarvy
2. CAQDAS
Katie MacMillan More Than Just Coding
Reporting on the Strategic Use of CAQDAS
Illumination with a Dim Bulb?
Free trials of CAQDAS packages (check YouTube for tutorials)
NVivo
ATLAS.ti
MAXQDA
HyperRESEARCH from Researchware
Assignments: send by email to gignatow@gmail.com by 12pm Friday July 18
1) Scrape or otherwise create a text sample of at least 5000 words. Describe the sample and how you collected it.
2) Write a 1-2-page memo describing possible coding schemes you will use on your data.
(15 points)
WEEK 3: SEQUENCE ANALYSIS METHODS
Franzosi 1987 From Words to Numbers
Franzosi 1998 Narrative Analysis
Carl Roberts A Generic Semantic Grammar
Ignatow 2004 Speaking Together, Thinking Together?
Assignments: send by email to gignatow@gmail.com by 12pm Friday July25
1) Write 1-2-page reviews of two of this week's articles
2) Write a 1-page update of your progress on your final paper
(10 points)
WEEK 4: SEMANTIC AND SENTIMENT ANALYSIS
Carley and Palmquist 1992 Extracting, Representing, and Analyzing Cultural Models
Ignatow 2007 Culture and Embodied Cognition
Bail 2012 The Fringe Effect
Assignments: send by email to gignatow@gmail.com by 12pm Friday Aug 1
1) Write 1-2-page reviews of two of this week's articles
2) Write a 1-page update of your progress on your final paper
(10 points)
WEEK 5: TOPIC MODELS
August 4 Mohr and Bogdanov 2013 Topic Models--What They Are and Why They Matter
August 5-6 Mohr, Wagner-Pacifici, Breiger and Bogdanov Graphing the Grammar of Motives in National Security Strategy
Email presentations to gignatow@gmail.com and ignatow@unt.edu by 5pm August 7 (10 points)
Final paper due by email by 12pm Friday August 8 (40 points)
Summer II 2014
M,T,W,Th, 12-1:50pm, Wooten Hal 116, July 7-August 8, 2014
Professor Gabe Ignatow
gignatow@gmail.com
Although this course has a room assigned, it is online-only. We will communicate by email, supplemented by several in-person meetings.
Course description: Social media sites generate massive volumes of natural language data that are available for social science research, and social scientists have developed a number of new technologies for analyzing this data. Researchers are scaling up traditional research techniques to take advantage of new sources of textual data, as well as developing new methods along with new theoretical and metatheoretical frameworks and approaches to research ethics. This course provides a practical guide to contemporary text mining and analysis for the social sciences, covering both qualitative and quantitative text analytic research methods. Our focus in this course is mainly on sociological text analysis methods, including computer-assisted qualitative methods, semantic text analysis methods, and topic models.
Requirements:
1) Completion of weekly assignments (see below)
2) Completion of 10-page final paper
Final paper requirements:
The final paper can be a proposal for a text mining and analysis project, a completed text mining and analysis project, or somewhere in between. For all final papers, students must collect their own data and explain and justify their sampling strategy. For CAQDAS projects, students must develop a coding scheme and apply it to a sub-sample of the larger text sample. For projects using more highly automated methods, students must review relevant text analysis methods and propose a strategy that can yield results relevant to the research question.
10 pages inclusive of full references, 12-pt font, double-spaced
WEEK 1: INTRODUCTION AND TEXT MINING
Steve Stemler An Overview of Content Analysis
Chris Bail The Cultural Environment: Measuring Culture with Big Data
Carl W. Roberts, A Conceptual Framework for Quantitative Text Analysis
optional: Sociological Discourse Analysis
Assignments: send by email to gignatow@gmail.com by 12pm Friday July 11
1) Propose one or more research questions that could be approached with text analysis methods
2) Identify 3 or more possible data sources, including newspaper archives, historical archives, social media platforms, websites, or research databases.
(15 points)
WEEK 2: TEXT MINING AND CAQDAS
1. Text Mining
brief overview article
another overview article
Text mining packages (free) (check YouTube for tutorials)
NCapture from NVivo
Helium Scraper
Mozenda
Outwit Hub
Visual Web Ripper
WebHarvy
2. CAQDAS
Katie MacMillan More Than Just Coding
Reporting on the Strategic Use of CAQDAS
Illumination with a Dim Bulb?
Free trials of CAQDAS packages (check YouTube for tutorials)
NVivo
ATLAS.ti
MAXQDA
HyperRESEARCH from Researchware
1) Scrape or otherwise create a text sample of at least 5000 words. Describe the sample and how you collected it.
2) Write a 1-2-page memo describing possible coding schemes you will use on your data.
(15 points)
Franzosi 1987 From Words to Numbers
Franzosi 1998 Narrative Analysis
Carl Roberts A Generic Semantic Grammar
Ignatow 2004 Speaking Together, Thinking Together?
Assignments: send by email to gignatow@gmail.com by 12pm Friday July25
1) Write 1-2-page reviews of two of this week's articles
2) Write a 1-page update of your progress on your final paper
(10 points)
WEEK 4: SEMANTIC AND SENTIMENT ANALYSIS
Carley and Palmquist 1992 Extracting, Representing, and Analyzing Cultural Models
Ignatow 2007 Culture and Embodied Cognition
Bail 2012 The Fringe Effect
Assignments: send by email to gignatow@gmail.com by 12pm Friday Aug 1
1) Write 1-2-page reviews of two of this week's articles
2) Write a 1-page update of your progress on your final paper
(10 points)
WEEK 5: TOPIC MODELS
August 4 Mohr and Bogdanov 2013 Topic Models--What They Are and Why They Matter
August 5-6 Mohr, Wagner-Pacifici, Breiger and Bogdanov Graphing the Grammar of Motives in National Security Strategy
Assignments:
Email presentations to gignatow@gmail.com and ignatow@unt.edu by 5pm August 7 (10 points)
Final paper due by email by 12pm Friday August 8 (40 points)