SOCI 6203/5200 Text Mining
Semester year
Professor Gabe Ignatow
ignatow@unt.edu
Start date-end date
Overview:
This is a graduate seminar on contemporary text mining and text analysis methods for the social sciences. We will cover principles of research design and research ethics as they apply to text-based social science research, and will review the major methodologies within social science text mining, including topic models and opinion mining.
Course Objectives:
Our goals for the course are to survey major contemporary approaches to social science text mining and for students to develop a preliminary text mining research project of their own.
Prerequisites:
None. However, experience with social science research methods and research design is preferred.
Minimum Technology Requirements and Skills to Function in the Course:
Basic reading, writing and computer skills, including the ability to access and search research databases and to download and learn consumer software packages, are required to function in this course.
Assignments:
Assignments include reaction memos, presentations, discussion posts reacting to other students’ presentations, and a final paper for 6203 students. The final paper is a preliminary project in the form of a grant proposal, including a research question, literature review, research design, and designation of data and method of analysis. The paper is expected to include a preliminary analysis of data along the lines of a pilot study. Feedback on weekly memos and assignments is provided in Canvas and individual meetings can be scheduled with the instructor as needed.
Point Values for Assignments:
*The main difference between 6203 and 5200 is that a final paper is not required for 5200. For 5200 the proposal draft counts as the final paper.
Grade Scale:
A: 90-100
B: 80-89
C: 70-79
D: 60-69
F: <60 span="">60>
Learning Outcomes
1. Learn major approaches to text mining.
2. Develop an original text mining project.
3. Present original research in a professional manner.
Grading Rubric:
Late Work Policy:
Late work will be penalized at a rate of a 10% reduction in overall score per day late.
Computational Resources
This course does not require a programming background, but students interested in developing their course project in R may wish to use the following resources:
You may also want to check out some of the e-books published by Hadley Wickham or the blogs at https://www.r-bloggers.com/.
This is a graduate-level course on contemporary text mining and text analysis methods for the social sciences. Your goals for the course are to learn the major contemporary approaches to social science text mining and to develop your own text mining pilot research project.
Assignments include reaction memos, presentations, and a final paper. The final paper is a preliminary project in the approximate form of a grant proposal, including a research question, literature review, research design, and designation of data and method of analysis. The paper is expected to include a preliminary analysis of your data along the lines of a pilot study.
Assigned Readings
Ignatow and Mihalcea. 2018. An Introduction to Text Mining. Sage. ISBN 978-1506337005
Schedule Overview
Feedback on weekly memos and assignments by email
Individual Skype/Facetime meetings as needed (email ignatow@unt.edu to schedule a meeting)
Memos
Four memos reacting to one or at most two of the week’s readings, due by 6pm Mondays.
Memo format:
1. Summary and critique: minimum 400 words for 6200 students, 300 words for 5200 students
2. Discussion of applicability of readings to your own project: minimum 400 words for 6200 students, 300 words for 5200 students
Presentations
5-slide presentations, to be uploaded to Canvas, for your pre-proposal and proposal
Part I. Foundations of Social Science Text Mining
WEEK 1: Introduction: History and Ethics of Text Mining
- ITM Chapters 1 and 3
Assignment: None
WEEK 2: The Philosophy and Logic of Text Mining
- ITM Chapter 4
- Ignatow, “Theoretical Foundations for Digital Text Analysis”
- Wagner-Pacifici, Mohr and Breiger “Ontologies, methodologies, and new uses of Big Data in the social and cultural sciences”
Assignment: Regular Memo
WEEK 3: Research Design Principles
- ITM Chapter 5
- Jorge Ruiz Sociological Discourse Analysis
- Bauer, Martin W. and Bicquelet, Aude and Suerdem, Ahmet K. 2014. Text Analysis: An Introductory Manifesto In: Bauer, Martin W. and Bicquelet, Aude and Suerdem, Ahmet K., (eds.) Textual Analysis. SAGE Benchmarks in Social Research Methods , 1. Sage, London, UK.
- Carl W. Roberts, A Conceptual Framework for Quantitative Text Analysis
Assignment: Regular Memo
Part II. Acquiring Texts
WEEK 4: Scraping and Crawling
ITM Chapter 2, Appendix A
Read software surveys in Canvas
Assignment: data mining memo: Scrape or otherwise create a text sample of at least 5000 words. Write a 500-word memo describing the sample and how you collected it as well as possible coding schemes you could use on your data.
WEEK 5: Pre-Proposals
Assignment: 4-page pre-proposal draft, 5-slide presentation, and discussion post on two students’ pre-proposal presentations
The 4-page proposal draft must include: research question, inferential logic, type of data, possible sources of data, selection strategy, sampling strategy
Part III. Methods
WEEK 6: Thematic Analysis
- ITM Chapter 11
- Boyatzis, Richard E. Transforming Qualitative Information: Thematic Analysis and Code Development. Thousand Oaks, CA: Sage Publications, 1998.
- Braun, Virginia, and Victoria Clarke. "Using Thematic Analysis In Psychology." Qualitative Research in Psychology: 77-101.
- Jones, M.V., Y. Coviello and Y.K. Tang. 2011. International Entrepreneurship research (1989–2009): A domain ontology and thematic analysis. Journal of Business Venturing. 26(6): 632-649.
- Hannah Frith and Kate Gleeson. 2004. “Clothing and Embodiment.” Psychology of Men & Masculinity 5(1): 40-48.
Optional:
Assignment: Regular Memo
WEEK 7: Narrative Analysis
- ITM Chapter 10
- Carl W. Roberts. 2002. A Conceptual Framework for Quantitative Text Analysis.Quality & Quantity 34: 259-274.
- Ann Mische. 2014. Measuring Futures in Action
- Gabe Ignatow 2004 Speaking Together, Thinking Together?
Assignment: Regular Memo
WEEK 8: Metaphor Analysis
- ITM Chapter 12
- Lakoff, G. and Johnson, M. 1980. Metaphors we live by. Chicago: University of Chicago Press. (excerpt here)
- Rudolf Schmitt. 2000. "Notes towards the analysis of metaphor. Forum Qualitative Social Research." 1(1).
- Naomi Quinn. 1982. “Commitment” in American marriage: a cultural analysis. American Ethnologist 9(4): 775-798.
- Rees, C.E., Knight, L.V. and C.E. Wilkinson. 2007. "Doctors being up there and we being down here: a metaphorical analysis of talk about student/doctor-patient relationships." Social Science and Medicine 65(4): 725-737.
- Schuster, J., Beune, E., and K. Stronks. 2011. "Metaphorical constructions of hypertension among three ethnic groups in the Netherlands." Ethnicity and Health 16(6): 583-600.
Assignment: Regular Memo
WEEK 9: Opinion Mining
- ITM Chapter 14
- Bail, C. (2012). “The Fringe Effect: Civil Society Organizations and the Evolution of Media Discourse about Islam since the September 11th Attacks.”American Sociological Review 77(6): 855-879.
- Eshbaugh-Soha, M. (2010). “The Tone of Local Presidential News Coverage.” Political Communication 27(2): 121-140.
- Ignatow, G., Zougris, K., and N. Evangelopoulos. 2015. “Sentiment Analysis of Polarizing Topics: Partisan News Site Readers’ Comments on the Trayvon Martin Controversy.” Emerald Studies in Media and Communications 11: 261-285.
Assignment: Regular Memo
WEEK 10: Topic Models
ITM Chapter 16
- DiMaggio, Nag and Blei. 2013. Exploring affinities between topic modeling and the sociological perspective on culture.
- Papadouka, M. E., Evangelopoulos, N., and G. Ignatow. 2016. “Agenda Setting and Active Audiences in Online Coverage of Human Trafficking.” Information, Communication and Society 19(5): 655-672.
Assignment: Regular Memo
WEEK 11: Proposals
Assignments: 6-page proposal draft, proposal presentation, and discussion posts on two students’ proposal presentations
The 6-page proposal draft: must include: research question, inferential logic, data, analysis method, repeated reading or pilot study
Part IV. Analysis
WEEK 12: Discourse Analysis
Readings:
- Edley, N. and Wetherell, M. (1997). “Jockeying for Position: The Construction of Masculine Identities.” Discourse & Society 8(2): (203-217.
- Edley, N. and Wetherell, M. (2001). “Jekyll and Hyde: Men’s Construction of Feminism and Feminists.” Feminism & Psychology 11(4): 439-457.
- Evison, J. (2013). “Turn Openings in Academic Talk: Where Goals and Roles Intersect.” Classroom Discourse 4(1): 3-26.
- Fairclough, N. (1992). “Intertextuality in Critical Discourse Analysis.” Science Direct 4(3-4): 269-293.
- Krishnamurthy, R. (1996) “Ethnic, Racial and Tribal: The Language of Racism?” Pp. 129–49 in C.R. Caldas-Coulthard and M. Coulthard (eds.) Texts and Practices: Readings in Critical Discourse Analysis. London: Routledge.
Assignment: Regular Memo
WEEK 13
Individual Skype/facetime meetings with instructor
WEEK 14 PRESENTATIONS
Assignment: final paper for 6203 students, 10-page minimum excluding title page and reference page
10-page final paper requirements: research question, finalized research design, cleaned and organized data, pilot study. For all final papers students must:
- Produce a precisely articulated research question(s)
- Acquire a dataset
- Explain and justify their data selection and sampling strategies
- Select a text analysis method to apply and justify their selection
- Develop a coding scheme and apply it to a sub-sample of the larger text sample.
- Maximum of 12 pages inclusive of full references
- 12-pt font, double-spaced, references should be single-spaced
MISCELLANEOUS INFORMATION
I encourage in and out of classroom input. I am available for consultation during my open office hours (or preferably by appointment) and welcome the opportunity to assist students. To arrange for an appointment and for purposes of this course, please use the Canvas message function or email me at ignatow@unt.edu.
NETIQUETTE
Collaboration and civility are core values in the practice of behavior analysis.
Completing courses is part of your graduate education. How you engage in those courses is
also part of your graduate education – because of that we emphasize professional etiquette as
part of your preparation as a behavior analyst.
Be kind, polite and respectful. Sometimes the impersonality of the computer makes it
hard to remember that we are all humans trying to teach, learn, and make the world a
better place. That is why we went into behavior analysis. Be patient with yourself, the
process and us!
Be a problem solver and contributor to improvement of situations. Communicating
online is not always as easy because of time differences, technology challenges, and
lack of context. Try to approach problems from a behavior analytic perspective and
then work on solutions by changing the environment. For general “netiquette” rules,
you can refer to sources such as the Core Rules of Netiquette
Seek help when you are not able to resolve something on your own. Collaboration is
an important skill in behavior analysis. Learn to know what you don't know and when
you need to ask for help. Respond to feedback and suggestions in a professional
manner.
Remember the big picture and let that help you behave civilly when you feel
discouraged. You are doing this because you will learn skills to help people. That is a
goal worth all the hard effort you are putting into it.
ACADEMIC DISHONESTY
Academic dishonesty (cheating and/or plagiarism) will not be tolerated at any time. Any person suspected of academic dishonesty will be handled in accordance with the policies and procedures set forth by the University of North Texas, the College of Public Affairs and Community Service and the Department of Sociology. You will find the complete provisions of the code in the student handbook. Please note that I take academic dishonesty very seriously and the consequences will be very harsh.
Plagiarism is defined as the act of taking another's ideas, words, writings, or research findings and not giving them proper credit through quotations or citations. Even when we are paraphrasing another's ideas, we must give them credit. To do otherwise is to allow the reader to think these ideas and words are your own when they are not. This act is considered theft of intellectual property. Plagiarism is considered one of the most serious transgressions that can be committed in the educational community.
In the case of plagiarism, there are several options available to an instructor, including verbal and/or written reprimand, assignment of a lower grade with an explanation from the instructor, expulsion from the course with the assignment of a passing grade (WP), expulsion from the course with the assignment of a failing grade (WF), and/or expulsion from the university.
Therefore, all written work should be properly cited when:
- Describing the ideas of another (even if it is not a direct quotation),
- Describing the research of another (even if it is not a direct quotation),
- Using the words, phrases, paragraphs, or pages of another, and/or
- Quoting the words of another.
RELIGIOUS OBLIGATIONS AND HOLIDAYS
If you intend to miss class sessions for religious reasons sometime during the semester, you must notify me in writing by no later than Friday at 5pm of the second week of classes.
ADD/DROP POLICY
Please refer to the UNT Faculty Handbook or your department regarding the Add/Drop Policy.
F-1 STUDENTS TAKING DISTANCE EDUCATION COURSES
U.S. Federal Regulation: For F–1 students enrolled in classes for credit or classroom hours, no more than the equivalent of one class or three credits per session, term, semester, trimester, or quarter may be counted toward the full course of study requirement if the class is taken on-line or through distance education and does not require the student's physical attendance for classes, examination or other purposes integral to completion of the class. An on-line or distance education course is a course that is offered principally through the use of television, audio, or computer transmission including open broadcast, closed circuit, cable, microwave, or satellite, audio conferencing, or computer conferencing. If the F–1 student's course of study is in a language study program, no on-line or distance education classes may be considered to count toward a student's full course of study requirement.
To read detailed Immigration and Customs Enforcement regulations for F-1 students taking online courses, please go to the Electronic Code of Federal Regulations website at:
The specific portion concerning distance education courses is located at:
"Title 8 CFR 214.2 Paragraph (f) (6) (i) (G)” and can be found buried within this document http://www.gpo.gov/fdsys/pkg/CFR-2012-title8-vol1/xml/CFR-2012-title8-vol1-sec214-2.xml
UNIVERSITY OF NORTH TEXAS COMPLIANCE
To comply with immigration regulations, an F-1 visa holder within the United States may need to engage in an on- campus experiential component for this course. This component (which must be approved in advance by the instructor) can include activities such as taking an on-campus exam, participating in an on-campus lecture or lab activity, or other on-campus experience integral to the completion of this course.
If such an on-campus activity is required, it is the student’s responsibility to do the following:
- Submit a written request to the instructor for an on-campus experiential component within one week of the start of the course.
- Ensure that the activity on campus takes place and the instructor documents it in writing with a notice sent to the International Student and Scholar Services Office. ISSS has a form available that you may use for this purpose.
Because the decision may have serious immigration consequences, if an F-1 student is unsure about his or her need to participate in an on-campus experiential component for this course, s/he should contact the UNT International Student and Scholar Services Office (telephone 940-565-2195 or email internationaladvising@unt.edu) to get clarification before the one-week deadline.
DISABILITY ACCOMMODATIONS
The University of North Texas seeks to provide appropriate academic adjustments for all individuals with disabilities. This University will comply with all applicable federal, state, and local laws, regulations and guidelines, specifically Section 504 of the Rehabilitation Act of 1973, and the Americans with Disabilities Act (ADA), with respect to providing appropriate academic adjustments to afford equal educational opportunity.
However, it is the responsibility of the student to register with and provide medical verification and academic schedules to Disability Support Services (DSS) at the beginning of each semester and no later than the second week of school unless otherwise determined by the coordinator. The student also must contact the faculty member in a timely manner to arrange for appropriate academic adjustments.
Appropriate adjustments and auxiliary aid are available for persons with disabilities. Call 940-565-2456 (TDD access 1-800-735-2989).
The University of North Texas makes reasonable academic accommodation for students with disabilities. Students seeking accommodation must first register with the Office of Disability Accommodation (ODA) to verify their eligibility. If a disability is verified, the ODA will provide you with an accommodation letter to be delivered to faculty to begin a private discussion regarding your specific needs in a course. You may request accommodations at any time, however, ODA notices of accommodation should be provided as early as possible in the semester to avoid any delay in implementation. Note that students must obtain a new letter of accommodation for every semester and must meet with each faculty member prior to implementation in each class. Students are strongly encouraged to deliver letters of accommodation during faculty office hours or by appointment. Faculty members have the authority to ask students to discuss such letters during their designated office hours to protect the privacy of the student. For additional information see the Office of Disability Accommodation website at: http://disability.unt.edu/. You may also contact them by phone at 940.565.4323.
STUDENT ACADEMIC SUPPORT SERVICES
Academic Resource Center: buy textbooks and supplies, access academic catalogs and programs, register for classes, and more.
- Center for Student Rights and Responsibilities: provides Code of Student Conduct along with other useful links.
- Office of Disability Accommodation: ODA exist to prevent discrimination on the basis of disability and to help students reach a higher level of independence.
- Counseling and Testing Services: CTS provides counseling services to the UNT community as well as testing services; such as admissions testing, computer-based testing, career testing and other tests.
- UNT Libraries: online library services
- Online Tutoring: chat in real time, mark up your paper using drawing tools and edit the text of your paper with the tutor’s help.
- The Learning Center Support Programs: various program links provided to enhance the student experience.
- Supplemental Instruction: program for every student, not just for students that are struggling.
- UNT Writing Lab: offers free writing tutoring to all UNT students, undergraduate and graduate.
- Math Tutor Lab: located in GAB, room 440.
- Succeed at UNT: how to be “a successful student” information.
TECHNICAL REQUIREMENTS AND ASSISTANCE
The following information has been provided to assist you in preparation for the technological aspect of the course.
- UIT Help Desk: http://www.unt.edu/helpdesk/index.htm
STUDENT TECHNICAL SUPPORT
The University of North Texas UIT Student Helpdesk provides student technical support in the use of Canvas and supported resources. The student help desk may be reached at:
Email: helpdesk@unt.edu Phone: 940.565-2324
In Person: Sage Hall, Room 130 Our hours are:
Monday-Thursday 8am-midnight Friday 8am-8pm
Saturday 9am-5p Sunday 8am-midnight