Session Title: Text Analytics Overview
Importance of the Topic
Text Analytics are important to any humanities scholars who are interested in increasing the efficiency of their efforts or exploring new research questions that are difficult to do without technology. This session will provide an overview of text analytics including part of speech tagging and other natural language processing tasks. We will look at example applications of feature comparisons (with Dunning Loglikelihood), clustering or topic modeling, frequent pattern analysis and entity extraction.
Focus of the Topic
Upon completion of this session, participants will understand:
- What Text Analytics can do
- Natural language processing capabilities
- Text mining algorithms (feature comparison, clustering or topic modeling, frequent patterns and entity extraction)
Format of the Session
- Presentation
- Demonstration
- Learning Exercise
- Discussion Questions
- Summary and Review
Presentation
- Slides can be found at http://dev-tools.seasr.org/confluence/display/Outreach/Presentations
- Link to NLP POS tags at http://code.google.com/p/universal-pos-tags/source/browse/trunk/en-ptb.map?spec=svn2&r=2
Demonstration
- SEASR Community Hub
- Natural language processing tasks
Learning Exercises
- Open browser to http://seasr.org
- Click on "Community Hub"
- Click on "Keyword Cloud"
- Click on "NLP" to explore the sample flows that have been constructed to demonstrate text processing capabilities
Attendee Project Plan
- Review and update project plan
- Identify data set to use
- Identify analysis to use
- Modify and develop the project plan over the week
Discussion Questions
- Identify and discuss three other text tools that could be useful in the Humanities?
- What are the obstacles to using this technology for text analysis - what will your colleagues say?
Summary and Review
Labels
Add Comment