Title of Study/Project:
List of team members and their affiliation
- Matthew Brook O'Donnell - University of Michigan
- Clare Llewellyn - JSTOR
- Michael Krot - JSTOR
Procedural Outline of Study/Project
0. Newly developed API and user interface at JSTOR may help to better define development and deployment scenarios, preliminary and currently being evaluated.
a. Research Question/Purpose of Study
We have identified 2 areas that we are interested in:
1) Creation of pre-canned workflows that are useful to linguists
2) Creation of workflows that can tell us information about a specific JSTOR article, for example reading ease or grade level.
b. Data Source
JSTOR articles
c. Analysis Tools
For the Tools for Linguists we have divided the analysis tools into 4 sets. We may produce the data sets externally from SEASR and feed them in from that point. The sets are:
1. Pick a data set or use predefined data set
2. Pick a workflow
- show me words counts
- Show me word lists
- show me ngrams
- show me a snippet of context around keyword (concordancer)
- show me all words in the paragraph surrounding the keyword
3. Perform a data transformation
- selection
- noun phrases
- data mining
- keyness extraction
- compare with another data set
4. visualizations
- tag cloud
- other visualizations
Activity Timeline or Milestones
Test case and implementation of a web service deployment as peer to the java application services available in the new API developed at JSTOR. Michael is undertaking the modification of a sample flow (engineered for Zotero interface) to acquire data from internal JSTOR resources and will be determining how SEASR and API can interface to provide JSTOR user front-end with outputs directly or indirectly produced by SEASR data flows.
Report or Project Outcome(s)
- Presentation
at the SEASR Follow-up Workshop in U. Maryland - June 22, 2009.
Ideas on what your team needs from SEASR staff to help you achieve your goal.
We'd like further information on running a workflow without the work bench. How we can interact with a flow from a webpage and other means.