Tuesday, April 14, 2015

Text Analytics

Text Analytics

DATES & TIME: August 3-6, 2015, 5:30-9:30 p.m.
LOCATION: Ann Arbor, MI
INSTRUCTOR: Robert Stine, University of Pennsylvania

DESCRIPTION: Statistical methods for the analysis of textual data have come of age. We can mine textual data for underlying sentiments, scan for hateful or discriminatory language, or create features that improve familiar predictive models. This workshop explores how various text analytics can be understood, used, and developed by non-specialists in the field. View the full course description here.

PREREQUISITES: This course is self-contained with no explicit prerequisite beyond familiarity with statistical methods at the level of multiple regression. That said, some familiarity with multivariate methods (particularly principal components) and exposure to probability models would be helpful. The course will predominantly use packages from R as the main software tool.

STANDARD FEE: Members = $1200; Non-members = $2400
SPECIAL FEE: Participants who attend either of the two 2015 regular sessions (First Session or Second Session) are eligible for a special discounted fee of $960 to attend this evening workshop. To received this special discounted fee, please email the Summer Program at sumprog@icpsr.umich.edu

Friday, April 10, 2015

Training for Policy Analysts and Program Evaluation

Many of our workshops cover material that is particularly relevant for those interested in examining important societal problems and conducting evaluations of policy impacts. A few 3- to 5-day workshops of interest include:

Regression Discontinuity Designs
DATES & TIME: June 15-17, 2015, 9 a.m.-5 p.m.
LOCATION: Ann Arbor, MI
INSTRUCTORS: Matias Cattaneo, University of Michigan, and Rocio Titiunik, University of Michigan
DESCRIPTION: Regression discontinuity designs are some of the most useful and potent research designs for evaluating program impacts, policy issues, and societal problems. This workshop covers the basic principles, estimation, and interpretation of regression discontinuity designs, as well as their applicability across a broad array of substantive areas.
FEES: ICPSR Member fees = $1300; Non-member fees = $2600

Dynamic Models for Policy, Economics, and Society: Practical Time Series Methods
Image credit: Harold D. Clarke
DATES & TIME: July 20-24, 2015, 9 a.m.-5 p.m.
LOCATION: Ann Arbor, MI
INSTRUCTOR: Harold D. Clarke, University of Texas at Dallas
DESCRIPTION: The course has an applied focus, and participants will learn how to specify, estimate, and evaluate multivariate time series models within their substantive fields of interest. Methods considered will be helpful to graduate students, faculty, and staff in the social sciences as well as policy analysts and other researchers working in the public and private sectors.
FEES: ICPSR Member fees = $1500; Non-member fees = $3000

Designing and Conducting Experiments in the Laboratory
DATES & TIME: June 22-26, 2015, 9 a.m.-5 p.m.
LOCATION: Ann Arbor, MI
INSTRUCTORS: Rick Wilson, Rice University, and Catherine Eckel, Texas A&M University
DESCRIPTION: This workshop introduces participants to basic research design considerations, the problems of inference for incomplete designs (particularly field experiments), and canonical games common in experimental economics. To learn about experiments, there is no substitute for doing. So, participants will work in four- to six-person groups to design experiments that will be run during the workshop.
FEES: ICPSR Member fees = $1500; Non-member fees = $3000

Causal Inference in Cross-Sectional Data, Survival-Time Data, and Panel Data Using Stata
DATES & TIME: July 13-17, 2015, 9 a.m.-5 p.m.
LOCATION: Ann Arbor, MI
INSTRUCTOR: David Drukker, Stata Corporation
DESCRIPTION: This workshop provides an introduction to causal inference for cross-sectional data, survival-time data, and panel data. It uses a combination of intuition, mathematics, and computational examples to illustrate what causal inference parameters measure and how we estimate them using Stata.
FEES: ICPSR Member fees = $1500; Non-member fees = $3000

Text Analytics
DATES & TIME: August 3-6, 2015, 9 a.m.-5 p.m.
LOCATION: Ann Arbor, MI
INSTRUCTOR: Robert Stine, University of Pennsylvania
DESCRIPTION: Statistical methods for the analysis of textual data have come of age. We can mine textual data for underlying sentiments, scan for hateful or discriminatory language, or create features that improve familiar predictive models. This workshop explores how various text analytics can be understood, used, and developed by non-specialists in the field. This workshop runs for 4 nights.
FEES: ICPSR Member fees = $1200; Non-member fees = $2400. (In addition, special fees apply for participants who attend either of the two 2015 regular 4-week sessions of the Summer Program.)

Network Analysis: An Introduction
DATES & TIME: June 1-5, 2015, 9 a.m.-5 p.m.
LOCATION: Ann Arbor, MI
INSTRUCTOR: Ann McCranie, Indiana University
DESCRIPTION: This one week intensive workshop presents an introduction to various concepts, methods, and applications of social network analysis. The primary focus is on the analysis of relational data measured on groups of actors as they interact across geographic, economic, social, and political contexts.
FEES: ICPSR Member fees = $1500; Non-member fees = $3000

Time Series Analysis: An Introduction for Social Scientists 
DATES & TIME: July 13-17, 2015, 9 a.m.-5 p.m.
LOCATION: Ann Arbor, MI
INSTRUCTOR: Mark Pickup, Simon Fraser University and the University of Oxford
DESCRIPTION: Statistical models can be applied to a broad array of time series data in order to examine the movement of variables over time (e.g., government policy, public opinion, administrative decisions, socioeconomic measures). This course introduces time series methods that allow analysts to estimate relationships between variables and test hypotheses using dynamic, realistic models of important processes.
FEES: ICPSR Member fees = $1500; Non-member fees = $3000

Thursday, April 9, 2015

Workshop on the National Survey of Early Care and Education (NSECE)

Digging into the NSECE: Exploiting the Potential of the Household and Provider Data from the National Survey of Early Care and Education (NSECE)

DATES & TIME: July 20-23, 2015, 9 a.m.-5 p.m.
LOCATION: Ann Arbor, Michigan
INSTRUCTORS: Johanna Bleckman, ICPSR, and Rupa Datta, NORC, University of Chicago
FEE: The workshop is free, but space is limited.
SPONSOR: Child Care & Early Education Research Connections data archive

The National Survey of Early Care and Education (NSECE) is the first study of its kind in over 20 years. The NSECE provides a national picture of families' non parental care utilization, as well as characteristics of both home-based and center-based providers for children birth through age 13. The NSECE will help deepen the understanding of the extent to which families' needs and preferences coordinate with providers' offerings and constraints.

This workshop will provide an opportunity for researchers to explore advanced topics related to the National Survey of Early Care and Education (NSECE). Along with a discussion of the unique characteristics of the NSECE, the workshop will introduce potential data users to technical issues associated with using the NSECE for secondary analysis. In addition, participants will receive guidance on mastering the complex aspects of the study, including useful programming techniques and additional statistical resources. Advanced topics will include the use of children's age categories, classifying types of care across files, and determining cost of care to families or price of care charged by providers. The workshop will discuss how to conduct comparable analyses across data files.

Some NSECE public-use data files are already available for secondary analysis through Research Connections. More public and restricted-use data files will become available on a rolling release schedule. This workshop will focus on the NSECE public-use files, however information covered may also be relevant to the NSECE restricted-use files. (Actual linking of household and provider data files will not be addressed, as such linkages require restricted-use data.)

PREREQUISITES: Interest in using the NSECE to answer policy relevant questions in early care and education. Participants must have programming experience in one or more of the following software packages: SAS, Stata, or SPSS. In addition, participants should have experience using large, complex survey data.

APPLICATION: All applications must include a curriculum vita along with a cover letter describing:

  • research questions intended to be explored using the NSECE;
  • prior early care and education research experience;
  • and experience analyzing large, complex survey data (including specific data sets and types of analyses conducted).

You can upload your application materials through the ICPSR Summer Program's portal.

NOTE: Please also indicate the statistical package in which you intend to work. Participants will be expected to become familiar with NSECE documentation prior to attending the workshop.

DEADLINE: The application deadline is May 29, 2015.

STIPENDS: Admitted graduate students, post-doctoral scholars, and junior faculty/researchers will be considered for one of a limited number of stipends to help with travel and housing costs. To be considered for one of these awards, applicants must also submit a letter of support from a senior faculty member, mentor, or adviser.

Wednesday, April 8, 2015

Free Workshop: Exploratory Data Mining via SEARCH Strategies


DATES AND TIME: June 8 - 12, 2015, 9 a.m.-5 p.m.

LOCATION: Ann Arbor, Michigan

INSTRUCTOR: John J. McArdle, University of Southern California

DESCRIPTION: This workshop provides an overview of current techniques in exploratory data mining for quantitative research in the social and behavioral sciences. Exploratory data mining uses computational methods on large amounts of data in order to construct predictive models of behavior, in contrast to the standard hypothesis testing of many standard statistical techniques. These data mining techniques can be used to model categorical choices, to classify groups, to discover patterns, and to model longitudinal data. Exploratory data mining techniques can be fruitful in most situations where categorical regression or many multivariate analytic techniques are used.

This workshop will explore key algorithms, including regression trees and SEM models (CART, SEMtrees, PARTY, etc.). This work was initiated by the SAS algorithm SEARCH (Morgan & Sonnquest, 1963), and the workshop will begin here and then move to use of the free software modules in R that are currently used for exploratory data mining. The workshop will offer a mixture of mathematical statistics in the morning session and practical, hands-on work in the afternoon. Participants are encouraged to bring their own data to which they can learn to fit an exploratory model and write up the activity.

FEES: There are no registration fees for accepted participants. The first ten (10) admitted workshop participants will receive a travel stipend of $500.

SPECIAL AWARDS: Those completing the course will be eligible to compete for two (2) $15,000 awards for innovative uses of SEARCH.

APPLICATION: Admission to the course is competitive and seating is limited. Apply through the ICPSR Summer Program Portal, where you will need to upload the following documents: 

For graduate students:
  • Letter describing reasons for attending course
  • Current CV
  • Unofficial transcript
  • Letter of recommendation from adviser
  • List of classes related to statistics and quantitative methods
For junior faculty:
  • Letter describing reasons for attending course and research interests
  • Current CV
  • List of courses taught
The application deadline is April 30, 2015.

Visit the course description page for more information.

Upcoming 2015 Summer Program Deadlines

Scholarships
April 15, 2015: Applications due for the Janet Box-Steffensmeier and John A. Garcia Scholarships and the Hanes Walton, Jr. Award for Quantitative Methods Training

April 30, 2015: Applications due for all ICPSR Scholarships (Clogg, Clubb, Miller, Education Research, Developmental Psychology, and Public Administration, Public Policy, and Public Affairs)


Discounts
April 30, 2015: Early payment discount for four-week registration fees expires at 11:59 p.m. EDT


Sponsored Workshops
April 30, 2015: Applications due for the free workshop "Exploratory Data Mining Via SEARCH Strategies"

May 1, 2015: Applications due for the free workshop "Immigration, Immigrants and Health Conditions, Health Status, and Policies"

May 3, 2015: Applications due for the free workshop "Secondary Data Analysis and the National Addiction & HIV Data Archive Program (NAHDAP)"

May 29, 2015: Applications due for the free workshop "Digging Into the NSECE: Exploiting the Potential of the Household and Provider Data From the National Survey of Early Care and Education (NSECE)"

Monday, April 6, 2015

New 3- to 5-Day Workshops in 2015

June 1-3
Handling Missing Data Using Multiple Imputation in Stata
Location: Ann Arbor, MI
Instructor: Yulia Marchenko, Stata Corporation
Description: Participants will learn how to use Stata to perform multiple-imputation analysis, a simulation-based technique for handling missing data. The workshop will discuss in detail the three stages of multiple imputation--imputation, complete-data analysis, and pooling--and provide accompanying Stata examples.
Fee: ICPSR members, $1,300; Non-members, $2,600

June 15-17
Regression Discontinuity Designs
Location: Ann Arbor, Michigan
Instructors: Matias Cattaneo, University of Michigan, and Rocio Titiunik, University of Michigan
Description: This workshop introduces regression discontinuity (RD) designs, focusing on both basic ideas and concepts as well as recent developments. RD designs are quasi-experimental techniques commonly used in social, behavioral, and related sciences as a way of estimating causal treatment effects in cases where treatment assignment is determined based on a threshold crossing rule using an observed variable (e.g., poverty index, population, vote share, age, etc.).
Fee: ICPSR members, $1,300; Non-members, $2,600

June 22-24
Survival Analysis, Event History Modeling, and Duration Analysis
Location: Berkeley, California
Instructor: Tenko Raykov, Michigan State University
Description:  This workshop introduces an applied methodology for the analysis and modeling of time-to-event data, which ensures proper handling of censored observations (not having experienced the event of main interest by the end of the study, for various reasons). Such observations, along with complete observations, arise frequently in the medical, social, and behavioral sciences when time elapsed until event occurrence is to be modeled and explained in terms of independent variables and predictors.
Fee: ICPSR members, $1,300; Non-members, $2,600


June 22-26
Designing and Conducting Experiments in the Laboratory
Location: Ann Arbor, Michigan
Instructors: Rick Wilson, Rice University, and Catherine Eckel, Texas A&M University
Description: Participants will learn the basics of experimental design, explore design considerations in the laboratory versus in the field, and become familiar with canonical laboratory designs.
Fee: ICPSR members, $1,500; Non-members, $3,000

July 20-24
Advanced Data Analytics: Statistical Learning and Latent Variables
Location: Ann Arbor, Michigan
Instructor: Douglas Steinley, University of Missouri
Description: This course will combine the perspectives of latent variable modeling and statistical modeling to uncover relationships and processes within a given data set. Topics include principal component analysis and factor analysis and extensions (mixtures of factor analyzers, factor mixture models, etc.), latent trait models, latent class models, and mixture models.
Fee: ICPSR members, $1,500; Non-members, $3,000

August 3-5
Nonparametric and Semiparametric Methods and Applications
Location: Ann Arbor, Michigan
Instructor: Matias Cattaneo, University of Michigan, Sebastian Calonico, University of Miami, and Max H. Farrell, University of Chicago
Description: Nonparametric and semiparametric methods allow for greater flexibility in data analysis, essential for social science research in which simple/parametric models do not always adequately capture the underlying statistical features of interest. Course participants will learn how to leverage these techniques for their own research.
Fee: ICPSR members, $1,300; Non-members, $2,600

August 3-6
Text Analytics
Location: Ann Arbor, Michigan
Instructor: Robert Stine, University of Pennsylvania
Description: This workshop will explore the application of text analytics methods across a range of substantive topics and research areas. Techniques include using raw counts of words, forming principal components from these counts, and building regressors from counts of adjacent words. The workshop will also explore proposed hierarchical generating models often associated with nonparametric Bayesian analysis. Because regressors derived from text may be difficult to interpret, the workshop will also show how to develop interpretive hooks from quantitative features.
Fee: ICPSR members, $1,200; Non-members, $2,400

August 10-12
Qualitative Research Methods
Location: Chapel Hill, North Carolina
Instructor: Paul Mihas, University of North Carolina at Chapel Hill
Description: Participants will learn the rationale for using different qualitative research traditions (e.g., grounded theory and narrative analysis); approaches to analyzing qualitative data, including coding, memo writing, assessing co-occurences, and theme-building; different coding approaches, including descriptive, interpretive, holistic, and micro-coding; different memo-writing approaches (e.g., reflective, document summary, key quotation, statement, and positionality); and basic skills in qualitative software (ATLAS.ti).
Fee: ICPSR members, $1,300; Non-members, $2,600

Tuesday, March 31, 2015

2015 Workshops: Longitudinal Data Analysis

In 2015, we are offering four courses on longitudinal data analysis. Longitudinal analysis is the study of short series of observations obtained from many respondents over time and is also referred to as panel analysis (of a cross-section of time series), or repeated measures, or growth curve analysis (polynomials in time), or multilevel analysis (where one level is a sequence of observations from respondents). Longitudinal analysis is used for panel surveys, experiments, and quasi-experiments in health and biomedicine, education and psychology, and the evaluation of prevention and treatment programs.


Second Four-Week Session
Longitudinal Analysis
Dates and Time: July 20-August 14, 10 a.m.-noon
Location: Ann Arbor, MI
Instructor: Michael Berbaum, University of Illinois at Chicago
Description: This course treats the statistical basis and practical application of linear models for longitudinal normal data and generalized linear models for longitudinal binary, count, and ordinal data. The approach involves inclusion of random effects in linear models to reflect within-person cross-time correlation. Techniques for irregularly observed (unequally spaced) data will be covered.
Program Scholar Fee: ICPSR members, $2,300; Non-members, $4,600


Three- to Five-Day Workshops
Analyzing Intensive Longitudinal Data: A Guide to Diary, Experience Sampling, and Ecological Momentary Assessment Methods
Dates and Time: June 9-12, 9 a.m.-5 p.m.
Location: Amherst, MA
Instructors: Niall Bolger, Columbia University, and Jean-Philippe Laurenceau, University of Delaware
Description: Intensive longitudinal methods, often called experience sampling, daily diary, or ecological momentary assessment methods, allow researchers to study people's thoughts, emotions, and behaviors in their natural contexts. Typically they involve self-reports from individuals, dyads, families or other small groups over the course of hours, days, and weeks. Such data can reveal life as it is actually lived and provide insights that are not possible using conventional experimental or survey research methods. Intensive longitudinal data, however, present data analytic challenges stemming from the multiple levels of analysis and temporal dependencies in the data. The multilevel or mixed-effects model for longitudinal data is a flexible analytic tool that can take account of these complexities, and the goal of the 4-day workshop is to provide training in its use.
Fee: ICPSR members, $1,400; Non-members, $2,800

Longitudinal Data Analysis, Including Categorical Outcomes
Dates and Time: June 22-26, 9 a.m.-5 p.m.
Location: Ann Arbor, Michigan
Instructor: Donald Hedeker, University of Chicago
Description: This workshop will focus on the analysis of longitudinal data using mixed models, beginning with models for continuous outcomes and including a description of the multilevel or
hierarchical representation of the model. For dichotomous, ordinal and nominal outcomes, this workshop will focus next on the mixed logistic regression model and generalizations of it. Finally, the workshop will cover missing data issues. In all cases, methods will be illustrated using software, with SAS used for most examples, and augmented with use of SPSS for continuous outcomes and SuperMix for categorical outcomes.
Fee: ICPSR member, $1,500; Non-member, $3,000

Applied Multilevel Models for Longitudinal and Clustered Data 
Dates and Time: July 13-17, 9 a.m.-5 p.m.
Location: Boulder, CO
Instructor: Jonathan Templin, University of Kansas
Description: Multilevel models, also known as hierarchical linear models or general linear mixed models, provide quantification and prediction of random variance due to multiple sampling dimensions (across occasions, persons, or groups). Multilevel models offer many advantages for analyzing longitudinal data, such as flexible strategies for modeling change and individual differences in change, the examination of time-invariant or time-varying predictor effects, and the use of all available complete observations. Multilevel models are also useful in analyzing clustered data (e.g., persons nested in groups), in which one wishes to examine predictors pertaining to individuals or to groups. This workshop will serve as an applied introduction to multilevel models, beginning with longitudinal data and continuing onto clustered data.
Fee: ICPSR member, $1,500; Non-member: $3,000

You can register for all of these course through our portal.