An Introduction to Entity Resolution with Rebecca Steorts

a Workshop

Wednesday, 07/10/2019.   ARCHIVED EVENT

Location: 1430 ISR-Thompson

The PDHP workshop series resumes with the first in a multi-part series of workshops on record linkage topics & techniques within social research.

Please join Assistant Professor Rebecca C. Steorts, PhD, of Duke University’s Department of Statistical Science, as she presents An Introduction to Entity Resolution, a half-day workshop geared toward statisticians, data scientists, population researchers, and computational social scientists of all experience levels. This hands-on workshop will cover both the theory and practice of probabilistic entity resolution, while demonstrating state of the art techniques using R software and Apache Spark.

Topics include:

• Overview and introduction to entity resolution

• Entity resolution fundamentals (record linkage, de-duplication, blocking, and computational gains)

• Entity resolution evaluation metrics (including precision, reduction ratio, and robustness to tuning parameters)

• Bayesian entity resolution models (including both parametric and nonparametric Bayesian mixture models)

• Hands-on demonstration of state of the art R packages (using blink) and computational gains (using Apache Spark)

