Northeastern University

Page Content   Main Links   Local Links   Utility Links   Footer Links

Northeastern University College of Computer and Information Science

Events — Colloquia & Seminars

Inferring and Exploiting Relational Structure in Large Collections

Speaker: David Smith

Date: Monday, February 6, 2012

Talk: 12:00 PM, 366 WVH

Abstract

The digitization of knowledge and concerted retrospective scanning projects are making significant amounts of data -- historical data in particular -- increasingly available to readers and researchers in many disciplines. To make this data useful, our group is working on improving OCR, language modeling, multiple-version alignment, syntactic analysis, information extraction, and information retrieval. I will focus in particular on problems of inferring the relational structure latent in large collections of documents such as books, web pages, patent applications, grant proposals, and social media postings. Which books or passages quote, translate, paraphrase, and cite each other? This research requires improvements in modeling translation and other forms of similarity, as well as improvements in efficiently comparing large numbers of passages. Finally, I will discuss how similarity relations can be used to improve classification tasks.

Brief Biography

David Smith is a Research Assistant Professor in the Computer Science Department at the University of Massachusetts, Amherst, where he conducts research on natural language processing, computational linguistics, information retrieval, digital libraries, and machine translation. He holds a Ph.D. in Computer Science from the Johns Hopkins University. Before graduate school, he was the head programmer for the Perseus Digital Library Project and received an A.B. in Classics from Harvard.

Host: Alan Mislove

Local Links

Text Only Options

Top of page


Text Only Options

Open the original version of this page.

Usablenet Assistive is a UsableNet product. Usablenet Assistive Main Page.