Data science is an important tool that can help researchers tackle important societal challenges ranging from mobility and health to public safety and education.
But data science techniques and technologies also pose enormous potential for harm by reinforcing inequity and leaking private information. As a result, many sensitive datasets are restricted from research use, impeding progress in areas that impact society.
The University of Michigan, with a $2 million grant from the National Science Foundation, plans to establish a framework for a national institute that would enable research using sensitive data, while preventing misuse and misinterpretation.
“Data science has proven time and time again to be an invaluable resource when addressing emerging challenges and opportunities in areas of broad potential impact,” said H.V. Jagadish, director of the Michigan Institute for Data Science. “But having access to information comes with a great deal of responsibility, so our first priority is to ensure data science is not misused to disproportionately harm underrepresented groups.”
U-M researchers will partner with colleagues at New York University and the University of Washington over the next two years to deploy new techniques and technologies that enable responsible data science, while establishing an interdisciplinary community focused on the study, design, deployment and assessment of equitable data systems.
Equity is an important facet of data science that NSF aims to strengthen in the coming years, as the federal agency partners with universities such as U-M to enable new modes of data-driven discovery that will transform the frontiers of science and engineering.
The centerpiece of its ongoing effort, called Harnessing the Data Revolution at NSF, is the development of national institutes that address multidisciplinary problems in big data. U-M will help lay the groundwork for developing these institutes, which will eventually serve as a point of convergence for researchers from multiple disciplines to share expertise and address pressing challenges in data science.
“Information is being gathered about all of us, from our Google searches and online purchases to property tax records and social media activity,” said Margaret Levenstein, director of the Inter-university Consortium for Political and Social Research at U-M, which maintains the world’s oldest and largest archive of research and instructional data for the social and behavioral sciences.
“You would assume the usage of data to be accurate and fair, but that is not always the case. That is why building a framework is so important because, in order for us to harness the enormous potential of big data, we need to ensure equity and privacy.”
Jagadish is the principal investigator on this grant. Robert Hampshire, research associate professor at the U-M Transportation Research Institute, and associate professor of industrial and operations engineering, and public policy; Levenstein; Bill Howe of the University of Washington; and Julia Stoyanovich of New York University are co-principal investigators.