Our Work

Latest Work / Tools

Data Synthesizer

Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.

Learn More Right Arrow See All Tools Right Arrow

Latest Work / Papers

Responsible Data Management

Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.

Learn More Right Arrow See All Papers Right Arrow

Latest Work / Courses

Responsible Data Science at NYU CDS

Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.

Learn More Right Arrow See All Courses Right Arrow

Ranking Facts

Ranking Facts is a standardized, human-interpretable summary of the ranking methodology and of its result.

Learn MoreRight Arrow

Fair Prep

FairPrep is a design and evaluation framework for fairness-enhancing interventions that treats data as a first-class citizen.

Learn MoreRight Arrow

Data Synthesizer

DataSynthesizer generates synthetic data that simulates a given dataset.

Learn MoreRight Arrow

Teaching Responsible Data Science: Charting New Pedagogical Territory

Armanda Lewis

Julia Stoyanovich

International Journal of Artificial Intelligence in Education

In this paper we recount a recent experience in developing and teaching a technical course focused on responsible data science, which tackles the issues of ethics in AI, legal compliance, data quality, algorithmic fairness and diversity, transparency of data and algorithms, privacy, and data protection.

Learn MoreRight Arrow

Public Engagement Showreel Int 1894

Julia Stoyanovich

Steven Kuyan

Meghan McDermott

Maria Grillo

Mona Sloane

There is an urgent need to develop effective regulatory mechanisms for ADS. New York City has been at the forefront of this work with its Local Law 49 of 2018 in relation to automated decision systems used by agencies, and the establishment of the Automated Decision Systems Task Force.

Learn MoreRight Arrow

Teaching responsible data science

Julia Stoyanovich

Armanda Lewis

International Journal of Artificial Intelligence in Education (IJAIED), 2021, Note: Special Issue: The FATE of AI in Education: Fairness, Accountability, Transparency, and Ethics

Although numerous ethics courses are available, with many focusing specifically on technology and computer ethics, pedagogical approaches employed in these courses rely exclusively on texts rather than on software development or data analysis. Technical students often consider these courses unimportant and a distraction from the "real" material.

Learn MoreRight Arrow

Lightweight inspection of data preprocessing in native machine learning pipelines

Stefan Grafberger

Julia Stoyanovich

Sebastian Schelter

CIDR 2021, 11th Conference on Innovative Data Systems Research, Online Proceedings

Machine Learning (ML) is increasingly used to automate impactful decisions, and the risks arising from this wide-spread use are garnering attention from policy makers, scientists, and the media. ML applications are often very brittle with respect to their input data, which leads to concerns about their reliability, accountability, and fairness.

Learn MoreRight Arrow

Taming technical bias in machine learning pipelines

Sebastian Schelter

Julia Stoyanovich

IEEE Data Eng. Bull., vol. 43, 2020

Machine Learning (ML) is commonly used to automate decisions in domains as varied as credit and lending, medical diagnosis, and hiring. These decisions are consequential, imploring us to carefully balance the benefits of efficiency with the potential risks.

Learn MoreRight Arrow

FairPrep: Promoting Data to a First-Class Citizen in Studies on Fairness-Enhancing Interventions

Sebastian Schelter

Yuxuan He

Jatin Khilnani

Julia Stoyanovich

EDBT 2020 (short paper), arXiv, November 2019, EDBT talk video

The importance of incorporating ethics and legal compliance into machine-assisted decision-making is broadly recognized.

Learn MoreRight Arrow

Fairness-Aware Instrumentation of Preprocessing Pipelines for Machine Learning

Ke Yang, Biao Huang

Julia Stoyanovich

Sebastian Schelter

Proceedings of HILDA 2020 (an ACM SIGMOD workshop)

Surfacing and mitigating bias in ML pipelines is a complex topic, with a dire need to provide system-level support to data scientists.

Learn MoreRight Arrow

Responsible Data Management

Julia Stoyanovich

Bill Howe

H.V. Jagadish

PVLDB 13(12): 3474-3489 (2020), invited paper accompanying VLDB 2020 keynote presentation

In this article, we argue that the data management community is uniquely positioned to lead the responsible design, development, use, and oversight of ADS.

Learn MoreRight Arrow

The Imperative of Interpretable Machines

Julia Stoyanovich

Jay J. Van Bavel

Tessa V. West

Nature Machine Intelligence, April 2020

As artificial intelligence becomes prevalent in society, a framework is needed to connect interpretability and trust in algorithm-assisted decisions, for a range of stakeholders.

Learn MoreRight Arrow

Causal Intersectionality for Fair Ranking

Ke Yang

Joshua R. Loftus

Julia Stoyanovich

arXiv, June 2020

In this paper we propose a causal modeling approach to intersectional fairness, and a flexible, task-specific method for computing intersectionally fair rankings.

Learn MoreRight Arrow

Balanced Ranking with Diversity Constraints

Ke Yang

Vasilis Gkatzelis

Julia Stoyanovich

Proceedings of IJCAI 2019

Many set selection and ranking algorithms have recently been enhanced with diversity constraints that aim to explicitly increase representation of historically disadvantaged populations, or to improve the overall representativeness of the selected set.

Learn MoreRight Arrow

Designing Fair Ranking Schemes

Abolfazl Asudeh

H. V. Jagadish

Julia Stoyanovich

Gautam Das

Proceedings of ACM SIGMOD, 2019

Items from a database are often ranked based on a combination of criteria. The weight given to each criterion in the combination can greatly affect the fairness of the produced ranking, for example, preferring men over women.

Learn MoreRight Arrow

MithraRanking: A System for Responsible Ranking Design

Yifan Guan

Abolfazl Asudeh

Pranav Mayuram

Hosagrahar V. Jagadish

Julia Stoyanovich

Gerome Miklau

Gautam Das

Proceedings of ACM SIGMOD, 2019

Items from a database are often ranked based on a combination of criteria. The weight given to each criterion in the combination can greatly affect the ranking produced.

Learn MoreRight Arrow

Transparency, Fairness, Data Protection, Neutrality: Data Management Challenges in the Face of New Regulation

Serge Abiteboul

Julia Stoyanovich

ACM Journal of Data and Information Quality, 2019

The data revolution continues to transform every sector of science, industry, and government. Due to the incredible impact of data-driven technology on society, we are becoming increasingly aware of the imperative to use data and algorithms responsibly—in accordance with laws and ethical norms.

Learn MoreRight Arrow

Nutritional Labels for Data and Models

Julia Stoyanovich

Bill Howe

IEEE Data Engineering Bulletin 42(3): 13-23 (2019)

An essential ingredient of successful machine-assisted decision-making, particularly in high-stakes decisions, is interpretability –– allowing humans to understand, trust and, if necessary, contest, the computational process and its outcomes.

Learn MoreRight Arrow

Towards Responsible Data-driven Decision Making in Score-Based Systems

Abolfazl Asudeh

H. V. Jagadish

Julia Stoyanovich

IEEE Data Engineering Bulletin 42(3): 76-87 (2019)

Human decision makers often receive assistance from data-driven algorithmic systems that provide a score for evaluating the quality of items such as products, services, or individuals.

Learn MoreRight Arrow

TransFAT: Translating Fairness, Accountably and Transparency into Data Science Practice

Julia Stoyanovich

International Workshop on Processing Information Ethically (PIE@CAiSE) (2019)

Data science holds incredible promise for improving peoples lives, accelerating scientific discovery and innovation, and bringing about positive societal change.

Learn MoreRight Arrow

Undergraduate and Graduate Responsible Data Science Courses at NYU CDS

Learn MoreRight Arrow

AI Ethics: Global Perspectives

Learn MoreRight Arrow

The Data, Responsibly Comic Series

Learn MoreRight Arrow