dsc261-wi24

Image genarted using DALL·E

DSC 261: Responsible Data Science

Description:

This course delves into Responsible Data Science, emphasizing the importance of conscientious practices in data analysis and application. It commences with Causal Inference, enabling students to distinguish between correlation and causation for accurate data interpretation. Subsequent modules cover Algorithmic Fairness to promote unbiased AI development, and Explainable AI, which aims to enhance the transparency and reliability of AI outputs. The curriculum concludes with Data Cleaning, Profiling, and Debiasing, where students learn to refine data quality and mitigate inherent biases. The course is tailored for those seeking to apply data science principles responsibly in practical scenarios.

Instructional team:

Instructor:

Babak Salimi, bsalimi@ucsd.edu

Course Assistants:

Baharan Khatami, skhatami@ucsd.edu

Jiongli Zhu, jiz143@ucsd.edu

Parjanya Prashant, pprashant@ucsd.edu

Lectures:

Tuesdays and Thursdays at 5:00pm-6:20pm

Office Hours: Baharan Khatami: Mon 15:00-16:00. Zoom link Jiongli Zhu: Thu 10:00-11:00am. Zoom link
Parjanya Prashant: Wed 13:00-14:00. Zoom link

Note: Office hours will be held via Zoom (link can be found on the Canvas calendar).

Course Workload

In this course, students will engage in a project-based learning experience, emphasizing teamwork, practical application of data science techniques, and effective communication. Groups of 2-3 students will collaborate to develop projects on topics relevant to the class. These projects can range from open-ended research that could potentially lead to a paper to the implementation and analysis of algorithms, building tools with GUIs for applications in responsible data science, or analyzing real and synthetic datasets. The focus is on applying responsible data science methodologies to real-world problems and data scenarios, rather than just reading papers or writing surveys.

The course will have three checkpoints: an initial proposal, a midterm update, and a final report. Students are encouraged to meet with the instructor regularly for guidance. The project will culminate in a short in-class presentation. Evaluation will consider the team’s effort, the results, the quality of the related work survey, and the effectiveness of their presentation and final report. Students will also participate in peer reviews, offering and receiving constructive feedback. To enhance the learning experience, students will create visual posters and dynamic presentations, aiming to foster a deep understanding of responsible data science. For detailed project ideas and refer to the provided Google Docs.

Project Timeline

Week 1: Project Kickoff

Week 2: Team Formation and Topic Selection

Week 3: Project Proposal Development

Week 4: Proposal Submission and Feedback

Week 5-8: Project Execution Begins

Week 9-10: Project Presentation

Grading Criteria

Calender:

(subject to change)

Course Calendar:

Week Date Lecture Topics Slides Readings
1 Jan 9 Course Overview, Introduction to Causal Inference Slides Slides Understanding Simpson’s Paradox
2 Jan 15 Potential Outcome, Structural Causal Models Slides Slides d-Separation Without Tears
3 Jan 22 Project Brainstorming    
4 Jan 29 Structural Causal Models, Identification Slides Recording Structural Causal Models — A Quick Introduction
5 Feb 5 Causal Estimation (Guest Lecture), Fair Ranking (Guest Lecture) Recording Matching Methods for Causal Inference
6 Feb 12 Algorithmic Fairness Slides A Survey on Bias and Fairness in Machine Learning
7 Feb 19 Fairness and Privacy (Guest Lecture), Algorithmic Fairness Recording Slides Direct and Indirect Effects, Interventional Fairness
8 Feb 26 Data Profiling, Data Profiling (Guest Lecture) Slides Slides  
9 Mar 4 Data Cleaning Slides  
10 Mar 11 Final Project Presentations    

Note: The readings and slides are placeholders and should be replaced with actual links to resources.

Textbook

Causal Inference in Statistics: A Primer

Trustworthy Machine Learning

Fairness and Machine Learning: Limitations and Opportunities

Interpretable Machine Learning A Guide for Making Black Box Models Explainable