dsc100-su25

The_Data_Lifecycle

DSC 100: Introduction to Data Management

Summer Session I 2025 — Fully Remote & Asynchronous (all lectures pre‑recorded)


Description

Databases are at the heart of modern commercial application development. Their use extends beyond this to many other environments and domains where large amounts of data must be stored for efficient update, retrieval, and analysis. This course provides a comprehensive introduction to data management, covering topics such as the relational data model, SQL, formal query languages, query evaluation, data storage and indexes, NoSQL databases, conceptual design, integrity constraints, design theory, normal forms, and data quality. Through pre‑recorded video lectures, demos, and hands‑on exercises, students will gain a strong foundation in database‑management concepts and practices, with a focus on the use of SQL and the relational model.

The course begins with an overview of course organization and an introduction to the relational data model, followed by lessons on SQL basics and different types of joins in SQL. In the following weeks, students will learn advanced SQL queries, formal query languages, and query evaluation. The course also covers topics related to data storage and indexing, including hard‑disk and file organization, index organization, and creating indexes in SQL. In the latter half of the course, students will study NoSQL databases, conceptual design, integrity constraints, and design theory, culminating in a discussion of normal forms. The course includes a discussion of data quality, including techniques for dealing with missing data and entity deduplication. Upon completing the course, students will be equipped with the skills and knowledge necessary for designing, querying, and maintaining a relational database.

Instructional Team


Delivery & Communication


Course Workload (subject to change)

Course Workload (subject to change) Homework (50 %) — written/programming assignments (45 %) + web‑quizzes (5 %). Teams of ≤ 2 allowed. Three 24‑hour late‑day tokens (max 1 per assignment).

Midterm — No formal exam; practice questions will be provided for self-assessment.

Final Exam (50 %) — comprehensive; remote Sat 02 Aug 2025, 15:00 – 17:59 PT.


Important Summer‑I Dates

(All assignment due dates appear on the Canvas calendar.)


Full Summer Calendar & Resources (subject to change)

Video Link: placeholder to be updated once recordings are uploaded.

Week           Description           Discussions Assignments / Remarks Lectures Optional Reading
 1   Intro, Data models, SQL         Slides and Recordings Sec. 2.1, 2.2, 2.3
      Join and Aggregates in SQL     SQL practice Recording   Slides and recordings Sec. 6.1, 6.2
 2, 3   Advanced SQL   WQ1 due: Introduction, Data Models, and Simple SQL
HW1 due: Sqlite
Slides and recordings  
     Relational Algebra   Relational Algebra Source, Recording HW2 due: Basic SQL
WQ2 due: SQL Aggregates
Slides and recordings  
4         Midterm: Practice questions only – no formal exam will be administered        Exam practice Questions Recording      
   Query Evaluation, Basics of Data Storage and Indexes Midterm solutions Recording1 Recording2   Slides and recordings  
 5           Cost Estimation
NoSQL Databases        
Indexes and Cost Estimation Source   Slides and recordings  
    Conceptual Design Homework Solutions   Slides and recordings Sec. 4.1, 4.6
    BCNF Conseuptual Design practice Recording HW3 due: Advanced SQL WQ4 due: RA and Conceptual Design
HW4 due: Conceptual Design
Slides and recordings  

Grading Scheme

The grading scheme is a hybrid of absolute and relative grading. The absolute cutoffs are based on your absolute total score. The relative bins are based on your position in the total score distribution of the class. The better grade among the two (absolute-based and relative-based) will be your final grade.

Grade Absolute Cutoff (>=) Relative Bin (Use strictest)
A+ 97 Highest 5%
A 94 Next 10% (5-15)
A- 91 Next 15% (15-30)
B+ 85 Next 15% (30-45)
B 80 Next 15% (45-60)
B- 70 Next 15% (60-75)
C+ 65 Next 5% (75-80)
C 60 Next 5% (80-85)
C- 55 Next 5% (85-90)
D 50 Next 5% (90-95)
F < 50 Lowest 5%

Textbook

Although a textbook is not required in the course, the following textbook is optional and recommended. Lecture slides and recorded videos would be sufficient for this class.

Database Systems: The Complete Book, by Hector Garcia-Molina, Jeffrey D. Ullman, and Jennifer Widom. 2nd Edition. Prentice Hall. 2008.

Canvas

All weekly homework assignments and web quizzes should be turned in via Canvas.

Communication

All important announcements will be sent through Canvas.

Piazza

All questions that may be of general interest to the class should be directed to Piazza. You will get your questions answered faster on Piazza than via personal emails to the instructional team, because Piazza is monitored closely by everybody in the class, not just the course staff. You are highly encouraged to answer each other’s questions on Piazza (you will get extra credit for the number of good answers on Piazza!) and the instructional team would endorse/add to those answers.

Note: Some slides are adopted from the UW database group.