Messages

  • Feb. 10, 2015: webpage draft created.
  • Sep. 6, 2015: assignment 1 posted, due date Sep. 13, 2015, 9pm EST.
  • Sep. 13, 2015: assignment 2 posted, due date Sep. 21, 2015, 9pm EST.
  • Sep. 25, 2015: assignment 3 posted, due date Oct. 4, 2015, 9pm EST.
  • Oct. 4, 2015: assignment 4a posted, due date Oct. 11, 2015, 9pm EST.
  • Oct. 20, 2015, mid-term, in class.
  • Oct. 25, 2015, 11:59pm EST, project final report due.
  • Oct. 25, 2015: assignment 4b posted, due date Nov. 1, 2015, 9pm EST.
  • Nov. 5, 2015: assignment 5 posted, due date Nov. 15, 2015, 9pm EST.
  • Nov. 22, 2015, 11:59pm EST, project final report due.
  • Nov. 22, 2015: assignment 6 posted, due date Dec. 3, 2015, 9pm EST.
  • Dec. 10, 2015, final, in class.

Course summary

In this course we will discuss some advanced database concepts and techniques, including:
1. SQL, Relational Algebra/Calculus, Index, Views, Constraints
2. Data Models, including NoSQL, I/O Model, Streaming Model, MapReduce, ActiveDHT
3. Query Optimization, including Optimal Join Algorithm
4. Transactions
5. Data Privacy, including Differential Privacy
Detailed list of topics is available in the course schedule below.

Participants are expected to have a background in algorithms and data structures, and some basic knowledge in databases.

The evaluation will be based on a set of exercises and exams, and a course project.

Lecturer

Qin Zhang
Office hour: Tuesday 3:00pm-4:00pm (LH430A)
    or by email appointment (I usually do not accept same day appointment requests, so please contact me one or two days ahead)
Email: qzhangcs@indiana.edu

Associate Instructors
  • Ali Varamesh (coordinator)
    Office hour: Wednesdays 1:00pm-2:00pm (LH120)
    Email: alivara@indiana.edu

  • Yuan Xie
    Office hour: Thursday 4:00pm-5:00pm (LH406)
    Email: xieyuanthu@gmail.com

  • Chao Tao
    Office hour: Friday 10:30am-11:30am (LH406)
    Email: sdutaochao@gmail.com


  • Time and place

    Session 1: Tuesday/Thursday, 11:15am-12:30pm, Woodburn Hall Room 004
    Session 2: Tuesday/Thursday, 1:00pm-2:15pm, Glenn A. Black Laboratory (9th and Fess) 101

    Texbooks

    Main reference book (you may wish to purchase a hard copy; though we do not strictly follow and often go beyond this):
  • [CB] Database Systems: The Complete Book, by Hector Garcia-Molina, Jeff Ullman and Jennifer Widom, 2nd Edition, Prentice Hall

  • Below are a few database texbooks that you may or may not want to read.
  • Readings in Database Systems, Hellerstein and Stonebraker, eds., 4th Edition
  • Database Management Systems, by Ramakrishnan and Guhrke, 3rd Edition, McGraw-Hill.
  • Database System Concepts, by UllSilberschatz, Korth and Sudarshan, 6th Edition, McGraw-Hill.
  • Foundations of Databases: The Logical Level, by Abiteboul, Hull, Vianu, 1st Edition, Addison-Wesley.

    Related courses

    Course schedule

    (subject to adjustments as we go along).
    Some slides are posted here, and you can find all slides in oncourse.

     Week   Date   Section   Content   Slides   Comments 
      1     Aug. 25     0. Introduction     slides    
      1     Aug. 27     0. Backgroud Survey        
      2     Sep. 1     1. Basics     SQL     oncourse     CB Ch. 6
      (book chapters that are relevant.  
      Note: we do not strictly follow)  
      2     Sep. 3         SQL (cont.)     oncourse     CB Ch. 6
      3     Sep. 8       Relational Algebra/Calculus,  
      Datalog  
      oncourse     CB Ch. 5
      3     Sep. 10       Relational Algebra/Calculus,  
      Datalog (cont.)  
      oncourse     CB Ch. 5
      4     Sep. 15       Relational Calculus (cont.) 
      View, Index, Constraints  
      oncourse     CB Ch. 7, 8, 14
      4   Sep. 17       View, Index, Constraints (cont.)     oncourse     CB Ch. 7, 8, 14
      5   Sep. 22       View, Index, Constraints (cont.)     oncourse     CB Ch. 7, 8, 14
      5   Sep. 24     2. Data Models   Old Models, NoSQL  
      ER&XML  
      slides  
      oncourse  
      XML in CB Ch. 11  
      6   Sep. 29       NoSQL     slides      
      6     Oct. 1       I/O Model     slides    CB Ch. 13
      7   Oct. 6       Streaming Model     slides    CB Ch 23.4, 23.5  
      7   Oct. 8       Streaming Model (cont.)     slides    Sections 1, 2, 4 of Chakrabarti's notes  
      8   Oct. 13       MapReduce, ActiveDHT     slides    Chapter 2, 5 of this book
      8   Oct. 15     Mid-term Review     Solutions for HW 1, 2, 3, 4a  
      by Yuan Xie  
          Mid-term preparation:  
      SQL, RC, Datalog
      View, Constraint, Indexing  
      (Old) data models, NoSQL,  
      I/O-model, streaming model  
      9   Oct. 20     Mid-term            
      9   Oct. 22     3. Optimization     Query Processing     oncourse     CB Ch. 15  
      10   Oct. 27         Query Processing (cont.)     oncourse     CB Ch. 15  
      10   Oct. 29       Query Optimization     oncourse     CB Ch. 16
      11   Nov. 3       Query Optimization (cont.)     oncourse     CB Ch. 16
      11   Nov. 5       Optimal Join Algorithm     Ré's slides    
      12   Nov. 10     4. Trasactions   Recovery     oncourse     CB Ch. 17  
      12   Nov. 12       Concurrency Control     oncourse     CB Ch. 18  
      13   Nov. 17       Concurrency Control (cont.)     oncourse     CB Ch. 18  
      13   Nov. 19         Concurrency Control (cont.)     oncourse   CB Ch. 18
      14   Nov. 24           Thanksgiving Break
      14   Nov. 26           Thanksgiving Break
      15   Dec. 1     5. Data Privacy     Introduction     oncourse  
      15   Dec. 3       Differential Privacy     oncourse  
      16   Dec. 8     Final Review     Solutions for HW 4b, 5 & 6  
      by Yuan  
          Final preparation:  
      topics in mid-term  
      MapReduce algorithm, RA, laws of RA,  
      logical/physical query plan, cost estimation  
      join ordering selection, UNDO/REDO logging,  
      (conflict, view) serializability, Recoverability  
      (strict) 2PL, lock modes (S, X, U)  
      16   Dec. 10     Final      

    Grading

    • Assignments 20%

      Six written assignments (2-4% each). Assignments be posted in the middle of the course.
      The answers should be typeset in LaTeX; here is a template to start with.

    • Projects 20%

      Write a report on a specific topic. Work in groups of 4. Details can be found in this instruction

    • Exams 60%

      (1) Mid-term 30%
      (2) Final 30%

    More reading topics

    Course policies

    For assignments, students may discuss answers with anyone, including problem approaches and proofs. But all students must write their own proofs, and write-ups. The names of all people that you have talked to should be listed at the beginning of the first page. If a solution comes from existing papers/web/books, they must be properly cited, and you must write the solution in a way that demonstrates your understanding (simply copying the solution will be considered as plagiarism). All deadlines are firm. No late assignments will be accepted unless there are legitimate circumstances.
    For more details, see Indiana University Code of Student Rights, Responsibilities, and Conduct.

    Prerequisites

      Participants are expected to have a background in algorithms and data structures (have taken C241 "Discrete Structures for Computer Science", C343 "Data Structures", B403 "Introduction to Algorithm Design and Analysis" or equivalent courses), and some basic knowledge in databases.