Messages

  • Jan. 3, 2014: webpage draft created.
  • May. 8, 2014: webpage draft completed.
  • Sep. 18, 2014: first assignment posted, due date Sep. 25, 2014, 9pm EST.
  • Oct. 2, 2014: second assignment posted, due date Oct. 11, 2014, 9pm EST.
  • Oct. 14, 2014: mid-term review, in class.
  • Oct. 16, 2014, mid-term, in class.
  • Oct. 21, 2014, 9pm EST, reading assignment proposal due. You are allowed to update it until Oct. 28, 2014, 9pm EST.
  • Nov. 16, 2014, 9pm EST, reading assignment final report due.
  • Nov. 13, 2014: third assignment posted, due date Nov. 25, 2014, 9pm EST.
  • Dec. 11, 2014, final, in class.

Course summary

In this course we will discuss some advanced database concepts and techniques (besides a recap of basics like SQL and relational algebra), including:
1. Storage Models, Indexing, Views
2. Query Optimization (include Optimal Join Algorithm)
3. Transactions
4. Data Privacy (e.g., Differential Privacy)
5. I/O-Efficient Algorithms (e.g., List Ranking)
6. Streaming Algorithms (e.g., Count-Min Sketch)
7. MapReduce Algorithms (e.g., Matrix Computation)
Detailed list of topics is available in the course schedule below.

Participants are expected to have a background in algorithms and data structures, and some basic knowledge in databases.

The evaluation will be based on a set of exercises and exams.

Lecturer

Qin Zhang
Email: qzhangcs@indiana.edu
Office hour: Tuesday 2:45pm-3:45pm (LH430A)

Associate Instructors
  • Erfan Sadeqi Azer. Office hour: Friday 10:00am-11:00am (LH406)
  • Le Lei. Office hour: Wednesday 1:30pm-2:30pm (LH Abyss, in front of LH215 the admin office)
  • Yifan Pan (coordinator): Monday 10:30am-11:30am (LH401A)
  • Ali Varamesh. Office hour: Friday 10:00am-11:00am (LH406)
  • Prasanth Velamala: Office hour: Friday 10:00am-11:00am (LH112)

  • Time and place

    Tuesday/Thursday, 11:15am-12:30pm (session 1), 1:00pm-2:15pm (session 2)
    INFO CONNECTOR Room 150

    Texbooks

    Main reference book (we go beyond this):
  • Database Systems: The Complete Book, by Hector Garcia-Molina, Jeff Ullman and Jennifer Widom, 2nd Edition, Prentice Hall

  • Below are a few database texbooks that you may or may not want to read.
  • Readings in Database Systems, Hellerstein and Stonebraker, eds., 4th Edition (will be our readings)
  • Database Management Systems, by Ramakrishnan and Guhrke, 3rd Edition, McGraw-Hill.
  • Database System Concepts, by UllSilberschatz, Korth and Sudarshan, 6th Edition, McGraw-Hill.
  • Foundations of Databases: The Logical Level, by Abiteboul, Hull, Vianu, 1st Edition, Addison-Wesley.

    Related courses

    Course schedule

    (subject to adjustments as we go along).

     Week   Date   Section   Content   Slides   Comments 
      1   Aug. 26     Touch-base Exam      
      1   Aug. 28     0. Introduction     slides
     models
     
      2   Sep. 2     1. Basics     SQL     slides     Main Ch. 6 (book chapters all approximate)
      2   Sep. 4       SQL     same    
      3   Sep. 9       Relational Algebra / Datalog     same   Main Ch. 5
      3   Sep. 11       Relational Calculus     same  
      4   Sep. 16     2. Models and Indexing   Data Models     slides    
      4   Sep. 18       Views, Constraints  
      Indexing 
      same     Main Ch. 7, 8, 14  
      5   Sep. 23     3. Optimization     Query Processing     slides     Main Ch. 15  
      5   Sep. 25       Query Processing (cont.)     same    
      6   Sep. 30       Query Optimization     same     Main Ch. 16
      6   Oct. 2       Optimal Join Algo     Ré's slides    
      7   Oct. 7     4. Trasactions   Recovery     slides     Main Ch. 17  
      7   Oct. 9       Recovery (cont.)     same    
      8   Oct. 14     Mid-term Review   Solution for Ass. 1 & 2  
      by Yifan Pan  
          Mid-term preparation:  
      Ass. 1 & 2; RA, RC, SQL,  
      B/B+-tree (Ch. 15.6), query rewrite (Ch. 16.2),  
      cost estimation / DP (Ch. 16.4, 16.6),  
      query plan, index (Ch. 15.6),  
      Joins (Ch. 15.3, 15.4, 15.5)  
      8   Oct. 16     Mid-term            
      9   Oct. 21     4. Trasactions (cont.)   Concurrency Control     same     Main Ch. 18  
      9   Oct. 23       Concurrency Control (cont.)     same    
      10   Oct. 28   5. I/O-Efficient Algorithms     Sorting   slides     Sitchianva's note  
      CACM article
      Sec. 5 of Vitter's book 
      10   Oct. 30     List Ranking     same   Sitchianva's note  
      Sec. 2 of Zeh's survey
      11   Nov. 4   6. Streaming Algorithms     Sampling   slides     Basic probability slides and proofs
      11   Nov. 6     Distint Elements     same   Section 2 of Chakrabarti's notes  
      12   Nov. 11     Heavy Hitters     same   Section 1, 4 of Chakrabarti's notes  
      For space-saving see this paper 
      12   Nov. 13   7. Data Privacy     Introduction     slides    
      13   Nov. 18     k-Anonymity, l-Diversity     same   Also see Cormode's slides
      13   Nov. 20     Differential Privacy       Final Review
      14   Nov. 25         Thanksgiving Break
      14   Nov. 27         Thanksgiving Break
      15   Dec. 2   9. Student Presentations     Achanta,  
      Garimella+Adusumilli,  
      Brundavanam+Krukau  
      Bellinger,  
      Kumar+Chandrashekharachar,  
      Gaikwad+Shelke  
     
      15   Dec. 4     Chen,  
      Supe+Jagasia,  
      Pathak+Jazayeri  
      Gupta,  
      Loya+Ramakrishna,  
      Luo and Peng  
     
      16   Dec. 9     Sureshkumar+Rangarajan,  
      Shivaramakrishnan,  
      Singh,  
      Pillai+Modi,  
      Mohsen,  
      Vijaya+Sasikumar  
     
      16   Dec. 11   Final       Final preparation:  
      Transactions,  
      I/O algorithms (time forward processing),  
      Streaming algorithms  

    Grading

    • Assignments 50%

      (1) Three written assignments (10% each). Assignments be posted in the middle of the course. The answers should be typeset in LaTeX; here is a template to start with.
      (2) One reading assignment (20%). One or a group of two read some (1 to +∞) papers/surveys and write a proposal (1 page, including which topic you choose and which papers/articles you plan to read) and a report (4 pages for one, and 8 pages for a group of two) on what you *think* of the articles you read (not just a repeat of what they have said). Topics can be found in this redbook, and more topics below (google the papers yourself; contact AI if you cannot find it). Selected students/groups will give 25mins talks (20mins presentation +5mins Q&A) in class. The best (at least) 1/3 individuals/groups will get a bonus in their final grades. Details will be announced in class.

    • Exams 50%

      (1) Mid-term 20%
      (2) Final 30%

    More reading topics

    Course policies

    For assignments, students may discuss answers with anyone, including problem approaches and proofs. But all students must write their own proofs, and write-ups. The names of all people that you have talked to should be listed at the beginning of the first page. If a solution comes from existing papers/web/books, they must be properly cited, and you must write the solution in a way that demonstrates your understanding (simply copying the solution will be considered as plagiarism). All deadlines are firm. No late assignments will be accepted unless there are legitimate circumstances.
    For more details, see Indiana University Code of Student Rights, Responsibilities, and Conduct.

    Prerequisites

      Participants are expected to have a background in algorithms and data structures (have taken C241 "Discrete Structures for Computer Science", C343 "Data Structures", B403 "Introduction to Algorithm Design and Analysis" or equivalent courses), and some basic knowledge in databases.