This course is a graduate level introduction to multicore computing with a focus on software challenges. We will cover multicore architectures, memory models, transactional memory, locking schemes, concurrent data structures, multicore programming languages, debugging including race detection and model checking, and performance analysis and optimizations.

Why should you take this course?

Sequential software scales linearly with speed of a single core. It is as if hardware designers are limiting software speed. But due to heat generated by fast cores, chip designers have now hit a wall trying to speed up cores. The only way forward for them is to make the speed a little slow and increase the number of cores. There is no heat wall here. They can design as many cores as they want with corresponding reduction in speed. But now the limiting factor is software and the pressure is on software designers to parallelize their implementations so that hardware designers can introduce more cores. This has made it inevitable for software designers to leave the comfortable and simple to understand world of sequential software and understand the complexities of multicore systems where architectural details don’t remain hidden (e.g. memory models and cache coherence), where bugs often don’t reproduce with the same inputs, and where performance sometimes decreases on multiple cores and sometimes increase beyond what is mathematically possible (super-linear speedup). You should take this course if you want to pursue a career in software design and development or if you want to research in architecture, operating systems, software reliability, program analysis, or programming languages.

Why should you not take this course?

  1. Attendance is mandatory (see policy below). Do not take this course if you cannot be in class on time.

  2. This course involves reading and discussing research papers. Research papers are not written like a book. You need to skim them before class, discuss them, be attentive in the lectures, and then again read them thoroughly after class to grasp them. If you cannot give time approximately to one paper per week, do not take this course.

  3. This course involves programming assignments. Do not take this course if you don’t like programming. A litmus test to see for yourself if you have enough programming expertise is to ask yourself if you can implement a binary search tree in the language of your choice right now without help and without Internet.

Can you find the 18 cores in this Xeon chip?
Office Hours:

TTh 10:45AM–11:30AM

Course Information

Class meetings

Tue/Thu 9:30AM–10:45AM in TBD


Graduate standing in CS/EE or CS 202 for Juniors/Seniors

Required text



60%Exams (4 of 15% each, some/all could be open book)
30%Assignments (3 of 10% each, see late submission and academic submission policies below)
10%Attedance (See policy below)


Academic dishonesty

You must not turn in work that is not yours. You must not enable someone else to turn in work that is not his or hers. Do not share your work with anyone else and adequately protect all your files. You must not allow someone to openly violate this policy because it diminishes your effort as well as that of your honest classmates.

Changing your exam answers after they have been graded, copying answers during exams, or plagiarizing the work of others will be considered academic dishonesty and will be subject to disciplinary penalties, including the possibility of failure in the course and/or dismissal from the University. Plagiarism detection software will be used on the programs submitted in this class.

Attendance & class decorum

Attendance is mandatory. You lose two absolute percentage points from the 10 attendance marks for every class missed except for the classes on 1st and 3rd September (Add period) and except for OSA approved absentees. Class is considered missed if you are not there at start of class. If you miss more than five classes, you will get an F in the course regardless of performance in assignments and exams.

Your behavior should not be disruptive during class and should not hinder in other students’ learning. In particular do not chat with your neighbors and keep your cell phones turned off in class.

Late submissions & missed exams

All work must be turned in by electronic submission before the deadline (no e-mailed submissions). Do not submit at the last moment. If you submit your assignment late even by a second, it will not be considered.

For OSA approved cases that span more than half the duration from assignment release to assignment submission, you may get extra days but no extra days will be awarded after the deadline has already passed.

OSA approved petitions for missed mid-term and final examinations will most likely receive an average score after a deduction according to their semester performance.

Tentative Schedule


Introduction and Administrivia

Single core out-of-order, speculative, and multiple issue advancements but heat and speed of light limitations, Inevitibility of multicore programming, Cache hierarchy

Software and the Concurrency Revolution, ACM Queue, Volume 3, issue 7, October 2005

Cache coherence, Programmatically finding cache and cache line sizes, Spatial and Temporal locality, Matrix Vector multiplication and performance analysis

A Primer on Memory Consistency and Cache Coherence, Chapter 2

More on cache performance analysis, Introduction to Mutual exclusion problem,

Memory Consistency problem and how it differs from Mutual exclusion and Cache Coherence, seqeuntail consistency, total store order, relaxed memory model, fence/barrier instructions, atomic instructions, and bus locking

A Primer on Memory Consistency and Cache Coherence, Chapter 3 and 4

Peterson’s Algorithm and Multiprocessor issues, Volatile keyword, Spinlocks, Busy waiting, Higher level locks with sleep queues

The Art of Multiprocessor Programming, Chapter 2

Thread level parallelism, Context switching, Java and C++ semantics to create threads with and without lambda expressions

Eid ul Adha

Mutual Exclsuon, Progress, Bounded Waiting, Synchronized construct in Java, Sleeping Lock, Deadlocks

Exam 1 discussion

Producer Consumer problem, Unbounded Queue implementation, Condition variables, wait, notify, Spurious wakeup

Last day to drop this course on 9 Oct.

Reader Writer Locks, Starvation, Aging, Lost wakeup, Multiple condition variables (Bounded queue), Monitor, Monitor Composition and Nested monitor calls, Deadlock

Safety and Liveness properties, Linked List, Linearization points, Sentinal nodes, Coarse and fine grained, hand-over-hand

The Art of Multiprocessor Programming, Chapter 9

Exceptions with explicit locks, fine grained, Optimistic, Lazy

The Art of Multiprocessor Programming, Chapter 9

Last day to withdraw from this course.

Spinlocks, contention

Concurrent structures

More about concurrent structures


Work stealing, and load balancing

Transactional Memory

Lockset race detection

Happens-before race detection

Model Checking

Partial order reduction

Performance Analysis

Closing notes