Syllabus

This course aims to teach you how to write fast code for parallel computers and specialized hardware accelerators. It is a spiritual follow-on to Software Performance Engineering (SPE), where we graduate from mainstream CPUs to things like GPUs. SPE or similar low-level programming, performance analysis, computer architecture, and performance optimization experience is expected as a prerequisite.

Key things we expect you to learn include:

Topics:

Grading

The overall course grade comprises:

Late submissions

We will generally also allow a short grace period after the deadline to accommodate minor technical hiccups, at our discretion. You don’t need to ask for accommodation if a bug delayed your submission by 5 minutes.

Late days are intended to cover everyday disruptions like a mild illness, interview travel, or a confluence of deadlines with other classes. Additional extensions over and above this generous baseline will only be considered in exceptional circumstances with S^3 support.

Attendance

You may miss up to 2 live labs, no questions asked. As with late days for lab submissions, this is intended to cover common minor disruptions, i and beyond this, excuses for exceptional circumstances will be considered with S^3 support.

Collaboration Policy

You are welcome and encouraged to discuss and learn from each other, but what you ultimately submit should represent your own individual final implementation.