MIDS Elective Suggestions#

This page includes some guidance on how to think about choosing electives as well as a summary of electives that have been taken by MIDS students in the past.

To be clear, the summary of electives below is NOT the full list of all electives available at Duke or even the full list of electives MIDS students have taken in the past — MIDS students have taken dozens of different courses as electives over the years! This is only meant to provide you a sense of some of the most popular courses and the areas in which you may wish to investigate electives!

It’s also worth emphasizing that the best resource on electives are older MIDS students — they’ve taken many of these courses, and so can speak to things like instructor quality and class workload!

Data science is a quickly changing field, so courses are constantly being added, removed, and changed! Please speak to your peers and faculty for their advice on electives and always check course syllabi since they often change from year to year.

How To Think About Electives#

As you are deciding what electives to take, it can be helpful to start by thinking about your goals. To illustrate, here are a few common goals:

  • Improving some foundational skills or knowledge: Feel like you want to improve your foundational programming, math or statistics skills? Electives are a great opportunity for that, especially because many people like to learn this kind of material in a university class rather than trying to learn the material on their own later.

  • Get a pre-requisite done: If there are specific classes you really want to take, check to see if there are classes you’re required to take to enroll in those classes. Many statistics courses at Duke, for example, require you to take a Introduction to Bayesian Statistics, so even if you don’t want to take that particular class, you may find it useful so you can take other stats department classes.

  • See if you like a substantive domain/type of data science: Data science is an extremely diverse field, and electives are an opportunity to try on different data science hats to see which fits best. Curious about biomedical applications, or finance, or public policy? Try a data science class in that specific area and see if it resonates with you!

Electives by Topic Area#

Below are a list of electives by topic area. Again, this is not an exhaustive list of electives you can take, or even an exhaustive list of electives MIDS students have taken — just some classes and topic areas to get you thinking!

Bayesian Statistics#

  • POLSCI 643S: Applied Bayesian Modeling

    Course DescriptionThis course covers the theoretical and applied foundations of Bayesian statistical analysis. It introduces the logic of Bayesian inference, the idea of regularization, the role of subjective priors, the likelihood, and the posterior distribution. We will discuss model checking and model comparison. Applied Bayesian models include Hierarchical models, factor analysis and item response theory models, treatment effect models, and generalized additive models. Throughout the course, we will focus on the flexible modeling of data arising in social/political science, as well as in public health. We will also pay close attention to the presentation and interpretation of substantive results.

  • STA 601L / STA 602L: Bayesian Statistical Modeling and Data Analysis

    Course DescriptionPrinciples of data analysis and modern statistical modeling. Exploratory data analysis. Introduction to Bayesian inference, prior and posterior distributions, predictive distributions, hierarchical models, model checking and selection, missing data, introduction to stochastic simulation by Markov chain Monte Carlo using a higher level statistical language such as R or Matlab. Applications drawn from various disciplines.

  • BIOSTAT 724: Introduction to Applied Bayesian Analysis

    Course DescriptionThis is a first course in Bayesian statistical analysis for graduate students in biostatistics. The fundamentals of Bayesian inference are introduced, including Bayes’ Theorem and prior and posterior distributions. Bayesian inference is compared and contrasted with frequentist methods through application to common problems in biostatistics. Inference based on conjugate families, as well as a computation-based introduction to Markov chain Monte Carlo methods is presented. Bayesian regression models are introduced, including model checking and selection, followed by an introduction to Bayesian hierarchical regression models. The course format emphasizes applied data analysis and is more heavily weighted toward heuristics and computation-based exploration of Bayesian methods rather than an intense mathematical treatment. Students should have a working knowledge of probability theory, likelihood, and applied frequentist data analysis including linear and logistic regression, and an understanding of how calculus is used in biostatistical applications. Prerequisite: None. Credits: 3

Computer Vision#

  • BME 548L: Machine Learning and Imaging

    Course DescriptionWelcome to Duke University’s Machine Learning and Imaging (BME 548) class! This class aims to teach you how they to improve the performance of you deep learning algorithms, by jointly optimizing the hardware that acquired your data. It primiarly focuses on imaging data - from cameras, microscopes, MRI, CT, and ultrasound systems, for example. It begins with overview of machine learning and imaging science, and then focuses on the intersection of the two fields. This class is for you if 1) you would with imaging systems and you would like to learn more about machine learning, 2) if you are familiar with machine learning and would like to know more about how your data is gathered, 3) if you work with both imaging systems and machine learning and would like to hear a new perspective on the topic, or 4) if you work with neither imaging systems nor machine learning but have a strong mathematical background and are motivated to learn about both.

  • COMPSCI 527: Computer Vision

    Course DescriptionImage formation and analysis; feature computation and tracking; image, object, and activity recognition and retrieval; 3D reconstruction from images. Prerequisites: Mathematics 221, 218 or 216; Mathematics 212; Mathematics 230 or Statistical Science 230; Computer Science 101; Computer Science 230.

Entrepreneurship & Business#

  • I&E 748: New Ventures: Discover

    Course Description This course is designed to lead you to a eureka moment by teaching you how to explore the world around you for problems worth solving. Instead of jumping directly into problem solving and solution development—which can often be wasteful without a clear understanding of a given market and customer need—this course focuses on research, exploration, and discovery. It asks students to set aside pre-conceived notions, avoiding some of their own blind spots, in order to do the necessary work of collecting data about market and learning to assess it as objectively as possible. This course is ideal for anyone who wants to excel at finding white space for new innovation and entrepreneurial action.

  • I&E 748: New Ventures: Deliver

    Course Description Did your idea pass muster in New Ventures Develop? Do you have early revenue or evidence of product market fit and want to continue to refine your go to market strategy? New Ventures Deliver is the ideal course for serious entrepreneurs ready to push themselves to take the leap. In this course you will continue to test core hypothesis while you develop a milestone driven plan for go-to-market, sales, staffing, and fundraising.

  • I&E 800: Business Fundamentals

    Course DescriptionUsing entrepreneurship as a backdrop, this course provides a broad overview of business, including practical business fundamentals and theoretical frameworks for critical thinking. Students will experience the early stages of a typical startup, examine theoretical basis for startup success, understand managing and operating within an organization, and conduct a business analysis of competing companies.

Ethics#

  • BIOETHICS 676: Ethical Technology Practicum

    Course DescriptionInterdisciplinary practicum aiming to provide foundational knowledge in legal, ethical and policy frameworks for developing safe and ethical approaches to use of technological developments together with a practical opportunity to use this knowledge and principles of ‘ethics by design’ to create ethical policies and uses of technology or design of the products or platform itself. In addition to developing substantive knowledge around ethical tech, the students are expected to develop practical skills around collaboration, analysis, research, drafting, and written and oral communication.

Finance#

For Quantitative Finance Electives, please see this page.

Geospatial (GIS)#

  • ENVIRON 559: Fundamentals of GIS and Geospatial Analysis

    Course DescriptionFundamental aspects of geographic information systems and satellite remote sensing for environmental applications. Covers concepts of geographic data development, cartography, image processing, and spatial analysis. Gateway into more advanced training in geospatial analysis curriculum. Consent of instructor required.

  • ENVIRON 558: Satellite Remote Sensing for Environmental Analysis

    Course DescriptionEnvironmental analysis using satellite remote sensing. Theoretical and technical underpinnings of remote sensing (corrections/pre-processing, image enhancement, analysis) with practical applications (land cover mapping, change detection e.g. deforestation mapping, forest health monitoring). Strong emphasis on hands-on processing and analysis. Will include variety of image types: multi-spectral, hyper-spectral, radar and others. Recommended prerequisite: familiarity with GIS.

  • ENVIRON 859: Geospatial Data Analytics

    Course DescriptionProvide training in more advanced skills such as: GIS database programming, modeling applications, spatial decision support systems and Internet map server technologies. The course requires a fundamental knowledge of geospatial analysis theory, analysis tools, and applications. Consent of instructor required. Prerequisite: Environment 559 and Environment 761, 765, or 789.

  • One may also pursue a full Nicholas School GIS Certificate

    Course Description

Math, Probability, and Statistics#

  • MATH 641: Probability

    Course DescriptionDesigned to be a sequel to Statistical Science 711. The basic five topics are: martingales, Markov chains from an advanced viewpoint, ergodic theory, Brownian motion and its applications to random walks, Donsker’s theorem and the law of the iterated logarithm, and multidimensional Brownian motion, connection to PDE’s. For those who have not had 711, we will prove the law of large numbers using martingales and obtain versions of the central limit theorem from Donsker’s theorem. Course requires a knowledge of measure theory. Prerequisite: Statistical Science 711 or Mathematics 631.

  • MATH 712: Multivariate Calculus

    Course DescriptionPartial differentiation, multiple integrals, and topics in differential and integral vector calculus, including Green’s theorem, the divergence theorem, and Stokes’s theorem. An assignment will ask the student to relate this course to their research.

  • MATH 718: Matrices and Vector Spaces

    Course DescriptionSolving systems of linear equations, matrix factorizations and fundamental vector subspaces, orthogonality, least squares problems, eigenvalues and eigenvectors, the singular value decomposition and principal component analysis, applications to data-driven problems. An assignment will ask the student to relate this course to their research.

  • MATH 730: Probability

    Course DescriptionProbability models, random variables with discrete and continuous distributions. Independence, joint distributions, conditional distributions. Expectations, functions of random variables, central limit theorem. An assignment will ask the student to relate this course to their research.

  • MATH 780: Calculus and Probability

    Course DescriptionIntroduction to calculus of real-valued functions with an emphasis on applications to probability. Topics include an introduction to elementary functions, differentiation and applications, integration, and continuous probability distributions. Intended for graduate students in social and applied sciences.

  • STA 611: Introduction to Mathematical Statistics

    Course DescriptionFormal introduction to basic theory and methods of probability and statistics: probability and sample spaces, independence, conditional probability and Bayes’ theorem; random variables, distributions, moments and transformations. Parametric families of distributions and central limit theorem. Sampling distributions, traditional methods of estimation and hypothesis testing. Elements of likelihood and Bayesian inference. Basic discrete and continuous statistical models.

Machine Learning#

Note: Courses in this area are constantly changing, so be sure to keep an eye out for new courses!

  • COMPSCI 675D: Introduction to Deep Learning

    Course DescriptionProvides an introduction to the machine learning technique called deep learning or deep neural networks. A focus will be the mathematical formulations of deep networks and an explanation of how these networks can be structured and ‘learned’ from big data. Discussion section covers practical applications, programming, and modern implementation practices. Example code and assignments will be given in Python with heavy utilization of PyTorch (or Tensorflow) package. The course and a project will cover various applications including image classification, text analysis, object detection, etc. Prerequisite: ECE 580, ECE 681, ECE 682D, Statistical Science 561D, or Computer Science 571D.

  • ECE 661: Computer Engineering Machine Learning and Deep Neural Nets

    Course DescriptionThis course examines various computer engineering methods commonly performed in developing machine learning and deep neural network models. The focus of the course is on how to improve the training and inference performance in terms of model accuracy, size, runtime, etc. Techniques that are widely investigated and adopted in industrial companies and academic communities will be discussed and practiced. Programming practices on these techniques are designed with heavy utilization of the PyTorch package. Prerequisites: Computer Science 201 or ECE 551D or ECE 751D. Instructors: Y. Chen or H. Li

  • ECE 685D: Intro to Deep Learning

    Course DescriptionProvides an introduction to the machine learning technique called deep learning or deep neural networks. A focus will be the mathematical formulations of deep networks and an explanation of how these networks can be structured and ‘learned’ from big data. Discussion section covers practical applications, programming, and modern implementation practices. Example code and assignments will be given in Python with heavy utilization of PyTorch (or Tensorflow) package. The course and a project will cover various applications including image classification, text analysis, object detection, etc. Prerequisite: ECE 580, ECE 681, ECE 682D, Statistical Science 561D, or Computer Science 571D. Instructor: Tarokh

  • ECE 689: Advanced Topics in Deep Learning.

    Course DescriptionFocus on advanced topics in deep learning, particularly methodological methods. This includes discriminative models (e.g., infinite/infinitesimal/physics-informed neural networks), generative models (normalizing flows, graphical models, Bayesian Neural Networks, non-parametric approaches), and topics on inference (e.g., exact and approximate inference methods). Assignments will provide an opportunity to implement techniques. Instructor: Tarokh

Programming#

  • IDS 721: Data Analysis at Scale in Cloud

    Course DescriptionData Analysis at Scale in the Cloud is a project based course with extensive hands-on assignments. This course is designed to give students a comprehensive view of cloud computing including Big Data and Machine Learning. A variety of learning resources will be used including interactive labs on Cloud Platforms (Google, AWS, Azure).

  • BIOSTAT 821: Software Tools for Data Science

    Course DescriptionA data scientist needs to master several different tools to obtain, process, analyze, visualize and interpret large biomedical data sets such as electronic health records, medical images, and genomic sequences. It is also critical that the data scientist masters the best practices associated with using these tools, so the results are robust and reproducible. The course covers foundational tools that will allow students to assemble a data science toolkit, including the Unix shell, text editors, regular expressions, relational and NoSQL databases, and the Python programming language for data munging, visualization and machine learning. Best practices that students will learn include the Findable, Accessible, Interoperable and Reusable (FAIR) practices for data stewardship, as well as reproducible analysis with literate programming version control and containerization. Credits: 3

  • MATH 560: Theory and Practice of Algorithms

    Course DescriptionThe mathematical theory of algorithms and graphs and their practical implementations. Examines the foundational mathematical structures for the behavior and analysis of algorithms from a variety of domains, with a particular emphasis on graphs. Students tie theory to practice by writing code to implement algorithms, and compare experimentally observed run-times to those predicted by the mathematical theory. Recommended prerequisite: Computer Science 201; or recommended corequisite: ECE 551; or equivalent.

  • MATH 561: Numerical Linear Algebra, Optimization and Monte Carlo Simulation

    Course DescriptionSingular Value Decomposition, Principle Component Analysis, QR Factorization, Least Square Problems, Conditioning and Stability, Direct Method for Linear Systems – Gaussian Elimination, Cholesky Factorization, Iterative Methods for Linear Systems – Conjugate Gradients, GMRES, Preconditioning, Eigenvalue Problem – Power Method, Rayleigh Quotient, Inverse Iteration, QR Algorithms, Newton Method for Nonlinear Equation, Multigrid Method and Fast Fourier Transform.

  • STA 663L: Statistical Computing and Computation

    Course DescriptionStatistical modeling and machine learning involving large data sets and challenging computation. Data pipelines and data bases, big data tools, sequential algorithms and subsampling methods for massive data sets, efficient programming for multi-core and cluster machines, including topics drawn from GPU programming, cloud computing, Map/Reduce and general tools of distributed computing environments. Intense use of statistical and data manipulation software will be required. Data from areas such as astronomy, genomics, finance, social media, networks, neuroscience. Instructor consent required. Prerequisite: Statistics 521L, 523L; Statistics 532 (or co-registration).

  • ECE 551D: Programming, Data Structures, and Algorithms in C++. Editorial Comment: An extremely difficult course. Do not enroll lightly or concurrently with other difficult courses.

    Course DescriptionStudents learn to program in C and C++ with coverage of data structures (linked lists, binary trees, hash tables, graphs), Abstract Data Types (Stacks, Queues, Maps, Sets), and algorithms (sorting, graph search, minimal spanning tree). Efficiency of these structures and algorithms is compared via Big-O analysis. Brief coverage of concurrent (multi-threaded) programming. Emphasis is placed on defensive coding, and use of standard UNIX development tools in preparation for students’ entry into real world software development jobs. Not open to undergraduates. Instructors: Hilton, Lipp, Pastorino, or Younes

  • ECE 590-1: Theory and Practice of Algorithms

    Course DescriptionThis course ties the mathematical theory of algorithms and graphs to their practical implementations. Students will learn about the mathematical structures that for the foundations for the behavior and analysis of algorithms from a variety of domains, with a particular emphasis on graphs. Students will also tie that theory to practice by writing code to implement those algorithms, and comparing experimentally observed runtimes to those projected by the mathematical theory.

  • ECE 651: Software Engineering

    Course DescriptionTeaches students about all steps of the software development lifecycle: requirements definition, design, development, testing, and maintenance. The course assumes students are skilled object-oriented programmers from prior courses, but will include a rapid introduction to Java. Students complete team-based semester-long software project which will progress through all phases of the software lifecycle. Prerequisite: Electrical and Computer Engineering 551D or 751D. Instructors: Derby, Hilton, Noyce, Pastorino, or Rahbar

  • BIOSTAT 823: Statistical Programming for Big Data

    Course DescriptionThis course will extend the foundation laid in software tools for data science to allow for efficient computing involving very large data sets. This course will explore the use appropriate algorithms and data structures for intensive computations, improving computational performance by use of native code compilation, use of parallel computing to accelerate intensive computations, use appropriate algorithms and data structures for massive data set, and use of distributed computing to process massive data sets. Prerequisite: BIOSTAT 821 or permission of the Director of Graduate Studies. Credits: 2

Time Series#

  • ENVIRON 797: Time Series Analysis for Energy and Environment Applications

    Course DescriptionThis course focuses on time series analysis, modeling, and forecasting, specifically within the context of energy and the environment. Lectures will include theory and applications using R programming language. Datasets from organizations like US Energy Information Administration (EIA), National Oceanic and Atmospheric Administration (NOAA), National Renewable Energy Laboratory (NREL) and US Geological Survey (USGS) will be used. Upon completion of the course, students will be able to use R to carry out basic statistical modeling and analysis as well as fitting models to data. The primary objective of the course is to empower students to extract meaningful predictions and insights from data.

  • STA 542: Intro to Time Series

Social Networks#

  • SOCIOL 728: Introduction to Social Networks

    Course DescriptionIntroduction to social network analysis (SNA). History of SNA; social-theoretical foundations of modern network analysis; data collection; data management; analysis and visualization tools. Survey of current applications of SNA within the social sciences.