Data science (2019 - 2020)
Data Science for Everyone
DS-UA 111 Prerequisite: high school algebra or permission of the program. Lecture and laboratory. Offered every semester. 4 points.
Prepares students to participate in the data-driven world that we are all experiencing. Students engage with core principles of data analysis and programming, and gain practical experience with real-world datasets from the humanities, social sciences, and natural sciences. Students are also introduced to ethical, legal, and privacy issues. Aims to transform students from passive consumers of conclusions about data that other people have made to informed, empowered, and critical readers and producers of data-driven insights. Also prepares students for further study in data science. Open to students from any discipline with any level of experience in computer science and/or statistics (including no experience at all). See syllabus. See course website.
Introduction to Data Science
DS-UA 112 Prerequisite: Data Science for Everyone (DS-UA 111), or Introduction to Computer Programming (No Prior Experience) (CSCI-UA 2), or Introduction to Computer Programming (Limited Prior Experience) (CSCI-UA 3), or Introduction to Computer Science (CSCI-UA 101), or permission of the program. Lecture and laboratory. Offered every semester. 4 points.
Offers the fundamental principles and techniques of the field. Students develop a toolkit to examine real-world examples and cases to place data science techniques in context, to develop data-analytic and inferential thinking, and to illustrate that proper application is as much an art as it is a science. In addition, students gain hands-on experience with the Python programming language and its associated data analysis libraries. Examines ethical implications surrounding privacy and data sharing, as well as algorithmic decision making for a given data science solution. Suitable for those with prior programming experience who seek a more rigorous overview of the field of data science. See draft syllabus.
DS-UA 201 Prerequisite: Introduction to Data Science (DS-112) and Probability and Statistics (MATH-UA 235), or permission of the program. Lecture and laboratory. Offered every semester. 4 points.
Causal Inference provides students with the tools for understanding causation, i.e., the relationship between cause and effect. We will start with the situation in which you are able to design and implement the data gathering process, called the experiment. We will then define causation, identify preconditions required for A to cause B, show how to design perfect experiments, and discuss how to understand threats to the validity of less-than-perfect experiments. In this course, we will cover experimental design and then turn to those careful approaches, where we will consider such approaches as quasi-experiments, regression discontinuities, differences in differences, and contemporary advanced approaches.
Responsible Data Science
DS-UA 202 Prerequisites: Introduction to Data Science (DS-UA 112) and Probability and Statistics (MATH-UA 235). Lecture and laboratory. Offered every semester. 4 points.
The first wave of data science focused on accuracy and efficiency: on what we can do with data. The second wave is about responsibility: what we should and should not do. Accordingly, this technical course tackles the issues of ethics and responsibility in data science, including legal compliance, data quality, algorithmic fairness and diversity, transparency of data and algorithms, privacy, and data protection. An important feature of this course is its holistic treatment of the data science lifecycle, beginning with data discovery and acquisition, through data cleaning, integration, querying, analysis, and result interpretation.
Machine Learning for Language Understanding
DS-UA 203 Identical to LING-UA 52. Prerequisites: At least one course with a substantial Python programming component (i.e., CSCI-UA 2, CSCI-UA 3, or an advanced CSCI-UA or other programming course); basic experience with calculus (i.e., MATH-UA 121, 122, or 123 or credit for testing out of one or more of these courses), and probability theory (e.g. MATH-UA 233), or permission of the instructor. Offered every spring semester. 4 points.
This course covers widely-used machine learning methods for language understanding—with a special focus on machine learning methods based on artificial neural networks—and culminates in a substantial final project in which students write an original research paper in AI or computational linguistics. If you take this class, you’ll be exposed only to a fraction of the many approaches that researchers have used to teach language to computers. However, you’ll get training and practice with all the research skills that you’ll need to explore the field further on your own. This includes not only the skills to design and build computational models, but also to design experiments to test those models, to write and present your results, and to read and evaluate results from the scientific literature.
Advanced Topics in Data Science
DS-UA 301 Prerequisite: Introduction to Data Science (DS-UA 112) and Probability and Statistics (MATH-UA 235), or permission of the program. Lecture and laboratory. Offered every semester. 4 points.
Advanced Topics in Data Science exposes students to two specialized topics within Data Science: Examples of topics include time series, deep learning, and other advanced machine learning topics. Students will learn the theoretical underpinnings of advanced data science techniques, as well as engage in hands-on activities to build a practical toolkit.