AMAT 585: Practical Methods in Topological Data Analysis

Fall 2023, Class #7604

Monday, Wednesday 11:40-1:00, Humanities 115

Instructor: Michael Lesnick
mlesnick [at] albany [dot] [the usual thing]
Office Hours: Monday, Wednesday 4:30-5:30, and by appointment.

About this Course:
This is the third course in a three-semester sequence on Topological Data Analysis (TDA), aimed primarily at students in Albany's Data Science MS program. It is a project-based course whose goal is to give students hands-on experience with TDA, and with data analysis more broadly.

The course centers on a single semester-long project.

This is a fully in-person course, and students are required to attend. I do understand that situations may occasionally arise that make attendance impossible, and I intend to be flexible.

Tentative Course Plan:

Project details :
The project will be chosen by the student, with my input and consent. Most projects will involve the application of topological and geometric tools to real world data. Projects focused either on supervised learning or on exploratory data analysis are acceptable. A project focused on algorithms or on theoretical questions is also an option.

Students will be permitted to work on the final projects in groups if they choose, but I expect that the ground covered by a project will be proportional to the number of people involved, and that responsibilities will be clearly and evenly delineated.

Expectations:
This course will require substantial time, independence, and academic maturity. Students should be be aware that data analysis can be time-consuming: In addition to the interesting stuff, one has to spend considerable time doing boring things like installing software on one's computer, cleaning data, debugging, troubleshooting, and waiting for long computations to finish. In addition, I expect that most students will work with data from real world applications for their projects; substantial time will be required to study the relevant literature and learning the application.

With that in mind, students should expect to devote substantial time to this course. Also, while I am generally available to help with mathematics and data science questions, be warned that my availability to help with technical computing issues (e.g., trouble installing software, bugs in your code, etc.) is very limited, and you should expect to handle such issues with little or no help from me. Moreover, depending on the application area you choose to study, I may not be of much help with questions about application areas.

Prerequisites:
Students are formally required to have either taken TDA I and II (AMAT 583/584) or to have permission of the instructor. In addition, you are expected to have a basic competence in programming and using computers for data analysis at the level of AMAT 502.

Course Materials:
There will be no course textbook or other formal set of course materials, but for the lecture portion of the course, I will make my (handwritten) lecture notes available.

Software:
Much of the TDA software we will use in this course can be found on github, under the tag "topological-data-analysis". There is quite a lot there, and I will make more specific suggestions about what software to use as the course progresses.

You might also find scikit-learn to be useful for clustering and dimensionality reduction.

Recommended reading (a very incomplete list):
Grading:
The class will use the university's A-E grading scheme.

50%: Final Project (including both oral and written component),
30%: Other Presentations,
20%: Attendance/Participation/Engagement

Academic Regulations:
Naturally, the University's Standards of Academic Integrity apply to this course, and students are expected to be familiar with these. Pay particular attention the the sections on plagiarism.