AMAT 585: Practical Methods in Topological Data Analysis

Spring 2022, Class #9126

Tuesday, Thursday 10:30-11:50, Massry 231

Instructor: Michael Lesnick
mlesnick [at] albany [dot] [the usual thing]
Office Hours: Tuesday, Thursday 3:00-:400, and by appointment.

Send Anonymous Feedback

About this Course:
This is the third course in a three-semester sequence on Topological Data Analysis (TDA), aimed primarily at students in Albany's Data Science MS program. This is a project-based course whose goal is to give students hands-on experience with TDA, and with data analysis more broadly.

The course centers on a single semester-long project.

This is a fully in-person course, and students are required to attend. I do understand that situations may occasionally arise that make it impossible for students to attend, particularly in the midst of a pandemic. I intend to be flexible.

Tentative Course Plan:


Project details :
The project will be chosen by the student, with my input and consent. Most projects will involve the application of topological and geometric tools to real world data. Projects focused either on supervised learning or on exploratory data analysis are acceptable. A project focused on algorithms or on theoretical questions is also an option.

Students will be permitted to work on the final projects in groups of up to three people if they choose, but I expect that the ground covered by a project will be proportional to the number of people involved, and that responsibilities will be clearly and evenly delineated. Expectations:
This course will require substantial time, independence, and academic maturity, much more so than my TDA I and TDA II courses.

Data analysis can be time-consuming: In addition to the interesting stuff, one has to spend considerable time doing boring things like installing software on one's computer, cleaning data, debugging, troubleshooting, and waiting for long computations to finish. In addition, I expect that most students will work with data from real world applications for their projects; substantial time will be required to study the relevant literature and learning the application.

With that in mind, students should expect to devote substantial time every week on this course--at least 8-10 hours in most cases. Also, while I am generally available to help with mathematics and data science questions, be warned that my availability to help with technical computing issues (e.g., trouble installing software, bugs in your code, etc.) is very limited, and you should expect to handle such issues with little or no help from me. Moreover, depending on the application area you choose to study, I may not be of much help with questions about application areas.

Prerequisites:
Students are formally required to have either taken TDA I and II (AMAT 583/584) or to have permission of the instructor. In addition, you are expected to have a basic competence in programming and using computers for data analysis at the level of AMAT 502.

Course Materials:
There will be no course textbook or other formal set of course materials, but for the lecture portion of the course, I will make my (handwritten) lecture notes available.

Software:
Much of the TDA software we will use in this course can be found on github, under the tag "topological-data-analysis". There is quite a lot there, and I will make more specific suggestions about what software to use as the course progresses.

You might also find scikit-learn to be useful for clustering and dimensionality reduction.

Recommended reading (a very incomplete list):
Grading:
The class will use the university's A-E grading scheme.

50%: Final Project (including both oral and written component),
30%: Other Presentations,
20%: Attendance/Participation/Engagement

Pandemic-Related Challenges:
The pandemic creates a complex set of potential difficulties for students. I intend to hold this class to a high standard of effort, but at the same time, I am mindful of the unique challenges our situation presents, and I intend to conduct the class accordingly. If you are dealing with issues created or exacerbated by the pandemic that risk getting in the way of your being a focused, active participant in this class, please let me know.

Academic Regulations:
Naturally, the University's Standards of Academic Integrity apply to this course, and students are expected to be familiar with these. Pay particular attention the the sections on plagiarism.