Dealing with Big Data – The Summer Bioinformatics Program by The RNA Institute

Zoom image-1 of Summer Bioinformatics Program
Summer Bioinformatics Program via Zoom
By Sangeetha Selvam, The RNA Institute

In the current era of genetics, an immense amount of data is being generated through various high-throughput experiments. When it comes to storing, processing, and analyzing these data to understand and to extract the genetic information, we rely on bioinformatics. Over this summer, The RNA Institute at the University at  Albany SUNY held a bioinformatics program for the trainees to analyze the biological data generated from the research programs of the institute. This year the program welcomed 34 trainees which included high school students, undergraduates, graduate students, post-doctoral fellows, and research technicians where each of them worked closely with RNA Institute faculty and analyzed the data generated from their labs during the course of the program.

“This program is an essential part of the Institute’s efforts to train the next generation of RNA researchers because having the computational skills to mine genetic data sets is going to be an important skill going forward for scientist.” says Prof. Andy Berglund, Director of The RNA Institute.

The bioinformatics program was directed by Dr. Hannah Shorrock, Postdoctoral Associate from the Berglund lab of the RNA Institute who had taken the same program last year. When asked about the program and its objective, Dr. Shorrock shares, “The idea behind the summer bioinformatics program is to provide students with necessary skills to be able to perform independent research using sequencing data. As part of this, we try to expose the students to multiple RNA sequencing technologies including single nucleus RNA Seq, nanopore sequencing and sequencing based approaches to study RNA structure”.

One of the group leaders from last year’s bioinformatics program, Ms. Emily Davey, continued with her interest in bioinformatics and currently works as the Bioinformatics technician at the Berglund lab. Ms. Davey served as one of the instructors of this program and as a group leader. The students were split into small groups and each group was assigned a group leader who was experienced with data analysis to closely assist the trainees. She believes that it is crucial for the trainees not only to produce the data but also to learn to process them. Through this course, the trainees are taught the basics of bioinformatics and the necessary skills needed to get started with genetic data processing. Ms. Davey adds, “We were also particularly interested in exposing underrepresented minorities to this field, as the field of bioinformatics is still largely homogeneous. We hope that by equipping these students with bioinformatics skills, we can make way for a new generation of diverse researchers who are skilled in the dry lab as well as the wet lab.”

The program was scheduled over 10 weeks during the summer and covered the basics of the programming language R, RStudio, introduction to Command Line Interface and Terminal. In addition, this year’s program also included several research seminars (on the topics about sequencing COVID variants, single cell RNASeq, cellular differentiation, and using bioinformatics to inform computational modelling) and multiple professional development workshops such as rigor & reproducibility in research, CV clinics, and the transition to graduate school. At the end of the program, each trainee presented their work on analyzing the dataset from their respective research group in a 12-minute talk to the RNA Institute faculty, staff and students.

Testimonials from a few of our trainees:

“…Going into the program, I didn’t really know what Bioinformatics was, and was curious to see how Biology and Informatics could connect and help answer a wide variety of research questions. Overall, I had an amazing experience in the program learning tools used to analyze RNA sequencing data and applying Bioinformatics to Myotonic Dystrophy and Cancer…” – Labika Baral, Undergraduate student, University at Albany SUNY

“…I applied for this program because I wanted to see learn more about how data is analyzed in published papers. Overall, the program was great. It gave me an excellent introduction to bioinformatics led by experts who cared about bioinformatics…” – Paul Avik, Undergraduate student, University at Albany SUNY

“…As an eager scientist, I felt called to learn a new set of skills that would enhance my independence and research capabilities.  This was easily one of my greatest decisions I have made in my academic career as I was able to earn money, network, learn, and be engaged in a healthy expansive environment that nourished my mind and prepared me to become an effective environmental health epidemiologist…” – Gabrielle Roosevelt, Undergraduate student, University at Albany SUNY

“…We not only learned bioinformatics skills through the lectures but also experienced many other interesting talks and professional development seminars…” – Forrest Gao, High School Student

“…I had a wonderful experience while this entire training program which taught me basics of UNIX operating system and R programming language. We had a superb team of faculties, coordinators and participants who guided & supported this program. It helped me grow my bioinformatic skills and to approach biological research data analysis. This program also helped us in preparing for Graduate school and career in research through various professional development seminars…” – Chetna Mathur, Graduate Student, Begley Lab, University at Albany, SUNY

“…My research project involves bioinformatic analysis of long-read DNA sequencing data. As a trained molecular biologist, I only have limited knowledge of bioinformatics and took this course to get a better understanding of the bioinformatic tools used in my analysis and to get more familiar with handling large DNA sequencing datasets. The course is ideal for beginner and intermediate level. Lessons learned from the first part of the course can easily be applied in individual projects during the second part. The small group setting allows for consistent exchange and troubleshooting with experienced group leaders to ensure steady progress…” – Christina Heil, Postdoctoral fellow, Thornton and Lueck lab at the University of Rochester Medical Center