Power in Numbers

September 23, 2022
Henry Han

Data sciences may well be more collaborative than any other discipline. A diverse field by its nature—involving machine learning, artificial intelligence, statistical sciences, digital communication and more—data sciences is a natural fit to advance research not only within its own core divisions, but all disciplines.

Across campus, faculty in areas ranging from computer science to education to statistics utilize data science methods, and collaborate with others to solve problems—and to do so within an ethical framework that makes the world better.

Mary Lauren Benton, Ph.D., serves as assistant professor of computer science. A Baylor graduate, she returned to her alma mater in 2020 after earning a Ph.D. in bioinformatics from Vanderbilt. Her research focuses on the application of computer science to interpret how DNA sequences alter genome function and impact disease risk, and to advance understanding of gene regulation. Her work has appeared in Nature Reviews Genetics.

Henry Han, Ph.D., came to Baylor from Fordham University in 2021 to serve as the inaugural McCollum Family Chair in Data Sciences. An internationally recognized researcher, his work impacts fields as varied as genetics, health, finance, stock trading and more. He developed a novel approach to sort through human genetics data and pinpoint three signals that could impact disease, and he his leadership in data includes editorship of Recent Advances in Data Science.

Grant Morgan, Ph.D., is a data scientist with an educational application. He serves as associate dean for research and professor of educational psychology in Baylor’s School of Education. There, he focuses on measuring educational data, including student testing. His work has been funded by the National Science Foundation (NSF) and National Institutes of Health, and he serves as a reviewer for NSF’s Directorate for Education & Human Resources and Directorate for Social, Behavioral, & Economic Sciences.

They share their insights on data sciences at Baylor—from growing an interdisciplinary field of research to ensuring ethics and a focus on people to make the world a better place.

How would you describe your main research focus?

Grant Morgan: I am formally trained as a psychometrician, and that means that I study the way we measure and explore relationships between variables that we believe exist but can't directly measure. Education, of course, is no stranger to testing. Anytime we think about tests that are used in education to make decisions about students, we're making a decision about something we can't directly see. We study tests and make sure that tests provide the kind of information that is high quality so that we have confidence in the decisions we're making about those students.

Mary Lauren Benton: I am in bioinformatics. The biological questions that drive my research come from the field of gene regulation. Amazingly, every cell in your body has roughly the same DNA sequence but form many different types of cells and perform a great diversity of functions. To understand how the diversity of tissues and functions come from a single sequence, as well as to understand how and why we get sick (what leads to disease from a genetic point of view), we have to be able to interpret the information encoded in that DNA sequence.

Henry Han: My research is in data science and AI, and focuses on how to dig different knowledge from all kinds of data currents. I focus on FinTech data, bioinformatics, and biomedical data. For example, I'm interested in finding trading markets for cryptocurrency, how to build efficient trading machines, and how to use data to figure out future stock financial information. At the same time, I'm also interested in new AI methods to handle other types of data. For example, in some bioinformatic or biomedical data, data is noisy. It can be mislabeled. How can we invent a new AI method to handle this more efficiently?

Data sciences are, by nature, a collaborative discipline. People from other fields need your skills to apply to their own work. What does that demand of a skilled data scientist?

Henry Han: As data scientists, we need to listen to data. Let data talk. There are challenges in data science because there are tons of machine learning and AI tools there, but it doesn't mean they can apply to all kinds of data. We need to modify, we need change, we need to invent all those more customized AI models. That's the part I enjoy so much. For example, when I work with people in engineering or astrophysics, their data is very similar to financial data -- but it's also quite different. From my viewpoint, data science should have different subfields, and those different subfields share the core knowledge of the data science method and models. For each subfield, they should have their own data oriented, AI or machine learning tools in problem solving.

Mary Lauren Benton: The technical skills are very important, but actually a small part of what I spend my time doing when I'm working on a project in data science. I think that can be often overlooked in a way that I hope is exciting and encouraging for folks who are excited about questions, applications, and finding insight. Yes, you need technical skills in data science and understanding how to do the statistics or how to write the program is really important, but a lot of what you do to be successful here is understanding data, understanding what you want to gain from it, and how you want to communicate that to your colleagues and peers and communities.

Grant Morgan: Sometimes, there is a misconception that data science equals artificial intelligence, or data science equals machine learning. That's not the case. Those are certainly subsumed by data science, but data science is more than that. We've talked a lot about applications in the bench sciences, but there are also applications in the humanities and the social sciences. It makes a lot of sense to me that we would have data science as one of our five pillars because it has such wide application in all the domains that we have represented here.

Ethics in data sciences feels core to Baylor’s mission. What does ethics in data science mean to you?

Grant Morgan: One thing that my colleagues here have mentioned is a great starting place— using methods that are true to the questions that are being asked. I don't teach my students to go out and harvest every data point they can find everywhere and put it into the equivalent of a meat grinder and see what comes out. I teach students to be very principled in how they design studies that adhere to the ethical standards of the field. Also, think about this nuanced difference between data and evidence, because there may be some differences worth considering. To think about those underlying relationships between the data we're collecting and the alignment with the kinds of questions we're trying to answer.

Mary Lauren Benton: Data are very powerful and we can learn a lot from them, but data exists in a human context. Even the data we might think of as being impartial; they have to come from somewhere. Genomics is a good example. The insights we derive from big genetic studies are biased by historic (and current) biases in data collection, such as who participates in research studies or which human populations are the most represented in our models and databases, and are influenced by the kinds of questions the researchers are interested in, how information is communicated to the scientific community and the public… we should consider how all of these things influence the articles we read and the kinds of claims that are made based on the data.

It might be easy for a lot of us to imagine you sitting behind computers all day, looking at numbers. But, as data scientists, how much are you in the ‘people business?’

Mary Lauren Benton: Most of my days are spent thinking about the kinds of scientific questions I can answer, the data that need to be collected and how best to collect them, and how to communicate our findings. All of those steps involve people and collaborations. I think I started in this field to stay out of the people business, but I definitely ended up in a highly communicative and collaborative field, and that is one of its strengths.

Grant Morgan: One of the things that I really try to tap into is why any of us do what we do. I think when you ask the question long enough and in the right ways, we all want the world to be better. Underneath it all, we all want life to be better. Yeah, we are intellectually curious and we like it and those kinds of things, but if you look at downstream...right now, I've got two computers running cluster algorithms on millions of data sets. I'm doing it so that I can write a paper to show how this might be used to design better educational interventions for kids who have certain background experiences. Is what I'm doing right now people-oriented? Yes, because it’s to make life better, hopefully, for kids that I've never met and will never meet. We all love to be alone in our offices with our computers. But why we do it is very, very much governed by our value system that respects people and wants life to be better.

At Baylor, the heart of research is grounded in our mission, compassion for others, and a call to solve our world’s greatest challenges. Discover other conversations with faculty in cancer research, environmental health, human flourishing, and materials science and engineering.