Attending college can feel isolating, whether due to the stresses of managing a work-life balance or being in an entirely new environment.
Universities and colleges are trying tyo do something about this. They are investing in physical health and counseling services as well as financial aid services.
Marilyn Escobar, a data analyst at Florida International University’s Center of Analysis and Information Management, explains the complexities of data analysis in post-secondary education as well as the seemingly unexpected data hidden in surveys of freshman and high school seniors who are about to graduate.
Kai: Hi, everyone! My name is Kai Wilson. Today, we are meeting Marilyn Escobar. Marilyn works as a Data Analyst here at FIU, in the Office of Analysis and Information. This interview will focus on all things related to data analysis. So, let’s waste no more time at all and learn all about the work that Marilyn does.
Kai: So my first question is can you please provide a general overview about your job and role as a data analyst?
Marilyn: Yes, absolutely. So I initially started as a research assistant within this office and eventually had the opportunity to apply for a data analyst role.
Sometimes, I get data requests directly from different offices across the university. If I have worked with them before, or sometimes similar to you, they’ll look up my office and send me an email. So sometimes it is data requests. Sometimes, we do complete reports or projects based on kind of themes and trends that have come up among executives within the ‘compass workshop’.
A lot of my job mainly, is looking at a group of students, and finding as much information as I can about them. Besides demographics, a lot of it is GPAs across different years, their retention, if they graduated, and depending on what group of students we are analyzing determines what factors and what data variables that we go and use.
Kai: Would you define your work to pertain to data mining or data profiling? A mixture of the two? Or somewhere beyond that?
Marilyn: Usually, like I said, if we receive a data request, it’s usually just a very just a long list of essentially student Panther IDs. And then we are tasked with gathering the data points on them that we need and conducting analysis & seeing if we can find any correlations or trends. So, for example, I do this semesterly report, from the ombudsperson office. I don’t know if you’ve ever heard of it, but they’re a great office for students as well as employees. Essentially for any student that is currently struggling or has a problem with anything — courses to something in their personal life- they can always go to the ombudsperson office. There, they usually get help for whatever problem they have. And that office is then able to direct them to someone that could help so if let’s say they’re struggling with maybe paying their tuition or anything like that, they can always send them to maybe the office of scholarships or any of the other programs or offices that we have that could help with student tuition. So, I essentially receive a data request every semester and it has that long list of Panther IDs and what they want to see is they want to see if after the students are being helped at the ombudsperson office, if they’re able to be retained — as in if they’re if they end up staying at FIU and eventually go on to graduate. We look at different data points besides retention and graduation. We also look at, of course, the GPA, and we kind of compare them to other people within their cohort. So that would be you know, a different process when receiving a data requests that it is if we’re completing a project within our own office. As I mentioned, we usually just kind of talk about popular topics within the university system and amongst students and from there we decide what to do and create our outline. So it depends on where the data is coming from. If it’s coming from our dashboards, the data’s itself, is already clean. So, the only thing that we essentially have to do is I guess what is officially called “data mining”, which is where we just identify patterns and correlations within the data set. However, if it’s data that we’re receiving from someone else, then of course, we have to go through it. First, we have to assess what we’re given. We have to clean up the data where necessary. Then, and only then, can we go ahead and begin to identify any patterns or correlations.
Kai: How do the types of data stories like mode, median, and mean come across in your work?
Marilyn: That’s definitely one of the first steps that Andrew [Andrew Laginess is a Coordinator of Statistical Research at FIU] does is he runs the basic analysis, just to make sure that the data that he has is is clear and it makes sense statistically. He does look at, you know, averages & minimums & maximums, to make sure because you know, we don’t have every single data point for every student that we’re looking at. Sometimes, you can see a student has for their first year GPA or cumulative GPA, they have a zero. But that’s not right. That’s not correct or right to us. So, then we look in further and we see — Was the student retained? Did they maybe, you know, drop out before the first year ended and and then that would explain, you know why there’s a zero there? So, a necessary part of analyzing data is cleaning up the data if necessary, but also going through it and seeing what data points you have and making sure that that everything is cohesive and everything makes sense.
Kai: How is student data encoded into your visualizations in terms of position, color, length, and shapes?
Marilyn: Typically for our reports, because, of course, on our end we have, we can have Excel workbooks that if we’re looking at a major group. Like for instance, the current report we’re working on now, that I can’t go into current detail because it’s still an ongoing project. The current report we’re working on now, though, the data has essentially been clean, so all the data is cohesive and precise. It’s looking at a very large group of students — over a lot of years, maybe 10, 11, 12 years. Because of that, alone, the workbook is about 82,000 rows, and each row represents a single student. And in terms of the number of columns, I don’t even know because it is a lot. Essentially, we have all this data. And once we report on the demographics, we also want our analysis, any sort of correlations or trends that we may have discovered, if any, we want to make sure that those reading our reports besides — the teams, the Provost, President, and any of the offices that we work with, we want it to be clear in what we are presenting. A lot of our visualizations are pretty simple. We usually just stick to you know, tables and charts — 2D charts, we don’t really have anything 3D. We deal with a lot of clustered columns or 100% stacked columns — if we’re showing, you know, parts of a hole. Sometimes, we’ll have we have scatter plots to show when we’re looking at maybe GPAs over a certain time within a certain group. We also besides the quantitative analysis that we’ve discussed so far in this interview, we do also conduct qualitative work. And so we are in charge of a lot of the surveys that are handed out to students. For example, we have what’s called the ‘freshman survey’. That is a survey that students sends alerts to their email or any of the other sites that asks the students questions, essentially, all about their first year at FIU. That’s just an example of qualitative work that we do. When it comes to that, if we’re creating a report for let’s say, a qualitative survey, and we’re kind of showing the feedback that we’ve received from students on certain questions. We’ll use a word cloud to show some survey responses. In terms of the color we we typically stick to, school spirit colors, we stick to a lot of blues, grays, yellows, but overall, because the process of analyzing the data and what we are presenting can be complex, so we try to make the visualizations as simple as possible. So, that for those that are reading the reports, whether they have background knowledge on the current topic, or they don’t, anyone can go ahead and read our reports and understand what what it is that we’re presenting.
*This interview has been lightly edited for condensing and readability purposes.