Data and Machine Learning (ML) | Serverless

Wabash Improves Student Matriculation with Machine Learning

Trek10 has crafted a unique modular data ecosystem that makes it quicker and easier for clients to implement and maintain their machine learning models.

As one of the top liberal arts colleges in the United States, Wabash College provides not only excellent career outcomes and personal growth but fosters a community-focused campus where faculty get to know students by name. The Midwestern campus prides itself on immersive education and the academic opportunities it offers students. A history class might spend a week in Rome, for instance, all expenses paid.

Wabash works hard to ensure anyone who wants to attend has the means to do so: they are generous with scholarship packages, offering one-hundred percent of admitted students financial aid.

“Students aren’t students forever,” says Chip Timmons, Dean for Enrollment at Wabash College, “they graduate and become spouses, parents, coaches, voters, consumers—at Wabash, we prepare them for a rich and full life in all those roles.”

With another admissions cycle beginning, the admissions staff at Wabash began asking themselves how they could improve their recruiting and better spread the word about their college to prospective students. They wanted to give recruiters access to richer data for more predictive outcomes: a way to understand which students are already very interested and which need another email, phone call, or invitation to an on-campus event.

They were already using software that attempted to give them a “level of interest” score for every prospective student, but it wasn’t giving them the necessary depth of insight; the score was a black box with no way to see or adjust which factors were being included.

Wabash wanted more control over how the interest score was figured, without gray areas. They wanted to choose which factors went into the model, and what level of importance those factors received, to ensure it incorporated metrics Wabash enrollment leaders knew to be correlated with success. So, they decided to build their own.

AWS Data and ML Services for Unlocking the Value of Your Data

Regardless of where you may be in your data journey, Trek10 will meet you there and can offer a range of solutions from centralizing your data to a single platform, to deploying production environments for testing, training, and hosting ML models.

Data & ML on AWS with Trek10

Partnering with Trek10 for Machine Learning

“From our first phone call, Trek10 showed us how knowledgeable, friendly, and open they were,” said Timmons. “They were very good at explaining the very technical details of machine learning models in ways that made it feel easy.”

Trek10 is an AWS Premier Partner wholly focused on AWS with an emphasis on the serverless ecosystem. They help companies execute cloud migrations, increase automation with robust DevOps, and push their business forward with scalable machine learning models.

To help clients excel, Trek10 has crafted a unique modular data ecosystem that makes it quicker and easier for companies to implement and maintain their machine learning models. Trek10’s Data Platform gives companies a secure place to store data, set up data pipelines, and support data warehouses, effectively reducing cost-to-insight, while their Machine Learning Operations (MLOps) Framework gives clients access to end-to-end lifecycle management for their machine learning models while enforcing architecture best practices for machine learning pipelines.

All of Trek10’s offerings can be supported by Trek10’s experts and 24/7 Monitoring solutions to ensure continued success.

Framing the Machine Learning Problem

Because Wabash has an excellent athletics program, a high percentage of the prospective students who apply are athletes. Wabash wanted to recruit more non-athletes, as well as expand their reach to prospective students outside of Indiana. This meant needing better information on where to focus time and marketing dollars.

One of Wabash’s strategies for recruiting new students involved access to college entrance exam lists. When students took the ACT, PSAT, SAT, or AP exams, they could opt in to share that information with colleges and universities and receive communications in return. While these lists were helpful for college recruiters, the data provided was inconsistent. Some student entries were robust; others included much less information.

“The first step in our process is to identify the business problem and frame it as a machine learning problem,” Brenden Judson, Cloud Architect at Trek10, explained. “Then we can move on to getting the data and cleaning the model.”

For Wabash, the business problem was simple. How could they take these limited-information lists of prospective students and use them to perform meaningful outreach, focus their marketing spend, and recruit students who are excited to take advantage of everything their campus offers?

To pull it off, Trek10 faced two big challenges. The first was developing a robust predictive model with sometimes sparse data. The second was making sure everything worked seamlessly with Wabash’s recruiting and admissions software, Slate.

Designing the Machine Learning Model

Due to the lack of complete data, Trek10 decided to use a technique called oversampling—more formally, Synthetic Minority Oversampling Technique, or SMOTE. Oversampling takes information from a small number of applicants who actually become Wabash students and synthetically creates dummy applicants by slightly tweaking a range of information from existing enrolled students. This increases the amount of relevant data to better train the algorithm.

The two teams worked together to better understand the factors that played a role in how excited a student was to apply or accept an offer for admission. Had they visited campus or responded to an email? Were they an athlete? Did they apply for an early decision?

Once the training data was created, Trek10 designed an infrastructure to support the full lifecycle of machine learning models. With this infrastructure, data is automatically pulled from Slate into AWS S3. From there, the AWS Glue crawlers and ETL jobs help to clean and prepare the data until it is ready for AWS SageMaker to train and host machine learning models. As the model performs predictions, the results are automatically pushed to Slate, and prospective students are given a score of 1-5 based on how likely the model predicted they would be to attend Wabash.

Currently, the platform is set up to create recurring insights on a monthly basis. Recruiters use this information as part of their outreach strategy as they determine which prospective students might need a higher-touch approach (e.g. needs a nudge to schedule a campus visit, would benefit from a phone call). Throughout the first recruiting cycle, recruiters identified factors that initially seemed important but didn’t correlate strongly with matriculation, and over a few iterations, the model was refined.

“Everything went smoothly from the beginning,” said Wabash’s Timmons. “There was a lot of complicated math involved, but Trek10 made sure everyone in the office understood what was happening. They produced the data needed to make decisions and really excelled at their jobs.”

Automating DevOps with the MLOps Platform

While the focus of the project was designing the machine learning algorithm itself, any machine learning solution needs a robust data pipeline to succeed. All data needs to be ingested, cleaned, hosted, and tiered so the model can continually learn and increase performance over time.

To assist, Trek10 has built a data source agnostic MLOps platform that any client can use to handle the DevOps for the data and models in their machine learning workflow (sometimes referred to as DataOps and MLOps). The MLOps platform ingests and cleans data, and trains and hosts models. It can tie into any data source a client needs, from mainstream SaaS applications such as Salesforce to more niche-use cases like Slate. The platform also makes sure the model passes a necessary performance threshold before it’s deployed.

“We didn't enter into this project thinking let’s fix things for the next recruiting cycle,” said Timmons. “This is a solution that will continue to grow and learn with us for the next 10-15 years.”

As such, having a robust MLOps environment for data management and operations has been key to the long-term success of Wabash College.

Focusing Resources, Increasing Success

There’s a cycle to enrollment work, as Timmons explains. “Every year there are new students, new families, and new stories. For a student to put your college in the running, outreach has to start at least six months in advance and often will begin much earlier, in a student’s sophomore year of high school,” he said. “In that way, there’s a bit of delayed gratification to the work. You put a lot of effort into recruiting in May, the whole time not knowing where prospective students are going to apply come November. Even if they do apply to your school, you don’t know whether they’ll accept your offer to attend.”

In its first admissions cycle in use, Wabash recruiters were able to identify more than 300 high-likelihood students using the machine learning model—a number that represents almost one-third of the existing student body.

To Timmons, one of the most important things he can accomplish is to make sure that anyone who wants to come to Wabash sees everything the College has to offer. He wants to provide hundreds of students with life-changing experiences and give them access to the tight-knit community Wabash College provides.

“When you see a kid blossom here, it’s so gratifying. I get teary-eyed watching them walk across the stage four years later to receive a diploma,” said Timmons.

With the new machine learning tools Wabash and Trek10 have created, Timmons is confident Wabash will recruit more effectively for years to come. “This recruitment cycle, we’ve already seen that what we’ve built with Trek10 is a much better recruitment tool than what we had previously,” he said. “We’re even more excited to see what the next enrollment cycle brings.