AI MLB Run Projection Model: The Ultimate Guide to Predicting Team Runs
Predicting MLB run totals isn't just about guessing which team is going to have a good night at the plate. It is a full-blown exercise in pattern-finding. If you are like me and you spend your time building AI models to sharpen your edge, you know that the secret sauce is in the blend. You have to mix Statcast quality, the weird quirks of specific parks, the impact of the weather, pitcher-hitter splits, and the ever-changing news from the clubhouse. When you get it right, you aren't just betting on a game; you are forecasting scoring with a level of confidence that most people don't have. I want to walk you through my exact workflow. We will cover the practical tools I use, the checks that keep my numbers actionable, and how to make sure your model actually holds up when money is on the line. At its core, this process effectively mirrors a professional
sports market trading strategy
, where identifying mispriced lines is more important than simply picking winners.
Problem framing and data sources
When I set out to build this, I realized that I couldn't just throw everything at a model and hope for the best. You have to be super careful with your target. I always model team runs per game rather than just the total for the whole game. It makes a huge difference because asymmetry is the name of the game in baseball. You might have one lineup facing a total disaster of a starter while the other team is locked in a pitcher's duel. If you model it per team, you capture that nuance. My target is always the integer count of runs for a specific team in a specific game, and I pull that from official box scores. I usually run my primary forecast in the morning, but then I have to refresh it once the official lineups are posted and again about an hour before the first pitch.
If you are looking for where to get your data, start with the official MLB data streams for schedules and lineups. For the historical backbone, Retrosheet is a lifesaver. You need Statcast data from Baseball Savant to understand the quality of contact. I look at exit velocity, launch angles, and barrel rates to get a sense of how a team is actually hitting versus how they are performing on paper. Then you have to layer on the pitcher splits. I look at how a starter performs against lefties versus righties, their pitch mix, and what their bullpen looks like behind them. Do not overlook the park factors. I use multi-year data to see how a specific stadium affects different types of hits. Weather is the other massive factor. If it is eighty-five degrees in Chicago with the wind blowing out, that is a completely different ballgame than a chilly night at Oracle Park.
Hygiene is the most boring but important part of this whole thing. You cannot leak future information into your training data. If you are training your model on today's games, you absolutely cannot use stats that were generated after the first pitch. I create snapshot tables that represent exactly what I knew at 1 pm ET or 7 pm ET. If a starting pitcher gets scratched, I have an automated job that triggers a re-run of the model for that game. If you don't keep your data clean and time-aware, your backtesting results are going to lie to you, and you will find out the hard way when you start putting real money on the table. This attention to detail is how you arrive at ai baseball over under predictions that actually stand the test of time rather than just chasing noise.
Feature engineering and modeling
My approach to features is all about rolling windows. I look at 7, 14, and 30-day windows for everything. I want to know if a hitter is currently scorching hot or if they are in a slump that the season-long stats are hiding. For pitchers, I look at the same rolling windows but focus on their strikeout-to-walk ratios and their tendency to induce ground balls or fly balls. The goal here is to capture form without overfitting to a tiny sample size. I use an exponential decay to weight the most recent games a little more heavily than games from three weeks ago.
You also have to get smart about platoon splits. It is not enough to just know if a guy is a lefty or a righty. You need to look at pitch-type exposure. If a team has a bunch of guys who crush fastballs but struggle against sliders, and they are facing a guy who throws seventy percent sliders, your model should know that. I also think travel and rest are the hidden edges. If a team just flew across the country and is playing a day game after a night game, they are probably going to be sluggish. I include a little coefficient for that in my model.
When it comes to the actual model, don't overcomplicate it right out of the gate. I started with a simple Poisson regression because it is great for count data. It is a solid, interpretable baseline. Once you have that, you can move up to a Negative Binomial model to handle overdispersion, which is a common issue with run scoring. Then, you can bring in something like XGBoost to capture all those complex, non-linear interactions like how a specific umpire interacts with a pitcher's zone or how the weather shifts in a specific park.
I am a big believer in not just spitting out a single mean. I use quantile regression or a Negative Binomial distribution to give me a full range of outcomes. I want to see the 10th percentile and the 90th percentile. This tells me if a game is likely to be a low-scoring grinder or if it has the potential to blow up into a double-digit shootout. It helps you understand the uncertainty. If the model gives me a wide band of potential outcomes, I know to be more careful with my bet size.
Training evaluation and backtesting
Backtesting is where you prove to yourself that your model isn't just lucky. I use a walk-forward approach where I train the model on, say, the first two months of the season, and then I validate it on the next month. Then I shift everything forward. This simulates the real world better than a random split. I always compare my results against a simple league-average baseline and a park-adjusted baseline. If my model isn't beating those, then all the work I put into my features is a waste of time.
I look at metrics like Poisson deviance and the Continuous Ranked Probability Score, or CRPS. These are way more useful than just plain old accuracy because they tell you how well your predicted distribution matches reality. I also look at calibration plots. If my model says there is a 70 percent chance a team scores fewer than four runs, I want to make sure that they actually score fewer than four runs roughly 70 percent of the time over the long haul.
Finally, I do a lot of stress testing. I look at how the model performs during periods of heavy roster turnover, like the trade deadline. I also check its performance during extreme weather conditions. If the model is totally guessing when the weather hits extremes, I know I need to tighten up those specific features. You have to be your own toughest critic.
Pipeline, deployment and operations
You can build the best model in the world, but if it breaks on game day, it is useless. I keep my environment totally reproducible using poetry for dependency management. I use DVC for data versioning so I can always go back and see exactly what data was used for a specific prediction. For orchestration, I use something like Prefect to handle my daily data pulls and retraining jobs. I have automated alerts that let me know if a data source is down or if the predictions look weird for a particular game.
I also run canary deployments. If I make a change to the model architecture, I will run it alongside the old one for a few days to see if the new version is actually performing better before I fully switch over. You have to monitor for drift, too. Sometimes the game changes. Players get better at stealing bases, or the league makes the balls slightly different. You have to keep an eye on your feature distributions to make sure they aren't drifting away from what the model was trained on.
Step-by-step build checklist
- Start by defining your target variable and setting up a clean schema for your team-game records.
- Get your data pipeline running. You need schedules, rosters, and historical box scores.
- Start engineering your rolling features for hitters, pitchers, and teams.
- Bring in the weather and umpire data. These are your multipliers for run-scoring potential.
- Build your starter-versus-bullpen logic. This is critical for getting the team-level projection right.
- Train your baseline Poisson and Negative Binomial models.
- Layer on the tree-based models like XGBoost to get the complexity right.
- Calibrate your uncertainty intervals so you know how much to trust your mean.
- Set up your leakage prevention. Double-check your timestamps.
- Run a rigorous backtest to see how you would have done in previous seasons.
- Build out your simulation engine to convert the PMF into betting edges.
- Operationalize everything into an automated flow.
- Keep your documentation updated. You will be glad you did when you have to debug a failure in the middle of a playoff race.
Practical betting applications with ATSwins context
The whole point of doing all this math is to find an edge that the market has missed. At ATSwins, we use this engine to power our daily decisions. When you have the distribution for both teams, you can do so much more than just look at the total. You can price those alternate lines, like taking the over 3.5 or the under 5.5 for a specific team. It also helps with props. If I project a team for six runs, I am going to be much more confident in an over on their top-of-the-order hitters to score runs or get RBIs. Understanding sports betting expected value explained in this context allows me to decide not just if I should bet, but exactly how much of my bankroll to commit to a specific play.
It is all about the daily rhythm. I follow a weekly plan to keep my model fresh. I do my big retraining on the off-days and keep my daily runs focused on the latest news, like lineup changes and late-breaking weather reports. We talk a lot about this at ATSwins, where we try to keep the betting process grounded in data rather than "gut feeling." If you want to see how this looks in practice, we have guides on our site about how to run a seven-day strategy. It is basically about making sure you are not just reacting to the noise but actually trusting your process.
Templates, tools, and practical tips
If you are just getting started, don't try to build the perfect system on day one. Start with a simple cookiecutter template for your repository. It will keep your files organized so you aren't digging through folders to find your feature scripts. For storage, keeping things in Parquet files partitioned by date is usually the easiest way to handle years of baseball data without it becoming a mess.
One of the best tips I can give you is to always have a fallback. Sometimes the weather API goes down or a site changes its data format. My system is set up to use the last known good weather forecast if the new one fails. You also have to track your model performance manually. Every night, look at the games that you got really wrong and try to figure out why. Was it a freak injury? A bullpen implosion? An ump with a massive strike zone? Keeping a log of these "oops" moments will teach you more than any book ever could.
Feature detail: turning batted-ball quality into runs
The way I think about batted-ball quality is through the lens of expected wOBA. Every time a ball is put in play, we know the exit velocity and the launch angle. That tells us, historically, how often that ball turns into a hit or a home run. I map those probabilities into a run expectancy matrix. If you don't want to go that deep, you can just aggregate the expected bases from the whole lineup and use a simple multiplier to convert that into runs. The key is that you are focusing on the quality of the contact, not just whether or not they got a hit in the last game.
A note on joint modeling of both teams
You can model the home and away teams separately, and for most bets, that is fine. But baseball games are correlated events. If the wind is blowing in at Wrigley, it affects both the Cubs and the visiting team. If the bullpen for one team is already exhausted, it might mean the other team gets more runs late in the game than they otherwise would. If you want to get fancy, you can add a correlation factor to your model when you sum the two distributions to get the total game runs. It is a bit more work, but it can make your totals pricing a lot more accurate.
Handling live in-game adjustments (optional)
If you really want to get into the weeds, you can build a model that updates in real-time. This is much harder because you have to account for the current base-out state and the remaining bullpen arms. I use a simple Markov chain for this. You just update the projected run scoring based on the inning, the score, and how many outs are left. It is great for betting the live total, especially if you see a bullpen manager making a move that the market hasn't fully accounted for yet.
Common pitfalls and how to avoid them
The biggest trap is leakage. Everyone does it. You accidentally include a stat that includes today's result in your training set. You have to be paranoid about your timestamps. Another common mistake is overreacting to small samples. Just because a guy had a great week doesn't mean he is suddenly a Hall of Famer. Always shrink your estimates toward the player’s long-term average. And be careful with park factors. It is easy to look at a place like Coors Field and just add three runs to everything, but you have to keep your park adjustments grounded in reality so you don't get blown out on a day when the ball just isn't flying.
How to present model outputs to bettors
If you are building an interface for this, keep it simple. Show the mean, but also show the range. People like to see the probability of scoring over or under a certain amount. I like to add a "confidence score" that looks at things like how certain we are about the starting lineups and the weather. It helps the user understand why the model is saying what it is saying. If I can explain that "this team is projected for 4.5 runs because the temperature is high and the opposing pitcher is a ground-ball guy," that gives the user a lot more trust than just seeing a number on a screen.
Backtesting workflow that sticks
I always look at three years of data for my backtesting. I want to see if the model works across different types of seasons. I report the Poisson deviance compared to a simple baseline, and I track my calibration stats month by month. If the model starts losing its calibration in September, I know I need to account for late-season roster call-ups. You have to constantly be iterating. A model that was great in April might need a few tweaks to work in August.
Maintenance through the MLB calendar
The baseball season is a marathon. In April, the weather is erratic, and the players are shaking off the rust. You have to widen your error bars. By mid-season, things settle down. After the trade deadline, the whole league changes, and you have to be ready to adjust your priors. In September, you get a lot of rookies, and you have to treat them differently than veterans. If you don't account for these phases of the season, your model will lose its edge.
Quick start: minimal viable model in one week
If you want to build this, block out a week. Day 1 and 2 are for your data. Get that team-game table set up. Day 3 is for your rolling features. Day 4 is for your basic regression models. Day 5 is for the XGBoost and the uncertainty intervals. Day 6 is for your simulation engine and checking your work against old lines. Day 7 is for documenting your process. It is a sprint, but it is totally doable. The most important thing is to just start.
Resources
I lean heavily on the official MLB API for the latest game info. For deep dives into player stats, Baseball Savant is the industry standard. If you want the deep history, Retrosheet is the place. For the actual modeling, scikit-learn is my go-to for the GLMs, and I use XGBoost for the more complex stuff. All of these tools have great documentation, so you don't need to be a Ph.D. in stats to get started.
Conclusion
Building an MLB run projection model is a journey, not a destination. You start with the basics, you layer on the complexity, and you constantly iterate based on what the data tells you. The big takeaways are simple: keep your data clean, respect the time-sensitivity of your features, and always validate your work with a rigorous backtest. Once you have the engine running, the real fun begins—finding those small edges that add up to real profit over the long haul. At ATSwins , we believe that an AI-powered sports prediction platform can change how you look at the game. Whether you are using our data-driven picks or building your own models, the key is to stay consistent, stay curious, and keep grinding. If you want to dive deeper into how we handle things, check out our resources at ATSwins. We provide the tools, the insights, and the tracking so you can focus on making smarter decisions.
Frequently Asked Questions (FAQs)
What is an MLB run projection model and how does it work?
An MLB run projection model is basically a way to guess how many runs a team is going to score in a game based on a mountain of data. I take things like hitter splits, the starting pitcher's recent form, the weather, and even the park, and I feed that into a model to get an expected number of runs. But it is not just a single number; it is a full distribution of outcomes. I use count models like Poisson or Negative Binomial to figure out the most likely results and give myself an idea of how much uncertainty there is in that projection.
Which inputs matter most in an MLB run projection model?
The biggest things are the starting pitcher and the quality of the lineup, especially their platoon splits. But really, it is the combination of everything. The park factors and the weather, especially the wind, are huge. I also keep a close eye on the bullpen. If a team has a weak bullpen, that is going to change my projection for the late innings. I also track rolling 7, 14, and 30-day form to see who is hot and who is cold. Everything gets factored in, but you have to be careful not to weigh anything too heavily.
How do I bet totals with an MLB run projection model without overthinking it?
The simplest way is to take your team run projections, add them together, and compare that total to what the sportsbook is offering. If my model says 8.5 runs and the book says 7.5, that is a potential spot. But look at the distribution. If there is a high probability of a blowout, you might want to look at alternative lines instead of the main total. Don't go crazy with your unit sizes; just look for consistent, small edges.
How reliable is an MLB run projection model when weather or parks change?
It is as reliable as the data you give it. If I have good weather info, I can make very accurate adjustments. Wind blowing out in a high-altitude park like Coors is going to change things, and the model will pick that up. The trick is to keep your park factors and weather sensitivity bounds so you don't overreact. The model is good, but you have to be the one to sanity-check the results before you pull the trigger.
How does ATSwins.ai use an MLB run projection model, and what do I get as a bettor?
At ATSwins.ai, we use our models as part of a whole system. You get access to the projected runs, the confidence intervals, and the notes on why the model is leaning a certain way—like weather or bullpen issues. ATSwins.ai is an AI-powered platform, and we want to make sure you have the context you need to make a smart bet. We offer different plans so you can test it out, track your progress, and get a better understanding of how the math translates into real-world results.