
The role of advanced analytics in sports is a contentious subject. To its defenders, data-driven pragmatism is a natural evolutionary step in the way we play and watch games. For detractors, the approach prioritizes results above all else and drains the soul from a pursuit that should be spontaneous and joyful.
As someone who is neither pragmatic nor spontaneous, I don’t qualify for either camp, though I find the very notion of applying this kind of research to soccer fascinating and even admirable. The game is resistant to orderly examination by design. Like preparing a tax return for a housecat, it takes a stupendous amount of ingenuity just to figure out which questions to ask, to say nothing of finding the answers.
While baseball can be a spreadsheet task, soccer matches amount to meandering free-verse written in 90-minute chunks. Luke Bornn is a data scientist who specializes in movement studies. Thanks to his background analyzing complex bodies in motion, he realized he was uniquely suited to explore the nature of such an evasive game. While at Los Alamos National Laboratory, Bornn worked on ways to detect how much damage helicopter blades can sustain before it compromises the chopper’s ability to stay airborne. He has mapped climate data to predict crop yield and studied how herds of massive land mammals move about the fruited plain. The ebb and flow of a soccer match, while mysterious, were not altogether unfamiliar, and he has pioneered ways to quantify some of the game’s amorphous spirit.
Along with frequent collaborator Javier Fernández, Bornn has published academic papers with titles like “Wide Open Spaces: A Statistical Technique for Measuring Space Creation in Professional Soccer.” In this study, the data scientists examine the ways players without the ball can manipulate opponents’ positioning on the pitch. Like the stylus of a Magna Doodle dragging metallic particles about the toy’s surface, seemingly uninvolved parties can contort the very geography of their foes to open new avenues of attack.
If you buy something using links in our stories, we may earn a commission. This helps support our journalism. Learn more.
Thanks to player tracking technology, this is now a quantifiable skill, and, like so many things, Lionel Messi is great at it. Through their research, Bornn and Fernández found that Messi is perhaps one of the best walkers in all of soccer. The Argentine legend is prone to lollygagging, and common conjecture has been that he’s either conserving energy or just can’t be bothered. While this may be part of it, their study demonstrates that Messi’s slow saunters about the pitch short-circuit defenses in unique ways. “That walking behavior is not a detachment from the match but a conscious action to move through empty spaces of value and claim the control of valuable space,” they write. “Messi does this very effectively, placing him near the top of players in terms of space gained during the whole match, despite the lack of active gain.”
In other words, Messi can achieve more on a stroll than most players do with an all-out sprint.
Ask the people who work deep inside soccer’s analytical engine rooms about how their work affects the way they view the game, and you’ll get some illuminating responses. “I watch in a strange way,” Bornn says. “I tend to watch with an eye toward what the tactical system could be, or whether the data that’s being collected is miscapturing what’s going on, or that the data might capture the core components but our models will miss what’s going on. It has kind of ruined sports for me.”
Sarah Rudd tends to agree. “It’s a little exhausting watching every game so analytically,” she says. “It’s hard to turn off that part of your brain, but you still want to be a fan and you want to enjoy.” Rudd got into soccer analytics so early, she essentially had to invent it from scratch. After graduating from Columbia University, she spent a few years living in Chile, where she fell further in love with her favorite sport. She fondly recalls squinting at her small, standard-definition television set to watch broadcasts of matches from Argentina. “You had to really know the teams,” she says. “If you weren’t really familiar with the teams, you couldn’t figure out who players were. It’s hard to read the numbers, and you couldn’t really see their faces.”
Rudd and her boyfriend at the time invented a game based on this challenge. “We would turn on the TV, and if Boca [Juniors] was playing, it was how quickly can you spot Carlos Tevez. Not because of his face but because he had this really weird running style. It was like, ‘Ope! There he is.’” Built like a fire hydrant, the stout, pugnacious Tevez was a rabid delivery robot programmed to kill on the pitch. Just thinking about it makes Rudd wistful: “What a player.”
Of her time in South America, Rudd recalls, “It made me want to work in football even more.” She took a gig doing data mining and machine learning for Microsoft in Seattle but continued to search for entry points into the sports industry. “A friend of mine suggested that I do an MBA program and then see if I could get a job at Nike or Adidas in their football business unit.” In 2011 she caught wind of a contest being held by sports analytics company StatDNA. “They were doing a research competition where they gave you a dataset,” she says, noting that, until that point “there was practically nothing” of the sort that had been collected for soccer.
Using a spreadsheet of rudimentary player-location data, Rudd set out to devise a method for analyzing an individual’s performance in more complex ways than simple goals and assists. “There wasn’t a ton of direction,” she recalls. “I think just from watching the game I was interested in evaluating how much value are people adding with every action that they do. Not necessarily trying to evaluate alternatives but being able to somewhat quantify, like, that was a dangerous giveaway, or it’s stupid to take a shot from there, that sort of thing.” To accomplish this she used Markov chains, a statistical tool that helps determine the likelihood of something happening within a system based on its current state.
First introduced in 1906, Markov chains represent a departure from the principle of absolute independence, a core tenant of probability theory seen in things like roulette wheels where each spin offers a fresh experiment with repeated odds. The chains are a way to examine ongoing scenarios where each starting point presents a different opportunity for the future. In the magazine American Scientist, Brian Hayes uses the board game Monopoly as an example:
The chains were invented by and named for Andrey Markov, an ornery Russian mathematician who, according to Hayes’ reporting, stopped attending meetings at the Academy of Sciences in Saint Petersburg late in his career because he claimed he didn’t have proper shoes. When the school sent him a pair of new boots, he said they were “stupidly stitched,” thus proving that his current state (pissed off) contributed to the likelihood of his return (zero).
The roots of Markov’s discovery sprouted from a dispute over the law of large numbers and free will. He long believed the universe was a series of events whose interconnectedness can be understood through mathematics. He refined this idea by condensing the text of the Alexander Pushkin novel in verse Eugene Onegin into one long sequence of letters suitable for mathematical analysis. In doing this, he discovered that stable patterns of double vowels and double consonants appeared throughout the work. Taking a large sample from the beginning of the text, he was able to determine that letter distribution didn’t adhere to the principle of independence, demonstrating that even something as beautiful and fluid as poetry was prisoner to the cold deductive properties of mathematics. He published his first paper on the subject in 1906 and formally presented his findings in 1913, one year after his request to be excommunicated from the Russian Orthodox Church.
“Any attempt to simulate probable events based on vast amounts of data—the weather, a Google search, the behavior of liquids—relies on Markov’s idea,” states an article in the Harvard Gazette. Sarah Rudd, who studied computer and environmental science as an undergrad at Columbia University and worked on Microsoft’s Bing search engine, added soccer to this list. Her paper “A Framework for Tactical Analysis and Individual Offensive Production Assessment in Soccer Using Markov Chains” placed players into one of 39 “states,” depending on things like location and ball possession to calculate the likelihood of what would happen next.
Rudd’s work was impressive enough to win her both the competition and a job with StatDNA. When the company was acquired by Premier League giant Arsenal the following year, Rudd suddenly found herself in London working for her favorite team and introducing the backroom staff to her advanced research. She spent nearly a decade at the club and became the head of analytics before leaving in 2021 to start her own firm with her husband.
“One of our jobs is to be the calm voice of reason,” Rudd says. “This is one of the things I like about consulting versus working for a club. You can be a little bit emotionally detached. You can be a little bit calmer. Because when you’re at the training ground every day, emotions are high. It’s a really stressful environment. There’s a lot at stake.”
In an interview with The Athletic, Rudd says she started her own firm, in part, “to figure out football.” I ask her what this would look like, and she concedes that “it’s really hard,” almost to the point of being self-defeating. “One of the difficult things about analytics in football is that there are so many different ways to win. There are so many trade-offs. I think somebody described it as trying to cover yourself with a blanket that’s too short. If you press really high, that’s going to come at the expense of something else. There are a few things we know that really help you win, but there’s still a whole lot where you could be just as effective doing something else.”
No matter how much research is done, soccer maintains its severe allergy to simple answers. Even something as fundamental as whether you want your team to have the ball or not is up for debate at the highest levels. As Dutch legend Johan Cruyff argued, a “footballer has to have the ball at his feet.”
Diametrically opposed to this philosophy is José Mourinho, one of the most successful managers of the 21st century. The rakish Portuguese gadfly opined that “whoever has the ball has fear,” preferring that his teams lie in wait and capitalize on opponents’ mistakes like the humans in The War of the Worlds who hunkered down until the Martians caught a sniffle and died.
Where else can such drastically conflicting worldviews have equal footing but in the poorly designed experiment that is soccer? “For so long it was, like, if only we have really wide-scale access to tracking data, that will solve all of our problems,” Rudd tells me. “And then we got it and, nope, we still have lots of problems.”
Reflecting on her 2011 paper using Markov chains, Sarah Rudd can’t help but poke holes in the research that made her a pioneer of the movement. “At the time I wrote that paper, I wasn’t looking at it nearly as analytically as I do now,” she says. “I think there were definitely a lot of decisions that I would have done differently, particularly, how you break down the field.” Rudd divided the pitch into equal boxes, dividing the expanse of open grass into a grid of easy-to-track cells. It was order from chaos manufactured out of misguided desperation. “Now we know that how the pitch operates isn’t necessarily linear or in neat little squares,” Rudd says. “There are certain zones where things happen for a number of reasons that don’t quite align with those pitch markings.” These areas of congestion are nebulous and reactive to tactical trends, such as defenses funneling play out wide or pressing high when out of possession, strategies informed by the work of people like Bornn and Rudd, analysts who are pulling at the proverbial blanket in offices unseen from public view.
“I’m not a huge fan of jumping straight to pragmatism if that’s not what’s required,” Rudd tells me. “We have to remember that we’re in the entertainment industry. It’s got to be fun.”
Excerpted from How to Watch Soccer Like a Genius: What Architects, Stuntwomen, Paleoanthropologists, and Computer Scientists Reveal About the World’s Game. Copyright © 2026 by Nick Greene. Used with permission of the publisher, Abrams Books. All rights reserved.