Bracketology and Basketball: Data Analysis for March Madness


NCAA_March_Madness_logo_2016It’s the most wonderful time of the year — that is, if you’re a college basketball fan. It’s March Madness! For the next several weeks, all you’ll be hearing about in your office is smack talk about whose alma mater is superior, whose brackets are busted and who must have paid off the NCAA to get that undeserved bid to the Big Dance. And expect a dramatic drop off in productivity on Thursday and Friday, at least for the next two weeks — because basketball games will be happening for the entire day. Seriously, it’s 10 straight hours of basketball for the first two days.

For the uninitiated, I’m talking about the national championship for college basketball teams, which is rather unlike most other popular tournaments in that first, it’s one and done — one loss and you’re out — and second, the system, which gives automatic bids to conference championship winners (even for small conferences) allows smaller schools in to compete against heavyweights. This all translates to a whole lot of sports drama.

And by the way — for the purpose of this blog, I’m talking about men’s college basketball, for one main reason — the winner of the women’s basketball championship tournament is mathematically likely (I’d say certain, but we data analysis people don’t like to use that term) to be the UConn Huskies. They are on an unprecedented 91-game winning streak. The last time they lost was in November of 2014 (to rival Stanford University). They have won the last four national championships, and 10 since the year 2000. The UConn Huskies women’s basketball is an unrivaled sports dynasty, and a blog for another day. If they don’t win the national championship this year, it will be an upset of epic proportions.

Anyway, I myself don’t have much skin in the game — my hometown university, Syracuse, is a big basketball school, but didn’t do that great this year and received a consolation bid to the NIT. My alma mater, Binghamton University, has only made it to the men’s basketball NCAA tournament once ever, in 2009, and they lost in the first round (um, go Bearcats?).

But whether you’re a fan of a specific team that made it to the tournament this year or not, the science of filling out brackets (bracketology, if you will) is still great fun for us math nerds. Because make no mistake — all sports are really a game of data and statistics — basically predictive analytics — and you can use that to increase your odds of winning any pool you’re playing in (purely for bragging rights, yes?). If you’re not a believer, please go read and/or watch Moneyball.

One of my favorite sites for statistical data analysis of well, anything, is Nate Silver’s 538, which has lots of useful and interesting information about bracketology. If you’re still making picks, you’ll find this page useful, which shows the odds that each team will make it to a particular round, purely based on data analysis from various sources. It updates as the tournament goes on. Fans of statistical analysis also shouldn’t miss Davidson University’s March Mathness site.

You may also feel better about your bracket being busted in the first round if you read this article, which shows the odds of you — or anyone — creating a perfect bracket. It’s essentially impossible, like winning the lottery (except someone actually does win the lottery). DePaul University professor Jeffrey Bergen has calculated the chances that someone will pick a perfect bracket as 1 in 9.2 quintillion (to be fair, that’s using a coin flip to determine winners, not using any actual information). According to 538, in the ESPN bracket challenge in 2015, 11.5 million people filled out brackets and after the first two days, only 273 people still had a perfect bracket. That’s after just the first round. This is probably why Warren Buffett felt comfortable in 2014 offering a $1 Billion prize to anyone who picked a perfect bracket. He also still offers $1 million — per year, for life — to any Berkshire Hathaway (his company) employee who can make it to the Sweet 16 intact. He’s never had to pay up.

Your odds of a perfect bracket would increase if you just picked the higher seed to win every game — to 1 in 17,000. But that has also never happened in the history of the NCAA tournament. And after all, the fun of making a bracket is predicting upsets — they are inevitable, but which game will they come in? Everyone wants to figure out who Cinderella might be this year — that’s a lower-ranked team that makes a run to a high round, beating basketball juggernauts along the way (I’m looking at you VCU, George Mason, Butler).

One choice that has always been safe to make is to pick a No. 1 seed to beat a No. 16 seed in the first round, because a No. 16 seed has never beat a No. 1 seed. Although, of course, 538 says it’s a statistical inevitability that this will happen eventually.

So what’s a fan to do? You could spend hours weighing the various points of data, checking out the brackets of basketball experts or famous people. Donald Trump isn’t filling out a bracket this year, but basketball fan Barack Obama has picked arch enemies Duke and UNC to face off in the title game, a result that might cause my current state of North Carolina, where basketball is religion and Duke and UNC are only 7 miles away from each other, to implode in on itself. You can also check out what Vegas is putting its money on.

But filling out brackets is supposed to be just pure entertainment. So use the data analysis available, but trust your gut, choose a few early upsets and have fun. And remember, despite what the drunk guys at the sports bar say, there really is no such thing as a hot streak. It doesn’t exist. But you can still wear your lucky socks.