Monday, August 15, 2011

US Open Draws Not Random

by Savannah

Taking a break from my mini vacay from blogging to post this article. I'm posting it in its entirety so I can't be accused of selective quoting. Excuse me while I go do the "I told you so" dance in the corner.

An "Outside the Lines" analysis of 10 years of men's and women's Grand Slam draws shows the top two men's and women's seeds in the U.S. Open -- on average -- faced easier opponents in the first round than is statistically probable if the draws were truly random.

Not only do both of the men's and women's first-round U.S. Open matchups deviate significantly from true randomness, this skewed pattern was not found at the Australian Open and Wimbledon, which use a similar draw system. At the French Open, the difficulty of opponents for the top two women's players during that time period was significantly more difficult than a random draw should produce, but the men were in line.


USTA Pro Circuit Director Brian Earley, who has been the U.S. Open tournament referee since 1992 and presides over the draw, said he stands by his system. However, he said he was concerned about the questions the analysis raises about the random nature of the draw.

"I have such faith in the folks within my work that if there was something unfair about it, I think it probably would have been proven to me and to the tournament before this," he said. "But we are always interested in hearing input."

"Outside the Lines" analyzed the average difficulty -- determined by the players' ATP or WTA rankings before the draws -- of those who played the top two seeds in all Grand Slams over 10 years. That was compared to 1,000 random simulations of 10 years of Grand Slam draws -- or the equivalent of producing 10,000 random draws taken 10 years at a time.

Only three of OTL's 1,000 simulations produced first-round opponents as easy as those the top two men's seeds have actually faced on average over 10 years in the U.S. Open. In none of the 1,000 simulations did OTL get the extreme results found in 10 years of actual opening matchups for the top two women's Open seeds.

Dr. Andrew Swift, past chairman of the American Statistical Association's Section on Statistics in Sports and an assistant mathematics professor at the University of Nebraska at Omaha, said the analysis and its methodology were sound.

"Any way you want to look at these, there is significant evidence here that these did not come from a random draw," he said.

That finding didn't surprise Scoville Jenkins, who in 2004 was ranked 1,433rd in the ATP singles rankings when he scored a wild-card entry into the U.S. Open. That made him the lowest-ranked player among the 128 entries in the men's tennis tournament. His opponent in the draw? No. 2 seed and defending champion Andy Roddick.

"At the time you think, 'Wow, this is unlucky,'" he said. "There's so many players in the draw I could have played."

A truly random draw for the unseeded players -- which is promised by USTA officials -- should have given Jenkins a two-thirds chance of playing another unseeded player, and a roughly 31 percent chance of playing a seeded opponent outside the top two seeds. He had a 2.08 percent chance of facing a top-two seed.

After facing Roddick in the first round in 2004, Jenkins drew No. 1 seed Roger Federer in the first round in 2007, when Jenkins was the 125th-best player in the tournament, according to the OTL analysis. He lost both times.

"Sometimes I think they put the player against who they would like to play," said Jenkins, who has since retired from professional tennis and is now an assistant tennis coach at Kennesaw State University near Atlanta. "If somebody came out tomorrow and said, 'This whole time we weren't doing it random and we were picking by whatever system,' it would not surprise me."

After being presented with the "Outside the Lines" analysis, Swift conducted his own study of the opponents of the top two seeds and found that only four times in 1 million simulations did he come up with an average ranking equal to or easier than what was actually observed in the men's and women's draws over the last 10 years.

"By itself, the U.S. [Open] numbers are weird," he said. "And then they're also weird in comparison to the other three Grand Slams. So you've got a double argument of weirdness here. Something weird is going on."

Trying to determine what has happened

But what exactly could that be?

"If there were anyone trying to fix this draw, which is comical to me & it would not be within my group and not within the USTA and not within the U.S. Open group," Earley said.

TOP SEEDS' FIRST-ROUND DRAWS

How frequently ESPN's simulated draws came up with average difficulty scores that were at least as low as scores for the actual Grand Slam draws. Percentages closer to 0 indicate a lower likelihood that the actual results are strictly due to random chance.

Men's Grand Slam Percent of draw simula-tions as easy as actual draws
Australian 71.2%
French 69.5%
Wimbledon 37.0%
U.S. Open 0.3%

Women's Grand Slam Percent of draw simula-tions as easy as actual draws
Australian 94.7%
French 99.2%
Wimbledon 30.7%
U.S. Open 0.0%
He said the computerized random draw is done as a small ceremony in a room in which representatives of the USTA and the men's and women's tennis associations and the chief of the Grand Slam supervisors are usually present. One of them gets to push the button on the computer that generates the draw of the unseeded players, which is displayed on a screen and printed out right away.

"So … you would have to say, 'Oh, yeah, well it's some programmer somewhere trying to decide, trying to fix this, or someone hacking into your system. I don't see that happening, either," Earley said.

Earley said he would consult with representatives of Information & Display Systems, the company that provides the software that generates the random draw. IDS has been providing the random draw software for the U.S. Open for more than 10 years, and does the draw for the Australian Open.

"I don't know how to explain it," Earley said. "And maybe we'll talk again, after I speak to someone who can give me a little bit better analysis of this, of how these could have happened, how this could have happened. Or if indeed it is as much of an anomaly as these would seem to indicate."

Leo Levin, IDS director of product development, said there is no problem with the program. A week after "Outside the Lines" presented its findings to the USTA, the organization forwarded an email from IDS president Rallis Pappas, in which he said the company simulated 200 draws. The 10-year averages in their sample were indeed random, but neither IDS nor the USTA offered an explanation for the skewed actual draws over the last 10 years, other than to say it had to have happened by random chance.

Chris Widmaier, managing director of communication for the USTA, said the organization stands by IDS and believes that it produces an automated random draw.

"If we were to put on 10,000 U.S. Opens, we'd probably see whatever the statistical average is, but we only put on one U.S. Open a year and for the last 10 years, the numbers have been the numbers," Wildmaier said. "I don't know what else we can say." He added there were no plans to investigate further and no changes will be made for this year's draw, scheduled for Aug. 24-25.

"I believe the answer is the age old saying, the luck of the draw," wrote Levin in an email, prior to conferring with the USTA. "There is nothing that happens at the U.S. Open draw that isn't done at other tournaments. Therefore, since the process is the same, the answer must be that's how they came out."

But Swift seemingly doesn't agree with that answer.

"There's always the chance that, yes, freak occurrences happen. But you're telling me a freak occurrence has happened with the men and the women?" he asked. "Double freak occurrences?"

Tennis fans have questioned the men's draw before, including two men Earley said approached him at a match last year with what they said was evidence the draw wasn't random. But he said he didn't agree with their conclusions. The men also took their data to ESPN, which televises the U.S. Open draw ceremony and tournament. ESPN examined their data, but its own statistician did an expanded analysis of both the men's and women's draws at all Grand Slam tournaments and came up with these findings.

Earley said the "Outside the Lines" analysis was the first time anyone had questioned the women's draw.

"What's really making the case are the players ranked in the 90s and 100, right?" he asked. "I mean, those are the ones that really skew the test?"

The top two seeds in each draw could have a first-round matchup with any unseeded player whose tournament rank is 33 through 128. Over the last 10 years, the average rank of opponents in the women's draw has been 98.5, and 97.2 for the men. A random draw should produce an average closer to 80.5.

"To get something as far away from 80 as 100 is extremely unlikely," Swift said. "If you looked at the other three Grand Slams over the same time period, the average rank of the opponents of the top two seeds in both the men's and women's sides was close to 80. It was close enough that it wasn't statistically significant."

Players: Two thoughts on playing top seeds

One of the players possibly affected by the statistical oddity was CoCo Vandeweghe. Vandeweghe's world ranking was 518 before she earned wild-card entry into the 2008 U.S. Open, which made her tournament rank 126 out of 128.

The then-16-year-old was coming out of practice one day when a friend asked if she had seen the draw.

"[My friend] said, 'You won't believe who you play,'" Vandeweghe recounted. "I said, 'Who is it?'" Her friend responded, "Jelena Jankovic," the No. 2 seed in the Open. "I thought they were kidding," Vandeweghe said.

She and Jankovic ended up playing a night match on the first day of the Open, which Vandeweghe said was "pretty crazy." She lost 3-6, 1-6.

"I was definitely happy to be playing a player of that caliber and kind of just see where I was at that point with my own game with a player like that," she said. "I took it all in as an experience. At the time, I thought I definitely could give her a run for her money."

When told about the U.S. Open draw analysis and the possibility that it was not random, Vandeweghe said it made her think about her experience being a No. 1 seed in other tournaments.

"If the No. 1 seed gets a slightly weaker draw, that's good news for them," she said. "It's kind of a funny stat to hear, especially when I've been a part of that stat."

Devin Britton was the lowest-ranked player when he entered the 2009 U.S. Open and drew Federer in the first round. Then 18, he had won only one set in his short professional career.

"I was kind of hoping to play another wild card to get a good shot at winning a match, but it was definitely a blessing to play Roger," he said. "You don't get that kind of atmosphere anywhere else."

Despite first-set nerves that Britton said caused him to trip over his feet, the newcomer broke Federer's serve and took at 3-1 lead in the second set.

"The crowd cheered a little bit and I thought, 'Oh, gosh, there's a lot of people out there,'" Britton said.

If for some reason, the draw for opponents for the top two seeds was skewed toward picking players at the very bottom of the rankings, such as Jenkins, Britton and Vandeweghe, those players said they aren't upset about the matchups -- even though they would have made more money in the tournament had they advanced to the second round.

Britton said he could see how that would be bad for people going into the Open with a lower ranking, but he said he still valued his experience and learned a lot from playing Federer.

"I could see where a lot of people would be upset about that," he said. "There's definitely two sides to it."

Jenkins had a similar attitude.

"Of course you wish you could have got somebody easier in the first round, second round. Next you realize that you're in a sport where you want to be the best, you've got to beat the best. He's the guy you have to get past," Jenkins said. " … That's the way it happens sometimes. It's not fair. It's a part of life, a part of sport. It happens."

As for the No. 1 and 2 seeds, representatives for Federer and Rafael Nadal declined a request for an interview. Representatives for recent top female seeds Jankovic, Serena Williams, Caroline Wozniacki, and Kim Clijsters also declined comment.

Top two seeds rarely lose Grand Slam openers

Top players boycotted the U.S. Open draw in 1996, forcing the USTA to remake the draw after allegations it could have been rigged to favor certain American players. The players were upset the then-16 seeded players were chosen after the first part of the draw. The association ended up redoing the draw with the seeds in place. And in 2001, the U.S. Open changed to seeding 32 players, which meant none of them would face a first- or second-round opponent whose rank in the tournament was better than 33. Wimbledon also changed to 32 seeds that year, with the French and Australian opens following the next year.

Even though the No. 1 and 2 players in the U.S. Open drew easier opponents on average among the pool of unseeded players, it did not seem to have any impact on whether they won or lost in the first two rounds, based on a comparison with how the top two seeds progressed in the other three Grand Slams. It's unclear what impact, if any, the skewed first-round matches had on the rest of the tournament.

"Being nice to the star players? Getting an easier pair? They're getting an easier pair [already]," Swift said.

The highest-ranked unseeded player that any of the top two men's seeds have played in the first round since 2001 was 41. It was 37 for the women. The top seeds won both times. In fact, since 2001, no No. 1 or 2 seed has lost a first-round match in the U.S. Open or the French Open and only twice each at the Australian Open and Wimbledon.

"What would the U.S. Open gain by fixing the draw in this way? I believe the U.S. Open would gain nothing," Earley said. "I think that that would be a risk that the U.S. Open would never take. Never."

Paula Lavigne is a reporter in ESPN's Enterprise Unit. Her work appears on "Outside the Lines." She can be reached at paula.lavigne@espn.com. Alok Pattani is an Analytics Specialist with the ESPN Stats & Information Group.

SOURCE

4 comments:

Yolita said...

I think this is absolutely shocking. And embarrasing for the USTA or whoever organises the US Open draws.
Maybe Wimbledon fares worse because they have a seeding of their own. My suggestion: get the Australians do the US Open draws!

Zafar said...

Yikes. Only just got round to reading this in full.

Didn't think it was quite as bad as this.

However much they may crow about "anomalies" and/or "luck of the draw", the fact remains that a simulated draw only matched the ACTUAL USO draw 0.3% of the times for men and 0.0% for the women.

You simply can't hide your head in the sand from a stat like that.

Whats worse is, as the article mentions, the disparity replicates itself across both sexes further doing away with any notion of "anomalies".

I'm not sure Wimbledon fares much better either - a figure as low as 30% would be considered scandalous in any other situation. The only reason we're not talking about them is that the USO figures were so unspeakably bad.

We should also take a moment to note, however, that the Aussie Open and RG emerge unscathed from this mess. Thats nothing to be proud about - we should expect nothing else. OTOH, its necessary to make the observation to put a stop to the irresponsible conspiracy theorists who would have us believe the problem is more endemic than it is.

Overhead Spin said...

I understand why people are in high dudgeon about this, but honestly, I just cannot for the life of me see what is so bad about this situation. In every single Grand Slam that Serena Williams has played, she has not been a top seed in every one of them and I am sure that she has drawn a top seed in her early round matches and she has never lost in the first round.

If the draws were being rigged to protect the top seeds, one would think that the USTA would rig the draws to ensure that their WCs and young up and comers would not draw these top seeds time and again. I just do not get what all the commotion is about. FWIW the top seeds are the top seeds for a reason. They are/were better than the rest and most of the time they did what they had to do to win the tournie

Coffee Lover said...

Great analysis/post.

The draws are clearly rigged. Fed vs Novak semis again at US open, should Roger make it...