Is there a way to score a tennis player out of 100 based on data taken from tournament play? The Game Insight Group certainly thinks so and in partnership with Tennis Australia they've come up with a new Player DNA dataset that takes a look at what they believe are the 4 pillars of the perfect player.
I first noticed the data last week after @rfswissmaestro tweeted me when Roger replied to the Aus Open on Instagram after he scored poorly on footspeed 😀 . We'll be seeing more of this new data on our screens during the Australian Open so I thought I'd take a closer look. It's all very interesting but is it quantifiable and does it really tell you anything? Or is it just another case of lies, damned lies, and statistics?
What is GIG Player DNA?
GIG’s Player DNA is a combination of four key areas: Technical, Tactical, Physical, and Mental. The Game Insight Group was formed by Tennis Australia and they break down the major findings within each metric to illustrate the strength of a player’s attributes relative to their competitors. Ratings are given up to a maximum score of 100.
What Data Are The Numbers Calculated From?
GIG has pulled together multiple data sources to try and get a reliable view of player performance. The data includes point-level data describing the outcomes of every point in Grand Slam matches and tracking data describing ball and player movement throughout points in matches at the Australian Open.
For all datasets, they focus on the period from 2016 to 2018. They don't actually give any specifics of where the data is sourced from which is a shame but you have to assume Hawkeye is involved here somewhere.
How Is Each Area Scored?
Each DNA score rates how much better or worse a player performs in that specific area or skill relative to an average Grand Slam player.
I rarely like these statistic based approaches as tennis is not played in a vacuum (one of the reasons ELO is completely flawed for tennis rankings).
However GIG has accounted for this realising the situation and strength of the opponent, they adjust for both situational factors and the difficulty of their opponent. In other words, using statistical models to compare players on a like for like basis.
Obviously it's not great for players who lost early and only played 1 or 2 matches in Australia as opposed to Federer who played 14, so to account for differences in sample size, each player effect was measured with ‘shrinkage’: shrinking values towards an effect size of zero in proportion to GIG's uncertainty about a player’s performance.
How Does GIG Overcome Biases?
GIG states they faced three major challenges when rating player skill
in each area of the Player DNA:
- Sample size
- Playing style
- Opponent effect
For this they say:
Our main strategy for dealing with these issues was to setup a statistical model for each measure that would help us to estimate a player’s how much better or worse a player’s expected performance is comparable with an average player controlling for contextual factors and opponent strength.
To give a concrete example of the modeling approach, consider the Court Control measure of the Tactical DNA.
To assess each player’s Court Control, we took all instances in our AO tracking data when the impact player had a spatial advantage, which we defined as playing a rally shot from a central location while their opponent was out wide.
The outcome of interest was a player’s ability to win the point within two shots from this situation controlling for each of the following factors: exact player positions, incoming shot characteristics, opponent’s ranking group, and opponent’s general rally ability. Using random effects for the player, the model yields a shrinkage estimate of how much better or worse a player’s Court Control is than the average player, accounting for differences in sample size.
What Does Each Pillar Look At?
- Serve (First and Second)
Each stroke is broken down into subcomponents:
- Accuracy, Placement and/or Reliability.
First and Second Serve
- Speed: rates a player’s average serve speed.
- Placement: rates how close to the lines a player hits their serve.
- Reliability: rates how often a player gets their serve in-play.
- Potency: rates how often a player is able to use their serve to win quick points.
- Speed: rates a player’s average return speed.
- Reliability: rates how often a player gets their return in-play given the quality of the incoming serve.
- Potency: rates how often a player is able to use their return to win points.
- Speed: rates a player’s top forehand speed.
- Potency: rates how often a player is able to win points with their forehand.
- Accuracy: rates how often a player uses placement rather than speed to win points with their forehand.
- Speed: rates a player’s top backhand speed.
- Potency: rates how often a player is able to win points with their backhand.
- Accuracy: rates how often a player uses placement rather than speed to win points with their backhand.
GIG takes into account five components to rate how well each player is tactically: Rallying Craft, Attacking Balance, Spatial Control, Time Control and End Range Defence.
1. Rallying Craft
This measures how successful a player is at rally exchanges of 4 or more shots.
2. Attacking Balance
This measures how well a player balances risk and reward when looking to attack. A good balance would result in more winners than unforced errors.
3. Court Control
This measures how successful a player is when they have the space advantage. A player has the space advantage when they can play their shot from a central location and their opponent is out wide.
4. Time Control
This measures how successful a player is when they have the time advantage. A player has the time advantage when they have more time to play their shot than their opponent just had. This is more time for decision-making, positioning and shot execution.
5. Wide Defence
This measures how good a player is at defending from a wide position (‘end range’)when their opponent is central. The best players are able to overturn their opponent’s space advantage and win the point.
GIG look at five stats to rate a player’s Physical DNA: Foot Speed, Power, Repeat Sprints, Agility and Match Endurance.
1. Foot Speed
This stat looks at players who are able to hit the highest speeds in a point and still have a successful outcome.
This stat looks at player’s explosive acceleration power when in a winning position.
3. Repeat Sprints
The Repeat Sprints stat measures how well a player can perform multiple running actions and still have the advantage in the point.
This measure assesses how well a player is able to quickly change direction during points and still be successful. A ‘quick change’ is a high–intensity change of direction.
5. Match Endurance
A player’s Match Endurance is measured by their win rate in Grand Slam matches 3 hours in length or more for men, and 2 hours in length or more for women.
Winning the mental game is all about handling pressure. We break down a player’s ability to handle pressure into four components: Killer Instinct, Grit, Clutch and Winning Edge.
1. Killer Instinct
This measure gets at a player’s ability to be clinical when they are in control of the match. The specific stat looks at how well a player is able to close out matches with minimal pressure faced during Grand Slams.
The Grit measure of mental performance focuses on a player’s mental doggedness. To evaluate player Grit we look at Grand Slam matches when a player’s back was to the wall and see how well they were able to raise the pressure of the match, keeping the match close even if it was ultimately a loss.
A player who can raise their level in key moments is considered ‘Clutch’: they bring their best game when it matters most. To evaluate clutch we look at player’s pressure win rate (PWR) on serve and return and compare these rates to their overall win rate on serve and return. The higher the differential on serve and return, the more ‘Clutch’ a player is.
4. Winning Edge
Most matches are won by the player who wins more of the key points than their opponent. Being able to maintain a high edge in big moments over opponents takes more than talent, it takes mental strength. The Winning Edge gets at this ability by looking at a player’s PWR on serve relative to the opponents they have faced at Grand Slams.
We think the rating method we have used to create Player DNA scores has a lot going for it. It looks at a number of dimensions of performance we rarely see analysed in tennis. And for each measure, we have attempted to make a statistical comparison that is robust and doesn’t cherry pick to favour popular players just because they are popular. Still, our approach isn’t without limitations. We hope that by sharing our method with readers, we can get more of the tennis community thinking about how we can improve the number and usefulness of advanced stats
in our sport.
The Latest Player DNA Data
The Player DNA scores for 56 male players. These scores are based on point-level data from Grand Slam matches and tracking data from the Australian Open matches played from 2016 to 2018.
|Diego Sebastian Schwartzman||53.9||94.3||68.8||88.5||89.6|
|Roberto Bautista Agut||61.8||61.2||74.7||62.4||78.6|
|Juan Martin Del Potro||84.4||53.2||86.7||22.9||73.9|
|Guillermo Garcia Lopez||10.7||48||89.7||27.7||37.5|
|Alex De Minaur||9||34.1||6.9||75.3||17.3|
|PLAYER||RALLY CRAFT||ATTACKING BALANCE||COURT CONTROL||TIME CONTOL||WIDE DEFENCE||TACTICAL DNA|
|Roberto Bautista Agut||85.1||84.3||63.9||52.1||89.2||88.3|
|Juan Martin Del Potro||44.1||68.3||75.8||74.1||29.1||67.8|
|Guillermo Garcia Lopez||72.9||40.3||75.2||77.9||19.2||65.6|
|Alex De Minaur||23.6||37.7||39.4||40.6||93.6||44.5|
|Diego Sebastian Schwartzman||67.3||16.1||68.3||31.4||13.9||29.7|
|PLAYER||FOOT SPEED||ACCELERATION||SPRINTS REPEAT||AGILITY||ENDURANCE MATCH||PHYSICAL DNA|
|Albert Ramos Vinolas||53.8||53.3||79.5||73.4||93||84.9|
|Diego Sebastian Schwartzman||91.1||91.5||45.2||33.1||69.5||80.5|
|Juan Martin Del Potro||77.9||75.9||52.6||30.1||92.6||80.2|
|Alex De Minaur||89.3||74.8||30.8||17.3||79.3||69.8|
|Pablo Carreno Busta||52.3||43.7||64.6||18.3||73.1||55.5|
|Roberto Bautista Agut||49||30.4||83.9||41.1||39.2||52.2|
Mental DNA[table id=4 /]
So all very interesting and it's obviously very clever how it's all been calculated to produce a final number. Looking through the chart some of the scores are what I'd have expected them to be, other's not.
For example, we all know Gasquet's forehand is an absolute joke compared to many of the top guys and he scores accordingly. But why is Wawrinka's backhand so low? Is it because he lost to Sandgren last year early?
Like I said above tennis is not played in a vacuum so it's very hard to just throw numbers like this around. How do you weight each particular area? From my understanding, all four pillars are weighted equally, but this will always vary from match to match. Some days are mental battles, other physical.
Another thought I had was the numbers show Roger's foot speed is below average. But can you measure his anticipation and understanding of the court geometry? He often has a very good idea of where the ball is going before it even leaves the strings. Without eye-tracking data, probably not. So lacking in one area can be made up for in another that essentially helps produce the same end result – getting to the ball in time.
And does something like this take into account what one player knows about the other from past matches and practice so they will vary their game style accordingly?
Overall I find it interesting, but not that useful. I feel like you can determine a lot of this stuff just by watching without the need to quantify it. I know De Minaur is fast and I know Nadal saves a lot of break points with his mindset. Does a score out of a 100 help? I'm not sure.
I just like the numbers that can be measured in a straight line and are very easy to quantify. Distance run/covered in the tournament, break points saved, break points converted etc.
Perhaps something like this works better on a tournament by tournament basis, so you can see who is performing well in certain areas during that week. Then combined at the year-end to see who comes out on top over 12 months and see how that relates to their results and titles. Rather than this data which is from just one tournament over 3 years.
What do you guys think of the Player DNA? I know some of you guys are very stats-minded, so is this any good? Or is it flawed? Let me know your thoughts in the comments.