Expected rally length in tennis

Discussion in 'Physics & Math' started by Jennifer Murphy, Jun 9, 2019.

  1. Jennifer Murphy Registered Senior Member

    Messages:
    239
    I have been trying to come up with a formula for the average number of balls returned in a tennis rally given the probability of a successful return on each stroke. I am only interested in the groundstrokes, so I am ignoring the serve and return of serve. I am also assuming that both players have the exact same probability of making a return.

    This is the equivalent of calculating the average number of red balls drawn with replacement from a bag of red and green balls given the number of red and green balls. If the tennis players have a return probability of 60%, it would be the equivalent of having 60 red and 40 green balls in the bag.

    I started by assuming that the expected average would be just the sum of the probabilities of each possible length of rally. That would be this infinite series, given p = the probability of any single return & q = 1-p:

    Expected Average Rally Length = 0*p^0*q + 1*p^1*q + 2*p^2*q + 3*p^3*q + ...

    That is, the odds of a 4-ball rally is p^4*q or the odds of 4 successful returns and 1 unsuccessful return.

    Multiplying that by 4 would give us the contribution to the average of rallies of that length.

    There must be something wrong with this, because the numbers do not agree with a simulation I wrote.

    Can anyone help me with the correct formula?

    Thanks
     
  2. Google AdSense Guest Advertisement



    to hide all adverts.
  3. przyk squishy Valued Senior Member

    Messages:
    3,203
    As far as I understand it, your question is equivalent to: Suppose I have a coin that, when tossed, returns heads with probability \(p\). I toss the coin until it returns tails. How many times will it return heads on average?

    If that's the case your formula looks correct to me. The average is

    \(E = 0 \cdot p^{0} q + 1 \cdot p^{1} q + 2 \cdot p^{2} q + 3 \cdot p^{3} q + \dotsb\)​

    like you've written. That can by the way be simplified to

    \(E = p/q\)​

    because

    \(\begin{eqnarray} E &=& pq \bigl( 1 + 2 p + 3 p^{2} + \dotsb \bigr) \\ &=& pq \, \frac{\mathrm{d}}{\mathrm{d}p} \bigl( 1 + p + p^{2} + p^{3} + \dotsb \bigr) \\ &=& pq \, \frac{\mathrm{d}}{\mathrm{d}p} \biggl( \frac{1}{1 - p} \biggr) \\ &=& pq \, \frac{1}{(1 - p)^{2}} \\ &=& p/q \,. \end{eqnarray}\)​
     
    Jennifer Murphy likes this.
  4. Google AdSense Guest Advertisement



    to hide all adverts.
  5. Jennifer Murphy Registered Senior Member

    Messages:
    239
    After I posted this question, I did a little more work on it and came up with the same result as you did, but by a completely different path. Sorry for my notation, but I find the equation notation confusing and tedious.

    Starting with

    L = 0*p^0*q + 1*p^1*q + 2*p^2*q + 3*p^3*q + ...​

    I replaced q with 1-p

    L = 0*p^0*(1-p) + 1*p^1*(1-p) + 2*p^2*(1-p) + 3*p^3*(1-p) + ...​

    I then expanded the series

    L = 0 - 0p + 1p - 1p^2 + 2p^2 - 3p^3 + ...​

    Combining like terms I got

    L = p + p^2 + p^3 + p^4 + ...​

    That's the geometric series but missing the first term (1), so

    L = (1/(1-p))-1 = (1 - 1 + p)/(1 - p) = p /(1 - p) = p/q​

    And I fixed my simulation so that it agrees with this. (Yea)


    But I don't understand your method. In step 2, are you taking the derivative? If so, why?

    And then how are you getting rid of the d/dp in step 4?

    Thanks
     
  6. Google AdSense Guest Advertisement



    to hide all adverts.
  7. iceaura Valued Senior Member

    Messages:
    30,994
    He takes the derivative to get from step 3 to 4. From 2 to 3 was algebra - the infinite sum made tractable, similar to what you did ("geometric series", your label)

    Take the derivative indicated in step 3 - actually do it, as he did - and look at what you have. (Recall the formula for the derivative of a fraction of functions, usually written f/g in textbooks? "f" here is 1, a constant, "g" is (1-p))

    Then 4 to 5 is algebra, remembering that q = 1-p.

    Sorry to butt in - refreshing change of pace - - - -
     
    Jennifer Murphy likes this.
  8. Jennifer Murphy Registered Senior Member

    Messages:
    239
    OK, I understand step 2-3, just replace the series with it's equivalent (standard identity).

    I understand step 3-4, now that you remind me how to take the derivative of a quotient.

    I understand step 4-5.

    And I understand step 1.

    I just don't understand step 1-2. Why take the derivative? And shouldn't the left side be E'? Any enlighten there?

    And I don't understand your use of the term "tractable". Do you just mean "manageable" or something else?

    Thanks
     
  9. iceaura Valued Senior Member

    Messages:
    30,994
    Because it made the explanation simple and easy and clear - for a math guy, who does stuff like that almost by reflex. It's like if you see a small even number and dividing by two would help - your brain just does it.
    Just that. Sorry - that kind of stuff is my reflexive flaw.
    - - -
    btw: You've launched an interesting investigation, if you care about tennis. If it is really the case, for example, that in a match the odds of a given return from a given player are independent of the length of the rally or the position of that return in the sequence - if selection with replacement from one bag of identical balls does in fact model actual rallies - that has significant implications for the players in real matches.

    It's hard to believe nobody has done this already, in the world of tennis (and its high class university environment), but that would be fun to check out as well.
     
    Last edited: Jun 11, 2019
    Jennifer Murphy likes this.
  10. przyk squishy Valued Senior Member

    Messages:
    3,203
    I didn't take the derivative of \(E\). I substituted \(1 + 2 p + 3 p^{2} + \dotsb = \frac{\mathrm{d}}{\mathrm{d}p} \bigl( 1 + p + p^{2} + p^{3} + \dotsb \bigr)\) into the first line. The second line is equal to the first line.

    This seems moot though. Your method in post #3 works and doesn't require any calculus and is simpler. I just hadn't thought of doing it that way.
     
    Jennifer Murphy likes this.
  11. Jennifer Murphy Registered Senior Member

    Messages:
    239
    I do not believe that selecting from a bag of balls is a perfect simulation of a real tennis match, especially at the pro level or even the higher amateur levels. But at the intermediate level (3.0 to 4.5 on the USTA rating scale), I believe that it is a reasonable and useful approximation.

    I started this analysis over 30 years ago when I was running a tennis ladder. We had 50 or so players that would challenge each other on the ladder. The winner would send me the results and I would publish a new ladder about once a month. At some point, I started including a breakdown of the game, set, and match totals for each player. After a few such reports, I noticed something interesting.

    Here is a sample of some matches played by 5 imaginary players:
    Code:
    Winner  Loser  Set1  Set2  Set3
      B      D     6-1   7-5  
      A      D     6-4   6-3  
      D      E     6-4   8-6  
      E      D     7-6   6-7    7-6
      B      D     7-6   6-7    7-6
      A      C     6-3   4-6    6-4
      D      E     7-6   3-6    7-6
      A      B     6-1   6-4  
      B      C    10-8   6-4  
      C      E     4-6   7-5    7-5
      B      C     6-4   6-7    7-6
      C      E     6-3   6-1  
      C      D     6-4   6-7    7-6
      B      E     6-4   6-2  
      A      B     6-4   5-7    6-0
    
    And here's the report for those results:
    Code:
              Games           Sets         Matches 
    Player Won Lost   %   Won Lost  %   Won  Lost  %
      A     57   36  61%    8   2  80%    4   0   100%
      B     96   89  52%   11   6  65%    5   2    71%
      C     91   88  51%    8   8  50%    3   3    50%
      D     99  112  47%    7  11  39%    2   5    29%
      E     74   92  45%    4  11  27%    1   5    17%
    Sum    417  417        38  38        15  15 
    
    Does anyone notice anything interesting about this report?
     

Share This Page