We are going to use the Python implementation of OpenSkill to make the predictions, but the math will work for any TrueSkill implementation as well. The dataset we are using is also exactly the same as that used by the JavaScript implementation of OpenSkill. Finally it is required you understand at least the general aspects of the Weng-Lin paper.

From the JavaScript version's author:

For team games, a dataset of 61867 matches from OverTrack.gg was loaded into memory. Most matches represent a team of 5 vs. 5, with no possibility for ties. Since tracking matches is volutary, a not insignificant percentage of matches (7.82%) are played against a team of unrated/unknown players. The average number of matchs per unique player is about 2.338, however a decent number of players were tracked for over a hundred matches (Busby, 2020).

**Note: ** Some people have pointed out that OverWatch matches are not 5 vs 5. The actual benchmarks are on matches of 6 vs 6. This is a typo in the original source.

The original author used a regular check with the \(\mu\) values. If the value was higher then the team won. We are going to use this equation (Ibstedt, 2019) instead create a general prediction function:

\[ p(p_1 - p_2 \geq \varepsilon) = \frac{\Phi(\mu_1 - \mu_2 - \varepsilon)}{\sqrt{2\beta^2 + \sigma_1^2 + \sigma_2^2}}\] (1)

Where \(\varepsilon\) is the draw margin, \(\Phi\) is the model's distribution specific cumulative distribution function and \(\beta^2\) is the performance variance.This equation can only be used to predict 1v1 matches.

Let's generalize this function to be a bit more useful. Before doing that, we need to way to get the rating of a team. The paper (Weng, 2011) that OpenSkill is based on denotes this as:

\[\theta_i = \sum_{j=1}^{n_i} \theta_{ij}\]

Where \(\theta_i\) is the strength of team \(i\) and \(\theta_{ij}\) is the strength of player \(j\) in team \(i\).

The next step is to construct the set of all pairwise subsets of (1). We will denote this set as \(S\). If \(k<2\) where \(k\) is the number of teams, then we simply return \(S\), else we define a new variable \(h = \frac{|S|}{2}\).

Given \(h\), we use (2) to get a complete probabilities:

**Algorithm 6**: Prediction function where \(k>2\)

6.1 Let \(P\) be the probabilities.

6.2 \(\text{while $|S| > 0$}\)

\(Q = \{S_0\}\)

\(s = \sum_{q=0}^{h - 1} Q \)

\(m = \prod_{q=1}^{k-2} k - q\)

\(P.append(\frac{s}{m})\)

(2)

The sum of \(P\) will always be very close to 1.

This is what the equivalent python code looks like:

```
if k > 2:
S = deque(S_list)
P = []
h = len(S) / k
while len(S) > 0:
s = sum([S.popleft() for q in range(int(h))])
m = k
for q in range(1, k - 2):
m *= k - q
P.append(s / m)
return P
```

Here is how the actual implementation behaves:

```
>>> from openskill import predict_win
>>> a1 = Rating()
>>> a2 = Rating(mu=33.564, sigma=1.123)
>>> predictions = predict_win(teams=[[a1], [a2]])
>>> predictions
[0.45110901512761536, 0.5488909848723846]
>>> sum(predictions)
1.0
```

The actual openskill.py benchmarks can be found here. Hopefully both TrueSkill and all OpenSkill implementations benefit from this function.

Busby, P. (2020, June 3). Openskill vs TrueSkill implementations. Philihp Busby Blog Posts. Retrieved February 12, 2022, from www.philihp.com/2020/openskill.html

Ibstedt, J., Rdahl, E., Turesson, E., & Voorde, M.V. (2019). Application and Further Development of TrueSkill Ranking in Sports.

Weng, R. , & Lin, C. . (2011). A Bayesian Approximation Method for Online Ranking. 12,, 267-300.

Let us refresh our memory as to what the argument actually is:

(1) The presence of pain (and harm more generally) is bad; and the presence of pleasure (and benefit more generally) is good; no such symmetry applies to the absence of pain and pleasure. Thus:

(2) The absence of pain (or harm more generally) is good even if that good is not enjoyed by anyone; and

(3) The absence of pleasure (or benefit more generally) is not bad unless there is somebody for whom this absence is a deprivation.(Benatar, 2018)

*Figure 1: The Axiological Asymmetry*

The conclusion of course from these axioms is that bringing a thing into existence that is subject to the experience of pain (harm) or pleasure (benefit) is not acceptable. Those experiences are respectively bad and good. But what if one is a utilitarian? Benatar likes to claim that it fits in with the utilitarian views just fine. Let us analyze with a hypothetical to see whether this actually holds up.

Suppose we had a time machine that would rewind time itself and reset all human beings and creatures in the universe to be sterile (so as to not produce offspring). In other words consider the scenario where the universe did not conspire to bring into existence conscious creatures that can reproduce. In this universe the creatures would all reproduce for nearly fifty-thousand years and one day the universe would make it physically impossible to reproduce due to some law of nature. By Benatar's position, it would be bad there are conscious creatures that have to be exposed to suffering, but good that the suffering would end because of the cessation of existence of progeny. In other words it is a better world than our world.

But what of our world? Presumably there are creatures on other planets (and our own) in the galaxy that are being brought into suffering by the laws of nature. Nature prefers suffering. This is where consequentialist views like utilitarianism come into play. It is of no doubt that in order to alleviate pain (suffering) for the many it is enough to agree to cease to exist. But what of those that cannot do this thinking. What of creatures that cannot consent, but nonetheless experience? Consider the scenario where no living creature chooses to expose itself to suffering as soon as it becomes capable of thinking and consenting. In this universe the aggregate suffering will still remain positive. Why want this universe when it's possible to have net positive pleasure (albeit with some suffering initially) simply by being the species to take up the service of suffering to alleviate the pain of the many?

This is not a zero sum game. It's possible to achieve goodness for everyone by being the one to take the bullet. This is the flaw in Benatar's argument. If we only look at ourselves as individuals as opposed to what we actually are (a society), it will lead to a very selfish kind of life. One filled with no empathy for the future generations that are beyond our reach. The ones that will have to learn for themselves what suffering is. In light of this hypothetical, it is clear that we have a refutation for the claim that axiological symmetry if true, would apply to utilitarians.

- Benatar, D. (2018). Not "Not 'Better Never to Have Been'": A Reply to Christine Overall. Philosophia, 47(2), 353367. doi:10.1007/s114060189972-y