The new ELO-based ranking system

  • '15 '14

    @axis-dominion said in Proposal for a new, ELO-based, ranking system:

    BTW I am 2-0 against the 2nd ranked player. Adam, however, has always been better than me I will admit that. I think he has beaten me at least 2 out of every 3 games, maybe more idk I’d have to look back.

    Well, I made the mistake not playing you earlier when it was easy to beat you because you regularly blundered ;-)

    Then, you stopped blundering and became super strong, and this is when I started playing you. I am still traumatized by our play-off match in which a surprise Neutral Crush entirely wrecked my (I believe at this point decent) position within a single turn :-(

    So yes, getting your scalp at least once is a reason for me to return to the league^^

    I am btw 2:1 vs Adam :D (Probably one win not recorded because it may have taken place in a Tournament and may not have counted for the league)

  • 2025 2024 '23 '22 '15 '11 '10 Official Q&A Moderator

    Ummmmm…
    It’s just a fact that if @JDOW and @axis-dominion returned, the league would be at full strength on top maybe like never before (according to the life-time ELOs)


  • just to get this straight.

    are we just continuing on the old results or is everyone starting at 1500 elo on jan 1.2024?
    My understaning was to start everyone at 1500

  • 2024 '23 '22 '21

    My understanding is that the ELO system is a life long rating and should take into account all games ever played. It will continue based no the hundreds of game results that were just input.

  • '19 '18

    @Martin said in Proposal for a new, ELO-based, ranking system:

    My understanding is that the ELO system is a life long rating and should take into account all games ever played. It will continue based no the hundreds of game results that were just input.

    This.

    The system works best when you have around 15-20 games or more. It would only weaken the accuracy if we started on 1-1-2024 instead of all games ever.

    As of today, 4519 games have been counted - a little bit more than “hundreds” ;-)


  • OK

    But, honestly this has not been comunicated clearly. you are now making scores based on revised conditions that was not know when games were started. It is a fundamental change and if I played someones “m*m” 10 years ago it should not count now.

    I understand the desire to make it accurate but if fundamental changes are made date zero must be clearly communicated. It should be 01.01.2024, not a random date in 2014 or whatever

  • 2024 '23 '22 '21

    “now making scores based on revised conditions that was not know when games were started.” --> What difference would it make? No-one would have played differently - everyone plays to win.

  • '19 '18

    @oysteilo said in Proposal for a new, ELO-based, ranking system:

    if I played someones “m*m” 10 years ago it should not count now.

    That game 10 years ago has zero to none impact on your current rating, unless of course you have only played 3-4 games since then.

    I understand your argument and our change to the ELO system wouldn’t hold up in court. If we want to go 100% by “the” rules (we make them ourselves…) it would be cleaner to set a certain start date.

    However, we are not a large organisation like FIFA or the IOC where millions and billions of $ are at stake. We can make decisions far more flexible, that serve our community better, without worrying that someone takes us to the CAS (Court of Arbitration for Sport) in Lausanne just because we didn’t close every legal loophole.

    On a side note: It doesn’t start at a random date in 2014, it starts with the very first game when this league has been established. As far as I know every (or at least most) big league has some kind of lifetime ranking, for example the premier league:
    28b83736-4d55-4435-879a-a279d3745819-image.png

    Our all-time ELO has little to no impact on everyday matches. You can treat it as a funny little statistics page, similar to the all-time tables of other sports.

    It only comes into play for playoff seedings. And if you worry about that: It would be a lot less accurate if we started 1-1-2024. It would only lead to 1-2 years of inaccurate rankings before it would be reliable.

    I’m interested in what exactly bothers you that all games are counted. What is the downside?


  • @MrRoboto

    1. I dont understand how much the game i lost/won 5 years ago count. I dont understand the importsnce of number of gsmes i play every year.
    2. If I played 10 games last year or 3 or 35 games how does that count?
    3. Compared to: In the last 3 years i played 4 gsmes a year and won all against a 1500 opponent. How does this compare to 12 wins this year against all 1500 opponent?

  • I feel you fail to explain this clearly, maybe it is obvious to you, its not to me


  • @oysteilo If you are consistently defeating 1500-ELO opponents, you will move up to the 1800s. Impossible to move above that range unless you are consistently beating better players. Doesn’t matter how many games you played or which year, you have to beat good players to get a great ranking. Just beating up mediocre players time after time won’t get you much above where you are now.

    To get up to the 1900+ range, you need to be beating players like AndrewAAGamer. Beat him a few times in a row and your ELO will jump up quickly, regardless of year. The bigger the upset, the more your score will jump. Lose to him a few times, and there is very little penalty as he is currently 300 points above you and you are expected to lose.


  • Really appreciate the questions/concerns

    This is not all aimed at you, oysteilo, but putting a summary out there for everyone wondering about the big change.


    The league has had annual playoffs for many years. The tradition has always been to record all regular season results by year. The year used to end on October 31, not that many years ago. We had a 14 month year once to get it to December 31.

    Annual playoffs are a very high priority for the league, to name a league champion every year.

    Reliable ratings for players are a very high priority for the league - so that you know what skill level you’re getting when you challenge someone to a game that takes many hours over many months of time.

    We’ve always had a start-over of records each league year. Was November 1, now has been January 1.
    It was just logical when doing a simple standings spreadsheet. In 2012 and before it was just a list of how many wins and losses each player had, ordered by Winning %

    The ELO system spreadsheet that MrRoboto has created makes it feasible to accumulate data over the years and since there is a database, can be presented in some different ways.

    The # of games played within a (calendar or league) year is still very important because it continues to be a league rule that a minimum is required to participate in the playoffs. This is already tracked in the new spreadsheets.

    The sensitivity set for the first three to six games has been set at a sufficiently high level that a new player to the league will have a fair shot at a fair spot in the playoffs based on 1 year’s performance. The minimum # of games of 3-6 helps ensure this. Obviously with only 3 games played, the player’s skill may not be accurately assessed, but this has always been true.
    It’s the nature of our game. And we like to include new comers into the playoff action without too much obstacle, but also not too easily.
    Daaras is our latest example. 3 games PTV finished just in December, all 3 against a single player, and he gets to participate in the playoffs. Nothing new here. Would be the same if he did it a year from now.


    To @oysteilo your questions

    1. If you’ve played like 20-30 games since 5 years ago, the games you won/lost 5 years ago have very little bearing.
    2. This depends a LOT on how many you played THIS year. The more you played this year, the less previous years will make a difference
    3. You will have the same ELO number at the end, either way. So your playoff spot would be the same (if you ignore the timing differences for your opponents, and their ELO’s would be different in your scenario) at the end of the current year.
      However, you would have been able to participate in year 1 as a player who’d won 4 times against a 1500, and in year 2 as a player who’d won 8 time against a 1500 - that is the difference.

    Running out of time to write. With the database, year-by-year data could be reported (wins/losses by player, sides taken).
    The year by year cutoff concept doesn’t have to completely die. We can talk about what we want to see.

    I strongly favor playoff seeding that looks back farther then January 1 of the current year, especially for players who just met or barely exceeded the minimum # of games.
    In other words, I’d like to just go by current ELO (would be lifetime) at 12/31, and check minimum # of games have been played. The last 3, 6, 15 whatever games you have played have a lot more weight on your ELO. The one 80 games ago in 2014 is a nice statistic in your win-loss record and percent, but almost nothing, maybe 1 point, in your current ELO.

    ELOs still tracked by version, plus overall, so potentially 4 different ratings per player.

  • 2025 2024 '23 '22 '15 '11 '10 Official Q&A Moderator

    I need to say -
    I know it is 1/2/2024 and the 2024 league rules were just posted a couple days ago.

    No one should feel ambushed by changes - the 2024 playoff rules can be changed. I just rolled the rules out on time this year, and it wasn’t long ago I entered all the data back to 2012, so there hasn’t been time for the protests/questions.

    The 2024 playoffs being set by life-time ELO at 12/31/24 is open for discussion and I can always edit the rule. (It’s not set in stone)
    Those playoffs are a year away.

    Oysteilo’s questions are so helpful - you can’t really debate the change knowledgeably until you have some examples… until you see what the numbers do and what impact it would have on potential playoff participants. And this is what he’s getting at, and what we need to show you in order for you to be satisfied that you understand how it works


  • Maybe we make our first game back against each other, both of us rusty and all? haha

    @JDOW said in Proposal for a new, ELO-based, ranking system:

    @axis-dominion said in Proposal for a new, ELO-based, ranking system:

    BTW I am 2-0 against the 2nd ranked player. Adam, however, has always been better than me I will admit that. I think he has beaten me at least 2 out of every 3 games, maybe more idk I’d have to look back.

    Well, I made the mistake not playing you earlier when it was easy to beat you because you regularly blundered ;-)

    Then, you stopped blundering and became super strong, and this is when I started playing you. I am still traumatized by our play-off match in which a surprise Neutral Crush entirely wrecked my (I believe at this point decent) position within a single turn :-(

    So yes, getting your scalp at least once is a reason for me to return to the league^^

    I am btw 2:1 vs Adam :D (Probably one win not recorded because it may have taken place in a Tournament and may not have counted for the league)


  • I just want to reword my question. But first of all its fine to leave things the way they are.

    I think what I would like to understand better is how much games count. You say games i played 15 games ago dont really count anymore. But do they count 5%, 10% or 0.001%. For the skilled people here it should be possible to put a number on this,. If at all possible give a number on all the last 10 to 15 games. My blunder 10 games ago cost me 100 in rating, how much do i suffer from this after 10 games cimpared to not having played the blunder game? It is this kind of questions I have. Give numbers on this or explain how i can figure it out myself!

  • '19 '18

    It’s impossible to put a number on it like you ask for.

    Maybe I’ll try to explain the general function of the ELO system and this could help you understand how it works.

    The ELO system tries to measure your skill. For example on a 0 - 10 scale, you could be a 5.84 and the system would try to find out that number.
    Now we all start at 1500, which on a 0-10 scale would be a 5. Everyone starts at 5.
    Why do I not use a 0-10 scale? Because we don’t have a fixed limit on how good or how bad someone can be, so we need an open-ended scale. And in many many other cases where ELO is used, 1500 has been established as some kind of standard starting point.

    Now let’s say your skill is around 1650. The system doesn’t know it yet, of course. But that would mean you should win most games against players below 1650 and lose most games against players above 1650. That’s why your ELO will gravitate towards that number.

    Every unexpected result (wins against players above 1650 or losses against players below 1650) will deflect you from that path. You might be above that 1650 because you had some lucky unexpected wins, so you might be at 1700 now. But now the system expects you to win against 1680 players most of the time so whenever you do lose against a 1680, it will bring you down a bit more than if you were at your “ACTUAL” skill level of 1650. Of course the opposite is true as well.

    You might have had a loss streak and now stand at 1500. The system would expect you to lose against 1600 players. But since you are ACTUALLY better than 1600, you DO win against them (which the system didn’t expect) so you will gain more points than usual.

    In the end you will gravitate towards the your skill level and your ELO will oscillate around that number.
    The only way to inrease that center of that gravitation is to actually improve your skill.


    There is one caveat though, a small weakness in the system.

    If you only choose opponents better than you are thus lose most of those games, the system can’t find your actual skill level. Let’s say you only play the top 1-5 players - each loss will only cost 1-2 ELO rating and thus the rating will only fall EXTREMELY slowly, but there are not enough wins to tell the system where you belong. In the end, if you lose to a 2000 player that could mean everything: You could be 600, 1500 or 1900.

    Same goes for the opposite.

    If you only choose weaker players and play the bottom 1-5 all the time, these wins are almost worthless (giving you 1-2 points). It’s not practical to climb the rankings that way, because you can’t play the bottom player 500 times in a row. But again, if there are no losses the system cannot know how good you really are. Defeating a 700 ELO player can mean everything as well…

  • '19 '18

    And to give some perspective, here is your personal BM4 graph, @oysteilo with all your 74 completed BM4-results.

    63a0e550-03b5-437b-b516-84d4d3164527-image.png

    As you can see, there is not that much movement, so let’s zoom in a little bit:

    04abbca3-3beb-4d5f-85d1-22b1d203a320-image.png

    The orange line is the trend line.
    From looking at this, I’d say your skill level is between 1550 and 1600.
    Between Dec 2019 and May 2020 you had a couple of wins in a row which lead to your all-time high of 1684, but that was followed by a 9-game loss streak.
    On 8 Nov 2021 it seemed like you mass forfeited 6 games, which of course brought you down a lot.
    However, with 4 wins our of your last 5 you almost got back to where you were before those forfeits. (1573 on 8 Nov 2021, 1538 now).

    As you can see, a single result 30 games ago has basically no impact on your current rating.
    A blunder against a much better opponent with an expected loss? No impact. A blunder against a weaker opponent that cost you the expected win? That hurts and you need at least a game, maybe 2 to offset that unexpected ELO drop.

  • '19 '18

    My own BM4-Graph looks like this btw:

    6112b39d-d8c8-4d62-a750-2fd7ea30be31-image.png

    When I first played in 2016, it looks like I was a 1480 player. These were some of my very first A&A games.
    However, 6 wins out of the last 7 games rewarded me with my then highest rating of 1563.

    I came back playing 2 years later and it seems like I improved to be something like a 1530 player.
    2 forfeited games (because I went AWOL) brought me down from 1536 to 1466 though (70 points in 2 games!)

    In my current stretch of games, it looks like I have improved once more to be around 1570. My current rating is 1591 but I have some losses incoming, so that sounds about right.


  • All right, thanks for the explanation. I guess you just have to focus on winning games and dont accept more than you can handle. It will be fun to see how this turns out!

  • '19 '18

    And because it’s so much fun, some more data :-)

    Who has played the most BM4 games? That would be our beloved @simon33 who unfortunately went AWOL a couple of weeks ago.

    78a05bc7-7dac-4ddf-92b9-6ffcfc131977-image.png

    You can see clearly how he improved a lot, from his lowest point in March 2017 (1077 ELO) to his highest Rating in May 2019 (1693 ELO). Which is to be expected!

    Now from then on his rating gradually declined - slowly but steady. Did he become worse? Maybe.

    But remember, we are not measuring an absolute value. Your ELO rating is actually only a number that gives you a relation to the other players. So it’s not absolute, it’s relative to every other player.
    I’d interpret this graph differently: I think over the past ~4 years his skill level plateaued or stagnated but A) more and more other people did improve while he did not and B) more and more new people joined the league who are on average better than him.


    And now a completely different graph. The player with the 3rd most completed BM4-games, after @simon33 and @Giallo ?
    Our very own @axis-dominion

    Here is your graph, Mister:

    2db50aee-fe7c-41c0-ad67-f390dc53f18a-image.png

    A steady climb at the beginning where he chose mostly average players and won consistently almost all of those games.
    The opponents were weaker than him, but still reasonably close so the games were still worth something. The climb could have been a lot faster had he won against the top players from the get go - but he didn’t! The matches against already strong Adam514 back then were almost all losses.

    An unbelievable streak of 30 wins in 31 games between July 18 and March 19 (the only loss was against, of course, Adam514) rewarded him his all-time high of 2101.

    3 losses on 16 Apr 2019 (probably mass forfeit?) saw him drop a whopping 158 points before his break.
    A steady climb after his return brought him back to the top, although his rating seems to have plateaued around 2030

Suggested Topics

  • 21
  • 109
  • 34
  • 178
  • 467
  • 47
  • 88
  • 137
Axis & Allies Boardgaming Custom Painted Miniatures

37

Online

17.8k

Users

40.6k

Topics

1.8m

Posts