Predicting Victory or Defeat - How do you know you are ahead or behind?

akreider2

Win / Lose is not a simple binary measure. You need to include the offset caused by a bid. I can guarantee that the Axis will >always win with a 100 IPC bid. I can also guarantee that the Axis will loss if the Allies get a 100 IPC bid.

Interesting. I’m thinking of using a standardized 9 VCs or concession for victory. Hopefully the League and Tournaments will provide a good set of people playing standardized games.

Bid is a variable. So far most of my data set has very similar bids (one 5 and a bunch of 7s), so it isn’t a measurable factor in those cases.

Not all victory cities are created equal, nor are all victory conditions. In an 8 VC game, Leningrad and Calcutta dominate the Axis >strategies. 9 VC adds Moscow or occasionally London to the list for the Axis. Allied strategies often start with “Stop Axis” before >they move on to identifying VC targets for 8 or 9 or more VC. Strategies will change based on the required VC count and on any >time limits imposed.

Yes.

IPC production rates have a long term effect. You can expect that there is a lagging phase relationship between IPC totals and >winning. Another way to think about it is with everything else being equal, larger IPC production will win out over time.

You’re on to something… So far (in very early results) the most powerful factor in the difference in Axis vs Allies units, because it is the best measure of the impact of the player’s strategy, and how much territory they’ve controlled in the past and present. By contrast, the current IPC value of territory isn’t so important.

Unit type and location is a short term effect. Since victory conditions are tied to physical control of certain territories, measuring >unit counts is not sufficient. They should be measured in terms of combat value and distance from victory cities.
Which is why my model will probably never predict more than 70-80% of the outcome. Counting the balance of units around Russia and Germany so as to determine whether either capital is about to fall, would be helpful, but darn hard.

I am interested in your results. Currently, I watch VC count, and IPC count as rough indicators. I also look at unit combat totals >and time/distances from VCs that are in contention that is much more subjective at this point. Schemes for valuation of units >become technically complex, mathematically messy and time consuming so I have not pursued them.
By this - Were you thinking of giving units different valuations from their IPC cost? Ex. if Japan has a stack of armor next to Russia, and a lack of infantry, they aren’t worth 5 each (more like 4 to 4.5). Too complex to do a good job of though.

If you do become successful at identifying what leads toward victory, it will be a valuable input towards building a smarter AI for >the game.
I like this. Maybe I should work on the AI. I’ve always wondered why AIs were so stupid. I used to play a lot of Civilization 2 and 3, and that AI was very stupid (the Civ 4 one is much better). I programmed an AI for Connect 4 once (using Turbo Pascal 7).

Baghdaddy

@Jennifer:

So, as you can probably guess, I pretty much know by round 5 who will win by how large a margin.Â Strategy, by that point, plays no role at all in the game, it’s now a game of chance.Â (Because everyone seems to use the same strategy with the only variations being related directly to what they have left after the last round.)

http://www.axisandallies.org/forums/index.php?topic=9006.msg178572#msg178572

Lets hear your prediction.

Baghdaddy

@akreider2:

…
By this - Were you thinking of giving units different valuations from their IPC cost?Â Ex. if Japan has a stack of armor next to Russia, and a lack of infantry, they aren’t worth 5 each (more like 4 to 4.5).Â Too complex to do a good job of though.

Definitely different values than their IPC cost. Transports and submarines are very different yet cost the same.

I contemplate a value system that looks at offensive punch and IPC value of enemy territories within range of a combat move.
How about adding defensive punch and IPC value of own territory and adjacent territories that the enemy is not adjacent to?

With unit value tied to what is threatened and defended, you now have a metric for giving real value differences between an Inf on the front line and one sitting out the war in Guam.

With TRANs and ACs improving the range of other units, you now have a value added system for those units. You might even give them some fraction of the value of the units they extend the range of.

With each space having its own intrinisic IPC value and having a relative value due to its proximity to other valuable spaces, the map starts to make more sense.

It is not obvious that this would provide real accounting the 3 TRAN in the Baltic for example but it is a start.

@akreider2:

If you do become successful at identifying what leads toward victory, it will be a valuable input towards building a smarter AI for >the game.
I like this.Â Maybe I should work on the AI.Â I’ve always wondered why AIs were so stupid.Â I used to play a lot of Civilization 2 and 3, and that AI was very stupid (the Civ 4 one is much better).Â I programmed an AI for Connect 4 once (using Turbo Pascal 7).

I would focus first on being able to predict the winner. Any decent AI in a strategy game will need a way to value board position. Being able to predict a winner based on a currnt board position allows the AI to assign values to reaching certain goals on the board.

Oh and I almost forgot. Don’t forget to factor in a fudge for dice rolls…

akreider2

As an example of early results, I first excluded the round 1 data (I’m taking data after the Russian turn) because there is a series of slaughter moves that typically happen in the first round. Eg. the fact that the UK has a battleship in the Mediterranean is worth a fraction of its 24 IPCs. Results are improved by excluding round 1 data.

My main variable is the total value of Axis units in IPC minus that of the Allied units.

Just to show that you can find out stuff with very little data, with a lousy 30 data points, this one variable explains 43% of the outcomes.

Adj R^2 = 0.433

Constant=.793

AxTotDif (the difference in IPCs)
B= 0.0034
T=4.89 (significant level is better than 0.001, or there is a greater than 99.9% chance that this variable is statiscally significant)

Thus, if you have zero IPC difference, the Axis has a 79.3% chance of winning (this makes a lot of sense - the Axis has better supply lines). For each IPC difference, the probability of winning increases by .34%. Note: these numbers are going to change a LOT, once I add some more data points.

In other news, this is pretty crazy, the difference in IPC value for land units has no significant impact, all of the difference comes from the naval IPC difference. That’s what happens when you only use 30 data points =)

Nix

Post P values instead of T values and i probably understand better ;) (still got to nail these boring stats exams in december….)

newpaintbrush

@newpaintbrush:

WHY do you want to do a scientific analysis?

I noticed my question hasn’t been answered.

Could this be part of the Secret Liberal Conspiracy?

Baghdaddy

@newpaintbrush:

@newpaintbrush:

WHY do you want to do a scientific analysis?

I noticed my question hasn’t been answered.

Could this be part of the Secret Liberal Conspiracy?

Of course!

Once they have reduced the odds of winning down to mere numbers in a table, they will analysis the Axis of Evil and decide we should surrender immediately.

akreider2

Is anyone an expert on different types of regressions? I’m wondering how much a problem using a linear regression is for a variable that only has a 1 or 0 outcome?

The problem is that the difference between winning by a slim margin, and totally devasting someone can be big. For instance, you can win a narrow victory with the Axis and Allies unit IPCs being equal, or have a big victory with a 200+ IPC difference. Ideally you’d have a win that was a “1” and a larger win that was a “1.5” or “2”. Any idea of how to measure this based on an Axis and Allies board? You could use victory cities, but I tend to think that they are a joke.

Is there any way to parse a map file? I’d like to convert it into an array of number of units per country, so I could write a computer program to generate a data file for analysis.

With the latest model, 1) AXIS IPC territory held (J+G territory) and 2) total unit IPC value difference are the two significant factors (p=0.001).

rjclayton

Instead of looking for a discreet outcome (0 or 1) why not look for a probability of axis win (from 0% to 100%). Then you could say Game X is a 40% probability and Game Y is a 85% probability.

Not sure how to set up the math, but if you could it would probably be more useful.

akreider2

To do a regression you need an outcome – or a “Y”.

The basic model is a rather simple
y=mx+b (standard linear equation)

Except that it is more like
y=m1x1+m2x2+…+ b
(a linear equation with several factors)

So my data needs a Y. Which is currently 1 if the Axis wins, and 0 if the Allies win. You cannot use a “probability of winning” unless there is a way of scientifically measuring it.

rjclayton

But does Y have to be discreet? That is, does it have to be an integer (only ‘0’ or ‘1’) or could it be any decimal between 0 and 1? ie, the equation is evaluated for Game A and Y=0.4. For Game B Y=0.85.

See what I mean?

rjclayton

Let’s say you discover 3 factors that you have identified as having a linear relationship with winning (say IPC value, Victory Cities, and Naval strength). Assign a weight (m1,m2,m3) to each factor based on their relative importance. The sum of the weights should equal 1. eg. if Victory Cities is found to be the most important it gets a m2=0.5 while IPC gets m1=0.3 and Navy gets m3=0.2

This defines your model as:
Y = 0.3x1 + 0.5x2 + 0.2*x3

Now for each factor, based on your data, you identify the range of values from the games you analyzed. If there was never an allied win after the axis had 97 IPC or more, then 97 IPC is assigned a value of 1. If there was never an axis win when the allies had 99 IPC or more (axis had 67 IPC) then assign that a value of 0. The range in between (from 67 to 97 axis IPC) gets assigned values between 0 and 1 depending on what percentage of games were won by each side. Should be the best fit linear. Do the same thing for the other 2 factors (Victory Cities and Naval Strength).

Now for any given game, just plug in the data from that game (ie. axis IPC is 73, so x1 would equal 0.2 lets say). Just for arguments sake lets say x2 was 0.25 and x3 was 0.8. Plug it into the formula and you get:

Y = (0.3)(0.2) + (0.5)(0.25) + (0.2)*(0.8)
= 0.06 + 0.125 + 0.16
= 0.345

Therefore there would be a 34.5% chance of the axis winning this particular game.

dezrtfish

@Baghdaddy:

@Jennifer:

So, as you can probably guess, I pretty much know by round 5 who will win by how large a margin.Â Strategy, by that point, plays no role at all in the game, it’s now a game of chance.Â (Because everyone seems to use the same strategy with the only variations being related directly to what they have left after the last round.)

http://www.axisandallies.org/forums/index.php?topic=9006.msg178572#msg178572

Lets hear your prediction.

Axis submision in round… 9
I can give a thought process, but I would rather wait until the game is over.

akreider2

Rclayton - interesting idea, but wouldn’t you run into a problem because the value that you’d have for Y would have to be created by the exact same factors that you had as x1, x2, and x3?

For instance, you wouldn’t want to use the IPC income as Y (because my regression analysis has shown that is a good measure of the “outcome”, but that there are other factors that affect it as well - such as total unit value).

If you are regressing IPC income on IPC income, of course your model will be very good at predicting because they are the same thing! A similar problem exists if instead of IPC income you define victory as a continuum from 0 to 1 based on multiple factors. Your dependent variable (your ‘y’) needs to be different from your independent variables - and theoretically caused by them.

…

My latest finding is that the model gets better and predicting if I remove the early rounds. Thus the R^2 (percent of variance predicted) can increase from 45% to as much as 85% (or possibly even more, though I don’t have enough last round data), if I only look at the latter rounds. This makes a lot of sense, because in the early part of the game you don’t know who is going to win (for that matter, the 45% of variance predicted was probably coming from the latter round records and very little to none of it from the first couple rounds).

To increase prediction power in the earlier rounds where the game is nearly even, you’d have to be able to predict luck (impossible) or measure skill (possibly using league rankings).

rjclayton

I think you misunderstand me. Y is not IPC. Y is probability of Axis victory.

m1, m2, m3… is the weight of each independent variable (ie factor), and they all add up to 1.0 (this is important to make sure that your Y value is a scale of 0 to 1). So if you found that IPC (m1) was a better predictor than any other variable, your m1 value for IPC would be higher than your other m values (ie. m1>m2; m1>m3).

x1, x2, x3…is the actual value each independent variable takes on for the specific game you are analyzing, and they must all be between 0 and 1 (again to make sure your Y is between 0 and 1). So again for your IPC factor, if the current game has the axis doing very well in IPC (compared to your data set of games you previously analyzed) then your x1 value would be close to 1.0 and if the allies are doing well, the x1 value would be close to 0.0

What I’m describing is basically a weighted average based on linear relationships that you will determine using your data set.

Hope that helps.

akreider2

Hmm, I think what you are describing is covered by linear regression process. I’m using OLS (ordinary least squares) regression and SPSS (software which does most of the work for me). Are you familiar with OLS?

I’m fuzzy on some of the exact details as to how OLS works because it’s been 5+ years since I was doing major statistics work.

Here is the wikipedia entry:
http://en.wikipedia.org/wiki/Least_squares

rjclayton

I guess I’m suggesting something like weighted least squares

http://en.wikipedia.org/wiki/Weighted_least_squares

My university math is pretty fuzzy, and as I recall we only touched on regression. I didn’t take too many stats courses.

But yes, what I am describing is covered by linear regression. What I was attempting to do was demonstrate that Y need not be an independent variable with only possible outcomes of 0 or 1, in response to:

@akreider2:

Is anyone an expert on different types of regressions? I’m wondering how much a problem using a linear regression is for a variable that only has a 1 or 0 outcome?

The problem is that the difference between winning by a slim margin, and totally devasting someone can be big. For instance, you can win a narrow victory with the Axis and Allies unit IPCs being equal, or have a big victory with a 200+ IPC difference. Ideally you’d have a win that was a “1” and a larger win that was a “1.5” or “2”. Any idea of how to measure this based on an Axis and Allies board? You could use victory cities, but I tend to think that they are a joke.

Is there any way to parse a map file? I’d like to convert it into an array of number of units per country, so I could write a computer program to generate a data file for analysis.

With the latest model, 1) AXIS IPC territory held (J+G territory) and 2) total unit IPC value difference are the two significant factors (p=0.001).

If you allow Y to be a real number between 0 and 1, then I think it makes your linear regression model work better with it, and also solves your concern of a narrow victory versus a landslide win.

Also, parsing a map file would depend on the format used. TripleA map files would be pretty simple to parse, since the project is open source you should be able to look at the code and determine the format of the file. Mapview is not open source, but Motdc is actively developing for it and he might be open to helping parse a mapview map file. ABattlemap would be near impossible as far as I can tell, since I don’t believe anyone is actively developing this application anymore. Unfortunately I’d say 80% of the map files on this board are ABattlemap, so that may put a big kink in those plans.

akreider2

Hopefully we won’t bore everyone else (people I still need data files - send me your aBattlemaps!!!) on the thread.

Maybe it would be helpful to clarify that there is are two Ys. The observed outcome and the predicted outcome. The predicted outcome will vary a lot (in the 0 to 1 range, but it could go as far as -1 or +2). The observed outcome is currently 0 or 1.

Weighing records might be a good idea. I suspect excluding them might work even better. Eg. if I can collect enough data for the last 1-3 rounds of a game that would be the best.

Weighting - I tried it out, using the Round as the weight, and it boosted R^2 from 0.35 to 0.55. However i can get better results by excluding the early rounds.

–
Hmm, logistic regression is meant to deal with 1/0 outcomes. However I don’t see how to do it with my SPSS version (11), so I’m going to try and get a new version.

BTW - do you have any aBattlemaps you can send my way? (Ideally with bid data).

rjclayton

Sorry, I don’t have any maps available. I am actually in the midst of playing my first ever revised game as we speak.

I would think that you might want to try to avoid excluding earlier rounds. Ideally you want to be able to take any game, plug in the critical dimensions into the formula, and spit out some sort of expected outcome. Just because a game was in the early rounds, doesn’t mean you shouldn’t try to take a crack at predicting the outcome, does it?

I wonder if you could also try to calculate a confidence level? Eg. Game 1 was in round 30, and based on the independent variables the axis should win with a confidence level of 90%. Game 2 was only in round 6, and it was calculated that the axis should win but with a confidence level only of 55%.

Or something along those lines.

Not sure what the calculations would look like though…

akreider2

I wonder if you could also try to calculate a confidence level? Eg. Game 1 was in round 30, and based on the independent >variables the axis should win with a confidence level of 90%. Game 2 was only in round 6, and it was calculated that the axis >should win but with a confidence level only of 55%.

The model with you give a predicted outcome and a standard deviation for that (For instance it might give you a 0.9 with a 0.2 standard deviation). So you could get a confidence level from that. Maybe a logit model will do a better job of this (as it will tell you chance of getting exactly 0 or 1, whereas the linear regression says you can get 0.9 which is an outome (a near win) that doesn’t exist as it represents an uncompleted game). I’ll see if I can get SPSS to upgrade.

Predicting Victory or Defeat - How do you know you are ahead or behind?

T-shirts, Hats, and More

Suggested Topics

Has anyone painted their pieces or know someone who has? Any pics..

New German move, let me know what you think

Average number of rounds to moderate victory?

Why Nine victory cities

Subs-I know its been done to death

Caspian Sub Policy Paper #15: Do you know a sucker?

Minor Victory?

How to win on a 8 victory city game?

8

18.1k

40.9k

1.8m