ramo.nash package

Submodules

ramo.nash.IBR module

ramo.nash.IBR.iterated_best_response(monfg, u_tpl, epsilon=0.0, max_iter=1000, init_joint_strategy=None, variant='alternating', global_opt=False, verify=True, seed=None)

Execute the iterated best response algorithm on a given MONFG and utility functions.

There are two variants of the iterated best response algorithm implemented, a simultaneous and alternating variant. These are not equivalent in general. In the simultaneous variant, all players calculate their best-response strategy simultaneously. The alternating variant does it by alternating.

Note

At this point in time, the algorithm does not find cycles and will continue to execute until the maximum number of iterations is reached.

Parameters:

monfg (MONFG) – An MONFG object.
u_tpl (Tuple[callable]) – A tuple of utility functions.
epsilon (float, optional) – An optional parameter to allow for approximate Nash equilibria. (Default value = 0)
max_iter (int, optional) – The maximum amount of iterations to run IBR for. (Default value = 1000)
init_joint_strategy (List[ndarray], optional) – Initial guess for the joint strategy. (Default value = None)
variant (str, optional) – The variant to use, which is either simultaneous or alternating. (Default value = ‘alternating’)
global_opt (bool, optional) – Whether to use a global optimiser or a local one. (Default value = False)
verify (bool, optional) – Verify if a converged joint strategy is a Nash equilibrium. When set to true, this uses a global optimiser and might be computationally expensive. (Default value = True)
seed (int, optional) – The initial seed for the random number generator. (Default value = None)

Returns:

Whether or not we reached a Nash equilibrium and the final joint strategy.

Return type:

Tuple[bool, List[ndarray]]

ramo.nash.Player module

class ramo.nash.Player.FPPlayer(pid, u, player_actions, payoff_matrix, init_strategy=None, rng=None)

Bases: Player

A player that learns a strategy using the fictitious play algorithm.

calc_joint_strategy()

Calculates the empirical joint strategy.

Returns:: The joint strategy.
Return type:: List[ndarray]

select_action()

Select an action using the current strategy.

Returns:: The selected action.
Return type:: int

update_empirical_strategy(player, action)

Update the empirical strategy of a player.

Parameters:

player (int) – The player to update for.
action (int) – Their last action.

update_strategy(epsilon=0, global_opt=False)

Updates the strategy of the player by calculating a best response to the empirical joint strategy.

Parameters:

epsilon (float, optional) – An optional parameter to allow for approximate Nash equilibria. (Default value = 0)
global_opt (bool, optional) – Whether to use a global optimiser or a local one. (Default value = False)

Returns:

Whether the strategy has converged and the best response strategy.

Return type:

Tuple[bool, ndarray]

class ramo.nash.Player.IBRPlayer(pid, u, num_actions, payoff_matrix, init_strategy=None, rng=None)

Bases: Player

A player that learns a strategy using best-response iteration.

update_strategy(joint_strat, epsilon=0, global_opt=False)

Update the strategy by using the super class implementation.

Parameters:

joint_strat (List[ndarray]) – A list of each player’s individual strategy.
epsilon (float, optional) – An optional parameter to allow for approximate Nash equilibria. (Default value = 0)
global_opt (bool, optional) – Whether to use a global optimiser or a local one. (Default value = False)

Returns:

Whether the strategy has converged and the best response strategy.

Return type:

Tuple[bool, ndarray]

class ramo.nash.Player.Player(pid, u, num_actions, payoff_matrix, init_strategy=None, rng=None)

Bases: object

A best-response player

check_converged(new_strat, joint_strat, epsilon=0)

Check whether a given player has converged in their best-response dynamics.

This works by comparing the performance of the old strategy and the new strategy to the current opponent strategies. If the old strategy performed as good (or better) in response, we don’t have to change the strategy and this player has (temporarily) converged.

Parameters:

new_strat – The new best-response strategy.
joint_strat – The old joint strategy.
epsilon (float, optional) – An optional parameter to allow for approximate Nash equilibria. (Default value = 0)

Returns:

Whether the player’s strategy has converged.

Return type:

bool

update(joint_strategy, epsilon=0, global_opt=False)

Update the strategy by calculating a best response to the other players’ strategies.

Parameters:

joint_strategy (List[ndarray]) – A list of each player’s individual strategy.
epsilon (float, optional) – An optional parameter to allow for approximate Nash equilibria. (Default value = 0)
global_opt (bool, optional) – Whether to use a global optimiser or a local one. (Default value = False)

Returns:

Whether the strategy has converged and the best response strategy.

Return type:

Tuple[bool, ndarray]

ramo.nash.fictitious_play module

ramo.nash.fictitious_play.alternating_variant(players, epsilon=0, global_opt=False)

Execute one iteration of the alternating fictitious play variant.

Parameters:

players (List[FPPlayer]) – A list of fictitious play players.
epsilon (float, optional) – The tolerance in best response optimisation.
global_opt (bool, optional) – Whether to find a globally optimal best response or only a locally optimal.

Returns:

Whether the policies have converged and the new joint strategy.

Return type:

Tuple[bool, List[ndarray]]

ramo.nash.fictitious_play.fictitious_play(monfg, u_tpl, epsilon=0, max_iter=1000, init_joint_strategy=None, variant='alternating', global_opt=False, verify=True, early_stop=None, seed=None)

Execute the fictitious play algorithm on a given MONFG and utility functions.

There are two variants of the fictitious play algorithm implemented, simultaneous and alternating fictitious play. These variants are not equivalent in general. In the simultaneous variant, all players calculate their best-response strategy simultaneously. The alternating variant does it by alternating.

Note

At this point in time, the algorithm does not find cycles and will continue to execute until the maximum number of iterations is reached.

Parameters:

monfg (MONFG) – An MONFG object.
u_tpl (Tuple[callable]) – A tuple of utility functions.
epsilon (float, optional) – An optional parameter to allow for approximate Nash equilibria. (Default value = 0)
max_iter (int, optional) – The maximum amount of iterations to run IBR for. (Default value = 1000)
init_joint_strategy (List[ndarray], optional) – Initial guess for the joint strategy. (Default value = None)
variant (str, optional) – The variant to use, which is either simultaneous or alternating. (Default value = ‘alternating’)
global_opt (bool, optional) – Whether to use a global optimiser or a local one. (Default value = False)
verify (bool, optional) – Verify if a converged joint strategy is a Nash equilibrium. When set to true, this uses a global optimiser and might be computationally expensive. (Default value = True)
early_stop (int, optional) – The number of iterations the joint strategy has to be the same to allow an early stop. (Default value = None)
seed (int, optional) – The initial seed for the random number generator. (Default value = None)

Returns:

Whether or not we reached a Nash equilibrium and the final joint strategy.

Return type:

Tuple[bool, List[ndarray]]

ramo.nash.fictitious_play.simultaneous_variant(players, epsilon=0, global_opt=False)

Execute one iteration of the simultaneous fictitious play variant.

Parameters:

players (List[FPPlayer]) – A list of fictitious play players.
epsilon (float, optional) – The tolerance in best response optimisation.
global_opt (bool, optional) – Whether to find a globally optimal best response or only a locally optimal.

Returns:

Whether the policies have converged and the new joint strategy.

Return type:

Tuple[bool, List[ndarray]]

ramo.nash.moqups module

ramo.nash.moqups.calc_nfg_psne(nfg, player_actions)

Calculate the PSNE for an NFG using the “underlining” method.

Parameters:

nfg (List[ndarray]) – An NFG as a list of payoff matrices.
player_actions (Tuple[int]) – A tuple of the amount of actions per player.

Returns:

The pure joint-strategies that are a PSNE.

Return type:

ndarray

ramo.nash.moqups.moqups(monfg, u_tpl)

Compute all Pure Strategy Nash Equilibria (PSNE) for a given MONFG with quasiconvex utility functions [1].

Note

MOQUPS, Multi-Objective and Quasiconvex Utilities for Pure Strategies, is only guaranteed to be correct when using quasiconvex utility functions.

References

Parameters:

monfg (MONFG) – An MONFG object.
u_tpl (Tuple[callable]) – A tuple of utility functions.

Returns:

A list of pure joint strategies that are Nash equilibria.

Return type:

List[List[ndarray]]

ramo.nash.moqups.psne_to_strats(psne_lst, player_actions)

Convert the pure strategy Nash equilibria as action profiles to joint strategies.

Parameters:

psne_lst (List[ndarray]) – A list of pure strategy Nash equilibria as action profiles.
player_actions (Tuple[int]) – The number of actions per player.

Returns:

A list of joint strategies.

Return type:

List[List[ndarray]]

ramo.nash.moqups.reduce_monfg(monfg, u_tpl)

Reduce an MONFG to an NFG by scalarisation. This is also known as the trade-off game.

Parameters:

monfg (MONFG) – An MONFG object.
u_tpl (Tuple[callable]) – A tuple of utility functions.

Returns:

The scalarised MONFG.

Return type:

List[ndarray]

ramo.nash.mose module

ramo.nash.mose.mose(monfg, u_tpl)

Compute all Pure Strategy Nash Equilibria (PSNE) for a given MONFG.

Note

MOSE, Multi-Objective Strategy Enumeration, is slow but guaranteed to be correct if using a global optimiser that has sufficient theoretical convergence guarantees for the given utility functions.

Parameters:

monfg (MONFG) – An MONFG object.
u_tpl (Tuple[callable]) – A tuple of utility functions.

Returns:

A list of pure joint strategies that are Nash equilibria.

Return type:

List[List[ndarray]]

ramo.nash.verify module

ramo.nash.verify.verify_all_nash(monfg, u_tpl, joint_strats, epsilon=0, tol=1e-12)

Globally verify if each joint strategy in a list is a Nash equilibrium.

Parameters:

monfg (MONFG) – An MONFG object.
u_tpl (Tuple[callable]) – A tuple of utility functions.
joint_strats (List[ndarray]) – A list of joint strategies to check.
epsilon (float, optional) – An optional parameter to allow for approximate Nash equilibria. (Default value = 0)
tol (float, optional) – The tolerance in the utility calculation. The default is set to the shgo default from
SciPy. (Default value = 1e-12) –

Returns:

Whether the joint_strategies in the list are actually Nash equilibria.

Return type:

bool

ramo.nash.verify.verify_nash(monfg, u_tpl, joint_strat, epsilon=0, tol=1e-12, strict=False)

Verify whether the joint strategy is a Nash equilibrium

Parameters:

monfg (MONFG) – An MONFG object.
u_tpl (Tuple[callable]) – A utility function per player.
joint_strat (List[ndarray]) – The joint strategy to verify.
epsilon (float, optional) – An optional parameter to allow for approximate Nash equilibria. (Default value = 0)
tol (float, optional) – The tolerance in the utility calculation. The default is set to the shgo default from
SciPy. (Default value = 1e-12) –
strict (bool, optional) – Whether to count unsuccessful optimisations as unverified and thus returning False.
False) ((Default value =) –

Note

A Nash equilibrium occurs whenever all strategies are best-responses to each other. We specifically use a global optimiser in this function to ensure all strategies are really best-responses and not local optima. Be aware that finding a global optimum for a function is computationally expensive, so this function might take longer than expected.

Returns:: Whether the given joint strategy is a Nash equilibrium.
Return type:: bool

ramo.nash package

Submodules

ramo.nash.IBR module

ramo.nash.Player module

ramo.nash.fictitious_play module

ramo.nash.moqups module

ramo.nash.mose module

ramo.nash.verify module

Module contents