Quick Post Mortem on improving basic bots with ML


#1

Although I won't open source my bot quite yet (I rushed a lot of features and definitely need to go back and make the code style not terrible so I won't feel ashamed posting it :sweat:) this is essentially the basis of what I did to improve the basic bot:

  1. Start off of the Starter ML bot. This is a pretty solid bot and gives you a nice idea of shortcomings and strengths from this approach. One thing I noticed is this also comes with a greedy ship assignment function, so it is also nice in that you don't have to write that logic yourself.

  2. Improve the Navigation. The starter bot is Python which means it may time out--to prevent this it just thrusts in an angle if the time is less than x seconds. I opted for a Cython approach as this speeded it up quite a bit; I would also recommend recognizing when a collision is possible (my solution was a simple bounding circle) and straying away from it, as this means when ships are close to each other, instead of complex navigation that avoids collision and may time out, it may be better just not to move towards other objects!

  3. Change the ML layers. I technically did this before step 2, but didn't bother training it actually until after the fact. I found a fairly large first layer and a smaller second layer works very well; my first implementation added more layers and used a GDN, but this didn't work out nearly as well as a 2 layer model. The first layer does pretty well within the range of 50-75 and the second at 18-25; it varies depending on who exactly you're trying to copy.

  4. Train your models. ALSO: don't train on the replays it gets from today! The dataset is quite small and the top players tend to implement strategies that are difficult for an AI to pick up. I trained mine on every tsadmiral game between November 11 - December 11. This is primarily due to the fact that tsadmiral seems to use a very concise strategy that isn't difficult to learn, and also has thousands of games of data on just one bot revision. My laptop's 16GB RAM + 6GB 1060X can support ~ 500 data samples at large batch sizes without a memoryerror (given every other application is closed). It IS extremely slow which is why I modified bot.py. By loading games from a preprocessed NumPy file you can speed up loading those 500 games (which, if you time out processing, you can load dynamically into multiple numpy files and then combine them!) and then train any model with that data in very short periods of time. My saves total 336MB which isn't bad for a full dataset, although this could be improved. Here are some of my modifications (manipulating numpy data isn't difficult at all but I haven't included that as it is essentially trivial):

def main():
    ...
    data_input, data_output = parse(raw_data, args.bot_to_imitate, args.dump_features_location)
    data_size = len(data_input)
    training_input, training_output = np.array(data_input[:int(0.75 * data_size)]), np.array(data_output[:int(0.75 * data_size)])
    validation_input, validation_output = np.array(data_input[int(0.75 * data_size):]), np.array(data_output[int(0.75 * data_size):])
    
    with open('train_in_1.npy','wb') as f:
        np.save(f, training_input)
    with open('train_out_1.npy','wb') as f:
        np.save(f, training_output)
    with open('test_in_1.npy','wb') as f:
        np.save(f, validation_input)
    with open('test_out_1.npy','wb') as f:
        np.save(f, validation_output)
    # After the first run, comment out the above and do this instead:
    training_input = np.load('./train_in_1.npy')
    training_output = np.load('./train_out_1.npy')
    validation_input = np.load('./test_in_1.npy')
    validation_output = np.load('./test_out_1.npy')

You've improved the basic ML. Now what? Strategies! ML will tell you where to go, now you need to handle some special cases. This is a list of really easy (total < 200 lines) strategies to implement.

(I'm going to refer to everything with Star Wars metaphors as that's just how I named these in my code :rocket:)

  1. Jump to Hyperspace/Run and Hide. (this will raise your rank/rating more than anything else) Running away when about to lose increases the chance of placing 2nd--which is good! The easiest way to do this is to detect that the Empire (number of enemy controlled planets / total planets) is greater than 80%, it is a 4 player game (game_map._players at the beginning is > 2), and the Rebel Alliance (number of your planets / total planets) is less than 20%. Then, move your fleet to the nearest corner. The first improvement you should make is fleeing earlier (although I have had some problems with this and my current leaderboard bot doesn't use this as a result). Next improvement would be undocking any docked ships and instructing them to move away from the corner you're moving towards. This will trigger the enemy fleet and you'll last longer. Lastly, go in a circle around the board to avoid being caught. Some users have run to the nearest edge, but the fault I find with this is that your final planet is likely near an edge and, as a result, you will be quite close to the enemy fleet and may trigger a response.

  2. BLOW UP THE DEATH STAR! If you're in dire straits like the rebel alliance, nominate a fleet of rebel ships to become kamikazes. Send your ships towards a densely populated planet and BLOW IT UP! Usually this does more damage to the enemy than it does to you. Unfortunately, you don't have a Luke Skywalker so none of these ships will survive :frowning:

  3. Retreat to Hoth. If your planet is obviously about to be taken over, run to the nearest planet and form a stronger base. Then, when the enemy fleet of AT-ATs moves towards you, you can wipe out as much of it as possible and increase the chance of making a comeback.

  4. Jedi Temple. If one of your jedi is getting way too old (e.g. a ship is running SUPER low on HP, e.g. < 70) then swap them out with an undocked ship nearby and direct the old ship to a planet. This builds a "jedi temple" of ships that are too old to fight as producers and puts the young padawans near the exterior of the planet to defend it. Preserve the order!

  5. Shoot first/Han Solo strat. Attack the weakest nearby clone/enemy ship if you're within x units (I find x=10 is nice as ships at high speed will probably come into contact the next turn if distance < 10). By initiating the contact you'll give the enemy less time to create a formation, and by disabling the weakest ship you decrease their firepower immediately. Also, some fleets follow the ship currently in battle, which means they will waste turns moving towards this weaker ship while you disable it as it is unlikely to be at the forefront of the assault.

  6. Rogue Squadron (similar to Rabbit). Find Yavin/biggest enemy and send it a fake ship. This ship is meant to draw attention from the fleet and can distract it; also, if undetected, it can launch an attack on the enemy base (Yavin!) or even become an early-on defector and hide in the corner.

  7. Squad Commander/Targeting. Ships going to the same planet should follow one leader who has the best calculated path. Do not use this for fighting or you'll get a terrible fleet formation that follows rabbits.

  8. Star Destroyer/Swarming. Use this for fighting. Create a swarm of x ships to fight, and have them move close together (within 3 units). This creates a type of Star Destroyer ship that is actually comprised of multiple ships but moves and acts as one. If you limit the number of the swarm, you can avoid losing to a rabbit. This is also more effective in increasing firepower, and also for defense.

  9. Kessel Run/Assassin strat. Early game, if you're close to the enemy fleet, send your ships as a star destroyer/swarm to kill them off. This is great as early game the enemy ships aren't likely to fight, and some may even start docking which means they will be defenseless! Of course this is a bit of a Kessel Run and runs EXTREMELY high risk.

  10. Sith Rule of Two. No ship can be given two commands! As a result, structure your code so the most important command (e.g. attack!) is given last (e.g. after move to planet). Store the commands in a dict rather than the default list as writing to command_queue[ship.id] will mean your final command to the ship will be stored as its command in the command queue. You can then turn this back into a list using one-liner return [command for ship_id, command in command_queue.items()].

My bot uses a few more finnicky strats but those don't offer that much improvement. I'm only rank 102/rank 5 ML so it is possible I've missed a lot of strats so feel free to add to the list!

Also, check out @arjunvis's thread on reinforcement learning with policy gradients for heavier AI improvements!