Artificial Intelligence (AI) — Part two

Graphs showing some ML data

Spoiler alert, I won’t use Unity machine learning (also called mlagents) to implement artificial intelligence for my bots. If you want to know more about why, read on.

At first, it was hard to use, see my previous post, but then Unity helped my by giving me access to their alpha mlagents-cloud. That fixed my previous problem which was mostly a hardware problem.

From hard to use it becomes easy to iterate, and that’s exactly what I needed to find out if it was a good approach for my idea of having bot using “real” AI.

Setup

When you try to train a model you have to give it three main data points:

1 – Observations: what it (it’s called an agent) can see from its environment
2 – Actions: what the agent can do
3 – Rewards: information on how it performs

So you have to think quite hard about it, but as you know your environment, in the end you can find some good inputs for each of those (so you think).

In the very beginning I tried to train a model that would move and shot the target.

Let’s dive into some details

I had 12 observations, 10 actions and plenty of rewards points here and there. But I found out that no matter what, my model could not understand how to fire, it was moving quite alright but never firing.

I decided to split the model in two, one for moving, and one for aim and fire. I found out online that most people do this way when the problem for the agent is too hard. It’s a first trade off but I thought that it was acceptable.

Now I have two experiments, one to learn to move and the other to aim and fire.

Moving

The agent has to go to the target so reward is calculated on how close it is to the target. It can go left/right/jump/double jump. The map can be pretty hard to navigate for sure, even sometime impossible (something that machine learning does not like).

After 7 iterations, where I changed the reward values, added/removed some observations, made the map easier to navigate etc. This is what I’ve got:

The agent mostly succeed but sometime it goes in the wrong direction, it is always jumping like crazy, it does not handle the double jump when needed, it does not look natural at all.

Give it 8 more iterations, trying to add negative reward to the jump so it stops doing it that much etc. I did not get anything better.

Note that even if Unity mlagents-cloud allow me to iterate quickly, it still needs a couple of hours between each model changes.

Firing

The agent has to hit the target with the bazooka so reward is calculated on how many damages it makes and also how close it is (when failing). It can aim up/down/load fire/release to shot. This time the map was made easy from the beginning.

But after 5 iterations I found out that this was already too complex for the model. It did not manage to hit the target, only itself. The load and release action to fire is too complex from what I understood.

Conclusion

The problem is that machine learning is hard, and I’m not an expert in it

It took me a full week, working like crazy to conclude that I’m not an expert enough to know what are the limits of this, and how to bypass them. Of course, I could spend more time on this but it seems that no matter what, the outcome will not be as good as I first imagined.

By working on machine learning in this scope, training an agent to be a bot in a game, I also realized that doing so as a developper, you would lose all the control on your bot. I’m quite sure that when the AI is well trained the result for the player is nice, but as a game designer you can not force how your bot would behave (except making an new model each time).

This adds up to my final conclusion: machine learning is not what I need so I’ll have to make a manual AI for Artillery Battle, and this will be hard.

Funny note

When working with machine learning, you can come across some funny (but logic) behaviors. For example in my first “fire” experiment the agent learned that not firing at all was the best way to go. Because if it failed and hit itself it was punished. So I had to give it some positive reward for firing and lower negative on hitting itself (this is an example on what you have to do between each of your experiments to get a better outcome).

Network implementation

Naive start

My idea was to implement the network part from day one because that way I could play with beta testers right away.

First, I started with my own implementation using Firebase and found out that it was pretty hard to get everything working. Then I benchmarked a bunch of solutions and settled down on Photon Unity Network (PUN).

It was great, the code was not that hard and it seemed to work. Until I had the occasion to test an early version with a friend in real condition (meaning over the internet and not on a local machine). The result was too laggy for me. I’m pretty sure I could improve some details but I didn’t want to fight against the code.

I decided to stop developing the network part right away, but thanks to this first step I’m very aware of how to structure the code.

Custom solution

Later on, I made some prototype with a new solution of my own, tailored for that particular game. Indeed, being a turn-based game, I will go for a “turn replay” mechanism: the idea is to record the turn of the player and broadcast it to the other player in near real time. This will also allow keeping a record of any game for later replays.

You can now see how important it is to have deterministic physics, so I don’t need to record every movement in the replay stream.

Let’s dive into some details

The “Stream Play” code (that’s how I call it internally) is split in two main components: the Recorder which in charge of — hum — recording events and the Player which will replay those events. Of course in between there is a websocket connection to transfer recorded event from player A to player B (it goes through a server for extra control).

The recorder does not save everything that is happening, it only saves important information called snapshots. Those are the position of the characters, the state of the map (holes and other changes like this), positions of the bonus boxes and mines. That way at the end of the turn we are sure that both players are in sync.

The recorder also sends the active player inputs, this time it is real time, and those inputs are played right away on the other side. But because the output could slightly diverge, the source of truth at the end of the turn will be the snapshots.

The player, on the other end, buffers a few seconds of data, and because Artillery Royale is turn based and not real time, it does not matter much. And then runs the inputs and apply the snapshots. Both are time based that way the player can follow the right timeline.

In the middle there is a NodeJS server. It does not do much. Mostly send data from player A to player B, using a game id that is shared across both client. This server prototype — I mean this whole network thing — is still an early prototype. But so far I have some good results!

@koalefant asked on the discord server (click to join): “I am curious why did you end up using both snapshots and input simulation for networking? Would not snapshots be sufficient?

The answer is: basically I use custom physics for movements (characters and ammo) but I still use Unity colliders and I’m worried that collision would drift away at some point (I mean not worried, it will at some point). That’s why I’m using both inputs and snapshots.

Conclusion

We can see that I choose a deterministic way of doing by sending inputs and letting the physics plays on both sides, and because the physics in Artillery Royale is — mostly — deterministic, it works. But I’m extra careful and send snapshots just in case!

The data that flows from both players is very light. Even real time inputs does not represent that much of information. This way of doing will also allow saving replays in a very optimized format.

Chess Battle is now Artillery Royale!

Chess Battle renamed into Artillery Royale

Finding a name is a very hard task, and for a non-native speaker even harder.
I thought I had something with Chess Battle, it’s distinct, has some meaning and straight forward to spell. Not bad.

Unfortunately on some very early blind play tests people got confused about the name.
Some warned me that they were not good at the chess game and were hopping that the game did not need much chess skill.

RED ALERT!

Even if at the start, the chess theme sounded great for me, I did not think about that.

A game of chess is hard, and for most people, not even fun to play with. In that way Chess Battle sends a wrong signal about the game. And to be honest, it’s not that much chess related.

So here we are: Chess Battle is now called Artillery Royale!

Six minutes of gameplay

I wanted to give you some news, everything is going well and the development is progressing as expected!

You would get more real time news on the Artillery Royale discord, but I also want to maintain this blog so you choose your favorite media 🙂

Six minutes of Artillery Royale gameplay

It’s still an early alpha here, the final game will feature multiple character classes, more weapon, more map type, etc.

Artificial Intelligence (AI) — Part one

Graph showing Cumulative Reward

A demo for Artillery Royale is planned for the end of September.

At first, this demo would have been a two-player demo only, but quickly I realized that this won’t make much sense for most of the players because right now there is no network support, nor enough players.

So if you want to play the demo you’d have to be two in the same room, playing turn by turn (something that is intended when the game will be done but probably not ideal for a demo where I want quick iteration and feedback loop).

On another hand, I always thought basic AI would be a pain to code and not fun to play with (because it’s based on a set of rules, the player can understand and predict them quickly).

So what was the solution?

Fortunately, we are at a time when you can use “real” AI in your games now. Real like the one in automated cars, or the one which won at the Go game. That kind of real. The one that you can train yourself giving rewards to get a neural network in return. The one that is mostly unpredictable, creative, and fun to interact with.

I mean, that’s the theory.

That being said, it’s still a hard topic. AI development is fun to play with but hard to get right. And to be honest, I’m a total noob in this area. I understand the basics and how it works as a whole but implementing it is something else.

Fortunately, Unity has some good pieces in place to help you start, they included a good toolkit (API) and some tutorials too. I thought it will be easy to apply their example to my specific problem but OMG that was way harder than I thought.

I had a first very naive approach, using what I’ve just learned in a good tutorial and thinking that it will be quite easy to apply to my own problem. Not true. I had to refactor all the code first to make it work with multiple environments (but that’s a detail), I’ve also had to think hard to get a good rewarding system and implement it. Find how to express AI objectives and translate that to code.

When this was done, I had some AI training going on, and to be honest it looked like it could yield some result. But I found another problem: computational power. My old MacBook Pro is, well, old and does not have the needed CPU power to train an AI model in a reasonable time. After a lot of internet searching, I found out that my objective is too complex for my AI model anyway (no matter the power of my computer).

Note on the hardware: at some point, I was looking to pimp my MacBook with external GPU thinking that it would help. I discovered that it won’t. Tensorflow the software used behind the scene uses Nvidia GPU only (and fallback to CPU otherwise). Unfortunately, Nvidia and Apple are at war (and thus not compatible). So it won’t help.

Getting ready for part two.

Today I designed a new way of getting an AI to work. After my search, I found out that people split their objectives in small chunks and train multiple brains. Then they add some code to switch those brains during the gameplay. I’m hoping that way it will work.

Also, Unity contacted me because they have some AI Training Cloud in their roadmap, and they may provide that service in the future. Hopefully, they will let me try it soon.

Doubts

When I spend too many hours on Artillery Royale trying to solve a problem or even iterating, sometimes I fall into periods of doubts.

I play my own game and I don’t see what I’m looking for. It’s not fun, not pretty, not well balanced. Full of bugs. In a word: it’s not good. From here it often escalates to the point where I realize the amount of work left. I feel overwhelmed. I don’t know where I’m heading to.

Mentally it’s hard to handle. The only solution I found for those moments is to step back a few hours, or even days. And hopefully, when I’m back to work it’s gone and I’m full of faith again (so far it worked like that).

I guess looking back at the accomplished work help too, hence my previous post about current progress.

I thought it’s important to share those bad feelings too because as indie developers we face many challenges, but self-doubting seems to be one of the worst (at least for me).

Do not hesitate to take a break!

On that note, it’s holiday time for me, see you in a couple of weeks.

🏖