--- Note: A great deal of technical development between the Technical_Notes-v1.1 and v2.0 is not documented. This document starts at the beginning of the addition of networking to Maelstrom, and finishes with the final networking algorithm used. (See Networking.Paper for a more organized description) --- The hardest thing about this networking project is synchronization. My initial feeling was to go with TCP for sending synchronization packets, but running a real-time game at 30 frames per second, I can't have the slow-start algorithm after a packet loss and the other overhead of TCP would be wasteful. My thought was to take advantage of the guaranteed delivery that TCP offers, since on a 640x480 screen, even one missed frame could have significant reprocussions in terms of game state. I also need the speed of UDP. Ideally, this game would be able to run over 14,400 bps modem speeds. Using PPP, this averages between 600 and 900 cps, so including IP headers, we would want no more than 600 bytes of traffic every second. Using 1 byte synchronization packets, 30 frames per second would require in (1+U+20)*30 bytes throughput every second. (Where U = #bytes in a UDP header) That's at least 630 bytes per second, just going one way. Guaranteed delivery of synchronization packets requires at least an ack, nearly doubling the necessary throughput. The only reason we want such tight synchronization is so that keypresses arrive on the frame that they need to, and that frame is the same on every machine in the game. Since key-presses average about 10 per second, we can probably get away with only 10 sync packets every second (1 every 3 frames). We still want a granularity of a keyboard poll every frame, but we can now do lazy acks and do guaranteed delivery on 1 sync per second and each keystroke. Well, the problem is not if a keystroke is sent to a machine that is slow, but if a keystroke is sent to a machine that is fast. If a fast machine zips through the frames, then a keystroke at a particular frame will get missed or late, both of which resulting in corrupt games. Imagine pressing the turn key on your machine, pivoting round, and pressing the fire key. On your machine you hit the asteroid next to you, and on another fast machine, you crashed before you were able to hit the fire key. Instability: You are dead on one machine, and live on another. :-) We need synchronization on each frame (actually on each key poll,) to prevent keystrokes coming in a frame or two too late. Keystrokes can ride piggyback on sync frames, so there is little overhead for that... The alternative is to poll the keyboard every other frame, and only send sync signals every other frame as well. ... That doesn't work very well... you need the every frame key poll to respond quickly while firing. Stop and wait? -- not exactly. :) The conclusion? Networked Maelstrom won't run over the modem (200 ms ping times are too slow) and we will use UDP with stop-go protocol for real-time speed. The next phase is designing a good handshake protocol. The plan: When a new game starts... The checksum server (player 1) will send a checksum to all other players. Each other player will, when it receives it, send back the same message. How do we guarantee delivery? We don't, but if the reply doesn't make it, the first player will keep sending the NEW_GAME packet to that player, and that player will respond, even though it's waiting for the first frame packet. Another possible dropped packet condition is as follows: Player 1 sends a frame packet to itself and Player 2. Player 2 sends a frame packet to itself and Player 1. Player 2 receives the frame packet from Player 1, and itself. Player 1 only receives the packet from itself. Player 2 continues to the next frame Player 1 retransmits the packet from its current frame. Player 2 is now deadlocked waiting for the packet for frame +1 while Player 1 is still asking for the packet for frame +0. The solution is for each player to keep the it's last frame packet cached, so that if it receives a request for the last frame, it can resend the previous frame's packet, and the game can continue. The players can only be one frame off, because each player does not continue to the next frame until all players have sent the frame packet for the current frame. If one player is one frame behind, all other players will wait for it. If packets arrive out of order, then an older retransmission can arrive after a frame has been successfully negotiated. In this case, the receiver will send a response if the packet is from the previous frame, and it will be ignored by the sender, who is a frame ahead. Thus consistency is preserved without initiating packet storms of retransmissions and acks. Note that this configuration really is peer-to-peer, and the only time one player is a "server" is when the first player sends the initial level, number of lives, and random seed to all other players. I had a great problem with lock-stepping using the above method. Imagine both players furiously sending frame packets at each other, and advancing to the next frame. Then: Player 1 (frame 10) Player 2 (frame 10) 10 --> <-- 10 dropped packet Player 1 (frame 10) Player 2 (frame 11) wait <-- 11 get 11, timeout Player 1 (frame 10) Player 2 (frame 11) 10 --> <-- 10 (cached) Player 1 (frame 11) Player 2 (frame 11) 11 --> wait Player 1 (frame 11) Player 2 (frame 12) wait <-- 12 get 12, timeout This goes on, repeating, as the two players advance frames in lock-step, one frame per timeout. The solution is for Player 1 to immediately resend the sync packet for it's frame if it receives a packet for the next frame. That way it doesn't have to time out. If it caches the packet for the next frame, the next time around, it just sends the packet for the next frame, and advances immediately. The two players are now back in sync. This assumes that few packets will be dropped or otherwise lost. This seems to be a valid assumption. Even over a 32.6K link to Australia, very few packets were lost and though the game was unplayably slow, dropped, lost, and timed out packets were few and far between. The new senario: Player 1 (frame 10) Player 2 (frame 10) 10 --> <-- 10 dropped packet Player 1 (frame 10) Player 2 (frame 11) get 11, cache, 10 --> <-- 11 Player 1 (frame 10) Player 2 (frame 11) get 10 <-- 10 (cached) Player 1 (frame 11) Player 2 (frame 11) 11 --> get 11 Player 1 (frame 12) Player 2 (frame 12) 12 --> <-- 12 The players are now synchronized again.