(翻譯 gafferongames) Client Server Connection 客戶端服務器連接
https://gafferongames.com/post/client_server_connection/
So far in this article series we’ve discussed how games read and write packets, how to unify packet read and write into a single function, how to fragment and re-assemble packets, and how to send large blocks of data over UDP.
Now in this article we’re going to bring everything together and build a client/server connection on top of UDP.
到目前為止,在本系列文章中我們已經討論了游戲是如何讀取和寫入數據包的,如何將數據包的讀取和寫入統一為一個函數,如何對數據包進行分片與重組,以及如何通過 UDP 發送大塊數據。
現在,在這篇文章中,我們將把這些內容整合起來,構建一個基于 UDP 的客戶端/服務器連接。
Background
Developers from a web background often wonder why games go to such effort to build a client/server connection on top of UDP, when for many applications, TCP is good enough. *
The reason is that games send time critical data.
Why don’t games use TCP for time critical data? The answer is that TCP delivers data reliably and in-order, and to do this on top of IP (which is unreliable, unordered) it holds more recent packets hostage in a queue while older packets are resent over the network.
This is known as head of line blocking and it’s a huuuuuge problem for games. To understand why, consider a game server broadcasting the state of the world to clients 10 times per-second. Each client advances time forward and wants to display the most recent state it receives from the server.
But if the packet containing state for time t = 10.0 is lost, under TCP we must wait for it to be resent before we can access t = 10.1 and 10.2, even though those packets have already arrived and contain the state the client wants to display.
有 Web 開發背景的開發者常常會疑惑,為什么游戲要費這么大力氣在 UDP 之上構建一個客戶端/服務器連接?對于很多應用來說,TCP 已經足夠用了。*
原因是:游戲需要發送的是對時間敏感的數據。
那為什么游戲不使用 TCP 來發送這些時間敏感的數據呢?答案是:TCP 為了實現可靠、有序的數據傳輸,會在 IP(本身不可靠、無序)之上引入機制,將較新的數據包“扣押”在隊列中,直到較老的丟包被重新傳輸并成功收到為止。
這就是所謂的 “隊頭阻塞(Head-of-Line Blocking)”,而對游戲來說,這是一個巨大的問題。
要理解為什么這很嚴重,我們可以想象一個游戲服務器每秒向客戶端廣播世界狀態 10 次。客戶端不斷推進游戲時間,并希望展示自己收到的最新服務器狀態。
但是如果包含時間 t = 10.0 狀態的數據包丟失了,在 TCP 下,客戶端必須等待該數據包被重傳并成功接收,然后才能訪問時間 t = 10.1 和 10.2 的數據,即使這些數據包已經到達,并且它們包含了客戶端想要展示的游戲狀態。
這就是 TCP 的“有序可靠性”機制帶來的副作用:**即使你已經收到了更新的數據,也不能用,必須等舊的先補上。**而對于實時游戲來說,這種延遲是不可接受的。

Worse still, by the time the resent packet arrives, it’s far too late for the client to actually do anything useful with it. The client has already advanced past 10.0 and wants to display something around 10.3 or 10.4!
So why resend dropped packets at all? BINGO! What we’d really like is an option to tell TCP: “Hey, I don’t care about old packets being resent, by they time they arrive I can’t use them anyway, so just let me skip over them and access the most recent data”.
Unfortunately, TCP simply does not give us this option :(
更糟糕的是,當那個被重傳的數據包最終抵達時,客戶端早就已經過了 10.0,現在正想顯示 10.3 或 10.4 附近的狀態了! 這個時候再拿到 10.0 的狀態數據,根本沒什么用了。
所以我們不禁要問:為什么還要重傳丟失的數據包呢?
BINGO! 我們真正想要的是這樣一種能力,能告訴 TCP:“嘿,我不關心舊的數據包是否重傳,它們到的時候我已經用不上了,干脆跳過它們,直接給我最新的數據吧!”
不幸的是,TCP 并不提供這種選擇
All data must be delivered reliably and in-order.
This creates terrible problems for time critical data where packet loss and latency exist. Situations like, you know, The Internet, where people play FPS games.
Large hitches corresponding to multiples of round trip time are added to the stream of data as TCP waits for dropped packets to be resent, which means additional buffering to smooth out these hitches, or long pauses where the game freezes and is non-responsive.
Neither option is acceptable for first person shooters, which is why virtually all first person shooters are networked using UDP. UDP doesn’t provide any reliability or ordering, so protocols built on top it can access the most recent data without waiting for lost packets to be resent, implementing whatever reliability they need in radically different ways to TCP.
所有數據必須被可靠且按順序地傳輸。
這對存在丟包和延遲的時間敏感數據造成了嚴重問題。比如,你懂的,互聯網——人們在上面玩 FPS 游戲的地方。
由于 TCP 等待丟失的數據包重傳,數據流中會出現相當于多個往返時延的大幅卡頓,
這意味著要么增加額外緩沖來平滑這些卡頓,要么出現游戲卡住、無響應的長時間暫停。
對于第一人稱射擊游戲來說,這兩種方案都是不可接受的,這就是為什么幾乎所有的第一人稱射擊游戲都使用 UDP 進行聯網的原因。
UDP 不提供任何可靠性和順序保證,因此在其上構建的協議可以無需等待丟失的數據包被重傳,就能訪問最新的數據,并能用與 TCP 完全不同的方式,實現自己所需的可靠性。
But, using UDP comes at a cost:
UDP doesn’t provide any concept of connection.
We have to build that ourselves. This is a lot of work! So strap in, get ready, because we’re going to build it all up from scratch using the same basic techniques first person shooters use when creating their protocols over UDP. You can use this client/server protocol for games or non-gaming applications and, provided the data you send is time critical, I promise you, it’s well worth the effort.
* These days even web servers are transitioning to UDP via Google’s QUIC. If you still think TCP is good enough for time critical data in 2016, I encourage you to put that in your pipe and smoke it :)
但是,使用 UDP 是有代價的:
UDP 不提供任何“連接”的概念。
我們必須自己構建這一部分。這是一項繁重的工作!所以,系好安全帶,準備好了,因為我們將從零開始構建,使用第一人稱射擊游戲在構建基于 UDP 協議時所采用的基礎技術。
你可以將這個客戶端/服務器協議用于游戲或非游戲應用,只要你發送的數據是時間敏感的,我向你保證,這一切努力都非常值得。
如今,甚至 Web 服務器也通過 Google 的 QUIC 協議轉向使用 UDP。如果你仍然認為 TCP 足以勝任 2016 年的時間敏感型數據傳輸,我建議你把這話寫在煙斗上,點著抽了吧 :)
Client/Server Abstraction 客戶端/服務器抽象層
The goal is to create an abstraction on top of a UDP socket where our server presents a number of virtual slots for clients to connect to:
目標是在 UDP 套接字之上創建一個抽象層,使得服務器可以提供多個虛擬連接槽位,供客戶端連接:

When a client requests a connection, it gets assigned to one of these slots:
當客戶端請求連接時,它將被分配到其中一個虛擬槽位:

If a client requests connection, but no slots are available, the server is full and the connection request is denied:
如果客戶端請求連接,但沒有可用的槽位,則服務器已滿,連接請求將被拒絕:

Once a client is connected, packets are exchanged in both directions. These packets form the basis for the custom protocol between the client and server which is game specific.
一旦客戶端連接,數據包將在雙向交換。這些數據包構成了客戶端和服務器之間的自定義協議的基礎,該協議是特定于游戲的。

In a first person shooter, packets are sent continuously in both directions. Clients send input to the server as quickly as possible, often 30 or 60 times per-second, and the server broadcasts the state of the world to clients 10, 20 or even 60 times per-second.
Because of this steady flow of packets in both directions there is no need for keep-alive packets. If at any point packets stop being received from the other side, the connection simply times out. No packets for 5 seconds is a good timeout value in my opinion, but you can be more aggressive if you want.
When a client slot times out on the server, it becomes available for other clients to connect. When the client times out, it transitions to an error state.
在第一人稱射擊游戲中,數據包持續在雙向傳輸。客戶端盡可能快速地將輸入發送到服務器,通常每秒發送30次或60次,而服務器則以每秒10次、20次甚至60次的頻率廣播世界狀態給客戶端。
由于數據包在雙向的穩定流動,因此不需要保keep-alive packets.。如果在任何時候沒有從另一端接收到數據包,連接就會超時。在我看來,5秒沒有收到數據包是一個不錯的超時值,但如果你愿意,也可以設置得更激進一些。
當服務器上的客戶端插槽超時時,它將變為可供其他客戶端連接的狀態。當客戶端超時后,它會轉變為錯誤狀態。
Simple Connection Protocol
Let’s get started with the implementation of a simple protocol. It’s a bit basic and more than a bit naive, but it’s a good starting point and we’ll build on it during the rest of this article, and the next few articles in this series.
First up we have the client state machine.
The client is in one of three states:
- Disconnected
- Connecting
- Connected
Initially the client starts in disconnected.
簡單連接協議
讓我們開始實現一個簡單的協議。它有點基礎,也有點天真,但它是一個很好的起點,在接下來的這篇文章以及本系列接下來的幾篇文章中,我們將基于它進行構建。
首先,我們來看客戶端的狀態機。
客戶端處于三種狀態之一:
-
斷開連接 (Disconnected)
-
正在連接 (Connecting)
-
已連接 (Connected)
客戶端初始時處于“斷開連接”狀態。
When a client connects to a server, it transitions to the connecting state and sends connection request packets to the server:
當客戶端連接到服務器時,它會切換到“正在連接”狀態,并向服務器發送連接請求包:

The CRC32 and implicit protocol id in the packet header allow the server to trivially reject UDP packets not belonging to this protocol or from a different version of it.
Since connection request packets are sent over UDP, they may be lost, received out of order or in duplicate.
Because of this we do two things: 1) we keep sending packets for the client state until we get a response from the server or the client times out, and 2) on both client and server we ignore any packets that don’t correspond to what we are expecting, since a lot of redundant packets are flying over the network.
包頭中的CRC32和隱式協議ID允許服務器輕松地拒絕不屬于該協議或來自不同版本的UDP數據包。
由于連接請求包是通過UDP發送的,它們可能會丟失、亂序或重復接收。
因此,我們采取了兩個措施:1) 我們會繼續發送客戶端狀態包,直到收到來自服務器的響應或客戶端超時;2) 在客戶端和服務器端,我們會忽略任何不符合預期的數據包,因為網絡上會有很多冗余的數據包。
On the server, we have the following data structure:
const int MaxClients = 64; class Server { int m_maxClients; int m_numConnectedClients; bool m_clientConnected[MaxClients]; Address m_clientAddress[MaxClients]; };
Which lets the server lookup a free slot for a client to join (if any are free):
int Server::FindFreeClientIndex() const { for ( int i = 0; i < m_maxClients; ++i ) { if ( !m_clientConnected[i] ) return i; } return -1; }
Find the client index corresponding to an IP address and port: 根據 IP 地址和端口查找對應的客戶端索引:
int Server::FindExistingClientIndex( const Address & address ) const { for ( int i = 0; i < m_maxClients; ++i ) { if ( m_clientConnected[i] && m_clientAddress[i] == address ) return i; } return -1; }
Check if a client is connected to a given slot: 檢查某個客戶端是否已連接到指定的槽位:
bool Server::IsClientConnected( int clientIndex ) const { return m_clientConnected[clientIndex]; }
… and retrieve a client’s IP address and port by client index: 并通過客戶端索引獲取該客戶端的 IP 地址和端口:
const Address & Server::GetClientAddress( int clientIndex ) const { return m_clientAddress[clientIndex]; }
Using these queries we implement the following logic when the server processes a connection request packet:
-
If the server is full, reply with connection denied.
-
If the connection request is from a new client and we have a slot free, assign the client to a free slot and respond with connection accepted.
-
If the sender corresponds to the address of a client that is already connected, also reply with connection accepted. This is necessary because the first response packet may not have gotten through due to packet loss. If we don’t resend this response, the client gets stuck in the connecting state until it times out.
當服務器處理連接請求數據包時,使用這些查詢我們可以實現以下邏輯:
-
如果服務器已滿,回復連接被拒絕。
-
如果連接請求來自一個新客戶端,且我們有空閑的插槽,則將該客戶端分配到一個空閑插槽,并回復“連接已接受”。
-
如果發送者對應的地址是一個已經連接的客戶端,也回復“連接已接受”。這是必要的,因為第一個響應數據包可能因為丟包沒有到達客戶端。如果我們不重新發送該響應,客戶端就會卡在“連接中”狀態,直到超時為止。
The connection accepted packet tells the client which client index it was assigned, which the client needs to know which player it is in the game:

Once the server sends a connection accepted packet, from its point of view it considers that client connected. As the server ticks forward, it watches connected client slots, and if no packets have been received from a client for 5 seconds, the slot times out and is reset, ready for another client to connect.
Back to the client. While the client is in the connecting state the client listens for connection denied and connection accepted packets from the server. Any other packets are ignored.
If the client receives connection accepted, it transitions to connected. If it receives connection denied, or after 5 seconds hasn’t received any response from the server, it transitions to disconnected.
Once the client hits connected it starts sending connection payload packets to the server. If no packets are received from the server in 5 seconds, the client times out and transitions to disconnected.
一旦服務器發送了“連接接受”數據包,從服務器的角度來看,它就認為該客戶端已經連接成功。隨著服務器持續運行,它會持續監視已連接的客戶端槽位,如果某個客戶端在 5 秒內沒有發送任何數據包,則該槽位會超時并被重置,準備接受另一個客戶端的連接。
回到客戶端這邊。當客戶端處于“連接中”狀態時,它會監聽來自服務器的“連接拒絕”和“連接接受”數據包,其他類型的數據包則會被忽略。
如果客戶端收到了“連接接受”數據包,它就會轉入“已連接”狀態。如果收到“連接拒絕”,或者在 5 秒內沒有收到服務器的任何回應,它就會轉入“斷開連接”狀態。
一旦客戶端進入“已連接”狀態,它就開始向服務器發送連接負載數據包。如果在之后的 5 秒內仍然沒有收到服務器的數據包,客戶端也會超時并轉為“斷開連接”狀態。
Naive Protocol is Naive
While this protocol is easy to implement, we can’t use a protocol like this in production. It’s way too naive. It simply has too many weaknesses to be taken seriously:
雖然這個協議實現起來很簡單,但我們不能在生產環境中使用這樣的協議。它實在是native了。它有太多的弱點,無法被認真對待:
-
Spoofed packet source addresses can be used to redirect connection accepted responses to a target (victim) address. If the connection accepted packet is larger than the connection request packet, attackers can use this protocol as part of a DDoS amplification attack. 偽造的數據包源地址可以用來將連接接受響應重定向到目標(受害者)地址。如果連接接受包比連接請求包大,攻擊者可以利用這個協議進行DDoS放大攻擊。
-
Spoofed packet source addresses can be used to trivially fill all client slots on a server by sending connection request packets from n different IP addresses, where n is the number of clients allowed per-server. This is a real problem for dedicated servers. Obviously you want to make sure that only real clients are filling slots on servers you are paying for. 偽造的數據包源地址還可以通過從n個不同的IP地址發送連接請求包來輕松填滿服務器上的所有客戶端插槽,其中n是每個服務器允許的客戶端數量。這對專用服務器來說是一個實際問題。顯然,你需要確保只有真實的客戶端才能占用你所支付的服務器上的插槽。
-
An attacker can trivially fill all slots on a server by varying the client UDP port number on each client connection. This is because clients are considered unique on an address + port basis. This isn’t easy to fix because due to NAT (network address translation), different players behind the same router collapse to the same IP address with only the port being different, so we can’t just consider clients to be unique at the IP address level sans port. 攻擊者可以通過在每個客戶端連接中更改客戶端的UDP端口號,輕松填滿服務器上的所有插槽。這是因為客戶端是基于地址+端口來唯一標識的。這不容易修復,因為由于NAT(網絡地址轉換),同一個路由器后面的不同玩家會共享相同的IP地址,只有端口不同,因此我們不能僅僅將客戶端視為基于IP地址唯一的。
-
Traffic between the client and server can be read and modified in transit by a third party. While the CRC32 protects against packet corruption, an attacker would simply recalculate the CRC32 to match the modified packet. 客戶端和服務器之間的流量可以被第三方讀取和修改。雖然CRC32可以防止數據包損壞,攻擊者仍然可以重新計算CRC32以匹配修改后的數據包。
-
If an attacker knows the client and server IP addresses and ports, they can impersonate the client or server. This gives an attacker the power to completely a hijack a client’s connection and perform actions on their behalf.如果攻擊者知道客戶端和服務器的IP地址和端口,他們就可以偽裝成客戶端或服務器。這使得攻擊者能夠完全劫持客戶端的連接,并代表客戶端執行操作。
-
Once a client is connected to a server there is no way for them to disconnect cleanly, they can only time out. This creates a delay before the server realizes a client has disconnected, or before a client realizes the server has shut down. It would be nice if both the client and server could indicate a clean disconnect, so the other side doesn’t need to wait for timeout in the common case.一旦客戶端連接到服務器,就沒有辦法干凈地斷開連接,它們只能超時。這會導致延遲,直到服務器意識到客戶端已斷開連接,或者直到客戶端意識到服務器已經關閉。如果客戶端和服務器都能指示一個干凈的斷開連接,那將會很好,這樣在常見情況下另一方就不需要等待超時了。
-
Clean disconnection is usually implemented with a disconnect packet, however because an attacker can impersonate the client and server with spoofed packets, doing so would give the attacker the ability to disconnect a client from the server whenever they like, provided they know the client and server IP addresses and the structure of the disconnect packet.干凈的斷開連接通常通過斷開連接數據包來實現,然而,由于攻擊者可以通過偽造數據包來冒充客戶端和服務器,進行這種操作將使得攻擊者能夠在任何時候斷開客戶端與服務器的連接,只要他們知道客戶端和服務器的IP地址以及斷開連接數據包的結構。
-
If a client disconnects dirty and attempts to reconnect before their slot times out on the server, the server still thinks that client is connected and replies with connection accepted to handle packet loss. The client processes this response and thinks it’s connected to the server, but it’s actually in an undefined state.如果客戶端斷開連接時沒有正確斷開,并嘗試在其插槽超時之前重新連接,服務器仍然認為該客戶端已連接,并回復連接接受,以處理數據包丟失。客戶端處理此響應并認為自己已連接到服務器,但實際上它處于未定義的狀態。
While some of these problems require authentication and encryption before they can be fully solved, we can make some small steps forward to improve the protocol before we get to that. These changes are instructive.
雖然其中一些問題需要認證和加密才能完全解決,但在我們達到這一點之前,我們可以采取一些小步驟來改進協議。這些改進是具有指導意義的。
Improving The Connection Protocol
The first thing we want to do is only allow clients to connect if they can prove they are actually at the IP address and port they say they are.
To do this, we no longer accept client connections immediately on connection request, instead we send back a challenge packet, and only complete connection when a client replies with information that can only be obtained by receiving the challenge packet.
我們想要做的第一件事是,只允許客戶端在能夠證明自己確實位于所聲明的 IP 地址和端口時才允許連接。
為此,我們不再在連接請求時立即接受客戶端連接,而是先發送一個挑戰包,只有當客戶端回復一個只有通過接收挑戰包才能獲得的信息時,我們才完成連接。
The sequence of operations in a typical connect now looks like this:

To implement this we need an additional data structure on the server. Somewhere to store the challenge data for pending connections, so when a challenge response comes in from a client we can check against the corresponding entry in the data structure and make sure it’s a valid response to the challenge sent to that address.
While the pending connect data structure can be made larger than the maximum number of connected clients, it’s still ultimately finite and is therefore subject to attack. We’ll cover some defenses against this in the next article. But for the moment, be happy at least that attackers can’t progress to the connected state with spoofed packet source addresses.
為了實現這一點,我們需要在服務器上增加一個額外的數據結構,用于存儲待處理連接的挑戰數據。這樣,當來自客戶端的挑戰響應到達時,我們可以與數據結構中的相應條目進行比對,確保它是對該地址發送的挑戰的有效響應。
雖然待處理連接數據結構可以比最大連接數更大,但它畢竟是有限的,因此仍然容易受到攻擊。我們將在下一篇文章中討論一些防御方法。但目前,至少可以確保攻擊者無法通過偽造包源地址進入連接狀態。
Next, to guard against our protocol being used in a DDoS amplification attack, we’ll inflate client to server packets so they’re large relative to the response packet sent from the server. This means we add padding to both connection request and challenge response packets and enforce this padding on the server, ignoring any packets without it. Now our protocol effectively has DDoS minification for requests -> responses, making it highly unattractive for anyone thinking of launching this kind of attack.
Finally, we’ll do one last small thing to improve the robustness and security of the protocol. It’s not perfect, we need authentication and encryption for that, but it at least it ups the ante, requiring attackers to actually sniff traffic in order to impersonate the client or server. We’ll add some unique random identifiers, or ‘salts’, to make each client connection unique from previous ones coming from the same IP address and port.
接下來,為了防止我們的協議被用于DDoS放大攻擊,我們將使客戶端到服務器的數據包變得相對于服務器發送的響應包較大。為此,我們在連接請求和挑戰響應數據包中添加填充,并在服務器端強制執行這一填充,忽略任何沒有填充的包。這樣,我們的協議實際上實現了DDoS請求->響應的最小化,從而使其對任何想要發動這種攻擊的人變得非常不具吸引力。
最后,我們再做一件小事來提高協議的健壯性和安全性。雖然這并不完美,仍然需要認證和加密來確保安全,但至少它增加了攻擊者的難度,要求他們必須嗅探流量才能偽裝成客戶端或服務器。我們將添加一些獨特的隨機標識符,或稱為“鹽”,使每個客戶端連接與來自相同IP地址和端口的先前連接不同。
The connection request packet now looks like this:

The client salt in the packet is a random 64 bit integer rolled each time the client starts a new connect. Connection requests are now uniquely identified by the IP address and port combined with this client salt value. This distinguishes packets from the current connection from any packets belonging to a previous connection, which makes connection and reconnection to the server much more robust.
Now when a connection request arrives and a pending connection entry can’t be found in the data structure (according to IP, port and client salt) the server rolls a server salt and stores it with the rest of the data for the pending connection before sending a challange packet back to the client. If a pending connection is found, the salt value stored in the data structure is used for the challenge. This way there is always a consistent pair of client and server salt values corresponding to each client session.
數據包中的客戶端鹽是每次客戶端啟動新連接時生成的一個隨機64位整數。連接請求現在通過IP地址和端口與該客戶端鹽值唯一標識。這將當前連接的數據包與任何屬于先前連接的數據包區分開來,從而使連接和重新連接到服務器變得更加穩健。
現在,當一個連接請求到達并且在數據結構中找不到對應的待處理連接條目(根據IP、端口和客戶端鹽值)時,服務器會生成一個服務器鹽,并將其與待處理連接的其余數據一起存儲,然后發送一個挑戰數據包回客戶端。如果找到了待處理連接,數據結構中存儲的鹽值將用于挑戰。通過這種方式,總是會有一對一致的客戶端和服務器鹽值與每個客戶端會話對應。

The client state machine has been expanded so connecting is replaced with two new states: sending connection request and sending challenge response, but it’s the same idea as before. Client states repeatedly send the packet corresponding to that state to the server while listening for the response that moves it forward to the next state, or back to an error state. If no response is received, the client times out and transitions to disconnected.
客戶端狀態機已經擴展,連接狀態被替換為兩個新狀態:發送連接請求和發送挑戰響應,但它與之前的概念相同。客戶端狀態會重復發送對應狀態的包到服務器,同時監聽響應,響應會將客戶端狀態推進到下一個狀態,或者返回到錯誤狀態。如果沒有收到響應,客戶端會超時并轉到斷開連接狀態。
The challenge response sent from the client to the server looks like this:

The utility of this being that once the client and server have established connection, we prefix all payload packets with the xor of the client and server salt values and discard any packets with the incorrect salt values. This neatly filters out packets from previous sessions and requires an attacker to sniff packets in order to impersonate a client or server.



浙公網安備 33010602011771號