TCP被设计为数据在双方之间可以可靠传输。在之前，我们介绍了流量控制，接收方来不及处理对方发送过来的数据的时候，通过在返回的报文中的window size字段来减小的对方的窗口。注意，流量控制和拥塞控制不同的是，流量控制关注的是连接的两端，而不是中间的链路。而接下来要介绍的拥塞控制，关注的是两端之间的网络，不会发送太多的数据使得网络陷入拥塞

### Detection of Congestion in TCP

TCP所交互的对象是对方主机，它在传统方法中无法知道链路中间的路由器的状态是怎么样的。

In the Internet, this can be quite challenging, as there has traditionally been no explicit way for a sending TCP to learn about the state of the intermediate routers

This is usually accomplished by detecting that one or more packets have been lost.

### Slowing Down a TCP sender

We can go a step further and arrange for the sender to slow down if either the receiver is too slow or the network is too slow

$$W = min(cwnd,awnd)$$

The total amount of data a sender has introduced into the network for which it has not yet received an acknowledgment is sometimes called the flight size

### Class Algorithm

#### Slow Start

The purpose of slow start is to help TCP find a value for cwnd before probing for more available band- width using congestion avoidance and to establish the ACK clock.

TCP在最开始的时候，发送一定数量的报文，叫作初始窗口（initial window）。一般来说IW最开始是一个SMSS（send maximum segment size，直接当MSS用也没问题）。当然RFC5681建议IW可以在初始的时候设置的更大一点：

Thus, after one segment is ACKed, the cwnd value is ordinarily increased to 2, and two segments are sent. If each of those causes new good ACKs to be returned, 2 increases to 4, 4 to 8, and so on.

#### Congestion Avoidance

Once this is achieved, there is always the possibility that more network capacity may become available for a connection

seeks additional capacity by increasing cwnd by approximately one segment for each window’s worth of data that is moved from sender to receiver successfully

$$cwnd_{t+1}=cwnd_{t}+SMSS\times \frac{SMSS}{cwnd_{t}}$$

$$cwnd_{1}=cwnd_{0} + \frac{1}{k}\times SMSS$$

The assumption of the algorithm is that packet loss caused by bit errors is very small (much less than 1%), and therefore the loss of a packet signals congestion somewhere in the network between the source and destination. If this assumption is false, which it sometimes is for wireless networks, TCP slows down even when no congestion is present

#### Selecting between Slow Start and Congestion Avoidance

When cwnd < ssthresh, slow start is used, and when cwnd > ssthresh, congestion avoidance is used. When they are equal, either can be used.

$$ssthresh=max(flight size/2,2\times SMSS)$$

PS:前面说过flight size是已经被发送但是还未被确认的数据。

#### Tahoe, Reno, and Fast Recovery

Tahoe was implemented by simply reducing cwnd to its starting value (1 SMSS at that time) upon any loss, forcing the connection to slow start until cwnd grew to the value ssthresh.

Any nonduplicate (“good”) ACK causes TCP to exit recovery and reduces the congestion back to its pre-inflated value.

#### Standard TCP

RFC 5681中给出了TCP的拥塞控制的基本算法。在开始阶段，cwnd=IW(initial window)，且在初始阶段ssthresh设置的相当大，通常来说ssthresh=awnd

TCP begin s a connection in slow start (cwnd = IW) with a large value of ssthresh, generally at least the value of awnd.

1. ssthresh被更新为不能超过max(flight size /2 ,2*MSS)，更加准确的来说，应该是设置原来cwnd的一半。

Reducing the estimate of the optimal window size is accompanied by altering ssthresh to
be about half of what the current window size is (but not ever below twice the SMSS)

1. 当执行快速重传的时候，cwnd = ssthresh+3*SMSS;
2. cwnd每接收到一个ACK就增加一个SMSS
3. 当一个好的ACK（新的ACK，而不是duplicate ACK）,cwnd被重新设置为ssthresh，这也使得，接下来就会执行线性增长。

Slow start is always used in two cases: when a new connection is started, and when a retransmission timeout occurs.

### Evolution of Standard Algorithm

Reno TCP仍然有些地方可以提高，也就是说他还是有那么些问题。

#### New Reno

This number is larger than the previous highest ACK value seen (23801), but not enough to meet or exceed the recovery point (44801).

once one packet is recovered (i.e., successfully delivered and ACKed), a good ACK can be received at the sender that causes the temporary window inflation in fast recovery to be erased before all the packets that were lost have been retransmitted

Reno TCP在面对partial ACK的时候会将他的窗口减小到相当小，使得TCP陷入idle状态直到超时定时器的超时。为了理解这个为什么会发生，回想一下non-SACK TCP依赖于冗余ACK的。比如说窗口变得非常小，没有新的数据可以发送了，对方就一直不会返回ACK，那么就达不到冗余ACK的条件，只能等待超时重传的发生。

This happens when, for each segment loss, Reno enters fast recovery, reduces its cwnd and aborts fast recovery on the receipt of a partial ACK. (A partial ACK is one which acknowledges some but not all of the outstanding segments.) After multiple such reductions, cwnd becomes so small that there will not be enough dupacks for fast recovery to occur and a timeout will be the only option left.

In New-Reno, partial ACKs do not take TCP out of Fast Recovery. Instead, partial ACKs received during Fast Recovery are treated as an indication that the packet immediately following the acknowledged packet in the sequence space has been lost, and should be retransmitted

This allows a TCP to continue sending one segment for each ACK it receives while recovering and reduces the occurrence of retransmission timeouts, especially when multiple packets are dropped in a single window of data

NewReno is a popular variant of modern TCPs—it does not suffer from the problems of the original fast recovery and is significantly less complicated to implement than SACKs.

#### TCP Congestion With SACK

Because the window is inflated for each arriving ACK during fast recovery, with larger windows TCP typically is able to send some addi-tional data after performing its retransmission.

#### Limited Transmit

there may not be enough packets in the network to trigger the fast retransmit/recovery algorithms when loss occurs, as these algorithms typically require three duplicate ACKs to be observed prior to initiation

Doing this helps to keep at least a minimal number of packets in the network—enough so that fast retransmit can be triggered upon packet loss

#### Congestion Window Validation (CWV)

If all goes well, a sender never pauses, and it continues sending data and receiving ACKs from its peer. This continuous feedback enables it to keep a reasonably current (within one RTT) estimate of what cwnd and ssthresh should be

Furthermore, if the pause is sufficiently long, its last cwnd value may no longer be appropriate for the path and congestion state

RFC 2861中提出了一个实验性的方法，称为Congestion Window Validation，如果一段时间以后都没发送报文cwnd会逐渐的递减。以ssthresh来反映之前的cwnd。

Essentially, the sender’s current value of cwnd decays over a period of nonuse, and ssthresh maintains the “memory” of it prior to the initia- tion of the decay.

CWV的算法主要如下，在发送新报文的时候，计算上一次发送报文的时间到现在已经过去了多久，并且根据是否超过了RTO做如下操作：

• sshthresh修改为max(ssthresh,(3/4)cwnd)
• 在空闲的时候，cwnd会随着每经过一个RTT就减半，但是不小于SMSS。

The application-limited sender does have more data to send but has been unable to for some reason. This could be because the sending computer is busy doing other tasks, or because some mechanism or protocol layer below TCP is preventing data from being sent.

CWV算法执行的操作是:

• 当前窗口大小为W_used
• sshthresh修改为max(ssthresh,(3/4)cwnd)
• cwnd设置为cwnd和W_used的平均值

Linux默认使用CWV算法。

### Handling Suprious RTO-the Eifel Response Algorithm

As we saw in Chapter 15, when TCP encounters a large delay spike, it can experience a retransmission timeout even if no packet has been lost.

• 若接收到的ACK有ECN标志（暂时还没说到），停止操作
• cwnd=flight size + min(bytes_acked,IW)
• ssthresh=pipe_prev

### Sharing Congestion State

RTT measurements (srtt and rttvar), an estimate of reordering, and the congestion control variables cwnd and ssthresh.

### Delay-Based Congestion Control

One clue that congestion may be forming is an increase in measured RTT as the sender injects more packets into the network.

### Active Queue Management and ECN

ECN主要是显示的来表示拥塞了，让路由器拥有表达链路拥塞的能力:

These RFCs describe Explicit Congestion Notification (ECN), which is a way for routers to mark packets (by ensuring both of the ECN bits in the IP header are set) to indicate the onset of congestion

## 发送评论编辑评论

|´・ω・)ノ
ヾ(≧∇≦*)ゝ
(☆ω☆)
（╯‵□′）╯︵┴─┴
￣﹃￣
(/ω＼)
∠( ᐛ 」∠)＿
(๑•̀ㅁ•́ฅ)
→_→
୧(๑•̀⌄•́๑)૭
٩(ˊᗜˋ*)و
(ノ°ο°)ノ
(´இ皿இ｀)
⌇●﹏●⌇
(ฅ´ω`ฅ)
(╯°A°)╯︵○○○
φ(￣∇￣o)
ヾ(´･ ･｀｡)ノ"
( ง ᵒ̌皿ᵒ̌)ง⁼³₌₃
(ó﹏ò｡)
Σ(っ °Д °;)っ
( ,,´･ω･)ﾉ"(´っω･｀｡)
╮(╯▽╰)╭
o(*////▽////*)q
＞﹏＜
( ๑´•ω•) "(ㆆᴗㆆ)

Emoji