From 9ca8dbcc65cfc63d6f5ef3312a33184e1d726e00 Mon Sep 17 00:00:00 2001 From: Yunhong Jiang Date: Tue, 4 Aug 2015 12:17:53 -0700 Subject: Add the rt linux 4.1.3-rt3 as base Import the rt linux 4.1.3-rt3 as OPNFV kvm base. It's from git://git.kernel.org/pub/scm/linux/kernel/git/rt/linux-rt-devel.git linux-4.1.y-rt and the base is: commit 0917f823c59692d751951bf5ea699a2d1e2f26a2 Author: Sebastian Andrzej Siewior Date: Sat Jul 25 12:13:34 2015 +0200 Prepare v4.1.3-rt3 Signed-off-by: Sebastian Andrzej Siewior We lose all the git history this way and it's not good. We should apply another opnfv project repo in future. Change-Id: I87543d81c9df70d99c5001fbdf646b202c19f423 Signed-off-by: Yunhong Jiang --- kernel/Documentation/networking/tcp.txt | 106 ++++++++++++++++++++++++++++++++ 1 file changed, 106 insertions(+) create mode 100644 kernel/Documentation/networking/tcp.txt (limited to 'kernel/Documentation/networking/tcp.txt') diff --git a/kernel/Documentation/networking/tcp.txt b/kernel/Documentation/networking/tcp.txt new file mode 100644 index 000000000..bdc4c0db5 --- /dev/null +++ b/kernel/Documentation/networking/tcp.txt @@ -0,0 +1,106 @@ +TCP protocol +============ + +Last updated: 9 February 2008 + +Contents +======== + +- Congestion control +- How the new TCP output machine [nyi] works + +Congestion control +================== + +The following variables are used in the tcp_sock for congestion control: +snd_cwnd The size of the congestion window +snd_ssthresh Slow start threshold. We are in slow start if + snd_cwnd is less than this. +snd_cwnd_cnt A counter used to slow down the rate of increase + once we exceed slow start threshold. +snd_cwnd_clamp This is the maximum size that snd_cwnd can grow to. +snd_cwnd_stamp Timestamp for when congestion window last validated. +snd_cwnd_used Used as a highwater mark for how much of the + congestion window is in use. It is used to adjust + snd_cwnd down when the link is limited by the + application rather than the network. + +As of 2.6.13, Linux supports pluggable congestion control algorithms. +A congestion control mechanism can be registered through functions in +tcp_cong.c. The functions used by the congestion control mechanism are +registered via passing a tcp_congestion_ops struct to +tcp_register_congestion_control. As a minimum name, ssthresh, +cong_avoid must be valid. + +Private data for a congestion control mechanism is stored in tp->ca_priv. +tcp_ca(tp) returns a pointer to this space. This is preallocated space - it +is important to check the size of your private data will fit this space, or +alternatively space could be allocated elsewhere and a pointer to it could +be stored here. + +There are three kinds of congestion control algorithms currently: The +simplest ones are derived from TCP reno (highspeed, scalable) and just +provide an alternative the congestion window calculation. More complex +ones like BIC try to look at other events to provide better +heuristics. There are also round trip time based algorithms like +Vegas and Westwood+. + +Good TCP congestion control is a complex problem because the algorithm +needs to maintain fairness and performance. Please review current +research and RFC's before developing new modules. + +The method that is used to determine which congestion control mechanism is +determined by the setting of the sysctl net.ipv4.tcp_congestion_control. +The default congestion control will be the last one registered (LIFO); +so if you built everything as modules, the default will be reno. If you +build with the defaults from Kconfig, then CUBIC will be builtin (not a +module) and it will end up the default. + +If you really want a particular default value then you will need +to set it with the sysctl. If you use a sysctl, the module will be autoloaded +if needed and you will get the expected protocol. If you ask for an +unknown congestion method, then the sysctl attempt will fail. + +If you remove a tcp congestion control module, then you will get the next +available one. Since reno cannot be built as a module, and cannot be +deleted, it will always be available. + +How the new TCP output machine [nyi] works. +=========================================== + +Data is kept on a single queue. The skb->users flag tells us if the frame is +one that has been queued already. To add a frame we throw it on the end. Ack +walks down the list from the start. + +We keep a set of control flags + + + sk->tcp_pend_event + + TCP_PEND_ACK Ack needed + TCP_ACK_NOW Needed now + TCP_WINDOW Window update check + TCP_WINZERO Zero probing + + + sk->transmit_queue The transmission frame begin + sk->transmit_new First new frame pointer + sk->transmit_end Where to add frames + + sk->tcp_last_tx_ack Last ack seen + sk->tcp_dup_ack Dup ack count for fast retransmit + + +Frames are queued for output by tcp_write. We do our best to send the frames +off immediately if possible, but otherwise queue and compute the body +checksum in the copy. + +When a write is done we try to clear any pending events and piggy back them. +If the window is full we queue full sized frames. On the first timeout in +zero window we split this. + +On a timer we walk the retransmit list to send any retransmits, update the +backoff timers etc. A change of route table stamp causes a change of header +and recompute. We add any new tcp level headers and refinish the checksum +before sending. + -- cgit 1.2.3-korg