Frequently Asked Questions (FAQ) for Performance
Enhancing Proxies (PEPS)
Hints on How to Configure PEPs
June 27, 2005
Background on PEPs
How PEPS Improve Performance
Placement of PEPs within a Network
Network Configuration #1 – Reserved Capacity Guaranteed Between PEPs Network Configuration #2 – Unreserved Capacity between PEPs Interaction with Network Encryptors
Interaction with GRE and other Tunneling Techniques
Interaction with End Systems
Interaction with Microsoft’s Implementation of TCP
Interaction with TDMA, DAMA and Other Dynamic Bandwidth Environments Interaction with Striping Data across Multiple Subnetworks Transport Layer - Send and Receive Window
Transport Layer - Data Rate (Rate control)
Transport Layer – Retransmission TimeOut (RTO) Parameters Transport Layer – Changing Delayed Ack Frequency of the Receiver Transport Layer – Time Stamps Option
Transport Layer – Compression Option
Techniques to Augment PEP Services
Documentation of TCP Performance Issues and PEPs within the IETF
1 Approved for Public Release; Distribution Unlimited
2 ?2005 The MITRE Corporation. All Rights Reserved
3 The research described in this paper was carried out for the Jet Propulsion Laboratory, California Institute of
Technology, under a contract with the National Aeronautics and Space Administration
Background on PEPs
Back in the late 1990s there were considerable discussions of the performance issues associated with TCP based applications traversing satellite environments. One popular mechanism to address performance concerns is a called a Performance Enhancing Proxy (PEP) also known as a transport layer proxy or gateway. This technique splits the TCP connection into three separate connections: 1) standard TCP connection between the local application and the local PEP; 2) an advanced protocol to transmit data across the satellite or otherwise challenged link; 3) standard TCP connection between the peer PEP and the peer application. This inter-PEP protocol is tuned and optimized to the characteristics of the satellite link. The advantage of using this technique is the end systems and applications are not modified and any way but still receive the performance enhancing benefits of the PEP. See following picture which describes the operation of a PEP.
TCPTCPTCPXXXXXXTCPIPIPIPIP IPIP LinkLinkLinkLinkLinkLinkInternetInternet
Figure 1 Sample PEP Scenario
How PEPS Improve Performance
PEPs, operating at the transport layer, increase the performance of TCP based applications where native performance suffers due to characteristics of a link or subnetwork in the path. These PEPs do not modify the application protocol, thus applications operate end-to-end. These PEP techniques are totally transparent to the applications. Technically, these gateways perform a technique called spoofing in which they intercept the TCP connection in the middle and terminate that connection as if the gateway were the intended destination. PEP, typically bracketing the satellite network, split a single TCP connection into three separate connections. These gateways communicate via standard TCP when talking to each of the end-systems and a third connection using an optimized rate based protocol is used when transferring the data between them. This technique allows one to isolate the properties of satellite networks that degrade performance from manifesting themselves to TCP. For example, corruption loss on the satellite network does not cause the transmission rate to be cut in half, and congestion loss on
the terrestrial network does not cause data to be retransmitted consuming satellite bandwidth that may be a scarce resource. This also allows the deployment of protocols that can be tuned to match the characteristics of the satellite link without affecting the end-systems. Now the default TCP parameters on most end-systems are now applicable for the terrestrial environment Placement of PEPs within a Network in which they operate.
In general PEPs should be placed as close to the satellite or otherwise challenged link as possible. As indicated above, the PEP operate best when they are the only entry point into a reserved capacity and therefore in a best case situation the PEP processes all traffic traversing satellite and only the traffic traversing the link. If the PEP needs to be placed further from the satellite resource, but still processes all data traversing the satellite resource then, the PEP needs the ability to classify which traffic will traverse the constrained resource.
Network Configuration #1 – Reserved Capacity Guaranteed
A rate-based emission strategy can be used by the PEP because it is typically in control of the inter-gateway network and is provided a guaranteed bandwidth capacity. Since the PEPs are the only entry point in to the reserved capacity, PEPs can ensure that network congestion is not possible. Since network congestion not possible for inter-gateway traffic only flow control and a retransmission strategy need to be deployed. This is the preferred topology for PEPs.
Network Configuration #2 – Unreserved Capacity between
When there is a unreserved, unknown, or uncontrolled bandwidth between the PEPs, the possibility of network congestion is always possible. Therefore, the PEP should run some from of congestion control packet emission policy. Whether this policy is standard VJ congestion control algorithm of the Vegas congestion control algorithm depends on the various in the latency of the path between the two paths. Vegas should typically not be used if there is any real variance of round trip times, since Vegas assumes the variation if round trip times is due to queueing delay and thus Vegas will adjust it rate according.
Interaction with Network Encryptors
Because the PEPs need to terminate TCP connections to/from the source/destination, they must
be placed in the network path between the end hosts and any network encryptor device (if
present). If the PEP is placed in the network so that it sees traffic before network layer
encryptors (e.g., IPSEC, TACLanes, etc.), it has the ability to terminate the TCP connections
and improve performance. However the PEP needs to be aware of the encryptor in the network,
in particular any overhead that is added by the encryptor. For example some encryptors operate
by encrypting the entire input IP packet and encapsulating it inside another packet. The PEP
must be aware of this for the following reasons.
? Fragmentation Avoidance IP fragmentation occurs when an IP packet is too large for
the Maximum Transmission Unit (MTU) of the link to be traversed. When this occurs
the IP packet is divided into smaller packets to traverse the link. These smaller
segments are then reassembled at the end point, which in this case is the peer PEP.
Fragmentation in general is undesirable for three reasons. First, the overhead involved
in the fragmented packets increases the number of octets traversing the link. Second, if
any of the fragmented packets is lost (either due to congestion or corruption) then the
entire packet must be resent not just the lost fragment. Third, reassembly from the
receiver’s perspective requires addition queueing and timers. Therefore fragmentation
should be avoided. PEPs have the ability to decrease the Maximum Segment Size
(MSS) to avoid any down stream fragmentation.
? Overhead Accountability One common performance enhancement technique PEPs use
is to disable congestion control and to emit data at or just below the line rate of the
challenged resource. When downstream devices add overhead, , the PEP may overdrive
the network and cause self congestion. With encryptors, the additional overhead
associated with the IP encapsulation plus any encryption overhead needs to be
accounted for. Additionally, some encryptors may prevent traffic analysis by padding
all packets to their MTU size. The PEP must be made aware and must account for the
additional overhead incurred by the encryptor.
End System E
Proxy E P X S X P E P
Sample Topology for Placement of Network Encryptors
Interaction with GRE and other Tunneling Techniques
As with network encryption devices, PEPs need to be aware of any network layer overhead that can augment size of the packets transmitted by the PEP. See Interaction with Network Encryptors problems associated with ‘Fragmentation Avoidance’ and ‘Overhead
Accountability.’ This also includes interaction with an MPLS cloud, since the MPLS ingress
point adds an additional 4 octets of overhead. Moreover, PEPs and most other Middleware boxes typically can not be inserted into an MPLS cloud, because the ethernet type of not of IP, but rather of MPLS which may cause confusion in these devices. When designing network architectures, PEPs should not be placed within a tunnel. From a theoretical perspective, it could be possible (assuming encryption is not present) for a PEP to walk down the levels of encapsulation to reach the transport layer header. When the PEP transmits packets it would have to be aware of the tunnel and recreate the tunneling structure. Most PEPs that I am aware of do not support this feature.
Interaction with End Systems
The SCPS PEP uses a superset of the TCP protocol and thus uses TCP options that are properly registered with IANA to negotiate the SCPS features. The TCP standard states that if an option is well formed but not implemented that option should be ignored. Therefore if a TCP protocol receives a TCP SYN with the SCPS option enabled, but does not understand the SCPS options it must ignore that option when responding with a SYN-ACK. Thus the SCPS features will not be used. Some operating systems, Microsoft Window variations in particular, unfortunately improperly handle TCP options not understood by their implementation. In this case, these implementations will not establish a connection if the TCP SYN contain the SCPS option. To work around this bug, PEPs are typically configured to not offer the SCPS option on the LAN side. If one PEP goes down, however, the peer PEP will be talking directly to the end system. When this occurs the PEP should try a few times to establish the connection with the SCPS option set on the SYN and if that fails, it should try to establish a connection without the SCPS option present. In addition to this the PEP should cache this information locally so that next time it attempts to establish a connection, it will not use the SCPS options if it was previously not accepted. Therefore for a period of time all connection to that destination will not include the SCPS option to speed up the connection establishment time.
Interaction with Microsoft’s Implementation of TCP
To date, most if not all Microsoft implementations of TCP have a software bug in the processing of TCP options that can prohibit some interactions with a SCPS based PEP.
Most implementations of TCP are developed to be flexible as TCP matures. In particular when new TCP options are developed implementations must be able to handle it - (e.g., window scaling, SACK, etc.) By this I mean to properly negotiate the usr or non use of that option. According to RFC 1122 entitled "Requirements for Internet Hosts -- Communication Layers"
18.104.22.168 TCP Options: RFC-793 Section 3.1
A TCP MUST be able to receive a TCP option in any segment. A TCP
MUST ignore without error any TCP option it does not implement, assuming
that the option has a length field (all TCP options defined in the future will
have length fields). TCP MUST be prepared to handle an illegal option
length (e.g., zero) without crashing; a suggested procedure is to reset the
connection and log the reason.
Therefore if a TCP implementation receives a properly formatted TCP option that it does not
implement it MUST simply ignore that option.
It appears that for options that Microsoft’s implementation of TCP understands, it will properly negotiate the use or non-use of that option. For example, it does properly negotiate the use of
timestamps and window scaling defined in RFC 1323. However it appears that most Microsoft
implementations of TCP will NOT accept TCP connections if they contains TCP options that
the OS does not understand. This is in accordance with the Internets ‘Requirements of Internet hosts – Communication Layers’ as stated above. Not only is the SCPS options well formed but
it is also registered with IANA. See http://www.iana.org/assignments/tcp-parameters - KIND
To overcome the problem in Microsoft’s TCP implementation, SCPS based PEPs need to be
configured to optionally not include the SCPS TCP option when communicating to systems on
the LAN side.
Interaction with TDMA, DAMA and Other Dynamic
In some environments, in particular DAMA or TDMA based environments; the bandwidth of a
network is dynamically share among the number of active nodes based upon varies parameters
(e.g., queue size.) In addition to sharing the link, the mechanisms required to schedule and fill
‘slots’ will also effect the overall bandwidth presented to each node. This environment breaks the typical deployment of PEP where the bandwidth of the system from the perspective of the
PEPs is assumed to be static and known apriori. A couple of techniques could be used to
address this problem.
1. If the bandwidth of the system does not change frequently and a feedback loop from
the modem to the PEP exists, then the PEP could be notified about the changes in
capacity. These changes could either be in the form of queue availability at the
constrained resource (flow control between the modem and the PEP) or current
bandwidth of the constrained resource. It should be noted that if encryption devices
are present between the PEP and the challenged resource, then this may not be possible.
2. Congestion control can be used by the PEP. For example, either TCP’s Van Jacobson
congestion control algorithm – which is a reactive congestion control algorithm or
SCPS Vegas’s congestion control algorithm – which is a proactive congestion control
3. A combination of both a rate based system and a congestion control algorithm could be
deployed. Rate control would provide a minimum guarantee and a maximum possible
value and a congestion control algorithm such as Vegas could be used to find an
appropriate operating point between these two values.
Interaction with Striping Data across Multiple Subnetworks
The concept of striping involves sending data across multiple subnetworks to increase the
overall throughput of a network. When striping occurs, typically a node on each side of the
multiple subnetworks ensures that all data from a single connection will only traverse a single
subnetwork. This decreases the probably of extreme packet misordering which has a
tremendous effect on overall throughput. In some environments, data from a single TCP
connection may be striped over multiple subnetworks (e.g., multiple low bandwidth RF links)
to improve the application performance. Striping a single connection, depending on the
characteristics of the individual links, will result in data will arrive (sometimes massively) out
of order. This may confuse and affect the retransmission policy of the PEP. If this occurs
addition logic needs to be added to the PEP that holds back on retransmitting packets that may
simply be arriving late. This additional logic may occur at the receiver requesting a
retransmission or at the transmitter retransmitting data to fill the hole.
End System S E
E P P E
Proxy P S
Subnetwork S S
Sample Topology for Striping Data across Multiple Subnetworks
Interaction with Multiple Secure Enclaves
A single secure site is able to use an insecure commercial satellite resource for its secure data
communication needs and is still able to use a gateway for performance improvements. A
following step might allow multiple sites with different security classifications to share the
common satellite resource. Thus each secure site would have a pair of gateways and encryptors
at each site for secure data transmission. However since PEPs typically rely strictly on rate
control as means to emit data through a network, there is a possibility, for congestion based loss
to occur across the satellite network. Therefore the following options may occur:
1. Each pair of gateways is configured to rate-control traffic onto the
channel at the full bandwidth of the satellite link. Network congestion
2. Each pair of gateways is configured a priori to rate-control traffic onto the
channel at the prescribed percentage of the satellite’s bandwidth.
Network congestion will not occur. However, when only a single pair of
gateways is actively communicating over the link, performance is non-
Another option would allow the gateway to provide some indirect feedback to the other
gateways; allowing some form of coordination. This approach is to augment the pure rate
control mechanism with the Vegas congestion control algorithm. Essentially the gateways will
use the Vegas algorithm for emitting data on the satellite network, but a rate control mechanism
will also be used as a stop-gap to ensure a single set of gateways will not cause congestion loss.
Legend E P X X P E
End System E
Proxy S P
E P X X P E
Sample Topology for Multiple Secure Enclaves
Transport Layer - Send and Receive Window
Flow control is used to ensure that the sender does not overwhelm the receivers’ buffers. In
actuality, both the TCP sender and receiver have buffers. The sender uses the buffers as a
retransmission buffer (i.e., to store data for possible retransmission) while the receiver uses the
buffers as an out-of-sequence queue (i.e., to save packets received by the sender that are not in
sequence to avoid the sender from possibly retransmitting them.) Therefore if the buffer sizes
are different, only the minimum buffer size of the two is used for flow control. When looking at flow control within TCP, it is quite useful to examine the Bandwidth Delay Product (BDP). The BDP provides a relationship between the bandwidth of a network and the round trip delay though the network to establish the minimum TCP window size required under ideal conditions (i.e., no segment loss or network queuing causing the round trip time to increase). Therefore if the bandwidth of the network (i.e., the slowest link in the network) is 1 Mbps and the round trip time through the entire network is 600 milliseconds (which are typically for satellites operating in a geo-synchronous orbit) the minimum TCP buffer size needs to be 75,000 bytes. In general the transport layer window sizes for the WAN side connection should be set somewhere between one and two times the bandwidth delay product. A rule of thumb without accounting for other information, the send and receive windows should be set for twice the bandwidth delay product. If the window size is greater than 65536 bytes then window scaling must be enabled. The default transport layer window sizes for most operating systems range from 8K to 32K) and therefore they should be set at 32K for the LAN side connection.
Transport Layer - Data Rate (Rate control)
Rate control controls the maximum data rate that data will be emitted from the PEP. This allows the PEP to emit data at or just below the line rate of the challenged resource. In general, the value should be set to the minimum of the maximum rates on the WAN side. It should be noted that there are other factors that may further limit the clocking out of data at less than data rate. For example
For control via the transport layer send and receive windows may further limit the traffic.
A combination of a congestion control technique (e.g., VJ or Vegas) may further limit how data is emitted.
It is important to note that when downstream devices cause an increase of packet size, the PEP may overdrive the network and cause self congestion. With encryptors, the additional overhead associated with the IP encapsulation plus any encryption overhead needs to be accounted for. Additionally, some encryptors may prevent traffic analysis by padding all packets to their MTU size. The PEP must be made aware and must account for the additional overhead incurred by the encryptor.
Transport Layer – Snack Option
The Selective Negative Acknowledgment (SNACK) option allows the receiver to inform the SCPS-TP sender about one or more holes in the receiver’s out-of-sequence queue. Without a
selective acknowledgment, TCP can use the ACK number to identify at most a single hole in the receiver’s buffer. Using its simple cumulative acknowledgment and the Fast Retransmit algorithm, TCP can recover efficiently from a single loss per window. However, because new data must be received for the receiver to advance the ACK number, TCP requires a minimum
of one RTT to signal each additional hole in the out-of-sequence queue. The SNACK option,
which is carried on an acknowledgment segment, identifies multiple holes in the sequence space buffered by the receiver. By providing more information about lost segments more quickly, the SNACK option can hasten recovery and prevent the sender from becoming window-limited. This allows the pipe to drain while waiting to learn about lost segments. The ability to transmit continuously in the presence of packet loss is especially important when loss is caused by corruption rather than congestion. In such a case, when it has been deemed that it is appropriate to disable congestion control as the response to loss, SNACK is of particular benefit in keeping the pipe full and allowing transmission to continue at full throttle while recovering from losses. In general, this option should be enabled, unless special network characteristics would warrant it non-use.
The following section describes the differences between SNACK and SACK. SACK is used by the receiver to tell the sender when there is an out of sequence queue (i.e., missing packets.) A SACK block indicated which packets have been received properly. Also explicitly stated in the original RFC that the sender may not use this information to assume which packets have not been received properly, most modern implementation make that assumption. Rather, the RFC states the sender may use this information to determine which packets have left the network and adjust the congestion window appropriately. Given the new interpretation of SACK, SACK and SNACK actually perform similar functions. SNACK however is not bit efficient and is able to more easily indicate multiple holes in the out of sequence queue.
Transport Layer – Retransmission TimeOut (RTO)
TCP provides a reliable transport layer. One of the ways it provides reliability is for each end to acknowledge the data it receives from the other end. But data segments and acknowledgments can get lost. TCP handles this by setting a timeout when it sends data, and if the data isn't acknowledged when the timeout expires, it retransmits the data. A critical element of any implementation is the timeout and retransmission strategy. How is the timeout interval determined, and how frequently does a retransmission occur. There exist three RTO parameters which control how frequently retransmission may occur. First there is the Initial Retransmission TimeOut (IRTO). Next there is the Minimum Retransmission TimeOut (MIN_RTO), which provides a minimum value or floor on how soon the retransmission timeout may occur. Finally, there is the Maximum Retransmission Timeout (MAX_RTO) which provides a maximum value or ceiling between successive timeouts.
These values may be changes based on knowledge of the challenged link. For example, due to certain characteristics of the challenged link (scheduling of the resource), the minimum value of a round trip time (e.g., ping) may be 5 seconds. Therefore you may want to set the Initial RTO to be 10 seconds. The Minimum RTO value may be set to 6 seconds, since the Round Trip Time must be greater than 5 seconds. The Maximum RTO may also be changed based on knowledge of the resource as well.