ProxySG Performance Webcast
ProxySG Performance Webcast
WEBCAST
PAUL KAO
Director Product Management
[email protected]
December 16, 2014
AGENDA
ProxySG Overview
Architecture (SGOS, CW, SW, Policy checkpoints)
System resources/metrics
Performance Model
Factors Impacting Performance
Authentication, ICAP, Policy, SSL, misc.
PROXYSG OVERVIEW
Copyright 2014
2013 Blue Coat Systems Inc. All Rights Reserved.
SGOS OVERVIEW
SGOS ARCHITECTURE
POLICY CHECKPOINTS
server_url.domain=
client.address=
http.response.apparent_data_type=
set(response.header.Set-Cookie, x")
PROXYSG APPLIANCE
PHYSICAL RESOURCES
Core appliance resources are:
CPU, Memory, Disk, Network Interface
CPU
No CPU throttling - continue to handle more load until appliance is at
CPU limit (assuming other resources are available). At this point,
requests take longer to process, with longer transaction times.
Memory
Threshold Monitor (TM) engages at 80% memory pressure, goes into
regulation, which limits HTTP acceptance to reduce rate of processing
new incoming connections.
Disk
At high disk utilization, back off mechanisms will engage to maintain
throughput at the expense of cache efficiency (disk read/writes)
Network Interface
Will trigger event log if network interface is saturated (TCP livelock)
Blue Coat Confidential
Licensed Client IP
A soft limit on HW appliances
A hard limit on Virtual appliances
Performance of appliances constrained by available number of
HTTP/TCP-Tunnel Client Workers (CW) for processing
Each appliance model has its own CW limit
PERFORMANCE MODEL
Copyright 2014
2013 Blue Coat Systems Inc. All Rights Reserved.
PERFORMANCE MODEL
10
1.
2.
Network deployment
3.
Authentication mode
4.
5.
6.
7.
Client
Policy
SSL
11
PERFORMANCE FACTORS
1. CLIENT
12
1. CLIENT SIDE
Client to SG connection (client side)
Limited by HTTP/TCP-Tunnel CW
User (client IP) is not an enforced metric. User is a model for sizing
CW limit does not include other TCP sessions (auth, ICAP, bypass,..)
Dont confuse TCP-Tunnel proxy CW as the TCP connection limit!!!
S-Series hardware
S-series models 5 connections/per user (user = unique client IP)
S200-10 S200-20 S200-30 S200-40 S400-20 S400-30 S400-40 S500-10 S500-20
Users
Max CW
400
1,200
2,600
5,000
6,000
2,000
6,000
13,000
25,000
30,000
14,000
25,000
30,000
50,000
Examples:
Financial trader, 50 conns per user
Kiosk, 1 connection per user
Blue Coat Confidential
13
PERFORMANCE FACTORS
2. NETWORK DEPLOYMENT
14
2. NETWORK DEPLOYMENT
Network 101
Link/duplex settings
WCCP
GRE vs L2
Set MTU appropriately to avoid fragmentation with GRE
15
PERFORMANCE FACTORS
3. AUTHENTICATION MODE
16
3. AUTHENTICATION
Evaluated at CI
Choice of Authentication mode can impact performance
Explicit proxy with NTLM: SG issues a 407 challenge for each
connection
IP Surrogate: After initial authentication, will use authentication cache
Kerberos: credentials validated without need to contact DC
17
PERFORMANCE FACTORS
4. DNS, CONTENT FILTERING
18
DNS
Not a high consumer of CPU, but can be cause of latency
If external DNS servers are slow/overloaded, Proxy will amplify the
problem
Use caution for policies/logging that trigger RDNS lookups
19
PERFORMANCE FACTORS
5. ICAP REQMOD (DLP)
20
21
PERFORMANCE FACTORS
6. ICAP RESPMOD (CAS/AV)
22
6. ICAP RESPMOD
(CONTENT ANALYSIS)
Evaluated at Server In (SI)
Higher cost due to volume of incoming request data
For ICAP RESPMOD, cache to disk for performance (no
need to return payload when response is 204 No
Modification)
Infinite Streams
ICAP deferred connections
ICAP mirroring (SG6.5)
Secure ICAP
SSL cost in initial connection setup
SSL overhead of bulk encryption low
Blue Coat Confidential
23
PERFORMANCE FACTORS
7. SYSTEM SERVICES
24
7. SYSTEM SERVICES
Access logging
Log entry written when connection is complete
A few percent overhead when enabled
Obviously more overhead if multiple log facilities in use
Health Checks
SNMP
Attack Detection
Failover, SGRP (VRRP)
Connection Forwarding
Scripts, polling of local policy
Snapshots, Debug logs
Blue Coat Confidential
25
PERFORMANCE FACTORS
8. POLICY
26
A point of reference
Policy used for SWG/ICAP/SSL consumes
about 15% of total CPU
Scale appropriately for higher/lower policy
usage
Variation across platforms
Only use as a rule of thumb
Not guaranteed to be exact
May change in the future
Blue Coat Confidential
27
PERFORMANCE FACTORS
9. SSL
28
9. SSL INTERCEPT
29
2,264
SPS52
2,250
SPS53
1,078
SPS54
1,390
SPS55
SPS56
SPS57
Total emulated server certificates removed from cache due to signature mismatch
312
SPS58
Total emulated server certificates removed from cache due to config changes
SPS59
874
SPS61
42,109
SPS62
SPS63
31
32
$(x-rs-certificate-valid-from)
$(x-rs-certificate-valid-to)
*.linkedin.com, 020000000001456FAAB168CFFE4A Apr 17 12:30:30 2014 GMT Apr 17 12:30:30 2015 GMT,
beis.cc.iup.edu,,
www.syncaccess.net,,
*.widget.custhelp.com,062306473BAC372720E3496C661336F0Feb 28 00:00:00 2014 GMTMar 30 23:59:59 2015 GMT,
ads.dotomi.com,02F7CASep 3 03:33:55 2014 GMTNov 5 14:50:00 2015 GMT,
*.wer.microsoft.com,28DB34EB000100005898Apr 4 17:56:38 2013 GMTApr 4 17:56:38 2015 GMT,
*.ebay.com,,
*.googleusercontent.com,,
*.reson8.com,D3C03378DC74A2ABF36132E69E273C45Jun 2 00:00:00 2014 GMTJul 21 23:59:59 2015 GMT,
stage.tracker.springserve.com,,
services.addons.mozilla.org,,
*.tapad.com,024906Jun 2 08:10:18 2013 GMTSep 3 03:30:13 2016 GMT,
*.dropbox.com,,
Blue Coat Confidential
33
34
35
CPU Utilization
Memory Pressure
Network Throughput
Client side HTTP connections (CWs)
Response time through ProxySG (and DNS response time)
Beware trend averages over long time intervals that flatten peaks
Identify true peak CPU utilization in busy hour
Peak CPU typically correlates with memory and connections
Baseline CPU distribution across components with CPU monitor
SNMP MIBs
See BLUECOAT-SG-PROXY-MIB.txt for resource monitoring
Also BLUECOAT-SG-ICAP-MIB.txt has been added in SG6.5
36
TROUBLESHOOTING PERFORMANCE
Tools
CPU Monitor
Sysinfo snapshots
Policy trace
Blue Coat Confidential
37
TROUBLESHOOTING PERFORMANCE
HIGH CPU
External Network Factors
Typically not going to be cause of high CPU on SG
Dependent Factors
Problem with Authentication server or Auth configuration (Kerberos falling back to
NTLM)
Internal factors
38
TROUBLESHOOTING PERFORMANCE
HIGH CPU
Data collection
Enable CPU monitor
Create and enable 5 min snapshots
Dont change the existing daily or hourly snapshot values
39
TROUBLESHOOTING PERFORMANCE
HIGH CPU EXAMPLE 1
Example-1
CPU Monitor
CPU 0
97%
81%
5%
Object Store
5%
Access Logging
2%
Miscellaneous
1%
CPU 1
94%
75%
TCPIP
11%
5%
DNS service
1%
40
TROUBLESHOOTING PERFORMANCE
HIGH CPU EXAMPLE 2
Example-2
CPU Monitor
CPU 0
100%
Object Store
ce_admin
Access Logging
CPU 1
98%
97%
1%
19%
TCPIP
8%
tcpip
7%
6%
http
1%
kernel
1%
3%
1%
41
TROUBLESHOOTING PERFORMANCE
HIGH CPU EXAMPLE 3
Example-3
CPU Monitor:
Configured interval duration:
5 seconds
2 seconds
CPU 0
77%
TCPIP
31%
17%
Object Store
13%
7%
DNS service
1%
Access Logging
1%
Miscellaneous
1%
42
TROUBLESHOOTING PERFORMANCE
HIGH CPU EXAMPLE 4
Example-4
5 seconds
0 seconds
CPU 0
35%
Object Store
14%
13%
3%
Miscellaneous
2%
CPU 1
100%
TCPIP
5%
1%
DNS service
1%
90%
43
TROUBLESHOOTING PERFORMANCE
SLOWNESS
Can be difficult to troubleshoot, especially if intermittent
External Network Factors
audit change requests to (upstream) network (over last week)
E.g., new FW installed last weekend
Network: Packet loss, retransmissions, asymmetric routing
Dependent Factors
DNS, Authentication, 3rd party ICAP servers
Internal factors
Audit config changes to SG, starting with most recent (work
backwards to last 2-3 days if intermittent problem)
44
TROUBLESHOOTING PERFORMANCE
SLOWNESS
Data collection
May require multiple rounds of troubleshooting (PCAP & Sysinfo snapshots)
Easiest to target specific client or server to test
May need to test with different configurations and capture with different filter to
narrow down the issue
45
SUMMARY
ProxySG Architecture
Appliance resources, CW limit
Performance Model
Factors Impacting Performance
ICAP (built into sizing model/guide)
Policy (sky is the limit)
SSL (SSL traffic mix amount of SSL decryption)
Troubleshooting
Importance of establishing a performance baseline
Tools to troubleshoot performance
Blue Coat Confidential
46
47
48
49
QUICK SURVEY
50
Questions?
51
52
53
54
55
56
57
58