Implementing SSH Port Forwarding With Data Guard
Implementing SSH Port Forwarding With Data Guard
1.2
Although not shown in the examples, if using cascading standby databases (i.e. a standby database that receives its redo logs from another standby database, not from the original primary database), like between the production and standby systems, an SSH tunnel would need to be created in both directions between the two standby systems: o o From standby1 to standby2 for RFS From standby2 to standby1 for FAL requests
The sshd daemon configuration on each host must allow TCP port forwarding. See the sshd_config(5) man page for details of the AllowTcpForwarding configuration parameter. Each ssh command may require its own window with the process running in the foreground. Ssh cannot be put in the background if a password or passphrase is required. If ssh is configured for prompt less access to the remote systems, the ssh commands can run in the background and the n option should be used. See the
ssh(1) man page for further details. If OpenSSH is being used, the command line option N is useful because it does not execute a remote command. See the OpenSSH ssh(1) man page for details. The examples shown below have been tested using SSH version 1.2.31 on Sun Solaris8 with Oracle9i Release 2 (9.2). Although it has not been verified, SSH tunneling of Data Guard network traffic should work on any platform that supports SSH, and with any version of SSH that supports tunneling. Tunneling Data Guard network traffic through SSH is transparent to Oracle, so there is no Oracle version dependency. Thus the following procedure can be used from Oracle9i and up.
1.3
To verify SSH is setup correctly to allow port forwarding, run the following test to forward telnet requests (default port 23) from the production system to the standby system (in the examples below, the production system is hasun23 and the standby system is hasun25): On the production system (i.e. hasun23) as the oracle user: $ ssh L 9000:hasun25:23 oracle@hasun25 # to setup the tunnel $ telnet localhost 9000 # this should forward telnet # request to telnet daemon # on hasun25
If the simple port forward of telnet requests does not work, the SSH configuration must be fixed before attempting to use it with Data Guard. SSH setup and configuration information must be obtained from your SSH documentation or your SSH vendor.
1.4
In the following example, hasun23 is the primary system and hasun25 is the standby system. The Oracle Net listener is listening on port 1525 on both the production and standby systems. Ports 9023 and 9025, as specified below in the forwarding syntax and the Oracle Net connect descriptor definition, are just unused, non-privileged ports. The Data Guard and Oracle Net setup prior to setting up SSH tunneling will usually be similar to the following. On hasun23: dg= (DESCRIPTION= (ADDRESS_LIST= (ADDRESS=(PROTOCOL=tcp)(PORT=1525)(HOST=hasun25)) ) (CONNECT_DATA=(SERVICE_NAME=sales)) ) On hasun25: dg= (DESCRIPTION= (ADDRESS_LIST= (ADDRESS=(PROTOCOL=tcp)(PORT=1525)(HOST=hasun23)) ) (CONNECT_DATA=(SERVICE_NAME=sales)) ) Database parameters utilizing the above Oracle Net service names would appear as: log_archive_dest_2=service=dg fal_client=dg fal_server=dg To modify the above configuration to utilize SSH to forward Data Guard Oracle Net traffic consists of the following three steps:
1. Setup SSH to allow port forwarding. On hasun23 as the oracle user: $ ssh -C -L 9025:hasun25:1525 oracle@hasun25 On hasun25 as the oracle user: $ ssh C L 9023:hasun23:1525 oracle@hasun23 2. The new Oracle Net service name is copied from the original and modified as shown below. On hasun23: dg_ssh= (DESCRIPTION= (ADDRESS_LIST= (ADDRESS=(PROTOCOL=tcp)(PORT=9025)(HOST=localhost)) (ADDRESS=(PROTOCOL=tcp)(PORT=1525)(HOST=hasun25)) ) (CONNECT_DATA=(SERVICE_NAME=sales)(SERVER=dedicated)) ) On hasun25: dg_ssh= (DESCRIPTION= (ADDRESS_LIST= (ADDRESS=(PROTOCOL=tcp)(PORT=9023)(HOST=localhost)) (ADDRESS=(PROTOCOL=tcp)(PORT=1525)(HOST=hasun23)) ) (CONNECT_DATA=(SERVICE_NAME=sales)(SERVER=dedicated)) ) 3. Change the following database parameters on all systems to use the new Oracle Net service name definition. log_archive_dest_2=service=dg_ssh fal_client=dg_ssh fal_server=dg_ssh The database parameters can be changed dynamically using the ALTER SYSTEM statement. Changes to LOG_ARCHIVE_DEST_2 will take effect at the next log switch. Changes to FAL_CLIENT and FAL_SERVER will take effect the next time FAL is initiated to resolve an archive gap. SQL> ALTER SYSTEM SET LOG_ARCHIVE_DEST_2=service=dg_ssh; SQL> ALTER SYSTEM SET FAL_CLIENT=dg_ssh; SQL> ALTER SYSTEM SET FAL_SERVER=dg_ssh;
1.5
In the following example, hasun23 and hasun24 comprise the primary system while hasun25 and hasun26 makeup the standby system. The Oracle Net listener is listening on port 1525 on both the production and standby systems. Ports 9023, 9024, 9025, and 9026, as specified below in the forwarding syntax and the Oracle Net connect descriptor definition, are just unused, non-privileged ports A Data Guard and Oracle Net configuration that is configured as described in the Maximum Availability Architecture paper and does not utilize SSH port forwarding will usually be similar to the following: On hasun23, hasun24: dg= (DESCRIPTION= (ADDRESS_LIST= (ADDRESS=(PROTOCOL=tcp)(PORT=1525)(HOST=hasun25)) (ADDRESS=(PROTOCOL=tcp)(PORT=1525)(HOST=hasun26))
) (CONNECT_DATA=(SERVICE_NAME=sales)) ) On hasun25, hasun26: dg= (DESCRIPTION= (ADDRESS_LIST= (ADDRESS=(PROTOCOL=tcp)(PORT=1525)(HOST=hasun23)) (ADDRESS=(PROTOCOL=tcp)(PORT=1525)(HOST=hasun24)) ) (CONNECT_DATA=(SERVICE_NAME=sales)) ) Database parameters utilizing the above Oracle Net service names would appear as follows for all systems: log_archive_dest_2=service=dg fal_client=dg fal_server=dg Setting up SSH to forward Data Guard Oracle Net traffic consists of 3 steps: 1. Setting up the SSH to allow port forwarding. On hasun23 as the oracle user: $ ssh C L 9025:hasun25:1525 oracle@hasun25 $ ssh C L 9026:hasun26:1525 oracle@hasun26 On hasun24 as the oracle user: $ ssh C L 9025:hasun25:1525 oracle@hasun25 $ ssh C L 9026:hasun26:1525 oracle@hasun26 On hasun25 as the oracle user: $ ssh C L 9023:hasun23:1525 oracle@hasun23 $ ssh C L 9024:hasun24:1525 oracle@hasun24 On hasun26 as the oracle user: $ ssh C L 9023:hasun23:1525 oracle@hasun23 $ ssh C L 9024:hasun24:1525 oracle@hasun24 2. The new Oracle Net service name is copied from the original and modified as shown below. On hasun23, hasun24: dg_ssh= (DESCRIPTION= (ADDRESS_LIST= (ADDRESS=(PROTOCOL=tcp)(PORT=9025)(HOST=localhost)) (ADDRESS=(PROTOCOL=tcp)(PORT=1525)(HOST=hasun25)) (ADDRESS=(PROTOCOL=tcp)(PORT=9026)(HOST=localhost)) (ADDRESS=(PROTOCOL=tcp)(PORT=1525)(HOST=hasun26)) ) (CONNECT_DATA=(SERVICE_NAME=sales)(SERVER=dedicated)) ) On hasun25, hasun26: dg_ssh= (DESCRIPTION= (ADDRESS_LIST= (ADDRESS=(PROTOCOL=tcp)(PORT=9023)(HOST=localhost)) (ADDRESS=(PROTOCOL=tcp)(PORT=1525)(HOST=hasun23)) (ADDRESS=(PROTOCOL=tcp)(PORT=9024)(HOST=localhost)) (ADDRESS=(PROTOCOL=tcp)(PORT=1525)(HOST=hasun24)) ) (CONNECT_DATA=(SERVICE_NAME=sales)(SERVER=dedicated)) )
3. Change the following database parameters on all systems to use the new Oracle Net service name definition. log_archive_dest_2=service=dg_ssh fal_client=dg_ssh fal_server=dg_ssh The database parameters can be changed dynamically using the ALTER SYSTEM statement. Changes to LOG_ARCHIVE_DEST_2 will take effect at the next log switch. Changes to FAL_CLIENT and FAL_SERVER will take effect the next time FAL is initiated to resolve an archive gap. SQL> ALTER SYSTEM SET LOG_ARCHIVE_DEST_2=service=dg_ssh; SQL> ALTER SYSTEM SET FAL_CLIENT=dg_ssh; SQL> ALTER SYSTEM SET FAL_SERVER=dg_ssh;
1.6
Testing Results
The results below are expressed as the difference of a primary database remotely archiving using the ARCH process with no SSH port forwarding versus that of one with SSH port forwarding (compression enabled). 1.6.1 ARCH:
Remotely archiving with the ARCH process in conjunction with SSH port forwarding showed the following characteristics when compared to the baseline: Significant reduction in network traffic No change in primary database throughput Minimal increase in CPU usage
When remotely archiving using the ARCH attribute, redo logs are transmitted to the destination during an archival operation. The background archiver processes (ARCn) or a foreground archival operation serves as the redo log transport service. Using ARCH to remotely archive does not impact the primary database throughput as long as enough redo log groups exist so that the most recently used group can be archived before it must be reopened. Using SSH port forwarding in conjunction with remote archiving with the ARCH process also did not negatively impact the throughput of the primary database. Using SSH port forwarding with compression disabled had minimal CPU impact. Using with compression enabled also had minimal CPU impact while achieving a significant reduction in network traffic. 1.6.2 LGWR ASYNC:
Asynchronously archiving with the LGWR process in conjunction with SSH port forwarding showed the following characteristics when compared to the baseline: Significant reduction in network traffic Slight increase in primary database throughput Minimal increase in CPU usage
When using LGWR to remotely archive in ASYNC mode, the LGWR process does not wait for each network I/O to complete before proceeding. This behavior is made possible by the use of an intermediate process, known as a LGWR network server process (LNS) that performs the actual network I/O and waits for each network I/O to complete. Each LNS has a user configurable buffer that is used to accept outbound redo data from the LGWR. This is configured by specifying the size in 512 byte blocks on the ASYNC attribute in the archivelog destination parameter. For example ASYNC=2048 indicates a 1Mb buffer. As long as the LNS process is able to empty this buffer faster than the LGWR can fill it, the LGWR will never stall. If the LNS cannot keep up, then the buffer will become full and the LGWR will stall until either sufficient buffer space is freed up by a successful network transmission or a timeout occurs.
Reducing network traffic in a network with high round trip times (RTT) reduces network server timeouts due to buffer full conditions, thus reducing the impact to the primary database throughput. ASYNC can improve the primary database throughput due to the fact that by compressing the redo traffic, the transfer (in 1 MB chunks) is quicker and thus the ASYNC buffer doesn't reach full capacity as often, thereby avoiding the wait that can occur when the buffer is full. 1.6.3 LGWR SYNC:
Synchronously archiving with the LGWR process in conjunction with SSH port forwarding showed the following characteristics when compared to the baseline: Significant reduction in network traffic Decrease in primary database throughput Minimal increase in CPU usage
The SYNC attribute with the LGWR process specifies that network I/O is to be performed synchronously for the destination, which means that once the I/O is initiated, the archiving process waits for the I/O to complete before continuing. If you specify the SYNC attribute, all network I/O operations are performed synchronously, in conjunction with each write operation to the online redo log. The transaction is not committed on the primary database until the redo data necessary to recover that transaction is received by the destination. Due to the smaller size of the network I/O's, testing showed that more time was spent compressing the transactions than was gained from more efficient network transmissions. The result is decreased primary database throughput. recommended to utilize SSH. For this reason, it is not