A Parametric Reliability Prediction Tool For Space Applications
A Parametric Reliability Prediction Tool For Space Applications
Significant: This category refers to a hardware or software event that results in loss of hardware usefulness, based on unreliable, inconsistent, or degraded performance such that switching to redundant hardware is required. Major: This category applies to a hardware or software event that results in the implementation of redundant hardware based on unreliable, inconsistent, or degraded performance of current hardware in use. Noteworthy: This applies to any failures that resulted to Catastrophic, Significant and/or Major failures. 3.2 Root Cause Categories Once the failures are classified, the root cause of each failure can be categorized. A typical set of root cause categories are defined in Table 1, along with typical examples of each. Cause Category design environment operational other part software test unknown workmanship Cause & Examples of Failure Cable Harness - faulty cable or connector design Atomic Oxygen (AO) AO in the atmosphere Command Error - incorrect command(s) sent Other - repeated or already known anomaly Broken weld inside the part Database incorrect values causing malfunction Operator Error - untrained operator caused the failure Unknown - control signal jitter source unknown Workmanship - inaccurate alignment
4 PARAMETRIC PREDICTION TOOLS PROCESS The methodology previously described has been developed and exercised by utilizing data and research information gathered from the engineering reliability anomaly database for beta program DMSP X. The process involved the following tasks: Data gathering of ground and on-orbit anomaly data from the date of satellite launch to end-of-life for space vehicles S6 through S14. Development and population of a configuration and timeline template for the reliability database. Verification and classification of subsystem failure causes and failure modes. Identification of failure drivers by subsystems and by line replaceable units (LRUs). Identification of failure drivers across the population in DMSP X S6-S14 vehicles. Classification of the LRU failure causes. Calculation of actual failure rates of units based on compiled operating hours and number of line replaceable units in the system. Determination of the complexity of failure drivers based on their failure rates. Analyses of failure data on identified parameters, using Bayesian and Weibull techniques. 4.1 Evaluating Parameters and Identification of Noteworthy Drivers Noteworthy subsystem failures impacting mission were selected, and on-orbit and pre-launch failures were separately analyzed. Figure 1 shows that sixty-five percent of all failures occurred on the payload subsystem. The top drivers were selected for further analysis, including the failure causes. Figure 2 shows payload module noteworthy failures, which shows that sixty-four percent of the failures occurred within the operational line scanner (OLS). Figure 3 shows payload noteworthy failure causes, which shows that workmanship is the highest cause of failure with ten failures attributed to it. 5 RESULTS Further top-down analysis of the failure drivers within the OLS showed that seventy-one percent of those failures were caused by the primary tape recorder as shown in Figure 5, and the driving failure cause within the primary data recorder is workmanship as shown in Figure 4. 5.1 Bayesian Reliability Analysis of Primary Data Recorder The mean time between failures (MTBF) for any component is calculated by dividing the total number of operating hours by the number of noteworthy failures. Table 3 illustrates the failure rate calculation for the primary data recorder for DMSP X satellites S6-S14 for all thirty-six recorders (4 per satellite) and the predicted failure rate developed by the manufacturer for the primary data recorder. Here is an example of Table 3 Primary Recorder (pr) failure rate calculation using satellite s7:
Table 1 Root Cause Categories 3.3 Satellite Subsystems To permit uniform application of the reliabilityengineering database, it must also capture a standardized set of spacecraft and launch vehicle subsystems. Table 2 shows a list of industry-wide subsystem assignments that have generally been incorporated into launch vehicle systems.
Nomenclature DMS EPDS FTS GN&C Hydraulics Payload Pneumatic Propulsion S&MS TT&C Thermal Subsystem Data Management Subsystem Electrical Power and Distribution Subsystem Flight Termination Subsystem Guidance, Navigation and Control Subsystem Hydraulic Subsystem Payload Subsystem Pneumatic Subsystem Propulsion Subsystem Structure and Mechanisms Subsystem Telemetry, Tacking and Command Subsystem Thermal Subsystem
Table 2: Subsystems
PAYLOAD
Subsystems
Failures
OLS SSM/T
Modules
Failures
WORKMANSHIP UNKNOWN
Cause
Failures
WORKMANSHIP
Cause
UNKNOWN
DESIGN 0 1 2 3 4 5 6
Failures
Assemblies
Others
Primary Recorder
10
Failures
Figure 5: OLS Noteworthy Assembly Failures the mission time (hrs) for each pr with zero failure is the time from launch date to end-of-life (eol) =42312 hrs, mission time for primary recorders pr1, pr2 and pr4 = 42312x3=126936 hrs, mission time for pr3 (@ failure) = 18024 hrs, total hours = b + c =144960 hrs. pr1-pr4 units are hours Figure 6 shows the primary recorder failure rate, using Bayesian analysis, with a) prior (Estimated Failure Rate Distribution Prior to the Availability of Actual On-Orbit Data) = 6.318 failures per million hours, b) likelihood (Probability of Actual On-Orbit Outcome Data) = 4.3 failures per million hours and c) posterior (Probability that has been revised in light of actual On-Orbit data) = 5.7 failures per million hours Note that the actual failure rate of the primary recorder was less than the predicted failure rate (estimated failure rate) of 6.318 failures per million hours. For failure missions, the most recommended reliable failure rate for the primary recorder to be used will be the posterior failure rate of 5.7 failure per million hours. 5.2 Weibull Reliability Analysis of Primary Data Recorder Figure 7, Reliability versus Time plot, shows the Weibull analysis for the primary data recorder. The historical reliability data are ranked according to the cumulative probability of failure and plotted on Weibull probability paper. In Figure 7, the ordinate (y) shows the probability of failure and the abscissa (x) represents the life value. This analysis resulted in a Weibull parameter beta of 1.99, which indicates a wearout failure mode. This is contrary to the manufacturers prediction, which implies that the failures can be attributed to design deficiency. Further investigation revealed that the primary recorders seem to be from two distinct batches. This analysis was critical in bringing attention to the failure of the primary recorders so that mitigating tasks can be put into place to prevent recurrence of the failure. The Weibull results provide additional information that further defines and illustrates the severity of the problem. Identifying and eliminating these types of problems through regular analysis as described in this paper can lead to marked improvements in efficiency and cost performance. 6 CONCLUSION PRPT provides a disciplined and adequate technique to
pr1 pr2 pr3 pr4 total failure failure failure failure hours
0 0 0 44184 15360 0 0 0 0 0 0 53232 46800 0 0 13512 0 0 0 18024 0 0 0 0 15240 0 0
0 652704 0.000 0 144960 6.898 47184 236832 7.207 62760 277512 10.137 0 295944 4.117 0 242880 0.000 16152 129744 13.452 0 223008 0.000 0 151776 0.000 10 2355360 4.246 failures hours fail / 1e6 hours 235536 MTBF (hours)
DMSP X Primary Data Recorder Failure Rate Calculation - (Prediction) fail / 1e6 hours duty factor 9.258 0.323 2.989 13.554 0.061 0.823 4.066 0.616 2.506 6.318 expected failure rate
* eol=End of Life
1.00
Weibull Data 1
0.80
Reliability, R(t)=1-F(t)
0.60
0.40
0.20
Nikki Ogamba The Aerospace Corp.
Time, (t)
=1.99, =36494.10, =0.89
Figure 7: DMSP X Primary Data Recorder Reliability vs. Time Plot 7 APPENDIX I: SPACE/ LAUNCH VEHICLE PERFORMANCE PARAMETER Key space vehicle performance parameters that have been determined to be relevant to a reliability database and can be used to correlate failure data with overall system reliability are: Anomaly Name Anomaly Probable Location Anomaly Type/Description Corrective Action Taken Date and Time of Anomaly Item Manufacturer Key Reliability Factors Operational Workarounds Probable Cause Program Symptom Category Type Orbit REFERENCES 1. 2. 3. J. P. Rooney, Aging in Electronic Systems, Proc. Ann. Reliability & Maintainability Symp., 1999, pp 293-299. P. I. Hsieh, A Framework of Integrated Reliability Demonstration in System Development, Proc. Ann. Reliability & Maintainability Symp., 1999, pp 258-264. Gelman, A., Carlin, J., Stern, H., and Rubin, D. (1995) 4. 5. 6. Bayesian Data Analysis, Chapman & Hall, New York. Abernathy, Robert, The New Weibull Handbook, Gulf Publishing Company, 1993. A.H. Quintero, Space Vehicle Anomaly Reporting System (SVARS) Electronic Data Interchange (EDI) Template, September 1996. Matusheski, Robert, Meridium Reliability Guidelines. Meridium Enterprise Reliability Management System, 1999. BIOGRAPHIES Nkiru U. Ogamba The Aerospace Corporation 2350 E. El Segundo Blvd. M4/994 El Segundo, CA 90245-4691 e-mail: [email protected] Nkiru Ogamba is an Engineering Specialist at The Aerospace Corporation in the Electronics Engineering Subdivision, Space Electronics Vulnerability Office. Prior to joining The Aerospace Corporation, she was employed by Litton Industries as a Senior Engineer in the Reliability/Maintainability Group. N. Ogamba has BS and MS degrees in Electrical Engineering, with over twenty-five years experience in Reliability, Quality, Test Engineering Applications and Design.