Estimation of Mean Time Between Failures (MTBF) of Electrical Feeders and Related Components
In the New York City Power Grid, electricity is transmitted via primary distribution feeders between the high voltage transmission system and the household-voltage secondary system. These feeders are susceptible to different kinds of failures such as emergency isolation caused by automatic substation relays (Open Autos), failing on test, maintainence crew noticing problems and scheduled work on different sections of the feeder. Over the past few years, researchers at CCLS have collaborated with the Consolidated Edison Company of New York to develop systems that can rank feeders and their components according to their susceptibility to failure. The Ranker for Open-Auto Maintainence Scheduling (ROAMS) was the first such system built using Martingale Ranking. Subsequently, the system was improved to boost ranking performance using an ensemble of ranking experts which, however, came at a cost of interpretability of the machine learning models and made the system more complicated. A comparison of three different techniques (Martingale Ranking, RankBoost and SVM score ranker) for generating ranked lists of electrical feeders - can be found in [Phil_08a].
More recently, we have begun to focus on measures such as Time Between Failures (TBF) and Time To Failures (TTF) of feeders resulting in regression problems as opposed to ranking. Several challenges exist in generating good regression models : (1) Few components actually failed during the time for which we have data. Many components have never failed or failed only once during the time for which we have data, and for these cases we need to learn estimates from “censored” data, i.e. data on time intervals where we only know that TBF is greater than a) the period for which we have data, or b) the times from the last failure before we started collecting data until the first failure, or c) the time from the last failure until the present. In some cases we may have two or more failures of the same feeder during the collection period, making it possible to get more precise data to train on. (2) Another key challenge is that there are several failure modes (such as Open Autos, Failed on Test, Out on Emergency), and so task is highly non-linear. Key failure causes for feeders include aging, power quality events (e.g. spikes), overloads (that have seasonal variation, with summer heat waves especially problematic), known weak components (e.g. PILC cable and joints connecting PILC to other sections), at-risk topologies (where cascading failures could occur), workmanship problems and the stress of HiPot testing and deenergizing/reenergizing of feeders that can result in “infant mortality.” (3) Some feeders are relatively new and last for a very long time whereas others are relatively short-lived. Failures that happen close to one another (say within two months of each other) are different from long-lived ones. In addition, seasonal effects (such as high summer temperatures) affect failure rates of feeders. This introduces considerable imbalance in the training data and makes generalization difficult.
We are applying and comparing Support Vector Machines (SVM), CART (Classification and Regression Trees) and ensemble based techniques such as Random Forests along with statistical methods, e.g. Cox Proportional Hazards, to these tasks. In all this we use empirical ML methods, training our models on a subset of the data, validating on another subset of the data, and then verifying in blind testing on previously unseen data. In future, we expect to study root causes of failures and generate rules that help analyze why some feeders last longer than others and what causes them to suffer from infant mortality and so on.