PREDICTIVE MAINTENANCE OF SUBSURFACE EQUIPMENT (ESPs) USING SENSOR DATA; A MACHINE LEARNING APPROACH
Faculty
Department
Year of Publication
Publication Type
Abstract
This thesis presents a comprehensive machine learning-based predictive maintenance (PdM) framework for Electric Submersible Pumps (ESPs) in petroleum production, aiming to mitigate unplanned downtime and optimize maintenance scheduling. ESPs are critical for artificial lift in oil wells but are prone to failures—mechanical, electrical, and operational—that disrupt production and incur high intervention costs. Traditional maintenance strategies, whether reactive or calendar-based, fail to leverage real-time sensor data effectively. This study bridges this gap by developing data-driven models to forecast ESP failures using historical and real-time telemetry. The research begins with a detailed exploration of ESP systems, their components, and common failure modes, emphasizing the role of downhole sensors (pressure, temperature, vibration, motor current) in monitoring pump health. A year-long dataset from an onshore oilfield, comprising hourly measurements of motor current, intake/discharge pressures, temperatures, vibration frequency, production rate, and choke settings, is rigorously preprocessed. Key steps data cleaning, imputation of missing values, outlier handling, and feature engineering. Temporal features such as rolling statistics (3-hour mean/standard deviation) and lagged variables (1- and 2-hour delays) are engineered to capture dynamic system behavior. Three ensemble machine learning models—Random Forest, Gradient Boosting, and XGBoost—are benchmarked against a linear regression baseline to predict motor housing temperature one hour ahead. The models are evaluated using time-series cross-validation to prevent data leakage. Results show that Gradient Boosting achieves the lowest mean squared error (MSE: 460.26 °F²) and highest R² (0.926), outperforming the linear baseline (MSE: 477.03 °F², R²: 0.923). Feature importance analysis reveals that lagged intake pressure and motor temperature dominate predictions, accounting for over 80% of explanatory power. However, the models exhibit slight under-prediction during rapid temperature spikes, highlighting a need for further refinement in extreme-event forecasting. Unsupervised clustering identifies five operational modes (startup, steady-state, high-temperature stress, low-pressure events, and current spikes), providing context for model per formance variations during regime transitions. The study underscores the potential of hybrid approaches, combining regime classification with mode-specific regression, to enhance accuracy. The thesis concludes with actionable recommendations: deploying the optimized model in real-time monitoring systems with adaptive alert thresholds, integrating additional sensor modalities (e.g., vibration, acoustics), and establishing feedback loops for continuous model retraining. Economically, the proposed PdM framework could reduce unplanned downtime by 30–50% and lower maintenance costs by 20–40%, as evidenced by industry benchmarks. Future work includes exploring deep learning architectures (e.g., LSTMs) for longer-term dependencies and extending the framework to other artificial-lift systems. This research contributes a scalable, interpretable, and empirically validated PdM pipeline, advancing the transition from reactive to proactive maintenance in petroleum production. By harnessing sensor data and machine learning, operators can anticipate failures, optimize interventions, and maximize ESP run life—translating into significant cost savings and production efficiency gains.
Supervisor(s)
co-supervisor


