SUNDAY AGBONS IGBINERE

PREDICTIVE MAINTENANCE OF SUBSURFACE EQUIPMENT (ESPs) USING SENSOR DATA; A MACHINE LEARNING APPROACH

Author(s)
Year of Publication
Publication Type
Abstract
This thesis presents a comprehensive machine learning-based predictive maintenance (PdM) framework for Electric Submersible Pumps (ESPs) in petroleum production, aiming to mitigate unplanned downtime and optimize maintenance scheduling. ESPs are critical for artificial lift in oil wells but are prone to failures—mechanical, electrical, and operational—that disrupt production and incur high intervention costs. Traditional maintenance strategies, whether reactive or calendar-based, fail to leverage real-time sensor data effectively. This study bridges this gap by developing data-driven models to forecast ESP failures using historical and real-time telemetry. The research begins with a detailed exploration of ESP systems, their components, and common failure modes, emphasizing the role of downhole sensors (pressure, temperature, vibration, motor current) in monitoring pump health. A year-long dataset from an onshore oilfield, comprising hourly measurements of motor current, intake/discharge pressures, temperatures, vibration frequency, production rate, and choke settings, is rigorously preprocessed. Key steps data cleaning, imputation of missing values, outlier handling, and feature engineering. Temporal features such as rolling statistics (3-hour mean/standard deviation) and lagged variables (1- and 2-hour delays) are engineered to capture dynamic system behavior. Three ensemble machine learning models—Random Forest, Gradient Boosting, and XGBoost—are benchmarked against a linear regression baseline to predict motor housing temperature one hour ahead. The models are evaluated using time-series cross-validation to prevent data leakage. Results show that Gradient Boosting achieves the lowest mean squared error (MSE: 460.26 °F²) and highest R² (0.926), outperforming the linear baseline (MSE: 477.03 °F², R²: 0.923). Feature importance analysis reveals that lagged intake pressure and motor temperature dominate predictions, accounting for over 80% of explanatory power. However, the models exhibit slight under-prediction during rapid temperature spikes, highlighting a need for further refinement in extreme-event forecasting. Unsupervised clustering identifies five operational modes (startup, steady-state, high-temperature stress, low-pressure events, and current spikes), providing context for model per formance variations during regime transitions. The study underscores the potential of hybrid approaches, combining regime classification with mode-specific regression, to enhance accuracy. The thesis concludes with actionable recommendations: deploying the optimized model in real-time monitoring systems with adaptive alert thresholds, integrating additional sensor modalities (e.g., vibration, acoustics), and establishing feedback loops for continuous model retraining. Economically, the proposed PdM framework could reduce unplanned downtime by 30–50% and lower maintenance costs by 20–40%, as evidenced by industry benchmarks. Future work includes exploring deep learning architectures (e.g., LSTMs) for longer-term dependencies and extending the framework to other artificial-lift systems. This research contributes a scalable, interpretable, and empirically validated PdM pipeline, advancing the transition from reactive to proactive maintenance in petroleum production. By harnessing sensor data and machine learning, operators can anticipate failures, optimize interventions, and maximize ESP run life—translating into significant cost savings and production efficiency gains.
Supervisor(s)
co-supervisor

DATA-DRIVEN PREDICTION AND EARLY DETECTION OF FLOW ASSURANCE CHALLENGES IN OIL AND GAS PIPELINES USING ENSEMBLE MACHINE LEARNING MODELS

Year of Publication
Publication Type
Abstract
Flow assurance remains a critical challenge in the oil and gas industry, where complex interactions among temperature, pressure, corrosion, and flow dynamics can lead to operational inefficiencies, production losses, or complete pipeline blockage. Traditional rule-based and thermodynamic models often fall short in capturing the nonlinear, multi-parameter dependencies underlying these challenges. In this study, a data-driven framework was developed for the prediction and early detection of flow assurance challenges in oil and gas pipelines using ensemble machine learning models.
A dataset comprising twenty-four operational and material parameters including temperature, pressure, pipe size, flow rate, corrosion impact, and energy consumption was analyzed to classify pipeline states into normal, moderate, and critical risk categories. Three algorithms Random Forest, Support Vector Machine (SVM), and Gradient Boosting—were implemented, optimized through grid-based hyper parameter tuning, and evaluated using cross validation and standard performance metrics such as accuracy, precision, recall, F1-score, and confusion matrices.
The results indicated that all three models successfully identified key operational relationships influencing flow assurance risks. The Random Forest model achieved a high training accuracy of 98.71% but showed overfitting with a test accuracy of 40.33%. The SVM model achieved a test accuracy of 45.00% with a recall of 70.6% for critical conditions. The Gradient Boosting model outperformed both, achieving a cross-validation score of 47.71%, test accuracy of 49.33%, and recall of 97.2% for critical flow states, with minimal overfitting (accuracy gap of 4.10%).
The study concludes that ensemble machine learning methods, particularly Gradient Boosting, offer a reliable and interpretable approach for predicting and classifying flow assurance challenges. By enabling early detection and proactive intervention, the proposed framework can support predictive maintenance, reduce pipeline downtime, and enhance operational safety and efficiency in oil and gas transportation systems.
Supervisor(s)
co-supervisor