1 Estimating oil and gas recovery factor s via machine learning Database - dependent accuracy and reliability Alireza Roustazadeh1 Behzad Ghanbarian1 Mohammad B. Shadmand2 Vahid Taslimitehrani3

2025-04-30 0 0 1.52MB 42 页 10玖币

侵权投诉

Estimating oil and gas recovery factors via machine learning: Database-

dependent accuracy and reliability

Alireza Roustazadeh1, Behzad Ghanbarian1*, Mohammad B. Shadmand2, Vahid Taslimitehrani3,

and Larry W. Lake4

1 Porous Media Research Lab, Department of Geology, Kansas State University, Manhattan 66506

KS, United States

2 Department of Electrical and Computer Engineering, College of Engineering, University of

Illinois at Chicago, Chicago 60607 IL, United States

3 Staff Machine Learning Scientist at Realtor.com, San Francisco CA, United States

4 Hildebrand Department of Petroleum and Geosystems Engineering, University of Texas at

Austin, Austin 78712 TX, United States

* Corresponding author’s email address: ghanbarian@ksu.edu

Abstract

With recent advances in artificial intelligence, machine learning (ML) approaches have

become an attractive tool in petroleum engineering, particularly for reservoir characterizations. A

key reservoir property is hydrocarbon recovery factor (RF) whose accurate estimation would

provide decisive insights to drilling and production strategies. Therefore, this study aims to

estimate the hydrocarbon RF for exploration from various reservoir characteristics, such as

porosity, permeability, pressure, and water saturation via the ML. We applied three regression-

based models including the extreme gradient boosting (XGBoost), support vector machine (SVM),

and stepwise multiple linear regression (MLR) and various combinations of three databases to

construct ML models and estimate the oil and/or gas RF. Using two databases and the cross-

validation method, we evaluated the performance of the ML models. In each iteration 90 and 10%

of the data were respectively used to train and test the models. The third independent database was

then used to further assess the constructed models. For both oil and gas RFs, we found that the

XGBoost model estimated the RF for the train and test datasets more accurately than the SVM and

MLR models. However, the performance of all the models were unsatisfactory for the independent

databases. Results demonstrated that the ML algorithms were highly dependent and sensitive to

the databases based on which they were trained. Statistical tests revealed that such unsatisfactory

performances were because the distributions of input features and target variables in the train

datasets were significantly different from those in the independent databases (p-value < 0.05).

Keywords: Hydrocarbon, Machine learning, Recovery factor, Regression, XGBoost

1. Introduction

Successful exploration and production of a hydrocarbon reservoir have always been

challenging in petroleum engineering [1]. Because of high fluctuations in oil and gas prices, it is

necessary to determine, if a field is economically feasible to be invested or not. A conventional

reservoir goes through different stages during its lifetime, namely exploration, appraisal,

development, production, and lastly abandonment. Once a reservoir is discovered (exploration

stage), the main challenge is to determine whether it has high potential to produce substantial

amount of hydrocarbon and subsequently yield enough profit. Even for a currently producing

reservoir, one still needs to analyze its performance and production rate to assure profits above

economical margins.

The performance of a reservoir influences its economic feasibility and is a function of

reservoir quality. Reservoir performance may be quantified by the initial production rate of a

reservoir and the decline in its production rate. Another indicator is the amount of recovered

hydrocarbon from the initial volume of hydrocarbon in a reservoir [2], also known as recovery

factor (RF). The RF, ranging between 0 and 1, is a quantity to evaluate the fate of a reservoir. It is

a function of displacement mechanisms, such as water drive, gas cap, rock compaction drive,

solution gas, and gravity drainage [3]. Its value may be determined at different stages of a reservoir

lifetime. However, in this study we refer to ultimate recovery factor by RF.

There exist different methods to determine the RF during the lifetime of a reservoir. When

a field is in its appraisal phase and no production data are available, analogs and empirical

formulas, e.g., history matching and volumetric reserve estimation, are commonly used. Such

methods, however, generally come with substantial uncertainties due to the lack of adequate

analogs [4]. During the production phase, dynamic properties of a field vary with production and

time and, thus, the RF determination is influenced by enhanced and improved recovery methods

[5] as well as more data being collected from a field [6]. Once approaching the abandonment phase,

the emphasis should be more on economical margin, and how close the field production is to such

a margin. At the plateau production phase, the RF is typically determined by simulations and

dynamic reservoir modeling. However, at late stages of production and when the plateau

production is no longer in place, the decline curve analysis should be analyzed. Various methods

used at different field life stages may lead to different RF determinations and uncertainties [4]. In

addition to the decline curve analysis, one may apply the material balance method to determine

the reserve and, accordingly, the recovery factor [4,6,7]. Both the material balance and production

decline curve analysis methods calculate reserves and recovery factor from the performance of a

field [8,9].

The aforementioned methods are time consuming, and their results may have a wide range

of uncertainties[10–12]. Moreover, the RF is a function of multiple variables e.g., permeability,

temperature, and gas oil ratio (GOR), which may not be necessarily incorporated in such methods.

In addition, those methods are not cost efficient. With recent advances in artificial intelligence and

data analytics, one may construct a machine learning-based model to estimate the RF from other

available reservoir characteristics [13]. Such models may provide a cost efficient and accurate

platform as well as better insights into reservoir characterization and hydrocarbon production, if

implemented appropriately.

Machine learning (ML) techniques have been previously used to estimate the RF [14–17].

For a recent review see Tahmasebi et al. [18]. For example, Lee and Lake [19] applied various

methods including multiple linear regression (MLR), MLR with sequential feature selection,

artificial neural networks (ANN), and Bayesian network to estimate both oil and gas RFs. In their

study, the ANN and MLR with sequential feature selection showed better performance than the

other two methods. Aliyuda and Howell [4] applied the MLR and support vector machine (SVM)

with Gaussian kernel on 93 reservoirs from the Norwegian Continental Shelf. 75 reservoirs were

from the Norwegian Sea, the Norwegian North Sea, and the Barents Sea, and the remaining 18

reservoirs from the Viking Graben in the UK sector of the North Sea. Aliyuda and Howell [4]

found that the SVM model outperformed the MLR model.

In another study, Chen et al. [16] applied the ANN approach to develop predictive oil RF

models with different sets of input data using the database TORIS and identified 19 principal

features (out of 70) in the construction of their ML model. Tewari et al. [20] used six different

approaches, i.e., random tree, random forest, SVM, bagging, radial basis function, and multilayer

perceptron and compared them with their proposed ensemble estimator (E2) model. Ensemble

methods are classifier algorithms that develop multiple classifiers and make inferencing using the

weighted vote of multiple estimations that they have made [21]. Tewari et al. [20] showed that

their E2-based model outperformed the other six methods.

In the literature, most ML studies divide a database into train and test splits. Although the

performance of ML models is evaluated using the test dataset, an independent database with no

data overlap is not used to further assess the accuracy and reliability of ML models constructed.

Furthermore, to the best of the authors’ knowledge, the extreme gradient boosting algorithm has

neither been applied to estimate the hydrocarbon recovery factory at the reservoir scale, nor has it

been compared with support vector machine and multiple linear regression. Therefore, the main

objectives of this study are to: (1) apply the extreme gradient boosting (XGBoost) approach to

develop an ML-based model, (2) compare its performance with that of multiple linear regression

(MLR) and support vector machine (SVM) in the estimation of oil and gas RFs using reservoirs

from around the world, and (3) address the database dependence of accuracy and uncertainty in

ML-based models.

2. Materials and Methods

2.1. Databases

- Commercial database

The commercial database analyzed in this study is the same one that [19] used. This

database contains more than 1200 samples including conventional reservoirs with different

geologic formations from around the world. The oil RF values were reported for more than 600

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

1Estimatingoilandgasrecoveryfactorsviamachinelearning:Database-dependentaccuracyandreliabilityAlirezaRoustazadeh1,BehzadGhanbarian1*,MohammadB.Shadmand2,VahidTaslimitehrani3,andLarryW.Lake41PorousMediaResearchLab,DepartmentofGeology,KansasStateUniversity,Manhattan66506KS,UnitedStates2DepartmentofEle...

展开>> 收起<<

1 Estimating oil and gas recovery factor s via machine learning Database - dependent accuracy and reliability Alireza Roustazadeh1 Behzad Ghanbarian1 Mohammad B. Shadmand2 Vahid Taslimitehrani3.pdf

共42页,预览5页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

1 Estimating oil and gas recovery factor s via machine learning Database - dependent accuracy and reliability Alireza Roustazadeh1 Behzad Ghanbarian1 Mohammad B. Shadmand2 Vahid Taslimitehrani3

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: