This paper uses multivariate regression to create a mathematical
model for iron skarn exploration in the Sarvian area, central Iran, using
multivariate regression for mineral prospectivity mapping (MPM). The main
target of this paper is to apply multivariate regression analysis (as an MPM method) to map iron outcrops in the northeastern part of the study area
in order to discover new iron deposits in other parts of the study area. Two
types of multivariate regression models using two linear equations were
employed to discover new mineral deposits. This method is one of the reliable
methods for processing satellite images. ASTER satellite images (14 bands)
were used as unique independent variables (UIVs), and iron outcrops were
mapped as dependent variables for MPM. According to the results of the probability value (

The remote-sensing layer is one of the significant data layers which is applicable for different levels of mineral exploration especially at reconnaissance levels. This data layer is processed based on the most common techniques for the identification of minerals. Mineral exploration is a complex process (Gupta, 2003). The complexities of mineral exploration can be solved by using remote-sensing techniques in the early stages of mineral exploration for the reconnaissance of target areas with the goal of continuing exploratory operations. One of the most recognizable uses with remote sensing is mineral exploration and the identification of various geological structures, faults and lineaments, geological units, alterations, indicator, and tracer minerals (Melesse et al., 2007; Carranza, 2008; Abedi et al., 2013; Golshadi et al., 2016 and Feizi and Mansouri, 2012). The factors mentioned play important roles for recognizing mineralization in the region of interest; so the identification of these factors saves time and cost as well as giving a more precise result (Xiong and Zuo, 2017).

There are various techniques in remote-sensing processes for recognizing minerals. The satellite images were processed with specific mathematical algorithms in all remote-sensing techniques with the goal of the generation of useful information. The information mentioned can be integrated with other information and layers for the evaluation and interpretation of exploratory results (Li et al., 2015, Abedi et al., 2012; Bonham-Carter and Agterberg, 1990; Carranza, 2009; Carranza and Sadeghi, 2010; Ford and Blenkinsop, 2008; Lindsay et al., 2014; Lisitsin et al., 2013; Pan and Harris, 2000; Porwal et al., 2010; Feizi and Mansouri, 2013a).

One of the important factors in remote-sensing processes is using an appropriate algorithm and the proper method. Today, new image processing methods and algorithms are improved. Among these methods, the regression analysis mathematical approach is significant due to its strong mathematical basics and the fact that it is compatible with geological data.

The identification of stream sediment anomalies has been used by multiple regression analyses (e.g. Carranza, 2010a, b). Likewise, multivariate regression has been effectively utilized by Granian et al. (2015) to display subsurface mineralization from lithogeochemical information. Granian et al. (2015) used four types of multivariate regression models to depict significant surface geochemical anomalies indicating subsurface gold mineralization and utilizing borehole data as dependent variables and surface lithogeochemical data as independent variables.

Based on previous work such as Allbed et al. (2014), modelling and mapping of mineral potentials based on satellite image data and processing it based on remote-sensing and regression analysis is a promising approach as it facilitates timely detection with a low-cost procedure and allows decision makers to decide what necessary action should be taken as the first step in the mineral prospectivity mapping (MPM) field.

There are multiple types of regression analyses. Among these types, multivariate regression analysis is selected and used in this paper. In multivariate regression analysis, the relationships between independent variables and dependent variables is predicted in order to analyse the effects of independent variables on dependent variables. This method can be used in remote sensing by modelling the mineralization outcrop points for further exploration and finding new prospective zones, directly. One of the advantages of this method is the directness and quickness of mineral identifications without the need for other exploration layers.

The aim of this paper is the processing of satellite images by the mathematical method of regression analyses and using its applications in remote-sensing and geological units. In addition, recognizing new mineralization in the region of interest with modelling mines and known deposits is another purpose of this paper. This aim is reached by identifying geological dependent variables and finding relationships among them for the exploration of new deposits with an acceptable accuracy in the study area.

The Sarvian iron ore deposit with 8 million tons reserve is a calcic iron skarn deposit. Due to intrusive rocks and carbonate rocks in many parts of the study area, new iron skarn mineralization can be introduced. In this paper we used the regression method to identify new iron mineralization in other parts of the study area.

In order to perform this method, the existence of a dependent variable is the main condition for the use of analytic regression. In this study, Advanced Spaceborne Thermal Emission and Reflection Radiometer (ASTER) satellite image pixels located in the northeastern part of the study area were considered to be dependent variables. Also, ASTER satellite image pixels of other parts of the study area were considered to be independent variables.

Two types of multivariate regression models were used to find new mineral deposits: the 14 bands of ASTER satellite images were set as unique independent variables (UIVs), while iron outcrop area (digitized as a 1 : 5000 geology map of the study area and field) data were set as dependent variables.

The Sarvian area is in the Orūmīyeh-Dokhtar magmatic arc in central Iran (Fig. 1a). This magmatic arc is the most important metallogenic area inside the district; it hosts large metal deposits such as lead, zinc, copper and iron. A set of crystallized limestone dolomite are the oldest geological units in the study area, dating to the Permian and Triassic. Sedimentation of limestone and marl in the Qom formation occurred concurrent with continental sedimentation at the Oligocene. Most tectonic activity in the study area was in the form of vertical movements ,which caused instability in the basin and changed the depth of the sea. Vertical movements at the beginning of the Miocene caused volcanic activity in the study area, which was impressive. Important magmatism occurred in the late of Miocene, which caused skarn mineralization where carbonate units of the Qom formation were in contact. The main fault of the study area is Bidehend. The Bidehend is a strike-slip fault with a length of 43 km. The Bidehend fault is 10 km away from the study area. The effect of this major fault on the study area is limited to the creation of parallel faults and fractures with the same direction as the Bidehend fault. There is no relationship between the skarn mineralization and faults in the Sarvian area because no mineralization has been reported in faults and fractures (Feizi et al., 2016, 2017).

Location of the Sarvian iron mine in the study area.

The study area is dominated by Eocene intrusive rocks and carbonates of the
Qom formation. Several types of metal and non-metal mineral ore deposits
have, up to now, been reported in the study area. According to the
1 : 100 000 geological map of Kahak, the lithology of this area includes
cream limestone with intercalations of marls (Qom formation), dark green,
andesitic–basaltic lava, volcanic breccia, hyaloclastic limestone, green
megaporphyritic andesitic–basaltic lava, rhyodacitic domes,
tonalite–quartz-diorite, microquartz-diorite–microquartz-monzodiorite,
granite–granodiorite, alternations between light green and grey
tuff, tuffaceous sandstone and shale with
the intercalation of nummulitic sandy limestone and andesitic lava, and

These relationships are demonstrated by the calcic iron skarn ore (Sarvian mine) in the northeast of the study area (Feizi et al., 2017) (Fig. 2). Skarn-type Fe mineralization and alteration are localized along the contact zone between intrusive rocks and carbonate sequences (Zuo et al., 2014).

For uncovering relationships between independent and dependent variables, an
appropriate statistical tool was introduced into
geoscience by Granian et al. (2015) which is called regression analyses. If dependent variables
are called (

Based on Granian et al. (2015), the following criteria were utilized for the examination of the regression analysis:

The variance and the mean of the random error should be a constant value and zero, respectively.

The coefficient of determination value which is called (

In Eq. (7), the mean of the variable is called (

Given the fact that adding independent variables to the model will increase
the

In Eq. (8),

In regression analyses, the

Wavelength ranges and spatial resolutions of ASTER bands (Abrams, 2000).

There are several iron ore bodies and one iron mine in the northeastern Sarvian study area. The regional geological conditions of the area suggest that the Sarvian iron mine is a good model for exploring the surrounding area. In this paper, a geology map of the mine is used as a training area for satellite imagery. In the training area, this method can model the iron outcrops (a dependent variable) based on ASTER satellite image bands (independent variables) (Fig. 3).

The ASTER sensor was launched in December 1999 on board the Earth Observation System (EOS) US Terra satellite. ASTER provides high-resolution images of the land surface, water, ice, and clouds using three separate sensor subsystems covering 14 multi-spectral bands from visible to thermal infrared (Table 1). Resolutions are 15, 30 and 90 m in the visible and near infrared (VNIR), shortwave infrared (SWIR) and thermal infrared (TIR), respectively. For more information see Feizi and Mansouri (2013b) and Mansouri and Feizi (2016).

In this study after corrections, the pixel size of the SWIR and TIR bands based on the VNIR3 band (panchromatic band) was converted to 15 m. The layer stacking function was then used to build a new multiband file from georeferenced images of various pixel sizes, extents and projections (Mansouri et al., 2015). The date of the images is 11 June 2002.

Formula of regression models used for ASTER satellite image bands.

There are several iron veins and outcrops around the iron ore skarn mine in the northeastern part of the Sarvian area. Iron outcrops in the training area were mapped using a geological map on a scale of 1 : 1000 of the iron ore deposit. The map was then field checked. The shape file layer of iron outcrops was converted to a raster file with a pixel size of 15 m.

Multiple, factorial, polynomial and response surface regressions have been
utilized in many fields including the geosciences (e.g. Granian et al.,
2015). In this study, Model 1 (

Regression analyses were performed to assess the models in Table 2, and the
critical criteria mentioned above were examined. The

Table 4 presents the calculated coefficients of independent variables in regression models. Excluded independent variables are not mentioned in Table 4. Excluded variables were those that had no effect on iron mineralization and the mapped distribution of iron outcrops.

We used several criteria to review the differences between the two models.
Firstly, the variance and the mean of the random error were acceptable for
both models. Secondly, based on Table 4, the

The value of

The

Because adding independent variables to the model will increase the

Thus, according to the results of the

A large part of the study area is formed based on carbonate units of the Qom formation and intrusive rocks such as diorite, granodiorite and gabbro. These rock units increases the probability of skarn mineralization in the study area. The type of the Sarvian iron ore, which is used in this paper from its outcropping pixels as dependent variables, is also skarn. According to the observations in the field operations and the study of the geological map of the area, there is contact between the intrusive units (diorite, granodiorite) and host rocks (limestone and siltstone of the Qom formation). In the contact area of intrusive units and host rocks, the skarn geological unit was seen as a narrow strip. The major economical mineralization of skarn iron ores in this region is magnetite.

The calculated coefficients of regression models 1 and 2. CST indicates the constant.

To assess the accuracy of the selected model, the created prospectivity map was checked against the iron outcrops map in the northeastern part of the study area (Fig. 5). The locations of iron outcrops are in close agreement with predictions from the mineral prospectivity map. In addition, three target areas with very high potential were checked for iron outcrops, and the prospectivity map was confirmed by geological observations (Fig. 6). Based on field observation, iron mineralization occurs at contacts between limestone and intrusive rocks (skarn type). Iron mineralization consists dominantly of magnetite (Fig. 6). Therefore, the accuracy of the mineral prospectivity map is confirmed in the Sarvian area.

Mineral prospectivity map of the Sarvian area.

Mineral prospectivity map of the Sarvian area, which is confirmed by iron outcrops.

It is obvious that satellite images consist of various bands, and each pixel in different bands has a specific pixel value. Thus, some quantitative information is obtained which should be processed for reaching the goal of interest. In remote sensing, selecting the appropriate method and algorithm is significant for obtaining the best results.

Mineral prospectivity map of the Sarvian area, which was confirmed by a field sample of the three target areas.

Remote-sensing methods were mostly generated based on spectral or pixels. Based on this categorization, various statistical and spectral methods are available. One of the methods that can be used in remote sensing is analytical regression. This method is a statistical process for estimating the relationships between variables.

The application of a multivariate regression method in remote sensing is based on a supervised method. The supporting vector machine (SVM) technique is a supervised approach which can be compared to multivariate regression analysis because both methods are supervised and based on regression functions.

The theory of SVM is based on classification and regression. This method is one of the most recent approaches that has shown appropriate performance in recent years. The classification in SVM is according to the linear data classification, and the user should select an appropriate line for classification. This method is a linear training method which uses the empty spaces between data. The SVM uses kernel functions to separate and classify classes. The more kernels can locate the classes with maximum distance from each other, the greater the accuracy with which the classification will be done. This refers to the maximum distance between the separator screen and the closest samples of each class (Forkuor et al., 2017; Cheng and Bao, 2014).

The most important advantages of SVM are that it has good application in various fields and produces an optimal response. The most important disadvantages of SVM are mentioned below:

In SVM method an appropriate kernel should be selected. Determining the proper kernel is very important. Selecting an inappropriate kernel causes errors in calculations and conclusions.

Accuracy sensitivity to SVM parameters.

SVM has limitations in speed and time because in this method an optimization issue needs to be solved.

This method may not provide good results for all data.

The similarity between SVM and the method used in this paper is that both are based on regression mathematics theory and functions, but there are also some differences between them. The SVM classifies and separates the categories, but the analytic regression method uses existing relationships and correlations between the data for introducing the best possible model for predicting the result.

The most important advantages of the analytical regression method are
mentioned below:

Almost all data can be used in this method.

The analytical regression method does not depend on a particular parameter, and it does not have any special restrictions like the SVM.

The analytical regression method does not have any limitations in speed and time.

Due to the fact that the model predicts the results according to the data as well as the relationships between the data, the results are closer to reality.

Forkuor et al. (2017) used four methods, i.e. multiple linear regression (MLR), random forest regression (RFR), support vector machine (SVM) and stochastic gradient boosting (SGB), to study soil properties in southwestern Burkina Faso. The results of all four methods are confirmed by Forkuor et al. (2017), who stated that other methods are preferable in comparison with methods based on regression according to the model performances statistics. This statement can obviously not be accurate in iron ore exploration of the Sarvian area. The results of regression analyses in the Sarvian area showed that all of the areas predicted by the appropriate regression model are iron ore minerals.

Also Allbed et al. (2014) used regression analyses to identify soil properties on satellite imagery and achieved good results; but the difference between this paper and other similar papers is the use of regression analyses in mineral exploration as well as the generation of a mineral potential map.

To examine and compare the results of using multivariate analytic regression with other similar methods on mineral exploration, Feizi and Mansouri (2013b) is used.

To review the results of multivariate regression and to compare this method with other existing methods, our previous work, Feizi and Mansouri (2013b), is referenced for two reasons. Firstly, the northern part of the study area in this paper is similar to some areas of the southern part of the study area in Feizi and Mansouri (2013b); secondly, like this paper, Feizi and Mansouri (2013b) is published with the aim of iron ore exploration. Feizi and Mansouri (2013b) used methods such as the Spectral Angle Mapper (SAM), principal component analysis (PCA), least-squares fit linear band prediction (LS-Fit), Minimum Noise Fraction Transform (MNF) and band ratio for iron ore exploration. According to the results obtained by these methods, the identified regions are iron oxide zones containing magnetite, hematite, goethite, limonite and jarosite minerals. In the Sarvian area, magnetite ore is more economical than other minerals and all active mines with more magnetite (in comparison with other minerals such as hematite, goethite, limonite and jarosite) are more economical. Therefore, the methods, such as SAM, PCA, LS-Fit, MNF and band ratio, used by Feizi and Mansouri (2013b) introduced iron oxide alterations and a variety of iron minerals such as hematite, goethite, limonite, and jarosite with magnetite, which are not significant or economically valuable. So, to identify the areas with the most magnetite, the field study should be performed with a high accuracy, which prevents wasting time and money; the results of multivariate regression in this paper recognized magnetite areas accurately. In this paper, the pixels for the magnetite veins of the Sarvian iron ore mine are considered to be base pixels, and, therefore, the results obtained from this method demonstrate exactly the magnetite anomalies in the study area. Iron oxide alterations and a variety of uneconomical iron minerals such as hematite, goethite, limonite, and jarosite in the study area were not observed based on the results of multivariate regression. Thus, the multivariate regression method performs more accurately than other methods mentioned, which results in saving time and money, specifically with regard to the field study. It should be noted that the use of analytical regression in remote sensing is most recent, and it needs further studies especially for different types of deposits in mineral exploration.

The novelty of this paper is the use of regression analyses in mineral exploration as well as the generation of a mineral potential map. For mineral exploration, various geo-data layers, such as geochemical, geophysical, remote sensing and geological geo-data layers, should be integrated into GIS, but the most important achievement of this method is that it can be used as a direct method for mineral exploration with the least requirement of other exploration layers. The direct detection of minerals such as copper, lead, zinc and some economically valuable minerals is difficult using remote sensing, but due to the accuracy of this method, these elements can be explored more easily than before. Selecting pixels as dependent variables has a direct effect on the results and is very important in regression analyses; therefore, the higher the resolution of the images, the more accurate the results will be.

The conclusions of this paper are as follows.

Regression analysis is an appropriate and direct method for mineral prospectivity mapping (MPM) with satellite image data. In this paper, the output of processed satellite images using regression analysis indicates the iron potential zones accurately.

The application of multivariate regression analysis (as an MPM method) was confirmed in the Sarvian area. This paper used multivariate regression to create a mathematical model (with reasonable accuracy) for iron mineral exploration in the region of interest.

Two types of multivariate regression models, in the form of two linear equations,
were employed to discover new mineral deposits. According to the results of
the

The accuracy of the model was confirmed by iron outcrop mapping and geological observations. Based on field observation, iron mineralization occurs as contact between limestone and intrusive rocks (skarn type).

The results demonstrate that modelling and mapping satellite image data based on regression analysis and remote-sensing data is an efficient approach as it facilitates timely detection with a low-cost procedure and allows decision makers to decide what necessary action should be taken as the first step in an MPM field.

The regression analysis method is a subset of supervised classification due to the procedure mentioned. In this method, target spectrums of the training area are used for modelling and MPM.

ASTER satellite image data used in this paper are available
at

The authors declare that they have no conflict of interest.

The authors would like to thank Amirabbas Karbalaei Ramezanali for his helpful suggestions. Edited by: Marc Oliva Reviewed by: Colin Pain and one anonymous referee