Hierachical Cluster Analysis (HCA) of Water Sources: Case Study of Some Water Sources in Bui Division, Northwest Region of Cameroon
Fondzenyuy Vitalis Fonfo1* and Fondzenyuy Lionel Nyanchi2
1University of Dschang, Faculty of Science, Department of Earth Sciences, Cameroon
2University of Bamenda, College of Technology, Cameroon
Fondzenyuy Vitalis Fonfo, University of Dschang, Faculty of Science, Department of Earth Sciences, Cameroon.
Complex data with multiple variables can be simplified to deduce the trends and relationship between the variables using the Principal Component Analysis (PCA), which further submits the usage of the variables to establish a Hierachical Cluster Analysis (HCA). Sample bottles were cleaned and rinsed with water from the sampling sites and filled to the brim to avoid any atmospheric oxidation. Five water catchments, 29 boreholes. 16 Open wells, 16 rivers and 16 streams were sampled in this work, and were taken immediately to the Laboratory for physico-chemical analysis. The physico-chemical data in application of the Software Package for Social Sciences (SPSS), converted the multiple variables into interpretable data in a correlation matrix. Pearson’s correlation matrix used in this work, reduced the data in correlation circles that were a range of geometrical projections that ranged from +1 to -1 called the correlation ratios. Variables, with correlation ratios greater than 0.5 or (-0.5) and closer to +1 or (-1) are considered significant, and are used in the Hierachical Cluster Analysis. The HCA is in form of a tree diagram called a dendrogram that arranged the variables in an increasing order of significance, with similar correlated variables put in clusters. HCA is innovative and compliments the PCA that already exists and focuses on analysis of ions expected to greatly impact an effect on water quality from their significance in the water sources. The Hierachical Component Analysis is derived from and viewed as an extension of the Principal Components Analysis.
Keywords: Correlation Matrix, Hierachical Components, Primcipal Components, Variable, Dendrogram
An evaluation of water sources based on modern analysis in communities have become increasingly important and has same implications in Bui division of the North West Region of Cameroon as elsewhere. Population growth, economic activity and urbanization of most settlements, has an impact on water sources. Many water related studies with human activity and influence on physico-chemical variables that can compromise water quality are done [1,2]. Water quality and its suitability for diverse purposes has remained preoccupying worldwide [2-10]. Water analysis with an innovative aspect such as a Hierachical Cluster Analysis (HCA) merits a particular attention and was done in this work for some water sources in Bui division, Northwest Region of Cameroon.
The ground water Chemistry of most water sources, is a result of chemical interactions such as; soil- water interactions, dissolution of primary minerals, rock- water interactions and anthropogenic sources [11-13]. The importance ground water has initiated some detailed works on its geochemical evolution [14-17]. The evaluation and establishing a correlation between the variables in water sources is important in understanding water quality.
The objective of this study was to establish a Hierachical Cluster Analysis (HCA) for Bui water sources. Specifically, the physico- chemical results obtained were used in Principal Component Analysis (PCA) to have a Pearson’s correlation matrix with correlations coefficients between the variables. The variables with significant correlation coefficients were used in the HCA to construct dendrograms for the studied water sources. The variables; Temperature (oC), pH, EC, Na+, K+, Ca2+, Mg2+,HCO3-, NO -, Cl-, SO 2-, SiO were needed and obtained from the sampled sites .
Location of the Study Zone
Bui division has boundaries, with Donga Mantung to the North, Ngohketunjia to the South, Boyo to the West, and Noun division to the East. The location of some sampling points are presented in Figure 1.
Figure 1: Location (a) Republic of Cameroon (b) Bui in the NWR (c) Sampling points
The Kilum Mountain range commonly referred to as Mount Oku is an important watershed in the division. It has a dendritic hydrographic network to the basins surrounding the massif. These basins include Ndop in Ngohketunjia division and Mbam Oku in Bui. Another important watershed is the Belem highlands from where water drains dentritically to Kumbo central subdivision, Nkum (Tatum) and Mbven (Mbiame) subdivisions. The Bui Plateau of the Bamenda Highlands is divided into six drainage basins that provide a major watershed for the Niger and Sanaga river systems. The streams in Bui division are mainly of first and second order as shown in Figure 2.
Figure 2: Hydrologic map of Bui Division 
The Bamenda highland in which Bui division is part has a granitic basement covered by basalts and trachytes from tertiary volcanicity in the Cameroon Volcanic Line . The division is part of the Bamenda highlands, a northward extension of the Bambouto Mountain part of the continental Cameroon Volcanic Line (C.V.L). The dominant geologic formations of Bui are basalts and trachytes similar to those of the Bamenda Mountain as reported by kamgang. A synthetic review of the geological setting of the Bambouto Mountain, presented by Kagou et al., indicated that, the first stage 21 million years ago corresponded to the building of an initial basaltic shield volcano . The second stage from 18.5 to 15.3 million years ago was marked by the collapse of the caldera linked to the pouring out of ignimbrites, rhyolites and trachytes. The third stage from 15 to 4.5 million years ago renewed with basaltic effusive activity together with post-caldera extrusions of trachytes and phonolites. In these works, the Bambouto Mountain is reported to be of volcanic origin similar to its northern extension the Bamenda highlands in which Bui the study area is found. These results are in agreement with the field observations indicating that trachytes and basalts are the dominant rock types in Bui. To these rocks are often associated phonolites, rhyolites and ignimbrites. A summary of the minerals in the volcanic rocks of Bui division are presented In Table 1.
Table 1: Summary of the minerals in the volcanic rocks of Bui Division
|Basalt||Plagioclase Olivine, pyroxene|
|Trachyte||Plagioclase Alkaline feldspar, pyroxene, amphibole, biotite|
|Rhyolite||Plagioclase Alkalinefeldspar, quartz, amphibole, biotite|
|Ignimbrite||Plagioclase Quartz, biotite, alkaline, feldspar, pyroxene, rocky inclusions|
Mostly the parent rock, climate, vegetation and relief influence the formation of soils and its nature. The soils resulted from the weathering of basalts and trachytes. Basalts weather to produce soils with reddish to brown colour, as seen within areas of water sources in basaltic environments.
Materials and Methods
Sample bottles were cleaned and rinsed with the water sample from the collection site and finally filled to the brim to avoid any atmospheric oxidation. The samples were collected by means of a water pumping mechanism into an insulating bucket from which the sample bottles were filled for each borehole. Groundwater from active wells was collected using a rope tied to a bucket. Surface water (streams and rivers) was sampled as far as possible from the edges of the water bodies and as deep as possible along the flow path. The samples were stored in a special Thermos vaccine carrier that prevented possible modifications. The samples were taken immediately to the Laboratory for physico-chemical analysis.
The physico-chemical data was submitted to treatment in a Software Package (SPSS) from which the Principal Components Analysis (PCA) was done to obtain Pearson correlation matrix. From the input data into the software, an output was produced that established, a correlation matrix that gave the correlation coefficient between the variables that ranged from - 1 to +1.
The Principal Components Analysis (PCA)
A correlation between the analysed parameters from the water sources was done to find out the relation between the variables in a bid to establish the closeness or distance between these parameters in water. The correlation coefficient is one of the indices that evaluates the strength of a relation between many variables, Kim et al., and Bulut et al., [22,23]. A high positive correlation between elements generally indicated the possibility of a common source of origin.
Tables, II and III presented the physico-chemical results of significant variables from Pearson’s correlation matrix for water catchments (springs) and boreholes respectively. Table IV and V had the significant variables from Pearson’s correlation matrix for open wells, and rivers/streams respectively . These variables were used in the Hierachical Cluster Analysis of the water samples.
The Hierachical Cluster Analysis (HCA)
A Hierachical Cluster Analysis of the variables was obtained in form of a tree diagram; dendrogram that arranged the variables in an increasing order of significance of each variable, with similar correlated variables put in clusters [25,26]. In order to construct a dendrogram we made some assertions and took some steps.
- It’s assertive that the variables were from the analysed water samples from the alternative water sources studied in Bui Division, North West Region of Cameroon.
- The variables used all had significant correlation strengths
- The variables were linked with a bottom to top approach to construct the dendrograms that related the variables in clusters. A vertical line had the height according to the Pearson’s correlation coefficient value, while the horizontal line presented the distancing between the clusters or their dissimilarities.
- The pairing of the variables was based on equal or close strengths (2 variables with identical correlation coefficients were paired up, or an average for very close values). Next, the cations of same charge were paired (Na+/ K+. Ca2+ /Mg2+) as well as the anions (HCO3-/ Cl-/ NO3-) followed by physical parameters (pH/temperature, EC/ TDS). Independent parameters such as NO3-, K+, and SO42- due to their anthropogenic source unlike other parameters were identified and handled independently in the construction of dendrograms.
All the sampled water sources had at least 11 variables with 4 factors in springs, rivers and streams, 5 factors for boreholes and open wells. The significant variables were put in clusters or as independent variables in some cases. The variables were expressed in dendrograms to correlate them and reveal their interrelation in Figures; 4,5,6, and 7 for springs, boreholes, open wells, rivers and streams respectively. Tables; 2,3,4, and 5 had data for dendrogram construction for springs (catchments), boreholes, open wells, rivers and streams respectively.
Table 2: Data for construction of dendrograms for springs (catchments)
Table 3: Data for the construction of Dendrograms for Boreholes
A total number of three or four factors were used in each water source with three or four clusters each, with the variables correlated within the cluster and the clusters are further linked up from bottom to top to reveal the branching present in the relationship. The number of variables present in each cluster could be read directly from the dendrogram and the numbers varied depending on the strength (percentage contribution) of the factors concerned as expressed in the dendrograms. Figure 4 and 5 presents the dendrogram for springs (catchment) and boreholes respectively. Factors; F1/F2, F1/F3, and F1/F4 were used to construct dendrograms for springs (catchment). Temperature and NO3- exist as independent variables in Figure 4, clusters 1 and 3. Temperature is an environmental factor not dependent on other variables while NO3- has as source fertilizer application with no dependence nor correlation with other variables.
F1/F2= 51.98 %; 11 variables F1/F3= 42.79%; 9 variables F1/F4= 39.57 %;9 variables
Figure 3: Dendrograms for Springs (Catchments)
F1/F2= 35.56 %; 8 variables F1/F3= 32.57 %; 7 variables F1/F4= 30.30 %; 5 variables F1/F5= 29.05 %; 7 variables
Figure 4: Dendrograms for Boreholes
The coefficients for construction of dendrograms for open wells, Streams/ Rivers in Figures 5 and 6 are in Tables 4 and 5 respectively
Table 4: Data for the construction of Dendrograms for open wells
Open wells had variables in 4 clusters defined by corresponding factors as expressed in Figure 6, with independent variables; K and Cl- in clusters 3 and 4. Unlike other variables, K could originate from application of potassium fertilizers within the vicinity of the open wells, and Chlorine from chlorination of some wells. This is different from the origin of most ions which are from weathering of rock minerals.
Table 5: Data for the construction of Dendrograms for Rivers and Streams
Rivers and Streams water sources are also presented in dendrograms with 3 clusters with their corresponding factors as expressed in Figure 6
F1/F2= 46.93%; 8 variables F1/F3= 39.87%; 7 variables F1/F4= 37.14 %; 6 variables F2/F3= 35.66%; 7 variables
Figure 5: Dendrograms for Open wells
F1/F2 = 41.45 %; 11 variables F1/F3 = 36.83 %; 9 variables F1/F3 = 34.56 %; 9 variables
Figure 6: Dendrograms for Rivers (RW) and Streams (ST)
In all, each cluster had a corresponding number of variables correlated. These variables and the responsible factors are expressed in tables for each water source from where a dendrogram is constructed as a branching diagram that represents the relationships or similarity among the variables.
A Hierachical Clustering of the sampled water bodies was done on the basis of the significant correlation coefficients obtained at each site and values used to get a background for establishing a relationship between the variables expressed in the different dendrograms. Unfortunately, due to the heterogenous aspects within the different sites, the correlation coefficients obtained from analysis of all the water samples jointly are too low to give a tangible reflection of clustering for the entire area of study. Consequently, a Hierachical Cluster Analysis (HCA) is limited to the studied water sites and not the entire Bui division. This is because an analysis using variables with weak and insignificant correlation coefficients will reflect weak and insignificant outcomes. On the strength of the insignificant correlation coefficients for the entire division in this work, the strength of the PCA and HCA were limited to the studied water sources and not to the entire division.
It was crystal clear that the greatest percentage contribution of the variables in the definition of the factors, were given by factors 1 and 2 as opposed to other factors with lower percentage contribution. These factors; F1 and F2 consequently, contributed dominantly in the data analysis. In all, the positively correlated variables are grouped together and evolve concomitantly, while the negatively correlated variables are ungrouped and evolve in an antagonistic manner. The Hierachical Cluster Analysis uses data from PCA. Hence, the HCA is viewed as an extension of the Principal Component Analysis.
Vulnerability on parameters inherent in water quality envisages regular sampling and analysis to monitor evolution and suggest strategies to mitigate any disastrous outcome linked to water quality to the local authorities.
This article is part of a PhD research by the corresponding author, in the department of Earth science of the Faculty of science in the University of Dschang. In Bui division NASCENT solution in collaboration with CARITAS in the Diocese of Kumbo has constructed boreholes in 29 primary schools under the food for education foundation programme of Aligning Literacy with Good Nutrition (ALIGN). This was in a bid to improve water quality for drinking. These boreholes and other water sources were used for analysis in this work. Laboratory analysis was done in St Anne’s Physicochemical and Biomedical laboratory in New Bell Douala. Provision of the HI 83200 Multiparameter Bench photometer, manufactured by Hanna instruments Inc in the Highland industrial park of the United States of America for physico-chemical parameters was quite innovative, efficient and rendered our data analysis within extremely minimal error margins. The application of the SPSS (Software) enhanced the correlation of the variables to establish a Hierachical Cluster Analysis in construction of dendrograms to attain the objectives of this research.
Conflict of Interest
In course of this work no conflict of interest was encountered. The authors had total collaboration with support of the Earth science department of the University of Dschang- Cameroon.
The data from this work is available in the article and includes; physico-chemical data from analysis of the water samples from water catchments, boreholes, open wells, rivers and streams.
- Dechao C, Acef E, Hualian X, Xinliang X, Zhi O (2020) A Study in the Relationship between Land Use Change and Water Quality of the Mitidja watershed in Algeria Based on GIS and RS. Sustainable Water Quality Management in the Changing Environment 12: 3510.
- Mohammed H, Yangun S, Mohamed M, Weijin J, Majid K, et al. (2020) Assessment of Groundwater Resources in Coastal Areas of Pakistan for Sustainable Water Quality Management Using Joint Geophysical and Geotechnical Approach : A Case Study. Sustainable Water Quality Management in the Changing Environment 12: 9730.
- Sajil KPJ, Elango L, James EJ (2014) Assessment of Hydrochemistry and water quality in south Chennai coastal area”. Arabian Journal of Geosciences 7: 2641-2653.
- Tyagi S, Singh P, Sharma B, Singh B (2014) Asseeement of Water Quality for Drinking Purposes in District Pauri of Uttarakhand, India. Applied Ecology and Environmental Sciences 2: 94-99.
- Njoyim EBT, Mofor NA, Niba MLF, Sunjo J (2016) Physicochemical and bacteriological quality assessment of the Bambui Community drinking water in the North West Region of Cameroon. African Journal of Enviromental Science and Technology 10: 181- 191.
- Inam E, Inoh GG, Offiong NO, Etim BB (2017) Physico-che0mical Characteristics and Health Risk Assessment of Drinking Water Sources in Okoroette Community, Eastern Coast of Nigeria. American Journal of Water Resources 5: 13-23.
- Germain KNG, Jules MOM, Narcisse KA, Aristide GD, Lanciné DG (2019) Caractérisation hydrogéochimique des eaux souterraines du bassin versant de la Baya, Est Côte d’Ivoire. International Journal of Biological and Chemical Sciences 13.
- WHO (2017) Guidelines for Drinking- Water Quality. 4th Edition. Incorporating the First Addendum, Geneva.
- WHO (2018) Developing drinking-water quality regulations and standards. Water sanitation hygiene. Publication on water sanitation and health.
- WHO (2019) UN-Water Global Analyses and Assessment of Sanitation and Drinking-Water (GLASS) 2019 Report.
- Ramesh K, Elango L (2012) Groundwater quality and its suitability for domestic and agricultural use in Tondiar river basin, TamilNadu. India. Environ. Monit. Assess 184: 3887-3899.
- Akoachere RA, Eyong TA, Egbe SE, Wotany RE, Nwude MO, et al. (2019) Geogenic Imprint on Groundwater and Its Quality in Parts of the Mamfe Basin, Manyu Division Cameroon. Journal of Geoscience and Environmental Protection, 7, 184-211.
- Frommen T, Groeschke M, Noelscher M, Schneider M (2019) Anthropogenic and geogenic impacts on peri-urban aquifers in India - Insight from a Case Study in the Northeast of Jaipur (in prep.)
- Edmunds WM, Smedley PL (1996) Ground water geochemistry and health with reference to developing countries. Geological Society Special Publication No. 113: 91-105.
- Tanyileke GZ, Kusakabe, Evans WC (1996) Chemical and isotopic characteristics of fluids along the Cameroon Volcanic Line 433-441.
- Deutsch WJ (1997) Groundwater geochemistry. Fundamentals and applications to contamination. CRC Press. Florida 221.
- Appelo CAJ, Postma D (2005) Geochemistry, groundwater and pollution (2nd edition). Balkama, Amsterdam 635.
- Fondzenyuy VF, Kengni L (2021) Hydrogeochemistry of alternative water sources in Bui division, North west region of Cameroon: Implications on drinking, domestic and agricultural uses. PhD Thesis, University of Dschang, Faculty of sciences, department of Earth sciences.
- Suiven JPT (2020) Standardised Precipitation Valuation of Water Resources Vulnerability to Climate variability on the Bui Plateau, Northwest Cameroon. Environment and Ecology Research of (2): 83-92.
- Ngako V, Affaton P, Njonfang E (2008) Pan-African tectonic in North–Western Cameroon implication for history of Western Gondwana. Gondwana Res 14; 509- 522.
- Kagou D, Nkoathio D, Pouclet A, Bardintzeff JM, Wandji K, et al. (2010) The discovery of late Quatenary basalts on Mount Bambouto: Implications for recent widespread volcanic activity in the Southern Cameroon line. Journal of the African Earth Science 57: 96-108.
- Kim JG, Ko KS, Kim TH, Lee GH, Song Y, et al. (2007) Effect of mining and geology on the chemistry of stream water and sediment in a small watershed. Geoscience Journal 11: 175- 183.
- Bulut VN, Bayram A, Gundogdu A, Soylak M, Turfekei, M. (2009) Assessment of water quality parameters in the stream Galyan, Trabzon, Turkey. Environmental Monitoring Assessment 165: 1-13.
- Kaiser HF (1958) The varimax criterion for factor analytic rotation in factor analysis. Educational and Psychological Measurement 23 770-773.
- Johnbosco E (2020) Ground water quality assessment using pollution index (PIG), Ecological Risk Index (ERI) and Hierachical ClusterAnalysis (HCA): A case study. Ground waterfor sustainable development 10: 1000292.
- Zolfaghari F, Khosravi H, Shariyari A, Jabbari M, Abolhasani A (2020) Hierachical Cluster Analysis to identify the homogenous desertification management units. PLoS ONE 10: 20226355.