Metagenomics Reveals seasonality of human pathogenic Bacteria from hand-Dug Well Water in the Cuvelai etosha Basin of namibia

Abbreviations: CEB: Cuvelai Etosha Basin; SASSCAL: Southern African Science Service Centre for Climate Change and Adaptive Land Management; WHO: World Health Organisation; pH: potential of hydrogen; OTU: Operational taxonomic units; H’: ShannonWiener diversity; D: Simpson diversity; R: Richness; E: Evenness; NMS: Nonmetric Multidimensional Scaling; UTI: urinary tract infections; PCR: Polymerase Chain Reaction.


intRODUCtiOn
Namibia is a large country found in southern Africa with a size of 823, 680 km 2 . 1 The country has a population slightly above 2 million of which 73% live in rural areas and the rest in urban settings. Namibia is one of the driest countries in southern Africa with unpredictable rainfall patterns which uniquely occur between October and May 2 . Due to high temperatures, Namibia experiences high evaporation rates, leaving a small quantity (2%) that can be used as surface water. 2 Amakali and Swatuk 3 disclosed that about 83% of the rainfall vaporizes almost immediately while 14% is lost by evapo-transpiration due to high temperatures, and groundwater is recharged by only 1%. Mendelson et al. 1 revealed that Namibia has been characterized by an arid climate for millions of years leading to water scarcity in most parts of the country. In addition, rivers in Namibia are dry most of the time and only flow after heavy rains. However, the Namibian transboundary rivers namely Kunene, Zambezi, Kavango and Orange are perennial although far from main centres of demand. This demands that communities rely on groundwater as a viable source for domestic water.
The Cuvelai Etosha Basin (CEB) is a densely populated area located in central northern Namibia, and consists of four regions namely Oshana, Oshikoto, Ohangwena and Omusati 4 .
This Basin harbours about half of the Namibian population amounting to 1 million 4 . Given the water scarcity problem in the country, this Basin is not excused from the hardships of inadequate water supply. Thus, the CEB rural communities mostly depend on the use of groundwater to circumvent water scarcity by the construction of hand-dug wells. However, ground-water presents another problem in that it is saline in most parts of the basin and the situation is worsened by lack of perennial rivers within the regions 5,6 . The CEB has three distinct hand-dug well forms (Fig.; 1, 2 and 3) that differ according to structure. These structural differences determine whether or not animals have access. There is only one report by McBenedict et al. 7 that described these hand-dug wells not to be safe for human consumption. However, McBenedict et al. 7 used culture dependent methods which cannot reveal the total bacterial species in these hand-dug wells as opposed to the Metagenomics technique used in this study as a follow up.
Hence, the use of metagenomics approaches circumvents the limitation of culturing media which does not appropriately represent in situ conditions since only a certain fraction (<1%) of bacteria can be cultured leading to a poor understanding of the bacterial communities. Metagenomics is a technique that employs the sequencing and analysis of the entire microbial community genomes to define and understand the genetic content of the environment in question. Metagenomics is also termed environmental genomics owing to its ability to capture and analyse the total DNA of an environment 8 . Metagenomics is a sequence-based and functional analysis of the entire microbial genomes from a microbial habitat and can reveal an inclusive measure of genetic diversity, species composition, evolution, and ecological functions of respective species in microbial communities 9 .
It is sufficing to state that the WHO documented list of water borne pathogens is not comprehensive due to lack of widespread research on water pathogens using highly specific and effective techniques such as metagenomics. Hence, there is need to conduct more microbial research in water in order to reveal the vast microbial life forms and their interactions. This study comprehensively disclosed the human pathogenic bacteria inhabiting the hand-dug wells in the CEB and adds on to the known pathogens for which water is a mode of transmission, and is the first to report the use of metagenomics in assessing the microbial quality of water in Namibia. Therefore, the main aim of this study was to investigate the influence of season on human pathogenic bacterial species richness, diversity and distribution in hand-dug wells of the CEB of Namibia.

Study sites and sample collections
The study targeted the three different structural types of hand-dug wells ( Fig. 1 -3). Although the CEB in Namibia is formed by Oshikoto, Omusati, Ohangwena and Oshana regions, water samples were collected from Omusati and Ohangwena regions since they possess the three different hand-dug well types. Convenience sampling was employed in this study although it targeted the areas in which hand-dug wells were monitored for chemical water quality by Southern African Science Service Centre for Climate Change and Adaptive Land Management (SASSCAL) task 007: Improving knowledge and understanding of groundwater flow, water quality and quantity variations, improve methodology of groundwater availability study in the Cuvelai since 2014. Water samples were collected following standard water sample collection guidelines stipulated by WHO, University of Namibia and the University of Zambia. A total of 40 water samples were collected in sterile 200 ml bottles from the hand-dug wells, half the total number being from the wet and 20 from the dry season respectively. Water samples were collected from the same hand-dug wells in both seasons. The bottles were lowered into the hand-dug wells for water collection using a rope which was tied to the sterile bottles. The samples were then transported on ice to the University of Namibia for analysis. Prior to transportation, the physical parameters; temperature, electrical conductivity, redox potential and potential of hydrogen (pH) of the water were measured.

DNA extraction and 16S rRNA gene amplification
Each water sample containing a volume of 200 ml was centrifuged at a speed of 100148 xg for 30 minutes in order to concentrate the microorganisms. Each volume was then reduced to 40 ml after centrifugation by discarding a portion of the supernatant. The SEEPREP 12 TM kit (Seegene, Rockville, USA) was used to extract DNA after which amplification of the 16S rRNA gene by PCR was performed using universal primer sets 27F (5' AGAGTTTGATCMTGGCTCAG 3') and 1492R (5' TACGGYTACCTTGTTACGACTT 3'). The Bio-Rad thermo-cycler (Hercules, CA) was used with reaction conditions described by McBenedict et al. 7 The amplicons were then prepared for next generation sequencing diversity assay using Illumina 16S sequencing at Mr. DNA Next Generation Sequencing provider in Texas, United States of America.

PCR product preparation and sequencing
The amplified amplicons from above were prepared for sequencing as described by the Illumina TruSeq DNA library preparation protocol. A MiSeq was used for Illumina 16s (20k 2x300bp) sequencing performed at MR. DNA (www. mrdnalab.com, Shallowater, TX, USA) according to the manufacturer's guidelines, and sequence data were processed using an exclusive analysis pipeline from MR. DNA.

16S rRNA metagenomics data collection and analysis
The metagenomics sequence data obtained from Mr. DNA Next Generation Sequencing provider (Texas, United States of Journal of Pure and Applied Microbiology America) were processed and edited using a proprietary analysis pipeline (www.mrdnalab.com, MR. DNA, Shallowater, TX). The Q25 sequence data derived from the sequencing process were depleted of barcodes and primers, and short sequences less than 150 bp were removed. In addition, sequences with ambiguous base calls, and homopolymer runs exceeding 6 bp were removed. The sequences were then denoised and chimeras also removed. Operational taxonomic units (OTUs) were defined after removal of singleton sequences, clustering at 3% divergence (97% similarity) according to other works 10 . OTUs were then taxonomically classified by performing a BLASTn against a curated GreenGenes, RDPII (http://rdp.cme.-msu.edu) and NCBI (www.ncbi. nlm.nih.gov) databases and compiled into each taxonomic level 11 . The data was arranged into counts and percentages based on the number of sequences in each sample.

Construction of a phylogenetic tree
From the total identified OTU's, selection of cleaned bacterial sequences of those bacteria known to cause disease in humans was performed. A phylogenetic tree was constructed from cleaned sequences using the Maximum Likelihood Method 12 . Only the trees with the highest log likelihood were chosen and percentages of trees in which the associated taxa clustered together were shown next to the branches. Initial tree(s) for the heuristic search were obtained automatically by applying the Neighbor-Join and BioNJ algorithms to a matrix of pairwise distances estimated using the Maximum Composite Likelihood (MCL) approach, and then selecting the topology with superior log likelihood value. Bootstrap was performed and the consensus trees inferred from 1000 replicates were taken to represent the evolutionary history of the taxa analysed 13 . Branches corresponding to partitions reproduced in less than 70% bootstrap replicates were collapsed and the percentage of replicate trees in which the associated taxa clustered together in the bootstrap test (1000 replicates) were also indicated next to the branches. The evolutionary analyses were conducted in MEGA 7 13 . A V. cholerae strain with accession number KJ725364.1 was retrieved from the NCBI website and used as the outgroup to root the phylogenetic tree.

Influence of season on the abundance of human bacterial pathogens
The human bacterial pathogens with their respective counts were identified and subjected to analysis. The distribution of these pathogens between the wet and dry seasons was evaluated by entering the data into SPSS version 24 to; generate the Shapiro-Wilk test and Kolmogorov-Smirnov test values, visually inspect histograms, Normal Q-Q Plots, and calculation of Z scores from Skewness and Kurtosis. An Independent Samples Mann-Whitney U test was used to investigate the influence of season on the abundance of the human bacterial pathogens.

Bacterial species diversity, evenness and richness
Species diversity is the number of species and abundance of each species present in a particular area while species richness is the number of species present in a particular area 14 . Species evenness is the measure of the relative abundance of different species in a particular area 14 . Species diversity, evenness and richness are fundamental in determining ecosystem health, and in the present study it gave an indication of contamination levels. In each hand-dug well, bacterial species richness was counted, and Shannon-Wiener diversity indices, Simpson's diversity indices and species evenness were calculated using the formulas described in literature 14,15 .
Possible differences in Shannon-Wiener diversity indices and species richness of human pathogenic bacteria between the wet and dry seasons were tested using a Paired sample t-test while differences in Simpson diversity indices and species evenness of human pathogenic bacteria between the wet and dry seasons were tested using the Mann-Whitney U test.

physical parameters analysis
The physical parameters (pH, Ec, temperature, redox potential) were entered into PC-ORD version 7 in order to investigate the influence of these parameters on the abundance of human pathogenic bacterial species using Nonmetric multidimensional scaling (NMS) multivariate analysis. NMS ordination procedure was done using S‫ר‬rensen (Bray-Curtis) distance measure. Number of runs used with real data was 200.

Phylogenetic tree
A total of 181 human pathogens were identified and used to generate a phylogenetic tree showing the relationships between the detected human pathogens (Fig. 4) 4) did not form any clusters. The V. cholerae strain with accession number KJ725364.1 was retrieved from the NCBI website and used as the outgroup to root the human pathogen's tree.

Bacterial species diversity, evenness and richness
The Kolmogrov-Smirnov test revealed that Shannon-Wiener diversity indices and species richness data were normally distributed (P > 0.05). A Paired sample t-test revealed that there was no significant difference between Shannon-Wiener diversity (H') indices, and species richness of human pathogenic bacteria in the wet and dry seasons ( Table 2). Simpson diversity indices and species evenness data were not normally distributed (P < 0.05). According to the Wilcoxon rank test, there was no significant difference in Simpson diversity (D) data across the wet and dry seasons. However, species evenness data differed significantly between the wet and dry seasons.

physical parameters
The pH values ranged from 7.18 to 8.31 in the wet season and 5.68 to 8.34 in the dry season, and temperature values ranged from 13.2° C to 26.3° C in the wet season and 20.5° C to 34.6° C in the dry season. The electrical conductivity values ranged from 54.5 µS/cm to 7240 µS/cm in the wet season and 62.4 µS/cm to 14420 µS/cm in the dry season, and the redox potential ranged from -380.1 Mv to 160.9 Mv in the wet season The NMS analysis performed had a total of 500 iterations upon which a final solution was reached with a stress of 11.807 and a final instability of 0.000001. NMS results revealed that the main physical factors that influenced the distribution of human pathogenic bacteria in hand-dug wells were temperature r=-0.628 tau=-0.365 and pH tau=0.645 tau= 0.401. pH was positively correlated with the wet season while temperature was correlated with the dry season (Fig. 5).

DisCUssiOn
Most clusters of the phylogenetic tree were formed by species belonging to the same genus indicating their close relation. The detection of multiple species in each genus and their close relation confirmed intra-genus versatility. Human pathogenic bacterial species diversity and richness did not show a significant difference between the wet and dry season indicating that hand-dug well bacterial diversity and richness is independent of season. The sustained species diversity and      Journal of Pure and Applied Microbiology richness was caused by the poor structure, lack of protection from animal access and lack of a covered top throughout the year allowing bacteria to be deposited into these wells. The easy access of livestock, other domestic and wild animals or birds to the water exposes these hand-dug wells to different human pathogenic bacteria regardless of season.
The diversity and richness of human pathogenic bacteria was similar in the hand-dug wells in both season due to continuous contact between soil and water in the hand-dug wells. Since various bacteria belonging to different phyla have mostly been reported to be found in the soil, 16 it is plausible that most of the bacteria detected originated from the soil, and were either active or dormant which is an inherent limitation of Metagenomics. Moreover, it is known that water availability and nutrients are among the main limiting factors influencing bacterial growth and survival. 17 The water-soil bacteria interface allows these bacteria to survive thereby maintaining the diversity within the hand-dug wells and this agrees with Bull 18 who reported that bacteria can grow and survive at various ranges of physicochemical parameters, and growth occurs at a slow rate because the environment is not well optimized. The findings of this study confirmed Sun et al. 19 's results which showed a high diversity and richness of bacteria in a river in both the wet and dry seasons with no significant difference between the two seasons (P > 0.05). Sun et al. 19 recorded Shannon indices values of 9.53 and 9.42 respectively for dry and wet seasons. However, the present study recorded lower values of Shannon indices than reported by Sun et al. 19 , but still displayed high diversity and richness in both seasons. Furthermore, the high diversity of human pathogenic bacteria in both seasons confirmed that bacteria have pronounced ecophysiological plasticity that permits them to adapt to various freshwater ecosystems and dynamic seasonal changes 19,20 .
Hand-dug wells had animal droppings and visible floating debris which provided a carbon rich environment propitious for microbial growth. This carbon rich environment coupled with the soil-water interface can enhance the ability of the microorganisms to cope with various or fluctuating environmental conditions by the transfer and exchange of genes owed to microbial interactions. This is evident because Escherichia species were once described to be unable to survive lengthy periods outside the intestines of warm blooded animals but other works 21,22 have indicated that E. coli strains survive in soil and water that's not known to be faecally contaminated. The current status quo demands the development of a more suitable indicator of recent faecal contamination and further research to explore these emerging patterns.
The present study indicated that there was a significant difference in abundance and evenness of human pathogenic bacterial species between the dry and wet seasons. The wet season had higher abundances of human pathogenic bacterial species than the dry season, and human bacteria pathogen species in the dry season were more even than in the wet season. The variation in abundance was due to surface runoff and the downward transportation of bacteria by water through the permeable soil layers in the wet season. Surface runoff transports various bacteria into the hand-dug wells thereby elevating the abundance of some human pathogenic bacteria and leading to none even distribution of bacterial abundance in the wet season. Most of the detected pathogens were soil and gut bacteria and this agrees with Thomas 23 who disclosed that Namibia is amongst the highest rated southern African countries with open defecation. The poor architecture of these hand-dug wells and the livestock and human faeces that are found at sites near the hand-dug wells promote the entry of faecal matter especially in the wet season. However, it's worth noting that surface runoff mainly affects the abundance and evenness than diversity and richness especially that the water is mostly in contact with soil. It is not to rule out that new species are introduced into these hand-dug wells but to emphasize that the amount is either negligible with respect to diversity or it is sustained throughout the year.
The abundance is easily increased since water serves as a transporter of these contaminants especially in rain season when the water penetrates the permeable soil layers reaching the aquifers below that are shared by the hand-dug wells within the same vicinity, 24 eliminating potential differences in contamination levels that would otherwise exist due to different types hand-dug wells. Further-more, it was noted that temperature was positively correlated with the dry season and negatively correlated with the wet season while pH was positively correlated with the wet season and negatively correlated with the dry season. It is established that pH and temperature have an influence on the growth and survival of bacteria. Optimal bacterial growth occurs at various levels of pH ranging from low, moderate, and high in different species. Similarly, various bacterial species have different temperature ranges for optimal growth. 25 This agrees with literature 16,18 disclosing that pH, salinity, water content, temperature, pressure and radiation are the major physicochemical parameters that regulate bacterial growth and survival.
H . p a r a i n f l u e n za e , L . l y t i ca , L . pneumophila, L. sainthelensi, P. mendocina, P. oryzihabitans, P. putida, P. stutzeri and S. sonnei showed a significant difference in abundance between the wet and dry seasons. L. sainthelensi, P. oryzihabitans, P. putida, P. stutzeri and S. sonnei were more abundant in wet season compared to the dry season, demonstrating that the communities of the CEB are exposed to these pathogens to a higher extent in the wet season than the dry season. The presence of Shigella species in these hand-dug wells is of grave concern because it is highly infectious. Bacterial counts of 0-100 are adequate to induce shigellosis in humans 31 . H. parainfluenzae, L. lytica, L. pneumophila and P. mendocina were more abundant in the dry season compared to the wet season, indicating that diseases caused by these species are expected to surge in the dry season. However, there was no significant difference in the abundance of Citrobacter spp., L. jordanis and V. cholerae between the wet and dry seasons, indicating that the CEB communities are exposed to these pathogens continuously. This explains the non-seasonal sporadic cholera outbreaks that occur in these communities and highlights the necessity of adhering to hygiene practices and implementing routine hand-dug well water bacteriological analysis. The rest of the detected human pathogens are reported to mostly cause endocarditis, meningitis and bacteraemia.
Metagenomics provided vast information regarding the microbial communities and safety of the hand-dug wells in the CEB for household consumption by bypassing limitations of culturing based methods that lead to the inability to quantify the total natural diversity within a given habitat 32 . Since Meta-genomics is a PCR-based analyses of microbial diversity, it is entrenched with some biases that are inherent to PCR applications that have been described by other studies 33,34,35,36 . However, it should be noted that since Metagenomics as opposed to Metatranscriptomics is a DNA based technique, the microbial communities detected potentially included DNA from dead bacteria thereby displaying an over representation of bacterial communities or omitted some bacteria due to DNA extraction difficulties as described by Filippidou et al 34 . This might have led to a low coverage of less abundant taxa known as "depth bias" and under representation of certain taxa. The findings of this study led to the conclusions that seasonality has an influence on evenness and overall abundance levels but none on species diversity and richness of human bacterial pathogens in hand-dug wells. Generally, this study showed that the water in hand-dug wells of the CEB is not safe for consumption and domestic use unless sanitized.

AVAilABilitY OF MAteRiAls
The 16S rRNA sequences data that were generated in the current study are available on GenBank, with the following accession numbers: