Current release info#

https://img.shields.io/pypi/v/statista https://img.shields.io/conda/v/conda-forge/statista?label=conda-forge https://img.shields.io/pypi/pyversions/statista https://img.shields.io/github/forks/mafarrag/statista?style=social:alt:GitHubforks https://anaconda.org/conda-forge/statista/badges/downloads.svg https://img.shields.io/conda/vn/conda-forge/statista:alt:Conda(channelonly) Gitter https://static.pepy.tech/personalized-badge/statista?period=total&units=international_system&left_color=grey&right_color=blue&left_text=Downloads PyPI https://anaconda.org/conda-forge/statista/badges/platforms.svg https://static.pepy.tech/personalized-badge/statista?period=total&units=international_system&left_color=grey&right_color=blue&left_text=Downloads https://static.pepy.tech/personalized-badge/statista?period=month&units=international_system&left_color=grey&right_color=blue&left_text=Downloads https://static.pepy.tech/personalized-badge/statista?period=week&units=international_system&left_color=grey&right_color=blue&left_text=Downloads

statista - statistics package#

statista is a statistics package.

Main Features#

Installation#

Stable release#

Please install statista in a Virtual environment so that its requirements don’t tamper with your system’s python.

conda#

the easiest way to install statista is using conda package manager. statista is available in the conda-forge channel. To install you can use the following command:

  • conda install -c conda-forge statista

If this works it will install Hapi with all dependencies including Python and gdal, and you skip the rest of the installation instructions.

Installing Python and gdal dependencies#

The main dependencies for statista are an installation of Python 3.9+, and gdal

Installing Python#

For Python we recommend using the Anaconda Distribution for Python 3, which is available for download from https://www.anaconda.com/download/. The installer gives the option to add python to your PATH environment variable. We will assume in the instructions below that it is available in the path, such that python, pip, and conda are all available from the command line.

Note that there is no hard requirement specifically for Anaconda’s Python, but often it makes installation of required dependencies easier using the conda package manager.

Install as a conda environment#

The easiest and most robust way to install Hapi is by installing it in a separate conda environment. In the root repository directory there is an environment.yml file. This file lists all dependencies. Either use the environment.yml file from the master branch (please note that the master branch can change rapidly and break functionality without warning), or from one of the releases {release}.

Run this command to start installing all Hapi dependencies:

  • conda env create -f environment.yml

This creates a new environment with the name statista. To activate this environment in a session, run:

  • conda activate statista

For the installation of Hapi there are two options (from the Python Package Index (PyPI) or from Github). To install a release of Hapi from the PyPI (available from release 2018.1):

  • pip install statista=={release}

From sources#

The sources for HapiSM can be downloaded from the Github repo.

You can either clone the public repository:

$ git clone git://github.com/MAfarrag/statista

Or download the tarball:

$ curl -OJL https://github.com/MAfarrag/statista/tarball/main

Once you have a copy of the source, you can install it with:

$ python setup.py install

To install directly from GitHub (from the HEAD of the master branch):

  • pip install git+https://github.com/MAfarrag/statista.git

or from Github from a specific release:

  • pip install git+https://github.com/MAfarrag/statista.git@{release}

Now you should be able to start this environment’s Python with python, try import statista to see if the package is installed.

More details on how to work with conda environments can be found here: https://conda.io/docs/user-guide/tasks/manage-environments.html

If you are planning to make changes and contribute to the development of Hapi, it is best to make a git clone of the repository, and do a editable install in the location of you clone. This will not move a copy to your Python installation directory, but instead create a link in your Python installation pointing to the folder you installed it from, such that any changes you make there are directly reflected in your install.

  • git clone https://github.com/MAfarrag/statista.git

  • cd statista

  • activate statista

  • pip install -e .

Alternatively, if you want to avoid using git and simply want to test the latest version from the master branch, you can replace the first line with downloading a zip archive from GitHub: MAfarrag/statista libraries.io.

Install using pip#

Besides the recommended conda environment setup described above, you can also install Hapi with pip. For the more difficult to install Python dependencies, it is best to use the conda package manager:

  • conda install numpy scipy gdal netcdf4 pyproj

you can check libraries.io. to check versions of the libraries

Then install a release {release} of statista (available from release 2018.1) with pip:

  • pip install statista=={release}

Check if the installation is successful#

To check it the install is successful, go to the examples directory and run the following command:

  • python -m statista.*******

This should run without errors.

Note

This documentation was generated on Feb 17, 2023

Documentation for the development version: https://statista.readthedocs.org/en/latest/

Documentation for the stable version: https://statista.readthedocs.org/en/stable/

Distributions#

Generalized extreme value distribution (GEV)#

  • The generalised extreme value (or generalized extreme value) distribution characterises the behaviour of ‘block

    maxima’

probability density function (pdf)#

f(x) = \frac{1}{\sigma}\ast{Q(x)}^{\xi+1}\ast e^{-Q(x)}

  • where
    • math

      sigma is the scale parameter

    • math

      mu is the location parameter

    • math

      delta is the scale parameter

Cumulative distribution function (cdf)#

F(x)=e^{-Q(x)}

Gumbel Distribution#

probability density function (pdf)#

f(x) = \frac{1}{\sigma} \ast { {e}^{-(\frac{x-\mu}{\delta}) - {e}^{- (\frac{x-\mu}{\delta})} }}

Cumulative distribution function (cdf)#

F(x) = {e}^{- {e}^{- (\frac{x-\mu}{\delta})} }

Sensetivity Analysis (OAT)#

OAT sensitivity analysis is a tool that is based

One of the simplest and most common approaches of sensitivity analysis is that of changing one-factor-at-a-time (OAT), to see what effect this produces on the output.

OAT customarily involves
  • moving one parameter, keeping others at their baseline (nominal) values, then,

  • returning the parameter to its nominal value, then repeating for each of the other parameters in the same way.

Sensitivity may then be measured by monitoring changes in the output. This appears a logical approach as any change observed in the output will unambiguously be due to the single parameter changed. Furthermore, by changing one parameter at a time, one can keep all other parameters fixed to their central or baseline values. This increases the comparability of the results (all ‘effects’ are computed with reference to the same central point in space)

If we want to check the sensitivity of the HBV hydrological model performance to predict stream flow to each parameter, the One-At-a Time sensitivity analysis is agreat meathod that helps in this area OAT fixes the value of all parameters and change the value of one parameter within boundaries each time to check the result of the given function based on different values of one of the inputs

First of all to run the HBV lumped model which we need to test its performance (based on RMSE error) based on a defined range for each parameter

Steps:

Run the model#

 1import pandas as pd
 2
 3import Hapi.rrm.hbv_bergestrom92 as HBVLumped
 4from Hapi.run import Run
 5from Hapi.catchment import Catchment
 6from Hapi.rrm.routing import Routing
 7import Hapi.statistics.performancecriteria as PC
 8from Hapi.statistics.sensitivityanalysis import SensitivityAnalysis as SA
 9
10Parameterpath = "/data/Lumped/Coello_Lumped2021-03-08_muskingum.txt"
11
12Path = "/data/Lumped/"
13
14### meteorological data
15start = "2009-01-01"
16end = "2011-12-31"
17name = "Coello"
18Coello = Catchment(name, start, end)
19Coello.ReadLumpedInputs(Path + "meteo_data-MSWEP.csv")
20
21### Basic_inputs
22# catchment area
23CatArea = 1530
24# temporal resolution
25# [Snow pack, Soil moisture, Upper zone, Lower Zone, Water content]
26InitialCond = [0,10,10,10,0]
27
28Coello.ReadLumpedModel(HBVLumped, CatArea, InitialCond)
29
30### parameters
31 # no snow subroutine
32Snow = 0
33# if routing using Maxbas True, if Muskingum False
34Maxbas = False
35Coello.ReadParameters(Parameterpath, Snow, Maxbas=Maxbas)
36
37parameters = pd.read_csv(Parameterpath, index_col = 0, header = None)
38parameters.rename(columns={1:'value'}, inplace=True)
39
40UB = pd.read_csv(Path + "/UB-1-Muskinguk.txt", index_col = 0, header = None)
41parnames = UB.index
42UB = UB[1].tolist()
43LB = pd.read_csv(Path + "/LB-1-Muskinguk.txt", index_col = 0, header = None)
44LB = LB[1].tolist()
45Coello.ReadParametersBounds(UB, LB, Snow)
46
47# observed flow
48Coello.ReadDischargeGauges(Path + "Qout_c.csv", fmt="%Y-%m-%d")
49### Routing
50Route=1
51# RoutingFn=Routing.TriangularRouting2
52RoutingFn = Routing.Muskingum
53
54### run the model
55Run.RunLumped(Coello, Route, RoutingFn)
  • Measure the performance of the baseline parameters

Metrics = dict()
Qobs = Coello.QGauges[Coello.QGauges.columns[0]]

Metrics['RMSE'] = PC.RMSE(Qobs, Coello.Qsim['q'])
Metrics['NSE'] = PC.NSE(Qobs, Coello.Qsim['q'])
Metrics['NSEhf'] = PC.NSEHF(Qobs, Coello.Qsim['q'])
Metrics['KGE'] = PC.KGE(Qobs, Coello.Qsim['q'])
Metrics['WB'] = PC.WB(Qobs, Coello.Qsim['q'])

print("RMSE= " + str(round(Metrics['RMSE'],2)))
print("NSE= " + str(round(Metrics['NSE'],2)))
print("NSEhf= " + str(round(Metrics['NSEhf'],2)))
print("KGE= " + str(round(Metrics['KGE'],2)))
print("WB= " + str(round(Metrics['WB'],2)))

Define wrapper function and type#

Define the wrapper function to the OAT method and put the parameters argument at the first position, and then list all the other arguments required for your function

the following defined function contains two inner function that calculates discharge for lumped HBV model and calculates the RMSE of the calculated discharge.

the first function RUN.RunLumped takes some arguments we need to pass it through the OAT method [ConceptualModel,data,p2,init_st,snow,Routing, RoutingFn] with the same order in the defined function “wrapper”

the second function is RMSE takes the calculated discharge from the first function and measured discharge array

to define the argument of the “wrapper” function 1- the random parameters valiable i=of the first function should be the first argument “wrapper(Randpar)” 2- the first function arguments with the same order (except that the parameter argument is taken out and placed at the first potition step-1) 3- list the argument of the second function with the same order that the second function takes them

There are two types of wrappers - The first one returns one value (performance metric)

1# For Type 1
2def WrapperType1(Randpar,Route, RoutingFn, Qobs):
3    Coello.Parameters = Randpar
4
5    Run.RunLumped(Coello, Route, RoutingFn)
6    rmse = PC.RMSE(Qobs, Coello.Qsim['q'])
7    return rmse

Instantiate the SensitivityAnalysis object#

1fn = WrapperType2
2
3Positions = [10]
4
5Sen = SA(parameters,Coello.LB, Coello.UB, fn, Positions, 5, Type=Type)

Run the OAT method#

Display the result with the SOBOL plot#

1From = ''
2To = ''
3
4    fig, ax1 = Sen.Sobol(RealValues=False, Title="Sensitivity Analysis of the RMSE to models parameters",
5              xlabel = "Maxbas Values", ylabel="RMSE", From=From, To=To,xlabel2='Time',
6              ylabel2='Discharge m3/s', spaces=[None,None,None,None,None,None])
  • Type 1 with one parameter

_images/sensitivityAnalysis1.png
  • Type 1 with all parameters

_images/sensitivityAnalysis3.png

The second type#

  • The second wrapper returns two values (the performance metric and the calculated output from the model)

 1# For Type 2
 2def WrapperType2(Randpar,Route, RoutingFn, Qobs):
 3    Coello.Parameters = Randpar
 4
 5    Run.RunLumped(Coello, Route, RoutingFn)
 6    rmse = PC.RMSE(Qobs, Coello.Qsim['q'])
 7    return rmse, Coello.Qsim['q']
 8
 9
10    fig, (ax1,ax2) = Sen.Sobol(RealValues=False, Title="Sensitivity Analysis of the RMSE to models parameters",
11          xlabel = "Maxbas Values", ylabel="RMSE", From=From, To=To,xlabel2='Time',
12          ylabel2='Discharge m3/s', spaces=[None,None,None,None,None,None])
13    From = 0
14    To = len(Qobs.values)
15    ax2.plot(Qobs.values[From:To], label='Observed', color='red')
  • Type 2

_images/sensitivityAnalysis2.png