---
name: v.mrmr.py
description: Perform Minimum Redundancy Maximum Relevance Feature Selection on a GRASS Attribute Table
keywords: [  ]
---

# v.mrmr.py

Perform Minimum Redundancy Maximum Relevance Feature Selection on a GRASS Attribute Table

=== "Command line"

    **v.mrmr.py**
    **table**=*name*
    **layer**=*string*
    [**threshold**=*float*]
    **nfeatures**=*integer*
    **nsamples**=*integer*
    **maxvar**=*integer*
    **method**=*string*
    [**--verbose**]
    [**--quiet**]
    [**--qq**]
    [**--ui**]

    Example:

    ```sh
    v.mrmr.py table=name layer=1 nfeatures=50 nsamples=1000 maxvar=10000 method=MID
    ```

=== "Python (grass.script)"

    *grass.script.run_command*("***v.mrmr.py***",
        **table**,
        **layer**=*"1"*,
        **threshold**=*1.0*,
        **nfeatures**=*50*,
        **nsamples**=*1000*,
        **maxvar**=*10000*,
        **method**=*"MID"*,
        **verbose**=*None*,
        **quiet**=*None*,
        **superquiet**=*None*)

    Example:

    ```python
    gs.run_command("v.mrmr.py", table="name", layer="1", nfeatures=50, nsamples=1000, maxvar=10000, method="MID")
    ```

=== "Python (grass.tools)"

    *grass.tools.Tools.v_mrmr_py*(**table**,
        **layer**=*"1"*,
        **threshold**=*1.0*,
        **nfeatures**=*50*,
        **nsamples**=*1000*,
        **maxvar**=*10000*,
        **method**=*"MID"*,
        **verbose**=*None*,
        **quiet**=*None*,
        **superquiet**=*None*)

    Example:

    ```python
    tools = Tools()
    tools.v_mrmr_py(table="name", layer="1", nfeatures=50, nsamples=1000, maxvar=10000, method="MID")
    ```

    This grass.tools API is experimental in version 8.5 and expected to be stable in version 8.6.

## Parameters

=== "Command line"

    **table**=*name* **[required]**  
    &nbsp;&nbsp;&nbsp;&nbsp;Name of input vector map  
    &nbsp;&nbsp;&nbsp;&nbsp;Vector features  
    **layer**=*string* **[required]**  
    &nbsp;&nbsp;&nbsp;&nbsp;Layer number or name  
    &nbsp;&nbsp;&nbsp;&nbsp;Vector features can have category values in different layers. This number determines which layer to use. When used with direct OGR access this is the layer name.  
    &nbsp;&nbsp;&nbsp;&nbsp;Default: *1*  
    **threshold**=*float*  
    &nbsp;&nbsp;&nbsp;&nbsp;Discretization threshold  
    &nbsp;&nbsp;&nbsp;&nbsp;Default: *1.0*  
    **nfeatures**=*integer* **[required]**  
    &nbsp;&nbsp;&nbsp;&nbsp;Number of features (attributes)  
    &nbsp;&nbsp;&nbsp;&nbsp;Default: *50*  
    **nsamples**=*integer* **[required]**  
    &nbsp;&nbsp;&nbsp;&nbsp;Maximum number of samples  
    &nbsp;&nbsp;&nbsp;&nbsp;Default: *1000*  
    **maxvar**=*integer* **[required]**  
    &nbsp;&nbsp;&nbsp;&nbsp;Maximum number of variables/attributes  
    &nbsp;&nbsp;&nbsp;&nbsp;Default: *10000*  
    **method**=*string* **[required]**  
    &nbsp;&nbsp;&nbsp;&nbsp;Feature selection method  
    &nbsp;&nbsp;&nbsp;&nbsp;Allowed values: *MID, MIQ*  
    &nbsp;&nbsp;&nbsp;&nbsp;Default: *MID*  
    **--help**  
    &nbsp;&nbsp;&nbsp;&nbsp;Print usage summary  
    **--verbose**  
    &nbsp;&nbsp;&nbsp;&nbsp;Verbose module output  
    **--quiet**  
    &nbsp;&nbsp;&nbsp;&nbsp;Quiet module output  
    **--qq**  
    &nbsp;&nbsp;&nbsp;&nbsp;Very quiet module output  
    **--ui**  
    &nbsp;&nbsp;&nbsp;&nbsp;Force launching GUI dialog

=== "Python (grass.script)"

    **table** : str, *required*  
    &nbsp;&nbsp;&nbsp;&nbsp;Name of input vector map  
    &nbsp;&nbsp;&nbsp;&nbsp;Vector features  
    &nbsp;&nbsp;&nbsp;&nbsp;Used as: input, vector, *name*  
    **layer** : str, *required*  
    &nbsp;&nbsp;&nbsp;&nbsp;Layer number or name  
    &nbsp;&nbsp;&nbsp;&nbsp;Vector features can have category values in different layers. This number determines which layer to use. When used with direct OGR access this is the layer name.  
    &nbsp;&nbsp;&nbsp;&nbsp;Used as: input, layer  
    &nbsp;&nbsp;&nbsp;&nbsp;Default: *1*  
    **threshold** : float, *optional*  
    &nbsp;&nbsp;&nbsp;&nbsp;Discretization threshold  
    &nbsp;&nbsp;&nbsp;&nbsp;Default: *1.0*  
    **nfeatures** : int, *required*  
    &nbsp;&nbsp;&nbsp;&nbsp;Number of features (attributes)  
    &nbsp;&nbsp;&nbsp;&nbsp;Default: *50*  
    **nsamples** : int, *required*  
    &nbsp;&nbsp;&nbsp;&nbsp;Maximum number of samples  
    &nbsp;&nbsp;&nbsp;&nbsp;Default: *1000*  
    **maxvar** : int, *required*  
    &nbsp;&nbsp;&nbsp;&nbsp;Maximum number of variables/attributes  
    &nbsp;&nbsp;&nbsp;&nbsp;Default: *10000*  
    **method** : str, *required*  
    &nbsp;&nbsp;&nbsp;&nbsp;Feature selection method  
    &nbsp;&nbsp;&nbsp;&nbsp;Allowed values: *MID, MIQ*  
    &nbsp;&nbsp;&nbsp;&nbsp;Default: *MID*  
    **verbose** : bool, *optional*  
    &nbsp;&nbsp;&nbsp;&nbsp;Verbose module output  
    &nbsp;&nbsp;&nbsp;&nbsp;Default: *None*  
    **quiet** : bool, *optional*  
    &nbsp;&nbsp;&nbsp;&nbsp;Quiet module output  
    &nbsp;&nbsp;&nbsp;&nbsp;Default: *None*  
    **superquiet** : bool, *optional*  
    &nbsp;&nbsp;&nbsp;&nbsp;Very quiet module output  
    &nbsp;&nbsp;&nbsp;&nbsp;Default: *None*  

=== "Python (grass.tools)"

    **table** : str, *required*  
    &nbsp;&nbsp;&nbsp;&nbsp;Name of input vector map  
    &nbsp;&nbsp;&nbsp;&nbsp;Vector features  
    &nbsp;&nbsp;&nbsp;&nbsp;Used as: input, vector, *name*  
    **layer** : str, *required*  
    &nbsp;&nbsp;&nbsp;&nbsp;Layer number or name  
    &nbsp;&nbsp;&nbsp;&nbsp;Vector features can have category values in different layers. This number determines which layer to use. When used with direct OGR access this is the layer name.  
    &nbsp;&nbsp;&nbsp;&nbsp;Used as: input, layer  
    &nbsp;&nbsp;&nbsp;&nbsp;Default: *1*  
    **threshold** : float, *optional*  
    &nbsp;&nbsp;&nbsp;&nbsp;Discretization threshold  
    &nbsp;&nbsp;&nbsp;&nbsp;Default: *1.0*  
    **nfeatures** : int, *required*  
    &nbsp;&nbsp;&nbsp;&nbsp;Number of features (attributes)  
    &nbsp;&nbsp;&nbsp;&nbsp;Default: *50*  
    **nsamples** : int, *required*  
    &nbsp;&nbsp;&nbsp;&nbsp;Maximum number of samples  
    &nbsp;&nbsp;&nbsp;&nbsp;Default: *1000*  
    **maxvar** : int, *required*  
    &nbsp;&nbsp;&nbsp;&nbsp;Maximum number of variables/attributes  
    &nbsp;&nbsp;&nbsp;&nbsp;Default: *10000*  
    **method** : str, *required*  
    &nbsp;&nbsp;&nbsp;&nbsp;Feature selection method  
    &nbsp;&nbsp;&nbsp;&nbsp;Allowed values: *MID, MIQ*  
    &nbsp;&nbsp;&nbsp;&nbsp;Default: *MID*  
    **verbose** : bool, *optional*  
    &nbsp;&nbsp;&nbsp;&nbsp;Verbose module output  
    &nbsp;&nbsp;&nbsp;&nbsp;Default: *None*  
    **quiet** : bool, *optional*  
    &nbsp;&nbsp;&nbsp;&nbsp;Quiet module output  
    &nbsp;&nbsp;&nbsp;&nbsp;Default: *None*  
    **superquiet** : bool, *optional*  
    &nbsp;&nbsp;&nbsp;&nbsp;Very quiet module output  
    &nbsp;&nbsp;&nbsp;&nbsp;Default: *None*  

    Returns:

    **result** : grass.tools.support.ToolResult | None  
    If the tool produces text as standard output, a *ToolResult* object will be returned. Otherwise, `None` will be returned.

    Raises:

    *grass.tools.ToolError*: When the tool ended with an error.

## DESCRIPTION

***v.mrmr*** is a simple GUI for exporting data to the Minimum
Redundancy Maximum Relevance (mRMR) feature selection command line tool
(Peng et al., 2005). mRMR is designed to select features that have the
maximal statistical "dependency" on the classification variable, while
simultaneously minimizing the redundancy among the selected features.

## NOTES

The command line tool needs to be installed separately in a location
that is recognized by the system or in the PATH. The command line tool
can be installed on windows (binaries available), linux and OS X (needs
compilation). Installation instructions are provided on [Peng's
Website](https://home.penglab.com/proj/mRMR).

The module requires data within a vector attribute table to be arranged
in a specific order. The classification variable (i.e., class labels)
need to be in the first column, except for the cat attribute which is
not exported. The class label also needs to be in numerical form, i.e.,
1, 2, 3.... rather than 'forest' or 'urban'. Also, the attribute table
should not contain any missing values because this causes an erroneous
mRMR result.

The algorithm outputs a tab-separated list of attributes, ranked by the
most important feature first. The *method* parameter allows a choice
between the Maximum Information Difference (MID) and Mutual Information
Quotient (MIQ) feature evaluation criteria, which respectively represent
the relevancy and redundancy of the features. The algorithm also shows
the ranking of the features based on the conventional maximum relevance
method. Additional user options include *nfeatures* which specifies the
number of features that you want to select; *nsamples* limits the
maximum number of samples to base the feature selection, and *maxvar*
limits the maximum number of attributes, both of which can therefore
reduce the computation for very large datasets. *threshold* is the
discretization threshold to apply to the continuous variable data, i.e.,
mean +/- threshold \* standard deviation. *layer* is the attribute layer
to be used in the feature selection process.

## EXAMPLE

```sh
v.mrmr.py vector=vector_layer layer=1 thres=1.0 nfeatures=50 \
      nsamples=10000 maxvar=10000 method=MID
```

## REFERENCES

Peng, H.; Fulmi Long; Ding, C., "Feature selection based on mutual
information criteria of max-dependency, max-relevance, and
min-redundancy," in Pattern Analysis and Machine Intelligence, IEEE
Transactions on , vol.27, no.8, pp.1226-1238, Aug. 2005

## AUTHOR

Steven Pawley

## SOURCE CODE

Available at: [v.mrmr source code](https://github.com/OSGeo/grass-addons/tree/grass8/src/vector/v.mrmr)
([history](https://github.com/OSGeo/grass-addons/commits/grass8/src/vector/v.mrmr))  
Latest change: Thursday Feb 20 13:02:26 2025 in commit [53de819](https://github.com/OSGeo/grass-addons/commit/53de8196a10ba5a8a9121898ce87861d227137e3)
