NumPy cov()

The numpy.cov() method estimates the covariance matrix, given data and weights.

Example


cov() Syntax

The syntax of the numpy.cov() method is:

numpy.cov(array, y = None, rowvar = True, bias = False, ddof = None, fweights = None, aweights = None, dtype = None)

cov() Arguments

The numpy.cov() method takes the following arguments:

  • array - array containing numbers whose covariance is desired (can be array_like)
  • y (optional) - an additional set of variables and observations (array_like)
  • rowvar (optional) - If True, each row represents a variable, otherwise, each column represents a variable
  • bias (optional) - normalizes the array if True
  • ddof (optional) - specifies whether to preserve the shape of the original array (bool)
  • fweights (optional) - integer frequency weights; the number of times each observation vector is repeated (array of int)
  • aweights (optional) - observation vector weights (array of int)
  • dtype (optional) - data type of the result

cov() Return Value

The numpy.cov() method returns a covariance matrix.


Covariance

Covariance is a statistical measure that describes the relationship between two random variables. It measures how changes in one variable are associated with changes in another variable.

Positive covariance means the variables tend to increase or decrease together, while negative covariance means they move in opposite directions.

A covariance of zero implies no linear relationship.


Example 1: Find the Covariance of an ndArray

Output

[[1. 1.]
 [1. 1.]]

[[ 1. -1.]
 [-1.  1.]]

Here, array1 correlates perfectly and array2 also does the same but in opposite directions.


Example 2: Specifying the Data Type of the Covariance Matrix

The dtype parameter can be used to control the data type of the covariance matrix.

Output

[[12.33333333  8.66666667 10.5       ]
 [ 8.66666667  6.33333333  7.5       ]
 [10.5         7.5         9.        ]] 

[[12.336  8.664 10.5  ]
 [ 8.664  6.332  7.5  ]
 [10.5    7.5    9.   ]]

Note: Using a lower precision dtype, such as float16, can lead to a loss of accuracy.


Example 3: Using Optional rowvar Argument

If rowvar is set to True (default), each row represents a variable, with observations in the columns.

If rowvar is set to False, the relationship is transposed: each column represents a variable, while the rows contain observations.

Output

With rows as variables
 [[12.33333333  8.66666667 10.5       ]
 [ 8.66666667  6.33333333  7.5       ]
 [10.5         7.5         9.        ]] 

With columns as variables
 [[1.  1.  0.5]
 [1.  1.  0.5]
 [0.5 0.5 1. ]]

Example 4: Create a Normalized Covariance Matrix

The optional argument bias specifies whether to normalize the covariance matrix and the argument ddof specifies the delta degrees of freedom.

Output

Unnormalized Covariance Matrix
[[12.33333333  8.66666667 10.5       ]
 [ 8.66666667  6.33333333  7.5       ]
 [10.5         7.5         9.        ]] 

Normalized Covariance Matrix
 [[8.22222222 5.77777778 7.        ]
 [5.77777778 4.22222222 5.        ]
 [7.         5.         6.        ]] 

Normalized Covariance Matrix With ddof = 2
[[24.66666667 17.33333333 21.        ]
 [17.33333333 12.66666667 15.        ]
 [21.         15.         18.        ]] 

Note: ddof = 0 is the default value and ddof = 1 returns an unnormalized matrix.


Example 5: Using Weights

The aweight and fweight parameters allow us to specify weights for covariance estimate.

Output

Unweighted Covariance Matrix
[[12.33333333  8.66666667 10.5       ]
 [ 8.66666667  6.33333333  7.5       ]
 [10.5         7.5         9.        ]] 

Covariance Matrix with Observation Vector weight
 [[16.04545455 11.5        13.77272727]
 [11.5         8.40909091  9.95454545]
 [13.77272727  9.95454545 11.86363636]] 

Covariance Matrix with frequency weight
 [[12.   8.4 10.2]
 [ 8.4  6.   7.2]
 [10.2  7.2  8.7]] 

Covariance Matrix with both weights
 [[13.86956522  9.86956522 11.86956522]
 [ 9.86956522  7.08695652  8.47826087]
 [11.86956522  8.47826087 10.17391304]] 

Here,

  • aweights represent the observation vector weight i.e. it quantifies the importance of an observation in the correlation.
  • fweights represent the frequency weight i.e. it represents the number of times the observation was repeated.