LinAlg module¶

The linalg module offers the following routines to perform linear algebra operations:

Function	Description
`cond_num()`	Computes the condition number of a collection.
`matrix_rank()`	Computes the matrix rank of a collection.
`imd()`	Computes the interpolative matrix decomposition of a collection.

araucaria.linalg.la.cond_num(collection, taglist=['all'], region='xanes', range=[- inf, inf], ord=None, kweight=2)[source]¶

Computes the condition number of a collection.

Parameters

collection (Collection) – Collection with the groups to compute the condition number.
taglist (List[str]) – List with keys to filter groups based on their tags attributes in the Collection. The default is [‘all’].
region (str) – XAFS region to compute the condition number. Accepted values are ‘dxanes’, ‘xanes’, or ‘exafs’. The default is ‘xanes’.
range (list) – Domain range in absolute values. Energy units are expected for ‘dxanes’ or ‘xanes’, while wavenumber (k) units are expected for ‘exafs’. The default is [-inf, inf].
ord (Union[int, str, None]) – Order of the norm. See cond() for details. The default is None, which equals to 2-norm.
kweight (int) – Exponent for weighting chi(k) by k^kweight. Only valid for region='exafs'. The default is 2.

Return type

Dataset

Returns

Dataset with the following arguments:

cn : condition number.
groupnames : list with names of groups.
energy : array with energy values. Returned only if region='xanes or region=dxanes.
k : array with wavenumber values. Returned only if region='exafs'.
matrix : array with observed values for groups in range.
cond_pars : dictionary with parameters of calculation.

Notes

The condition number measures the amplification of the output as a function of a small change in the input parameters. For a linear system \(Ax=b\), the condition number \(\kappa(A)\) is formally defined as:

\[\kappa(A) = ||A^{-1}|| \text{ } ||A||\]

The interpretation of the condition number can be established with the following inequality 1:

\[\frac{|| \delta x ||}{||x||} \le \kappa(A) \frac{|| \delta b||}{||b||}\]

Therefore, if \(\kappa(A) = 10^k\) then one expects to lose \(k\) digits of precision in \(x\) when solving the linear system. Linear systems with large \(\kappa(A)\) are said to be ill-conditioned, and the obtained results can exhibit large errors.

References

1: Cheney, W. and Kincaid, D. (2008) “Numerical Mathematics and Computing”, 6th Edition, pp. 321.

Examples

>>> from araucaria.testdata import get_testpath
>>> from araucaria import Dataset
>>> from araucaria.xas import pre_edge, autobk
>>> from araucaria.linalg import cond_num
>>> from araucaria.io import read_collection_hdf5
>>> from araucaria.utils import check_objattrs
>>> fpath      = get_testpath('Fe_database.h5')
>>> collection = read_collection_hdf5(fpath)
>>> collection.apply(pre_edge)
>>> out        = cond_num(collection, region='xanes')
>>> attrs      = ['groupnames', 'energy', 'matrix', 'cond', 'cond_pars']
>>> check_objattrs(out, Dataset, attrs)
[True, True, True, True, True]
>>> print('%1.3f' % out.cond)
241.808

>>> # condition number of exafs
>>> collection.apply(autobk)
>>> out   = cond_num(collection, region='exafs', range=[0,10])
>>> attrs = ['groupnames', 'k', 'matrix', 'cond', 'cond_pars']
>>> check_objattrs(out, Dataset, attrs)
[True, True, True, True, True]
>>> print('%1.3f' % out.cond)
7.019

araucaria.linalg.la.imd(collection, taglist=['all'], region='xanes', range=[- inf, inf], eps=0.0001, kweight=2)[source]¶

Computes the interpolative matrix decomposition for a collection.

Parameters

collection (Collection) – Collection with the groups to compute the rank.
taglist (List[str]) – List with keys to filter groups based on their tags attributes in the Collection. The default is [‘all’].
region (str) – XAFS region to compute the rank. Accepted values are ‘dxanes’, ‘xanes’, or ‘exafs’. The default is ‘xanes’.
range (list) – Domain range in absolute values. Energy units are expected for ‘dxanes’ or ‘xanes’, while wavenumber (k) units are expected for ‘exafs’. The default is [-inf, inf].
eps (float) – Relative error of approximation. The default is 1e-4.
kweight (int) – Exponent for weighting chi(k) by k^kweight. Only valid for region='exafs'. The default is 2.

Return type

Dataset

Returns

Dataset with the following arguments:

rank : matrix rank.
idx : column index array.
proj : interpolation coefficients.
groupnames : list with names of groups.
energy : array with energy values. Returned only if region='xanes or region=dxanes.
k : array with wavenumber values. Returned only if region='exafs'.
matrix : array with observed values for groups in range.
id_pars : dictionary with parameters of calculation.

Notes

The interpolative decomposition (ID) of a \(m\) by \(n\) matrix \(A\) of rank \(k < min\{m,n\}\) is a factorization of the following form,

\[A\Pi = [A\Pi_1 \quad A\Pi_2 ] = A \Pi_1 [I \quad T]\]

where \(\Pi = [\Pi_1 \quad \Pi_2]\) is a permutation matrix, with \(\Pi_1\) being a \(k\) by \(n\) matrix.

The latter can be equivalently written as \(A = BP\), where \(B\) is known as the skeleton matrix, while \(P\) is known as the interpolation matrix.

Therefore, the original matrix can be factorized into an skeleton matrix that preserves the original matrix rank, while the rest of the original matrix can be reconstructed through the interpolation matrix.

Computation of the interpolative decompsition method is performed with the interp_decomp() function of scipy, which is based on the ID software package 2. Therefore, the skeleton matrix can be computed as follows:

B = A[:,idx[:rank]]

while the interpolation matrix can be computed as follows:

P = numpy.hstack([numpy.eye(rank), proj])[:, np.argsort(idx)]

References

2: P.G. Martinsson, V. Rokhlin, Y. Shkolnisky, M. Tygert. “ID: a software package for low-rank approximation of matrices via interpolative decompositions, version 0.2.” http://tygert.com/id_doc.4.pdf.

Example

>>> from araucaria.testdata import get_testpath
>>> from araucaria import Dataset
>>> from araucaria.xas import pre_edge, autobk
>>> from araucaria.linalg import imd
>>> from araucaria.io import read_collection_hdf5
>>> from araucaria.utils import check_objattrs
>>> fpath      = get_testpath('Fe_database.h5')
>>> collection = read_collection_hdf5(fpath)
>>> collection.apply(pre_edge)
>>> out        = imd(collection, region='xanes')
>>> attrs      = ['groupnames', 'energy', 'matrix', 'rank', 'idx', 'proj', 'id_pars']
>>> check_objattrs(out, Dataset, attrs)
[True, True, True, True, True, True, True]
>>> print(out.idx)
[0 1 3 2]

araucaria.linalg.la.matrix_rank(collection, taglist=['all'], region='xanes', range=[- inf, inf], eps=0.0001, kweight=2)[source]¶

Computes the matrix rank of a collection.

Parameters

collection (Collection) – Collection with the groups to compute the rank.
taglist (List[str]) – List with keys to filter groups based on their tags attributes in the Collection. The default is [‘all’].
region (str) – XAFS region to compute the rank. Accepted values are ‘dxanes’, ‘xanes’, or ‘exafs’. The default is ‘xanes’.
range (list) – Domain range in absolute values. Energy units are expected for ‘dxanes’ or ‘xanes’, while wavenumber (k) units are expected for ‘exafs’. The default is [-inf, inf].
eps (float) – Relative error for computation of rank. The default is 1e-4.
kweight (int) – Exponent for weighting chi(k) by k^kweight. Only valid for region='exafs'. The default is 2.

Return type

Dataset

Returns

Dataset with the following arguments:

rank : matrix rank.
groupnames : list with names of groups.
energy : array with energy values. Returned only if region='xanes or region=dxanes.
k : array with wavenumber values. Returned only if region='exafs'.
matrix : array with observed values for groups in range.
rank_pars : dictionary with parameters of calculation.

Notes

Computation of matrix rank is based on the interpolative decompsition method. See estimate_rank() for further details.

Examples

>>> from araucaria.testdata import get_testpath
>>> from araucaria import Dataset
>>> from araucaria.xas import pre_edge, autobk
>>> from araucaria.linalg import matrix_rank
>>> from araucaria.io import read_collection_hdf5
>>> from araucaria.utils import check_objattrs
>>> fpath      = get_testpath('Fe_database.h5')
>>> collection = read_collection_hdf5(fpath)
>>> collection.apply(pre_edge)
>>> out        = matrix_rank(collection, region='xanes')
>>> attrs      = ['groupnames', 'energy', 'matrix', 'rank', 'rank_pars']
>>> check_objattrs(out, Dataset, attrs)
[True, True, True, True, True]
>>> print(out.rank)
4

>>> # rank of exafs matrix
>>> collection.apply(autobk)
>>> out   = matrix_rank(collection, region='exafs', range=[0,10])
>>> attrs = ['groupnames', 'k', 'matrix', 'rank', 'rank_pars']
>>> check_objattrs(out, Dataset, attrs)
[True, True, True, True, True]
>>> print(out.rank)
4