Collection class¶
- class araucaria.main.collection.Collection(name=None)[source]¶
Collection storage class.
This class stores a collection of
Group
objects.- Parameters
name (
str
) – Name for the collection. The default is None.
Notes
Each group will be stored as an attribute of the collection. The
tags
attribute classifies group names based on atag
key, which is useful for joint manipulation of groups.The following methods are currently implemented:
Method
Description
Adds a group to the collection.
Applies a function to groups in the collection.
Returns a copy of the collection.
Deletes a group from the collection.
Returns a group in the collection.
Returns the minimum common energy range for the collection.
Return group names in the collection.
Returns tag of a group in the collection.
Renames a group in the collection.
Modifies tag of a group in the collection.
Returns a summary report of the collection.
Warning
Each group can only have a single
tag
key.Example
>>> from araucaria import Collection >>> collection = Collection() >>> type(collection) <class 'araucaria.main.collection.Collection'>
- add_group(group, tag='scan')[source]¶
Adds a group dataset to the collection.
- Parameters
- Return type
- Returns
- Raises
TypeError – If
group
is not a valid Group instance.ValueError – If
group.name
is already in the collection.
Example
>>> from araucaria import Collection, Group >>> from araucaria.utils import check_objattrs >>> collection = Collection() >>> g1 = Group(**{'name': 'group1'}) >>> g2 = Group(**{'name': 'group2'}) >>> for group in (g1, g2): ... collection.add_group(group) >>> check_objattrs(collection, Collection, attrlist=['group1','group2']) [True, True]
>>> # using tags >>> g3 = Group(**{'name': 'group3'}) >>> collection.add_group(g3, tag='ref') >>> for key, value in collection.tags.items(): ... print(key, value, type(value)) scan ['group1', 'group2'] <class 'list'> ref ['group3'] <class 'list'>
- apply(func, taglist=['all'], **kwargs)[source]¶
Applies a function to groups in a collection.
- Parameters
- Return type
- Returns
- Raises
ValueError – If any item in
taglist
is not a key of thetags
attribute.
Example
>>> from araucaria.testdata import get_testpath >>> from araucaria.io import read_collection_hdf5 >>> from araucaria.xas import pre_edge >>> fpath = get_testpath('Fe_database.h5') >>> collection = read_collection_hdf5(fpath) >>> collection.apply(pre_edge) >>> report = collection.summary(optional=['e0']) >>> report.show() =============================================== id dataset tag mode n e0 =============================================== 1 FeIISO4_20K scan mu 5 7124.7 2 Fe_Foil scan mu_ref 5 7112 3 Ferrihydrite_20K scan mu 5 7127.4 4 Goethite_20K scan mu 5 7127.3 ===============================================
- copy()[source]¶
Returns a deep copy of the collection.
- Parameters
None –
- Return type
- Returns
Copy of the collection.
Example
>>> from numpy import allclose >>> from araucaria import Group, Collection >>> collection1 = Collection() >>> content = {'name': 'group', 'energy': [1,2,3,4,5,6]} >>> group = Group(**content) >>> collection1.add_group(group) >>> collection2 = collection1.copy() >>> energy1 = collection1.get_group('group').energy >>> energy2 = collection2.get_group('group').energy >>> allclose(energy1, energy2) True
- del_group(name)[source]¶
Removes a group dataset from the collection.
- Parameters
name – Name of group to remove.
- Return type
- Returns
- Raises
TypeError – If
name
is not in a group in the collection.
Example
>>> from araucaria import Collection, Group >>> from araucaria.utils import check_objattrs >>> collection = Collection() >>> g1 = Group(**{'name': 'group1'}) >>> g2 = Group(**{'name': 'group2'}) >>> for group in (g1, g2): ... collection.add_group(group) >>> check_objattrs(collection, Collection, attrlist=['group1','group2']) [True, True] >>> collection.del_group('group2') >>> check_objattrs(collection, Collection, attrlist=['group1','group2']) [True, False] >>> # verifying that the deleted group has no tag >>> for key, value in collection.tags.items(): ... print(key, value) scan ['group1']
- get_group(name)[source]¶
Returns a group dataset from the collection.
- Parameters
name – Name of group to retrieve.
- Return type
- Returns
Requested group.
- Raises
TypeError – If
name
is not in a group in the collection.
Important
Changes made to the group will be propagated to the collection. If you need a copy of the group use the
copy()
method.Example
>>> from araucaria import Collection, Group >>> from araucaria.utils import check_objattrs >>> collection = Collection() >>> g1 = Group(**{'name': 'group1'}) >>> collection.add_group(g1) >>> gcopy = collection.get_group('group1') >>> check_objattrs(gcopy, Group) True >>> print(gcopy.name) group1
- get_mcer(num=None, taglist=['all'])[source]¶
Returns the minimum common energy range for the collection.
- Parameters
- Return type
- Returns
Array containing the minimum common energy range
- Raises
AttributeError – If
energy
is not an attribute of the requested groups.ValueError – If any item in
taglist
is not a key of thetags
attribute.
Notes
By default the returned array contains the lowest number of points available in the minimum common energy range of the groups.
Providing a value for
num
will return the desired number of equally-spaced points for the minimum common energy range.Examples
>>> from numpy import linspace >>> from araucaria import Collection, Group >>> collection = Collection() >>> g1 = Group(**{'name': 'group1', 'energy': linspace(1000, 2000, 6)}) >>> g2 = Group(**{'name': 'group2', 'energy': linspace(1500, 2500, 11)}) >>> tags = ('scan', 'ref') >>> for i, group in enumerate([g1, g2]): ... collection.add_group(group, tag=tags[i]) >>> # mcer for tag 'scan' >>> print(collection.get_mcer(taglist=['scan'])) [1000. 1200. 1400. 1600. 1800. 2000.] >>> # mcer for tag 'ref' >>> print(collection.get_mcer(taglist=['ref'])) [1500. 1600. 1700. 1800. 1900. 2000. 2100. 2200. 2300. 2400. 2500.]
>>> # mcer for 'all' groups >>> print(collection.get_mcer()) [1600. 1800. 2000.] >>> # mcer for 'all' groups explicitly >>> print(collection.get_mcer(taglist=['scan', 'ref'])) [1600. 1800. 2000.]
>>> # mcer with given number of points >>> print(collection.get_mcer(num=11)) [1500. 1550. 1600. 1650. 1700. 1750. 1800. 1850. 1900. 1950. 2000.]
- get_names(taglist=['all'])[source]¶
Returns group names in the collection.
- Parameters
taglist (
List
[str
]) – List with keys to filter groups in the collection based on thetags
attribute. The default is [‘all’].- Return type
- Returns
List with group names in the collection.
- Raises
ValueError – If any item in
taglist
is not a key of thetags
attribute.
Example
>>> from araucaria import Collection, Group >>> collection = Collection() >>> g1 = Group(**{'name': 'group1'}) >>> g2 = Group(**{'name': 'group2'}) >>> g3 = Group(**{'name': 'group3'}) >>> g4 = Group(**{'name': 'group4'}) >>> tags = ('scan', 'ref', 'ref', 'scan') >>> for i, group in enumerate([g1, g2, g3, g4]): ... collection.add_group(group, tag=tags[i]) >>> collection.get_names() ['group1', 'group2', 'group3', 'group4'] >>> collection.get_names(taglist=['scan']) ['group1', 'group4'] >>> collection.get_names(taglist=['ref']) ['group2', 'group3']
- get_tag(name)[source]¶
Returns tag of a group in the collection.
- Parameters
name – Name of group to retrieve tag.
- Return type
- Returns
Tag of the group.
- Raises
AttributeError – If
name
is not in a group in the collection.
Example
>>> from araucaria import Collection, Group >>> collection = Collection() >>> g1 = Group(**{'name': 'group1'}) >>> g2 = Group(**{'name': 'group2'}) >>> tags = ('scan', 'ref') >>> for i, group in enumerate([g1, g2]): ... collection.add_group(group, tag=tags[i]) >>> print(collection.get_tag('group1')) scan >>> print(collection.get_tag('group2')) ref
- rename_group(name, newname)[source]¶
Renames a group in the collection.
- Parameters
- Return type
- Returns
- Raises
AttributeError – If
name
is not a group in the collection.TypeError – If
newname
is not a string.
Example
>>> from araucaria import Collection, Group >>> collection = Collection() >>> g1 = Group(**{'name': 'group1'}) >>> g2 = Group(**{'name': 'group2'}) >>> for i, group in enumerate([g1, g2]): ... collection.add_group(group) >>> collection.rename_group('group1', 'group3') >>> print(collection.get_names()) ['group2', 'group3'] >>> print(collection.group3.name) group3
- retag(name, tag)[source]¶
Modifies tag of a group in the collection.
- Parameters
- Return type
- Returns
- Raises
AttributeError – If
name
is not a group in the collection.
Example
>>> from araucaria import Collection, Group >>> collection = Collection() >>> g1 = Group(**{'name': 'group1'}) >>> g2 = Group(**{'name': 'group2'}) >>> tags = ('scan', 'ref') >>> for i, group in enumerate([g1, g2]): ... collection.add_group(group, tag=tags[i]) >>> collection.retag('group1', 'ref') >>> for key, value in collection.tags.items(): ... print(key, value) ref ['group1', 'group2']
- summary(taglist=['all'], regex=None, optional=None)[source]¶
Returns a summary report of groups in a collection.
- Parameters
taglist (
List
[str
]) – List with keys to filter groups in the collection based on thetags
attribute. The default is [‘all’].regex (
Optional
[str
]) – Search string to filter results by group name. See Notes for details. The default is None.optional (
Optional
[list
]) – List with optional parameters. See Notes for details. The default is None.
- Return type
- Returns
Report for datasets in the HDF5 file.
- Raises
ValueError – If any item in
taglist
is not a key of thetags
attribute.
Notes
Summary data includes the following:
Group index.
Group name.
Group tag.
Measurement mode.
Numbers of scans.
Merged scans, if
optional=['merged_scans']
.Optional parameters if they exist as attributes in the group.
A
regex
value can be used to filter group names based on a regular expression (reges). For valid regex syntax, please check the documentation of the modulere
.The number of scans and names of merged files are retrieved from the
merged_scans
attribute ofcollection
.Optional parameters will be retrieved from the groups as attributes. Currently only
str
,float
orint
will be retrieved. Otherswise an empty character will be printed in the report.See also
Examples
>>> from araucaria.testdata import get_testpath >>> from araucaria.io import read_collection_hdf5 >>> fpath = get_testpath('Fe_database.h5') >>> collection = read_collection_hdf5(fpath) >>> # printing default summary >>> report = collection.summary() >>> report.show() ======================================= id dataset tag mode n ======================================= 1 FeIISO4_20K scan mu 5 2 Fe_Foil scan mu_ref 5 3 Ferrihydrite_20K scan mu 5 4 Goethite_20K scan mu 5 =======================================
>>> # printing summary of dnd file with merged scans >>> report = collection.summary(regex='Goe', optional=['merged_scans']) >>> report.show() ============================================================= id dataset tag mode n merged_scans ============================================================= 1 Goethite_20K scan mu 5 20K_GOE_Fe_K_240.00000.xdi 20K_GOE_Fe_K_240.00001.xdi 20K_GOE_Fe_K_240.00002.xdi 20K_GOE_Fe_K_240.00003.xdi 20K_GOE_Fe_K_240.00004.xdi =============================================================
>>> # printing custom summary >>> from araucaria.testdata import get_testpath >>> from araucaria import Collection >>> from araucaria.io import read_xmu >>> fpath = get_testpath('xmu_testfile.xmu') >>> # extracting mu and mu_ref scans >>> group_mu = read_xmu(fpath, scan='mu') >>> # adding additional attributes >>> group_mu.symbol = 'Zn' >>> group_mu.temp = 25.0 >>> # saving in a collection >>> collection = Collection() >>> collection.add_group(group_mu) >>> report = collection.summary(optional=['symbol','temp']) >>> report.show() =================================================== id dataset tag mode n symbol temp =================================================== 1 xmu_testfile.xmu scan mu 1 Zn 25 ===================================================