weaver module

class hidef.weaver.Weaver[source]

Class for constructing a hierarchical representation of a graph based on (a list of) input partitions.

property assignment

assignment matrix

delete_nodes(nodes, relabel=False)[source]

Delete some nodes from the hierarchy. This approach can be used to delete those with low persistence and rebuild a simpler hierarchy.

Parameters
  • nodes (a list of string) – names of the cluster to delete.

  • relabel (bool (default = False)) – if True, rename nodes. Setting to False may be easier to track the cluster identities.

depth_cluster(depth, flat=True)[source]

Recovers the partition at specified depth.

Returns

H

Return type

a Numpy array of labels for all the terminal nodes.

level_cluster(level, flat=True)[source]

Recovers the partition that specified by the index based on the hierarchy.

Returns

H

Return type

a Numpy array of labels for all the terminal nodes.

node_cluster(node, out=None)[source]

Recovers the cluster represented by a node in the hierarchy.

Returns

H

Return type

a Numpy array of labels for all the terminal nodes.

pick(top)[source]

Picks top x percent edges. Alternative edges are ranked based on the number of overlap terminal nodes between the child and the parent. This is the second step of weave(). Subclasses can override this function to achieve different results.

Parameters

top (int or float (0 ~ 100, default=100)) – top x percent (alternative) edges to be kept in the hierarchy. This parameter controls the number of parents each node has based on a global ranking of the edges. Note that if top=0 then each node will only have exactly one parent (except for the root which has none).

Returns

Return type

networkx.DiGraph

show(**kwargs)[source]

Visualize the hierarchy using networkx/graphviz hierarchical layouts.

See also

show_hierarchy

property terminals

terminals nodes

weave(partitions, terminals=None, boolean=True, levels=False, **kwargs)[source]

Finds a directed acyclic graph that represents a hierarchy recovered from partitions.

Parameters
  • partitions (positional argument) – a list of different partitions of the graph. Each item in the list should be an array (Numpy array or list) of partition labels for the nodes. A root partition (where all nodes belong to one cluster) and a terminal partition (where all nodes belong to their own cluster) will automatically added later.

  • terminals (keyword argument, optional (default=None)) – a list of names for the graph nodes. If none is provided, an integer will be assigned to each node.

  • levels (keyword argument, optional (default=False)) – whether assume the partitions are provided in some order. If set to True, the algorithm will only find the parents for a node from upper levels. The levels are assumed to be arranged in an ascending order, e.g. partitions[0] is the highest level and partitions[-1] the lowest.

  • boolean (keyword argument, optional (default=False)) – whether the partition labels should be considered as boolean. If set to True, only the clusters labelled as True will be considered as a parent in the hierarchy.

  • merge (keyword argument, optional (default=False)) – whether merge similar clusters. if one cluster is contained in another cluster (determined by “cutoff” oarameter) and vice versa, these two clusters deemed to be very similar. if set to true, such clusters groups will be merged into one (take union)

  • top (keyword argument (0 ~ 100, default=100)) – top x percent (alternative) edges to be kept in the hierarchy. This parameter controls the number of parents each node has based on a global ranking of the edges. Note that if top=0 then each node will only have exactly one parent (except for the root which has none).

  • cutoff (keyword argument (0.5 ~ 1.0, default=0.75)) – containment index cutoff for claiming parenthood.

Returns

T

Return type

networkx.DiGraph

write(filename, format='ddot')[source]

Writes the hierarchy to a text file.

Parameters
  • filename (str) – the path and name of the output file.

  • format (str) – output format. Available options are “ddot”.

hidef.weaver.prune(T)[source]

Removes the nodes with only one child and the nodes that have no terminal nodes (e.g. genes) as descendants. (This basically removes identical clusters)

Parameters

T (a weaver object) –

hidef.weaver.show_hierarchy(T, **kwargs)[source]

Visualizes the hierarchy in notebook