HeavyEdge-Distance documentation#
Plugin of heavyedge to compute shape distance between edge profiles.
Usage#
HeavyEdge-Distance is designed to be used either as a command line program or as a Python module.
To compute shape distance matrix, convert your profile data to pre-shapes and compute distance matrix. For example, the following lines of commands convert a profile data to area-scaled pre-shape and compute Wasserstein distance matrix.
heavyedge scale --type=area <profile> -o <scaled_profile> # command from HeavyEdge package
heavyedge dist-wasserstein --grid-num=100 <scaled_profile> -o <distance-matrix>
Command line#
Command lines are provided as plugins and can be invoked by:
heavyedge <command>
Refer to help message of heavyedge for list of commands and their arguments.
Python module#
The Python module heavyedge_distance provides functions to compute distance matrix in Python runtime.
Refer to Runtime API section for high-level interface.
Module reference#
This section provides reference for heavyedge_distance Python module.
Runtime API#
High-level Python runtime interface.
- heavyedge_distance.api.distmat_euclidean(f1, f2=None, batch_size=None, logger=<function <lambda>>)[source]#
L2 distance matrix between profiles.
- Parameters:
- f1heavyedge.ProfileData
Open h5 file.
- f2heavyedge.ProfileData, optional
Open h5 file. If not passed, it is set to f1.
- batch_sizeint, optional
Batch size to load data. If not passed, all data are loaded at once.
- loggercallable, optional
Logger function which accepts a progress message string.
- Returns:
- (N1, N2) array
Euclidean distance matrix.
Notes
distmat_euclidean(f1)is faster thandistmat_euclidean(f1, f1).Examples
>>> from heavyedge import ProfileData >>> from heavyedge_distance import get_sample_path >>> from heavyedge_distance.api import distmat_euclidean >>> with ProfileData(get_sample_path("MeanProfiles-AreaScaled.h5")) as data: ... D = distmat_euclidean(data)
- heavyedge_distance.api.distmat_wasserstein(t, f1, f2=None, batch_size=None, logger=<function <lambda>>)[source]#
Wasserstein distance matrix between area-scaled profiles.
Warning
This function assumes that the profiles in f1 and f2 are area-scaled and heights outside the support are zero.
- Parameters:
- t(M,) ndarray
Coordinates of grids over which the quantile functions will be measured. Must be strictly increasing from 0 to 1.
- f1heavyedge.ProfileData
Open h5 file of area-scaled profiles.
- f2heavyedge.ProfileData, optional
Open h5 file of area-scaled profiles. If not passed, it is set to f1.
- batch_sizeint, optional
Batch size to load data. If not passed, all data are loaded at once.
- loggercallable, optional
Logger function which accepts a progress message string.
- Returns:
- (N1, N2) array
Wasserstein distance matrix.
Notes
distmat_wasserstein(f1)is faster thandistmat_wasserstein(f1, f1).Examples
>>> import numpy as np >>> from heavyedge import ProfileData >>> from heavyedge_distance import get_sample_path >>> from heavyedge_distance.api import distmat_wasserstein >>> with ProfileData(get_sample_path("MeanProfiles-AreaScaled.h5")) as data: ... D = distmat_wasserstein(np.linspace(0, 1, 100), data)
- heavyedge_distance.api.distmat_frechet(f1, f2=None, batch_size=None, n_jobs=None, logger=<function <lambda>>)[source]#
1-D discrete Fréchet distance matrix between profiles.
- Parameters:
- f1heavyedge.ProfileData
Open h5 file.
- f2heavyedge.ProfileData, optional
Open h5 file. If not passed, it is set to f1.
- batch_sizeint, optional
Batch size to load data. If not passed, all data are loaded at once.
- n_jobsint, optional
Number of parallel workers. If not passed, HEAVYEDGE_MAX_WORKERS environment variable is used. If the environment variable is invalid, set to 1.
- loggercallable, optional
Logger function which accepts a progress message string.
- Returns:
- (N1, N2) array
Discrete Fréchet distance matrix.
Notes
distmat_frechet(f1)is faster thandistmat_frechet(f1, f1).Examples
>>> from heavyedge import ProfileData >>> from heavyedge_distance import get_sample_path >>> from heavyedge_distance.api import distmat_frechet >>> with ProfileData(get_sample_path("MeanProfiles-PlateauScaled.h5")) as data: ... D = distmat_frechet(data)
Low-level API#
Wasserstein distance#
Wasserstein-related functions.
- heavyedge_distance.wasserstein.wdist(t, Qs1, Qs2)[source]#
Wasserstein distance matrix of 1D probability distributions.
\[d_W(f_1, f_2)^2 = \int^1_0 (Q_1(t) - Q_2(t))^2 dt\]where \(Q_i\) is the quantile function of \(f_i\).
- Parameters:
- t(M,) ndarray
Points over which Qs1 and Qs2 are measured. Must be strictly increasing from 0 to 1.
- Qs1(N1, M) ndarray
Quantile functions of first set of probability distributions.
- Qs2(N2, M) ndarray or Non
Quantile functions of second set of probability distributions. If
Noneis passed, it is set to Qs1.
- Returns:
- (N1, N2) array
Wasserstein distance matrix.
Examples
>>> import numpy as np >>> from heavyedge import ProfileData >>> from heavyedge.wasserstein import quantile >>> from heavyedge_distance import get_sample_path >>> from heavyedge_distance.wasserstein import wdist >>> with ProfileData(get_sample_path("MeanProfiles-AreaScaled.h5")) as data: ... x = data.x() ... fs, Ls, _ = data[:] >>> t = np.linspace(0, 1, 100) >>> Qs = quantile(x, fs, Ls, t) >>> D1 = wdist(t, Qs, None) >>> D2 = wdist(t, Qs, Qs)