Comparison with cr-sparse#
In this notebook, we compare the performance of different implemention of common sparse constrained optimization algorithms in skscope and cr-sparse:
IHT: Iterative Hard Thresholding
OMP: Orthogonal Matching Pursuit
HTP: Hard Thresholding Pursuit
Grasp or CoSaMP: Compressive Sampling Matching Pursuit.
[7]:
import numpy as np
import pandas as pd
import time
import jax.numpy as jnp
from skscope.solver import *
import cr.sparse.dict as crdict
from cr.sparse.pursuit import iht, omp, htp, cosamp
from abess.datasets import make_glm_data
The following function generate synthetic data and solve the sparse constrained least-square problem.
The algorithm implemented in skscope and cr-sparse libraries are compared and the recovery accuracy and computation time are reported.
[8]:
def test(n=500, p=1000, s=5, random_state=None, rep=1):
print('='*20 + f' n={n}, p={p}, s={s} ' + '='*20 )
# rng = np.random.default_rng(random_state)
# true_support_set = rng.choice(np.arange(p), size=s, replace=False)
# real_coef = np.zeros(p)
# real_coef[true_support_set] = rng.choice(np.arange(1, 4), size=s) * rng.choice([1, -1], size=s)
# data = make_glm_data(n=n, p=p, k=s, family='gaussian', coef_=real_coef)
# X, y = data.x, data.y
# iterables = [['OMP', 'IHT', 'HTP', 'Grasp'], ['cr-sparse', 'skscope']]
# index = pd.MultiIndex.from_product(iterables, names=['Algorithm', 'Package'])
# res = pd.DataFrame(columns=['Accuracy', 'Time'], index = index)
res = pd.DataFrame(columns=['Algorithm', 'Package', 'Accuracy', 'Time'])
def objective(params):
loss = jnp.mean((y - X @ params) ** 2)
return loss
for i in range(rep):
rng = np.random.default_rng(i)
true_support_set = rng.choice(np.arange(p), size=s, replace=False)
real_coef = np.zeros(p)
real_coef[true_support_set] = rng.choice(np.arange(1, 4), size=s) * rng.choice([1, -1], size=s)
data = make_glm_data(n=n, p=p, k=s, family='gaussian', coef_=real_coef)
X, y = data.x, data.y
for algo in ['OMP', 'IHT', 'HTP', 'Grasp']:
if algo == 'OMP':
solver = OMPSolver(p, sparsity=s)
model = omp
elif algo == 'IHT':
solver = IHTSolver(p, sparsity=s)
model = iht
elif algo == 'HTP':
solver = HTPSolver(p, sparsity=s)
model = htp
elif algo == 'Grasp':
solver = GraspSolver(p, sparsity=s)
model = cosamp
# cr-sparse
t_begin = time.time()
solution = model.matrix_solve(jnp.array(X), y, s)
t_cr = time.time() - t_begin
acc_cr = len(set(solution.I.tolist()) & set(true_support_set)) / s
# res.loc[(algo, 'cr-sparse')] = [acc_cr, np.round(t_cr, 4)]
res.loc[len(res)] = [algo, 'cr-sparse', acc_cr, t_cr]
# skscope
t_begin = time.time()
params = solver.solve(objective, jit=True)
t_skscope = time.time() - t_begin
acc_skscope = len(set(np.nonzero(params)[0]) & set(np.nonzero(data.coef_)[0])) / s
# res.loc[(algo, 'skscope')] = [acc_skscope, np.round(t_skscope, 4)]
res.loc[len(res)] = [algo, 'skscope', acc_skscope, t_skscope]
res_mean = res.groupby(['Algorithm', 'Package']).mean()
res_std = res.groupby(['Algorithm', 'Package']).std()
res_mean['Accuracy'] = res_mean['Accuracy'].map(lambda x: f'{x:.2f}')
res_mean['Time'] = res_mean['Time'].map(lambda x: f'{x:.2f}')
res_std['Accuracy'] = res_std['Accuracy'].map(lambda x: f' ({x:.2f})')
res_std['Time'] = res_std['Time'].map(lambda x: f' ({x:.2f})')
res_all = res_mean + res_std
print(res_all)
return res_all
The results are shown in the following three tables and each correspons to a specific data dimension.
Both recovery accuracy and computation time show the superiority of skscope over cr-sparse for all the above algorithms.
[9]:
settings = [
(500, 1000, 10),
(2500, 5000, 50),
(5000, 10000, 100),
]
for setting in settings:
n, p, s = setting
res = test(n=n, p=p, s=s, rep=10)
==================== n=500, p=1000, s=10 ====================
Accuracy Time
Algorithm Package
Grasp cr-sparse 1.00 (0.00) 1.68 (0.95)
skscope 1.00 (0.00) 0.47 (0.25)
HTP cr-sparse 0.51 (0.14) 1.11 (0.66)
skscope 0.84 (0.10) 0.26 (0.18)
IHT cr-sparse 0.34 (0.24) 0.33 (0.16)
skscope 0.83 (0.09) 0.33 (0.19)
OMP cr-sparse 0.10 (0.00) 0.05 (0.03)
skscope 1.00 (0.00) 0.49 (0.29)
==================== n=2500, p=5000, s=50 ====================
Accuracy Time
Algorithm Package
Grasp cr-sparse 1.00 (0.00) 190.02 (29.06)
skscope 1.00 (0.00) 6.25 (2.44)
HTP cr-sparse 0.81 (0.03) 157.74 (27.27)
skscope 0.89 (0.05) 5.13 (1.42)
IHT cr-sparse 0.33 (0.18) 12.83 (3.10)
skscope 0.86 (0.04) 4.53 (1.56)
OMP cr-sparse 0.02 (0.00) 4.18 (2.84)
skscope 1.00 (0.00) 13.59 (4.54)
==================== n=5000, p=10000, s=100 ====================
Accuracy Time
Algorithm Package
Grasp cr-sparse 1.00 (0.00) 952.16 (470.47)
skscope 1.00 (0.00) 11.27 (7.43)
HTP cr-sparse 0.81 (0.03) 797.22 (400.70)
skscope 0.91 (0.04) 9.05 (4.53)
IHT cr-sparse 0.00 (0.00) 11.33 (8.57)
skscope 0.85 (0.03) 9.30 (7.41)
OMP cr-sparse 0.01 (0.00) 22.49 (28.15)
skscope 1.00 (0.00) 47.42 (28.42)