Stream: helpdesk (published)

Topic: SVM with imbalanced dataset


view this post on Zulip Júlio Hoffimann (Feb 13 2023 at 19:21):

I am trying to use LIBSVM.jl with an imbalanced dataset.

Providing weights for each class doesn't seem to affect the result:

using CSV
using DataFrames
using LIBSVM

df = CSV.read("svm.csv", DataFrame)

X = [df.x1 df.x2 df.x3]'
y = df.y
w = Dict(l => 1 / count(==(l), y) for l in unique(y))
svm = svmtrain(X, y, weights=w)

x1 = range(-0.5,0.5, length=100)
x2 = range(-0.5,0.5, length=100)
x3 = range(-0.5,0.5, length=100)
xs = [collect(x) for x in Iterators.product(x1,x2,x3)]
 = reduce(hcat, xs)
, _ = svmpredict(svm, )

Can you reproduce the issue? I've uploaded the dataset in this gist: https://gist.github.com/juliohm/a0c98c0d386d297e2105818652faa076

view this post on Zulip Júlio Hoffimann (Feb 14 2023 at 16:39):

Cross-posted on Discourse: https://discourse.julialang.org/t/libsvm-jl-with-imbalanced-dataset/94635


Last updated: Oct 02 2023 at 04:34 UTC