sklearn.datasets.make_multilabel_classification()

sklearn.datasets.make_multilabel_classification

sklearn.datasets.make_multilabel_classification(n_samples=100, n_features=20, n_classes=5, n_labels=2, length=50, allow_unlabeled=True, sparse=False, return_indicator='dense', return_distributions=False, random_state=None) [source]

Generate a random multilabel classification problem.

For each sample, the generative process is:
  • pick the number of labels: n ~ Poisson(n_labels)
  • n times, choose a class c: c ~ Multinomial(theta)
  • pick the document length: k ~ Poisson(length