AugmenterR: Data Augmentation for Machine Learning on Tabular Data
Implementation of a data augmentation technique based on conditional entropy
It was devised by both authors during their masters and is discussed in detail in the second author dissertation.
It is able to create novel samples conditioned on a desired value of a categorical attribute, as a way to augment data for classification tasks
Tests discussed in the dissertation and future paper present that the technique satisfies several statistical assumptions for the novel samples.
It also shows significant improvement for machine learning models trained on small data.
|knitr, ggplot2, markdown
|Rafael S. Pereira [aut, cre, cph],
Henrique Matheus ferreira da silva [aut, cph],
Fabio A.M Porto [aut, ths, cph]
|Rafael S. Pereira <r.s.p.models at gmail.com>
|MIT + file LICENSE
Please use the canonical form
to link to this page.