Usage

Run

git clone https://github.com/FaramirHurin/ADV-O.git
cd ADV-O
pip install -r requirements.txt
python main.py

Output

Table 6: Synthetic data: R2 scores for the predicted features for various regressors.
                                            x_terminal_id  y_terminal_id  TX_AMOUNT
MLPRegressor(max_iter=2000, random_state=42)           0.85           0.59       0.94
Ridge(random_state=42)                                 0.85           0.58       0.93
RandomForestRegressor(random_state=42)                 0.85           0.59       0.90
Naive                                                  0.39           0.54       0.91


Table 7: Synthetic data: accuracy of oversampling algorithms. All oversampling algorithms have been tested using a Balanced Random Forest. No oversampling has been tested with a classic Random Forest ('Baseline'),  and a Balanced Random Forest ('Baseline balanced').
            Baseline  Baseline_balanced  SMOTE  Random  KMeansSMOTE  ADVO
PRAUC           0.32               0.37   0.36    0.37         0.36  0.37
PRAUC_Card      0.45               0.50   0.46    0.49         0.48  0.48
Precision       0.34               0.23   0.27    0.26         0.25  0.27
Recall          0.29               0.89   0.68    0.72         0.73  0.69
F1 score        0.31               0.36   0.39    0.38         0.37  0.39
PK50            0.76               0.36   0.56    0.30         0.40  0.42
PK100           0.78               0.37   0.52    0.38         0.39  0.45
PK200           0.74               0.38   0.50    0.44         0.36  0.55
PK500           0.61               0.40   0.50    0.40         0.40  0.55
PK1000          0.48               0.42   0.46    0.44         0.40  0.48
PK2000          0.36               0.38   0.40    0.39         0.38  0.41


Table 8: Synthetic data: AUC of absolute differences between kde
            x_terminal_id  y_terminal_id  TX_AMOUNT
SMOTE                 0.11           0.10       0.18
Random                0.05           0.11       0.02
KMeansSMOTE           0.05           0.10       0.02
ADVO                  0.09           0.12       0.03