Machine Learning

How to Pick ML Algorithms

Algorithms are chosen based on intuition and practical benefits, rather than math and theory.

data scientists actually do spend most their time on the earlier steps:

    - Exploring the data.
    - Cleaning the data.
    - Engineering new features.

Again, that’s because better data beats fancier algorithms.

Training the Model

Hyperparameters
Cross-Validation

a method for getting a reliable estimate of model performance using only your training data.

Select Winning Model

Selecting the best performing model using testing datasets, according to perfromance metrics: - For regression tasks, we recommend Mean Squared Error (MSE) or Mean Absolute Error (MAE). (Lower values are better) - For classification tasks, we recommend Area Under ROC Curve (AUROC). (Higher values are better)