[Advanced Learning Algorithms] Machine Learning Development Process - 4
1. Machine Learning Development Process
1.1 Cycle of ML process
Step 1: Choose Architecture (mode, data, etc)
Step 2: Train Model
Step 3: Diagnostics (bias, variance, and error analysis)
Step 4: Deploy in production(Deploy, moinor and maintain system)
1.2 Example (Spam Classification Example)
Example: Spam Classification Example
Supervised Learning Algorithm
- x = feature of spam emails
- y = whether the email is spam or not
How do reduce the model's error?
- Collect more data
- Develop sophisticated features based on email routing
- Define sophisticated features from email body
- Design algorithms to detect misspellings
2. Methods to reduce the error of the model
2.1 Error Analysis
Problem:
- 100 out of 500 CV test were misclassified
Analysis:
- Manually examine 100 examples and categorize them based on common traits
- Collect more datas on the trait that had the most error with
2.2 Adding data
- Rather than dadding mode data of everything, adding more data at the types where error Analysis indicated might help
- Rather than getting brand new training examples, we can use the technique called Data Augmentation
2.3 Data Augmentation / Data Synthesis
Def: Modifying or Augmenting the data to create a new testing examples
Example: OCR model for Character recognition
- We can shrink, rotatie, or flip the image "A" for additional training examples
- Adding purely random or meaningless noise to our data does not help
2.4 Transfer Learning
Def: Using the data from a different task
Example: We have a model with 5 layers, which was to classify and detect objects
- Use trained parameters of these layers for the digit classification problem except for the output layer that we have to modify
- Object classification and detection model incorporates the parameter that distinguishes basic components of an image such as line, edges, corners and shape.
- Option 1: only train output layers parameters
- Options 2: Train all parameters
This method is also called Fine Tuning
2.5 Error Metrics for skewed datasets
2.5.1 Precision and recall
We are going to use error metrics to analyze if our algorithms is flawed or not.
Ex) Rare disease classification example
We got 1% error on test set. But only 0.5% of patients have the disease. The test error is not helpful to measure the algorithm's performance in this skewed dataset.
Then how are we going to measure the performance of the algorithm with the skewed dataset?
By using Precision and Recall
y = 1 if disease present
y = 0 otherwise
Predicted | Actual | 1 | 0 |
1 | True Positive | False Positive |
0 | False Negative | True Negative |
Precision
= of all patients where we predicted y = 1, what fraction actually have the rare disease
= True positives / Total predicted positive
-> precision of algorithm's performance
Recall
= of all patients where we preditced y = 1, what fraction did we correctly detect as having it.
= True positives / Total actual positive
Sum: Precision and Recall reassure whether your algorithm is flawed or not
2.5.2 Trading off precision and recall
By manipulating the threshold of the logistic regression, we can manage the algorithm's precision and recall.
Suppose we want to predict y = 1 only if very confident
- > higher precision, lower recall
Suppose we want to avoid missing too many case of rare disease
-> lower precision, higher recall
F1 Score
= This helps you to compare the algorithm's precision and recall, so that you can choose the best threshold value
= computing an average of sorts that pays more attention to whichever is lower
= 1 / (1/2 (1/p + 1/r) = 2PR/(P + R)