Spaces:
Running
Running
Add `🤔Key takeaways` and `Summary` sections
Browse files
probability/21_logistic_regression.py
CHANGED
@@ -611,6 +611,57 @@ def _(mo):
|
|
611 |
return
|
612 |
|
613 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
614 |
@app.cell(hide_code=True)
|
615 |
def _(mo):
|
616 |
mo.md(
|
|
|
611 |
return
|
612 |
|
613 |
|
614 |
+
@app.cell(hide_code=True)
|
615 |
+
def _(mo):
|
616 |
+
mo.md(
|
617 |
+
r"""
|
618 |
+
## 🤔 Key Takeaways
|
619 |
+
|
620 |
+
Click on the statements below that you think are correct to verify your understanding:
|
621 |
+
|
622 |
+
/// details | Logistic regression tries to find parameters (θ) that minimize the error between predicted and actual values using ordinary least squares.
|
623 |
+
❌ **Incorrect.** Logistic regression uses maximum likelihood estimation (MLE), not ordinary least squares. It finds parameters that maximize the probability of observing the training data, which is different from minimizing squared errors as in linear regression.
|
624 |
+
///
|
625 |
+
|
626 |
+
/// details | The sigmoid function maps any real number to a value between 0 and 1, which allows logistic regression to output probabilities.
|
627 |
+
✅ **Correct!** The sigmoid function σ(z) = 1/(1+e^(-z)) takes any real number as input and outputs a value between 0 and 1. This is perfect for representing probabilities and is a key component of logistic regression.
|
628 |
+
///
|
629 |
+
|
630 |
+
/// details | The decision boundary in logistic regression is always a straight line, regardless of the data's complexity.
|
631 |
+
✅ **Correct!** Standard logistic regression produces a linear decision boundary (a straight line in 2D or a hyperplane in higher dimensions). This is why it works well for linearly separable data but struggles with more complex patterns, like concentric circles (as you might've noticed from the interactive demo).
|
632 |
+
///
|
633 |
+
|
634 |
+
/// details | The logistic regression model params are typically initialized to random values and refined through gradient descent.
|
635 |
+
✅ **Correct!** Parameters are often initialized to zeros or small random values, then updated iteratively using gradient descent (or ascent for maximizing likelihood) until convergence.
|
636 |
+
///
|
637 |
+
|
638 |
+
/// details | Logistic regression can naturally handle multi-class classification problems without any modifications.
|
639 |
+
❌ **Incorrect.** Standard logistic regression is inherently a binary classifier. To handle multi-class classification, techniques like one-vs-rest or softmax regression are typically used.
|
640 |
+
///
|
641 |
+
"""
|
642 |
+
)
|
643 |
+
return
|
644 |
+
|
645 |
+
|
646 |
+
@app.cell(hide_code=True)
|
647 |
+
def _(mo):
|
648 |
+
mo.md(
|
649 |
+
r"""
|
650 |
+
## Summary
|
651 |
+
|
652 |
+
So we've just explored logistic regression. Despite its name (seriously though, why not call it "logistic classification"?), it's actually quite elegant in how it transforms a simple linear model into a powerful decision _boundary_ maker.
|
653 |
+
|
654 |
+
The training process boils down to finding the values of θ that maximize the likelihood of seeing our training data. What's super cool is that even though the math looks _scary_ at first, the gradient has this surprisingly simple form: just the error (y - predicted) multiplied by the feature values.
|
655 |
+
|
656 |
+
Two key insights to remember:
|
657 |
+
|
658 |
+
- Logistic regression creates a _linear_ decision boundary, so it works great for linearly separable classes but struggles with more _complex_ patterns
|
659 |
+
- It directly gives you probabilities, not just classifications, which is incredibly useful when you need confidence measures
|
660 |
+
"""
|
661 |
+
)
|
662 |
+
return
|
663 |
+
|
664 |
+
|
665 |
@app.cell(hide_code=True)
|
666 |
def _(mo):
|
667 |
mo.md(
|