Understanding the Cost Function in Logistic Regression
Logistic Regression is a type of supervised learning algorithm used primarily for classification tasks. To assess the performance of the model, a cost function is employed, whic...
Logistic Regression is a type of supervised learning algorithm used primarily for classification tasks. To assess the performance of the model, a cost function is employed, which measures the discrepancy between predicted outcomes and actual values. In contrast to linear regression, the cost function in Logistic Regression is based on log loss, also known as cross-entropy loss.
- This cost function evaluates the error between the predicted probabilities and the true class labels, which are either 0 or 1.
- Unlike Linear Regression, which predicts along a straight line, Logistic Regression uses probabilities between 0 and 1, facilitated by the sigmoid function.
- The cost function places greater penalties on incorrect predictions, especially when the model is confidently incorrect.
The cost function in Logistic Regression is defined as follows:
[ \text(h_\theta(x), y) = -y \cdot \log(h_\theta(x)) - (1 - y) \cdot \log(1 - h_\theta(x)) ]
Where:
- ( h_\theta(x) ) is the predicted probability calculated using the sigmoid function.
- ( y ) is the actual class value (0 or 1).
For all training examples, the log loss function is expressed as:
[ J(\theta) = -\frac \sum_^ \left[y^ \log(h_\theta(x^)) + (1-y^) \log(1-h_\theta(x^))\right] ]
Why Mean Squared Error is Not Used
- Mean Squared Error (MSE) is effective for regression but results in a non-convex curve for Logistic Regression, leading to multiple local minima.
- Log loss guarantees a convex cost function, simplifying optimization via Gradient Descent and ensuring a global minimum.
Implementing Logistic Regression Cost Function in Python
Below is a Python implementation demonstrating how Logistic Regression calculates predicted probabilities using the sigmoid function and evaluates model performance using the log loss (binary cross-entropy) cost function. It highlights how the confidence of predictions affects the overall error in a straightforward, numerically stable manner.
import numpy as np
def sigmoid(z):
return 1 / (1 + np.exp(-z))
def log_loss(y_true, y_pred, eps=1e-15):
y_pred = np.clip(y_pred, eps, 1 - eps)
m = y_true.shape[0]
return -np.mean(
y_true * np.log(y_pred) +
(1 - y_true) * np.log(1 - y_pred)
)
X = np.array([0.2, 0.4, 0.6])
y = np.array([0, 1, 1])
theta = 0.5
z = X * theta
y_pred = sigmoid(z)
print("Predicted Probabilities:", y_pred)
print("Log Loss Value:", log_loss(y, y_pred))
Output:
-
Predicted Probabilities: [0.52497919, 0.549834, 0.57444252]
-
Cost Function Value: 0.6322969246412298
-
Predicted Probabilities: These values represent the model's estimated likelihood of the positive class, indicating low confidence as they are all close to 0.5.
-
Cost Function Value: A log loss of 0.63 suggests moderate error, indicating that the model's predictions could be improved.
Related Topics
- Logistic Regression
- Mean Squared Error (MSE)