1. SVM Objective
The core goal of the Support Vector Machine (SVM) is to find the decision boundary (hyperplane) that maximizes the margin between the two classes.
The equation of the decision boundary is typically written as:
w · x + b = 0
Where:
w = [w1, w2]is the weight vector (which is perpendicular to the hyperplane).x = [x1, x2]is the feature vector (the input data).bis the bias term (the offset from the origin).
This equation defines a hyperplane in a multidimensional space.
2. Finding the Margin
In SVM, the objective is to maximize the margin, which is the distance between the decision boundary (the hyperplane) and the support vectors. The margin is mathematically defined as:
Margin = 2 / |w|
Where |w| is the norm (magnitude) of the weight vector.
The margin boundaries (parallel to the decision boundary) are:
w · x + b = +1 (for Class +1)
w · x + b = -1 (for Class -1)
3. The Decision Boundary Equation
Once the model has learned the values of w and b through training, the decision boundary is the hyperplane where the equation:
w · x + b = 0
holds true.
In the case of the simplified example where we only have two features x1 and x2, the equation for the decision boundary could be something like:
w1 x1 + w2 x2 + b = 0
4. Example with Given Data (Simplified Case)
Let’s say after training the SVM on the dataset, we find that the learned values for the weight vector w = [w1, w2] = [1, 1] and the bias term b = -7.
The decision boundary equation becomes:
1 · x1 + 1 · x2 - 7 = 0
Simplifying:
x1 + x2 = 7
So, the decision boundary is derived from the training process, and it's not an assumption. The equation x1 + x2 = 7 is the optimal hyperplane that SVM finds to separate the two classes based on the training data.
5. Support Vectors and Margin Boundaries
To maximize the margin, the support vectors are the closest points to the decision boundary. These points lie on the two margin boundaries, which are:
x1 + x2 = 8 (for Class +1)
x1 + x2 = 6 (for Class -1)
These margins are parallel to the decision boundary and define the region in which the SVM tries to maximize the separation between the classes.
6. Threshold (Decision Rule)
Now, the threshold is 7 because it’s the value of x1 + x2 where the decision boundary lies. The threshold tells us how the classifier will assign a label to a new point:
- If
x1 + x2 > 7, classify the point as Class +1. - If
x1 + x2 < 7, classify the point as Class -1.
If a point lies exactly on x1 + x2 = 7, it would be on the decision boundary, and the classifier would be indifferent between the two classes (the decision would be based on further rules or data).
Conclusion
- The equation
x1 + x2 = 7is derived from the training process of the SVM, not arbitrarily assumed. - The SVM optimizes the margin to maximize the separation between the two classes, and it results in a decision boundary equation like
x1 + x2 = 7. - The threshold value (7) is the value of
x1 + x2on the decision boundary. Points on this boundary are equidistant from both classes.
So, 7 is derived as part of the training process and is not assumed. It's the value that separates the two classes.