Feature Subset Optimization I: Search Space Modeling and Feasibility Checks
Intermediate
Combinatorial Search in DS
Created by Pavel
· 12.03.2026 at 07:54 UTC
· 1 completed
Case setup:
A classifier has n candidate features. Each feature can be included or excluded, so the exact subset-search space is 2^n.
Why modeling comes first:
- estimate combinatorial cost before training,
- identify when brute force is impossible,
- decide between exact, heuristic, or hybrid search.
Examples:
- n=12 -> 4096 subsets,
- n=25 -> 33,554,432 subsets.
Practical constraints to include in model:
- max training time,
- max number of selected features,
- fairness and interpretability constraints,
- minimum acceptable validation score.
Edge cases:
- objective computed on unstable split,
- data leakage during feature selection,
- compute budget underestimated by per-subset CV cost.
University approvals: 0
Tasks
Card Info
- Topic: Combinatorial Search in DS
- Difficulty: Intermediate
- Completed: 1 users
Creator
Pavel