Select Model with SymetryML
Select Model introduction
SymetryML Select Model is a feature selection functionality. It allows to automatically select the best features for a given model algorithm and it leverages SymetryML unique capabilities to build different predictive model quickly. The functionality builds various model each with different input attributes using a predefined heuristic. It then computes a score for them using out of sample data and will retain the best one. The following table describes the available heuristics:
Select Heuristic
Name
Description
Forward Backward
A heuristic that does the following: 1. Iteratively add as many features as possible while keeping the best model 2. Iteratively remove as many feature as possible while keeping the best model 3. repeat a specific number of time.
Brute Force
Brute force will try all possible combinations of the input attributes. It should not be used if you have more than 17-18 attributes.
Max Number of Iterations
Randomly create a model by trying a specific number of random number of permutations of the features.
Max. Number of Seconds
Randomly create a model by trying a random number of permutations of the features for a maximum number of seconds.
Simple
The simple heuristic starts with one feature and then incrementally adds one additional feature until it tries all the features. It then keeps track of the best model.
Selector Types
selector_type_fw_bw
Forward / Backward heuristic. Number of iteration is by default 5. It can be controlled with the selector_max_iterations parameters.
selector_type_simple
Simple heuristic
selector_type_brute
Brute force selector.
selector_type_iteration
A Selector that will either try a specific number of random combination or will try for a specific number of seconds. selector_max_iterations or selector_max_seconds must also be specified with this type of selector
selector_type_genetic
(Experimental) Genetic Algorithm feature selector. Uses evolutionary optimization to find optimal feature subsets. See Genetic Algorithm Selector section.
selector_type_bayesian
(Experimental) Bayesian Optimization feature selector. Uses probabilistic modeling to efficiently search the feature space. See Bayesian Optimization Selector section.
Selector Grid
Elastic Net model has 2 hyper parameters that can be optimized eta and lambda. The auto-select algorithm will try various combinations of these parameters using a grid search. The size of this grid can be controlled via the autoselect_grid_type extra parameter in the MLContext request body. Please see this section for such an example.
autoselect_grid_type_tiny
eta [0, 0.5, 1.0] x lambda [1e-3, 1e-2, 0.1]
autoselect_grid_type_small
eta [0, 0.5, 1.0] x lambda [1e-3, 1e-2, 0.1, 1]
autoselect_grid_type_normal
eta [0, 0.3333, 0.6666, 1.0] x lambda [1e-4, 1e-3, 1e-2, 0.1, 1, 10]
autoselect_grid_type_large
eta [0, 0.2, 0.4, 0.6, 0.8, 1.0] x lambda [1e-9, 1e-8, 1e-7, 1e-6, 1e-5, 1e-4, 1e-3, 1e-2, 0.1, 1, 10, 100, 1000]
Select Model Rest API
Allows to invoke the select model functionality by specifying an external data source id as the out of sample data to use for model assessment.
URL
Query Parameters
modelid
Required
ID to assign to the new model.
algo
Required
Algorithm to fit
MLContext Build Parameters
rnd_seed
Optional
Integer
Set the seed of the randomizer
selector_type
Optional
String
Default is selector_type_fw_bw. Please see Selector Heuristic and Selector Types sections for details.
autoselect_grid_type
Optional
String
Default is autoselect_grid_type_tiny. Please see Selector Grid Table for details.
HTTP Responses
202
OK
Job accepted.
400
BAD REQUEST
Unknown SymetryML project. {"statusCode":"BAD_REQUEST","statusString":" + Cannot Find SYMETRYML id[r2] for Customer id [c1]","values":{}}
Sample Request Response Classifier
Sample Request Response Regression
Select Model Dataframe Rest API
Allows to invoke the select model functionality by using a DataFrame passed in the request body as the out of sample data to be used for models assessment.
URL
Query Parameters
modelid
Required
ID to assign to the new model.
algo
Required
Algorithm to fit
MLContext Build Parameters
rnd_seed
Optional
Integer
Set the seed of the randomizer
selector_type
Optional
String
Default is selector_type_fw_bw. Please see Selector Heuristic and Selector Types sections for details.
autoselect_grid_type
Optional
String
Default is autoselect_grid_type_tiny. Please see Selector Grid Table for details.
HTTP Responses
202
OK
Job accepted.
400
BAD REQUEST
Unknown SymetryML project. {"statusCode":"BAD_REQUEST","statusString":" + Cannot Find SYMETRYML id[r2] for Customer id [c1]","values":{}}
Sample Request Response Classifier
Sample Request Response Regression
Genetic Algorithm Selector (Experimental)
The Genetic Algorithm selector uses evolutionary optimization to find optimal feature subsets. It evolves a population of candidate feature sets over multiple generations, using selection, crossover, and mutation operations to discover high-performing feature combinations.
When to Use
When you have a large number of features and want to explore the feature space more thoroughly than forward/backward selection
When feature interactions are important and simple greedy approaches may miss optimal combinations
When you can afford more computation time for potentially better results
Genetic Algorithm Parameters
genetic_population_size
Integer
50
Number of candidate feature sets in each generation
genetic_num_generations
Integer
100
Maximum number of generations to evolve
genetic_mutation_rate
Double
0.05
Probability of flipping each feature (gene) during mutation
genetic_crossover_rate
Double
0.8
Probability of performing crossover between two parents
genetic_elite_count
Integer
2
Number of top-performing individuals preserved unchanged each generation
genetic_tournament_size
Integer
3
Number of individuals competing in tournament selection
genetic_initial_feature_prob
Double
0.1
Probability that each feature is included in initial random population
genetic_min_features
Integer
1
Minimum number of features allowed in any individual
genetic_max_features
Integer
unlimited
Maximum number of features allowed in any individual
genetic_parallel_threads
Integer
4
Number of parallel threads for model evaluation
genetic_stagnation_limit
Integer
20
Number of generations without improvement before early stopping
Sample Request
Bayesian Optimization Selector (Experimental)
The Bayesian Optimization selector uses probabilistic modeling to efficiently search the feature space. It builds a surrogate model of the objective function and uses an acquisition function to balance exploration and exploitation when selecting which feature combinations to evaluate.
When to Use
When model evaluation is expensive and you want to minimize the number of evaluations
When you want a more sample-efficient search compared to random or genetic approaches
When the feature space is large but you suspect good solutions exist in specific regions
Bayesian Optimization Parameters
bayesian_num_iterations
Integer
100
Total number of optimization iterations
bayesian_initial_random
Integer
20
Number of random samples before starting Bayesian optimization
bayesian_exploration_weight
Double
0.1
Exploration weight for UCB (Upper Confidence Bound) acquisition function
bayesian_num_candidates
Integer
100
Number of candidate feature sets evaluated per iteration
bayesian_local_search_steps
Integer
10
Number of local search steps for solution refinement
bayesian_embedding_dim
Integer
50
Dimension for random embedding when dealing with high-dimensional feature spaces
bayesian_top_k_memory
Integer
200
Number of top observations kept in memory for surrogate model
bayesian_stagnation_limit
Integer
30
Number of iterations without improvement before early stopping
Sample Request
Last updated