HTPS Parameters =============== This writeup provides detailed descriptions of parameters available for configuring the HyperTree Proof Search (HTPS) algorithm within the **open-htps** library. Parameters ---------- - **exploration** (*double*): Exploration parameter in UCB (Upper Confidence Bound), often referred to as `c`. - **policy_type** (*enum*): Policy selection, either `"AlphaZero"` or `"RPO"`. - **num_expansions** (*int*): Total number of expansions to perform before terminating the search. - **succ_expansions** (*int*): Number of expansions performed per `theorems_to_expand` call. Determines how many theorems are returned for expansion at each step. - **early_stopping** (*bool*): Whether already solved nodes should be excluded from further expansions. Not to be confused with `early_stopping_solved_if_root_not_proven`. - **no_critic** (*bool*): Indicates if a critic model should be used or omitted during the search. - **backup_once** (*bool*): If set to `true`, each subtree is backed up only once, useful when a subtree is selected multiple times due to high `succ_expansions`. - **backup_one_for_solved** (*bool*): Determines whether solved nodes should always back up with a value of 1.0. If `false`, uses the critic value instead. - **depth_penalty** (*double*): A discount factor applied as a penalty based on node depth. - **count_threshold** (*int*): Minimum number of visits a node must have to qualify as a training example. - **tactic_p_threshold** (*double*): Probability threshold from the tree policy for a tactic to be considered a training example. - **tactic_sample_q_conditioning** (*bool*): Enables returning tactic samples for Q-conditioning (for details, see the original HTPS paper). - **only_learn_best_tactics** (*bool*): If `true`, returns only tactic samples involved in minimal proofs. - **tactic_init_value** (*double*): Initial tactic value assigned before a goal has been evaluated by the critic or expanded. - **q_value_solved** (*enum*): Determines the Q-value assignment for solved node-tactic pairs. - **policy_temperature** (*double*): Temperature parameter applied to the tree policy distribution. - **metric** (*enum*): Metric used in minimal proof search. Options include: - `"depth"` - `"length"` - `"time"` - **node_mask** (*enum*): Specifies nodes returned as training examples: - `"NoMask"`: all samples are returned - `"Solving"`: only samples from solved nodes are returned (does not require the root to be proven) - `"Proof"`: only samples from proof nodes are returned (requires the root to be proven) - `"MinimalProof"`: only samples from nodes on the minimal proof are returned - `"MinimalProofSolving"`: same as `"MinimalProof"` if the root is proven, otherwise same as `"Solving"` - **effect_subsampling_rate** (*double*): Subsampling rate applied to effect samples. - **critic_subsampling_rate** (*double*): Subsampling rate applied to critic samples. - **early_stopping_solved_if_root_not_proven** (*bool*): Stops expansion of solved nodes if the root has not yet been proven, prioritizing proving the root first. - **virtual_loss** (*int*): The number of virtual counts added for each node visit to encourage exploration.