HTPS Parameters

This writeup provides detailed descriptions of parameters available for configuring the HyperTree Proof Search (HTPS) algorithm within the open-htps library.

Parameters

  • exploration (double): Exploration parameter in UCB (Upper Confidence Bound), often referred to as c.

  • policy_type (enum): Policy selection, either “AlphaZero” or “RPO”.

  • num_expansions (int): Total number of expansions to perform before terminating the search.

  • succ_expansions (int): Number of expansions performed per theorems_to_expand call. Determines how many theorems are returned for expansion at each step.

  • early_stopping (bool): Whether already solved nodes should be excluded from further expansions. Not to be confused with early_stopping_solved_if_root_not_proven.

  • no_critic (bool): Indicates if a critic model should be used or omitted during the search.

  • backup_once (bool): If set to true, each subtree is backed up only once, useful when a subtree is selected multiple times due to high succ_expansions.

  • backup_one_for_solved (bool): Determines whether solved nodes should always back up with a value of 1.0. If false, uses the critic value instead.

  • depth_penalty (double): A discount factor applied as a penalty based on node depth.

  • count_threshold (int): Minimum number of visits a node must have to qualify as a training example.

  • tactic_p_threshold (double): Probability threshold from the tree policy for a tactic to be considered a training example.

  • tactic_sample_q_conditioning (bool): Enables returning tactic samples for Q-conditioning (for details, see the original HTPS paper).

  • only_learn_best_tactics (bool): If true, returns only tactic samples involved in minimal proofs.

  • tactic_init_value (double): Initial tactic value assigned before a goal has been evaluated by the critic or expanded.

  • q_value_solved (enum): Determines the Q-value assignment for solved node-tactic pairs.

  • policy_temperature (double): Temperature parameter applied to the tree policy distribution.

  • metric (enum): Metric used in minimal proof search. Options include:

    • “depth”

    • “length”

    • “time”

  • node_mask (enum): Specifies nodes returned as training examples:

    • “NoMask”: all samples are returned

    • “Solving”: only samples from solved nodes are returned (does not require the root to be proven)

    • “Proof”: only samples from proof nodes are returned (requires the root to be proven)

    • “MinimalProof”: only samples from nodes on the minimal proof are returned

    • “MinimalProofSolving”: same as “MinimalProof” if the root is proven, otherwise same as “Solving”

  • effect_subsampling_rate (double): Subsampling rate applied to effect samples.

  • critic_subsampling_rate (double): Subsampling rate applied to critic samples.

  • early_stopping_solved_if_root_not_proven (bool): Stops expansion of solved nodes if the root has not yet been proven, prioritizing proving the root first.

  • virtual_loss (int): The number of virtual counts added for each node visit to encourage exploration.