gwkokab.analysis.core.inference_io ================================== .. py:module:: gwkokab.analysis.core.inference_io Classes ------- .. autoapisummary:: gwkokab.analysis.core.inference_io.AnalyticalPELoader gwkokab.analysis.core.inference_io.DiscretePELoader gwkokab.analysis.core.inference_io.FlowMCGlobalConfig gwkokab.analysis.core.inference_io.NumpyroGlobalConfig gwkokab.analysis.core.inference_io.NumpyroMCMCConfig gwkokab.analysis.core.inference_io.NumpyroNUTSSamplerConfig gwkokab.analysis.core.inference_io.PoissonMeanEstimationLoader gwkokab.analysis.core.inference_io.SamplerConfig Package Contents ---------------- .. py:class:: AnalyticalPELoader(/, **data: Any) Bases: :py:obj:`pydantic.BaseModel` Loader for Analytical PE (Parameter Estimation) samples from files matching a regex. This class handles the ingestion of gravitational-wave posterior samples, manages parameter aliasing, performs subsampling, and calculates log-prior weights for population inference. Create a new model by parsing and validating input data from keyword arguments. Raises [`ValidationError`][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model. `self` is explicitly positional-only to allow `self` as a field name. .. py:method:: from_json(config_path: str) -> AnalyticalPELoader :classmethod: Initializes the loader from a JSON configuration file. :param config_path: Path to the JSON file containing loader settings. :type config_path: str :returns: An instance of AnalyticalPELoader. :rtype: AnalyticalPELoader :raises KeyError: If the 'regex' field is missing in the configuration. :raises FileNotFoundError: If no files match the provided regex pattern. .. py:method:: load(parameters: tuple[str, Ellipsis], seed: int = 37) -> dict[str, list[numpy.ndarray]] Loads analytical PE data from disk. This method reads the mean, covariance, and limits for each event specified in `self.event_paths`, validates that the necessary parameters are present, and returns them as stacked numpy arrays. :param parameters: The list of parameters to extract from each file. :type parameters: tuple[str, ...] :param seed: Random seed used for deterministic subsampling, by default 37 :type seed: int, optional :returns: A dictionary containing lists of arrays of mean, covariance, and limits for each event. :rtype: dict[str, list[np.ndarray]] .. py:method:: load_file(filename: pathlib.Path | str, waveform_name: str) -> AnalyticalPEFileData :classmethod: Loads a single PE sample file into a DataFrame. :param filename: Path to the sample file. :type filename: Path | str :param waveform_name: Name of the waveform model used. :type waveform_name: str :returns: NamedTuple containing the samples and metadata from the file. :rtype: AnalyticalPEFileData .. py:attribute:: alternate_waveforms :type: dict[str, str] :value: None Mapping of filenames to alternate waveform names, if needed. .. py:attribute:: default_waveform :type: str :value: None Default waveform name to use when loading samples. .. py:attribute:: event_paths :type: tuple[pathlib.Path, Ellipsis] Tuple of absolute paths to the files containing PE samples. .. py:attribute:: model_config Configuration for the model, should be a dictionary conforming to [`ConfigDict`][pydantic.config.ConfigDict]. .. py:attribute:: parameter_aliases :type: dict[str, str] :value: None Mapping of internal parameter names to the column names used in the CSV files. .. py:attribute:: sample_transformer :type: gwkokab.analysis.core.utils.SampleTransformer :value: None An instance of a SampleTransformer that defines how to transform the samples from the analytical PE format to the model's expected format. This allows for flexible handling of different coordinate systems or parameterizations used in the PE samples. .. py:class:: DiscretePELoader(/, **data: Any) Bases: :py:obj:`pydantic.BaseModel` Loader for Discrete PE (Parameter Estimation) samples from files matching a regex. This class handles the ingestion of gravitational-wave posterior samples, manages parameter aliasing, performs subsampling, and calculates log-prior weights for population inference. Create a new model by parsing and validating input data from keyword arguments. Raises [`ValidationError`][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model. `self` is explicitly positional-only to allow `self` as a field name. .. py:method:: from_json(config_path: str) -> DiscretePELoader :classmethod: Initializes the loader from a JSON configuration file. :param config_path: Path to the JSON file containing loader settings. :type config_path: str :returns: An instance of DiscreteParameterEstimationLoader. :rtype: DiscreteParameterEstimationLoader :raises KeyError: If the 'regex' field is missing in the configuration. :raises FileNotFoundError: If no files match the provided regex pattern. .. py:method:: load(parameters: tuple[str, Ellipsis], seed: int = 37) -> tuple[list[numpy.ndarray], list[numpy.ndarray]] Loads samples from disk and computes the corresponding log-prior weights. It is inspired by :func:`~gwpopulation_pipe.data_collection.evaluate_prior`. :param parameters: The list of parameters to extract from each file. :type parameters: tuple[str, ...] :param seed: Random seed used for deterministic subsampling, by default 37 :type seed: int, optional :returns: A tuple containing: - A list of arrays (one per event) containing the requested parameters. - A list of arrays (one per event) containing the log-prior weights. :rtype: tuple[list[np.ndarray], list[np.ndarray]] .. py:method:: load_file(filename: pathlib.Path | str, datasets: str | tuple[str, Ellipsis]) -> pandas.DataFrame :classmethod: Loads a single PE sample file into a DataFrame. :param filename: Path to the sample file. :type filename: Path | str :param datasets: Name or tuple of names of the dataset(s) to load from the HDF5 file, in order of preference. :type datasets: str | tuple[str, ...] :returns: DataFrame containing the samples from the file. :rtype: pd.DataFrame .. py:attribute:: alternate_datasets :type: dict[str, str] :value: None Mapping of filenames to an alternate dataset name, overriding the default dataset(s). .. py:attribute:: alternate_distance_priors :type: dict[str, Literal[None, 'comoving', 'euclidean']] :value: None Mapping of filenames to an alternate distance prior, overriding the default distance prior. .. py:attribute:: alternate_mass_priors :type: dict[str, Literal[None, 'flat-detector-components', 'flat-detector-chirp-mass-ratio', 'flat-source-components']] :value: None Mapping of filenames to an alternate mass prior, overriding the default mass prior. .. py:attribute:: alternate_spin_priors :type: dict[str, Literal[None, 'component']] :value: None Mapping of filenames to an alternate spin prior, overriding the default spin prior. .. py:attribute:: default_datasets :type: tuple[str, Ellipsis] :value: None Default dataset names to look for in HDF5 files, in order of preference. .. py:attribute:: default_distance_prior :type: Literal[None, 'comoving', 'euclidean'] :value: None The distance prior assumed; used to calculate volume-sensitive weights. .. py:attribute:: default_mass_prior :type: Literal[None, 'flat-detector-components', 'flat-detector-chirp-mass-ratio', 'flat-source-components'] :value: None The mass prior assumed during the original PE run to be removed/reweighted. .. py:attribute:: default_spin_prior :type: Literal[None, 'component'] :value: None The spin prior assumed during the original PE run. .. py:attribute:: filenames :type: tuple[pathlib.Path, Ellipsis] Tuple of absolute paths to the sample files. .. py:attribute:: max_samples :type: Optional[pydantic.PositiveInt] :value: None If set, limits the number of samples loaded per event to this value. .. py:attribute:: model_config Configuration for the model, should be a dictionary conforming to [`ConfigDict`][pydantic.config.ConfigDict]. .. py:attribute:: parameter_aliases :type: dict[str, str] :value: None Mapping of internal parameter names to the column names used in the CSV files. .. py:class:: FlowMCGlobalConfig(/, **data: Any) Bases: :py:obj:`pydantic.BaseModel` Configuration for the FlowMC sampler. Create a new model by parsing and validating input data from keyword arguments. Raises [`ValidationError`][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model. `self` is explicitly positional-only to allow `self` as a field name. .. py:method:: from_json(config_path: str) -> FlowMCGlobalConfig :classmethod: Initializes the loader from a JSON configuration file. :param config_path: Path to the JSON file containing loader settings. :type config_path: str :returns: An instance of FlowMCGlobalConfig. :rtype: FlowMCGlobalConfig .. py:attribute:: batch_size :type: pydantic.PositiveInt :value: None Number of samples per training batch for the Normalizing Flow. .. py:attribute:: chain_batch_size :type: Annotated[int, Field(ge=0)] :value: None Batch size for processing chains. If 0, processes all chains simultaneously. .. py:attribute:: global_thinning :type: pydantic.PositiveInt :value: None Thinning factor applied to global (Normalizing Flow) proposals. .. py:attribute:: history_window :type: pydantic.PositiveInt :value: None Size of the rolling history window used for training data or adaptation. .. py:attribute:: learning_rate :type: pydantic.PositiveFloat :value: None Learning rate for the Normalizing Flow optimizer. .. py:attribute:: local_sampler_name :type: Literal['mala', 'hmc'] :value: None The underlying local MCMC sampler to use ('mala' for MALA or 'hmc' for HMC). .. py:attribute:: local_thinning :type: pydantic.PositiveInt :value: None Thinning factor applied to local sampler steps. .. py:attribute:: mass_matrix :type: pydantic.PositiveFloat | NumPyArrayTypeForPydantic :value: None Mass matrix diagonal elements or scalar value for HMC trajectory dynamics. .. py:attribute:: model_config Configuration for the model, should be a dictionary conforming to [`ConfigDict`][pydantic.config.ConfigDict]. .. py:attribute:: n_NFproposal_batch_size :type: pydantic.PositiveInt :value: None Batch size used when generating proposal steps from the Normalizing Flow. .. py:attribute:: n_chains :type: pydantic.PositiveInt :value: None Number of chains to sample. .. py:attribute:: n_epochs :type: pydantic.PositiveInt :value: None Number of training epochs for the Normalizing Flow per training loop. .. py:attribute:: n_global_steps :type: pydantic.PositiveInt :value: None Number of global production/exploration steps to take using the NF proposal. .. py:attribute:: n_leapfrog :type: pydantic.PositiveInt :value: None Number of leapfrog steps per HMC trajectory (ignored if using MALA). .. py:attribute:: n_local_steps :type: pydantic.PositiveInt :value: None Number of local sampler steps to take between global updates. .. py:attribute:: n_max_examples :type: pydantic.PositiveInt :value: None Maximum number of total samples/examples to store in the training history. .. py:attribute:: n_production_loops :type: pydantic.PositiveInt :value: None Number of production loops to run after the model is trained. .. py:attribute:: n_training_loops :type: pydantic.PositiveInt :value: None Number of initial loops dedicated to tuning and training the Normalizing Flow. .. py:attribute:: rq_spline_hidden_units :type: list[pydantic.PositiveInt] :value: None Layer widths of the neural network conditioning the Rational-Quadratic Splines. .. py:attribute:: rq_spline_n_bins :type: pydantic.PositiveInt :value: None Number of bins used in each Rational-Quadratic Spline transformation layer. .. py:attribute:: rq_spline_n_layers :type: pydantic.PositiveInt :value: None Total number of flow layers (coupling blocks) in the Normalizing Flow. .. py:attribute:: rq_spline_range :type: tuple[float, float] :value: None The bounding box interval (min, max) where the spline transformations are active. .. py:attribute:: sampler_name :type: Literal['flowMC'] :value: 'flowMC' The name of the sampler to use. Currently only 'flowMC' is supported. .. py:attribute:: step_size :type: pydantic.PositiveFloat :value: None The initial step size (or integration step size) for the local sampler. .. py:attribute:: verbose :type: bool :value: None If True, prints execution progress logs and loss metrics to the console. .. py:class:: NumpyroGlobalConfig(/, **data: Any) Bases: :py:obj:`pydantic.BaseModel` Configuration for the Numpyro sampler, including both kernel and MCMC settings. Create a new model by parsing and validating input data from keyword arguments. Raises [`ValidationError`][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model. `self` is explicitly positional-only to allow `self` as a field name. .. py:method:: from_json(config_path: str) -> NumpyroGlobalConfig :classmethod: Initializes the loader from a JSON configuration file. :param config_path: Path to the JSON file containing loader settings. :type config_path: str :returns: An instance of NumpyroGlobalConfig. :rtype: NumpyroGlobalConfig .. py:attribute:: kernel :type: NumpyroNUTSSamplerConfig :value: None Configuration for the NUTS sampler kernel. .. py:attribute:: mcmc :type: NumpyroMCMCConfig :value: None Configuration for the MCMC sampling procedure. .. py:attribute:: model_config Configuration for the model, should be a dictionary conforming to [`ConfigDict`][pydantic.config.ConfigDict]. .. py:attribute:: sampler_name :type: Literal['numpyro'] :value: 'numpyro' The name of the sampler to use. Currently only 'numpyro' is supported. .. py:class:: NumpyroMCMCConfig(/, **data: Any) Bases: :py:obj:`pydantic.BaseModel` Configuration for the Numpyro MCMC. Create a new model by parsing and validating input data from keyword arguments. Raises [`ValidationError`][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model. `self` is explicitly positional-only to allow `self` as a field name. .. py:attribute:: chain_method :type: Literal['parallel', 'sequential', 'vectorized'] :value: None A callable jax transform like :func:`~jax.vmap` or one of `"parallel"` (default), `"sequential"` `"vectorized"`. The method `"parallel"` is used to execute the drawing process in parallel on XLA devices (CPUs/GPUs/TPUs), If there are not enough devices for `"parallel"`, we fall back to `"sequential"` method to draw chains sequentially. `"vectorized"` method is an experimental feature which vectorizes the drawing method, hence allowing us to collect samples in parallel on a single device. .. py:attribute:: jit_model_args :type: bool :value: None If set to `True`, this will compile the potential energy computation as a function of model arguments. As such, calling :func:`~numpyro.infer.MCMC.run` again on a same sized but different dataset will not result in additional compilation cost. Note that currently, this does not take effect for the case `num_chains > 1` and `chain_method == 'parallel'`. .. py:attribute:: model_config Configuration for the model, should be a dictionary conforming to [`ConfigDict`][pydantic.config.ConfigDict]. .. py:attribute:: num_chains :type: pydantic.PositiveInt :value: None Number of MCMC chains to run. By default, chains will be run in parallel using :func:`~jax.pmap()`. If there are not enough devices available, chains will be run in sequence. .. py:attribute:: num_samples :type: pydantic.PositiveInt :value: None Number of samples to generate from the Markov chain. .. py:attribute:: num_warmup :type: pydantic.PositiveInt :value: None Number of warmup steps. .. py:attribute:: progress_bar :type: bool :value: None Whether to enable progress bar updates. Defaults to `True`. .. py:attribute:: progress_rate :type: pydantic.PositiveInt | None :value: None Number of iterations per progress bar update. Defaults to `None`, which is 5% of total iterations when there are more than 20 iterations, otherwise every iteration. .. py:attribute:: thinning :type: pydantic.PositiveInt :value: None Positive integer that controls the fraction of post-warmup samples that are retained. For example if thinning is 2 then every other sample is retained. Defaults to 1, i.e. no thinning. .. py:class:: NumpyroNUTSSamplerConfig(/, **data: Any) Bases: :py:obj:`pydantic.BaseModel` Configuration for the Numpyro NUTS. Create a new model by parsing and validating input data from keyword arguments. Raises [`ValidationError`][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model. `self` is explicitly positional-only to allow `self` as a field name. .. py:attribute:: adapt_mass_matrix :type: bool :value: None A flag to decide if we want to adapt mass matrix during warm-up phase using Welford scheme. .. py:attribute:: adapt_step_size :type: bool :value: None A flag to decide if we want to adapt step_size during warm-up phase using Dual Averaging scheme. .. py:attribute:: dense_mass :type: bool | list[tuple[str, Ellipsis]] :value: None This flag controls whether mass matrix is dense (i.e. full-rank) or diagonal (defaults to dense_mass=False). To specify a structured mass matrix, users can provide a list of tuples of site names. Each tuple represents a block in the joint mass matrix. For example, assuming that the model has latent variables "x", "y", "z" (where each variable can be multi-dimensional), possible specifications and corresponding mass matrix structures are as follows: - `dense_mass=[("x", "y")]`: use a dense mass matrix for the joint (x, y) and a diagonal mass matrix for z - `dense_mass=[]` (equivalent to `dense_mass=False`): use a diagonal mass matrix for the joint (x, y, z) - `dense_mass=[("x", "y", "z")]` (equivalent to `full_mass=True`): use a dense mass matrix for the joint (x, y, z) - `dense_mass=[("x",), ("y",), ("z")]`: use dense mass matrices for each of x, y, and z (i.e. block-diagonal with 3 blocks) .. py:attribute:: find_heuristic_step_size :type: bool :value: None Whether or not to use a heuristic function to adjust the step size at the beginning of each adaptation window. Defaults to False. .. py:attribute:: forward_mode_differentiation :type: bool :value: None Whether to use forward-mode differentiation or reverse-mode differentiation. By default, we use reverse mode but the forward mode can be useful in some cases to improve the performance. In addition, some control flow utility on JAX such as jax.lax.while_loop or jax.lax.fori_loop only supports forward-mode differentiation. .. py:attribute:: inverse_mass_matrix :type: None | NumPyArrayTypeForPydantic | dict :value: None Initial value for inverse mass matrix. This may be adapted during warmup if `adapt_mass_matrix = True`. If no value is specified, then it is initialized to the identity matrix. For a `potential_fn` with general JAX pytree parameters, the order of entries of the mass matrix is the order of the flattened version of pytree parameters obtained with :func:`~jax.tree_flatten`, which is a bit ambiguous (see more at https://jax.readthedocs.io/en/latest/pytrees.html). If model is not None, here we can specify a structured block mass matrix as a dictionary, where keys are tuple of site names and values are the corresponding block of the mass matrix. For more information about structured mass matrix, see dense_mass argument. .. py:attribute:: max_tree_depth :type: pydantic.PositiveInt | tuple[pydantic.PositiveInt, pydantic.PositiveInt] :value: None Max depth of the binary tree created during the doubling scheme of NUTS sampler. Defaults to 8. This argument also accepts a tuple of integers (d1, d2), where d1 is the max tree depth during warmup phase and d2 is the max tree depth during post warmup phase. .. py:attribute:: model_config Configuration for the model, should be a dictionary conforming to [`ConfigDict`][pydantic.config.ConfigDict]. .. py:attribute:: regularize_mass_matrix :type: bool :value: None .. py:attribute:: step_size :type: pydantic.PositiveFloat :value: None Determines the size of a single step taken by the verlet integrator while computing the trajectory using Hamiltonian dynamics. If not specified, it will be set to 1. .. py:attribute:: target_accept_prob :type: Annotated[float, Field(gt=0.0, le=1.0)] :value: None Target acceptance probability for step size adaptation using Dual Averaging. Increasing this value will lead to a smaller step size, hence the sampling will be slower but more robust. Defaults to 0.8. .. py:class:: PoissonMeanEstimationLoader(/, **data: Any) Bases: :py:obj:`pydantic.BaseModel` !!! abstract "Usage Documentation" [Models](../concepts/models.md) A base class for creating Pydantic models. .. attribute:: __class_vars__ The names of the class variables defined on the model. .. attribute:: __private_attributes__ Metadata about the private attributes of the model. .. attribute:: __signature__ The synthesized `__init__` [`Signature`][inspect.Signature] of the model. .. attribute:: __pydantic_complete__ Whether model building is completed, or if there are still undefined fields. .. attribute:: __pydantic_core_schema__ The core schema of the model. .. attribute:: __pydantic_custom_init__ Whether the model has a custom `__init__` function. .. attribute:: __pydantic_decorators__ Metadata containing the decorators defined on the model. This replaces `Model.__validators__` and `Model.__root_validators__` from Pydantic V1. .. attribute:: __pydantic_generic_metadata__ A dictionary containing metadata about generic Pydantic models. The `origin` and `args` items map to the [`__origin__`][genericalias.__origin__] and [`__args__`][genericalias.__args__] attributes of [generic aliases][types-genericalias], and the `parameter` item maps to the `__parameter__` attribute of generic classes. .. attribute:: __pydantic_parent_namespace__ Parent namespace of the model, used for automatic rebuilding of models. .. attribute:: __pydantic_post_init__ The name of the post-init method for the model, if defined. .. attribute:: __pydantic_root_model__ Whether the model is a [`RootModel`][pydantic.root_model.RootModel]. .. attribute:: __pydantic_serializer__ The `pydantic-core` `SchemaSerializer` used to dump instances of the model. .. attribute:: __pydantic_validator__ The `pydantic-core` `SchemaValidator` used to validate instances of the model. .. attribute:: __pydantic_fields__ A dictionary of field names and their corresponding [`FieldInfo`][pydantic.fields.FieldInfo] objects. .. attribute:: __pydantic_computed_fields__ A dictionary of computed field names and their corresponding [`ComputedFieldInfo`][pydantic.fields.ComputedFieldInfo] objects. .. attribute:: __pydantic_extra__ A dictionary containing extra values, if [`extra`][pydantic.config.ConfigDict.extra] is set to `'allow'`. .. attribute:: __pydantic_fields_set__ The names of fields explicitly set during instantiation. .. attribute:: __pydantic_private__ Values of private attributes set on the model instance. Create a new model by parsing and validating input data from keyword arguments. Raises [`ValidationError`][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model. `self` is explicitly positional-only to allow `self` as a field name. .. py:method:: from_json(config_path: str, key: jaxtyping.PRNGKeyArray, parameters: Tuple[str, Ellipsis]) :classmethod: .. py:method:: get_estimators() -> Tuple[Optional[Callable[[jaxtyping.Array], jaxtyping.Array]], Callable[Ellipsis, jaxtyping.Array], dict[str, Any]] .. py:attribute:: loader :type: Union[NeuralVolumeTimeSensitivityPoissonMeanLoader, NeuralVolumeProbabilityOfDetectionPoissonMeanLoader, GWTCInjectionLoader, CustomPoissonMeanEstimationLoader] :value: None .. py:class:: SamplerConfig Factory interface for generating Sampler Configs. .. py:method:: from_json(config_path: str) -> NumpyroGlobalConfig | FlowMCGlobalConfig :staticmethod: Initializes and returns the specific config instance directly from JSON.