spras.config package
Submodules
spras.config.algorithms module
Dynamic construction of algorithm parameters with runtime type information for parameter combinations. This has been isolated from schema.py as it is not declarative, and rather mainly contains validators and lower-level pydantic code.
spras.config.config module
This config file is being used as a singleton. Because python creates a single instance of modules when they’re imported, we rely on the Snakefile instantiating the module. In particular, when the Snakefile calls init_config, it will reassign config to take the value of the actual config provided by Snakemake. After that point, any module that imports this module can access a config option by checking the object’s value. For example
import spras.config.config as config container_framework = config.config.container_settings.framework
will grab the top level registry configuration option as it appears in the config file
- class spras.config.config.Config(raw_config: dict[str, Any])
Bases:
object- classmethod from_file(filepath: str | PathLike[str])
- spras.config.config.init_from_file(filepath)
- spras.config.config.init_global(config_dict)
spras.config.container_schema module
The separate container schema specification file. For information about pydantic, see schema.py.
We move this to a separate file to allow containers.py to explicitly take in this subsection of the configuration.
- class spras.config.container_schema.ContainerFramework(*values)
Bases:
CaseInsensitiveEnum- apptainer = 'apptainer'
- docker = 'docker'
- dsub = 'dsub'
- singularity = 'singularity'
- class spras.config.container_schema.ContainerRegistry(*, base_url: str = 'docker.io', owner: str = 'reedcompbio')
Bases:
BaseModel- base_url: str
The domain of the registry
- model_config: ClassVar[ConfigDict] = {'extra': 'forbid', 'use_attribute_docstrings': True}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- owner: str
The owner or project of the registry
- class spras.config.container_schema.ContainerSettings(*, framework: ContainerFramework = ContainerFramework.docker, unpack_singularity: bool = False, enable_profiling: bool = False, registry: ContainerRegistry)
Bases:
BaseModel- enable_profiling: bool
A Boolean indicating whether to enable container runtime profiling (apptainer/singularity only)
- framework: ContainerFramework
- model_config: ClassVar[ConfigDict] = {'extra': 'forbid', 'use_attribute_docstrings': True}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- registry: ContainerRegistry
- unpack_singularity: bool
- class spras.config.container_schema.ProcessedContainerSettings(framework: spras.config.container_schema.ContainerFramework = <ContainerFramework.docker: 'docker'>, unpack_singularity: bool = False, prefix: str = 'docker.io/reedcompbio', enable_profiling: bool = False, hash_length: int = 7)
Bases:
object- enable_profiling: bool = False
- framework: ContainerFramework = 'docker'
- static from_container_settings(settings: ContainerSettings, hash_length: int) ProcessedContainerSettings
- hash_length: int = 7
The hash length for container-specific usage. This does not appear in the output folder, but it may show up in logs, and usually never needs to be tinkered with. This will be the top-level hash_length specified in the config.
We prefer this hash_length in our container-running logic to avoid a (future) dependency diamond.
- prefix: str = 'docker.io/reedcompbio'
- unpack_singularity: bool = False
spras.config.dataset module
- class spras.config.dataset.DatasetSchema(*, label: ~typing.Annotated[str, ~pydantic.functional_validators.AfterValidator(func=~spras.config.util.label_validator.<locals>.validate)], node_files: list[str | ~os.PathLike[str]], edge_files: list[str | ~os.PathLike[str]], other_files: list[str | ~os.PathLike[str]], data_dir: str | ~os.PathLike[str])
Bases:
BaseModelCollection of information related to Dataset objects in the configuration.
- data_dir: str | PathLike[str]
- edge_files: list[str | PathLike[str]]
- label: validate)]
- model_config: ClassVar[ConfigDict] = {'extra': 'forbid'}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- node_files: list[str | PathLike[str]]
- other_files: list[str | PathLike[str]]
spras.config.revision module
The revision is an optional hash associated to all files in the designated output directory to make sure that file _names_ are immutable. We attach the revision to three labels:
Datasets
Gold standards
Algorithms
In the future, the spras revision may change depending on what files are effected (e.g specific algorithms will have specific revisions that change as they get updated) to avoid unnecessary running in the Reed-CompBio/spras-benchmarking repository.
This is an optional feature, as the spras_revision function below is dependent on a RECORD file (described in the docstring associated with spras_revision.)
We provide the convenient attach_spras_revision used in ./config.py, and detach_spras_revision used to get rid of the revision for algorithms specifically.
- spras.config.revision.attach_spras_revision(immutable_files: bool, label: str) str
Attaches the SPRAS revision to a label. This function signature may become more complex as specific labels get versioned.
@param label: The label to attach the SPRAS revision to. @param immutable_files: if False, this function is equivalent to id.
- spras.config.revision.detach_spras_revision(immutable_files: bool, attached_label: str) str
The inverse of attach_spras_revision.
- spras.config.revision.spras_revision() str
Gets the current revision of SPRAS.
Note: This is not dependent on the SPRAS release version number nor the git commit, but rather solely on the PyPA RECORD file, (https://packaging.python.org/en/latest/specifications/recording-installed-packages/#the-record-file), which contains hashes of all of the installed SPRAS files [excluding RECORD itself], and is also included in the package distribution. This means that, when developing SPRAS, spras_revision will be updated when spras is initially installed. However, for editable pip installs (e.g. from pip install -e .), the spras_revision will not be updated, as the RECORD file only contains metadata: https://setuptools.pypa.io/en/latest/userguide/development_mode.html.
spras.config.schema module
Contains the raw pydantic schema for the configuration file.
Using Pydantic as our backing config parser allows us to declaratively type our config, giving us more robust user errors with guarantees that parts of the config exist after parsing it through Pydantic.
We declare models using two classes here: - BaseModel (docs: https://docs.pydantic.dev/latest/concepts/models/) - CaseInsensitiveEnum (see ./util.py)
- class spras.config.schema.Analysis(*, summary: ~spras.config.schema.SummaryAnalysis = SummaryAnalysis(include=False), cytoscape: ~spras.config.schema.CytoscapeAnalysis = CytoscapeAnalysis(include=False), ml: ~spras.config.schema.MlAnalysis = MlAnalysis(include=False, aggregate_per_algorithm=False, components=2, labels=True, kde=False, remove_empty_pathways=False, linkage=<MlLinkage.ward: 'ward'>, metric=<MlMetric.euclidean: 'euclidean'>), evaluation: ~spras.config.schema.EvaluationAnalysis = EvaluationAnalysis(include=False, aggregate_per_algorithm=False))
Bases:
BaseModel- cytoscape: CytoscapeAnalysis
- evaluation: EvaluationAnalysis
- ml: MlAnalysis
- model_config: ClassVar[ConfigDict] = {'extra': 'forbid'}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- summary: SummaryAnalysis
- class spras.config.schema.CytoscapeAnalysis(*, include: bool)
Bases:
BaseModel- include: bool
- model_config: ClassVar[ConfigDict] = {'extra': 'forbid'}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class spras.config.schema.EvaluationAnalysis(*, include: bool, aggregate_per_algorithm: bool = False)
Bases:
BaseModel- aggregate_per_algorithm: bool
- include: bool
- model_config: ClassVar[ConfigDict] = {'extra': 'forbid'}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class spras.config.schema.GoldStandard(*, label: ~typing.Annotated[str, ~pydantic.functional_validators.AfterValidator(func=~spras.config.util.label_validator.<locals>.validate)], node_files: list[str] = [], edge_files: list[str] = [], data_dir: str, dataset_labels: list[str])
Bases:
BaseModel- data_dir: str
- dataset_labels: list[str]
- edge_files: list[str]
- label: validate)]
- model_config: ClassVar[ConfigDict] = {'extra': 'forbid'}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- node_files: list[str]
- class spras.config.schema.Locations(*, reconstruction_dir: str)
Bases:
BaseModel- model_config: ClassVar[ConfigDict] = {'extra': 'forbid'}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- reconstruction_dir: str
- class spras.config.schema.MlAnalysis(*, include: bool, aggregate_per_algorithm: bool = False, components: int = 2, labels: bool = True, kde: bool = False, remove_empty_pathways: bool = False, linkage: MlLinkage = MlLinkage.ward, metric: MlMetric = MlMetric.euclidean)
Bases:
BaseModel- aggregate_per_algorithm: bool
- components: int
- include: bool
- kde: bool
- labels: bool
- model_config: ClassVar[ConfigDict] = {'extra': 'forbid'}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- remove_empty_pathways: bool
- class spras.config.schema.MlLinkage(*values)
Bases:
CaseInsensitiveEnum- average = 'average'
- complete = 'complete'
- single = 'single'
- ward = 'ward'
- class spras.config.schema.MlMetric(*values)
Bases:
CaseInsensitiveEnum- cosine = 'cosine'
- euclidean = 'euclidean'
- manhattan = 'manhattan'
- class spras.config.schema.RawConfig(*, containers: ~spras.config.container_schema.ContainerSettings, immutable_files: bool = False, hash_length: int = 7, algorithms: list[~typing.Annotated[~spras.config.algorithms.allpairsModel | ~spras.config.algorithms.bowtiebuilderModel | ~spras.config.algorithms.diamondModel | ~spras.config.algorithms.dominoModel | ~spras.config.algorithms.meoModel | ~spras.config.algorithms.mincostflowModel | ~spras.config.algorithms.omicsintegrator1Model | ~spras.config.algorithms.omicsintegrator2Model | ~spras.config.algorithms.pathlinkerModel | ~spras.config.algorithms.responsenetModel | ~spras.config.algorithms.rwrModel | ~spras.config.algorithms.strwrModel, FieldInfo(annotation=NoneType, required=True, discriminator='name')]], datasets: list[~spras.config.dataset.DatasetSchema], gold_standards: list[~spras.config.schema.GoldStandard] = [], analysis: ~spras.config.schema.Analysis = Analysis(summary=SummaryAnalysis(include=False), cytoscape=CytoscapeAnalysis(include=False), ml=MlAnalysis(include=False, aggregate_per_algorithm=False, components=2, labels=True, kde=False, remove_empty_pathways=False, linkage=<MlLinkage.ward: 'ward'>, metric=<MlMetric.euclidean: 'euclidean'>), evaluation=EvaluationAnalysis(include=False, aggregate_per_algorithm=False)), reconstruction_settings: ~spras.config.schema.ReconstructionSettings)
Bases:
BaseModel- algorithms: list[Annotated[allpairsModel | bowtiebuilderModel | diamondModel | dominoModel | meoModel | mincostflowModel | omicsintegrator1Model | omicsintegrator2Model | pathlinkerModel | responsenetModel | rwrModel | strwrModel, FieldInfo(annotation=NoneType, required=True, discriminator='name')]]
- containers: ContainerSettings
- datasets: list[DatasetSchema]
- gold_standards: list[GoldStandard]
- hash_length: int
The length of the hash used to identify a parameter combination
- immutable_files: bool
If enabled, this tags all files with their local file version. Most files do not have a specific version, and by default, this will be the hash of all the SPRAS files in the PyPA installation. This option will not work if SPRAS was not installed in a PyPA-compliant manner (PyPA-compliant installations include but are not limited to pip, poetry, uv, conda, pixi.)
By default, this is disabled, as it can make output file names confusing.
- model_config: ClassVar[ConfigDict] = {'extra': 'forbid', 'use_attribute_docstrings': True}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- reconstruction_settings: ReconstructionSettings
spras.config.util module
General config utilities. This is the only config file that should be imported by algorithms, and algorithms should only import this config file.
- class spras.config.util.CaseInsensitiveEnum(new_class_name, /, names, *, module=None, qualname=None, type=None, start=1, boundary=None)
Bases:
str,EnumWe prefer this over Enum to make sure the config parsing is more relaxed when it comes to string enum values.
- class spras.config.util.Empty
Bases:
BaseModelThe empty base model. Used for specifying that an algorithm takes no parameters, yet is deterministic.
- model_config: ClassVar[ConfigDict] = {'extra': 'forbid'}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- spras.config.util.label_validator(name: str)
A validator takes in a label and ensures that it contains only letters, numbers, or underscores.