lyscripts

social card

MIT license GitHub repo build badge docs badge tests badge

What are these lyscripts?

This package provides convenient scripts for performing inference and learning regarding the lymphatic spread of head & neck cancer. Essentially, it provides a command line interface (CLI) to the lymph library.

We are making these "convenience" scripts public, because doing so is one necessary requirement to making our research easily and fully reproducible. There exists another repository, lynference, where we store the pipelines that produce(d) our published results in a persistent way. Head over there to learn more about how to reproduce our work.

Installation

These scripts can be installed via pip:

pip install lyscripts

or installed from source by cloning this repo

git clone https://github.com/rmnldwg/lyscripts.git
cd lyscripts
pip install .

Usage

After installing the package, run lyscripts --help to see the following output:

USAGE: lyscripts [-h] [-v]
                 {app,data,evaluate,plot,predict,sample,temp_schedule} ...

Utility for performing common tasks w.r.t. the inference and prediction tasks one
can use the `lymph` package for.

POSITIONAL ARGUMENTS:
  {app,data,evaluate,plot,predict,sample,temp_schedule}
    app                 Module containing scripts to run different `streamlit`
                        applications.
    data                Provide a range of commands related to datasets on
                        patterns of lymphatic progression. Currently, the
                        following modules provide additional commands: 1. The
                        `lyscripts.data.clean` module that converts a LyProX-style
                        table of patient information into a simplified format that
                        is used by the `lymph` model. 2. `lyscripts.data.enhance`,
                        a module for computing consensus diagnoses and to ensure
                        that super- and sublevels are consistently reported. 3.
                        The module `lyscripts.data.generate` for creating
                        synthetic datasets with certain characteristics. 4.
                        Submodule `lyscripts.data.join` to concatenate two
                        datasets, e.g. from different institutions. 5.
                        `lyscripts.data.split`, a module with which datasets may
                        be split into random sets of patient data. The split data
                        may then be used e.g. for cross-validation.
    evaluate            Evaluate the performance of the trained model by computing
                        quantities like the Bayesian information criterion (BIC)
                        or (if thermodynamic integration was performed) the actual
                        evidence (with error) of the model.
    plot                Provide various plotting utilities for displaying results
                        of e.g. the inference or prediction process. At the
                        moment, three subcommands are grouped under
                        `lyscripts.plot`: 1. `lyscripts.plot.corner`, which simply
                        outputs a corner plot with nice labels for a set of drawn
                        samples. 2. The module `lyscripts.plot.histograms` can be
                        used to draw histograms, e.g. the ones over risks and
                        prevalences as computed by the `lyscripts.predict` module.
                        3. Module `lyscripts.plot.thermo_int` allows comparing
                        rounds of thermodynamic integration for different models.
    predict             This module provides functions and scripts to predict the
                        risk of hidden involvement, given observed diagnoses, and
                        prevalences of patterns for diagnostic modalities. The
                        submodules for prediction are currently: 1. The
                        `lyscripts.predict.prevalences` module for computing
                        prevalences of certain involvement patterns that may also
                        be compared to observed prevalences. 2. A module
                        `lyscripts.predict.risks` for predicting the risk of any
                        specific pattern of involvement given any particular
                        diagnosis.
    sample              Learn the spread probabilities of the HMM for lymphatic
                        tumor progression using the preprocessed data as input and
                        MCMC as sampling method. This is the central script
                        performing for our project on modelling lymphatic spread
                        in head & neck cancer. We use it for model comparison via
                        the thermodynamic integration functionality and use the
                        sampled parameter estimates for risk predictions. This
                        risk estimate may in turn some day guide clinicians to
                        make more objective decisions with respect to defining the
                        *elective clinical target volume* (CTV-N) in radiotherapy.
    temp_schedule       Generate inverse temperature schedules for thermodynamic
                        integration using various different methods. Thermodynamic
                        integration is quite sensitive to the specific schedule
                        which is used. I noticed in my models, that within the
                        interval $[0, 0.1]$, the increase in the expected
                        log-likelihood is very steep. Hence, the inverse
                        temparature $\beta$ must be more densely spaced in the
                        beginning. This can be achieved by using a power sequence:
                        Generate $n$ linearly spaced points in the interval $[0,
                        1]$ and then transform each point by computing $\beta_i^k$
                        where $k$ could e.g. be 5.

OPTIONAL ARGUMENTS:
  -h, --help            show this help message and exit
  -v, --version         Display the version of lyscripts (default: False)

Each of the individual subcommands provides a help page like this respectively that detail the positional and optional arguments along with their function.

  1"""
  2.. include:: ../README.md
  3"""
  4import argparse
  5import logging
  6import re
  7
  8from rich.containers import Lines
  9from rich.logging import RichHandler
 10from rich.text import Text
 11from rich_argparse import RichHelpFormatter
 12
 13from lyscripts import app, data, evaluate, plot, predict, sample, temp_schedule
 14from lyscripts._version import version
 15from lyscripts.utils import report
 16
 17__version__ = version
 18__description__ = "Package containing scripts used in lynference pipelines"
 19__author__ = "Roman Ludwig"
 20__email__ = "roman.ludwig@usz.ch"
 21__uri__ = "https://github.com/rmnldwg/lyscripts"
 22
 23# nopycln: file
 24
 25
 26logger = logging.getLogger(__name__)
 27logger.addHandler(logging.NullHandler())
 28
 29
 30class RichDefaultHelpFormatter(
 31    RichHelpFormatter,
 32    argparse.ArgumentDefaultsHelpFormatter,
 33):
 34    """
 35    Empty class that combines the functionality of displaying the default value with
 36    the beauty of the `rich` formatter
 37    """
 38    def _rich_fill_text(self, text: Text, width: int, indent: Text) -> Text:
 39        text_cls = type(text)
 40        if text[0] == text_cls("\n"):
 41            text = text[1:]
 42
 43        paragraphs = text.split(separator="\n\n")
 44        text_lines = Lines()
 45        for par in paragraphs:
 46            no_newline_par = text_cls(" ").join(line for line in par.split())
 47            wrapped_par = no_newline_par.wrap(self.console, width)
 48
 49            for line in wrapped_par:
 50                text_lines.append(line)
 51
 52            text_lines.append(text_cls("\n"))
 53
 54        return text_cls("\n").join(indent + line for line in text_lines) + "\n\n"
 55
 56
 57RichDefaultHelpFormatter.styles["argparse.syntax"] = "red"
 58RichDefaultHelpFormatter.styles["argparse.formula"] = "green"
 59RichDefaultHelpFormatter.highlights.append(
 60    r"\$(?P<formula>[^$]*)\$"
 61)
 62RichDefaultHelpFormatter.styles["argparse.bold"] = "bold"
 63RichDefaultHelpFormatter.highlights.append(
 64    r"\*(?P<bold>[^*]*)\*"
 65)
 66RichDefaultHelpFormatter.styles["argparse.italic"] = "italic"
 67RichDefaultHelpFormatter.highlights.append(
 68    r"_(?P<italic>[^_]*)_"
 69)
 70
 71
 72def exit_cli(args: argparse.Namespace):
 73    """Exit the cmd line tool"""
 74    if args.version:
 75        report.print("lyscripts ", __version__)
 76    else:
 77        report.print("No command chosen. Exiting...")
 78
 79
 80def main():
 81    """
 82    Utility for performing common tasks w.r.t. the inference and prediction tasks one
 83    can use the `lymph` package for.
 84    """
 85    parser = argparse.ArgumentParser(
 86        prog="lyscripts",
 87        description=re.sub(r"\s+", " ", main.__doc__)[1:],
 88        formatter_class=RichDefaultHelpFormatter,
 89    )
 90    parser.set_defaults(run_main=exit_cli)
 91    parser.add_argument(
 92        "-v", "--version", action="store_true",
 93        help="Display the version of lyscripts"
 94    )
 95    parser.add_argument(
 96        "--log-level", default="INFO",
 97        choices=["DEBUG", "INFO", "WARNING", "ERROR", "CRITICAL"],
 98    )
 99
100    subparsers = parser.add_subparsers()
101
102    # the individual scripts add `ArgumentParser` instances and their arguments to
103    # this `subparsers` object
104    app._add_parser(subparsers, help_formatter=parser.formatter_class)
105    data._add_parser(subparsers, help_formatter=parser.formatter_class)
106    evaluate._add_parser(subparsers, help_formatter=parser.formatter_class)
107    plot._add_parser(subparsers, help_formatter=parser.formatter_class)
108    predict._add_parser(subparsers, help_formatter=parser.formatter_class)
109    sample._add_parser(subparsers, help_formatter=parser.formatter_class)
110    temp_schedule._add_parser(subparsers, help_formatter=parser.formatter_class)
111
112    args = parser.parse_args()
113
114    handler = RichHandler(
115        console=report,
116        show_time=False,
117        markup=True,
118    )
119    handler.setFormatter(logging.Formatter("%(message)s"))
120    logger.addHandler(handler)
121    logger.setLevel(args.log_level)
122
123    args.run_main(args)
logger = <Logger lyscripts (WARNING)>
class RichDefaultHelpFormatter(rich_argparse.RichHelpFormatter, argparse.ArgumentDefaultsHelpFormatter):
31class RichDefaultHelpFormatter(
32    RichHelpFormatter,
33    argparse.ArgumentDefaultsHelpFormatter,
34):
35    """
36    Empty class that combines the functionality of displaying the default value with
37    the beauty of the `rich` formatter
38    """
39    def _rich_fill_text(self, text: Text, width: int, indent: Text) -> Text:
40        text_cls = type(text)
41        if text[0] == text_cls("\n"):
42            text = text[1:]
43
44        paragraphs = text.split(separator="\n\n")
45        text_lines = Lines()
46        for par in paragraphs:
47            no_newline_par = text_cls(" ").join(line for line in par.split())
48            wrapped_par = no_newline_par.wrap(self.console, width)
49
50            for line in wrapped_par:
51                text_lines.append(line)
52
53            text_lines.append(text_cls("\n"))
54
55        return text_cls("\n").join(indent + line for line in text_lines) + "\n\n"

Empty class that combines the functionality of displaying the default value with the beauty of the rich formatter

Inherited Members
rich_argparse.RichHelpFormatter
RichHelpFormatter
group_name_formatter
styles
highlights
usage_markup
console
add_text
add_renderable
add_usage
add_argument
format_help
argparse.HelpFormatter
start_section
end_section
add_arguments
def exit_cli(args: argparse.Namespace):
73def exit_cli(args: argparse.Namespace):
74    """Exit the cmd line tool"""
75    if args.version:
76        report.print("lyscripts ", __version__)
77    else:
78        report.print("No command chosen. Exiting...")

Exit the cmd line tool

def main():
 81def main():
 82    """
 83    Utility for performing common tasks w.r.t. the inference and prediction tasks one
 84    can use the `lymph` package for.
 85    """
 86    parser = argparse.ArgumentParser(
 87        prog="lyscripts",
 88        description=re.sub(r"\s+", " ", main.__doc__)[1:],
 89        formatter_class=RichDefaultHelpFormatter,
 90    )
 91    parser.set_defaults(run_main=exit_cli)
 92    parser.add_argument(
 93        "-v", "--version", action="store_true",
 94        help="Display the version of lyscripts"
 95    )
 96    parser.add_argument(
 97        "--log-level", default="INFO",
 98        choices=["DEBUG", "INFO", "WARNING", "ERROR", "CRITICAL"],
 99    )
100
101    subparsers = parser.add_subparsers()
102
103    # the individual scripts add `ArgumentParser` instances and their arguments to
104    # this `subparsers` object
105    app._add_parser(subparsers, help_formatter=parser.formatter_class)
106    data._add_parser(subparsers, help_formatter=parser.formatter_class)
107    evaluate._add_parser(subparsers, help_formatter=parser.formatter_class)
108    plot._add_parser(subparsers, help_formatter=parser.formatter_class)
109    predict._add_parser(subparsers, help_formatter=parser.formatter_class)
110    sample._add_parser(subparsers, help_formatter=parser.formatter_class)
111    temp_schedule._add_parser(subparsers, help_formatter=parser.formatter_class)
112
113    args = parser.parse_args()
114
115    handler = RichHandler(
116        console=report,
117        show_time=False,
118        markup=True,
119    )
120    handler.setFormatter(logging.Formatter("%(message)s"))
121    logger.addHandler(handler)
122    logger.setLevel(args.log_level)
123
124    args.run_main(args)

Utility for performing common tasks w.r.t. the inference and prediction tasks one can use the lymph package for.