lyscripts
What are these lyscripts
?
This package provides convenient scripts for performing inference and learning regarding the lymphatic spread of head & neck cancer. Essentially, it provides a command line interface (CLI) to the lymph
library.
We are making these "convenience" scripts public, because doing so is one necessary requirement to making our research easily and fully reproducible. There exists another repository, lynference
, where we store the pipelines that produce(d) our published results in a persistent way. Head over there to learn more about how to reproduce our work.
Installation
These scripts can be installed via pip
:
pip install lyscripts
or installed from source by cloning this repo
git clone https://github.com/rmnldwg/lyscripts.git
cd lyscripts
pip install .
Usage
After installing the package, run lyscripts --help
to see the following output:
USAGE: lyscripts [-h] [-v]
{app,data,evaluate,plot,predict,sample,temp_schedule} ...
Utility for performing common tasks w.r.t. the inference and prediction tasks one
can use the `lymph` package for.
POSITIONAL ARGUMENTS:
{app,data,evaluate,plot,predict,sample,temp_schedule}
app Module containing scripts to run different `streamlit`
applications.
data Provide a range of commands related to datasets on
patterns of lymphatic progression. Currently, the
following modules provide additional commands: 1. The
`lyscripts.data.clean` module that converts a LyProX-style
table of patient information into a simplified format that
is used by the `lymph` model. 2. `lyscripts.data.enhance`,
a module for computing consensus diagnoses and to ensure
that super- and sublevels are consistently reported. 3.
The module `lyscripts.data.generate` for creating
synthetic datasets with certain characteristics. 4.
Submodule `lyscripts.data.join` to concatenate two
datasets, e.g. from different institutions. 5.
`lyscripts.data.split`, a module with which datasets may
be split into random sets of patient data. The split data
may then be used e.g. for cross-validation.
evaluate Evaluate the performance of the trained model by computing
quantities like the Bayesian information criterion (BIC)
or (if thermodynamic integration was performed) the actual
evidence (with error) of the model.
plot Provide various plotting utilities for displaying results
of e.g. the inference or prediction process. At the
moment, three subcommands are grouped under
`lyscripts.plot`: 1. `lyscripts.plot.corner`, which simply
outputs a corner plot with nice labels for a set of drawn
samples. 2. The module `lyscripts.plot.histograms` can be
used to draw histograms, e.g. the ones over risks and
prevalences as computed by the `lyscripts.predict` module.
3. Module `lyscripts.plot.thermo_int` allows comparing
rounds of thermodynamic integration for different models.
predict This module provides functions and scripts to predict the
risk of hidden involvement, given observed diagnoses, and
prevalences of patterns for diagnostic modalities. The
submodules for prediction are currently: 1. The
`lyscripts.predict.prevalences` module for computing
prevalences of certain involvement patterns that may also
be compared to observed prevalences. 2. A module
`lyscripts.predict.risks` for predicting the risk of any
specific pattern of involvement given any particular
diagnosis.
sample Learn the spread probabilities of the HMM for lymphatic
tumor progression using the preprocessed data as input and
MCMC as sampling method. This is the central script
performing for our project on modelling lymphatic spread
in head & neck cancer. We use it for model comparison via
the thermodynamic integration functionality and use the
sampled parameter estimates for risk predictions. This
risk estimate may in turn some day guide clinicians to
make more objective decisions with respect to defining the
*elective clinical target volume* (CTV-N) in radiotherapy.
temp_schedule Generate inverse temperature schedules for thermodynamic
integration using various different methods. Thermodynamic
integration is quite sensitive to the specific schedule
which is used. I noticed in my models, that within the
interval $[0, 0.1]$, the increase in the expected
log-likelihood is very steep. Hence, the inverse
temparature $\beta$ must be more densely spaced in the
beginning. This can be achieved by using a power sequence:
Generate $n$ linearly spaced points in the interval $[0,
1]$ and then transform each point by computing $\beta_i^k$
where $k$ could e.g. be 5.
OPTIONAL ARGUMENTS:
-h, --help show this help message and exit
-v, --version Display the version of lyscripts (default: False)
Each of the individual subcommands provides a help page like this respectively that detail the positional and optional arguments along with their function.
1""" 2.. include:: ../README.md 3""" 4import argparse 5import logging 6import re 7 8from rich.containers import Lines 9from rich.logging import RichHandler 10from rich.text import Text 11from rich_argparse import RichHelpFormatter 12 13from lyscripts import app, data, evaluate, plot, predict, sample, temp_schedule 14from lyscripts._version import version 15from lyscripts.utils import report 16 17__version__ = version 18__description__ = "Package containing scripts used in lynference pipelines" 19__author__ = "Roman Ludwig" 20__email__ = "roman.ludwig@usz.ch" 21__uri__ = "https://github.com/rmnldwg/lyscripts" 22 23# nopycln: file 24 25 26logger = logging.getLogger(__name__) 27logger.addHandler(logging.NullHandler()) 28 29 30class RichDefaultHelpFormatter( 31 RichHelpFormatter, 32 argparse.ArgumentDefaultsHelpFormatter, 33): 34 """ 35 Empty class that combines the functionality of displaying the default value with 36 the beauty of the `rich` formatter 37 """ 38 def _rich_fill_text(self, text: Text, width: int, indent: Text) -> Text: 39 text_cls = type(text) 40 if text[0] == text_cls("\n"): 41 text = text[1:] 42 43 paragraphs = text.split(separator="\n\n") 44 text_lines = Lines() 45 for par in paragraphs: 46 no_newline_par = text_cls(" ").join(line for line in par.split()) 47 wrapped_par = no_newline_par.wrap(self.console, width) 48 49 for line in wrapped_par: 50 text_lines.append(line) 51 52 text_lines.append(text_cls("\n")) 53 54 return text_cls("\n").join(indent + line for line in text_lines) + "\n\n" 55 56 57RichDefaultHelpFormatter.styles["argparse.syntax"] = "red" 58RichDefaultHelpFormatter.styles["argparse.formula"] = "green" 59RichDefaultHelpFormatter.highlights.append( 60 r"\$(?P<formula>[^$]*)\$" 61) 62RichDefaultHelpFormatter.styles["argparse.bold"] = "bold" 63RichDefaultHelpFormatter.highlights.append( 64 r"\*(?P<bold>[^*]*)\*" 65) 66RichDefaultHelpFormatter.styles["argparse.italic"] = "italic" 67RichDefaultHelpFormatter.highlights.append( 68 r"_(?P<italic>[^_]*)_" 69) 70 71 72def exit_cli(args: argparse.Namespace): 73 """Exit the cmd line tool""" 74 if args.version: 75 report.print("lyscripts ", __version__) 76 else: 77 report.print("No command chosen. Exiting...") 78 79 80def main(): 81 """ 82 Utility for performing common tasks w.r.t. the inference and prediction tasks one 83 can use the `lymph` package for. 84 """ 85 parser = argparse.ArgumentParser( 86 prog="lyscripts", 87 description=re.sub(r"\s+", " ", main.__doc__)[1:], 88 formatter_class=RichDefaultHelpFormatter, 89 ) 90 parser.set_defaults(run_main=exit_cli) 91 parser.add_argument( 92 "-v", "--version", action="store_true", 93 help="Display the version of lyscripts" 94 ) 95 parser.add_argument( 96 "--log-level", default="INFO", 97 choices=["DEBUG", "INFO", "WARNING", "ERROR", "CRITICAL"], 98 ) 99 100 subparsers = parser.add_subparsers() 101 102 # the individual scripts add `ArgumentParser` instances and their arguments to 103 # this `subparsers` object 104 app._add_parser(subparsers, help_formatter=parser.formatter_class) 105 data._add_parser(subparsers, help_formatter=parser.formatter_class) 106 evaluate._add_parser(subparsers, help_formatter=parser.formatter_class) 107 plot._add_parser(subparsers, help_formatter=parser.formatter_class) 108 predict._add_parser(subparsers, help_formatter=parser.formatter_class) 109 sample._add_parser(subparsers, help_formatter=parser.formatter_class) 110 temp_schedule._add_parser(subparsers, help_formatter=parser.formatter_class) 111 112 args = parser.parse_args() 113 114 handler = RichHandler( 115 console=report, 116 show_time=False, 117 markup=True, 118 ) 119 handler.setFormatter(logging.Formatter("%(message)s")) 120 logger.addHandler(handler) 121 logger.setLevel(args.log_level) 122 123 args.run_main(args)
31class RichDefaultHelpFormatter( 32 RichHelpFormatter, 33 argparse.ArgumentDefaultsHelpFormatter, 34): 35 """ 36 Empty class that combines the functionality of displaying the default value with 37 the beauty of the `rich` formatter 38 """ 39 def _rich_fill_text(self, text: Text, width: int, indent: Text) -> Text: 40 text_cls = type(text) 41 if text[0] == text_cls("\n"): 42 text = text[1:] 43 44 paragraphs = text.split(separator="\n\n") 45 text_lines = Lines() 46 for par in paragraphs: 47 no_newline_par = text_cls(" ").join(line for line in par.split()) 48 wrapped_par = no_newline_par.wrap(self.console, width) 49 50 for line in wrapped_par: 51 text_lines.append(line) 52 53 text_lines.append(text_cls("\n")) 54 55 return text_cls("\n").join(indent + line for line in text_lines) + "\n\n"
Empty class that combines the functionality of displaying the default value with
the beauty of the rich
formatter
Inherited Members
- rich_argparse.RichHelpFormatter
- RichHelpFormatter
- group_name_formatter
- styles
- highlights
- usage_markup
- console
- add_text
- add_renderable
- add_usage
- add_argument
- format_help
- argparse.HelpFormatter
- start_section
- end_section
- add_arguments
73def exit_cli(args: argparse.Namespace): 74 """Exit the cmd line tool""" 75 if args.version: 76 report.print("lyscripts ", __version__) 77 else: 78 report.print("No command chosen. Exiting...")
Exit the cmd line tool
81def main(): 82 """ 83 Utility for performing common tasks w.r.t. the inference and prediction tasks one 84 can use the `lymph` package for. 85 """ 86 parser = argparse.ArgumentParser( 87 prog="lyscripts", 88 description=re.sub(r"\s+", " ", main.__doc__)[1:], 89 formatter_class=RichDefaultHelpFormatter, 90 ) 91 parser.set_defaults(run_main=exit_cli) 92 parser.add_argument( 93 "-v", "--version", action="store_true", 94 help="Display the version of lyscripts" 95 ) 96 parser.add_argument( 97 "--log-level", default="INFO", 98 choices=["DEBUG", "INFO", "WARNING", "ERROR", "CRITICAL"], 99 ) 100 101 subparsers = parser.add_subparsers() 102 103 # the individual scripts add `ArgumentParser` instances and their arguments to 104 # this `subparsers` object 105 app._add_parser(subparsers, help_formatter=parser.formatter_class) 106 data._add_parser(subparsers, help_formatter=parser.formatter_class) 107 evaluate._add_parser(subparsers, help_formatter=parser.formatter_class) 108 plot._add_parser(subparsers, help_formatter=parser.formatter_class) 109 predict._add_parser(subparsers, help_formatter=parser.formatter_class) 110 sample._add_parser(subparsers, help_formatter=parser.formatter_class) 111 temp_schedule._add_parser(subparsers, help_formatter=parser.formatter_class) 112 113 args = parser.parse_args() 114 115 handler = RichHandler( 116 console=report, 117 show_time=False, 118 markup=True, 119 ) 120 handler.setFormatter(logging.Formatter("%(message)s")) 121 logger.addHandler(handler) 122 logger.setLevel(args.log_level) 123 124 args.run_main(args)
Utility for performing common tasks w.r.t. the inference and prediction tasks one
can use the lymph
package for.