lyprox

package documentation

LyProX is a Django app to interactively explore lymphatic progression patterns.

Introduction

This is the documentation for the source code of the LyProX web app. It is intended for developers who want to contribute to the project or use the code for their own purposes.

We will try to write this documentation as self-contained as possible, because in our experience, the Django documentation is not always easy to navigate and learning by example is often more effective.

The most important modules in this package are listed below. We recommend to start with their documentations for a deep dive into how the web app works and how it may be modified, enhanced, or extended.

the settings, where all the configuration of the website is defined. Most importantly, it documents all the environment variables that need to be set to run the website.
the dataexplorer module, which provides the interactive data exploration tool that is the heart of the website.
the riskpredictor module, that allows computing the risk for occult disease, given an individual diagnosis as computed by a specified model.

Beyond the documented code, you will also find the following information in this documentation:

Introduction
Maintenance
- Bash Commands
  - Systemd Commands
  - Environment
- Django Commands
  - Built In
  - Custom
Conventions

Maintenance

Beyond understanding and modifying the source code of the app, there is also general maintenance work to be done. Right now, the web app runs on an Azure virtual machine.

On this machine, the repository is cloned into the /srv/www/lyprox.org directory. A systemd service provides the gunicorn server that in turn provides the interface between the Django app and the nginx web server that handles any incoming requests.

We have chosen systemd over e.g. docker, because in this simple instance where we don't need a separate database server or other services, it is easier to set up and maintain. Note however, that this limits the app to be deployed on virtual machines that actually have systemd installed.

Bash Commands

Here, we give a list of useful commands to see what is going on and perform certain tasks on the server.

Systemd Commands

First, some commands related to the systemd services:

sudo systemctl status lyprox.org.service

Check the status of the gunicorn server. Ideally, it should show - in addition to some general info - the green text active (running).
sudo systemctl start lyprox.org.service

Launch the gunicorn server. One can also use stop or restart to perform the corresponding action.
sudo journalctl -u lyprox.org.service -f

Shows a continuous stream of the latest log messages emitted by the app. Here, one will see the log messages written into the source code of the app.

Similarly, one can inspect the status and logs of the nginx.service that runs the interface between the gunicorn server and the outside world with the corresponding commands. Simply replace lyprox.org with nginx.service in the above commands.

Environment

Next, some commands related to the virtual Python environment and the environment variables that Django uses for certain settings and secrets:

uv sync

We use uv to manage virtual environments and with this command, one can synchronize the virtual environment with the requirements.txt file. For this to work, one needs to be inside the /srv/www/lyprox.org directory.
set -a; source .env; set +a

This command loads all the environment variables from the .env file into the current shell. It is strongly recommended to put the secret key as well as some config and passwords into the .env file and not directly in the settings. Of course, never commit the .env file to source control.

Django Commands

Built In

Django itself also provides a number of commands that can be run from the command line. Note that the way we have configured everything, all commands are available under the lyprox CLI that is installed via the uv sync command (if the repo is the current working directory and the virtual environment is activated).

lyprox runserver

This starts a Python webserver that is only accessible from the local machine. This should NOT be used in production, but is useful for development, mostly on local machines.
lyprox collectstatic

This command collects all static files from the various apps and puts them into the directory that is exposed to the internet by the nginx server. This is necessary whenever static files are changed or added.
lyprox migrate

This command applies all migrations to the database. In a typical Django app migrations happen sometimes and this command ensures that changes to the database model are reflected in the database itself without loosing any data. However, we do not really care about loosing data since nothing is stored only in the SQLite3 database. So, this command is mainly used to initialize the database.
lyprox shell

This command starts a Python shell with the Django environment loaded. This is useful for testing code snippets or inspecting the database.

Custom

All these are provided by Django itself and are also well documented in their docs. Now come a couple of commands that we implemented for ourselves. They are all about populating the database:

lyprox add_institutions --from-file initial/institutions.json

The add_institutions command allows creating all the institutions that are defined in the initial/institutions.json file. Institutions must be initialized first, because both the users as well as the datasets must belong to an insitution.
lyprox add_users --from-file initial/users.json

The add_users command allows creating all the users that are defined in the initial/users.json file. Users must be initialized after the institutions, because they must belong to an institution. Note that the passwords are not stored in the JSON file, but in environment variables (i.e., in the .env file). The name of the environment variable is DJANGO_<EMAIL>_PASSWORD, where <EMAIL> is the part of the email address before the @ symbol, with all dots removed.
lyprox add_datasets --from-file initial/datasets.json

With the add_datasets command, fetching and loading CSV tables of patient records from the lyDATA repo is initiated. The loaded pandas dataframes are cached using joblib and thus the patient data never reaches the SQLite3 database.
lyprox add_riskmodels --from-file initial/riskmodels.json

Lastly, the add_riskmodels command loads a model definition from the config files that the initial/riskmodels.json file points to. Then, it fetches the MCMC samples for that model and precomputes posterior state distributions for a subset of the samples. This is then used in the riskpredictor app to compute the personalized risk on demand. Again, joblib is used to cache and speed up the results.

Conventions

To keep the code and documentation clean and readable, we follow a few conventions. Any contributions should adhere to these style conventions.

Code Style

We use ruff to format the code. The selected and ignored rules are specified in the pyproject.toml file at the root of the repository.

Also, we are big fans of type hints in function arguments and class attributes and use them wherever possible. While not enforced, as there may be cases where type hints are not feasible or useful, we generally strive to use them extensively.

Docstrings

Docstrings should follow the format below. Note that we don't like the styles where the docstring lists all function arguments and their types again. We think this is redundant as long as

the argument names are meaningful
all arguments are type hinted
the use and effect of each argument is described in the body of the docstring

Therefore, we write docstring that create documentations that look similar to how the main Python docs are structured (note the linked example of getattr()).

def my_function(arg1: int, arg2: str) -> float:
    """Briefly describe what the function does.

    Then go on and describe it in detail. Make sure to mention what ``arg1`` does
    and also what the effect of ``arg2`` is.

    You can also link to other symbols using single backticks. E.g., if the main
    Python docs are linked, then you could mention the `print` function and then
    pydoctor would turn that into a link to this built-in function's docs.

    We also really like doctests. First, they are tested. Second, they directly
    showcase some examples:

    >>> my_function(5, "hey")
    hey
    8.14
    """
    print(arg2)
    return 3.14 + arg1

When using docstrings as above, the documentation can be auto-generated from the code using pydoctor. Its output is simple, clean, and comprehensive. It lists all modules, classes, and functions in the same hierarchy as they appear in the codebase. If the docstrings are well-written, they can effectively guide the reader through the codebase. Cross-references can be added using single backticks, e.g., `my_symbol` will search the entire codebase (and even other linked docs) for this symbol and add a link to it if found.

Pre-Commit Hooks and Conventional Commits

We use pre-commit to run [ruff] and other checks before every commit. This ensures that the code adheres to some basic standards and that all commit messages are so-called conventional commits. To enable this, you need to install pre-commit (e.g., by running uv add pre-commit or, better yet, by installing it via pipx) and then install the following two hooks:

pre-commit install
pre-commit install --hook-type=commit-msg

Package	`accounts`	No package docstring; 3/7 modules, 0/2 package documented
Module	`context_processors`	Context processors for the LyProX app.
Package	`dataexplorer`	The app that makes the data interactively explorable by the user.
Module	`manage`	Django's command-line utility for administrative tasks.
Package	`riskpredictor`	Risk predictions for involvement patterns, given personalized diagnoses.
Module	`settings`	Main configurations. Explanations of all options can be found in the Django docs.
Package	`templatetags`	No package docstring; 1/1 module documented
Module	`urls`	LyProX' URL configuration.
Module	`utils`	Utility functions.
Module	`views`	Define the home view and a maintenance view.
Module	`wsgi`	The WSGI configuration for the LyProX project. Don't touch it.
Module	`__main__`	Undocumented
Module	`_version`	Undocumented