module documentation

Module for translating the user input into a database query. It retrieves the information of interest and returns it in a format that can then be put into the response of the server.

The main interaction point is the run_query function. It takes an initial QuerySet of patients.models.Patient objects and filters it down according to the cleaned data from a form.DashboardForm.

Class ModalityCombinor Utility class that defines and helps to select the various methods for combining diagnoses from different modalities.
Function collect_info Collect the patient-, tumor-, and diagonse-information in one dictionary.
Function combine_diagnoses Combine the potentially conflicting diagnoses for each patient and each side according to the chosen combination method.
Function compute_statistics Use the collected information as returned by collect_info and generate statistics for them.
Function diagnose_specific Filter the diagnoses based on selected involvement patterns.
Function does_patient_match Compare the diagnose of a patient with the involvement pattern to filter for.
Function extract_filter_pattern Sort the kwargs from the request.
Function patient_specific Filter QuerySet of patients based on patient-specific properties.
Function run_query Run a database query using the cleaned form data from the forms.DashboardForm.
Function sort_diagnoses_by_patient Use a QuerySet of patient.models.Diagnose and sort its entries into a nested dictionary.
Function sort_patients_by_id Collect patient information by patient ID in a dictionary.
Function sort_tumors_by_patient Collect tumor information by patient ID in a dictionary.
Function subsite2arr Map different subsites to an one-hot-array of subsite groups.
Function tf2arr Map True, False & None to one-hot-arrays of length 3.
Function tumor_specific Filter QuerySet of tumors based on tumor-specific properties.
Variable logger Undocumented
def collect_info(patients, tumors, diagnoses):

Collect the patient-, tumor-, and diagonse-information in one dictionary.

Parameters
patients:Dict[int, Any]Patients sorted by ID. Should be the output of sort_patients_by_id
tumors:Dict[int, Any]Tumors by patient ID, as returned by sort_tumors_by_patient
diagnoses:Dict[int, Any]Combined involvement by patient ID. Output of sort_diagnoses_by_patient.
Returns
Dict[int, Any]A dictionary with patient IDs as keys and the combined information underneath.
def combine_diagnoses(method, diagnoses):

Combine the potentially conflicting diagnoses for each patient and each side according to the chosen combination method.

Parameters
method:CallableThe function used to combine them. It should only take a tuple of values where each value represents the involvement for the same LNL as reported by the different modalities. The order of the values corresponds to the order of the modalities in patients.models.Diagnose.Modalities.
diagnoses:Dict[int, Dict[str, np.ndarray]]This should be the output of the sort_diagnoses function.
Returns
Dict[int, Dict[str, Dict[str, Optional[bool]]]]A dictionary where the combined involvement per LNL is stored under the corresponding patient ID, side (ipsi or contra) and the respective LNL's name (e.g. IIa).
def compute_statistics(patients):

Use the collected information as returned by collect_info and generate statistics for them.

Many of these statistics come in the form of a list of length 3. They indicate for how many patients the correpsonding field was True (index 1), False (index -1, meaning the last entry), or None (index 0).

So, for example, "hpv_status": [23, 82, 72] means that 82 patients are HPV positive, 72 HPV negative and for 23 we do not have any information.

In the beginning, this kind of encoding seemed smart, because I could give semantic meaning to an index (positive being +1, negative -1). But in Django's HTML templates that doesn't work and I have to use 0, 1, and 2 there...

Parameters
patients:Dict[int, Any]Undocumented
Returns
Dict[str, Any]Undocumented
def diagnose_specific(diagnoses, modalities, modality_combine='maxLLH', n_status=None, **kwargs):

Filter the diagnoses based on selected involvement patterns.

In contrast to the patient_specific and tumor_specific functions, this one is more complicated: It needs to combine multiple modalities' information before checking if a patient's involvement matches the selected pattern.

Parameters
diagnoses:Optional[QuerySet]QuerySet of diagnoses. If None, all diagnose entries in the database will be used.
modalities:Optional[List[int]]List of diagnostic modalities to consider in the filtering.
modality_combine:strName of the method to use for combining multiple, possibly conflicting diagnoses from different modalities.
n_status:Optional[bool]If True, only patients will be kept that are N+ patients. False means only N0 patients are kept and None ignores N-status.
**kwargsUndocumented
Returns
Dict[int, Dict[str, Dict[str, Optional[bool]]]]A nested dictionary with patient IDs as top-level keys. Under each patient, there are the keys "ipsi" and "contra". And under that, finally, a dictionary stores the combined involvement (can be True for metastatic, False for healthy, or None for unknown) per LNL.
def does_patient_match(patient_diagnose, filter_pattern, n_status):

Compare the diagnose of a patient with the involvement pattern to filter for.

A 'match' occurs, when for both sides (ipsi & contra) and all LNLs the patient's diagnose and the filter_pattern are the same or the latter is undefined (None).

Parameters
patient_diagnose:Dict[str, Dict[str, Optional[bool]]]The diagnose of a single patient, as they are stored by their IDs in the output of the combine_diagnoses dictionary.
filter_pattern:Dict[str, Dict[str, Optional[bool]]]The involvement pattern that was selected in the dashboard by the user, collected in the same format as the patient_diagnose.
n_status:Optional[bool]If True, only patients will be kept that are N+ patients. False means only N0 patients are kept and None ignores N-status.
Returns
boolWhether patient_diagnose and filter_pattern match or not.
def extract_filter_pattern(kwargs):

Sort the kwargs from the request.

The filter patterns are sent in the request (e.g. as {ipsi_IIa: True}). This method sorts them into a dictionary by side (ipsi or contra) and by LNL.

Parameters
kwargs:Dict[str, Optional[bool]]Undocumented
Returns
Dict[str, Dict[str, Optional[bool]]]Undocumented
def patient_specific(patients=None, nicotine_abuse=None, hpv_status=None, neck_dissection=None, dataset__in=None, **rest):

Filter QuerySet of patients based on patient-specific properties.

This function is designed in such a way that one can simply add another argument to its definition without actually changing the logic inside it and it will the be able to filter for that added argument (given that it is also a field in the model patients.models.Patient).

Parameters
patients:Optional[QuerySet]The QuerySet of patients to begin with. I None, this is just the entire patient dataset.
nicotine_abuse:Optional[bool]Filter smokers or non-smokers?
hpv_status:Optional[bool]Select based on HPV status.
neck_dissection:Optional[bool]Filter thos that did or didn't undergo neck dissection.
dataset__in:Optional[Institution]Select based on the dataset that describes the respective patient.
**restUndocumented
Returns
QuerySetThe filtered QuerySet.
def run_query(patients, cleaned_form_data, do_compute_statistics=True):

Run a database query using the cleaned form data from the forms.DashboardForm.

It first filters all patients in the database by patient-specific characteristics. Then all tumors by tumor features. Afterwards, it only keeps those patients that have tumors which were not yet filtered out. It continues to remove patients based on their diagnosed lymph node involvement and the selected involvement patterns.

The filtering parameters are provided by the cleaned data of a form.DashboardForm. The default initial QuerySet of patients is simply all patients in the database, unless patients is specified. In that case, the provided QuerySet is the starting point of the filtering query.

The computation of statistics can be skipped using the do_compute_statistics parameter (default is True).

Parameters
patients:Optional[QuerySet]Undocumented
cleaned_form_data:Dict[str, Any]Undocumented
do_compute_statistics:boolUndocumented
Returns
Dict[int, Any]Undocumented
def sort_diagnoses_by_patient(diagnoses):

Use a QuerySet of patient.models.Diagnose and sort its entries into a nested dictionary.

The top level of the dictionary has the patient's IDs as keys. Underneath it is sorted by side (ipsi & contra). The values of those are then numpy matrices that are indexed by modality and by LNL. They hold the involvement that was oserved by the corresponding modality for the respective LNL.

Parameters
diagnoses:QuerySetThe QuerySet of diagnoses.
Returns
Dict[int, Dict[str, np.ndarray]]The sorted, nested dictionary.
def sort_patients_by_id(patients):

Collect patient information by patient ID in a dictionary.

Parameters
patients:QuerySetUndocumented
Returns
Dict[int, Any]Undocumented
def sort_tumors_by_patient(tumors):

Collect tumor information by patient ID in a dictionary.

Parameters
tumors:QuerySetUndocumented
Returns
Dict[int, Any]Undocumented
def subsite2arr(subsite):

Map different subsites to an one-hot-array of subsite groups.

E.g., a one in the first place means "base of tongue", at the second place is "tonsil" and so on.

def tf2arr(value):

Map True, False & None to one-hot-arrays of length 3.

This particular mapping comes from the fact that in the form True, None, False are represented by integers 1, 0, -1. So, the one-hot encoding uses an array of length 3 that is one only at these respective indices, where -1 is the last item.

See also the documentation of the compute_statistics function for an explanation of this encoding.

def tumor_specific(tumors=None, subsite__in=None, t_stage__in=None, central=None, extension=None, **rest):

Filter QuerySet of tumors based on tumor-specific properties.

It works almost exactly like the patient-specific querying function patient_specific in terms of adding new arguments to filter by.

Parameters
tumors:Optional[QuerySet]QuerySet of tumors to begin with. if None, this simply includes all patients in the database.
subsite__in:Optional[List[str]]Is the tumor's subsite in this list of subsites?
t_stage__in:Optional[List[int]]Does the tumor's T-stage match one of this list?
central:Optional[bool]Is the tumor symmetric w.r.t. the mid-sagittal line?
extension:Optional[bool]Does the tumor extend over the mid-sagittal line?
**restUndocumented
Returns
QuerySetThe QuerySet of tumors filtered for the requested characterisics.
logger =

Undocumented