Documentation of individual methods - Part 2


alphabetic index / program documentation / database / internals / methods (Part 1) / display methods
merger
peaks_list
ppm_entry, ppm_list
ppm_library
prob_mat
project
project_db
pseudoatom_entry, pseudoatom_lib
residue_type_handler
residue_ppm_entry, residue_ppm_list
sequential_handler
sequence_info
strip_comparison
strip_residue
manip_thd - compare peaklists



project

(see also: The project concept)
residue_type_handler * res_hd
calculates and provides information about probable residue types for the individual fragments.
sequential_handler * seq_hd
calculates and provides information about n - n+1 connections.
pal_list * current_pl
a hook for peaklists on which some macro commands can be executed. If the list is deleted, pj->current_pl is automatically set to zero.
param_list * speclist
all spectra known to the project
peaks_list * pkllist
all peaklists known to the project
assignment_manager * am
assignment manager in mode SPSCAN_PJ
project_db * db
pal_list * current_pl
spec_io * current_spio
methods:
void project::initialize(void)
void project::read/write(void)
The project is written into a text file, in the order general, speclist, assglist, pkllist. Loading is done in the same order. The individual read functions of speclist, assglist and pkllist perform tests, whether the files exist. pkllist->read() checks whether references between peaklists and spectra / assignment groups are valid.
If a resource "project","Project" exists, pj is loaded when the program starts up. otherwise only if needed. If pj has been created, it is always saved when the program exits normally.
void project::add(void)
Adds peaklists and spectra to the project. When peaklists are added, a link to an assignment group is made, and the assignment numbers in the list are checked/adapted.
int project::check_peaklist_known(pal_list * pl)
int project::check_spectrum_known(spec_io * sp)
checks whether a file with this full name exists. Returns the index of the file, or -1.
void project::simulate_peaklist(void)
Write a peaklist from a proton list and a library entry that defines pairs/triples of atoms of a particular fragment (or "ALL"). Sort the list according to fragment numbers and remove duplicate entries.
Pseudoatoms are not checked - this must be handled in the library.
The first step is to create a mini proton list with all resonances of known ppm for a particular fragment. Currently the function cannot evaluate n-1 or n+1 fragments.

The library spec_sim.lib contains only very few spectra types that can be simulated.

void project::get_sequential(void) - sequence.cc
Construction of the sequential_handler, and calculation of a matrix of sequential connections.
Currently (1.0.50) only a specific version is implemented, using seq_score_1.
void project::organize_target_sequence(void)
Define the part of the sequence to which the residues are mapped. (The routine supports "branches").
void project::check_valid_matrices(void)
If res_hd and seq_hd do not contain pf_mat, the files that were stored last time in this project are loaded.
spec_io * project::get_spio_id(char * id)
Access to the spectrum with "id" from speclist, or NULL. non-ineractive.

project_db

methods: A database that knows a lot about fragments, resonances, mapping, next and previous residues ...
int base_new_frag
new fragment numbers which are created are >= base_new_frag. Fragment numbers < base_new_frag are the fragments to which the other fragments are mapped.
proton_list * d_prot
all known resonances should be available in d_prot. The ppm value depends on the priority of the peaklists.
sequence_info * q_seq
all known fragments should be available in q_seq.
methods:
sequence_info * project_db::count_residues(bool print=true, bool mapped_too=true)
returns a sequence list that contains in "number" the expected number of ocurences of this residue type (instead of the fragment number). It counts only fragments with flags & SF_FINAL. If print, print the result. If !mapped_too, count only free target positions.
int project_db::get_highest_fragment_nr(void)
returns the highest fragment number used in the project.
float project_db::get_ppm(int fragment_nr, char * atom_name)
return ppm value - depending on seq_prot_exists d_prot or q_seq->pp[]->prot is searched.

void project_db::build_sp_index(void)
build all prot lists in q_seq from entries in d_prot. set seq_prot_exists=true and allow fast access to info in d_prot.
void project_db::clean_sp_index(void)
remove all prot lists in q_seq without destroying the entries in d_prot.
void project_db::check_seq_prot(void) in sequence.cc
Adapts q_seq and d_prot to the peaklists currently known to project. Set SF_FINAL and SF_USED flags in q_seq. d_prot is set to ppm values of default priority. Obsolete fragments and resonances are removed.
Calls the following 3 routines.
void project_db::include_peaklist(pal_list * pl, bool mesg, int * pri)
void project_db::remove_obsolete_fragments(bool mesg)
void project_db::remove_obsolete_protons(bool mesg)
removes fragments from q_seq and protons from d_prot, to which no peaks of the project are assigned.
void project_db::check_assignment_numbers(void)
checks whether the assignment numbers in d_prot are according to standard rules, whether the resonances there are in the library, and whether the assignment numbers are unique.
print out messages, but does not attempt to fix anything.
int project_db::nr_res_frag(int fr_nr)
returns the number of resonances in d_prot that belong to fragment number fr_nr.
void project_db::find_sidechain_NH2(void) (tools.cc)
Function to identify sidechain NH2 groups after picking a N/HN 2D cosy and a few other spectra.
void project_db::check_consistent_ppm(void)
Goes through the proton list and displays all peaks with identical assignment if some of them have different chemical shift. The resource "equal_shift_factor" influences the criterium for different shift: (i) the average ppm value is calculated considering priorities (ii) if the deviation of a ppm value from average is larger than equal_shift_factor * expected linewidth Lw_e[d], the atom is displayed.

peaks_list (peaks_entry)

methods: class peaks_list : public list is the class definition for pj->pkllist. It organizes all peaklists that are known to the project. peaks_entry has the following entries and pointers:
char f_name[FILENAME_L]
complete filename of the peaklist
char id[SNLN]
short identifier for the peaklist
char spec[SNLN], int spec_nr
identifier and index (in pj->speclist) of corresponding spectrum
int priority, int c_priority[3]
priority of the list and of its different dimensions (*priorities of corr. spectrum)
int backup_mode
0=no backup; 1=last-->.backup; 2=all-->.001, .002, ...
char * comment
bool pj_ass;
whether assigned to pj->am
pal_list * pl
peaklist in memory, or NULL
int in_use
current number of pointers to pl.
methods of peaks_list:
int nr(char * xid)
Returns index of entry with "id" xid, or INVALID.
int nr_f_name(char * f_name)
Returns index of entry with complete file name f_name, or INVALID.
pal_list * get_peaklist(int n)
Returns pointer to peaklist with index n. The peaklist is loaded to memory before, if it is not yet there. Increment in_use. Several routines can access the same peaklist. Peaklists obtained in this way have to be returned with ::peaklist_back(pal_list * pl).
void peaklist_back(int n)
Decrement in_use. If in_use<=0 (list no longer used) delete it from memory.
void get_spec_match(int n)
Find the spectrum that corresponds to this peaklist and permutate the peaklist such that it corresponds to the default permutation of the spectrum. Set c_priority.
void peaks_list::open/close(void)
Get all peaklists to memory / put all peaklists back (decreas in_use once). Useful to call beffor and after routines that might otherwise load an individual peaklist to memory and delete it several times.
void peaks_list::change_ass_number(int from, int to)
Change a particular assignment number in all peaklists of this.
void peaks_list::set_keep_open(int mask)
void peaks_list::unset_keep_open(int mask);
(keep_open|=mask, keep_open&=~mask) If keep_open, unused peaklists are not written to disc and deleted. They stay open until a call to unset_keep_open() sets keep_open==0.

sequence_info

sequence_info "q_seq" is part of the project database. If it has been updated, it contains all fragments that are assigned in any peaklist of the project, and all sequential residues from the target sequence. sequence_info_entry contains the following information:
int number
fragment number
char name[ATOMNAME_L]
residue type of fragment
int prev
fragment number of previous residue, INVALID=-1 if unknown.
int next
fragment number of next residue, INVALID=-1 if unknown.
int map
fragment number of corresponding residue in the sequence, or INVALID=-1. This information exists only in an intermediate stage - after the mapping is done, the non-sequential fragment does not exist any longer.
int flags
describe the status of the fragment.
#define SF_FINAL 0x200  // Target of map. (used to determine number of residues)
#define SF_REAL 0x400  // =Position in sequence. REAL is always FINAL.
#define SF_USED 0x800  // there are resonances assigned to this fragment
#define SF_BRANCH_F 0x1000  // two or more FINAL fragments follow
#define SF_BRANCH_P 0x2000  // two or more FINAL fragments before
// 0x01-0x08 reserved for index to sequence for multiple resonances
proton_list * prot
for a faster access to the resonances, prot provides a list of pointers to the entries in pj->db->d_prot for this fragment. prot=NULL unless constructed with project_db::build_sp_index.
Access to the information in sequence_info for a fragment number nr is provided by: (all in sequence.cc)

residue_type_handler

residue_type_handler contains methods to determine the likely residue type from resonances assigned to a fragment.

methods: (all in sequence.cc)

bool mapped_too
Whether those fragments that are already mapped are treated same as other fragments. Determines size and sums of matrix.
int max_fragment_nr
int * index
index[fragment number] = column in fr_mat
prob_mat * fr_mat
matrix with probabilities
e1 - fragment numbers, e2 - residue types
sequence_info * res_count
rows of fr_mat. The number field contains the total number of residues in the part of the sequence to which the fragments are mapped (depending on mapped_too). res_count->pp[n]->name is the fragment type in row n.
ppm_library * ppm_lib
library of expected ppm values and standard deviation.
methods:
void residue_type_handler::initialize(void)
Creates matrix fr_mat, res_count and index, set fr_mat->id2.
float p_fr_m(int fn, char * res_name)
Provides the probability that fragment number fn is a residue of type res_name from the current state of the matrix.
float p_fr_0(int fn, char * res_name)
Calculates the match between fragment number fn and residue type res_name. The routine uses various information from the project. Different routines may be called depending on the state of the project - $$.

The number of residues of a particular type or the match with other residue types are not considered.

float residue_type_handler::p_fr_AB(int fn, char * res_name)
A version that matches only alpha and beta carbon resonances. Only routine currently implemented (1.0.41).
void norm(void)
Treats the matrix of probabilities in such a way that the sum of probabilities is approximately 1 for each fragment, and equal to the number of residues in the sequence for a particular type.
int residue_type_handler::res_index(char * name)
returns the index for a residue type in fr_mat, or INVALID. Uses res_count.
float residue_type_handler::best_prob(int res_nr)
the higest probability of the residue number res_nr for any of the fragment types.
void residue_type_handler::show_residue(void)
Create strip_residue * srs, get spectra from library file, ask for residue to display.

sequential_handler

methods: parameters and pointers:
bool ignore_current
present settings of prev/foll are ignored - even mapped residues are included in the connection
bool only_passive
if !ignore_current && only_passive, the current state is not changed, but it is possible to connect unconnected fragments with fragments that are already connected. This may be useful to find multiple conformations. (1.0.41) not yet implemented.
float nr_chainstarts, nr_chainends
1 for start+end, further breaks for prolines, if there are no proline fragments that can be linked to the other fragments.
prob_mat * pf_mat
matrix with "probabilities" that i1 and i2 are sequential. The matrix always has the same number of elements in both dimensions. i1=0 and i2=0 are for chain starts and chain ends.
int * index
index[fragment number] = row / column of pf_mat
int max_fragment_nr
size of index
int nr_chains, sum_chains, nr_mapped
seq_chain * chain
sequential areas to which the fragments must be mapped. Within one chain, fragments numbers have to be prev++.
int nr_chains, sum_chains
number of entries in chain, and total number of residues in all entries of chain.
int nr_mapped
number of residues that are already mapped to their final position.
seq_score * seqscore
provides the scoring scheme for n - n+1 connection.
methods:
void sequential_handler::initialize(void)
Build the pf_mat matrix and index. col&row 0 are for chain starts and chain ends. All other positions are for residues with SF_USED.
The size of pf_mat depends on ignore_current. If ignore_current == false, residues for which both prev and next are defined are not considered except flags&SF_FINAL and prev or next are not SF_USED, i.e. the setting of prev and next does not really mean that a sequential connectivity is found.. If ignore_current == false, residues for which prev or next are known or dont exist get zero entries in their ts1 or ts2 fields, respectively.
Calculates max_fragment_nr, chain, nr_chains, sum_chains, nr_chainstarts, nr_chainends, nr_mapped, and defines index[].
void sequential_handler::build_matrix(void)
feed the matrix pf_mat with likely sequential connections. Calls match.
void sequential_handler::norm_matrix(void)
iteratively applies norm_1 to the matrix until all sums of rows and columns are within 1 +/- MX_DEV (=0.1) or 10 iterations are made.
mmatch_3 sequential_handler::map(int nf, int * fragments)
returns the "probability" and the first residue for the three best matches of the nf fragments with fragment numbers fragments[0], fragments[1] ... fragments[nf-1] to the sequences in chain[]. If nf > chain[i].ln, the chain is not used. If there are less than 3 possibiilities, the mmatch'es contain 0, INVALID.

float sequential_handler::map_p(int mpos, int nf, int * fragments)
returns the score for mapping nf fragments with fragment numbers fragments[0], fragments[1] ... fragments[nf-1] to fragmant numbers mpos, mpos+1 ... mpos+nf-1 . Called from map.
float sequential_handler::match(int prev, int next)
determines the "probability" of a sequential connection between prev and next. prev and next are fragment numbers, not indices. Calls seqscore->match(prev, next), supresses self-connections in some way and transforms the linear score as score=exp(score)/pf_mat->e1; score=score/(1+score);.
void sequential_handler::connect(int pr, int nr)
Set the probability that residues pr and nr are seqential to 1. Clear the probability for alternative connections with pr and nr. Map one of the residues if the other is mapped. Check for errors.
void sequential_handler::set_prob(int pr, int nr, float w)
Set the probability that residues pr and nr are seqential to w. Do not (immediatedly) influence other probabilities.
void sequential_handler::show_spectra(void) - strip_disp.cc
create strip_comparison * stc, get spectra from library file, start display of spectra.

ppm_library, residue_ppm_list, residue_ppm_entry, ppm_list, ppm_entry

ppm_library reads libraries that contain information about the expected range of ppm values for different residue types. 2 different formats are automatically recognized.
The library has a residue_ppm_list * rs_list, which contains for each residue either a ppm_list * atoms, or NULL - if no entry exists for that residue.
The ppm_entry in ppm_list contains the atom name "name", the average value "ppm", and the standard deviation of ppm values (which are assumed to have Gaussian distribution) "ppm_r".

methods:

ppm_library
read_residue
get_atoms
ppm_list
match
index

ppm_list * ppm_library::get_atoms(char * residue)
returns the list of entries for the residue type "residue". If necessary, it is loaded first. Returns NULL if no such residue exists in the file.
ppm_list * ppm_library::read_residue(char *residue)
actually reads the information for one residue from the file into the library. Called from get_atoms.
float ppm_list::match(float p_x, char * an)
float ppm_list::match(float p_x, int idx)
"probability" exp(-2*d^2) where d is the difference between the ppm value p_x and the ppm value of the atom with name an or index idx, divided by the standard deviation of ppm values for that atom.
int ppm_list::index(char * an)
index of a named atom.

merger, merge_table_handler

methods: merger has the function to make two fragments out of one. It can also be used to change the fragment type ($$$ not yet implemented). The two fragments are not treated in a symmetric way, fragment 2 is merged into fragment 1, so that fragment 1 contains all the resonances.
merger can only be used if a valid project is loaded.
merge_table_handler * thd
is used to change the mapping between the two fragments interactively
int f1, f2
fragment numbers of the two fragments
sequence_info_entry *si1, *si2
pointers to the two fragments
proton_list * prot1, prot2
lists of known protons of the two fragments, pointers to si?->prot
atom_list * at1
list of all possible protons of fragment 1
methods:
void merger::do_merge(void)
merges fragment 2 into fragment 1 with the following consequences:
  • a temp. list umr of fragment numbers to unmap is created, including all prot1 resonances with prot1st[]==M_unass1 and all prot2 resonances with prot2st[]==M_unass2.
  • a temp. list umm of pairs of fragment numbers to map is created, including all prot2 resonances with prot2st[]==M_map, and their partners prot2r.
  • all prot2 resonances with prot2st[]==M_new and their partners prot2r are added to umm, while the new proton entries are created. If one of the new resonance numbers is in umr, it is removed from umr and the ppm data of the entry in d_prot is replaced with the data in prot2.
  • all assignments in pj->pkllist to the resonances in umr are set to 0.
  • all assignments in pj->pkllist to the resonances in umm are changed.
  • all fragments with the fragment numbers in umr are removed from d_prot and prot1.
  • fragment f2 is removed from q_seq
void merger::map_p(int from, int to)
map prot2->pp[from] to prot1->pp[to]
void merger::map_a(int from, int to)
map prot2->pp[from] to at1->pp[to]
void merger::unmap_1(int p1)
map prot1->pp[p1] to 0
void merger::unmap_2(int p2)
map prot2->pp[p2] to 0
void merger::set_thd_arrays(void)
fills thd->p1, p2 and st using tha data in prot1st, prot1r, at1st, at1r, prot2st, prot2r
bool merger::start_merge(int f1x, int f2x)
Entry point for a merge. Tries to do make an automatic merge, and depending on interactive_level it calls thd to resolve remaining problems. If interactive_level==0 the table always shows what is to be done.
return false if merge was aborted.
void merge_table_handler::restart(void)
print buffer using p1, p2, st and set the table.

strip_comparison

public methods: class strip_comparison : public main_window is a tool to display spectra that define a sequential connection between strips. strip_comparison * pj->seq_hd->stc is started with sequential_handler::show_spectra

class stc_commands : public table_handler(itb) is the handler behind the command window that appears with strip_comparison. parameters and pointers:

#define MAX_CON 3
max. number of residues for each connection
#define MAX_SP 5
max. number of spectra per residue
spec_window * sw[2][MAX_CON][MAX_SP]
spectral windows
int res[2][MAX_CON]
residue numbers to display
float cow[MAX_CON]
"probabilities" for connection to the residues
char sp[2][MAX_SP][SNLN]
id names of spectra to display
int fwd
fwd=1: show connection of "n-1" residue to several possible "n" residues. fwd=0: connect residue "n" with possible "n-1"
methods:
void strip_comparison::show_connection(int resc)
Find the best matching residues for resc and start the display.
void strip_comparison::draw(void)
organize the geometry of the display.
void strip_comparison::build_sw(void)
set up of spectral windows, using id names in sp[o][s].

strip_residue

public methods:

class strip_residue : public main_window is a tool to compare spectral strips for one fragment and resolve ambiguities interactively. strip_residue * pj->seq_hd->str is started with residue_type_handler::show_residue Most of the display functions are analoguous to strip_comparison described above.

class srs_commands : public table_handler(rtb) is the handler behind the command window that appears with strip_residue. parameters and pointers:

#define MAX_SPR 10
max. number of spectra per residue
spec_window * sw[MAX_SPR]
pal_list * pl[MAX_SPR]
spectral windows and peaklists to display there.
int res
residue numbers to display
int types[MAX_TYP]; int nr_typ;
indices of the possible fragment types sorted by probability, number of fragment types. The i-th entry in the table corresponds to pj->res_hd->res_count->pp[xxx->types[i]]

methods:

void strip_residue::get_atoms(void)
set atoms[] and nr_atom to the atoms of res.
void strip_residue::change_prob(int ti, float prob= -1)
If prob outside 0..1 get prob interactive. Set probability that residue res is of type res_hd->res_count->pp[ti]->name to prob. If prob==1 set probability of all other fragment types to 0.

manip_thd : public table_handler

Comparison of two peaklists: Find peaks that are only in one of the lists, or merge the lists such that corresponding peaks are not duplicated. This table handler is invoked with "project"/"peaklist manipulations".

parameters and pointers:

pal_list *pl1, *pl2
the two peaklists.
pal_pointer *plx
com_pl_mode mode
enum com_pl_mode { CMP_2D, CMP_strip, CMP_3D }
float dst[3]
max. distance in ppm for comparison: peaks with a larger distance are considered different.
All methods are called from handle(int col, int row). They are described under spscan project database: peaklist manipulations:
display_status(void)
merge_peaklists(void)
difference_peaklists(void)
take_assignments(void)
diff_plx(void)
include all peaks from peaklist 2 in plx, which have no corresponding peak in peaklist 1.
get_dst(void)
permutate(pal_list *&pl, pal_list *plr=NULL)
adapt pl to plr, or check the order of dimensions interactively. (called when loading the peaklists)
remove_dup1(void)
check_ass1(void)

seq_chain

seq_chain is a sequence of residues which to which fragments can be mapped. The fragment number of the n-th fragment (n=0..ln-1) is alwats fi+n.
int ln
length of the chain
int fi
fragment number of first fragment
no methods

Matrix of sequential probabilities: prob_mat

methods: (all in: sequence.cc) parameters and pointers:
int e1, e2
size of matix: number of rows and columns
float * fp
the matrix elements [i1,i2]
float * s1, s2
sum of elements in rows and columns. s1 = sum of i1,0 .. i1,e2. s2 = sum of 0,i2 .. e1,i2.
float * ts1, ts2
expected sum of elements in rows and columns.
int * id1, id2
row and column identification numbers, e.g. indices in corresponding lists.

s1, ts1 and id1 are arrays of size e1, s2, ts2 and id2 are arrays of size e2.
methods:

void prob_mat::dump(FILE * F)
Write matrix to tape in binary form.
static prob_mat * prob_mat::recover(FILE *F)
construct from file (written with dump), at the current position.
void prob_mat::norm_1(void)
For each element f=sqrt(ts1/col_sum * ts2/row_sum) is calculated. For each probability w in the matrix w=w*f/(w*f-w+1). This is one step of an iterative procedure to bring the sums of all rows and col's to the target sums ts1[] and ts2[].
rows or columns with a target sum ts == 0 will not be corected.
void prob_mat::norm_b(void)
The matrix is treated in such a way that no more than one value remains in each row and col. The highest probability in the matrix is set to P_FIX, all other values in this row and column are set zero, and norm_1 is called. This procedure is repeated until all values are either 0, or P_FIX. Used to force exactly one next and previous.
float get(int x1, int x2)
void put(int x1, int x2, float v)
access to the matrix itself.
int prob_mat::next_highest(bool fwd, int ir, unsigned char no_flag=0)
Returns the index of the matrix entry with highest probability that has not fl[ir]&no_flag. If fwd==true ir is dim 1, and dim 2 is searched. If fwd=false ir is dim 2, dim 1 is searched.
void prob_mat::set_flag(bool fwd, int ir, unsigned char flag)
Set the flag in fwd==true: dim 2, false: dim 1 flx[ir]|=flag.

prob_mat_2 - should become obsolete:

int prob_mat::analyse_sym(void)
Check whether in each row and column all values are 0.0, and 1 or 0 values are P_FIX (=1024). Set next and prev. Return SUCCESS if everything is ok.
int prob_mat::number_of_pieces(bool print)
Check whether there are several pieces and whether pieces of sequence are closed. Return number of pieces, including isolated elements.
int prob_mat::get_pieces(piece **& p)
p is set to an array of pointers, and the pieces (including isolated elements) are made available as p[]->... Returns number of pieces.

pseudoatom_entry, pseudoatom_lib

Undocumented. Classes to recognize the relation between pseudoatoms and "real" atoms. in assignment.h / check.cc
Part 3: methods connected to display of spectra
alphabetic index / program documentation / database / internals / methods (Part 1) display methods
Dr. Ralf W. Glaser
FSU Jena, Institut fuer Molekularbiologie
Winzerlaer Strasse 10
D-07745 Jena, Germany
Tel.: +49-3641-65-7573
Fax: +49-3641-65-7520
E-mail: Ralf.Glaser@uni-jena.REMOVSPMTAG.de