molrep - descriptionMOLREP (CCP4: Supported Program) NAME molrep - automated program for molecular replacement SYNOPSIS molrep [HKLIN in.mtz] [MAPIN EM_map.ccp4] [MODEL in.pdb ( or EM_mod_map.ccp4)] [MODEL2 in2.pdb] [PATH_OUT path_out] [PATH_SCR path_scr] [Keyworded input] DESCRIPTION Version 7.3 /01.07.2002/ FEATURES standard molecular replacement method by cross rotation function (RF), full-symmetry translation function (TF), packing function (PF) also Self Rotation Function with PostScript plots Spherically averaged phased translation function (SAPTF) Phased Rotation(PRF) Phased Translation functions (PTF) Locked cross rotation function (LRF) allows input of a priori knowledge of similarity and completeness of the model. scaling by Patterson origin peak soft low resolution cut-off anisotropic correction and scaling can use modified stucture factors instead of Fobs for RF rigid body refinement can use second fixed model if the number of monomers is known MOLREP can position the input number of monomers in a simple run can check and manage pseudo-translation fitting two models can just rotate and position the model and compute R-factor, CC search model in electron density map multi-copy searh can choose from symmetry-related models closest to which was found before can improve the model before use model correction by sequence alignment can use NMR model can use EM or electron density map as model or use instead of Fobs for searching a model in EM map search model orientation in electron density map for particular position by phased RF find HA positions by MR solution heavy atom search CONTENTS References Input/output files What MOLREP can do How to use MOLREP Keywords Molecular replacement method. Some theory. Command file examples Convention for rotation and coord. system Memory control parameters REFERENCES Author: A.A.Vagin email: alexei@ysbl.york.ac.uk References: A.A.Vagin, New translation and packing functions., Newsletter on protein crystallography., Daresbury Laboratory, (1989) 24, pp 117-121. Alexei Vagin and Alexei Teplyakov. An approach to multi-copy search in molecular replacement., Acta Cryst.D,(2000) 56, pp 1622-1624 A.A.Vagin and M.N.Isupov Spherically averaged phased translation function and its application to the search for molecules and fragments in electron-density maps Acta Cryst.D,(2001) 57, pp 1451-1456 main: A.Vagin,A.Teplyakov, MOLREP: an automated program for molecular replacement., J. Appl. Cryst. (1997) 30, 1022-1025.INPUT/OUTPUT FILES Input file HKLIN Input MTZ file or CIFile structure factors in the case of fitting two atomic models, leave this option out. CIF file contains indices, structure factors (and phases if you need them): h,k,l,|F|,sig(F),Phi or h,k,l,|F|,sig(F),Phi,Fom MTZ file must have extension "mtz". MAPIN MAPIN can be used instead of HKLIN Input file with CCP4 map. EM or electron density file in CCP4 format (File must have extension "ccp4"). This file will be converted to files of !F! and phases. In the case of fitting two atomic models, leave this option out. MODEL Input PDB file with the coordinates or CCP4 map. (CCP4 map file must have extension "ccp4"). Without this file the program computes a Self Rotation Function and plots RF(theta,phi,chi) for chi = 180, 90, 120, 60 You can change the fourth value of chi (60) by keyword CHI You can change the scale of these plots of RF by keyword SCALE MODEL2 Input PDB file with second fixed model PATH_SCR You can use this variable to redirect all scratch files to special directory. Default is value of TEMP1 or CCP4_SCR variable. If 'PATH_SCR .' all scratch files will be in current directory. PATH_OUT You can use this variable to redirect output files to special directory. Space group and unit cell parameters of the unknown structure will be taken from the file of structure factors. You can change the space group of the structure factor file by using keyword NOSG. Output files molrep.pdb new PDB coordinate file of model (plus model_2) corresponding to the best solution of Cross Rotation and Translation Function. molrep_fcalc.cif formatted CIFile of molrep solution (plus from model_2) with Fobs, Fcalc, Phcalc (this files will be created if your model is EM or density in CCP4 format). molrep.doc additional protocol (like log file in CCP4). This file will be created if keyword DOC is 'Y' or 'A'. molrep_rf.tab List of peaks of rotation function, created as soon as the program calculates a Rotation Function. molrep_rf.tab is default name, you can use another name using keyword FILE_T. You can edit this file and use it for subsequent calculations without computing the rotation function again (keyword FUN=T). The program then reads (in free format!): "Sol_", peak number and Polar angles (theta,phi,chi), e.g.: "Sol_ 23 10.0 22.2 40.0"In 'Rotation and Position the model' (keyword FUN=S), the program reads "Sol_", peak number,Polar angles and shift (sx,sy,sz) e.g.: "Sol_ 23 10.0 22.2 40.0 .564 .443 .032"If you like to use Eiler angles use "Sol_A" instead "Sol_" In 'Search model orientation by PRF' (keyword PRF=P), the program reads "Sol_", peak number,Polar angles and shift (sx,sy,sz) e.g.: "Sol_ 23 10.0 22.2 40.0 .564 .443 .032"But program will use only the shift (sx,sy,sz). molrep_srf.tab List of peaks of Self Rotation Function. molrep_srf.tab is default name, you can use another name using keyword FILE_TSR. molrep_rf.ps PostScript file of Self Rotation Function also (if you used keyword FILE_S): align.pdb input model corrected by sequence alignment WHAT MOLREP CAN DO +-- Self RF (FUN=R, without any model) ! +-- Standard MR -+-- Cross RF (FUN=A or FUN=R ) ! ! ! +-- Locked Cross RF ( FUN=A or R and LOCK=Y ) ! ! ! +-- TF (FUN=A or FUN=T ) ! ! ! +- two identical models ! ! ! +-- Dyad search -+ ! ! (DYAD=D) ! +-- Multi-copy search -+ +- two different ! for MR ! models ! ! MOLREP --+ +-- Multy-copy for one model ! (DYAD=Y) ! ! ! +-- RF and PTF ! ! (PRF=N) ! ! +-- Fitting two models -+-- SAPTF, RF and PTF ! ! (PRF=S) ! ! ! +-- SAPTF, PRF and PTF ! (PRF=Y) ! ! ! +-- RF and PTF ! ! (PRF=N) ! ! +-- Searching in ED map -+-- SAPTF, RF and PTF ! ! (PRF=S) ! ! ! +-- SAPTF, PRF and PTF ! (PRF=Y) ! +-- Rotate and position the model (FUN=S FILE_T) ! ! ! +-- Search model orientation in electron density map ! for particular position by phased RF (PRF=P FILE_T) ! ! ! +-- find HA positions by MR solution ! ! (FUN=D, model_2) ! ! +-- HA search ---+-- HA search for SIR or SAD ! ! (DIFF=H, FUN=T, without any model) ! ! ! +-- Self RF for HA position ! (DIFF=H, FUN=R, without any model) ! +-- pure RB refinement (in patterson or real space, FUN=B, model_2) where: FUN, DYAD, PRF, LOCK, DIFF - keywords MR - Molecular Replacement RF - Rotation function TF - Translation function PRF - Phased Rotation function PTF - Phased Translation function SAPTF - Spherically Averaged Phased Translation function ED - Electron density HA - heavy atom RB - rigid body Standard molecular replacement method Self Rotation Function only Search model in electron density map Search model orientation by PRF Fitting two models Just rotate and position the model Multi-copy search Model correction Use sequence alignment NMR model EM or electron density model Locked Cross Rotation Function rigid body refinement * Standard molecular replacement method The program performs molecular replacement in two steps: Rotation function (RF) search orientation of model Cross Translation function (TF) and Packing function (PF) search position of oriented model The result of the Rotation function depends on the radius of a spherical domain in the centre of the Patterson function (the so-called cut-off radius). This radius must be chosen so as to maximize the ratio between the number of inter- and intramolecular vectors. The program chooses the value of this radius as twice the radius of gyration, but can also use an input value. Instead of computing RF, the program can use a list of orientations from a Rotation function (keyword FILE_T) which was prepared before. Anisotropic correction of data before computing RF can be useful for data with high anisotropy (keyword ANISO). With a second fixed model, the use of modified stucture factors instead of |Fobs| for RF (keyword DIFF) may make RF clearer. The modified stucture factor is: ||Fobs|-|Fmod2|*(P2/100)| where P2 is the percentage of model_2 in the whole structure. The Translation function can check several peaks of the rotation function by computing a correlation coefficient for each peak and sorting the result. For scaling observed and calculated structure factors, the program uses the scaling by the origin peak of Patterson, but for data with high anisotropy the program can use anisotropic scaling. The Translation function can take into account the second fixed model and also, if the number of monomers is known, MOLREP can position the input number of monomers in a simple run (keyword NMON). Also in this case the possibility to choose from symmetry-related models closest to which was found before is useful (keyword STICK). The program can detect and use pseudo-translation vectors. In this case the pseudo-translation related copy will be added to the final model (keyword PST). The Packing function is very important in removing wrong solutions which correspond to overlapping symmetry-related or different models (keyword PACK). Use keywords: COMPL, DIFF, FUN, NMON, NP, NPT, P2, PACK, PST, RAD, RESMAX, SIM, STICK, SURF, VPST, NREF, NREFP, FILE_T, FILE_TSR, NSRF * Self Rotation Function only If you define only a file of structure factors (Fobs), the program will compute a Self Rotation function with cut-off radius RAD = 30 as default. Use keyword RAD if you want another value. Other useful keywords: RESMAX, RES_R, COMPL, SIM. Resulting output: molrep_srf.tab List of peaks of Self Rotation Function. molrep_srf.tab is default name, you can use another name using keyword FILE_TSR. molrep_rf.ps PostScript file of Self Rotation Function which contains four plots RF (theta,phi,chi) for chi = 180, 90, 120, 60. You can change the fourth value of chi (60) by keyword CHI You can change the scale of these plots of RF by keyword SCALE * Search model in electron density map In some cases it is difficult to solve an X-ray structure by molecular replacement even when a structure for a homologous molecule is khown. If prior phase information either from SIR/MAD or from a partial structure is known, this could be used in a six-dimensional search. The program divides the six-dimensional search with phases into three steps: a spherically averaged phased translation function (SAPTF) is used to locate the position of the molecule or its fragment. It compares locally spherically avaraged experimental electron density with that calculated from the model and tabulates highest scoring positions. then for each such position a local phased rotation function (PRF) is used to find the orientation of the molecule. Another possibility is to use usual rotation function (RF) for modified map, i.e program sets 0 the density outside of sphere with radius = twice radius of model and with the centre in current point. the phased translation function (PTF) for found orientation which checks and refines the found position. You need to have the phases in a CIF file of structure factors or to use corresponding keywords for MTZ file or use EM map as input instead of Fobs file. with keyword PRF = 'N' (default value): usual Rotation function and Phased Translation function will be used. with keyword PRF = 'Y': SAPTF (Spherically averaged phased translation function), Phased Rotation function and Phased Translation functions will be used. with keyword PRF = 'S': 1.SAPTF (Spherically averaged phased translation function). 2.For current point of SAPTF solution input map is modified, i.e program sets 0 the density outside of sphere with radius = twice radius of model and with the centre in current point. 3.usual Rotation function for this modified map. 4.Phased Translation functions Other useful keywords: COMPL, NMON, NP, NPT, RAD, RESMAX, SIM, SURF, INVER Also you can refine solution by Pure Rigid Body Refinement * Search model orientation in electron density map for particular position by PRF You can use this possibility (keywords PRF=P and FUN=R or A) if you want to find the model orientation in ED map by rotating model around the defined point in ED map. Program puts the origin of model coordinate sysytem to the defined point and performs phased rotation function (PRF). Use keyword RAD to define the radius of sphere for PRF. You must define the list of defined points of ED map using file FILE_T , wich must contain lines with "Sol_", peak number,Polar angles and shift (sx,sy,sz) e.g.: "Sol_ 23 10.0 22.2 40.0 .564 .443 .032"But program will use only the shift (sx,sy,sz). Model is rotated around the origin of model coordinate sysytem. If keyword SURF= Y,A,2,O program puts the centre of model to the origin of model coordinate sysytem automatically. If you want, for example, to rotate the model around some atom, shift the origin to this atom and use SURF=N Other useful keywords: COMPL, NMON, NP, NPT, RESMAX, SIM, INVER * Fitting two models (FM) The idea is to fit the electron densities instead of the atomic models, trying to find the best overlap. Advantages are: can be used for cases with very low homology; can be used when amino acid sequence is absent; no need to use the list of equivalent atoms. If you define only two files of models (searching model and model_2), without a file of structure factors (HKLIN), the program will fit the search model (MODEL) to the second model (MODEL2). The search model must be smaller or equal to the second model. with keyword PRF = 'N' (default value): usual Rotation function (RF) to search the orientation and Phased Translation function (PTF) to search position will be used. with keyword PRF = 'Y': Spherically averaged phased translation function (SAPTF) gives the expected position for model. Phased Rotation function (PRF) for expected position gives orientation. Phased Translation function (PTF) checks and refines the translation vector. with keyword PRF = 'S': see above Search model in electron density map Other useful keywords: COMPL, NP, NPT, RAD, RESMAX, SIM, SURF The result is file molrep.pdb - model fitted to second model. * Just rotate and position the model This possibility may be useful if you want to place the model to a particular orientation and position, or to compare several solutions. Use keyword FUN=S and define three files: a model (MODEL), a file of structure factors (HKLIN) and file with polar angles and shifts (keyword FILE_T). The program will shift the model to the origin, rotate (by polar angles) and the position it (in fractional unis). The new model will be written to an output coordinate file. Also the program will compute an R-factor and a Correlation Coefficient. Other useful keywords: COMPL, RESMAX, RES_T, SIM * Multi-copy search There are two modes: "dyad_search" and "Multi-copy search". Dyad_search - Search two copies of a model simultaneously (keyword DYAD=D). Sometimes you can not find a solution starting with one molecule if you have several copies of the molecule in the asymmetrical part of the unit cell. In this case a search with two independent molecules may give a solution. The central point of method is the construction of a multi-copy search model from properly oriented monomers using a special TF (STF), which gives the intermolrecular verctor between properly oriented monomers (dyad). This dyad can then be used for a positional search with a conventional TF. the program checks all pairs of NP peaks of the Rotation Function (RF). For each pair the program uses the first rotation to prepare model-1. Model-2 will be prepared by using the second rotation and one rotation from the crystallographical symmetry operators. The total number of pairs to be checked is ((NP+1)*NP*Nsym)/2 next, for model-1 and model-2: the program computes the Special Translation Function ( STF) to find the inter-molecular vector of the dyad. for NPT peaks (i.e. inter-molecular vectors) of the STF, the program computes the standard Translation Function (TF) using the current dyad as a model, and it calculates a Correlation Coefficient for the firstNPTD peaks of the TF. Solution and output file: molrep.pdb will be the dyad with the best Correlation Coefficient (or several dyads if keyword NMON > 1). WARNING: the procedure takes quite some time, because the total number of Translation Functions to be calculated is NMON*NPT*((NP+1)*NP*Nsym)/2. In the output .log (.doc) file you can find the following information: Sol_ R1 R2 Rs Rslf STF TF Shift_1 PFmax PFmin Rfac Corr Sol_ 1 1 1 0 2 1 0.059 0.000 0.201 1.01 0.99 0.569 0.379 and Sol_best 1 1 1 0 2 1 0.059 0.000 0.201 1.01 0.99 0.569 0.379 Sol_best Rot1-->2 Dyad_vector dist d_ort d_par Sol_best 0.0 0.0 0.0 -0.210 0.000 -0.487 39.2 19.6 33.9 These lines means: R1 peak number of rotation for model-1 R2 peak number of rotation for model-2 Rs CS operator number which applyed before rotation for model-2 Rslf peak number of self rotation function STF peak number of special translation function TF peak number of translation function Shift_1 position of model-1 PFmax PFmin min, max values of Packing function Rfac Corr R-factor and Correlation Coefficient Rot1->2 polar angles of rotation from model-1 to model-2. Dyad_vector vector (in fractional) from model-1 to model-2. dist d_ort d_part first number - distance between models (in Angstrom) second number - distance orthogonal to rotation 1->2 third number - distance parallel to rotation, i.e. for pure dimer this is 0. With keyword LIST=L you can find additional information: Sol_ angles_1 angles_2 shift_2 Sol_ 90.63 98.70 118.12 90.63 98.70 118.12 0.189 0.256 -0.415 +---------------------------------------------------------+ ! ! ! ! ! ! ! ! ! ! ! ----------------- ----------------- ! ! / \ / \ ! ! / \ / \ ! ! ! rotated (angles_1) ! ! rotated (angles_2) ! ! ! ! monomer_1 ! ! monomer_2 ! ! ! ! ! dyad ! ! ! ! ! +----------!-----------------+ ! ! ! ! / ! vector! ' ! ! ! ! / / \ ' / ! ! \ / / \ / ! ! \ / / ' \ / ! ! ---/------------ ' -------------- ! ! /shift_1 ' shift_2 ! ! / ' ! ! / ' ! ! / ' ! ! / ' ! ! / ' ! ! / ' ! ! / ' ! !/ ' ! +---------------------------------------------------------+ origin If you believe the Self-RF, you can try to find a dyad which has the rotation between monomers corresponding to the rotation of the Self-RF (use keywords NSRF,FILE_TSR). Model-2 can be different from model-1. Use keywords FILE_M2 to define file of searching model-2, FILE_T2 with list of peaks rotation function for this model (this RF have to be computed before) and NP2 number of peacks which will be used. Multi-copy search - Search many copies of a model (not only dyad) (keyword DYAD=Y). Program starts to search a single monomer, after that produces the dyad search, repeates dyad search for next dyad with the first being fixed and ,finaly, tryes add a single monomer. Use keywords: DYAD, DIST, NP, NSRF, NPT, NPTD, NP2, AXIS, FILE_M2, FILE_T2, FILE_T, FILE_TSR, NMON, ALL, PACK and also: COMPL, SIM, RESMAX, SURF, STICK * Model correction You can improve your model beforehand by using keyword SURF. * Using sequence alignment Another way to improve your model is to use the sequence of the unknown structure. Use keyword FILE_S to define a file containing a sequence. This sequence file must be ASCII: ! ! !# sequence !SVIGSDDRTRVTNTTAYPYRAIVHISSSIGSCTGWMIGPKTVATAGHCIY !# this is comment ! DTSSG--SFAGTATVSP GRNGTSYPYG !NRGTRITKEVFDNLTNWKNSAQ! If the first symbol in the line is "#", it means the line contains comments. Blancs are ignored. The program will perform sequence alignment and create a new corrected model with the atoms corresponding to the alignment. The output file with the corrected model is align.pdb. The results of the alignment are written to the DOC-file, if this was defined. Without an Fobs file, the program only performs model correction. * NMR Model You can use PDB file with NMR models or pseudo-NMR file with several homologous structures which were superimposed before. Algorithm is equivalent to sum RF or/and TF for individual structures. Program can find the best model in NMR file or use all models (see keyword NMR) . In the PDB file different models must be separated by MODEL record. For example: HEADER HYDROLASE (ENDORIBONUCLEASE) CRYST1 64.900 78.320 38.790 90.00 90.00 ... MODEL 1 ATOM 1 N ASP A 1 45.161 12.836 ... ATOM 2 CA ASP A 1 45.220 12.435 ... ... ATOM 745 SG CYS A 96 58.398 6.673 ... ATOM 746 OXT CYS A 96 62.238 7.178 ... ENDMDL MODEL 2 ATOM 1 N ASP B 1 44.487 11.386 ... ATOM 2 CA ASP B 1 44.559 11.129 ... ... Use keyword NMR * EM or electron density model Searching model can be Electron Microscopic model (EM) or electron density map. Only values higher the limit (if keyword ROLIM is defined) will be used. Map must have space group P1 and contains whole model. Vector ORIGIN defines the centre of model and the rotation will be performed around this point. If parameter DRAD (radius of model) is defined program will use the density only inside the sphere with radius = DRAD and with centre in vector ORIGIN. +--------------------------------+ nz ! ! ! ! . . ! ! ! ! ! . . ! ! ! ! ! +--------------------------------+ izmax ! ! ! ! ! ! ! ! ! ! ! ---------------- ! ! ! / \ ! ! ! / \ ! ! ! / \ ! C_cell ! / \ ! ! ! ! ! ! ! ! ! DRAD ! ! ! ! !---------- + ! ! ! ! ! / centre ! ! ! ! ! / / ! ! ! \ / / ! ! ! \ / / ! ! ! \ / / ! ! ! \ / / ! ! ! -/-------------- ! ! ! / ! ! ! / ! ! ! / ORIGIN ! ! ! / ! ! ! / ! ! ! / ! ! !/ ! ! +--------------------------------+ 0 nx ----------- A_cell -------------- Program will get vector ORIGIN from file automatically. If it is not possible to get correct vector, program will use ORIGIN = ( 0.5, 0.5, izmax/(2*nz)). If you want you can define ORIGIN yourself. Use keywords: DSCALEM, INVERM, ROLIM, DRAD, ORIGIN Also you can use EM or electron density map instead of file of Fobs. In this case map will be converted into !F! and phases and Search model in electron density map will be performed as usual. Use keywords: DSCALE, INVER, DLIM * Locked Cross Rrotation Function Locked Cross Rotation function (LRF) means to average the Cross Rotation function by NCS which can be determined with Self Rotation function. LRF is especially useful when NCS forms a group. Use keywords: LOCK, NSRF, FILE_TSR, * Rigid body refinement If keyword MODE = S program produces Rigid Body refinement for each peak of TF. Number of cycles is controled by keyword NREF (default 10). Also program can refine the orientation given by RF before TF. In this case program produces Rigid Body refinement (in space group P1) for each peak of RF. Number of cycles is controled by keyword NREFP. Default value is 0, i.e. without this refinement. Use keywords: MODE, NREF, NREFP If your model contains several domains you can use multi-domain Rigid body refinement. For this you must put into PDB file additional lines before each domain. Additional line contains word '#DOMAIN' and number of domain (free format). For example: HEADER HYDROLASE (ENDORIBONUCLEASE) CRYST1 64.900 78.320 38.790 90.00 90.00 ... #DOMAIN 1 ATOM 1 N ASP A 1 45.161 12.836 ... ATOM 2 CA ASP A 1 45.220 12.435 ... ... ATOM 745 SG CYS A 96 58.398 6.673 ... ATOM 746 O CYS A 96 62.238 7.178 ... #DOMAIN 2 ATOM 747 N PHE A 97 44.487 11.386 ... ATOM 748 CA PHE A 97 44.559 11.129 ... ... ATOM 945 C VAL A 196 58.398 6.673 ... ATOM 946 O VAL A 196 62.238 7.178 ... #DOMAIN 1 ATOM 947 N ASP A 197 44.487 11.386 ... ATOM 948 CA ASP A 197 44.559 11.129 ... ... Also you can use Pure Rigid Body Refinement in Patterson or Real space. This possibility is useful in the last stage of MR. For example after fitting the model into EM map. If you want to use multi-domain Rigid body refinement define domain structure in PDB file (see above) and use keyword DOM = 'Y'. Use keywords: FUN = B, DOM, NREF * Find HA positions by MR solution Use keywords: FUN = D, MODEL_2, * Heavy atom search In this case you need not to use any model. Use keywords: DIFF = H, FUN = T or R, 'FUN = T' means Heavy atom search (experimental version) 'FUN = R' means Self RF for Heave atom structure. HOW TO USE MOLREP A simple way to use MOLREP is to define files for Fobs (HKLIN) and the model (MODEL), number of model to search (keyword NMON), and use default values for all parameters (i.e. without using any keywords). There is always a chance of solving the structure automatically. If this does not work, use a common strategy of molecular replacement. Planning ahead Success of the molecular replacement method depends on: quality of experimental data scaling |F|_obs and |F|_calc low resolution limit high resolution limit quality of the model, homology, conformation Things to look out for: data Look at your data quality. Completeness is very important. Absence of low resolution reflections may cause problems, especially if the model is some part of a whole structure. Look at anisotropy and twinning. Think carefully: can you 'safely' use the high resolution reflections? If not, use keyword SIM to remove the potentially bad effect of this part of the data. It might be a good idea to use some program to check the data, for example SFCHECK model Look at the model regarding the shape. The automatic choice of the cut-off radius for RF is twice the radius of gyration. This is good enough if the shape is approximately spherical. If the model is very asymmetrical, it is better to make a choice yourself. Remove from your model the heavy atoms and some terminal residues if they lie 'outside' the model. Make a choice for SIM,COMPL. If you have not any idea about similarity, SIM=0.5 is a good approximation. If you have a dimer use it, but use RAD corresponding to a monomer. It is very useful to shift the model to the origin of coordinates. Use keyword SURF = O or Y (Y is default). Self-RF Compute Self-RF. It may give you some idea about NCS or about the number of copies in the asymmetrical part of unit cell. Choose the radius of integration carefully. The program can not make any informed choice about it without a model (default is 30Å). Cross-RF only Compute Cross-RF with LIST=L and DOC = Y. In the DOC_file you can find the list of expected orientations of the model and also the rotations between them. Compare this with the Self-RF. This is an additional check of correctness of the expected orientation. But sometimes we can not find corresponding peaks in Self-RF for correct orientation. If you have high anisotropy in the data, use anisotropic correction. Translation function If there are several copies of the model in the asymmetrical part of unit cell, use keyword NMON or multi-copy or dyad search. You can not use the option of Pseudo-translation for a dyad search, since this can recognize Pseudo-translation itself. If you have high anisotropy in the data, use anisotropic scaling. Pseudo-translation MOLREP can detect pseudo-translation, and define a pseudo-translation vector. If keyword PST = Y, the program applies pseudo-translation with a pseudo-translation vector which was defined by the program or the user. When calculating a Translation Function, the program will use this vector to modify structure factors. Pseudo-translation copy will be added to the final model at the end program running. If FUN=R and LIST=L MOLREP computes a list of Patterson peaks and writes these to molrep.doc. This may be helpful in the detection of pseudo-translation. Use keywords: PST, VPST Flexible model If your model is flexible, for example, consists of two domains, you can try to solve this problem by two ways: 1. Create two files for each domain and use dyad search (DYAD = D) 2. Combine these two domain files to single file with line "MODEL" between domains (like NMR file). Use usual Molecular Replacement methods with keyword NREFP or MODE = S and NREFP. The use many homologous models If you have several homologous models you can create a pseudo NMR file with these models and use its together (see NMR model). But these models must be superimposed before, for example, by MOLREP (see fitting two models). Keep in mind If you want to play with parameters, use also Keywords for special cases. Without a model file, the program only computes a Self Rotation Function. Model correction can be performed by using keyword SURF, or by including FILE_S, a file with a sequence for sequence alignment. If FUN=R, the program computes and writes to molrep.doc all symmetry-related peaks of the Rotation Function. If also keywords NSRF and FILT_TSR are used you can fine the pairs of peaks of cross RF which corresponds to NCsymmetry. If you want to change the space group of the structure factor file, use keyword NOSG, i.e. new space group number. The Packing Function (PF) is very important to remove wrong solutions which correspond to overlapping symmetry-related or different models. But you can remove this option (PACK = N ), for example, if you want to find the model in a special position. Value of PF = 1 corresponds to non-overlapping, value = -1 corresponds to completely overlapping two models. When MOLREP is trying to find several models (NMON > 1) it is useful to use keyword STICK = Y. Then for each additional molecule the program will choose a symmetry-related molecule closest to which was/were found before. This option does not work with pseudo-translation. Do not use MODE = S without serious consideration. For the PRF and the SAPTF, the default cut-off radius is once the radius of gyration, whereas for a Patterson calculation the cut-off radius would be twice the radius of gyration. KEYWORDED INPUT The available keywords are: General keywords Common: DOC, LABIN, FILE_T, FUN, NMON, NP, NPT, RAD PATH_SCR And for structure factors control: COMPL, RESMAX, SIM And for model control: SURF And for multi-copy search: DYAD, FILE_M2, FILE_T2, NP2, NPTD, NSRF And for search in ED: PRF, INVER And for fitting two models: PRF And for EM or electron density model: DSCALEM, INVERM, ROLIM, DRAD, ORIGIN And for EM or electron density instead of Fobs: DSCALE, INVER, DLIM Keywords for special cases Common: ANISO, BADD, LIST, LMAX, LMIN, MODE, PACK, RES_R, RES_T And for standard MR: DIFF, FILE_S, NMR, NOSG, P2, PST, STICK, VPST, LOCK, NREF, NREFP And for Self RF: CHI, PST, SCALE, FILE_TSR And for multi-copy search: AXIS, DIFF, DIST, P2, ALL, STICK And for search in ED: DIFF, P2, NPTD And for fitting two models: NPTD And for Pure Rigid Body Refinement: DOM General keywords LABIN =... Specify input column lables. The program labels defined are: F, SIGF, F(-), SIGF(-), I,SIGI, I(-), SIGI(-), PHIC, FOM Flabel of F or F(+) SIGFlabel of sigma F or sigma F(+) F(-)label of F(-) SIGF(-)label of sigma F(-) IStructure Intensity of hkl SIGIStandard deviation of the above I(-)Structure Intensity of -h -k -l SIGI(-)Standard deviation of the above PHlabel of phases FOMlabel of figure of merit DOC < N | Y | A > Default: use the additional file with the protocol of the running of the program: DOC-file molrep.doc Ndo not produce DOC-file Yproduce DOC-file with new contents Akeep old contents and add new information, i.e. if a file molrep.doc already exists, the program will add any new information to the end of this file The DOC-file contains the protocol of the running of the program. NP Default: <10> is the number of peaks from the rotation function to be used/checked (maximum: 50). In special cases (e.g. for a dyad search), the use of keywords FUN (with option 'T') and FILE_T is closely linked to NP. NPT Default: <20> is the number of peaks from the translation function to be used/checked (maximum: 50). For use in dyad search, see NPT for dyad search. NMON Default: <1> is the number of monomers. The program will try to create a full model, which will consist of NMON initial models plus model_2. COMPL Default: automatic choice is the completeness of the model: from 0.1 to 1.0. It corresponds to Boff: from RESMAX*2 to RESMAX*6. If COMPL is used, keywords RES_R and RES_T are ignored. For example: if you have a dimer in the asymmetric part of the unit cell, COMPL=0.5. SIM Default: automatic choice Similarity of the model: from 0.1 to 1.0. It corresponds to Badd: from Boverall to -Boverall. SIM=1 means normalized F will be used. When no knowledge of similarity is available, the use of SIM=0.5 as a starting value is recommended. If SIM is used, the keyword BADD is ignored. SIM (Badd) controls high resolution data COMPL (Boff) controls low resolution data The use of Boff and Badd means to change Fobs and Fmodel: |F|_new = |F|_input *exp(-Badd*s2)*(1-exp(-Boff*s2) FUN < A | R | T | S | B | D > Default: Rcalculate only Rotation Function Tcalculate only Translation Function, reading list of peaks of RF from file (molrep_rf.tab) or from TAB_file Acalculate both: RF and TF Srotate and position the model and compute R-factor and Correlation Coefficient Bpure Rigid Body refinement Dfind HA positions by MR solution FILE_T Default: Input or output TAB_file (see also molrep_rf.tab) SURF < N | Y | A | O | 2 > Default: Perform model correction. Ndo not perform any model correction.For FUN=S (just_rotate_and_position) program changes N to O Oonly shift to the origin Amake the protein into a polyalanine model (i.e. remove from the model: water molecules, H atoms, atoms with alternative conformation (except the first), atoms with occupacy = 0), make all B = 20, and shift to the origin Yremove various atoms from the model (water molecules, H atoms, atoms with alternative conformation (except the first), atoms atoms with occupacy = 0), shift to the origin, compute atomic accessible surface area and replace atomic B with B = 15.0 + SURFACE_AREA*10.0 2set all B = 20 and shift to the origin RAD Default: automatically calculated from the model, unless: in case of Self-RF calculations: 30Å for Rotation Function calculations: twice the radius of gyration for PRF and SAPTF: radius of gyration Cut-off radius for Patterson search or for electron density search. RESMAX Default: <3> High resolution limit. Keywords for special cases PST < N | Y | C > Default: How to deal with pseudo-translation. Nignore pseudo-translation altogether Ccheck only, but do not use pseudo-translation If FUN=R and LIST=L, the program computes a list of Patterson peaks and writes these to 'molrep.doc'. It may be useful to detect pseudo-translation. Yuse pseudo-translation. For the Translation Function, the program will add to the model a copy of the model which is translated by the pseudo-translation vector. VPST Default: automatically from Patterson Pseudo-translation vector (in fractional units), used when PST = Y. MODE Default: Fstandard rotation and translation functions are used without rigid body refinement Sadvanced rotation and translation functions and rigid body refinement are used Mstandard rotation and translation functions are used. Rigid body refinement is possible. Rather slow then MODE=F, but correlation coefficient is calculated more correctly. RES_R Default: automatic choice Low resolution limit for Rotation Function. Instead of applying RES_R directly, the program uses all data and applies Boff=4*(RES_R)2. RES_T Default: automatic choice Low resolution limit for Translation Function. Instead of applying RES_T directly, the program uses all data and applies Boff=4*(RES_T)2. BADD Default: <0> BOFF and BADD mean: |F|_new = |F|_input *exp(-Badd*s2)*(1-exp(-Boff*s2) ANISO < N | Y | C | S | K > Default: Ndo not use anisotropic correction and/or scaling Yuse anisotropic correction and scaling Cuse anisotropic correction of Fobs for RF only Suse anisotropic scaling for TF only Kuse scaling without B-factor PACK < Y | N > Default: Yuse Packing Function with Translation Function Ndo not use Packing Function with Translation Function LMIN Default: <4> Minimum L-index of spherical coefficients. The program does not use coefficients with L=0. Possible values are 2,4,6,... L = 2 means to use all coefficients up to Lmax. LMAX Default: automatic choice Maximum L-index of spherical coefficients. Possible values are 2,4,6,8,...,58,60. PRF < N | Y | S | P > Default: Nstandard RF and Phased Translation Function is calculated YSAPTF (Spherically averaged phased translation function), Phased Rotation Function (PRF) and Phased Translation Function will be used. SSAPTF (Spherically averaged phased translation function), Usual Rotation Function (RF) for modified map and Phased Translation Function will be used. PSearch the model orientation in ED map by rotating model around the defined points in ED map. List of points must be in the file FILE_T. Program will use the phases from MTZ file or from EM map. If keyword FUN=T, rather than computing the Rotation Function, the program reads rotation function results from file FILE_T ( or "molrep_rf.tab"): "Sol_ peak number, polar angles (theta,phi,chi) and shift (sx,sy,sz)" NOSG Default: <0> Number of new space group if you want to change the space group for the file of structure factors. Program just changes space group name, group number and cryst. symmetry operators, but not cell and data. LIST < S | L > Default: Sshort DOC-file Llong DOC-file DIFF < N | P | F | H > Default: Nuse unmodified structure factors Puse modified stucture factors instead of Fobs for RF, as follows: ||Fobs|-|Fmod2|*(P2/100)| Fuse modified stucture factors instead of Fobs for RF, as follows: vector difference (Fobs - Fmod2*(P2/100)) Hfor heavy atom search P2 Default: <0> Percentage of model_2 in the structure. NREF Default: <10> number of cycles of rigid body refinement for each TF solution. see keyword:MODE NREFP Default: <0> number of cycles of rigid body refinement before TF for each peak RF. Default is without this refinement STICK < N | Y > Default: Choose from symmetry-related models closest to which found before (this option does not work with pseudo-translation possibility). FILE_S File with sequence for model correction by sequence alignment. NMR < 0 | 1 | 2 | 3 > Default: <0> 0use PDB file with NMR structures as single model 1use NMR possibility only for RF 2use NMR possibility for RF and TF. Best NMR model will be found and used as solution. 3use NMR possibility for RF and TF. Averaged TF will be used. All NMR models will be used as solution. LOCK < Y | N > Default: Locked Cross Rotation function will be performed. Use also keywords: FILE_TSR and NSRF Keywords specific for multi-copy search DYAD < N | Y | D > Default: Ymulti-copy search Ddyad search DIST Three distances for dyad search. DminDefault: radius of gyration. minimal distance between molecules DmaxDefault: 1000Å. maximal distance between molecules DparDefault: 1000Å. maximal shift along rotation axis AXIS Default: <0,0> Chi check only rotation by Chi (in degrees). 0 means to check all orientations. Delta delta for Chi (in degrees) NSRF Default: <0> Number of peaks of Self-RF which will be used. 0 means not to use Self-RF. A list of Self-RF peaks will be taken from file defined by keyword FILE_TSR which must be prepared in advance (see Self Rotation Function). NPT This meaning only in conjuction with keyword DYAD: number of peaks in the STF (Special Translation Function) to be checked through Translation Function calculations, for inter-molecular vector search. If keyword DYAD is not given, the standard meaning of keyword NPT is used. NPTD Number of peaks in TF to be checked through Correlation Coefficient calculations, for dyad search. NP2 Number of peaks in RF for second searching model to be checked for dyad search. FILE_M2 file of second searching model FILE_T2 file with list of peaks of RF for second searching model ALL < N | Y > Default: if ALL = Y , program will use all Crystallographical Symmetry Operators Keywords for Self Rotation Function Without a file of the model, the program computes a Self Rotation Function. CHI Default: <60> Angle chi of additional fourth section of RF(theta,phi,chi). SCALE Default: <6> Maximum value of RF is SCALE * SIGMA(RF). FILE_TSR Default: Input or output TAB_file with peaks of Self_RF. Keywords for EM or electron density as model: DSCALEM Default: <1> scale factor of correction of density cell INVERM < N | Y > Default: If Y , inverted phases will be used ROLIM Default: minimal value of density which will be used DRAD Default: <0> radius of the model (in A). If parameter DRAD is defined program will use the density only inside the sphere with radius = DRAD and with centre in vector ORIGIN. ORIGIN Default: <0,0,0> center of the model in the cell (in fract.units) Keywords for EM or electron density instead of Fobs: DSCALE Default: <1> scale factor of correction of density cell INVER < N | Y > Default: If Y , inverted phases will be used DLIM Default: minimal value of density which will be used Keywords for pure Rigid Body Refinement: DOM < N | Y | I | S > Default: NRB refinement as single body. YMulti-domain refinement. IGive only information about molecule-domain structure. Useful for RB refinement with constraints. SMulti-domain refinement with constraints. MOLECULAR REPLACEMENT METHOD - THEORY Molecular replacement method (MR) There are two major steps in the Molecular replacement method: orientation and translation search. They are performed by Rotation and Translation function. Both of them are correlation functions (or overlapping functions) between observed and calculated from model Patterson. Rotation function (RF): ROT(R) = I Pobs(r) * Pcalc(R,r) dr radwhere R operator of rotation I rad integral inside a sphere in the centre of patterson with radius=rad (i.e. the cut-off radius) Pobs observed Patterson Pcalc calculated Patterson for rotated (R) model Translation function (TF): TR(s) = I Pobs(r) * Pcalc(s,r) dr = cell = Sum ( I Pobs(r) * Pcalc_ij(s,r) dr) = Sum TRij(s) i#j i#jwhere s vector of translation I integral i,j cryst. symmetry operator numbers Pcalc_ij(s,r) calculated Patterson for model corresponding to ith operator and model corresponding to jth operator TRij(s) translation function of Pattersons Pobs(r) and Pcalc_ij(s,r). The Translation Function is the sum of translation functions for each pair of different cryst. symmetry operators. The best rotation function algorithm is the Crowther Fast Rotation Function which we use here. It utilizes FFT. MOLREP can compute the Rotation Function for three different orientations of the model and average them. That reduces the noise of Rotation function. Translation function algorithm was developed by the author and performs calculations in the reciprocal space using FFT. There are two major differences from other translation functions. Instead of summation of the translation functions for two operators TRij, we use their multiplication which makes the resulting map far more contrast-rich. Finally we can multiply the translation function with the Packing Function to remove peaks corresponding to incorrect solutions with bad packing. Packing function (PF) is overlapping function: P(s) = Sum ( I Ro_i(r) * Ro_j(r) dr ) i#j cellwhere Ro_i(r) is the electron density of the model which corresponds to the ith cryst. symmetry operator. The algorithm of calculation of the Packing Function is similar to the one for the Translation Function and performed by the same program. Finally the 'advanced' Translation function is: TR(s) = [ M TRij(s) ] * P(s) i#jwhere M means multiplication of different TRij. Scaling by Patterson For scaling we use a completely new strategy based on the Patterson origin peak which is approximated by a Gaussian. This peak is computed for both the observed and calculated amplitudes, and each case the B_overall is computed. The difference B_diff_overall = B_obs_overall - B_calc_overall is then added to calculated B_overall so as to make the width of the calculated Patterson origin peak equal to the observed peak. This method makes it possible to have a good approximation for the scaling problem even if only low resolution data is available where other methods do not work. Scaling by Patterson is also useful for the Cross Rotation Function where we have different cells for the model and the unknown structure. Low resolution cut-off (Boff) Low resolution cut-off introduces systematical errors in the electron density especially near the surface of the model. This is known as the series termination effect. Instead of using the usual low resolution cut-off, MOLREP multiplies the modules of the structure factors by a special coefficient: Fnew = Fold (1-exp(-Boff*s2)), where Boff= 4resmin2Boff is called the "soft low resolution cut-off", which allows removal of structure factors in this resolution range without inroducing the series termination effect. The use of a priori knowledge of similarity and completeness of the model For low similarity the high resolution reflections are weighted down. For this, MOLREP uses an additional overall factor Badd: Fnew = Fold exp(-Badd*s2)Value of similarity 'SIM' can be: from 0.1 to 1.0. It corresponds to Badd: from (B_limit-Boverall) to -Boverall, where B_limit + 80. SIM=1 means normalized F will be used. For low completeness, e.g. when there are several molecules in the a.u., the contribution of low resolution reflections is weighted down. To manage the completeness of the model, MOLREP uses a low resolution cut-off (Boff). Completeness of model 'COMPL' can be : from 0.2 to 1.0. It corresponds to Boff: from 400 to 1600. Functions of electron density searching (SM) We suggest a new approach to divide a phased six-dimensional search into three steps: A spherically averaged translation function is used to locate the position of a molcule or its fragment. It compares locally spherically averaged experimental electron density with that calculated from the model and tabulates highly probable positions accordingly. Then for each position a local phased rotation function is used to find the orientation of the molecule. The third step is the phased translation function, used to check and refine the found position. Spherically averaged phased translation function (SAPTF) SAPTF gives the expected position of a model in an electron density map by the comparison of spherically averaged density of the model with locally spherically averaged observed density. SAPTF(s) = I Robs(r,s) * Rcalc(r) dr rad(s)where I rad(s) integral inside a sphere centred in point s of electron density with radius=rad (i.e. the cut-off radius) Robs spherically averaged around point s observed electron density Rcalc spherically averaged around origin of coordinate system calculated electron density for model Phased Rotation function (PRF) PRF gives the orientation of model placed in some point of electron density. PROT(O) = I Robs(r) * Rcalc(O,r) dr rad(s)where O operator of rotation I rad(s) integral inside a sphere centred in point s of electron density with radius=rad Robs observed electron density Rcalc calculated electron density for rotated (O) model Phased Translation function (PTF) Translation search in electron density map. PTR(s) = I Robs(r) * Rcalc(s,r) dr cellwhere s vector of translation I integral Robs observed electron density Rcalc(s,r) calculated electron density for model placed in the vector s Fitting two models (FM) Fitting through electron density. Second model (MODEL_2) is the target model which converted to electron density. To search the best overlapping of electron densities of models there are two algorithms: Rotation Function (Patterson) and Phased Translation Function (electron density). All functions for electron density. Spherically Averaged Phased Translation Function gives expected position for model. Phased Rotation Function for expected position gives orientation. Phased Translation Function checks and refines the translation vector. Special Translation Function (STF) for dyad search Multi-copy search Search two copies of a model simultaneously. There are three stages to this: Rotation function. The program checks all pairs of first NP peaks of Rotation Function (RF). For each pair the program uses the first rotation to prepare model-1. Model-2 will be prepared by using the second rotation and one rotation from the crystallographical symmetry operatators. Next, for the current pair (model-1 and model-2): MOLREP computes the Special Translation Function (STF) to find the inter-molecular vector of this dyad. For NPT peaks of the previous Special Translation Function (STF) (i.e. for NPT inter-molecular vectors) the program computes a standard Translation Function (TF) using the current dyad as model and calculates a Correlation Coefficient for first NPTD peaks of TF. Special Translation Function (STF) Imagine two models in the asymm. part of the unit cell: F1(h) structure factor of model_1 with the centre of gravity in the origin of the coord. system F2(h) structure factor of model_2 Let S1 vector in unit cell from the origin of the coord. system to the centre of gravity of model_1 S2 vector for model_2 When F(h) is the total structure factor (for the whole crystal structure): F(h) = F1(h)exp(-2pihS1) + F2(h)exp(-2pihS2)Then the Patterson is: P(h) = F(h)*F'(h) = F1(h)*F1'(h) + F1'(h)*F2(h)*exp(-2pih(S2-S1)) + F2'(h)*F2(h) + F1(h)*F2'(h)*exp(-2pih(S1-S2)) = P0(0) + P1(S2-S1) + P1(S1-S2)The Special Translation Function is a Phased TF with a Patterson function as electron density and P1 = F1'(h)*F2(h) as structure factors of the model. Solution of this function is the dyad vector S1-S2. Anisotropic correction and scaling Aniso correction: For Structure Factors we can estimate: 1. isotropic B_overal: F(s) ~ Scale_overall * exp (-B_overall*s^2) 2. anisotropic B_overall (tensor) : F(s) ~ Scale_overall * exp(-(B11a*a*hh +2B12a*b*hk+..) Aniso correction means to make data isotropic with B_overall: F_new(s) = F_old(s) * exp(+(B11a*a*hh +2B12a*b*hk+..) * exp(-B_overall*s^2) Aniso scaling: Fnew = Scale*Fold*exp(-(B11a*a*hh +2B12a*b*hk+..) Scale ans aniso B are taken by mimimization: sum(!Fobs-Fnew!) COMMAND FILE EXAMPLES example of Cross Rotation and Translation functions: # -------------------------------- molrep HKLIN test.mtz MODEL 2sar.pdb << eor # -------------------------------- # LABIN F=F SIGF=SIGF NP 8 RAD 27 ANISO C sim .1 compl .5 eor example of Self Rotation function: # -------------------------------- molrep HKLIN test.mtz << eor # -------------------------------- # LABIN F=F SIGF=SIGF # RAD 27 END eor example using phases For searching in the electron density map for some model (standard Rotation Function will be used): # -------------------------------- molrep HKLIN test.mtz MODEL mod.pdb << eor # -------------------------------- # LABIN F=F SIGF=SIGF PH=PH_FO FOM=FOM # NP 8 END eor example of fitting two models: # -------------------------------- molrep MODEL mod.pdb MODEL2 mod2.pdb << eor # -------------------------------- # PRF Y eor example of dyad search: # -------------------------------- molrep HKLIN test.mtz MODEL mod.pdb << eor # -------------------------------- # LABIN F=F SIGF=SIGF # dyad y axis 0,10 dist 0,300,300 NPT 3 NPTD 3 eor example of dimer search: # -------------------------------- molrep HKLIN test.mtz MODEL mod.pdb << eor # -------------------------------- # LABIN F=F SIGF=SIGF # dyad y axis 180,10 dist 0,300,1 NPT 3 NPTD 3 eor example dimer search for Self-RF orientations: # -------------------------------- molrep HKLIN test.mtz MODEL mod.pdb << eor # -------------------------------- # LABIN F=F SIGF=SIGF # dyad y axis 180,10 dist 0,300,1 NSRF 20 NPT 3 NPTD 3 FILE_TSR srf.tab eor example of using file of sequence # -------------------------------- molrep HKLIN test.mtz MODEL mod.pdb << eor # -------------------------------- # LABIN F=F SIGF=SIGF # NP 8 NMON 2 FILE_S new.seq sim .1 compl .5 eor Convention for rotation Rotation by Euleran angles Alpha, Beta, Gamma: euleran angles : 1. A( Z ) - alpha around axis Z 2. B( Y') - beta around new axis Y 3. G( Z') - gamma around new axis Z Rotation by Polar angles Theta, Phi, Chi: polar coordinates Theta, Phi of rotate axis: Theta - angle between rotate axis and Z Phi - angle in plan XY between X and projection rotate axis Chi - rotation angle arount rotate axis Convention for Orthonormal coordinate system Orthonormal axes are defined to have: A parallel to X , Cstar parallel to Z MEMORY CONTROL PARAMETERS In main_molrep_mtz.f: CC --- MEMORY - common memory for maps and coordinates PARAMETER ( MEMORY =4000000 ) CC --- NCRDMAX - maximal number of coordinates PARAMETER ( NCRDMAX = 100000 ) CC --- IPRSYM - maximal number of symmetry operators PARAMETER ( IPRSYM=96 ) INTEGER*2 ISYM(5,3,IPRSYM) PARAMETER ( MEM = MEMORY/2 ) REAL*8 POOL(MEM) C ---- If program stops with message: ERROR: not memory enough ...change parameter MEMORY in main_molrep_mtz.f Penn State Milton S. Hershey Medical Center ©2004 This page was last updated on May 09, 2005 Contact Us