!===================================================================================! ! LIN_SOLVER --- solves S*x=b for a nb-block-band diagonal matrix S with optional ! ! right parameter bands by a band version of Gauss elimination with ! ! partial or full pivoting. ! ! STANDALONE VERSION ! !-----------------------------------------------------------------------------------! ! STATUS: gra 06-VI-98 CREATED: gra 15.IX.97 ! !-----------------------------------------------------------------------------------! ! INPUT: ! ! s --- block nb-diagonal matrix with npar columns at the ! ! right border of the matrix with nequ x nequ blocks. ! ! s(nequ x nb+2 x nequ x 0:nb_rows) : ! ! 1st index: column number in each block ! ! 2nd index: block number, ! ! s(1..npar,nb+1,*,*) for eigenvalue parts ! ! s(1..nrhs,nb+2,*,*) for the rhs parts ! ! 3rd index: equation number in each of the blocks ! ! 4th index: node number, block-row number ! ! (for more details: see the notes below ) ! ! N.B.: s should be zeroed before fill ! ! ! nb_rows --- maximum number of block rows of s (not counting the 0th one). ! ! Expressed in Fortran-like integer arithmetics: ! ! nb_rows >= nodes, ! ! nodes =(neqs-jb1)/nequ+(jb2+nequ-1)/nequ, with ! ! jb2 = mod(jb1+npar,nequ). ! ! To choose nb_rows >= neqs/nequ+1 is always safe. ! ! nb --- bandwidth of matrix s (can be choosen > than the actual ! ! bandwidth of the problem) ! ! neqs --- total number of equations ! ! nequ --- number of equations in each block ! ! npar --- number of parameters (eigenvalues); must be <= nequ ! ! nrhs --- number of right hand sides; must be <= nequ ! ! nbe --- number of equations intended to refer to one of the first ! ! nequ variables (=first block of row) ! ! jb1 --- number of equations in s(*,*,*,0) (can be zero or <= nequ) ! ! debug --- set true if extensive debugging output is desired ! ! full_piv --- set true if full pivoting is desired ! !-----------------------------------------------------------------------------------! ! OUTPUT: ! ! s --- is destroyed on output ! ! sol --- nequ x nb_rows x nrhs solutions at each node ! ! The eigenvalues are stored in sol(*,(neqs-npar)/nequ+1,*). ! ! failed --- true if an error occured ! !-----------------------------------------------------------------------------------! ! CONTAINS: internal subroutine error_check for error detection ! !-----------------------------------------------------------------------------------! ! NOTES: ! ! All real variables are of kind dp (see module precision). ! ! This code solves a nb-block-diagonal matrix with optional npar eigenvalues ! ! which show up as npar right colums of the nequ x nb+2 x nequ x 0:nb_rows ! ! matrix s, which is solved with a Gauss elimination scheme. ! ! o The first dimension specifies the columns in each block. ! ! o The second dimension denotes the sub-, principal, and superdiagonal ! ! blocks of matrix M. Its dimension extends from 1:nb+2. ! ! The eigenvalue part is stored in S(1..npar,nb+1,*,*). ! ! The right-hand sides are contained in S(1..nrhs,nb+2,*,*). ! ! o The third dimension numbers the equations (the rows) in each block ! ! o The fourth dimension counts the node number ! ! The upper(lower) jb1(jb2) equations are stored in s(.,.,.,0) ! ! ( s(.,.,.,nodes) ), leaving the last unused rows empty. The zeroth blocks ! ! can be left empty or filled entirely, giving the user some flexibility. ! ! The eigenvalue dependencies e are collected in the rightmost columns. ! ! The form of the matrix at the boundaries is specified with the parameter ! ! nbe, which counts the number of equations including the first nequ variables! ! (full blocks * nequ + jb1). The form of the lower boundary then follows ! ! from the other input parameters. As an example consider the following: ! ! nb = 3, neqs = 20, nequ = 3, npar = 2, nrhs =2, nbe = 7, jb1 = 1, ! ! nodes = 7, x = any value, r = rhs, V = variable, E = eigenvalue ! ! \ block 1 2 3 4 sol 5=r.h.s. ! ! node ----------------- ------- --------- --- ----- ! ! 0 | x x x | x x x | x x x | jb1 \ | e | e | |V1 |r|r| ! ! ----------------- ------- | --------- |V| ----- ! ! | x x x | x x x | x x x | | | e | e | |V| |r|r| ! ! 1 | x x x | x x x | x x x | nequ | | e | e | --- |r|r| ! ! | x x x | x x x | x x x | |= nbe | e | e | |V| |r|r| ! ! ------------------------- | --------- |V| ----- ! ! 2 | x x x | x x x | x x x | | | e | e | |V| |r|r| ! ! | x x x | x x x | x x x | nequ | | e | e | --- |r|r| ! ! | x x x | x x x | x x x | / | e | e | |V| |r|r| ! ! --------------------------------- --------- |V| ----- ! ! | x x x | x x x | x x x | | e | e | |V| |r|r| ! ! 3 | x x x | x x x | x x x | | e | e | --- |r|r| ! ! | x x x | x x x | x x x | | e | e | |V| |r|r| ! ! --------------------------------- --------- x |V| = ----- ! ! | x x x | x x x | x x x | | e | e | |V| |r|r| ! ! 4 | x x x | x x x | x x x | | e | e | --- |r|r| ! ! | x x x | x x x | x x x | | e | e | |V| |r|r| ! ! ... ----------------------------------------- |V| ----- ! ! | x x x | x x x | x x x | e | e | |V| |r|r| ! ! nodes-2 = 5 nequ | x x x | x x x | x x x | e | e | --- |r|r| ! ! | x x x | x x x | x x x | e | e | |V| |r|r| ! ! --------------------------------- |V| ----- ! ! | x x x | x x x | x x x | e | e | |V| |r|r| ! ! nodes-1 = 6 nequ | x x x | x x x | x x x | e | e | --- |r|r| ! ! | x x x | x x x | x x x | e | e | |E| |r|r| ! ! --------------------------------- --- ----- ! ! nodes = 7 jb2 | x x x | x x x | x x x | e | e | |E| |r|r| ! ! --------------------------------- --- ----- ! ! block 1 2 3 4 5 ! ! ! ! Notice the order of the blocks at the lower boundary, especially if the ! ! bandwidth is larger than the mathematical one. ! ! The matrix is reduced to upper triangular form by Gauss elimination with ! ! extended partial or full pivoting on successive elimination frames of ! ! size nbe x nb+2 and then solved by backsubstitution. Full pivoting is ! ! done on the first nbe x nequ block column. Possible fill-in is handled by ! ! shifting remaining blocks to the left, so no additional memory is required. ! ! Left bands can be simulated by extra variables Vx=V1/2/.., which are used ! ! as parameters, thus leading to additional right bands, and extra equations ! ! Vx=V1/2/.. at the top. ! !-----------------------------------------------------------------------------------! ! AUTHOR: Dr.U.Grabowski, Astronomical Institute Basel, Switzerland. ! ! Any suggestions to improve this code are welcome. ! ! Send mail to gra@astro.unibas.ch. ! !-----------------------------------------------------------------------------------! ! LEGAL: This code can be used and exchanged freely except for ! ! commercial purposes. ! ! There is no guarantee that this code is bug-free. ! !-----------------------------------------------------------------------------------! ! ACKNOWLEDGEMENTS: ! ! This code has been developed as part of the DarkStar Project, ! ! supported by the Swiss National Science Foundation. ! !-----------------------------------------------------------------------------------! ! MAINTENANCE HISTORY: ! ! gra 6. VI.98: variable debug format; minor edits ! ! gra 23. IV.98: no access to sol(:,npar_node,:) if npar=0 ! ! gra 18. XI.97: full pivoting of last block in all cases ! ! gra 10. XI.97: corrected spurious error in storage of last block of solution, ! ! reestablished full pivoting. ! ! gra 6. XI.97: corrected errors ! ! gra 5. XI.97: parameter nrhs ! ! gra 4. XI.97: optimized version ! ! gra 2. XI.97: new input conception ! ! gra 23. X.97: changed dimension of corr, ngrid => nodes ! ! gra 15. IX.97: created from 3-block version ! !===================================================================================! subroutine LIN_SOLVER (s, sol, nb_rows, nb, neqs, nequ, npar, nrhs, nbe, jb1, & & failed, debug, full_piv) USE precision, only : dp, zero, one, two implicit NONE ! --> declaration of constants real (KIND=dp) , parameter :: eps = EPSILON (one) character (LEN=*), parameter :: debug_file = "lin_solver.debug" ! --> declaration of formal parameters integer, intent(IN) :: nb_rows, nb, nequ, neqs, npar, nrhs, nbe, jb1 logical, intent(IN) :: debug, full_piv real (KIND=dp), intent(INOUT) :: s(nequ,nb+2,nequ,0:nb_rows) real (KIND=dp), intent(OUT) :: sol(nequ,nb_rows,nrhs) logical, intent(OUT) :: failed ! --> declaration of variables real (KIND=dp) :: sh((nb+2)*nequ,(nb+2)*nequ) real (KIND=dp) :: shh((nb+2)*nequ),ev(npar+nequ,nrhs) real (KIND=dp) :: p_scale((nb+2)*nequ), epsp((nb+2)*nequ), pivot integer :: ind_p(2), ind_c((nb+2)*nequ), nib, lf, kl, next_free_unit integer :: nodes, inodes, npar_node, jb2, last_frame, kp, kp1, io_debug integer :: last_rows, n_diag, i_rows, kcol, nb1, i, j, k logical :: error_checked = .false., full character (LEN=18) :: c_format="(1p, (1x,E11.3:))" save error_checked, io_debug, c_format ! --> global settings of computable matrix parameters ! --> nib : initial blockrows except boundary blockrow in S(:,:,:,0) ! --> last_rows : number of rows in last elimination block ! --> jb2 : possible excess lines in last blockrow (if not completely filled) ! --> inodes : completely filled nodes except upper boundary in S(:,:,:,0) ! --> nodes : all nodes except upper boundary in S(:,:,:,0) ! --> npar_node : node where eigenvalues start (always the last npar elements in sol) nib = (nbe-jb1)/nequ last_rows = nb*nequ+npar jb2 = mod(nequ-jb1+npar,nequ) inodes = (neqs-jb1)/nequ nodes = inodes+(jb2+nequ-1)/nequ npar_node = (neqs-npar)/nequ+1 last_frame = inodes-(last_rows-jb1)/nequ+1 ! print*,'nb_rows,nb, neqs, nequ, npar, nbe, jb1:' ! print*,nb_rows,nb, neqs, nequ, npar, nbe, jb1 ! print*,'nib,last_rows,jb2,inodes,nodes,last_frame:' ! print*,nib,last_rows,jb2,inodes,nodes,last_frame ! --> error checking if (.not. error_checked) then call error_check error_checked = .true. endif full = full_piv ! --> forward loop over elimination frames frame_loop: do kp=1,last_frame ! --> copy nbe x (nb+2)*nequ submatrix from s to sh ! --> diagonalize first nequ x nequ block of submatrix sh and ! --> eliminate (nib-1)*nequ+jb1 x nequ block below; pivoting includes ! --> all blocks to be reduced (generates max. nb-1 blocks of fill-in); ! --> store first nequ remaining rows to s (aligned to the left) ! --> get further frames partly from sh of last step if (kp == last_frame) full = .true. ! ! -----> first jb1 rows ! nb1 = (nb+2)*nequ if (kp == 1) then if (jb1 /= 0) sh (:,1:jb1) = RESHAPE( s(:,:,1:jb1,0), (/ nb1,jb1 /) ) sh (:,jb1+1:jb1+nib*nequ) = RESHAPE( s(:,:,:,1:nib), (/ nb1,nib*nequ /) ) else sh (1:(nb-1)*nequ,1:nbe-nequ) = sh(nequ+1:nb*nequ,nequ+1:nbe) sh ((nb-1)*nequ+1:nb*nequ,1:nbe-nequ) = zero sh (nb*nequ+1:(nb+2)*nequ,1:nbe-nequ) = sh(nb*nequ+1:(nb+2)*nequ,nequ+1:nbe) sh(:,nbe-nequ+1:nbe) = RESHAPE( s(:,:,:,kp+nib-1), (/ (nb+2)*nequ, nequ /) ) if (kp == last_frame) then lf = last_rows+mod(nequ-jb2,nequ)-nbe sh (:,nbe+1:nbe+lf) = RESHAPE( s(:,:,:,kp+nib:nodes), (/ nb1,lf /) ) endif endif ! --> now we can diagonalize and reduce first block column of sh : ! -----> find scale factors for implicit pivoting in first block column ! -----> reduce rows in column ! ind_sc = 0 i_rows = nib*nequ+jb1 if (kp == last_frame) i_rows = last_rows if (debug) then if (kp == 1) then io_debug = next_free_unit(11) open (io_debug,file=debug_file, access='SEQUENTIAL', form='FORMATTED') if (nequ*(nb+2) > 99) write (*,"(a)") "### LIN_SOLVER WARNING: debug output truncated !" write (c_format(5:6),"(i2)") nequ*(nb+2) endif write (io_debug,*) write (io_debug,*) '***** Point ',kp do j=1,i_rows write(io_debug,c_format) ( sh(i,j), i=1,nb1) enddo endif ! !!!### implicit pivoting appears to be inappropriate for our problem ! do i=1,i_rows ! ind_sc = MAXLOC(ABS(sh(1:nb*nequ,i))) ! p_scale(i) = ABS(sh(ind_sc(1),i)) ! if (p_scale(i) == zero) p_scale(i) = one ! enddo ! -----> epsp is minimum precision possible with given order of magnitude ! epsp(1:i_rows) = two*eps*p_scale(1:i_rows) ! if (debug) write (io_debug,*) 'epsp=',epsp(1:i_rows) ! --> loop over columns of first block column n_diag = nequ if (kp == last_frame) n_diag = i_rows ind_c(1:n_diag) = (/ (i,i=1,n_diag) /) diag_loop: do k=1,n_diag ! -----> exchange rows according to largest elements in successive columns ! -----> for explicit extended partial or full pivoting kcol = k if (full) kcol = n_diag ind_p = MAXLOC(abs(sh(k:kcol,k:i_rows)))+k-1 if (ind_p(2) /= k .or. (full .and. ind_p(1) /= k)) then shh = sh(:,ind_p(2)) sh(:,ind_p(2)) = sh(:,k) sh(:,k) = shh pivot = shh(ind_p(1)) else pivot = sh(k,k) endif ! -----> in case of full pivoting exchange columns if (full .and. ind_p(1) /= k) then shh(1:i_rows) = sh(k,1:i_rows) sh(k,1:i_rows) = sh(ind_p(1),1:i_rows) sh(ind_p(1),1:i_rows) = shh(1:i_rows) ind_c(k) = ind_p(1) endif ! shh(1) = epsp(k) ! epsp(k) = epsp(ind_p(2)) ! epsp(ind_p(2)) = shh(1) if (pivot == zero) then write (*,'(a,2i5)') '### Error in LIN_SOLVER: zero pivot, frame/line=',kp,k failed = .true. return endif ! -----> normalize pivot row sh(k:,k) = sh(k:,k)/pivot if (debug ) then write(io_debug,*) write(io_debug,*) 'Before reduction:' do j=1,i_rows write(io_debug,c_format) ( sh(i,j), i=1,nb1) enddo endif ! -----> reduce elements of column except pivot ! -----> (zero out elements dominated by numerical errors) !!!### change of : to kp1: is slower kp1 = k+1 do i=1,i_rows if (i == k) cycle sh(:,i) = sh(:,i)-sh(k,i)*sh(:,k) ! sh(kp1:,i) = sh(kp1:,i)-sh(k,i)*sh(kp1:,k) ! where (sh(k:,i) < epsp(i)) sh(k:,i) = zero enddo if (debug) then write (io_debug,*) write (io_debug,*) 'After reduction:' do j=1,i_rows write(io_debug,c_format) ( sh(i,j), i=1,nb1) enddo endif enddo diag_loop ! ! -----> in case of full pivoting, descramble columns and rows ! if (full) then do i=n_diag,1,-1 if (ind_c(i) /= i) then shh(1:i_rows) = sh(i,1:i_rows) sh(i,1:i_rows) = sh(ind_c(i),1:i_rows) sh(ind_c(i),1:i_rows) = shh(1:i_rows) endif enddo do i=n_diag,1,-1 if (ind_c(i) /= i) then shh = sh(:,i) sh(:,i) = sh(:,ind_c(i)) sh(:,ind_c(i)) = shh endif enddo endif if (debug) then write (io_debug,*) write (io_debug,*) '##### Point ',kp do j=1,i_rows write(io_debug,c_format) ( sh(i,j), i=1,nb1) enddo if (kp == last_frame) close (io_debug,status='KEEP') endif ! --> store remaining nontrivial blocks (1:nequ,1:nb+2 = parameters and rhs) ! --> filling s starting at the leftmost block ! --> last block need not to be stored if (kp < last_frame) then if (jb1 /= 0) then if (kp == 1) s(:,:,1:nequ-jb1,0) = zero s (:, 1:nb-1 ,nequ-jb1+1:nequ, kp-1) = RESHAPE( sh( nequ+1: nb*nequ ,1:jb1), & & (/nequ,nb-1,jb1/) ) s (:, nb ,nequ-jb1+1:nequ, kp-1) = zero s (:,nb+1:nb+2,nequ-jb1+1:nequ, kp-1) = RESHAPE( sh(nb*nequ+1:(nb+2)*nequ,1:jb1), & & (/nequ, 2 ,jb1/) ) endif s (:, 1:nb-1 ,1:nequ-jb1,kp) = RESHAPE( sh( nequ+1: nb*nequ ,jb1+1:nequ), & & (/nequ,nb-1,nequ-jb1/) ) s (:, nb ,1:nequ-jb1,kp) = zero s (:,nb+1:nb+2,1:nequ-jb1,kp) = RESHAPE( sh(nb*nequ+1:(nb+2)*nequ,jb1+1:nequ), & & (/nequ, 2 ,nequ-jb1/) ) endif ! --> end of big loop end do frame_loop ! --> now we are ready for backsubstitution : ! --> last filled sh block holds solutions for the last_rows variables sol(:,nodes:,:) = zero if (npar /= 0) then ev = zero do j=1,nrhs ev(1:npar,j) = sh((nb+1)*nequ+j,last_rows-npar+1:last_rows) sol(1:npar,npar_node,j) = ev(1:npar,j) enddo endif lf = last_rows-npar kl = npar_node-1 do j=1,nrhs sol (:,last_frame:kl,j) = RESHAPE( sh((nb+1)*nequ+j,1:lf), (/nequ,kl-last_frame+1/) ) enddo ! --> loop up in s to compute the remaining variables do j=1,nrhs do kp=last_frame-1,1,-1 do i=1,nequ-jb1 sol (i+jb1,kp,j) = s(j,nb+2,i,kp) - & & SUM( s( : ,1:nb-1,i,kp) * sol(:,kp+1:kp+nb-1,j) ) - & & SUM( s(1:npar, nb+1 ,i,kp) * ev(1:npar,j) ) enddo do i=1,jb1 sol (i,kp,j) = s(j,nb+2,nequ-jb1+i,kp-1) - & & SUM( s( : ,1:nb-1,nequ-jb1+i,kp-1) * sol(:,kp+1:kp+nb-1,j) ) - & & SUM( s(1:npar, nb+1 ,nequ-jb1+i,kp-1) * ev(1:npar,j) ) enddo enddo enddo ! --> exit failed = .false. return !---------------------------------------------------------------------------------------------- CONTAINS !===================================================================================! ! ERROR_CHECK --- check against input errors ! !-----------------------------------------------------------------------------------! ! STATUS: gra 02-II-98 CREATED: gra 15.IX.97 ! !-----------------------------------------------------------------------------------! ! MAINTENANCE HISTORY: ! ! gra 2. II.98: corrected nb_rows check ! ! gra 10. XI.97: added matrix size check ! ! gra 2. XI.97: new input conception ! !===================================================================================! subroutine error_check implicit NONE if (size(s,1) /= nequ .or. size(s,2) /= nb+2 .or. & & size(s,3) /= nequ .or. size(s,4) /= nb_rows+1) then write (*,'(a)') '### ERROR in LIN_SOLVER: s and parameters are incompatible' write (*,'(a,3i5)') ' nequ, nb+2, nb_rows=', nequ, nb+2, nb_rows write (*,'(a)') ' HALTED ###' stop else if (nib < 1 ) then write (*,'(a)') '### ERROR in LIN_SOLVER: No first block' write (*,'(a,3i3)') ' nbe, nequ, npar=', nbe, nequ, npar write (*,'(a)') ' HALTED ###' stop else if (jb1 > nequ) then write (*,'(a)') '### ERROR in LIN_SOLVER: jb1 > nequ' write (*,'(a,3i3)') ' jb1, nequ =', jb1, nequ write (*,'(a)') ' HALTED ###' stop else if (inodes*nequ+jb1+jb2 /= neqs) then write (*,'(a)') '### ERROR in LIN_SOLVER: neqs,nequ,npar,jb1 inconsistent !' write (*,'(a,4i5)') ' neqs,nequ,npar,jb1 =', neqs,nequ,npar,jb1 write (*,'(a)') ' HALTED ###' stop else if (nodes < 2*nb) then write (*,'(a,i5)') '### ERROR in LIN_SOLVER: nodes = ',nodes write (*,'(a,i3,a)') ' specify at least ',2*nb,' points !' write (*,'(a)') ' HALTED ###' stop else if (nbe > nb*nequ+jb1) then write (*,'(a,i3)') '### ERROR in LIN_SOLVER: nbe = ',nbe write (*,'(a,i3,a)') ' specify max. ',nb*nequ+jb1,' initial equations !' write (*,'(a)') ' HALTED ###' stop else if (npar > nequ) then write (*,'(a,i3)') '### ERROR in LIN_SOLVER: npar > ',nequ write (*,'(a,i3,a)') ' specify max. ',nequ,' parameters !' write (*,'(a)') ' HALTED ###' stop else if (nb_rows < nodes) then write (*,'(a,i5,a)') '### ERROR in LIN_SOLVER: nb_rows=',nb_rows,' too small' write (*,'(a,i5,a)') ' specify min. ',nodes+1,' rows !' write (*,'(a)') ' HALTED ###' stop endif return end subroutine error_check end subroutine LIN_SOLVER !===================================================================================! ! NEXT_FREE_UNIT --- look for a free unit number for opening a file connection ! !-----------------------------------------------------------------------------------! ! STATUS: gra 31-I-98 CREATED: gra 31.I.98 ! !-----------------------------------------------------------------------------------! ! INPUT: iu_requested --- requested file unit ! !-----------------------------------------------------------------------------------! ! RESULT: next_free_unit --- unit assigned ! !-----------------------------------------------------------------------------------! ! MAINTENANCE HISTORY: ! !===================================================================================! function NEXT_FREE_UNIT (iu_requested) RESULT (iu_free) implicit NONE integer, intent (IN) :: iu_requested integer :: iu_free logical :: is_open iu_free = iu_requested-1 is_open = .true. do while (is_open) iu_free = iu_free+1 inquire (unit=iu_free, opened=is_open) enddo end function NEXT_FREE_UNIT