pghpf Version 2.1
Release Notes
The Portland Group, Inc.
9150 SW Pioneer Court, Suite H
Wilsonville, Oregon 97070
While every precaution has been taken in the preparation of this document, The Portland Group, Inc. makes no warranty for the use of its products and assumes no responsibility for any errors which may appear, or for damages resulting from the use of the information contained herein. The Portland Group, Inc. retains the right to make changes to this information at any time, without notice. The software described in this document is distributed under license from The Portland Group, Inc and may be used or copied only in accordance with the terms of the license agreement. No part of this document may be reproduced or transmitted in any form or by any means, for any purpose other than the purchaser's personal use without the express written permission of The Portland Group, Inc. Commercial uses are strictly prohibited.
PGI, pghpf, pgf77, pgcc, pgprof, and pgdbg are trademarks of The Portland Group, Inc. Other brands and names are the property of their respective owners.
pghpf Version 2.1 Release Notes
Copyright (c) 1996 The Portland
Group, Inc.
All rights reserved.
Printed in the United States of
America
Printing History
May 1996: First Printing
Part Number: 2401-990-990-0596
Phone: (503) 682-2806
Fax: (503) 682-2637
e-mail: trs@pgroup.com
Most features of full HPF are included in pghpf 2.1, with the few exceptions noted in these release notes. If you encounter any HPF feature that is not supported, and not listed in Section 3, "Restrictions and Omissions - pghpf 2.1", you should consider it a bug and report it to PGI at the e-mail address trs@pgroup.com.
HPF language features include:
Section 6, "Independent Loops" describes this directive and the new clauses.
USE HPF_LOCAL_LIBRARY
-Mnofree
-Mnofreeform
These two options serve the same function. Using pghpf 2.1, the compiler treats files with a .f90 extension as Fortran 90 files using free source form. Using either of these options specifies fixed source form for files with the .f90 extension. For files with other extensions, for example .F or .hpf, the option -Mfreeform specifies Fortran 90 free source form.
-Mnoindependent
Independent DO loop processing has been significantly expanded in this release. This new option disables parallelization associated with INDEPENDENT DO loops.
-Moverlap=size:n
This option allows the programmer to set the size of the overlap shift area created for the overlap shift optimization. By default the size is set to 4. If the programmer wants a different overlap shift area size, either to save memory allocated or to reduce communications when the compiler generates the overlap shift optimization, a size other than 4 can be set. A size of 0 disables the overlap shift optimization. For more details on this option, refer to section 4.1 "Setting the Size of the Overlap Area".
% setenv PGI /usr/pgi % set path=($PGI/platform/bin $path) % setenv LM_LICENSE_FILE $PGI/license.datYou should now be able to compile and run HPF programs as follows:
% pghpf hello.hpf % a.out options -pghpf pghpf_optionsIf you wish to link and run with a version of pghpf other than the default for your system, refer to the pghpf User's Guide for more details.
subroutine foo(a,b)
common /c1/ n,m
integer, dimension(n) :: a
integer, dimension(m) :: b
!hpf$ distribute (block) :: a,b
a(:) = b(:)
end
The
assignment a(:) = b(:) says that a and b must be
equal sized arrays, since the assignment implies the arrays are conformable.
When using either of n or m in the declaration for
a and b, the compiler performs additional optimizations, as
compared with the code shown above.
DATE (2:5) = DATE(1:4)
subroutine test1(a,b)
integer, dimension(10):: a,b
optional :: a
!hpf$ template t(n)
!hpf$ distribute (block)::t
!hpf$ align a(i) with t(i)
!hpf$ align b(i) with a(i) ! THIS IS A PROBLEM
Should
be rewritten as:
subroutine test1(a,b)
integer, dimension(10):: a,b
optional :: a
!hpf$ template ta(n)
!hpf$ distribute (block):: ta
!hpf$ align a(i) with ta(i)
!hpf$ template tb(n)
!hpf$ distribute (block):: tb
!hpf$ align b(i) with tb(i) ! THIS IS FINE
integer, pointer :: p
integer, target , :: a(10),b(10)
!hpf$ distribute (block) :: a
p => a(1,1) ! unsupported
p => b(1,1) ! supported
end
Finally,
do not use a pointer dummy variable to declare other variables such as
automatic arrays using lbound(), ubound() and size()
intrinsics. For example:
subroutine sub(p)
integer, pointer, dimension(:,:) :: p
integer, dimension(lbound(p,1): &
+ ubound(p,1),size(p,2))::a ! does not work
The
compiler error messages for the pointer limitations are:
PGHPF-S-0000-Internal error. POINTER common block member not supported PGHPF-S-0155-DYNAMIC object may not have the POINTER attribute PGHPF-S-0000-Internal error. POINTER component of derived type not supported PGHPF-W-0155-Complex TARGET may not be properly aligned PGHPF-F-0155 scalar POINTER associated with distributed object is unsupportedRuntime Error Message:
POINTER: cyclic distribution of target unsupported
The following Fortran 90 Intrinsics will not work with variables or arrays of derived type:
ALLOCATED(ARRAY)The following HPF library and intrinsic procedures will not work with variables or arrays of derived type:
CSHIFT(ARRAY,SHIFT,DIM)
EOSHIFT(ARRAY,SHIFT,BOUNDARY,DIM)
LBOUND(ARRAY,DIM)
MERGE(TSOURCE,FSOURCE,MASK)
PACK(ARRAY,MASK,VECTOR)
PRESENT(A)
RESHAPE(SOURCE,SHAPE,PAD,ORDER)
SHAPE(SOURCE)
SIZE(ARRAY,DIM)
SPREAD(SOURCE,DIM,NCOPIES)
TRANSFER(SOURCE,MOLD,SIZE)
TRANSPOSE(MATRIX)
UBOUND(ARRAY,DIM)
UNPACK(VECTOR,MASK,FIELD)
COPY_PREFIX()
COPY_SCATTER()
COPY_SUFFIX()
HPF_ALIGNMENT()
HPF_DISTRIBUTE()
HPF_TEMPLATE()
INTEGER, PARAMETER, DIMENSION(3):: X=(/1,2,3/)The following will not work:
INTEGER, PARAMETER:: Y=X(1) !WILL NOT WORKBecause of this limitation, named array or structure constants cannot be used in the following places:
Named array or structure constants found in modules cannot be used in the following cases:
The module PUBLIC/PRIVATE access statements cannot reference a CONTAINed subprogram.
A MODULE cannot contain forward references to procedures defined in the same module. For example, the module B below will not work in the current release, while module C will work:
MODULE B CONTAINS FUNCTION G . . . CALL H END FUNCTION G SUBROUTINE H . . . END SUBROUTINE H END MODULE B MODULE C CONTAINS SUBROUTINE H . . . END SUBROUTINE H FUNCTION G . . . CALL H END FUNCTION G END MODULE C
PGHPF-W-3011-Non-replicated mapping for character/struct/union array, char_table, ignored (file.F: lineno)
%a.out mppexec_opt user_opt -pghpf HPF_optThe mppexec options are described in the mppexec(1) man page. The -npes mppexec_opt option is required and specifies the number of processors. The number of processors must be a power of 2.
The only supported HPF options are -stat and -np. The HPF -np option may be specified to reduce the number of processors from the value specified by the -npes option. The use of the -np option is not recommended as the unused processors are not available for other uses.
%pghpf -Ojump file1.hpfThe -Ojump switch will pass "-Wf,-ojump" to the T3D Fortran 77 compiler and link a version of the runtime library compiled with -h jump. See the documentation on -h jump for the T3D C compiler for more details.
%mpirun -np numberofprocs a.out user_opt -pghpf HPF_optThe only supported HPF option is -stat. The HPF -np option is not supported.
For the IBM SP2, the MPL communications library is also available. To use the MPL library, include the option -Mmpl on the compiler command line (this library is loaded when linking occurs).
set path=($path $PGI/rs6000/bin)In pghpf 2.1, the directory structure for rs600 workstations e is as follows:
set path=($path $PGI/rs6000/bin)And for SP 2 systems is:
set path=($path $PGI/sp2/bin)
ERROR 104: GOT page/offset relocation out of range: x.o ERROR 104: GOT page/offset relocation out of range: x.owhere x.o is one of the object files being linked. This problem should not occur with the version of the linker included in IRIX 6.1. No known work-around is available.
The file $PGI/patches/ contains a patch script and a README file with the latest information for changing an installation for the different MPI versions.
With the patches for SGI MPI 2.0 installed, every link will give the following warnings:
ld64: WARNING 85: definition of atexit in /usr/pgi/pcxl/lib/mips4/lib... ld64: WARNING 85: definition of exit in /usr/pgi/pcxl/lib/mips4/libpg...These warning messages can be ignored.
Too many HIPPI messages in input queue without matching receives.
MPI
can hold only 64 such unexpected messages per process at a time. The
environment variable MPIRUN_UNEX may be used to increase this limit.
Setting MPIRUN_UNEX to 1024 should work for most code.
The Convex Exemplar PVM runtime implementation has a limited buffer size. This may cause compiled programs to fail. The buffer size can be increased. Refer to your system administrator or the bugs section in the PVM Readme.mp file for more information.
When compiling extrinsic routines, the Fortran 77 compiler option +ppu should be used. This option appends underscores at the end of definitions of and references to externally visible symbols. Since the caller appends underscores for extrinsic names, the callee extrinsic needs this option when it is compiled.
For example:
setenv PARAGON_XDEV /usr/localThe environment variable PGI needs to be set:
setenv PGI /usr/local/paragon/pgiThen two elements need to be added to the path:
set path=($PARAGON_XDEV/paragon/bin.arch \
$PGI/pgon/bin.arch $path)
Where
arch is the architecture on which the compilation is performed. Choices
for arch include: sgi, solaris, and sun4.
INTEGER, DIMENSION(N,N) :: A,B !HPF$ DISTRIBUTE (BLOCK,BLOCK) :: A,B A = CSHIFT(B, DIM=1, SHIFT=2)
INTEGER, DIMENSION(N,N) :: A,B !HPF$ DISTRIBUTE (BLOCK,BLOCK) :: A,B FORALL(I=1:N,J=1:N) A(I,J) = B(J,I)
INTEGER, DIMENSION(N,N) :: A,B,C !HPF$ DISTRIBUTE (BLOCK,BLOCK) :: A,B,C FORALL(I=1:N,J=1:N) A(C(1,I),C(2,J)) = B(J,I)
INTEGER, DIMENSION(N,N) :: A,B !HPF$ DISTRIBUTE (BLOCK,BLOCK) :: A,B
INTEGER, DIMENSION(N,N) :: A,B,C,D !HPF$ DISTRIBUTE (BLOCK,BLOCK) :: A,B,D !HPF$ DISTRIBUTE (CYCLIC,CYCLIC) :: C A(:,:) = C(:,:) B(:,:) = C(:,:)
A = B(V) C = D(V)
A = B C = D
DO I = 1,N A(I) = B(1) + A(I)*2 CALL FOO(A(I) END DO
0: ALLOCATE: xxxx bytes requested; not enough memoryThe new overlap option is available for such cases. Use -Moverlap as follows:
%pghpf -Moverlap[=size:n]This option controls the size of the overlap area the compiler generates for certain arrays. In some cases, increasing or decreasing the size of the overlap area may improve a program's performance. The default size is 4. You may want to change the size from the default to improve performance in cases where pghpf generates overlap_shift() communications. For example, in the code:
!hpf$ distribute (block) :: a,busing the default overlap size of 4, pghpf does not use the overlap shift optimization, since 4 is too small. By increasing the overlap size to 10, pghpf generates overlap shifts.
forall(i=1:n) a(i) = b(i+10)
The compiler only performs overlap shift communications for BLOCK distributed dimensions where an array's shift amount is a compile time constant and is less than the overlap default, or the size specified with -Moverlap.
Reducing the overlap size may also improve performance for some codes. Setting the size to 0 completely disables overlap shifts. If a program's expressions which utilize the overlap optimization never use an offset greater than one or two, then specifying an overlap size smaller than the default, for example a value of 2, will reduce memory usage and may reduce communications. For example, the following code shows an expression that would only require on overlap size of 1.
!hpf$ distribute (block) :: a,b
forall(i=1:n) a(i) = b(i-1) + b(i) + b(i+1)
INTEGER, DIMENSION(N,N) :: A,B !HPF$ DISTRIBUTE (BLOCK,BLOCK) :: A,B FORALL(I=1:N,J=1:N) A(I,J) = B(J,I) PRINT *, A,B ENDVariables used in namelist groups (NAMELIST) may not be mapped; the compiler issues a warning message if an attempt is made to map a variable in a namelist group:
PGHPF-W-0311-Non-replicated mapping for namelist
array, name, ignored (test1.hpf:4)
Input
and output is currently serialized. One processor reads or writes the data and
sends or receives it to or from the other processors owning the data.There are two methods pghpf uses to perform I/O, depending upon the data items being read or written. For example, assuming a and b are arrays, the command:
read(...) (a(i),b(i),i=1,1000)will not be very efficient running on anything other than a small number of processors. All of a and b are read by a single processor and then broadcast to all nodes.
read(...) a(1),a(3),a(5),...The code above reads a list of scalars. This is not the most efficient I/O for pghpf.
read(...) a,bThe example above should perform better, a single processor still reads all data, but it only sends the parts of arrays a and b that each node requires. The example below will generate similar code:
read(...) (a(i),i=1,1000)
An INDEPENDENT DO loop is designated by the programmer by preceding it with the INDEPENDENT directive. For example:
!HPF$ INDEPENDENTThe compiler accepts the above, or any of the standard HPF directive prefixes, as well as additional INDEPENDENT clauses (section 6.2 "INDEPENDENT Clauses").
No command-line switches are needed to invoke parallelization of INDEPENDENT loops. The -Mnoindependent switch is available to inhibit parallelization of all INDEPENDENT loops. The -Minfo command-line switch reports which loops have been parallelized.
At present, only INDEPENDENT loops with Fortran-77 constructs can be
parallelized. In particular, the presence of array assignments, WHERE
statements, FORALL statements, and ALLOCATE statements will
eliminate loops from consideration for parallelization. INDEPENDENT
loops can be nested, currently to a depth of seven loops, but there can be at
most one INDEPENDENT loop directly nested within another
INDEPENDENT loop. For example, the following loop nest will not be
parallelized since two independent loops are present at the same level.
!HPF$ INDEPENDENT DO i = 1, n !HPF$ INDEPENDENT DO 10 j = 1, m 10 A(j,i) = (j-1) * n + i !HPF$ INDEPENDENT DO 20 k = m, 1, -1 20 B(k,i) = A(m-k+1,i) ENDDOThis restriction has been added to ensure that a unique home array can be found for the entire INDEPENDENT loop nest (see section 6.2 "The On Home Clause" for a discussion of home arrays). For the same reason, trip counts and strides for non-outermost INDEPENDENT loops must be invariant with respect to the entire loop nest.
There are additional cases where INDEPENDENT loops are not parallelized or are only parallelized if an INDEPENDENT clause is used (refer to the following section for a description of INDEPENDENT clauses). To describe these cases, we must first define several terms. Each INDEPENDENT DO loop defines an INDEPENDENT index, which is the DO loop's index. In processing INDEPENDENT loops, the compiler will replicate those variables that do not contain subscripts that are functions of INDEPENDENT indices. As a degenerate case, all scalars will be replicated. Variables that the compiler replicates may originally be distributed. To perform parallelization, the compiler will create replicated copies. The resulting variables are compiler-replicated.
The compiler must ensure that values of compiler-replicated variables will be identical across all processors. If a compiler-replicated variable can be modified within an INDEPENDENT loop, and is used outside the loop, the loop will not be parallelized.
Modifications to compiler-replicated variables can be made through assignment statements, or through procedure calls. Any modification to a compiler replicated variable disables parallelization of the INDEPENDENT loop unless NEW or REDUCTION clauses are specified for the modified variable or there are no uses (refer to section 6.2). The presence of INTERFACE blocks for procedures describing the INTENTs of parameters will help the compiler to identify variables that are not modified across procedure calls (refer to section 6.3 "Procedure calling").
Uses of variables may be explicit, and can occur either after the INDEPENDENT loop nest, or within the same loop nest. For example, the following INDEPENDENT loop has a likely programming error because variable j is both read and written on different iterations, violating Bernstein's conditions (refer to page 193 of The High Performance Fortran Handbook).
!HPF$ INDEPENDENT DO 10 i = 1, n 10 j = j + A(i)Implicit uses of variables arise either because the variables exist in COMMON blocks, or because the variables occur as dummy parameters with INTENT INOUT or INTENT OUT.
Another reason that INDEPENDENT loops may not be parallelized is the presence of array aliases: there may be distinct array references, where at least one reference is a store, that refer to the same array locations on certain iterations. When the compiler must copy programmer-defined arrays to compiler-created arrays and array aliasing arises, the compiler cannot determine how to replace a given array reference. This problem can arise in the following INDEPENDENT loop.
!HPF$ INDEPENDENT DO i = 1, n A(J1(I)) = 0 A(J2(I)) = 1 ENDDOIf the first reference to array A is replaced with A$TMP1 and the second is replaced by A$TMP2, the compiler cannot determine which temporary array to copy back to A after the loop.
INDEPENDENT [, ON HOME ( home-array )]
[, NEW ( var-list )]
[, REDUCTION ( var-list )]
The
following sections, describe the NEW, ON HOME, and
REDUCTION clauses.
The following example demonstrates use of the NEW clause.
!HPF$ INDEPENDENT, NEW (S) DO I = 1, n S = SQRT(A(i)**2 + B(i)**2) C(i) = S ENDDOAfter execution of the INDEPENDENT loop, values of compiler-replicated variables appearing in NEW clauses may be different across different processors, causing errors if these variables are used without intervening assignments.
The ON HOME clause is optional. If it is not specified, the compiler will select a suitable home array from array references within the INDEPENDENT loop, or will create a home array (without actually allocating space for it).
Each INDEPENDENT index of a loop nest should be a subscript in a mapped dimension of the home array reference in the ON HOME clause. Valid distribution attributes are BLOCK and BLOCK(N). The home-array should reference valid array locations for all values of the INDEPENDENT indices. When a subscript is not an INDEPENDENT index, it can be a triple. The following example demonstrates use of the ON HOME clause.
DIMENSION A(0:n+1,1:m) !HPF$ DISTRIBUTE A(BLOCK,*) !HPF$ INDEPENDENT, ON HOME (A(i,:)) DO 1 i = 1, n 1 B(i) = i
!HPF$ INDEPENDENT, REDUCTION (S) DO I = 1, n S = S + A(I) ENDDOA reduction statement is an assignment statement in one of the forms below:
A = A + E A = A * E A = A .or. E A = A .and. E A = A .neqv. E A = iand(A, E1, ..., En) A = ior(A, E1, ..., En) A = ieor(A, E1, ..., En) A = min(A, E1, ..., En) A = max(A, E1, ..., En)In these reduction statements, A is an accumulator appearing in a REDUCTION clause, and expressions E, E1, ..., En do not contain A. The compiler produces statements to perform reductions locally on all processors, then combines all local accumulators globally.
PGHPF-W-0313-Sub-program x within INDEPENDENT loop not PURE
Every INDEPENDENT loop nest is assigned a home array by the compiler. All array references in an INDEPENDENT loop nest are examined to see if they are aligned with the home array. Array references that are not aligned are replaced with new temporary arrays which are aligned with the home array. The time required to allocate and deallocate new temporary arrays, as well as the time to copy data both to the temporary arrays and then back to the original arrays can be substantial, and is the primary cause of slowdown in performance of INDEPENDENT loops.
The compiler's -Minfo command-line switch informs programmers about the presence of temporary arrays for which performance overhead of array copying may be substantial. In this case, the compiler produces messages such as the following:
14, Independent loop parallelized
expensive communication: all-to-all communication (copy_section)
18, expensive communication: all-to-all communication (copy_section)
The
first "expensive communication" message is produced for the copy into a
temporary array, and is associated with the first line of the
INDEPENDENT loop nest (line 14 in the above message). The second
"expensive communication" message is produced for the copy from the temporary
array to the original array, and is associated with the last line of the
INDEPENDENT loop nest (line 18 in the above message). Small changes to a program can lead to a substantial reduction in the number of temporary arrays created by the compiler. There are two primary strategies that can be followed:
!HPF$ DISTRIBUTE (BLOCK,*) :: AFor this loop nest, a temporary copy will be created for A. The temporary array will be distributed over both of its dimensions. This temporary can be eliminated in one of two ways:
!HPF$ INDEPENDENT
DO 1 i = 1, m
!HPF$ INDEPENDENT
DO 1 j = 1, n
1 A(i,j) = (i-1) * n + j
%pghpf -Mprof=lines -otest1 test_prog.hpf
The profiler pgprof is invoked as follows:
% pgprof [options] [-I srcdir] [-o prog] [datafile]If invoked without any options command-line or arguments, pgprof looks for the pgprof.out data file and the program source files in the current directory. The program's executable name, as specified when the program was run, is usually stored in the profile data file. If all program related activity occurs in a single directory, pgprof needs no arguments. If present, the arguments are interpreted as follows:
To prepare an HPF program for debugging, use the -Mg compile-time option to pghpf. The generated Fortran 77 output will be saved and the Fortran 77 node compiler will be invoked with the -g compile-time option to provide symbolic information in the image file. If you wish to execute the program on only a single processor, use the -Mrpm1 compiler option when linking the program (this is not available on all platforms). You can then use a standard debugger on the image file.
If you need to execute the program on multiple processors, the following sequence is useful:
subroutine sub(a)
implicit none
common /c/ n
integer n
character*8 a(n)
end
Compiler Command-line Options
Option Description
-c Stops after assembling (results placed in
filename.o).
-Dname[=val ] Defines a preprocessor macro name with value
val.
-dryrun Show but do not execute all commands created by
the driver.
-E Displays preprocessed HPF file to the standard
output.
-F Saves a preprocessed HPF file in filename.f.
-help Display the complete list of driver options.
-Idirectory Adds a directory directory to the search path
for #include files.
-Ldirectory Adds a directory directory to the search path
for library files.
-llibrary Loads the library, in addition to the standard
libraries.
-O[level] Specifies code optimization at the specified
level.
-ofilename Names the object file filename.
-r4 Interpret DOUBLE PRECISION variables as REAL.
-r8 Interpret REAL variables as DOUBLE PRECISION.
-time Print execution times for the various compiler
steps.
-Uname Undefine a preprocessor macro name.
-V Displays the compiler phase version messages.
-v Displays the compiler, assembler and linker
phase invocation.
-W0,arg Passes arguments arg to the node compiler.
-Wa,arg Passes arguments arg to the assembler.
-Wl,arg Passes arguments arg to the linker.
-Wh,arg Passes arguments arg to the HPF compiler.
-w Do not print warning messages.
pghpf Compiler Options
Option Description
-Mautopar Auto-parallelize Fortran DO loops.
-M[no]backslash Determines how the backslash character is treated
in quoted strings.
-Mcmf Provides limited support for CM Fortran
compatibility.
-Mextract Perform a manual extract phase for procedures
within INDEPENDENT DO loops that are to be
inlined. See the -Minline option.
-M[no]dclchk Determines whether all program variables must be
declared.
-M[no]depchk Compiler checks for potential data dependencies.
-M[no]dlines The compiler treats lines containing "D" in
column 1 as executable statements. With nodlines
the compiler does not treat lines containing "D"
in column 1 as executable statements (does not
ignore the "D".
-Mextend The compiler accepts 132-column source code;
without this option lines are 72 columns.
-Mfreeform Process source using Fortran 90 freeform input
specifications.
-Mftn Stop after compiling HPF and keep the
intermediate Fortran 77 output.
-Mg Set the debug option, as well as the -Mkeepftn
option, and also set the pghpf compiler flag that
makes debugging the Fortran 77 output easy by
suppressing HPF line numbers in the generated
Fortran 77 intermediate file.
-Minfo Instructs the compiler to produce size, time, and
other compilation information.
-Minform Specify the minimum level of error severity that
the compiler will display.
-Minline Perform procedure inlining within INDEPENDENT DO
loops.
-Mkeepftn Retain Fortran 77 intermediate files.
-Mmpi Link a version of the HPF runtime libraries and
startup routines for the PGI mpi environment
(valid only on certain platforms).
-Mmpl Link a version of the HPF runtime libraries and
startup routines for the PGI mpl environment
(valid only on certain platforms).
-M[no]list Specifies whether the compiler creates a listing
file.
-Mnofree[form] Use fixed form formatting for file processing.
-Mnohpfc Skip the HPF compilation step and compile using
the Fortran 77 compiler if a file with a .f or
.F extension is supplied.
-Mnoindependent Do not apply the INDEPENDENT directive to DO
loops.
-Moverlap Set the size of the overlap area for BLOCK
distributed arrays.
-Mpreprocess Run the preprocessor on the input source file.
-Mprof Select profiling. Insert calls to profile
routines and link profiler libraries.
-Mpvm Generate code using runtime libraries and startup
routines for the PVM environment.
-Mr8 Promote REAL variables and constants to DOUBLE
PRECISION and COMPLEX to DOUBLE COMPLEX.
-Mreplicate The array replicator eliminates calls to
pghpf_get_scalar() by replicating distributed
arrays that satisfy certain conditions.
-Mrpm Link a version of the HPF runtime libraries and
startup routines for the PGI RPM environment
(valid only on certain platforms).
-Mrpm1 Link a version of the HPF runtime libraries and
startup routines for the PGI RPM single-process
environment for debugging (valid only on certain
platforms).
-M[no]sequence All variables are created as SEQUENCE variables,
where sequential storage is assumed. With
-Mnosequence, all variables are created as
nonsequential variables unless an explicit
SEQUENCE directive is supplied or the variable is
an assumed size array.
-Mstats Link a version of the runtime libraries for
printing runtime communications and message
passing statistics.
-Mstandard Causes the compiler to flag source code that does
not conform to the ANSI Fortran 90 standard.
-Mupcase Allow uppercase letters in identifiers.
The Portland Group, Inc 9150 SW Pioneer Ct, Suite H +1-503-682-2806 (voice) Wilsonville, OR 97070 +1-503-682-2637 (FAX)
The Portland Group, Inc. also maintains a WWW home page with information on PGI and its products; the URL is http://www.pgroup.com.
To obtain further assistance on pghpf 2.0, or on other PGI products, you can also use the address/number shown above.
http://www.mcs.anl.gov/mpi/index.html
http://www.netlib.org/pvm3
%xmosaic $PGI/doc/hpf/html/pghpf.index.html
This appendix lists the HPF_LOCAL_LIBRARY procedures. Table B.1 briefly lists the procedures. Refer to Appendix A and B for details on the intrinsics defined in the Fortran 90 Language Specification and for HPF LIBRARY procedures.
For complete descriptions of the HPF_LOCAL_LIBRARY routines, and the current standards for HPF_LOCAL extrinsics, refer to Annex A, "Coding Local Routines in HPF and Fortran 90", in the High Performance Fortran Language Specification (Version 1.1, November 10, 1994, http://www.erc.msstate.edu/hpff/hpf-report/hpf-report/hpf-report.html or http://www.crpc.rice.edu/HPFF/home.html)
HPF_LOCAL_LIBRARY Procedures
Intrinsic Description
ABSTRACT_TO_PHYSICAL Returns processor identification for
physical processor associated with a
specified abstract processor.
GLOBAL_ALIGNMENT Returns information about the global HPF
array argument.
GLOBAL_DISTRIBUTION Returns information about the global HPF
array argument.
GLOBAL_LBOUND Returns lower bounds of the actual HPF
global array associated with a dummy
array.
GLOBAL_SHAPE Returns the shape of the global HPF
actual argument.
GLOBAL_SIZE Returns the global extent of the
specified argument.
GLOBAL_TEMPLATE Returns template information for the
global HPF array argument.
GLOBAL_TO_LOCAL Converts a set of global coordinates
within a global HPF actual argument to
an equivalent set of local coordinates.
GLOBAL_UBOUND Returns upper bounds of the actual HPF
global array associated with a dummy
array.
LOCAL_BLKCNT Returns the number of blocks of elements
in each dimension on a given processor.
LOCAL_LINDEX Returns the lowest local index of all
blocks of an array dummy ..
LOCAL_TO_GLOBAL Converts set of local coordinates within
a local dummy array to an equivalent set
of global coordinates.
LOCAL_UINDEX Returns the highest local index of all
blocks of an array dummy argument.
MY_PROCESSOR Returns the identifying number of the
calling physical processor.
PHYSICAL_TO_ABSTRACT Returns coordinates for an abstract
processor, relative to a global actual
argument array.