首页 > 代码库 > Test of returning array efficiency from Fortran subprograms

Test of returning array efficiency from Fortran subprograms

<style></style>

Fortran has two kinds of subprograms: subroutine and function. Usually, subroutine is a combination of several procedures generating side effects without returning values, while the purpose of function is to return values after some operations. In fact, returning values can be implemented in a subroutine by setting some of the formal parameter properties as intent(out) or intent(inout). Compared to function calling, an inconvenience for returning values from a subroutine is that we can’t adopt the form of natural assignment like this:

A = MathFunc(...)

Further, if the formula on the right side of the assignment involves several concatenated function calls, compiler optimization could be enabled for it:

A = MathFunc1(...) * MathFunc2(...) – Mathfunc3(...)

However, when considering about returning a large chunk of data from a subprogram, there may be some efficiency problems with function call. The return value of a function call is created on the function’s own stack when entering the function. When the function returns, the data held in the memory will be coped out into an external variable, like the ‘A‘ on the left of an assignment as shown in the above. After that, the data in the stack will be popped out and lost. It is obvious to see that if the data dimension is huge, for example, elemental or DOF data in a FEM analysis, this data copy operation will be time-consuming.

On the contrary, if the large matrix is returned as a subroutine parameter (as a function parameter is also ok) with its intent property set to out or inout, there will be no such a data copy operation. This is because in Fortran, the transfer of parameters into a subprogram is governed by the mechanism of passing by reference, which is different from the default behavior in C or C++.

To verify the above assumption, a test has been performed. In this test, two subprograms were written and both of them return a 100000×100000 matrix, with the difference that one of the subprogram is a function, in which the matrix is returned by copy and the other subprogram is a subroutine, in which the matrix is returned by reference as a parameter. The two subprograms are:

subroutine ret_by_para(AA)  real(8), dimension(10000,10000), intent(out) :: AA  integer m, n  do m = 1, 10000     do n = 1, 10000        AA(m,n) = 10.     end do  end doend subroutine ret_by_parafunction ret_by_func()  real(8), dimension(10000,10000) :: ret_by_func  integer m, n  do m = 1, 10000     do n = 1, 10000        ret_by_func(m,n) = 10.     end do  end doend function ret_by_func

By calling these two subprograms for 1 and 10 times respectively, the running time collected by gprof is summarized as follows:

Call for 1 time

% time

Cumulative seconds

Self seconds

name

72.29

4.46

4.46

ret_by_func.1512

27.71

6.17

1.71

ret_by_para.1516

Call for 10 times

% time

Cumulative seconds

Self seconds

name

82.12

71.58

71.58

ret_by_func.1512

17.88

87.17

15.59

ret_by_para.1516

It can be seen that the time cost by returning value from a function is much larger than by returning reference from a subroutine. And this difference increases with the calling times. Therefore, it is suggested that if a parameter or return value is a large matrix, it had better be transferred by reference as a subroutine parameter instead of by function return value.