首页 > 代码库 > Test of returning array efficiency from Fortran subprograms
Test of returning array efficiency from Fortran subprograms
Fortran has two kinds of subprograms: subroutine and function. Usually, subroutine is a combination of several procedures generating side effects without returning values, while the purpose of function is to return values after some operations. In fact, returning values can be implemented in a subroutine by setting some of the formal parameter properties as intent(out) or intent(inout). Compared to function calling, an inconvenience for returning values from a subroutine is that we can’t adopt the form of natural assignment like this:
A = MathFunc(...)
Further, if the formula on the right side of the assignment involves several concatenated function calls, compiler optimization could be enabled for it:
A = MathFunc1(...) * MathFunc2(...) – Mathfunc3(...)
However, when considering about returning a large chunk of data from a subprogram, there may be some efficiency problems with function call. The return value of a function call is created on the function’s own stack when entering the function. When the function returns, the data held in the memory will be coped out into an external variable, like the ‘A‘ on the left of an assignment as shown in the above. After that, the data in the stack will be popped out and lost. It is obvious to see that if the data dimension is huge, for example, elemental or DOF data in a FEM analysis, this data copy operation will be time-consuming.
On the contrary, if the large matrix is returned as a subroutine parameter (as a function parameter is also ok) with its intent property set to out or inout, there will be no such a data copy operation. This is because in Fortran, the transfer of parameters into a subprogram is governed by the mechanism of passing by reference, which is different from the default behavior in C or C++.
To verify the above assumption, a test has been performed. In this test, two subprograms were written and both of them return a 100000×100000 matrix, with the difference that one of the subprogram is a function, in which the matrix is returned by copy and the other subprogram is a subroutine, in which the matrix is returned by reference as a parameter. The two subprograms are:
subroutine ret_by_para(AA) real(8), dimension(10000,10000), intent(out) :: AA integer m, n do m = 1, 10000 do n = 1, 10000 AA(m,n) = 10. end do end doend subroutine ret_by_parafunction ret_by_func() real(8), dimension(10000,10000) :: ret_by_func integer m, n do m = 1, 10000 do n = 1, 10000 ret_by_func(m,n) = 10. end do end doend function ret_by_func
By calling these two subprograms for 1 and 10 times respectively, the running time collected by gprof is summarized as follows:
Call for 1 time
% time | Cumulative seconds | Self seconds | name |
72.29 | 4.46 | 4.46 | ret_by_func.1512 |
27.71 | 6.17 | 1.71 | ret_by_para.1516 |
Call for 10 times
% time | Cumulative seconds | Self seconds | name |
82.12 | 71.58 | 71.58 | ret_by_func.1512 |
17.88 | 87.17 | 15.59 | ret_by_para.1516 |
It can be seen that the time cost by returning value from a function is much larger than by returning reference from a subroutine. And this difference increases with the calling times. Therefore, it is suggested that if a parameter or return value is a large matrix, it had better be transferred by reference as a subroutine parameter instead of by function return value.