Contruction of call multigraphs revisited

Arun Lakhotia

Extended abstract

This paper presents an algorithm for constructing callmulti graphs for procedural language programs. The call graph construction (CGC) problem is analogous to the problem of performing control flow analysis (CFA) of Scheme [Shi88, Shi91b]. As shown later, this problem is also analogous to the problem of determining types (TD) at specific program points for object-oriented languages [PS91, PR93, Suz81]. our CGC algorithm may be easily adapted to perform CFA and TD. The algorithm is significant since the CGC, CFA, and TD problems are of fundamental importance for efficient compilation of programs in procedural, functional, and object-oriented languages, respectively.

Pande & Ryder [PR93] have shown that the TD problem is NP-hard in the presence of single-level pointers. We show that the CGC problem is analogous to the TD problem. The CGC problem too is, therefore, NP-hard. Any polynomial algorithm for it can only give approximate results. The algorithm presented makes the following contributions:

  1. It constructs call graphs for a larger class of procedural language programs than previous algorithms [Bur87, CCHK90, HK93, Lak93, Wei80, Spi71, Ryd79].
  2. When compared under suitable mapping, the results of our algorithm are more precise than Pande and Ryder's type determination algorithm for C++ [PR93].
  3. When compared under suitable mapping, our algorithm is equivalent to Shivers' 1CFA [Shi88, Shi91b]. Our algorithm is however a) more efficient since it does not require CPS conversion and b) gives more precise results for pre-CPS converted programs.

Unlike our previous algorithm for the same problem [Lak931], the new algorithm works for programs with global variables and reference parameters. This increase in scope is brought about by using control flow graph (CFG) representation of programs; the previous algorithm used program dependence graphs (PDG) [OO84]. The improvement comes at additional computational costs. Since in a CFG the data dependence between statements are not available information is propagated through all statements, as against directly between the definition and use statements. Part of this cost may be recovered by using sparse evaluation graphs (SEG) [CCF91] instead of CFGs.

The call graph of a program is affected by program aliases. On the other hand, the set of aliases in a program is influenced by the call graph [Ban79, CBC93, CK89, LR91]. To obtain precise results these two problems should be solved simultaneously. Though we treat global variables and call by references, two of the sources of aliasing, we ignore the aliasing problem. It may be observed that the pointer-induced aliasing algorithms of [CBC93, LR91] and our algorithm have similar "structure." Our algorithm may be combined with one of the aliasing algorithms to simultaneously construct alias sets and call graph. We restrict our focus to the call graph problem and do not delve into the details of how these algorithms may be combined.

Full paper