Thursday, October 21, 2010

Common Subexpression Elimination (CSE) by GCC

Test Program

    main()
        {
             int i, j, k, r;
             scanf("%d%d", &i, &j);

             k = i + j + 10;
             r = i + j + 30;  

             printf("%d %d %d\n", k, r);
        }

Assemly Code

AT&T format of assembly code is used.
  
    main:
            pushl   %ebp
            movl    %esp, %ebp
            andl    $-16, %esp
            subl    $32, %esp
            leal    24(%esp), %eax
            movl    %eax, 8(%esp)
            leal    28(%esp), %eax
            movl    %eax, 4(%esp)
            movl    $.LC0, (%esp)
            call    scanf
            movl    28(%esp), %edx
            movl    24(%esp), %eax
                leal    (%edx,%eax), %eax
                addl    $10, %eax
                movl    %eax, 20(%esp)
            movl    28(%esp), %edx
            movl    24(%esp), %eax
                leal    (%edx,%eax), %eax
                addl    $30, %eax
                movl    %eax, 16(%esp)
            movl    16(%esp), %eax
            movl    %eax, 8(%esp)
            movl    20(%esp), %eax
            movl    %eax, 4(%esp)
            movl    $.LC1, (%esp)
            call    printf
            leave
            ret

The two blocks in bold represents the evaluation of 'k' and 'r' in the test program respectively.
The 'leal    (%edx,%eax), %eax' command adds the two values in the 'edx' and 'eax' and stores the result in 'eax'. The 'addl' command adds a constant to the value in the 'eax'.
Here, both 'leal' and 'addl' are called two times, for the evaluation of 'k' and 'r' respectively.

 After optimization as:
    gcc -S -O3 -fomit-frame-pointer opt2.c
    less opt2.s

    main:
            pushl   %ebp
            movl    %esp, %ebp
            andl    $-16, %esp
            subl    $32, %esp
            leal    24(%esp), %eax
            movl    %eax, 8(%esp)
            leal    28(%esp), %eax
            movl    %eax, 4(%esp)
            movl    $.LC0, (%esp)
            call    scanf
                movl    24(%esp), %eax
                addl    28(%esp), %eax
                movl    $.LC1, (%esp)
                leal    30(%eax), %edx
                addl    $10, %eax
            movl    %edx, 8(%esp)
            movl    %eax, 4(%esp)
            call    printf
            leave
            ret

Here, what is seen to be done is:
1)
 
'i' in the test program stored in 'eax'



2) 'j' added to 'eax'



        Now 'eax' contains 'i' + 'j'.


3) 'r' is obtained as " 30 + the value in 'eax' "



4) 'k' is obtained by adding 10 to the value in 'eax'

Observation is:
        'i' + 'j' was evaluated only once !

Common Subexpression Evaluation (CSE)

As observed, CSE is an optimization technique employed by the compiler, when the same subexpression is present in more than one expressions.

It is as if the subexpression is evaluated first, and the result is stored in a temporary variable. For all further calculations where this subexpression was a part originally, the value of this newly created temporary variable will be used.
In the test program used above, the so evaluated subexpression is ' i + j '.

Also, CSE is performed only when, in that environment, the cost to use such a temporary variable is lesser than the cost to perform the operations in the subexpression itself. Here, the operation is '+'. 

Tuesday, October 19, 2010

Depicting Function Inlining by GCC

Inline Function

In C, if a particular function used has only a few lines in its body, and if the optimization level is set to 03 (preferably), some unexpected changes can be observed about how gcc handles this function.

What the compiler will do is that it replaces the call for this function, with the actual code of the function, called inlining.

The limit on the number of lines below which inlining is performed, strictly depends upon the gcc heuristics.

This is not all. In  the extreme case, if the small function mentioned above only does something like calculating a value after taking an input, then gcc will evaluate the function call, calculate the value, and directly paste it in the program instead of the function call itself.

Sweet, isn't it?  

Test Program



    int sqr(int x)
        {
            int a;
            return x*x;
        }

    main()
        {
            printf("%d\n", sqr(10));
        }

Assembly Code

To view the assembly code.
    gcc -S -fomit-frame-pointer opt1.c

    less opt1.s


The assembly code is:
    sqr:
            subl    $16, %esp
            movl    20(%esp), %eax
            imull   20(%esp), %eax
            addl    $16, %esp
            ret
          
    main:
            pushl   %ebp
            movl    %esp, %ebp
            andl    $-16, %esp
            subl    $16, %esp
            movl    $10, (%esp)
            call    sqr
            movl    %eax, 4(%esp)
            movl    $.LC0, (%esp)
            call    printf
            leave
            ret
          
On optimization,

    gcc -S -O3 -fomit-frame-pointer opt1.c

        less opt1.s

The new code is:

    sqr:
            movl    4(%esp), %eax
            imull   %eax, %eax
            ret
    main:
            pushl   %ebp
            movl    %esp, %ebp
            andl    $-16, %esp
            subl    $16, %esp
            movl    $100, 4(%esp)
            movl    $.LC0, (%esp)
            call    printf
            leave
            ret

Here, the function sqr( ) does something very simple, and the input to the function is statically assigned. It means that the value of the input (10) will never change during runtime. Hence, the compiler will optimize the program even further, to the extreme that the square of 10 will be evaluated and the result pasted in the program instead of the original call to the function sqr( ).  

Sunday, October 17, 2010

User Mode Linux Built From Scratch !!!

Linux From Scratch
"Linux From Scratch (LFS) is a project that provides you with step-by-step instructions for building your own custom Linux system, entirely from source code."
Homepage is : http://www.linuxfromscratch.org/ .

Use Mode Linux
"User-Mode Linux is a safe, secure way of running Linux versions and Linux processes. Run buggy software, experiment with new Linux kernels or distributions, and poke around in the internals of Linux, all without risking your main Linux setup.
User-Mode Linux gives you a virtual machine that may have more hardware and software virtual resources than your actual, physical computer. Disk storage for the virtual machine is entirely contained inside a single file on your physical machine. You can assign your virtual machine only the hardware access you want it to have. With properly limited access, nothing you do on the virtual machine can change or damage your real computer, or its software."

UML - The kernel on top of a kernel 

To get the complete idea, it is true that the UML kernel can be booted and shutdown from your Linux system, just like another application. It will not cause your Linux system to halt in any way.

How is the required privilege levels setup for the UML kernel?
The privilege levels in a Linux system ranges from 0 (ring 0) to 3 (ring 3). Ring 0 gives you complete power. You can change the contents of any register, do anything. Ring 3 is the user mode. It also has the lowest privilege.

This is the same in the UML kernel too.

Can a C code get privilege level 0?
Yes it can. Through system calls. But it cannot be allowed just like that. Allowing a C code full control will be like allowing viruses to grow in Linux! The C code must be able to make system calls, and simultaneously not be the one who is in possession of the control flow.

This is the specific design technique employed in Linux. When a system call occurs in a C code, there will be a switching from ring 0 to ring 3. It will be simultaneously accompanied with transfer of control from the C program to the Linux kernel. No hassle there.

Thus, total safety is ensured.

How is the UML kernel designed then?
A Linux kernel comprises of two parts:
1) the hardware dependent part - specifically, everything inside the 'arch'
                                                    folder in the kernel source code.
2) others

What is done in the UML kernel is that:
1) take away all the hardware dependent part of the kernel.
2) simply replace it with the system calls of the kernel layer below
    it (pure C code).
    (the UML kernel will behave just as an application)

Consider a sample executable binary 'a.out' compiled inside the UML kernel, from a sample file 'a.c'.

Fig 1. The kernel layers

a.out makes a system call
e.g. read( )


replace a.out's call with the address of
its own read( )










The mechanism:
The UML kernel uses ptrace( ) to freeze 'a.out', the moment it invokes a system call. Then, the address of this function call is replaced with a corresponding system call address that is part of the UML kernel itself.

Everything works fine, in a cute way.

Compiling and Booting the UML kernel

While compiling the kernel, just add an extra parameter 'ARCH=um' to all the steps outlined in the Linux kernel README.
After compilation, an executable binary called 'linux' will be created.

Assuming 'linux' is present in your current directory, to boot into the UML kernel give the command as:
    ./linux ubda=< path of the filesystem >

where filesystem can be a physical partition, or one created with the dd and mkfs/mke2fs commands.

Some Snapshots

'Make'ing Glibc 
Fig 2. Running 'make' for glibc
 'Configure'ing Bash
Fig 3. 'Config'uring Bash
Linguistic Perl 
    The configuration settings for Perl, created by Larry Wall, was the most "linguistic" out of these!  Some excerpts are:



Fig 4. Excerpts from the 'configure' settings for Perl5

Man pages


    These had a 'make install' with one of the shortest SBU, and looked a bit of a variety too!

Fig 5. 'make install' of man pages
Bash without name !
    During the process, there is a time when 'chroot' is used to completely move into the LFS installation and start using the programs already setup inside it.  At this point, the Bash will be setup without creating the /etc/passwd file. Now the Bash will say that it has no name !
Fig 6. Bash without /etc/passwd
After the Bash has been recompiled and installed properly with respect to the LFS system, and the /etc/passwd file created, the Bash prompt reverts back to normal.
Fig 7. Bash after recompiling and creating /etc/passwd
Booting in ...
Fig 8. Booting into the UML kernel
Powering off ...
Fig 9. Powering off the UML kernel