swapElements:
        pushl   %ebp
        movl    %esp, %ebp
        movl    8(%ebp), %edx          ; r3 = address of array a             
        movl    12(%ebp), %ecx         ; r2 = i                      
        movl    16(%ebp), %eax         ; reg1 = j                            
        pushl   %ebx                                                                                 
        leal    (%edx,%ecx,4), %ecx    ; R[r2] = address of a[i]             
        leal    (%edx,%eax,4), %eax    ; R[r1] = address of a[j]             
        movl    (%ecx), %ebx           ; R[r4] = M[r2]  (i.e., R[r4] = a[i])  
        movl    (%eax), %edx           ; R[r3] = M[r1]  (i.e., R[r3] = a[j]) 
        movl    %edx, (%ecx)           ; M[r2] = R[r3]  (i.e., a[i] = R[r3]) 
        movl    %ebx, (%eax)           ; M[r1] = R[r4]  (i.e., a[j] = R[r4]) 
        popl    %ebx                                                                                 
        popl    %ebp                                                                                 
        ret                            ; return

The variable tmp is not allocated a memory location. Instead register %ebx (denoted by r4) is being used for tmp.
The assembly code actually uses a different algorithm. It uses two register temporaries instead of one memory temporary.
Since two register temporaries are used, there are 4 movl instructions to do the swap instead of 3. This is because the movl instruction can't have both operands be memory locations.

(But 4 is better than using 2 movl instructions for each of the 3 assignments.)
Register %ebx that is used as the temporary r4, is callee save register:

If the called function uses %ebx, it must save the current value before using the register and must restore the value before the function returns.

Caller save: %eax, %ecx, %edx

Callee save: %ebx, %esi, %edi

IA32 Assembly for: accessing array elements (3)