Most of you already know that I did an internship at Sun Microsystems India Engineering Center this summer (I am sure I have bragged enough about it). The internship has been a fantastic experience. The very first day I was welcomed by Rajesh, the Sun CA program co-ordinator for the south of India and Ganesh Hiregoudar, the APAC Head of the Sun CA Program. On the very next day I got a complete list of projects in my preferred area, High Performance Computing (HPC). I chose a project and a conference call was scheduled with the concerned Sun Engineer, in my case it was Verdi March, who works at the Sun Singapore office and is also a researcher at National University of Singapore (NUS). I was also alloted an account at Singapore Grid Discovery Zone to run my apps. My project was to find the per function speed up of a parallel application across multiple profiles. If you haven’t understood anything, don’t worry. I will explain.
Any computer application is made up of functions (I trust you already knew that). Profiling an application involves executing it and finding performance metrics out of it, like number of system calls, the time taken for each such call, the time taken by each function to complete,the amount of memory used, the number of bytes of inter process communication done by the application etc. In my case, all the applications were parallel ones, applications which use more than one CPU/cores/nodes. Now each time you run a app, under different conditions like different number of CPU, the performance characteristics will be different. Function will take different amounts of time to complete under different conditions, sometimes lesser time (a speed-up) and sometimes more (a speed down, i.e a speed up of less than 1). My job was to write a program that could read the performance profiles and calculate this speed up.
Here comes SCALASCA (Scalable Analysis of Large Scale Parallel Applications). SCALASCA profiles and analyses parallel apps. It been developed by the Julich Supercomputing Centre in Germany and has been used on such well known supercomputers like the Blue Gene……
My job was to extend SCALASCA to calculate the per function speed up. SCALASCA already does the job of profiling your application and generating a trace in XML format. It also provides an API to read/write those files. However, there isn’t much documentation available on SCALASCA as the developers are mainly people at the Julich Supercomputing Centre. Neither is there a mailing list for discussing problems. Hence I spent a hell lot of time going through the source and trying to understand the underlying design. The problem was further compounded by the near lack of comments in the source code. However, the SCALASCA developers, particularly Markus Geimer and Brian Wylie replied to all my question which enabled me to understand the CUBE API (CUBE is the tool that displays the performance data in the form of a GUI and allows you to perform operations on it).
Understanding the architecture of SCALASCA and the CUBE API took me quite a few days. And finally on the fourth week, I had nearly written a code that I believed would do the job. However there are still a few compilation errors to be taken care of (They have reduced from 26 to 4 now). I should complete the whole thing in the next couple of days.
I also met many of the other CAs. Made great friends with Abhishek Uppala, Vasusen Patil, Jay, Avinash, Abhishek Gupta, Okendra and others. Rajesh and Ganesh were always helpful and approachable. They are the best bosses you can get. I, Rajesh, Uppala and Vasusen also made a trip to Pondicherry. It was a fantastic experience. We played Foosball daily and my performance as a defender improved from pathetic to passable. We also had a few CA conferences and a team lunch as a farewell to Vijaya Santosh, a CA co-ordinator.
Overall the experience was amazing and I am looking forward to go back to IEC.