日本財団 図書館


GLOBAL EDDY-RESOLVING SIMULATION BY THE EARTH SIMULATOR: BRIEF REPORT ON THE FIRST RUN
 
Hirofumi Sakuma1, Hideharu Sasaki1, Keiko Takahashi1, Takashi Kagimoto2, Toshio Yamagata2,3 and Tetsuya Sato1
 
1Earth Science Program, Earth Simulator Center
Yokohama, Kanagawa, JAPAN
sakuma@es.jamstec.go.jp
 
2Variation Research Program, Institute for Global Change Research
Yokohama, Kanagawa, JAPAN
 
3Department of Earth and Planetary Science, Graduate School of Science
The University of Tokyo, Tokyo, JAPAN
 
ABSTRACT
 
The outcomes of the first test run on a global eddy-resolving simulation using the Earth Simulator are reported briefly in this short paper. The aim of the first run is to assess not only computational performance of our newly tuned code for the machine but also an overall physical performance in reproducing basin scale characteristics of currents and temperature fields together with important mesoscale eddy activities in the world ocean. One of the noteworthy accomplishments of the first run is the fact that the Earth Simulator enables us to complete a time integration of a 50-years-long global eddy-resolving simulation in less than half a month, which would accelerate greatly high-resolution climate modeling studies from now on.
 
INTRODUCTION
 
Among the key elements that determine the basic properties of the general circulations of the world ocean, nonlinear scale interactions between mesoscale eddies and basin scale circulations affecting the global statistics of heat, momentum and tracer transports are challenging research subjects to which enormous computational power of the Earth Simulator is of great help. Setting this as an initial goal of our Earth Simulator Initiative, we have developed a MOM3-based high performance OGCM code optimized for our machine. Physical performance of ocean models using the Bryan (1969) formulation adopted in MOM-3 have been extensively checked so far by many research groups in the world and high-resolution performances of the model were investigated notably by Semtner-Chervin (1992) and Fu and Smith (1996). Those studies revealed that the model can reproduce comprehensive three-dimensional structures of ocean circulations, some of them are fairly correct but some are rather inaccurate. Well known persistent problems relating to a high-resolution issue are sluggishness of simulated circulations and improper separation points of the western boundary currents and those are expected to be improved with higher resolution in both the horizontal and vertical directions (Chao et al., 1996).
 
Simulated oceanic fields are not only dependent upon the physical performances of a given model but also on the quality or appropriateness of the data imposed as boundary forcing including bathymetry and attempts of successful global eddy resolving simulations are relatively new time consuming efforts in which the above two factors are carefully evaluated. A major obstacle to perform a global eddy-resolving simulation is the fact that an effective execution of such a simulation is not possible due to the insufficient computational capability of available machines. However, the situation has been drastically changed with the advent of the Earth Simulator together with an optimized OGCM code for it. Our machine enables us to complete a decadal-long (near) global eddy-resolving simulation with horizontal resolution of 0.1 degree in several days and it gave us a great impetus to start out on high-resolution simulation studies on climate variability. As a first step towards such studies, we set up a series of basic numerical experiments to assess the computational and physical performances of our newly developed code. The aim of this short paper is to report the main outcomes of the first experiment in which the choice of scheme options, model parameters and types of boundary forcing were made on a rather trial basis. Nevertheless, the overall characteristics of the simulated fields turned out to be quite realistic, especially improved are two drawbacks mentioned above, namely, the sluggishness of the currents and separation points of the Kuroshio and the Gulf Stream. In what follows, our code optimization strategy and sustained performance of a newly developed MOM3-based OGCM are briefly explained in section one. Section two covers the outline of the computational settings of our first experiment, and in section three we will assess the overall physical performances of our code putting emphasis on fine structures our high-resolution simulation could reproduce. Brief summary and future plans are given in the summary section.
 
CODE OPTIMIZATION AND COMPUTATIONAL PERFORMANCE
 
To attain high performance of our eddy resolving code, a number of different optimization technique have been utilized considering distinctive characteristics of the Earth Simulator. First of all, each routine must be vectorized to improve the performance on vector machines. In addition, we applied such common techniques as inline expansions, loop merging, loop unrolling/rerolling together with re-ordering in order to reduce the number of calling procedures and to make the averaged length of do-loops longer. In some cases, techniques of loop fission/splitting and loop fusion are reintroduced in a balanced manner.
 
As the first step of optimization, we attempted to make the vector ratio of almost all routines to exceed 99.5 percent. Attained the maximum value of averaged vector length, vector ratio and value of the total flops are indicated in Table 1, where the maximum vector length and peak performance for each processor are 256 and 8 Gflops respectively.
 
Table 1: Routine-wise vector length, ratio and the total flops elapsed
Main computation [routine names] CPU time (%) MFLOPS Vector Ratio Ave. Vector Length
baroclinic computation [baroclinic] 14.3 4900.7 99.8 240.0
vertical mixing with implicitly [invtri] 13.0 3374.6 99.6 240.0
barotropic computation [expl_freesurf] 11.0 4902.6 99.7 240.0
Unesco_density [unesco_density] 8.4 5202.2 99.8 256.0
computation with biharmonic [delseq_velocity] 7.5 4255.7 99.7 240.0
main computation of tracers [tracer] 7.1 5025.9 99.8 240.0
calculate advection velocities [adv_vel] 4.2 4838.7 99.6 240.0
construct diagnostics [diagtl] 2.6 3801.6 99.5 243.0
computation of normalized densities [statec] 2.4 6613.0 99.8 240.1
 
We employed one dimensional domain decomposition in the meridional direction, in which parallelization procedure is limited by the number of employed latitudinal circles, namely, the maximum number of CPUs to be used for a near global domain extending from 75°S to 75°N is 1500 provided that the meridional resolution is 0.1 degree. Each processor is assigned computation in zonal strips. The number of meridional grid points in a zonal strip depends on the number of processors. As the number of processors increases, the meridional extent of a strip becomes comparable to or smaller than a halo region, which means that some measures are necessary to reduce computational burdens especially in the halo regions. To this end, we employed micro-tasking techniques for intra-node parallelization while inter-node communications were achieved via MPI library to get the best communication performance on the Earth Simulator. Using 188 nodes, the sustained performance of our aggregate code turned out to be 2.75 Tflops, which is 23% of the peak performance. The horizontal and vertical resolutions we employed to get this performance are 1/10 degree and 54 levels respectively. The other details of our simulation settings are given in the following section. Our tuning effort is still going on and, as the result, the latest parallel efficiency reached to 99.9% and 30 days integration is completed in 1395 seconds of wall clock time, which allows us to execute 100 years integration within 20 days.
 
SIMULATION SETTING
 
The computational domain covers a near-global region extending from 75°S to 75°North. The horizontal resolution and the number of vertical levels we employed are 1/10°and 54 respectively. The thickness of vertical layers increases with depth starting with 5 m (the depth of the upper-most grid point is at 2.5 m depth) and the maximum depth of our model ocean is 6,065m. Model bottom topography was interpolated from 1/30°"OCCAM Topography" dataset we obtained by courtesy of GFDL, which was originally created by the OCCAM project at the Southampton Oceanography Centre. The upper boundary forcing of the momentum heat and salinity fluxes are specified by using monthly mean NCEP (Kistler et al., 2001) to reanalysis data with surface salinity restoring to climatological value. At the northern and southern artificially introduced boundaries, we introduced the restoring zones with three degrees meridional width in which temperature and salinity fields are also restored to their monthly climatological values. The annual mean temperature and salinity fields obtained from World Ocean Atlas 1998 (henceforth WOA98; Antonov et al., 1998a, 1998b, 1998; Boyer et al., 1998a, 1998b, 1998) are used as an initial condition for density field and initial current velocity is set to zero at all levels. To suppress grid-scale noises we introduced a scale-selective damping of Bi-harmonic type and for the vertical mixing, KPP scheme (Troen et al. 1986, Large et al, 1994) is employed. In the present version of the model, no sea ice model is implemented yet.







日本財団図書館は、日本財団が運営しています。

  • 日本財団 THE NIPPON FOUNDATION