Talk abstract:
Simulation of Incompressible Flows Using the Lattice Boltzmann
Method
Viswanathan Babu, Ford Scientific Research Lab.
The lattice Boltzmann (LB) method, which is closely related
to the lattice gas (LG) method, will be discussed in detail.
The LG method is boolean in nature using only bits to indicate
the presence or absence of a particle moving in a particular
direction and speed. The absence of floating point operations
gives the LG method unconditional numerical stability but restricts
it to specialized hardware that can perform the Boolean logical
operations efficiently. The Boolean character also results in
a noisy signal that must be averaged over space/time for reliable
estimates. The LB method, on the other hand, tracks the distribution
functions (or time averages) of the particles. As a result,
floating-point numbers have to be used and so the method is
not boolean. While this makes the LB method susceptible to instabilities
due to accumulation of round-off errors, it allows the LB method
to use a variety of existing computer platforms. The signal-to-noise
ratio of the LB method is also significantly higher than the
LG method. Both the LB method and the LG method are highly parallel.
In fact, the LB method optimizes extremely well on current computer
platforms as will be demonstrated in this talk. The current
version of the LB code developed at FRL runs at speeds around
1.7 Gflops (2D code) and 2.0 Gflops (3D code) on a 32 processor
Cray T3D. The 3D code has super-linear speedup (linear being
the theoretical maximum) and runs at a speed of 33 Gflops on
a 512 processor T3D.
The objective of this talk is to demonstrate the potential
of this method as a viable tool for performing time accurate
simulations of incompressible flows. To that end, two issues
will be examined in detail --- accuracy and speed. The spatial
and temporal accuracy of the LB method will be established through
suitable benchmark studies. It is important to note here that
the method is formally second-order accurate in both space and
time, an accuracy that exceeds that of many commercial codes
today. The speed at which the code runs will be demonstrated
through actual production runs on the parallel Cray T3D. As
an example of the phenomenal performance of the LB code developed
at FRL, consider the following fact. Each processor on the T3D
has a read bandwidth from memory to cache that is limited to
320 Mbytes/sec, which translates to 40 Mflops since the T3D
is a 64 bit machine (8 bytes/word). Codes that do not have cache
reuse will be hardware limited to this speed. Our code runs
at a speed of 60 Mflops on this processor through cache reuse
and other optimizations.
This is joint work with Gary S. Strumolo.
Back to Workshop
Schedule
1996-1997
Mathematics in High Performance Computing