Record Details

An Optimizing Code Generator for a Class of Lattice-Boltzmann Computations

Electronic Theses of Indian Institute of Science


Field	Value

Title	An Optimizing Code Generator for a Class of Lattice-Boltzmann Computations

Creator	Pananilath, Irshad Muhammed

Subject	Lattice-Boltzmann Computations Computational Fluid Dynamics Tiling Stencil Computations Single Instruction Multiple Data (SIMD) Parallel Computers Parallel Processing Loop Transformations Lattice-Boltzman Method (LBM) Lattice Boltzman Method Lattice-Boltzmann Equation Computer Science

Description	Lattice-Boltzmann method(LBM), a promising new particle-based simulation technique for complex and multiscale fluid flows, has seen tremendous adoption in recent years in computational fluid dynamics. Even with a state-of-the-art LBM solver such as Palabos, a user still has to manually write his program using the library-supplied primitives. We propose an automated code generator for a class of LBM computations with the objective to achieve high performance on modern architectures. Tiling is a very important loop transformation used to improve the performance of stencil computations by exploiting locality and parallelism. In the first part of the work, we explore diamond tiling, a new tiling technique to exploit the inherent ability of most stencils to allow tile-wise concurrent start. This enables perfect load-balance during execution and reduces the frequency of synchronization required. Few studies have looked at time tiling for LBM codes. We exploit a key similarity between stencils and LBM to enable polyhedral optimizations and in turn time tiling for LBM. Besides polyhedral transformations, we also describe a number of other complementary transformations and post processing necessary to obtain good parallel and SIMD performance on modern architectures. We also characterize the performance of LBM with the Roofline performance model. Experimental results for standard LBM simulations like Lid Driven Cavity, Flow Past Cylinder, and Poiseuille Flow show that our scheme consistently outperforms Palabos–on average by3 x while running on 16 cores of a n Intel Xeon Sandy bridge system. We also obtain a very significant improvement of 2.47 x over the native production compiler on the SPECLBM benchmark.

Contributor	Bondhugula, Uday

Date	2018-03-09T06:54:29Z 2018-03-09T06:54:29Z 2018-03-09 2014

Type	Thesis

Identifier	http://hdl.handle.net/2005/3259 http://etd.ncsi.iisc.ernet.in/abstracts/4120/G26635-Abs.pdf

Language	en_US

Relation	G26635

ICAR Research Data Repository for Knowledge Management

Record Details

An Optimizing Code Generator for a Class of Lattice-Boltzmann Computations

Electronic Theses of Indian Institute of Science