HPC programming techniques are largely applied in many fields . New generation research projects require often HPC code programming for efficient processing of large data volumes. In this scope the optimisation of data model and data format is critical.
A HPC data format is designed to help the CPU data pre-fetching . Data have to be contiguous , cache friendly and the data locality has to be preserved . The tables of data also have to be aligned on vectorial registers with respect to their types. Therefore, no peal is needed before a loop.
All of these points increase the computing intensity, especially for short vectors ( hundreds of elements ). Consequently developing a HPC data format is a complex and challenging task that results in a time consuming process. Codes generators have been introduced to provide flexibility and to hopefully reduce the development time. Furthermore, they can provide larger set of functionalities, format-versioning and languages compatibility.
Well known existing data-format generators like Protocol Buffer, Avro or Thrift are able to perform versioning and serialization. The serialization provides flexibility but has an important cost in data transfer or data copy time for big binary files. In these cases, the data tables can not be aligned on vectorial registers. Since the stored data are flagged, they are not really contiguous, hence the data locality is not optimal .
Other protocols like SOAP use lot of XML which has a high parsing overhead. JSon or Plain Text could be used. However, they are too slow for binary data. In this work, we address the problem of code generation of HPC data formats. The presented data format generator creates HPC data format from a simple configuration. It adapts itself to the targeted computing architecture; data are automatically aligned on vectorial register depending on their types for each Intel's architectures SSE, SSE2, SSSE3, SSE4, AVX, AVX2, AVX512. The generated data format is known by both the generator and the compiler, therefore no serialization is needed. This implies, no overhead due to serialization. The generator language has been designed to be as simple as possible. It creates C + +, Python and wrapped Python data formats. The generated code is fully human-readable compared to other generators.
Up-to-date documentation of the generated code is also released. This simplicity enables potentially this generator to be easily applicable in many research domains . Moreover, a fast Python interface can be generated on the user demand. This combines the HPC data format speed and the versatility of Python with a minimal wrapper impact on performances. Some programs and libraries like Swig of PyBind11 provide generated Python wrappers but they are not as optimized as the users needs.
All the following are enable for C + + and Python thought the wrapped python data format generator:
Pierre Aubert and Jean Jacquemier
High Performance Computing data format generator - 2018 Release