HPC programming techniques are largely applied in many fields . New generation research projects require often HPC code programming for efficient processing of large data volumes . In this scope the optimisation of data model and data format is critical . A HPC data format is designed to help the CPU data pre-fetching . Data have to be contiguous , cache friendly and the data locality has to be preserved . The tables of data also have to be aligned on vectorial registers with respect to their types . Therefore, no peal is needed before a loop.
All of these points increase the computing intensity , especially for short vectors ( hundreds of elements ) . Consequently developing a HPC data format is a complex and challenging task that results in a time consuming process . Codes generators have been introduced to provide flexibility and to hopefully reduce the development time . Furthermore , they can provide larger set of functionalities , format-versioning and languages compatibility . Well known existing data-format generators like Protocol Buffer , Avro or Thrift are able to perform versioning and serialization . The serialization provides flexibility but has an important cost in data transfer or data copy time for big binary files . In these cases , the data tables can not be aligned on vectorial registers .
Since the stored data are flagged , they are not really contiguous , hence the data locality is not optimal . Other protocols like SOAP use lot of XML which has a high parsing overhead . JSon or Plain Text could be used . However , they are too slow for binary data . In this work , we address the problem of code generation of HPC data formats . The presented data format generator creates HPC data format from a simple configuration . It adapts itself to the targeted computing architecture ; data are automatically aligned on vectorial register depending on their types for each Intel ' s architectures SSE , SSE2 , SSSE3 , SSE4 , AVX , AVX2 , AVX512 . The generated data format is known by both the generator and the compiler , therefore no serialization is needed . This implies , no overhead due to serialization .
The generator language has been designed to be as simple as possible . It creates C + + , Python and wrapped Python data formats . The generated code is fully human-readable compared to other generators . Up-to-date documentation of the generated code is also released . This simplicity enables potentially this generator to be easily applicable in many research domains . Moreover , a fast Python interface can be generated on the user demand . This combines the HPC data format speed and the versatility of Python with a minimal wrapper impact on performances . Some programs and libraries like Swig of PyBind11 provide generated Python wrappers but they are not as optimized as the users needs .
All the following are enable for C + + and Python thought the wrapped python data format generator :
Pierre Aubert et Jean Jacquemier LAPP/CNRS
High Performance Computing data format generator - 2019 Release