As the title says, this MR removes cuda layout.
Overall, layouts should not be backend specific and include other informations than the data layout. If specific information relative to the destination needs to be passed to the dma, it should rather be embedded in the copy operator arguments or the dma data structure.
This will greatly simplify crafting higher level blocks involving DMAs. When such a block makes several dma requests, it would require to create several layouts. Therefore, DMAs would need to be associated with src/dst layout constructors. For instance copying to cuda devices would imply a cuda layout. If a higher abstraction needs to instanciate layouts, for instance a deepcopy mapper, layout creation operators need to be passed as argument to be able to perform dma requests on different pieces of data.
Next, the MR also removes the script for cuda device code compilation. It is kept in a separate commit to ease eventual future reintegration.