The ATLAS (Automatically Tuned Linear Algebra Software) project is an ongoing research effort focusing on applying empirical techniques in order to provide portable performance. At present, it provides C and Fortran77 interfaces to a portably efficient BLAS implementation, as well as a few routines from LAPACK.

Using  atlas

atlas provides libraries. If you intend to embed atlas into your application, all you'd need is to link your executable to the atlas library of your choice (libsatlas is the single-threaded version, libtatlas multithreaded). Be aware that the libraries deployed are compiled for a generic platform largely and are hence not optimized for a specific hardware architecture. For maximum performance you'd need to compile the libraries yourself with the appropriate flags. For example, hyperthreading is usually enabled on maxwell-nodes. atlas can't actually detect the physical cores, so you'd need to supply the appropriate information for a particular hardware int the atlas configuration. See http://math-atlas.sourceforge.net/atlas_install/ for details.