Loading Model weights with fastsafetensors

Loading Model weights with fastsafetensors#

Using fastsafetensor library enables loading model weights to GPU memory by leveraging GPU direct storage. See https://github.com/foundation-model-stack/fastsafetensors for more details. For enabling this feature, set the environment variable USE_FASTSAFETENSOR to true