Add Tesla T4 compatibility for the 0.5B dense model (fp32 fallback + SDPA fallback)

This issue is specifically about the `0.5B` dense model. While the MoE variants may still require bf16-capable hardware, the `0.5B` dense model is small enough that supporting Tesla T4-class GPUs seems both practical and useful. T4 is widely available through platforms such as Colab and Kaggle, so enabling stable inference there would improve accessibility for many researchers and developers.

## Proposed approach

### Runtime load dtype selection
```py
self.llm_dtype = (
    torch.bfloat16
    if torch.cuda.is_bf16_supported(including_emulation=False)
    else torch.float16
)
```
### fp32 promotion during generation on fp16 hardware

On T4, running the audio pipeline in fp16 produces NaNs.
Upcasting the full model to fp32 before generation avoids this issue:
```py
if next(self.model.parameters()).dtype == torch.float16:
    self.float()
```
### flash-attn fallback
Add a fallback to PyTorch native SDPA when flash-attn is unavailable/not supported for a hardware.


## Validation:
I have tested my above assumption and it seems to be working on kaggle T4(with all the examples from cookbook.ipynb + test.py) , you can check the notebook at [here](https://www.kaggle.com/code/akshatnayak/test-ming-tts-gradio).

The Draft PR is : [here](https://github.com/inclusionAI/Ming-omni-tts/pull/13)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add Tesla T4 compatibility for the 0.5B dense model (fp32 fallback + SDPA fallback) #12

Proposed approach

Runtime load dtype selection

fp32 promotion during generation on fp16 hardware

flash-attn fallback

Validation:

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

Add Tesla T4 compatibility for the 0.5B dense model (fp32 fallback + SDPA fallback) #12

Description

Proposed approach

Runtime load dtype selection

fp32 promotion during generation on fp16 hardware

flash-attn fallback

Validation:

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions