diff --git a/docs/source/en/model_doc/quasarv4.md b/docs/source/en/model_doc/quasarv4.md new file mode 100644 index 00000000000..ccd29b641a5 --- /dev/null +++ b/docs/source/en/model_doc/quasarv4.md @@ -0,0 +1,68 @@ + + +# QuasarV4 + +## Overview + +The QuasarV4 model is a transformer-based architecture with an innovative token temperature mechanism for natural language processing tasks. It was developed by SILX AI Labs and the model weights are available at [silx-ai/QuasarV4-600M-Transformer](https://huggingface.co/silx-ai/QuasarV4-600M-Transformer). + +The transformer-based implementation is compatible with the Hugging Face Transformers library and can be used for causal language modeling tasks. + +### Model Architecture + +The QuasarV4 transformer-based model extends the standard transformer architecture with several innovative components: + +1. **Token Temperature Mechanism**: A new approach that dynamically adjusts the importance of tokens based on their contextual significance. This mechanism allows the model to focus more computational resources on critical tokens and less on filler words, improving generation quality. + +2. **Temperature Aggregation Plus**: A multi-layer network that processes token temperatures across the sequence to create a global temperature focus, enabling better long-range dependencies. + +3. **Cross-token Temperature Attention**: Applies attention mechanisms specifically to temperature values, allowing tokens to influence each other's importance based on semantic relationships. + +4. **DenseNet-style Residual Connections**: Incorporates connections from earlier layers (at 1/3 and 2/3 depth) directly to the output, creating richer gradient flows and enabling better information preservation through the network depth. + +The architecture includes: +- Multi-headed attention with separate query, key, and value projections +- MLP blocks with SiLU activation functions +- Token temperature layers that dynamically adjust the importance of different tokens +- Global temperature scaling for context-aware processing + + + +### Model Details + +- **Model Type**: Causal Language Model +- **Implementation**: Transformer-based architecture with token temperature mechanisms +- **Language(s)**: English +- **License**: Apache 2.0 +- **Resources for more information**: + - [Model Card](https://huggingface.co/silx-ai/QuasarV4-600M-Transformer) + +## Usage + +The QuasarV4 model can be loaded using the `AutoModelForCausalLM` class: + +```python +from transformers import AutoTokenizer, AutoModelForCausalLM + +# Load tokenizer +tokenizer = AutoTokenizer.from_pretrained("silx-ai/QuasarV4-600M-Transformer") + +# Load model +model = AutoModelForCausalLM.from_pretrained("silx-ai/QuasarV4-600M-Transformer") + +# Generate text +input_text = "The capital of France is" +input_ids = tokenizer(input_text, return_tensors="pt").input_ids +outputs = model.generate(input_ids, max_length=50) +generated_text = tokenizer.decode(outputs[0], skip_special_tokens=True) +print(generated_text)