RWKV-Raven-14B

Below is the description from the original repository

RWKV is an RNN with transformer-level LLM performance. It can be directly trained like a GPT (parallelizable). It's combining the best of RNN and transformer - great performance, fast inference, saves VRAM, fast training, "infinite" ctx_len, and free sentence embedding.

Model Details

The details of the architecture can be found on the blogpost mentioned above and the Hugging Face blogpost of the integration.

Usage

Convert the raw weights to the HF format

You can use the convert_rwkv_checkpoint_to_hf.py script by specifying the repo_id of the original weights, the filename and the output directory. You can also optionally directly push the converted model on the Hub by passing --push_to_hub flag and --model_name argument to specify where to push the converted weights.

python convert_rwkv_checkpoint_to_hf.py --repo_id RAW_HUB_REPO --checkpoint_file RAW_FILE --output_dir OUTPUT_DIR --push_to_hub --model_name dummy_user/converted-rwkv

Generate text

You can use the AutoModelForCausalLM and AutoTokenizer classes to generate texts from the model. Expand the sections below to understand how to run the model in different scenarios: The "Raven" models needs to be prompted in a specific way, learn more about that in the integration blogpost.

Citation

If you use this model, please consider citing the original work, from the original repo here

免责声明

RWKV-Raven-14B模型来源于第三方，百度智能云千帆大模型平台不保证其合规性，请您在使用前慎重考虑，确保合法合规使用并遵守第三方的要求。具体请查看模型的开源协议Apache 2.0及模型开源页面展示信息等。如您发现模型/数据集/文件等有任何问题，请及时联系我们处理。