Arcenal-Small-8b: Uncensored Reasoning Model Based on DeepSeek-R1 Distillation

Arcenal-Small-8b: Uncensored Reasoning Model Based on DeepSeek-R1 Distillation

July 19, 2025
3 min read

Abstract

We are releasing Arcenal-Small-8b, an 8B parameter reasoning model built on the Qwen3 architecture. Starting from DeepSeek-R1-0528-Qwen3-8B (a distilled version of DeepSeek-R1), we fine-tuned the model using LoRA and GRPO techniques. We also integrated the OpenChina dataset to remove censorship constraints, resulting in a reasoning model that delivers strong performance without artificial limitations.

Benchmark Performance

The base DeepSeek-R1-0528-Qwen3-8B model already exhibited exceptional reasoning capabilities. Our challenge was to remove its censorship constraints while preserving these strengths. Not only did we successfully uncensor the model through our fine-tuning process, but we also managed to maintain and even improve its performance on key benchmarks.

Arcenal-Small-8b demonstrates superior performance across critical reasoning benchmarks, outperforming both Mistral Magistral Small and Medium models in key areas:

ModelAIME 2024 pass@1MATH-500 pass@1LiveCodeBench pass@1
Arcenal-Small-8b86%94%60%
Mistral Magistral Small71%96%56%
Mistral Magistral Medium74%94%59%

The model achieves particularly impressive results on the AIME 2024 benchmark with 86% pass@1, significantly outperforming both Mistral variants. While maintaining competitive performance on MATH-500 (94%), it also edges ahead on LiveCodeBench with 60% pass@1, demonstrating well-rounded capabilities across mathematical reasoning and code generation tasks.

Arcenal-Small-8b Benchmark Results

Architecture and Training

Arcenal-Small-8b builds upon the strong foundation of DeepSeek-R1-0528-Qwen3-8B, a distilled version of the groundbreaking DeepSeek-R1-0528 model. The base model already demonstrated exceptional performance across reasoning benchmarks, but suffered from a critical limitation: systematic censorship aligned with Chinese government policies. Our approach addressed this while preserving its strengths:

  • Base Model: DeepSeek-R1-0528-Qwen3-8B distillation
  • Fine-tuning: LoRA (Low-Rank Adaptation) for efficient parameter updates
  • Optimization: GRPO (Gradient Reward Policy Optimization) for enhanced reasoning
  • Uncensoring: OpenChina dataset integration to remove artificial constraints

This combination preserves the exceptional reasoning capabilities of the original model while ensuring unbiased, uncensored responses across all domains.

Conclusion

Arcenal-Small-8b represents a new standard in efficient reasoning models, combining state-of-the-art performance with complete freedom from censorship. At just 8 billion parameters, it offers an optimal balance between capability and computational efficiency, making it suitable for both research and production deployments.

The model is now available for immediate use, empowering developers and researchers with unconstrained AI reasoning capabilities. You can try Arcenal-Small-8b right now at chat.arcenal.org by enabling reasoning mode. Our chat interface is completely free and offers a seamless way to experience the model's uncensored reasoning capabilities firsthand.

We plan to open-source Arcenal-Small-8b soon, making it freely available to the community for further research and development.

Interested in deploying uncensored AI models or building on our technology? Reach out to discuss how Arcenal can accelerate your AI initiatives.

Stay Updated with Our Newsletter

Get the latest insights on AI and technology delivered to your inbox

Let's Connect

Have questions about this article or want to explore how AI can transform your projects? We'd love to hear from you.