Large Language Models

Llama-3.1-FoundationAI-SecurityLLM-Reasoning-8B Technical Report

ZZhuoran YangEEd LiJJianliang HeAAman PriyanshuBBaturay SaglamPPaul KassianikSSajana WeerawardhenaAAnu VelloreBBlaine NelsonNNeusha JavidniaAArthur GoldblattFFraser BurchAAvi ZoharyAAssaf EisenmanMMahdi SabbaghiSSupriti VijayRRahim DharssiDDhruv KediaKKojin OshibaYYaron SingerAAmin Karbasi
Published
January 28, 2026
Authors
21

Abstract

We present Foundation-Sec-8B-Reasoning, the first open-source native reasoning model for cybersecurity. Built upon our previously released Foundation-Sec-8B base model (derived from Llama-3.1-8B-Base), the model is trained through a two-stage process combining supervised fine-tuning (SFT) and reinforcement learning from verifiable rewards (RLVR). Our training leverages proprietary reasoning data spanning cybersecurity analysis, instruction-following, and mathematical reasoning. Evaluation across 10 cybersecurity benchmarks and 10 general-purpose benchmarks demonstrates performance competitive with significantly larger models on cybersecurity tasks while maintaining strong general capabilities. The model shows effective generalization on multi-hop reasoning tasks and strong safety performance when deployed with appropriate system prompts and guardrails. This work demonstrates that domain-specialized reasoning models can achieve strong performance on specialized tasks while maintaining broad general capabilities. We release the model publicly at https://huggingface.co/fdtn-ai/Foundation-Sec-8B-Reasoning.

Keywords

supervised fine-tuningreinforcement learning from verifiable rewardscybersecurity analysismulti-hop reasoningsafety performance

More in Large Language Models

View all
Llama-3.1-FoundationAI-SecurityLLM-Reasoning-8B Technical Report | Paperchime