Norwegian Legal Q&A Dataset
7,603 expert-curated question-answer pairs covering Norwegian legislation, case law, and regulatory compliance. Used to fine-tune dobetter-norge-v2.
Dataset Overview
This dataset contains 7,603 question-answer pairs specifically designed for training language models on Norwegian legal reasoning. Each pair was derived from authoritative legal sources and validated against established legal interpretations.
Dataset Statistics
| Metric | Value |
|---|---|
| Total Q&A Pairs | 7,603 |
| Source Documents | 2,847 |
| Training Examples (augmented) | 31,842 |
| Average Question Length | 47 tokens |
| Average Answer Length | 186 tokens |
| Legal Domains Covered | 12 |
Domain Coverage
- Contract Law (Avtaleloven)
- Employment Law (Arbeidsmiljoloven)
- Company Law (Aksjeloven)
- Tax Law (Skatteloven)
- Data Protection (Personopplysningsloven / GDPR)
- Consumer Protection (Forbrukerkjopsloven)
- Criminal Law (Straffeloven)
- Administrative Law (Forvaltningsloven)
- Environmental Law (Forurensningsloven)
- Planning and Building (Plan- og bygningsloven)
- Immigration Law (Utlendingsloven)
- Intellectual Property (Åndsverkloven)
Data Format
JSON Lines format with the following structure:
{
"id": "qa-0001",
"question": "Hva er vilkårene for...",
"answer": "I henhold til § 36...",
"source_doc": "LOV-2023-06-16-40",
"domain": "contract_law",
"difficulty": "intermediate"
}