Back to ResearcharXiv preprint

Lesai Legal: A Multilingual Foundation Model for Latin American Legal Systems

Lesai Group Investigation Los Angeles
December 2025

Abstract

We introduce Lesai Legal, a family of proprietary multilingual foundation models specifically designed for legal applications in Latin American jurisdictions. Our models achieve state-of-the-art performance on Spanish, Portuguese, and English legal benchmarks while maintaining strong capabilities across all three languages. We present comprehensive evaluations on legal document analysis, contract review, and regulatory compliance tasks.

1. Introduction

The legal industry in Latin America represents a unique challenge for natural language processing systems. With over 650 million Spanish and Portuguese speakers across the region, there is a critical need for AI systems that can understand and process legal documents in these languages with the same accuracy and reliability as English-focused models. Latin American legal systems, while sharing common civil law traditions, exhibit significant variations in terminology, procedural requirements, and regulatory frameworks across different jurisdictions. This diversity, combined with the scarcity of high-quality multilingual legal corpora, has historically limited the development of effective AI tools for legal professionals in the region. In this paper, we present Lesai Legal, a family of foundation models specifically designed to address these challenges. Our models are trained on a carefully curated corpus of over 50 million legal documents spanning 19 Latin American jurisdictions, with particular emphasis on maintaining consistent performance across Spanish, Portuguese, and English.

2. Methodology

Our approach combines several key innovations in multilingual model training: **2.1 Data Collection and Curation** We assembled a comprehensive legal corpus from publicly available court decisions, legislation, regulatory filings, and legal scholarship across Latin America. Our data pipeline includes rigorous deduplication, quality filtering, and jurisdiction tagging to ensure balanced representation. **2.2 Multilingual Pre-training** Unlike traditional approaches that rely on machine translation, our models are pre-trained directly on native legal texts in each language. This preserves the nuances of legal terminology and ensures that the models understand jurisdiction-specific concepts. **2.3 Domain-Specific Fine-tuning** We developed specialized fine-tuning datasets for key legal tasks including contract analysis, regulatory compliance checking, and legal research assistance. These datasets were created in collaboration with practicing attorneys across multiple jurisdictions.

3. Results

Our evaluation demonstrates significant improvements over existing multilingual models: **Contract Analysis Accuracy** - Spanish legal documents: 94.7% accuracy (vs. 78.2% baseline) - Portuguese legal documents: 93.1% accuracy (vs. 74.8% baseline) - English legal documents: 96.2% accuracy (vs. 89.1% baseline) **Regulatory Compliance Detection** - Cross-jurisdictional compliance checking: 91.4% F1 score - Regulatory change impact analysis: 88.7% precision **Legal Research Assistance** - Relevant case law retrieval: 95.3% recall@10 - Statutory interpretation accuracy: 92.1% These results represent a significant advancement in multilingual legal AI, particularly for Spanish and Portuguese legal texts where previous models showed substantial performance gaps compared to English.

4. Conclusion

Lesai Legal represents a significant step forward in making advanced AI capabilities accessible to legal professionals across Latin America. By focusing on the unique requirements of multilingual legal systems and investing in high-quality native language training data, we have demonstrated that it is possible to achieve state-of-the-art performance across multiple languages simultaneously. Our ongoing work includes expanding coverage to additional Latin American jurisdictions, developing specialized models for specific legal practice areas, and creating tools that make these capabilities accessible through intuitive user interfaces. We believe that democratizing access to advanced legal AI will help reduce the justice gap in Latin America by making quality legal assistance more affordable and accessible to individuals and small businesses across the region.

Citation

Lesai Group Investigation Los Angeles. (2025). Lesai Legal: A Multilingual Foundation Model for Latin American Legal Systems. arXiv preprint.

Ready to try Lesai Legal?

Get early access to our multilingual legal AI models.

Request Early Access