An Albanian Edge LLM – Part 1: The Vision


Why I’m Trying to Build an Albanian Translation Model

TLDR: Albanian is a low-resource language with around 7.5M speakers. I want to build a private, on-device translation model that (hopefully) runs in under 300ms on a modern iPhone. No servers, no data leaving your phone.


The Idea

I’ve been thinking a lot about what it means to translate Albanian well.

When I try to translate proverbs or idioms using whatever app is handy, whether that’s ChatGPT, Gemini, Google Translate, or any of the cloud-based services, something gets lost. The translation comes back technically correct but… flat? The cultural richness disappears.

Take “Më mirë shëndet, se mbret” which roughly means “Better health than king” (or more idiomatically, health is more valuable than power). These services handle it okay, but they don’t quite capture centuries of embedded Albanian wisdom about values and priorities.

I started wondering: could I build something better? Something that runs entirely on my phone, keeps my data private, and maybe handles Albanian nuance a little better?

I don’t know if I can. But I figured I’d try and document the journey.

Why On-Device Matters

Every time you use a cloud translation service, your text travels to someone else’s servers. (Why? More compute lives there, but I don’t like all my translations needing to go to someone else’s servers/computer.)

Maybe you’re translating a message from a family member. Or discussing something medical. Or working through a legal document. Having that text leave your phone and travel across the internet to be processed… it just feels different than keeping it local.

On-device translation means the text never leaves your pocket. The model lives on your phone. The processing happens on your phone. No internet connection needed. Complete privacy.

I think that matters, especially for diaspora communities who communicate across borders about sensitive family or business matters.

The Albanian Context

Here’s something I find fascinating about Albanian (though I should note I’m not a linguist—this is just what I’ve gathered from reading):

Albanian appears to be a linguistic isolate within the Indo-European family. When researchers draw the family tree of European languages, Albanian sits on its own branch with no close relatives. The languages closest to it apparently died out centuries ago, poorly documented.

This isolation seems to make Albanian uniquely interesting to linguists studying how languages evolve. It also seems to make translation harder: Albanian patterns don’t map as cleanly onto other European languages the way, say, Spanish relates to Italian.

The language has survived Ottoman occupation, decades of communist isolation, and ongoing globalization pressure. That resilience is remarkable.

What I’m Trying to Build

I set some goals for myself. Whether I can actually hit them remains to be seen:

GoalTarget
SpeedUnder 300ms per translation on recent iPhones
SizeUnder 1GB (so it fits on phones without eating all your storage)
AccuracyBetter than I currently get with cloud services like ChatGPT, Google Translate, Gemini, or Claude.
Privacy100% on-device, no server calls

These targets might be too aggressive. A 1GB model is tiny by modern LLM standards. But I’ve read that specialized models can sometimes outperform larger generalist models on narrow tasks. I want to test that hypothesis.

Why Proverbs Keep Coming Up

I keep returning to proverbs because they seem like the hardest test case.

Proverbs encode cultural knowledge in compressed form. “Fjala pa punë, si peshku pa lumë” translates literally as “Words without work, like fish without river.” But that misses the deeper meaning—the skepticism toward empty promises, the primacy of action over speech.

If I can get proverbs working reasonably well, I figure simpler translations should follow. If I can’t, at least I’ll learn something about the limits of small models.

These little compressed wisdom packets fascinate me. They’re also a brutal test for translation systems. If a model can handle “fish without river” correctly, it probably understands something about Albanian that goes beyond dictionary lookups. (See Part 5 on benchmarking for more on this.)

The Bigger Picture (Maybe)

I’ve been thinking about this project in a larger context.

There are thousands of languages spoken on Earth. Many are what researchers call “low-resource”—meaning there’s limited digital data available for training AI systems. Albanian, with its 7.5 million speakers, is somewhere in the middle. (Note on low resource languages: One excellent project mentioned in Part 4, that I managed to get running on an iPhone was OmniASR from Meta.)

If I can figure out a reasonable approach for Albanian: Synthetic data generation, efficient fine-tuning, on-device deployment, maybe the same techniques could help other language communities? I don’t want to overclaim here. I’m just one person messing around with models. But the possibility is exciting.

It could be cool to see Albania become a shining example of a nation leveraging local AI capabilities. Maybe that’s overly optimistic, but why not dream a little?

Technical Concepts (For Those New to This)

I’ll be using some technical terms throughout this series. Here’s my attempt to explain them in plain language (keeping in mind I’m learning too):

LLM (Large Language Model): Think of it as the engine. A neural network trained on massive amounts of text that learns to predict what words come next. GPT, Claude, Llama, Qwen—all LLMs.

Fine-tuning: Taking a general-purpose model and teaching it to specialize. The model already knows language; I’m trying to teach it specifically how Albanian maps to English. Like teaching a chef who knows cooking generally to make specifically Albanian food.

Quantization: Compressing the model to make it smaller. Neural networks usually use 32-bit numbers for their internal values. Quantization reduces this to 4-bit numbers. Result: roughly 8x smaller model, with (hopefully) minor quality loss.

On-device: Everything runs on your phone. Model weights stored locally. Processing happens locally. No internet, no servers, no data leaving the device.

LoRA (Low-Rank Adaptation): A fine-tuning technique that only updates about 2% of the model’s parameters. Much faster and cheaper than updating everything. Also helps prevent the model from forgetting what it already knows.

Tokens: Chunks of text the model processes. Roughly 0.75 words per token for English. Albanian tokenization seems slightly less efficient—more tokens per word on average.

SFT (Supervised Fine-Tuning): Teaching by example. You show the model thousands of input-output pairs (Albanian → English), and it learns the pattern. This is the bread-and-butter of making models do specific tasks.

Inference: Running the trained model to get output. When you type Albanian text and get English back, that’s “inference.” The training happens once; inference happens every time you use the app.

MLX: Apple’s machine learning framework for their chips. I ended up using this for running the model on iPhone. More on why in Part 4.

The Road Ahead

This series documents my journey:

Part 2: Sourcing the Beans — How I tried to find training data and eventually built a synthetic data pipeline. Spoiler: finding clean, licensed Albanian data was really hard.

Part 3: The Roast — All the training approaches I tried. Most failed! What eventually worked, and what I learned from the failures.

Part 4: The Pour — Getting the model onto an iPhone. Compression, MLX framework, the actual iOS app.

Part 5: The Taste Test — Current results (around 69% accuracy), why that’s both encouraging and frustrating, and what I want to try next.

A Note on Expertise

I want to be clear: I’m not an ML researcher. I’m not a linguist. And I’m not even Albanian lol.

What follows is my best attempt to document what I tried, what worked, what didn’t, and what I think it means. I’ve probably made mistakes. I’ve definitely made suboptimal choices. I’m sharing this anyway because I think the process of learning in public has value, and because maybe someone smarter than me will see this and do it better.

That said, I’ve tried to be careful about fair use and intellectual property throughout this project. The opinions expressed here are my own personal views based on my limited experience. I’m not trying to criticize any company or product—I’m just trying to solve a problem I personally care about.

Let’s see what happens.


First sip. The water’s heating. Let’s see if I can find the right beans.