Understanding the Anki APKG Format
(updated ) Eiko WagenknechtIf you’re building educational software or working with Anki flashcards programmatically, you’ve discovered that Anki’s APKG file format lacks proper documentation.
Anki is the most popular open-source spaced repetition software for memorizing information through flashcards. It uses the custom APKG format to store and share flashcard decks.
There’s no official specification, though - just outdated reverse-engineering attempts scattered across the web. This leaves developers guessing at the format’s structure.
I ran into this exact issue while building tools that needed to read and write APKG files. After spending too much time piecing together fragments, I analyzed the format myself.
I’ll cover the different APKG format versions and their technical details in this series, starting with the format structure below.
This isn’t an official spec, just what I’ve figured out through research and reverse engineering. If you spot any mistakes, please let me know and I’ll fix them.
Table of Contents
- Current Documentation Problems
- APKG vs. COLPKG: Two Sides of the Same Coin
- Format Evolution: Three Format Versions
- What’s Inside an APKG File
collection.anki2
file - Compatibility Layer- When Each Format Matters
- What’s Coming Next
- Building Better Spaced Repetition Tools
Current Documentation Problems
The official Anki documentation briefly mentions the APKG format, but says nothing about its structure.
The anki-cards-web-browser documentation was published in 2017 (when the current Anki version was 2.0.47) and provides detailed structure and content descriptions. However, a lot changed in the past 8 years and current Anki versions don’t use the exact same format.
Many sources link to a wiki page that’s no longer available. The latest snapshot is from 2018 and contains experimental information from examining generated databases.
The most recent source I found is the AnkiDroid wiki, updated in 2024 with detailed SQLite database descriptions. However, it’s based on database version 11 and doesn’t account for recent changes to the database structure. It also lacks information about the APKG file structure.
For developers, there’s also the source code of Anki. Some relevant files are:
- The database storage connection describes how the database is created and updated.
- The SQLite database schema v11.
- The SQLite database schema upgrade files up to v18, which are also referenced in the AnkiDroid wiki above.
APKG vs. COLPKG: Two Sides of the Same Coin
Anki exports use either COLPKG (collection package) or APKG (deck package) formats. Both use the same structure - ZIP archives containing the same file types. They differ in what gets included.
For the rest of this post, I’ll refer to both as the “APKG format” since they share the same underlying structure.
COLPKG (Collection Package): Used for backing up your entire collection or migrating between devices.
Importing a COLPKG replaces your existing collection with the package contents.
Collection packages created with previous versions of Anki were called collection.apkg
.
APKG (Deck Package): Used for sharing specific decks or adding content to existing collections. Importing an APKG adds contents to your existing collection without replacing anything. For previously imported notes, Anki keeps the most recent version.
Both formats can be exported in an older, more compatible format (see Format Evolution: Three Format Versions).
Data Type | COLPKG | APKG |
---|---|---|
Deck scope | Always all decks | Single deck or all decks |
Scheduling data | Always included | Optional (when excluded, removes marked/leech tags) |
Note types | All (even unused) | Only used note types |
Deck presets | Always included | Optional |
Media files | Optional | Optional |
Format Evolution: Three Format Versions
The APKG format evolved significantly, with major changes in 2012, 2018, and 2020 - 2022. Here are the three main versions and their differences. I’ll use the names from the Anki code, with added emojis for easier distinction:
- 📜 Legacy 1
- 🔄 Legacy 2
- ⚡ Latest
📜 Legacy 1 (Older Shared Decks, 2012 - 2018)
Modern Anki doesn’t use this format, but it’s worth mentioning for context. Anki 2.0 introduced it in 2012 as the first 2.x file format.
- Deck data (cards, notes, note types, etc.) is stored in a
collection.anki2
SQLite database file (compressed with “deflate”). - It uses database schema
v11
with configuration data stored as JSON in TEXT columns. - Media information is stored as a JSON HashMap in a
media
file. - It doesn’t contain a
meta
file.
🔄 Legacy 2 (Maximum Compatibility, 2018 - 2019)
Anki 2.1 introduced this format in 2018. It’s still widely used.
Export with Support older Anki versions (slower/larger files) creates this format.
Changes from 📜 Legacy 1 to 🔄 Legacy 2:
- Deck data (cards, notes, note types, etc.) is stored in a
collection.anki21
SQLite database file (instead ofcollection.anki2
). - A
meta
file was added to store information about the file format version.
⚡ Latest (Modern Format, 2020 - Present)
This format emerged between 2020 and 2022.
First, the database schema evolved through several versions: v11 → v14 (April 2020, separate tables deck_config
, config
and tags
) → v15 (May 2020, separate tables fields
, templates
, notetypes
, decks
) → v17 (January 2021, additional fields for tags
) → v18 (May 2021, primary key for graves
).
Anki 2.1.50 (April 2022) finally added zstd compression, introducing collection.anki21b
files.
The v16 schema is somewhat of an oddity because it only contained semantic changes, not actual changes to the database structure.
Modern Anki versions use this schema internally and when exporting decks without the Support older Anki versions (slower/larger files) option enabled.
Changes from 🔄 Legacy 2 to ⚡ Latest:
- Deck data (cards, notes, note types, etc.) is stored in a
collection.anki21b
SQLite database file, which is compressed separately with zstd. - It uses database schema
v18
with configuration data stored as Protobuf messages in BLOB columns. Some values that were previously stored in a specific column now have their own table. - Media information is stored as a Protobuf MediaEntries message in a
media
file.
Format Comparison
Here’s how the three formats compare:
Feature | 📜 Legacy 1 | 🔄 Legacy 2 | ⚡ Latest |
---|---|---|---|
Database file | .anki2 | .anki21 | .anki21b |
Database schema | v11 | v11 | v18 |
Number of tables | 5 | 5 | 12 |
ZIP compression | deflate | deflate | store (database compressed individually) |
Database compression | ❌ none | ❌ none | ✅ zstd |
Configuration storage | 📄 JSON in TEXT | 📄 JSON in TEXT | 📊 Protobuf in BLOB |
Media mapping | 📄 JSON | 📄 JSON | 📊 Protobuf |
Meta file | ❌ | ✅ | ✅ |
Data readability | 🟡 Medium | 🟡 Medium | 🔴 Low (binary format) |
File size | 🔴 Large | 🔴 Large | 🟢 Small |
Compatibility | 🟡 Old Anki only | 🟢 Wide compatibility | 🔴 Modern Anki only |
What’s Inside an APKG File
APKG files are standard ZIP archives that open with any ZIP tool. 📜🔄 Legacy formats use “deflate” compression for databases and store other files uncompressed. The ⚡ Latest format stores all files uncompressed, but compresses the database file inside the ZIP archive with zstd.
Despite different database names and formats, all APKG files share the same basic structure. Each contains a database with deck data, a media mapping file, numbered media files (if applicable), and a metadata file. All files are stored in the archive root with no subdirectories.
- Database file: Contains all the deck data (cards, notes, configurations, etc.)
collection.anki2
(📜 Legacy 1),collection.anki21
(🔄 Legacy 2), orcollection.anki21b
(⚡ Latest)- For 🔄 Legacy 2 and ⚡ Latest formats, there’s also a compatibility database file named
collection.anki2
for older Anki versions, containing a single note that displays “Please update to the latest Anki version, then import the .colpkg/.apkg file again.” when opened in older Anki versions.
- Media mapping file: Named
media
, maps media filenames likephoto.jpg
to numbered files (starting from0
). - Media files: Numbered sequentially (
0
,1
,2
, …) containing images, audio, etc. - Metadata file: Named
meta
(🔄 Legacy 2 and ⚡ Latest only)
Example File Structure (Legacy 2)
example-deck.apkg (ZIP archive)
├── collection.anki21 # Main SQLite database (🔄 Legacy 2 format)
├── collection.anki2 # Dummy compatibility database (📜 Legacy 1 format)
├── meta # Format metadata (JSON or protobuf)
├── media # Media file mapping
├── 0 # Media file (image, audio, etc.)
├── 1 # Media file
└── ... # Additional media files
Format Detection
To determine an APKG file’s format, check which database files are present after extracting the ZIP archive:
- Rename the
.apkg
or.colpkg
file to.zip
and extract it. - Check the database files:
collection.anki2
only → 📜 Legacy 1collection.anki2
+collection.anki21
→ 🔄 Legacy 2collection.anki2
+collection.anki21b
→ ⚡ Latest
This relies on the current state of Anki and its file structure.
For future Anki versions, you might need to check the meta
file for version information.
The collection.anki2
file in 🔄 Legacy 2 and ⚡ Latest formats is only a compatibility dummy.
Edge Cases in Format Detection
During the evolution from 🔄 Legacy 2 to ⚡ Latest, intermediate database schema versions (v14-v17) also used collection.anki21
files.
Anki used these internally but never exported them, so you won’t find them in shared decks.
If you encounter these transitional formats, check the meta
file version field for the authoritative version.
To find the exact database schema version, examine the ver
column in the col
table of the SQLite database.
As Anki evolves, future formats will likely use the meta
file for definitive version information.
The meta
File - Version Information
Only needed in edge cases currently, the meta
file is a Protobuf-encoded file that contains metadata about the APKG file.
It contains a single field, version
, which shows the APKG version:
syntax = "proto3";
message PackageMetadata {
enum Version {
VERSION_UNKNOWN = 0;
VERSION_LEGACY_1 = 1;
VERSION_LEGACY_2 = 2;
VERSION_LATEST = 3;
}
Version version = 1;
}
It is set to 2
for 🔄 Legacy 2 and to 3
for ⚡ Latest.
The 📜 Legacy 1 format does not have a meta
file.

collection.anki2
file - Compatibility Layer
This file is an SQLite database with the same structure as the collection.anki21
file (see this post for a detailed explanation).
However, it only contains dummy content for compatibility purposes: a default deck with a single card saying “Please update to the latest Anki version, then import the .colpkg/.apkg file again.”
Since it contains no actual data, I won’t examine it further.
When Each Format Matters
Anki has been around for many years, resulting in a large number of decks created and shared at different times - and in different formats. Here’s where you’ll most likely encounter each format and when to use them for exporting your decks.
📜 Legacy 1 (Older Shared Decks)
You’ll encounter the 📜 Legacy 1 format when downloading older shared decks, like those on AnkiWeb that haven’t been updated in years. Modern Anki no longer creates files in this format, but you may need to work with 📜 Legacy 1 files when using popular older decks. There’s no practical reason to create these files yourself.
🔄 Legacy 2 (Maximum Compatibility)
🔄 Legacy 2 has been around for years and is supported by most tools in the broader Anki ecosystem. It will likely continue to be supported even if Anki ceases to exist. This format is created when you select “Support older Anki versions (slower/larger files)” when exporting. Choose 🔄 Legacy 2 when sharing decks with users who may have older Anki versions, or when working with tools that haven’t been updated to handle the ⚡ Latest format. The JSON-based configuration storage also makes it easier to inspect and modify deck data programmatically.
⚡ Latest (Modern Format)
The ⚡ Latest format is optimized for modern Anki installations and offers the best performance and size characteristics. It uses more efficient compression (zstd) and protobuf for data serialization, resulting in smaller file sizes and faster processing. However, the binary protobuf format complicates manual inspection and the development of supporting tools. Consequently, few tools besides Anki itself can properly handle this format. It will, however, support all the latest features and improvements in Anki, like the new FSRS algorithm. As long as you’re only using Anki, there’s no reason not to use it, since you can always export to the 🔄 Legacy 2 format if needed.
Which Format Should You Choose?
Here’s how to choose the right format for your needs:
Priority | Recommended Format | Why |
---|---|---|
Wide compatibility | 🔄 Legacy 2 | Works with older Anki versions |
File size / performance | ⚡ Latest | Better compression and processing |
Data inspection / modification | 🔄 Legacy 2 | Human-readable JSON configuration |
Using the latest Anki features | ⚡ Latest | Current standard, ongoing development |
Sharing with unknown users | 🔄 Legacy 2 | Safer compatibility choice |
What’s Coming Next
Now that I’ve covered the landscape and high-level structure, let’s dive deeper. This series will continue with a detailed analysis of the two main formats: 🔄 Legacy 2 and ⚡ Latest.
Part 1: Overview (this post): This post provides an overview of the Anki APKG format, its evolution, and the differences between the 📜 Legacy 1, 🔄 Legacy 2, and ⚡ Latest formats.
Part 2: The 🔄 Legacy 2 Format in Detail: In the next post, I’ll cover the 🔄 Legacy 2 format in depth: the SQLite database structure, tables and their relationships, JSON configuration fields, and media file handling.
Part 3: The ⚡ Latest Format in Detail (not yet published): This covers the ⚡ Latest format including the protobuf schema, database schema v18, and the key differences from 🔄 Legacy 2.
Part 4: APKG Format Critique (not yet published): The final will be a critique of the APKG format, its strengths and weaknesses, and my take on how a spaced repetition software could do better.
Building Better Spaced Repetition Tools
I didn’t reverse-engineer this format just for fun.
What started as figuring out how to improve my own memory turned into building a spaced repetition app. Before you think “oh great, yet another flashcard app” - I’m focused on solving the user experience and data ownership problems that existing tools haven’t addressed.
The main thing that’s stopped me from fully committing to existing tools is vendor lock-in. I want to own my learning data - the cards, decks and knowledge that I create, now and forever. Our study materials are valuable and shouldn’t be trapped in proprietary formats that make third-party development a nightmare.
Step one: Build a TypeScript library that converts between spaced repetition formats from tools like Anki, Mochi and Mnemosyne. To do it right, I need to understand exactly how these formats work.
That’s why you’re getting these detailed technical breakdowns.
Want Updates?
I’ll share progress through blog posts - more format deep-dives, implementation details, and early tool access.
Join the newsletter for updates when there’s something worth sharing. No spam, just occasional progress reports.
I’ll replace this with a proper signup form soon. For now, send that email to get added.
No Comments? No Problem.
This blog doesn't support comments, but your thoughts and questions are always welcome. Reach out through the contact details at the bottom of the page.
Support Me
If you found this page helpful and want to say thanks, you can support me here.