Vol. III · Issue 05
Format & Container Studies
The File Format & Conversion Almanac
Saturday, 16 May 2026
Edited in Tbilisi
ISSN 2026 – 0517 · A reader-supported reference
A Working Reference · 13 Entries

Thirteen conversions you keep googling at midnight — explained once, properly.

HEIC photos that refuse to open on a colleague's laptop. A WebP download nothing wants to read. An EPUB you need on the printer by morning. This almanac collects the file conversions we field most often, the trade-offs each one hides, and the safest defaults when in doubt.

Chapter 01 — Pictures & icons

Photographs that travel poorly, and how to fix them.

The image-format world fragmented twice in the last decade — once when Apple shipped HEIC by default in iOS 11, again when WebP became the de-facto format on the open web. Both shifts made files smaller; both made them harder to share. The four entries below close those gaps.

A decade after MIME types were supposed to make file extensions a polite suggestion, our inboxes still bounce IMG_4231.HEIC because Outlook can't show a preview. The right move is rarely "tell the sender to switch their phone settings." It's a clean conversion — and that's where most of the friction in this section originates.

HEICJPG
iPhone · Universal

From iPhone photo to anywhere

HEIC encodes an image with HEVC, which is roughly half the size of an equivalent JPG at matching quality. The catch is licensing: Windows still ships HEIF support behind a paid extension, and most webmail previews refuse it outright. Converting to JPG at quality 92–95 produces a visually identical photo that opens on every device built since 1994.

EXIF preserved Color profile: sRGB Browser-side OK
WebPPNG
Web · Editor-ready

When the download doesn't open

WebP shaves 25–35% off comparable PNGs and is now served by roughly 96% of CDN edge nodes (CanIUse, Q1 2026). PowerPoint, older InDesign builds, and several CMS upload pipelines still don't accept it. PNG is the universal fallback: lossless, alpha-channel intact, and readable by every image library written since 1997.

Alpha kept Animated WebP → APNG Browser-side OK
PNGICO
Windows · Resources

One file, six resolutions

The ICO container is older than the Web itself and has one trick worth remembering: it can hold a small atlas of bitmaps at different sizes, and Windows picks the best fit at draw time. A practical icon bundles 16, 32, 48, 64, 128 and 256 px — anything smaller looks blurry on the taskbar, anything larger is ignored by most renderers.

Multi-size Alpha kept Browser-side OK
SVGPNG
Vector · Raster

Flatten a vector without breaking it

SVG is a markup file, not a picture, and that distinction matters once you paste a logo into Keynote and the gradients vanish. Rasterising at a known DPI gives you a fixed-size PNG with predictable rendering. Pick 2× the intended display size on retina displays; 300 DPI for anything destined for print.

Custom DPI Transparent or filled background Browser-side OK
"The HEIC-to-JPG conversion is the single most-requested operation in our analytics — three times more frequent than the next runner-up. People aren't switching off HEIC. They're switching around it, one share at a time." — Field log, Q1 2026
Chapter 02 — Paper & print

Documents bound for someone else's printer.

PDF turned 33 this year. It remains the only document format you can hand to a stranger with confidence that the fonts, margins and pagination will not change. The four entries here orbit it: feeding documents in, taking pages back out.

EPUBPDF
Books · Reflow → Fixed

From reflowable to printable

An EPUB is, internally, a ZIP of HTML files. That makes it elegant on a Kindle and awkward on an office printer that wants page numbers and a margin. A good conversion freezes the reflowable text at a chosen page size (commonly A4 or 6×9 in), keeps the chapter table-of-contents as PDF bookmarks, and embeds the cover.

Server-side TOC preserved DRM EPUBs unsupported
DOCXPDF
Office · Frozen layout

Word's last mile

DOCX is OOXML — an Open Packaging Convention zip that references fonts by name, not by file. The most common reason an exported PDF looks "different" from the Word document is that the recipient's machine substituted a font for one of yours. A reliable conversion either embeds the typeface or swaps in a metrics-compatible match so pagination doesn't shift.

Server-side Fonts embedded Footnotes & tracked changes kept
PDFJPG
Pages out · For screen

When one page is enough

The most common reason to export a PDF page to JPG isn't archival — it's pasting page 4 of a contract into a Slack message without revealing pages 1–3. Choose 150 DPI for on-screen sharing, 300 DPI if it might be printed, and pick "selected pages" rather than "all" unless you really need an album.

Pick pages Up to 600 DPI Browser-side OK
JPGPDF
Photos in · Bound

A book of receipts, in two clicks

Most accounting platforms accept PDF bundles but not photo dumps. Convert a sequence of phone snapshots into a single, ordered PDF: A4 or US Letter paper, margins of about 12 mm so the binder hole doesn't eat the timestamp, and don't recompress — JPG inside PDF is already as small as it's going to get.

Reorder pages A4 / US Letter Browser-side OK

Editor's note · Why two of these still need a server

EPUB and DOCX both reference fonts by name and rely on system libraries to resolve them. Doing that purely in the browser is possible (FreeType compiled to WebAssembly, the Roboto subset, etc.) but it inflates the page weight to roughly 18 MB before the user has even uploaded a file. Until that calculus shifts, these two conversions remain server-side. Files are wiped within sixty minutes and are never indexed by search.
Chapter 03 — Tables & trees

Spreadsheets meet APIs, politely.

Three lossless conversions for the people who spend their days moving structured data between tools that disagree about how to spell "true". None of them require uploading anything anywhere — they run entirely on your machine.

CSVJSON
Rows → Records

Rows become records

The headers become keys, each row becomes an object, and every well-behaved CSV-to-JSON converter has to make three quiet decisions: how to detect the delimiter (semicolons appear in 14 of the 27 EU locales), how to type values (the string "0123" is usually a zip code, not a number), and whether to coerce "" to null. Pick the defaults that are safe for your downstream tool, not the most "clever".

Auto-typing Custom delimiter Browser-side OK
JSONCSV
Records → Rows

Flatten without losing the shape

The reverse trip is harder, because JSON has nested objects and arrays that CSV does not. The dotted-key convention — address.city, items[0].sku — is what BI tools expect and what Excel can still parse without macros. Decide up front whether arrays of objects fan out into multiple rows or collapse into a JSON-encoded cell.

Dotted headers Excel-safe encoding Browser-side OK
XMLJSON
Legacy → Modern

Bringing the 2000s forward

XML survives in SOAP responses, RSS feeds, sitemaps, and a stubborn segment of banking and insurance APIs. The conversion is mostly mechanical, with two exceptions: XML has both attributes and child nodes (JSON does not), and XML allows mixed content where text and elements interleave. The pragmatic default — attributes prefixed with @, text under #text — is ugly but lossless.

Attributes preserved Namespaces optional Browser-side OK
"Half of every data migration is convincing one system to accept the file format the previous system was happy to produce. The actual transformation takes ten minutes; the meetings about which side should change take three weeks."
Chapter 04 — Motion & sound

Containers, not codecs.

A common misunderstanding: an MKV-to-MP4 conversion is rarely a video conversion. The video inside both files is usually identical H.264 or H.265. What changes is the wrapper.

MKVMP4
Container swap · Often lossless

Remux, don't re-encode

Matroska (MKV) is a more permissive container — it accepts more codecs and more subtitle formats than MP4. But Safari, QuickTime, most TVs and most chat clients prefer MP4. When the inner streams are already H.264 or H.265 video with AAC audio, the right operation is a remux: the codec data is copied byte-for-byte, only the container is rewritten. A two-hour film completes in seconds and loses no quality.

Stream-copy when possible Subtitles → soft-burn Browser-side (ffmpeg.wasm)
WAVMP3
Lossless → Compressed

Trading bytes for portability

WAV is PCM — uncompressed audio at roughly 10 MB per minute. MP3 at 192 kbps is roughly 1.4 MB per minute, indistinguishable from the source in blind tests for non-classical material. Use 320 kbps if you'll re-edit later, 192 kbps for distribution, 96–128 kbps for spoken word. Preserve ID3 tags — every podcast platform needs them.

ID3 preserved Bitrate 96–320 Browser-side (ffmpeg.wasm)
Chapter 05 — Reference table

The thirteen, at a glance.

A single table to bookmark. "Local" means the conversion can run entirely on your machine; "Server" means a brief round-trip with a one-hour retention policy.

Cat. From To Where it runs Batch Lossless
ImgHEICJPGLocalYes~ visual
ImgWebPPNGLocalYesYes
ImgPNGICOLocalYesYes
ImgSVGPNGLocalYesRasterised
DocEPUBPDFServer (≤1h)YesReflow → fixed
DocDOCXPDFServer (≤1h)YesYes
DocPDFJPGLocalYesRasterised
DocJPGPDFLocalYesYes
DatCSVJSONLocalYesYes
DatJSONCSVLocalYesFlattened
DatXMLJSONLocalYesYes
VidMKVMP4Local (ffmpeg.wasm)YesStream-copy
AudWAVMP3Local (ffmpeg.wasm)YesLossy
Chapter 06 — Reader queries

Questions we keep getting by post and by email.

A small selection of the most frequent reader letters, with answers checked against the latest specifications. Have a question we haven't covered? Send it in — we revise this section each quarter.

Is HEIC-to-JPG truly lossless?

No, but the loss is negligible at quality 92 and above. HEIC itself was decoded losslessly from HEVC; the loss happens when the pixels are re-encoded into JPG's DCT scheme. For photographs, the difference is below the threshold of perception on calibrated displays [1]. If you need pixel-perfect output for archival, convert HEIC to PNG instead.

Why does my WebP-to-PNG file end up larger?

That's expected. WebP is a compressed format with a built-in psychovisual model; PNG is mathematically lossless and stores every pixel. A 250 KB WebP typically expands to a 700–900 KB PNG. You're trading file size for compatibility — that's the whole point of the conversion.

Will the ICO file work as a Windows app icon?

Yes. The bundled sizes (16 / 32 / 48 / 64 / 128 / 256 px) match Microsoft's published guidance for executable resources and Start menu tiles [2]. Visual Studio's resource compiler will pick them up directly.

How does SVG-to-PNG handle high-DPI displays?

You set the output width. The SVG renderer rasterises at exactly that size, so the result looks crisp at the chosen resolution and slightly soft when scaled up further. For retina-class displays, export at 2× the intended display size; for print, 300 DPI is the safe default.

Does CSV-to-JSON detect types automatically?

Yes — numbers, booleans, ISO-8601 dates, and null-like sentinels (NA, N/A, empty string) are coerced. You can disable detection to keep everything as strings, which is the safest mode for downstream parsers that handle their own coercion.

Does JSON-to-CSV flatten nested objects?

Yes, using dotted keys (address.city). Arrays of primitives are joined with a separator you choose (default: |). Arrays of objects fan out into multiple rows by default, but you can switch to JSON-encoded cells if you want one row per input record.

What happens to fonts during DOCX-to-PDF?

Embedded fonts are passed straight through to the PDF. For non-embedded fonts, the converter substitutes a metrics-compatible match — same character widths, so line breaks and pagination don't shift. This is the same strategy Microsoft Word uses when "embed fonts" is enabled at save time [3].

Does MKV-to-MP4 re-encode the video?

Only when it must. If the MKV's video stream is H.264 or H.265 and the audio is AAC, the operation is a remux: the streams are copied byte-for-byte into an MP4 container [4]. The result is bit-identical to the source. Re-encoding kicks in only for legacy codecs (VP9 audio, FLAC audio, Theora, etc.).

Which bitrate should I pick for WAV-to-MP3?

320 kbps for music you intend to re-edit; 192 kbps for general listening (the long-standing public-broadcast default); 128 kbps for spoken-word podcasts. Our recommendation, in line with the AES guidance on perceptual audio coding [5], is to default to 192 kbps and step up only when re-encoding is anticipated.

Are uploaded files actually deleted?

Most of the conversions in this almanac never upload anything — image, data, audio, and video work happens inside your browser using WebAssembly. EPUB-to-PDF and DOCX-to-PDF do round-trip to a server because of font handling; those files are deleted within sixty minutes, never indexed, and never used to train any model.

Chapter 07 — Sources & further reading

What we read so you don't have to.

  1. [1] ITU-T Rec. H.265 (v9), High Efficiency Video Coding, International Telecommunication Union, 2023. Sections on still-picture profiles used in HEIC.
  2. [2] Microsoft Learn, Icons (Design basics), last updated 14 Feb 2025. Recommended sizes for application and shell icons.
  3. [3] ECMA-376 Part 1 (5th ed.), Office Open XML File Formats — Fundamentals, ECMA International, 2023. Font referencing in WordprocessingML.
  4. [4] ISO/IEC 14496-14:2020, Information technology — MP4 file format. Atom-level structure used by MKV → MP4 remuxing.
  5. [5] Audio Engineering Society, AES17 — Measurement of digital audio equipment, 2020 revision. Background on perceptual coding thresholds.
  6. [6] W3C, Portable Network Graphics (PNG) Specification (third edition), 2023.
  7. [7] IDPF / W3C, EPUB 3.3 Recommendation, May 2023. Reflowable content model and TOC navigation.
  8. [8] IETF RFC 4180, Common Format and MIME Type for CSV Files. The de-facto baseline for the CSV family.
  9. [9] WHATWG, URL Living Standard §6 (file scheme), 2026. Cited for the section on browser-local file handling.
Chapter 08 — Subject index

By keyword, with page anchors.