Base64 Encoding Explained: How It Works and When to Use It
Base64 appears everywhere in web development—JWT tokens, data URIs, email attachments, API authentication headers, and binary-to-text serialization. Yet many developers treat it as a black box: data goes in, a long string of letters comes out. Understanding how it actually works helps you use it correctly and avoid common pitfalls. This guide walks through the algorithm, variants, use cases, and when to choose an alternative.
Encode and decode Base64 strings instantly with our Base64 Encoder/Decoder.
The Algorithm: 3 Bytes → 4 Characters
Base64 works by taking 3 bytes of binary data (24 bits) and encoding them as 4 ASCII characters (6 bits each). This is why it's called Base64—each character represents 6 bits, and 2^6 = 64 possible values.
Let's encode the ASCII string "Man" step by step:
Character: M a n
ASCII: 77 97 110
Binary: 01001101 01100001 01101110
Combined 24 bits: 010011 010110 000101 101110
Split into 6-bit groups:
010011 = 19 → T
010110 = 22 → W
000101 = 5 → F
101110 = 46 → u
Result: "TWFu"
The 64-Character Alphabet
Standard Base64 uses exactly 64 characters:
- Uppercase letters: A–Z (values 0–25)
- Lowercase letters: a–z (values 26–51)
- Digits: 0–9 (values 52–61)
- Plus sign:
+(value 62) - Forward slash:
/(value 63)
Plus the padding character =, which is not part of the alphabet but indicates the input was not a multiple of 3 bytes.
Padding with =
Since we process 3 bytes at a time, input lengths that are not divisible by 3 require padding:
// 1 remaining byte → 2 Base64 chars + == padding
"M" → 01001101 00000000 00000000
→ 010011 010000 000000 000000
→ T Q = =
Result: "TQ=="
// 2 remaining bytes → 3 Base64 chars + = padding
"Ma" → 01001101 01100001 00000000
→ 010011 010110 000100 000000
→ T W E =
Result: "TWE="
// 3 bytes exactly → 4 Base64 chars, no padding
"Man" → "TWFu"
This means the length of a valid Base64 string is always a multiple of 4.
The base64url Variant
Standard Base64's + and / characters have special meaning in URLs—they would need to be percent-encoded as %2B and %2F. The base64url variant solves this with two substitutions:
+→-(hyphen)/→_(underscore)- Trailing
=padding is typically omitted
Base64url is used in JWTs, OAuth 2.0 authorization codes, and any context where the encoded data appears in URLs or HTTP headers. Use our JWT Decoder to inspect the base64url-encoded payload of any JWT.
Data URIs: Embedding Files in HTML/CSS
The data: URI scheme allows embedding file content directly in HTML or CSS, eliminating a network request:
/* Format: data:[mediatype][;base64],data */
/* Inline PNG favicon */
<link rel="icon" href="data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAB...">
/* Inline SVG in CSS */
.icon {
background-image: url('data:image/svg+xml;base64,PHN2ZyB4bWxucz0i...');
}
/* Inline SVG without base64 (URL-encode instead) */
.icon {
background-image: url("data:image/svg+xml,%3Csvg xmlns='http://www.w3.org/2000/svg'...");
}
/* Small inline images in HTML */
<img src="data:image/gif;base64,R0lGODlhAQABAIAAAAAAAP///yH5BAEAAAAALAAAAAABAAEAAAIBRAA7"
alt="1x1 transparent pixel">
Convert any image to a data URI with our Image to Base64 tool.
The 33% Size Overhead
Base64 encodes every 3 bytes as 4 characters. This is an inherent 33.3% size increase (4/3 = 1.333...). Additionally, each character in ASCII/UTF-8 is 1 byte, so the byte overhead is also 33%.
Original file: 100 KB
Base64 encoded: ~133 KB (33% larger)
Original file: 1 MB PNG
As data URI: ~1.33 MB of text
1 MB image in data URI vs external file:
- Data URI: 1.33 MB in HTML (not separately cacheable)
- External file: 1 MB with proper caching headers
HTTP compression (gzip/brotli) partially offsets this—Base64 text compresses well since it has limited character variety. But for files larger than ~5 KB, the caching and parallel loading benefits of external files outweigh the single-request convenience of data URIs.
Base64 in JavaScript: Web Crypto API Pattern
The browser's built-in btoa() and atob() functions are limited to Latin-1 characters. For binary data and Unicode strings, use the Web Crypto API or TextEncoder:
// btoa/atob: only for ASCII/Latin-1 strings
btoa('Hello World'); // "SGVsbG8gV29ybGQ="
atob('SGVsbG8gV29ybGQ='); // "Hello World"
// For Unicode strings: encode to UTF-8 bytes first
function encodeBase64Unicode(str) {
const bytes = new TextEncoder().encode(str);
const binStr = Array.from(bytes, b => String.fromCodePoint(b)).join('');
return btoa(binStr);
}
function decodeBase64Unicode(b64) {
const binStr = atob(b64);
const bytes = Uint8Array.from(binStr, c => c.codePointAt(0));
return new TextDecoder().decode(bytes);
}
// Modern approach: ArrayBuffer to Base64
async function arrayBufferToBase64(buffer) {
const blob = new Blob([buffer]);
const dataUrl = await new Promise(resolve => {
const reader = new FileReader();
reader.onload = () => resolve(reader.result);
reader.readAsDataURL(blob);
});
return dataUrl.split(',')[1]; // Remove "data:...;base64," prefix
}
// Base64 to ArrayBuffer
function base64ToArrayBuffer(base64) {
const binStr = atob(base64);
const bytes = new Uint8Array(binStr.length);
for (let i = 0; i < binStr.length; i++) {
bytes[i] = binStr.charCodeAt(i);
}
return bytes.buffer;
}
// base64url variant in browser
function encodeBase64Url(str) {
return btoa(str)
.replace(/\+/g, '-')
.replace(/\//g, '_')
.replace(/=/g, '');
}
function decodeBase64Url(str) {
// Restore standard Base64
str = str.replace(/-/g, '+').replace(/_/g, '/');
// Add padding if needed
while (str.length % 4) str += '=';
return atob(str);
}
Node.js and Python
// Node.js: Buffer-based (handles binary correctly)
const encoded = Buffer.from('Hello, World!').toString('base64');
// "SGVsbG8sIFdvcmxkIQ=="
const decoded = Buffer.from('SGVsbG8sIFdvcmxkIQ==', 'base64').toString('utf8');
// "Hello, World!"
// Node.js: base64url
const encoded_url = Buffer.from('data').toString('base64url');
// Node.js: from binary file
const fs = require('fs');
const imageData = fs.readFileSync('image.png');
const base64Image = imageData.toString('base64');
const dataUri = `data:image/png;base64,${base64Image}`;
# Python
import base64
# Standard Base64
encoded = base64.b64encode(b'Hello, World!').decode('utf-8')
# b'SGVsbG8sIFdvcmxkIQ=='
decoded = base64.b64decode('SGVsbG8sIFdvcmxkIQ==').decode('utf-8')
# 'Hello, World!'
# base64url (URL-safe)
encoded_url = base64.urlsafe_b64encode(b'data+/test').decode('utf-8')
# From file
with open('image.png', 'rb') as f:
encoded = base64.b64encode(f.read()).decode('utf-8')
When NOT to Use Base64
Base64 is not appropriate for every binary-to-text serialization task:
- Large file uploads: Use multipart/form-data or presigned S3 URLs. Base64-encoding a 10 MB video adds 3.3 MB of overhead and puts the entire file in memory.
- Password storage: Never use Base64 as a security measure. It is encoding, not encryption. Use bcrypt, scrypt, or Argon2 for password hashing.
- Data stored in databases: Store binary files in blob storage (S3, GCS) and save the URL. Storing Base64 in a database column is wasteful and slow to query.
- Large image in CSS: Images larger than ~5 KB as data URIs bloat your CSS file, prevent caching the image independently, and delay rendering.
Alternatives:
- Presigned S3/GCS URLs for time-limited access to large files
- Multipart upload for files over 5 MB
- Hex encoding when working with cryptographic hashes (shorter than base64 for <16 bytes)
- Protocol Buffers or MessagePack for binary serialization without the overhead
For JWT inspection, our JWT Decoder handles base64url decoding automatically. For generating secure tokens, see our Base64 Encoder.