Understand Base64 encoding and decoding with practical examples. Learn when to use it, how it works under the hood, and encode/decode strings online for free.
Every developer eventually runs into Base64. Maybe you are staring at a long string of letters and numbers in an API response and wondering what it means. Maybe you need to embed an image inside a JSON payload. Maybe you are debugging an email that arrived as a wall of seemingly random characters.
Base64 is one of those things that is everywhere but rarely explained well. Most people learn just enough to call btoa() and move on. That works until it does not — until you hit a Unicode string that breaks, or you need to understand why your encoded payload is 33 percent larger than the original, or you are asked in a code review why you chose Base64 instead of hex encoding.
This post covers all of it. What Base64 actually is, how the algorithm works step by step, when you should use it, when you should not, and practical code examples in JavaScript, Python, and the command line.
Base64 is a binary-to-text encoding scheme. It takes arbitrary binary data — bytes that could represent anything from a JPEG image to a cryptographic key — and converts it into a string of printable ASCII characters.
The name comes from the fact that it uses 64 characters to represent data. Those 64 characters are:
A through Z (26 characters)a through z (26 characters)0 through 9 (10 characters)+ and / (2 characters)There is also =, which is used as a padding character. More on that later.
The key thing to understand: Base64 is not encryption. It does not protect data. Anyone can decode a Base64 string. It is a transport encoding — a way to represent binary data in contexts that only support text.
The internet was originally built around text. Protocols like SMTP (email), HTTP headers, and early versions of HTML were designed to handle ASCII text, not raw binary data. Try sending a JPEG file through a protocol that expects printable characters and you will get corruption. Some bytes in a JPEG happen to be control characters, null bytes, or values above 127 that older systems do not handle correctly.
Base64 solves this by mapping every possible byte sequence to safe, printable ASCII characters. The tradeoff is size: Base64-encoded data is about 33 percent larger than the original, because you are representing 6 bits of data with each character instead of 8.
That 33 percent overhead matters. It is why you should not Base64-encode an entire video file and stick it in a JSON response. But for small payloads — authentication tokens, inline images in CSS, embedded files in API calls — the convenience outweighs the cost.
Understanding the algorithm makes everything else click. Here is the step-by-step process.
Take your input string and get its binary representation. Each ASCII character is one byte (8 bits).
For the string Man:
M = 77 = 01001101
a = 97 = 01100001
n = 110 = 01101110
Concatenate the bits: 010011010110000101101110
Base64 uses 6-bit groups instead of 8-bit bytes. Why 6? Because 2 to the power of 6 equals 64, which gives us exactly 64 possible values — one for each character in the Base64 alphabet.
Split the 24 bits into groups of 6:
010011 010110 000101 101110
Each 6-bit group is a number between 0 and 63. Look up that number in the Base64 index table:
010011 = 19 = T
010110 = 22 = W
000101 = 5 = F
101110 = 46 = u
So Man encodes to TWFu.
The algorithm processes 3 bytes at a time (24 bits, which splits evenly into four 6-bit groups). But input is not always a multiple of 3 bytes.
When the input length is not divisible by 3, the algorithm pads with zero bits and adds = characters to signal how many bytes were padded.
For the string Ma (2 bytes, 16 bits):
M = 01001101
a = 01100001
Concatenated: 0100110101100001
We need 18 bits (three 6-bit groups), so pad with two zero bits:
010011 010110 000100
That gives us TWE. Since we had one byte of padding, we append one = sign: TWE=.
For a single character M (1 byte, 8 bits):
M = 01001101
We need 12 bits (two 6-bit groups), so pad with four zero bits:
010011 010000
That gives us TQ. Since we had two bytes of padding, we append two = signs: TQ==.
The rule is simple:
===You will never see three = signs because the cycle repeats every 3 bytes.
For reference, here is the complete mapping:
Value Char Value Char Value Char Value Char
0 A 16 Q 32 g 48 w
1 B 17 R 33 h 49 x
2 C 18 S 34 i 50 y
3 D 19 T 35 j 51 z
4 E 20 U 36 k 52 0
5 F 21 V 37 l 53 1
6 G 22 W 38 m 54 2
7 H 23 X 39 n 55 3
8 I 24 Y 40 o 56 4
9 J 25 Z 41 p 57 5
10 K 26 a 42 q 58 6
11 L 27 b 43 r 59 7
12 M 28 c 44 s 60 8
13 N 29 d 45 t 61 9
14 O 30 e 46 u 62 +
15 P 31 f 47 v 63 /
The standard Base64 alphabet uses + and /. These characters have special meaning in URLs and file systems, which causes problems. Several variants exist to address this.
Defined in RFC 4648, Base64url replaces + with - and / with _. It also typically omits padding (=). This is the variant used in JWTs, data URIs in some contexts, and anywhere Base64 data appears in a URL.
Used in email (MIME), this variant inserts line breaks every 76 characters. The actual character set is the same as standard Base64, but the line wrapping means you cannot compare MIME-encoded strings directly without stripping whitespace first.
JavaScript has built-in functions for Base64, but they have a well-known limitation with Unicode.
// Encode
const encoded = btoa("Hello, World!");
console.log(encoded); // "SGVsbG8sIFdvcmxkIQ=="
// Decode
const decoded = atob("SGVsbG8sIFdvcmxkIQ==");
console.log(decoded); // "Hello, World!"Try encoding a string with non-ASCII characters and btoa throws an error:
btoa("Hello, "); // DOMException: The string contains characters outside of the Latin1 rangeThe fix is to encode the string as UTF-8 first, then Base64-encode the bytes:
// Encode Unicode to Base64
function encodeBase64(str) {
const bytes = new TextEncoder().encode(str);
const binary = Array.from(bytes, (byte) => String.fromCodePoint(byte)).join("");
return btoa(binary);
}
// Decode Base64 to Unicode
function decodeBase64(base64) {
const binary = atob(base64);
const bytes = Uint8Array.from(binary, (char) => char.codePointAt(0));
return new TextDecoder().decode(bytes);
}
console.log(encodeBase64("Hello, ")); // "SGVsbG8sIPCfjI0="
console.log(decodeBase64("SGVsbG8sIPCfjI0=")); // "Hello, "In Node.js, the Buffer class handles this more cleanly:
// Encode
const encoded = Buffer.from("Hello, World!", "utf-8").toString("base64");
console.log(encoded); // "SGVsbG8sIFdvcmxkIQ=="
// Decode
const decoded = Buffer.from("SGVsbG8sIFdvcmxkIQ==", "base64").toString("utf-8");
console.log(decoded); // "Hello, World!"
// Base64url variant
const urlSafe = Buffer.from("Hello, World!", "utf-8").toString("base64url");
console.log(urlSafe); // "SGVsbG8sIFdvcmxkIQ"Python makes Base64 handling straightforward with the base64 module.
import base64
# Encode a string
text = "Hello, World!"
encoded = base64.b64encode(text.encode("utf-8"))
print(encoded) # b'SGVsbG8sIFdvcmxkIQ=='
# Decode back to string
decoded = base64.b64decode(encoded).decode("utf-8")
print(decoded) # "Hello, World!"
# Base64url variant
url_encoded = base64.urlsafe_b64encode(text.encode("utf-8"))
print(url_encoded) # b'SGVsbG8sIFdvcmxkIQ=='
# Encode a file
with open("image.png", "rb") as f:
file_encoded = base64.b64encode(f.read())
print(file_encoded[:50]) # First 50 chars of the encoded filePython also handles Unicode naturally since you explicitly encode to bytes before Base64 encoding:
import base64
text = "Merhaba Dunya"
encoded = base64.b64encode(text.encode("utf-8"))
print(encoded) # b'TWVyaGFiYSBEdW55YQ=='
decoded = base64.b64decode(encoded).decode("utf-8")
print(decoded) # "Merhaba Dunya"The base64 command is available on most Unix systems:
# Encode a string
echo -n "Hello, World!" | base64
# SGVsbG8sIFdvcmxkIQ==
# Decode a string
echo "SGVsbG8sIFdvcmxkIQ==" | base64 --decode
# Hello, World!
# Encode a file
base64 image.png > image_encoded.txt
# Decode a file
base64 --decode image_encoded.txt > image_decoded.pngThe -n flag on echo is important. Without it, echo appends a newline character, and that newline gets encoded too.
OpenSSL provides another way to handle Base64 from the terminal:
# Encode
echo -n "Hello, World!" | openssl base64
# SGVsbG8sIFdvcmxkIQ==
# Decode
echo "SGVsbG8sIFdvcmxkIQ==" | openssl base64 -d
# Hello, World!# Encode
[Convert]::ToBase64String([Text.Encoding]::UTF8.GetBytes("Hello, World!"))
# SGVsbG8sIFdvcmxkIQ==
# Decode
[Text.Encoding]::UTF8.GetString([Convert]::FromBase64String("SGVsbG8sIFdvcmxkIQ=="))
# Hello, World!Base64 is the right choice in several specific scenarios.
JSON does not support binary data natively. If you need to include an image, PDF, or any binary file in a JSON payload, Base64 is the standard approach:
{
"filename": "document.pdf",
"content": "JVBERi0xLjQKMSAwIG9iago8PCAvVHlwZSAvQ2F0YWxvZw...",
"contentType": "application/pdf"
}You can embed small images directly in HTML or CSS using data URIs:
<img src="data:image/png;base64,iVBORw0KGgoAAAANSUhEUg..." alt="icon" />.icon {
background-image: url("data:image/svg+xml;base64,PHN2ZyB4bWxu...");
}This eliminates an HTTP request for small assets. For icons under a few kilobytes, the reduced latency often outweighs the size overhead.
MIME encoding uses Base64 to attach binary files to emails. When you attach a PDF to an email, your mail client Base64-encodes it behind the scenes.
HTTP Basic Authentication Base64-encodes the username:password pair:
Authorization: Basic dXNlcm5hbWU6cGFzc3dvcmQ=
Note: this is encoding, not encryption. The credentials are trivially decodable. Always use HTTPS when sending Basic Auth headers.
Some legacy systems or configurations only support text fields. Base64 lets you store binary data in those fields, though a proper binary column is always preferable when available.
The 33 percent size overhead adds up fast. A 10 MB file becomes roughly 13.3 MB after Base64 encoding. For large files, use multipart form uploads or binary protocols instead.
Base64 is not a security mechanism. Encoding credentials in Base64 does not protect them. If you see someone describing Base64 as "encrypted," correct them. Use actual encryption (AES, RSA, or a higher-level library like libsodium) for sensitive data.
Some developers Base64-encode strings to hide them from casual inspection. This is security through obscurity and provides zero protection. Any developer can decode it in seconds.
Standard Base64 contains +, /, and =, all of which have special meaning in URLs. If you must put Base64 data in a URL, use the Base64url variant or properly URL-encode the string.
MIME-encoded Base64 includes line breaks every 76 characters. If you copy a Base64 string from an email source and try to decode it, strip the whitespace first:
const clean = encodedString.replace(/\s/g, "");
const decoded = atob(clean);Some implementations omit padding (=). Most decoders handle this gracefully, but if you get errors, try adding padding:
function addPadding(base64) {
const remainder = base64.length % 4;
if (remainder === 2) return base64 + "==";
if (remainder === 3) return base64 + "=";
return base64;
}A surprisingly common bug: encoding data that is already Base64-encoded. If you see output like U0dWc2JHOHNJRmR2Y21sa0lR, you might be looking at double-encoded data. Decode it twice.
If you encode a string as UTF-8 on one system and decode it as Latin-1 on another, you will get garbled output. Always agree on the character encoding (UTF-8 is the safe default) before Base64 encoding.
JSON Web Tokens use Base64url encoding (not standard Base64) for their header and payload segments. A JWT looks like this:
eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJzdWIiOiIxMjM0NTY3ODkwIiwibmFtZSI6IkpvaG4gRG9lIiwiaWF0IjoxNTE2MjM5MDIyfQ.SflKxwRJSMeKKF2QT4fwpMeJf36POk6yJV_adQssw5c
Each segment between dots is Base64url-encoded JSON. You can decode the payload to inspect claims without verifying the signature:
function decodeJwtPayload(token) {
const payload = token.split(".")[1];
const padded = payload.replace(/-/g, "+").replace(/_/g, "/");
return JSON.parse(atob(padded));
}
const claims = decodeJwtPayload("eyJhbGciOiJIUzI1NiIs...");
console.log(claims);
// { sub: "1234567890", name: "John Doe", iat: 1516239022 }This is useful for debugging, but never trust decoded JWT claims without verifying the signature server-side.
Base64 encoding and decoding is fast. On modern hardware, you can encode hundreds of megabytes per second. The bottleneck is almost never the encoding itself — it is the increased data size.
In a web context, the 33 percent overhead matters most for:
For most use cases — API tokens, small embedded images, configuration values — the overhead is negligible. For anything over a few hundred kilobytes, consider whether Base64 is really the right tool.
If you need a quick way to encode or decode Base64 strings without writing code, you can use the free Base64 Encoder and Decoder tool on akousa.net. It runs entirely in your browser — nothing is sent to a server — so it is safe for sensitive data like API keys or tokens. It handles Unicode correctly and supports both standard Base64 and the Base64url variant.
Base64 is a simple, well-defined encoding that converts binary data to printable ASCII text. It is not encryption, not compression, and not obfuscation. It is a transport encoding that solves a real problem: getting binary data through text-only channels.
The algorithm groups input bytes into 6-bit chunks and maps each chunk to one of 64 printable characters. The output is always 33 percent larger than the input. Padding with = handles inputs that are not a multiple of 3 bytes.
Use it for embedding binary data in JSON, data URIs, email attachments, and authentication headers. Do not use it for large files, security, or anything where the size overhead matters. Always use the Base64url variant when putting encoded data in URLs or JWTs.
Every major language has built-in Base64 support. In JavaScript, watch out for the Unicode limitation of btoa and atob. In Python, remember to encode strings to bytes first. On the command line, remember the -n flag with echo to avoid encoding a trailing newline.
Base64 is one of those tools that rewards understanding. Once you know how it works, you stop treating it as a black box and start making better decisions about when and how to use it.