CSV to JSON Conversion: A Practical Guide

CodeKit
csvjsondata-conversion

Why Convert CSV to JSON?

CSV (Comma-Separated Values) and JSON (JavaScript Object Notation) are two of the most common data formats, but they serve different purposes. CSV is great for spreadsheets and tabular data, while JSON excels at representing structured, hierarchical data. Converting between them is a frequent task in data processing, API development, and configuration management.

CSV vs JSON: A Quick Comparison

FeatureCSVJSON
StructureFlat, tabularHierarchical, nested
Data typesAll stringsStrings, numbers, booleans, null
HeadersFirst row (optional)Keys in each object
NestingNot supported nativelyFully supported
Human-readableYes (for small datasets)Yes (especially when formatted)
File sizeSmallerLarger (repeated keys)
Parsing speedFasterSlightly slower

Basic Conversion

Here’s a simple CSV file and its JSON equivalent:

CSV:

name,age,city
Alice,30,New York
Bob,25,San Francisco
Charlie,35,Chicago

JSON:

[
  { "name": "Alice", "age": 30, "city": "New York" },
  { "name": "Bob", "age": 25, "city": "San Francisco" },
  { "name": "Charlie", "age": 35, "city": "Chicago" }
]

Notice that in JSON, the age values can be proper numbers rather than strings—JSON preserves data types while CSV treats everything as text.

Writing a CSV-to-JSON Converter

Simple Implementation

function csvToJson(csv) {
  const lines = csv.trim().split('\n');
  const headers = lines[0].split(',');

  return lines.slice(1).map(line => {
    const values = line.split(',');
    const obj = {};
    headers.forEach((header, index) => {
      obj[header.trim()] = values[index]?.trim() || '';
    });
    return obj;
  });
}

const csv = `name,age,city
Alice,30,New York
Bob,25,San Francisco`;

console.log(JSON.stringify(csvToJson(csv), null, 2));

This works for simple cases, but real-world CSV data is rarely this clean.

Handling Common CSV Pitfalls

1. Quoted Fields with Commas

CSV fields containing commas must be wrapped in quotes:

name,description
"Smith, John","A developer, designer"

A naive split(',') would break this. You need a proper CSV parser:

function parseCSVLine(line) {
  const result = [];
  let current = '';
  let inQuotes = false;

  for (let i = 0; i < line.length; i++) {
    const char = line[i];
    if (char === '"') {
      if (inQuotes && line[i + 1] === '"') {
        current += '"';
        i++; // Skip escaped quote
      } else {
        inQuotes = !inQuotes;
      }
    } else if (char === ',' && !inQuotes) {
      result.push(current);
      current = '';
    } else {
      current += char;
    }
  }
  result.push(current);
  return result;
}

2. Different Delimiters

Not all CSV files use commas. Tab-separated values (TSV) and semicolon-separated files are common, especially in European locales where commas are used as decimal separators.

function csvToJson(csv, delimiter = ',') {
  const lines = csv.trim().split('\n');
  const headers = parseCSVLine(lines[0], delimiter);

  return lines.slice(1).map(line => {
    const values = parseCSVLine(line, delimiter);
    const obj = {};
    headers.forEach((header, index) => {
      obj[header.trim()] = values[index]?.trim() || '';
    });
    return obj;
  });
}

// For tab-separated data
const tsvData = csvToJson(tsvContent, '\t');

// For semicolon-separated data
const csvData = csvToJson(csvContent, ';');

3. Type Inference

CSV values are all strings, but you often want numbers, booleans, or null in your JSON:

function inferType(value) {
  if (value === '' || value === 'null' || value === 'NULL') return null;
  if (value === 'true' || value === 'TRUE') return true;
  if (value === 'false' || value === 'FALSE') return false;
  if (!isNaN(value) && value.trim() !== '') return Number(value);
  return value;
}

// Apply type inference during conversion
headers.forEach((header, index) => {
  obj[header.trim()] = inferType(values[index]?.trim() || '');
});

Handling Nested Data

CSV is inherently flat, but sometimes you need nested JSON structures. A common convention is to use dot notation in headers:

user.name,user.email,address.city,address.zip
Alice,alice@example.com,New York,10001

Converts to:

[
  {
    "user": { "name": "Alice", "email": "alice@example.com" },
    "address": { "city": "New York", "zip": 10001 }
  }
]
function setNestedValue(obj, path, value) {
  const keys = path.split('.');
  let current = obj;
  for (let i = 0; i < keys.length - 1; i++) {
    if (!current[keys[i]]) current[keys[i]] = {};
    current = current[keys[i]];
  }
  current[keys[keys.length - 1]] = value;
}

Large File Considerations

For large CSV files (millions of rows), keep these tips in mind:

  • Stream processing: Don’t load the entire file into memory; process it line by line
  • Batch writes: Write JSON output in chunks rather than building one giant array
  • Memory: A 100 MB CSV file can expand to 200+ MB as JSON due to repeated keys
// Streaming approach (Node.js)
const readline = require('readline');
const fs = require('fs');

async function convertLargeCSV(inputPath, outputPath) {
  const stream = fs.createReadStream(inputPath);
  const rl = readline.createInterface({ input: stream });
  const output = fs.createWriteStream(outputPath);

  output.write('[\n');
  let headers = null;
  let first = true;

  for await (const line of rl) {
    if (!headers) {
      headers = line.split(',');
      continue;
    }
    const values = line.split(',');
    const obj = {};
    headers.forEach((h, i) => obj[h] = values[i]);

    if (!first) output.write(',\n');
    output.write(JSON.stringify(obj));
    first = false;
  }
  output.write('\n]');
}

Using the CodeKit Tools

For quick conversions without writing code, use these CodeKit tools:

  • CSV to JSON: Paste your CSV data and get instant JSON output with configurable delimiters and type inference
  • JSON Formatter: Format and validate your JSON output for readability

Both tools run entirely in your browser—no data is sent to any server.

Conclusion

Converting CSV to JSON is a common task that seems simple but has many edge cases. Quoted fields, different delimiters, type inference, and nested structures all require careful handling. Whether you write your own converter or use a tool, understanding these nuances will help you avoid data corruption and build more robust data pipelines.

Ready to convert some data? Try the CSV to JSON converter and JSON formatter on CodeKit for instant, browser-based conversions.