How AWS Lambda and SQS Actually Work
Back to blog
awsserverlessbackend

How AWS Lambda and SQS Actually Work

Most misconfiguration in Lambda and SQS comes from skipping the mental model. Visibility timeouts, partial batch responses, idempotency, reserved concurrency. Getting these wrong is easy and the failures are subtle. Getting them right before you ship is not complicated once you understand what the queue and the function are each responsible for.

March 2, 2026ยท12 min read

AWS Lambda and SQS are two of the most-used services in the AWS ecosystem. Together they form one of the cleanest patterns for async background processing: something happens, a message goes into a queue, a function processes it, done. No servers to manage, scales from zero.

But the mental model matters before the code does. If you do not know what a visibility timeout is or why Lambda polls instead of receives, you will misconfigure this in a way that is hard to debug later. This post covers the fundamentals: the terms, the mechanics, local development with LocalStack, and what to watch for before you go live.


What is SQS?

SQS stands for Simple Queue Service. It is a fully managed message queue: you send messages into it, consumers read messages out of it, and SQS handles the durability and delivery guarantees in between.

The core use case is decoupling. Instead of service A calling service B directly and blocking until B responds, A sends a message to a queue and returns immediately. B processes it whenever it is ready. If B is temporarily down, the message sits in the queue. When B comes back up, it processes it. Neither service needs to know about the other's availability.

What is a message?

A message is a unit of data in SQS. It has a body (a string, usually JSON) and optional attributes (key-value metadata). The maximum message size is 256 KB. If your payload is larger, the pattern is to store the data in S3 and put the S3 key in the message body.

What is a queue?

A queue is a named buffer that holds messages until they are consumed and deleted. Two types exist:

Standard queues deliver messages at least once, in best-effort order. Duplicates are possible. Throughput is essentially unlimited. This is the right choice for most workloads.

FIFO queues guarantee exactly-once delivery and strict ordering within a message group. Throughput is capped at 3,000 messages per second with batching. Use this when order genuinely matters and cannot be enforced at the application layer, such as a payment state machine or a sequential workflow.

What is the visibility timeout?

When a consumer reads a message from SQS, the queue does not delete it immediately. Instead, it makes the message invisible to all other consumers for a period of time called the visibility timeout. The default is 30 seconds.

If the consumer finishes processing and explicitly deletes the message, it is gone. If the consumer crashes or takes longer than the visibility timeout, the message becomes visible again and another consumer can pick it up. This is what gives you retry behavior without any extra configuration.

The implication: SQS message processing is inherently at-least-once. The same message may be delivered more than once. Your processing logic should be idempotent if that matters for your use case.

What is a Dead-Letter Queue (DLQ)?

A DLQ is a separate queue where messages go after they have failed to process more than a configured number of times. You set this with maxReceiveCount on the source queue. When a message has been received and returned to the queue more times than maxReceiveCount, SQS automatically routes it to the DLQ.

Without a DLQ, a poison pill message (one that always causes an error due to a malformed payload or a bug in your handler) will cycle through the queue indefinitely. A DLQ isolates it so the rest of your queue can keep moving, and gives you a place to inspect and replay failures.

What is the message retention period?

Messages are not held forever. The default retention period is 4 days. The maximum is 14 days. If a message sits in the queue unprocessed for longer than the retention period, SQS deletes it silently. For most workloads this never matters. For workloads where a downstream outage could last days, it is worth setting the maximum.


What is Lambda?

Lambda is a compute service that runs your code in response to events, without you managing any servers. You upload a function, configure a trigger, and AWS runs it on demand. You pay per invocation and per duration, rounded to the nearest millisecond.

What is a function?

A Lambda function is a piece of code that AWS knows how to invoke. It has a handler, which is the entry point AWS calls, a runtime (Node.js, Python, Go, Java, etc.), memory allocation (128 MB to 10 GB, which also determines CPU), and a timeout (up to 15 minutes).

// A minimal Lambda handler in Node.js
export const handler = async (event) => {
  console.log('Received event:', JSON.stringify(event));
  return { statusCode: 200, body: 'ok' };
};

What is a cold start?

Lambda functions do not run constantly. AWS spins up a container to execute your function, runs it, and keeps that container warm for a period in case another invocation arrives. If no invocation arrives, the container is torn down.

The first invocation that requires a new container has a cold start: the time to initialize the runtime and your code before the handler runs. For Node.js and Python, this is typically 100 to 500ms. For JVM runtimes, it can be seconds.

For Lambda + SQS patterns, cold starts rarely matter because the workload is async and latency-tolerant. They matter more for API-facing Lambdas where a user is waiting.

What is an execution environment?

Each Lambda container can handle one concurrent invocation at a time. If ten messages arrive simultaneously, AWS spins up ten containers. This is automatic. The implication is that Lambda is not a shared runtime: there is no shared in-memory state between invocations. If you store something in a module-level variable, it may persist within the same container for the lifetime of that container, but you cannot rely on it across invocations.

What is reserved concurrency?

By default, all your Lambda functions share a pool of concurrent executions from your AWS account limit (3,000 in most regions). Reserved concurrency lets you allocate a fixed number of concurrent executions to a specific function, both as a guarantee that it will have capacity and as a ceiling so it cannot consume the entire pool.

For Lambda + SQS, setting reserved concurrency is how you control how fast your queue gets consumed. If your downstream (a database, a third-party API) can handle 20 concurrent requests, set reserved concurrency to 20.


How do Lambda and SQS connect?

Lambda does not listen to SQS. Lambda polls SQS. AWS manages this polling through a component called the event source mapping. When you create an event source mapping between an SQS queue and a Lambda function, AWS runs internal pollers that continuously long-poll your queue. When messages arrive, the poller fetches a batch and invokes your Lambda function with that batch as the event payload.

The event payload for an SQS trigger looks like this:

{
  "Records": [
    {
      "messageId": "059f36b4-87a3-44ab-83d2-661975830a7d",
      "receiptHandle": "AQEBwJnKyrHigUMZj6reyasLmvuBTDHMxU8",
      "body": "{\"orderId\": \"123\", \"userId\": \"456\"}",
      "attributes": {
        "ApproximateReceiveCount": "1",
        "SentTimestamp": "1545082650636"
      },
      "messageAttributes": {},
      "md5OfBody": "e4e68fb7bd0e697a0ae8f1bb342846b3",
      "eventSource": "aws:sqs",
      "eventSourceARN": "arn:aws:sqs:us-east-1:123456789012:my-queue",
      "awsRegion": "us-east-1"
    }
  ]
}

Each record in Records is one SQS message. Your handler receives a batch of these and is responsible for processing all of them.

What is batch size?

Batch size controls how many messages the poller fetches and passes to a single Lambda invocation. The default is 10. The maximum for standard queues is 10,000. Larger batches mean higher throughput but also larger blast radius if a failure causes the whole batch to retry.

What is partial batch response?

If your Lambda function throws an exception, the entire batch is considered failed and all messages return to the queue. This is often wrong: if 9 of 10 messages processed successfully and 1 failed, you want to retry only that 1.

Partial batch responses solve this. Instead of throwing, your handler returns a batchItemFailures list containing the messageId values of the messages that failed. SQS deletes the successful ones and re-queues only the failures.

export const handler = async (event) => {
  const failures = [];

  for (const record of event.Records) {
    try {
      await processMessage(JSON.parse(record.body));
    } catch (err) {
      console.error('Failed to process message', record.messageId, err);
      failures.push({ itemIdentifier: record.messageId });
    }
  }

  return { batchItemFailures: failures };
};

Enable this with functionResponseTypes: ['ReportBatchItemFailures'] on the event source mapping. Treat it as a default for any non-trivial batch processing.


How do you develop and test this locally?

Running Lambda + SQS locally without hitting real AWS requires an emulator. LocalStack is the standard tool for this. It runs a local version of SQS, Lambda, and most other AWS services inside a Docker container.

How do you run LocalStack with Docker?

Pull and start the LocalStack container. No CLI wrapper needed.

docker run \
  --rm \
  --name localstack \
  -p 4566:4566 \
  -e SERVICES=sqs \
  -e DEFAULT_REGION=us-east-1 \
  -e DEBUG=0 \
  localstack/localstack:latest

-p 4566:4566 maps LocalStack's gateway port to your host. All SQS operations go through http://localhost:4566. -e SERVICES=sqs scopes LocalStack to only start the SQS service, which is faster than booting the full suite.

If you want LocalStack to run in the background:

docker run \
  --rm \
  --detach \
  --name localstack \
  -p 4566:4566 \
  -e SERVICES=sqs \
  -e DEFAULT_REGION=us-east-1 \
  localstack/localstack:latest

# Tail logs to confirm it is ready
docker logs -f localstack
# Ready when you see: "Ready."

Stop it when done:

docker stop localstack

How do you wire your SDK to LocalStack?

LocalStack does not validate credentials. Use any non-empty string. The only meaningful change from a real AWS client is the endpoint override.

import { SQSClient } from '@aws-sdk/client-sqs';

const isLocal = process.env.NODE_ENV === 'development';

const client = new SQSClient({
  region: 'us-east-1',
  ...(isLocal && {
    endpoint: 'http://localhost:4566',
    credentials: {
      accessKeyId: 'test',
      secretAccessKey: 'test',
    },
  }),
});

Keep the local endpoint conditional so the same client config works in both environments without changes. Controlled by NODE_ENV or an equivalent environment variable.

How do you create a queue and send a test message?

Use the AWS CLI pointed at LocalStack. The --endpoint-url flag redirects all API calls to your local container.

# Create a standard queue
aws sqs create-queue \
  --queue-name orders-local \
  --region us-east-1 \
  --endpoint-url http://localhost:4566

# Output
{
  "QueueUrl": "http://localhost:4566/000000000000/orders-local"
}

# Send a test message
aws sqs send-message \
  --queue-url http://localhost:4566/000000000000/orders-local \
  --message-body '{"orderId": "order-001", "userId": "user-123"}' \
  --region us-east-1 \
  --endpoint-url http://localhost:4566

# Verify the message is in the queue
aws sqs get-queue-attributes \
  --queue-url http://localhost:4566/000000000000/orders-local \
  --attribute-names ApproximateNumberOfMessages \
  --region us-east-1 \
  --endpoint-url http://localhost:4566

How do you test your Lambda handler locally?

For unit-level testing, invoke the handler directly with a mock SQS event. You do not need Lambda runtime emulation to validate handler logic.

// scripts/run-local.js
import { handler } from '../src/handler.js';

const mockSqsEvent = {
  Records: [
    {
      messageId: 'a1b2c3d4-1234-5678-abcd-ef0123456789',
      receiptHandle: 'mock-receipt-handle',
      body: JSON.stringify({ orderId: 'order-001', userId: 'user-123' }),
      attributes: {
        ApproximateReceiveCount: '1',
        SentTimestamp: String(Date.now()),
      },
      messageAttributes: {},
      md5OfBody: 'mock-md5',
      eventSource: 'aws:sqs',
      eventSourceARN: 'arn:aws:sqs:us-east-1:000000000000:orders-local',
      awsRegion: 'us-east-1',
    },
  ],
};

const result = await handler(mockSqsEvent);
console.log('Handler result:', JSON.stringify(result, null, 2));

Run it with:

NODE_ENV=development node scripts/run-local.js

This tests the full handler logic, including partial batch response behavior, against your LocalStack SQS instance. No Lambda container required. The handler code does not know whether it is being invoked by AWS or by your script.

How do you wire LocalStack into a Docker Compose project?

If your project already uses Compose for other services (a database, a cache), add LocalStack as a service and configure your application to use it.

# docker-compose.yml
services:
  localstack:
    image: localstack/localstack:latest
    container_name: localstack
    ports:
      - '4566:4566'
    environment:
      SERVICES: sqs
      DEFAULT_REGION: us-east-1
      DEBUG: 0
    healthcheck:
      test: ['CMD', 'curl', '-f', 'http://localhost:4566/_localstack/health']
      interval: 5s
      timeout: 3s
      retries: 10
    volumes:
      - ./infra/localstack:/etc/localstack/init/ready.d

  app:
    build: .
    environment:
      NODE_ENV: development
      AWS_ENDPOINT_URL: http://localstack:4566
      AWS_REGION: us-east-1
      AWS_ACCESS_KEY_ID: test
      AWS_SECRET_ACCESS_KEY: test
      QUEUE_URL: http://localstack:4566/000000000000/orders-local
    depends_on:
      localstack:
        condition: service_healthy

The healthcheck ensures your application container does not start before LocalStack is ready to accept requests. The condition: service_healthy on depends_on enforces this.

Put any queue provisioning in ./infra/localstack/init.sh. LocalStack runs scripts in that directory on startup:

#!/bin/bash
# infra/localstack/init.sh

awslocal sqs create-queue \
  --queue-name orders-local \
  --region us-east-1

awslocal sqs create-queue \
  --queue-name orders-local-dlq \
  --region us-east-1

echo "Queues provisioned."

awslocal is the LocalStack-aware wrapper for the AWS CLI, pre-installed in the LocalStack image. Inside the container it does not need --endpoint-url.


What configuration actually matters before you ship?

Visibility timeout relative to Lambda timeout

Set your SQS visibility timeout to at least 6 times your Lambda function timeout. If Lambda can run for up to 30 seconds, set the visibility timeout to 180 seconds minimum. If the visibility timeout is shorter than the Lambda timeout, the message can become visible again while your function is still processing it. Another invocation picks it up. Now you are processing the same message twice concurrently.

This is the most common production incident with this pattern.

Dead-letter queue configuration

Set up a DLQ before you go live. Configure maxReceiveCount on the source queue (3 to 5 for transient failures, lower for workloads where a failure likely means a bad payload). Monitor DLQ depth as an operational alarm. A non-zero DLQ depth means something is broken.

Partial batch responses

Enable ReportBatchItemFailures on your event source mapping and return the batchItemFailures array from your handler. Without this, a single message failure in a batch of 50 causes all 50 to retry.

Reserved concurrency

If your downstream has a limit (a database with max connections, a third-party API with rate limits), set reserved concurrency on your Lambda function. Without it, a large queue backlog will trigger as many concurrent invocations as your account allows, which may be more than your downstream can absorb.

Idempotent processing

Lambda + SQS is at-least-once delivery. The same message can be delivered more than once, especially when visibility timeouts expire or network issues cause redelivery. Your handler should be safe to run multiple times with the same message. The simplest approach is to check a unique ID against a processed-IDs store before doing any work. The cleanest approach is to make the operation itself idempotent by design: upserting instead of inserting, using conditional updates, using request deduplication on downstream APIs.


What does a complete working example look like?

A handler that processes order events from an SQS queue, with partial batch responses and structured error handling:

import { DynamoDBClient, PutItemCommand } from '@aws-sdk/client-dynamodb';

const dynamo = new DynamoDBClient({});

export const handler = async (event) => {
  const failures = [];

  for (const record of event.Records) {
    const { messageId } = record;

    try {
      const body = JSON.parse(record.body);
      const { orderId, userId } = body;

      if (!orderId || !userId) {
        // Non-retryable: bad payload. Log it and do NOT add to failures.
        // Let SQS exhaust its retries and send to DLQ via maxReceiveCount.
        console.error('Invalid message payload', { messageId, body });
        continue;
      }

      await dynamo.send(
        new PutItemCommand({
          TableName: process.env.ORDERS_TABLE,
          Item: {
            orderId: { S: orderId },
            userId: { S: userId },
            processedAt: { S: new Date().toISOString() },
          },
          ConditionExpression: 'attribute_not_exists(orderId)', // idempotency
        })
      );

      console.log('Processed order', { messageId, orderId });
    } catch (err) {
      if (err.name === 'ConditionalCheckFailedException') {
        // Already processed. Safe to acknowledge.
        console.log('Duplicate message, already processed', { messageId });
        continue;
      }

      // Retryable failure: transient error, network issue, etc.
      console.error('Failed to process message', { messageId, err: err.message });
      failures.push({ itemIdentifier: messageId });
    }
  }

  return { batchItemFailures: failures };
};

A few things worth noting in this handler:

The bad payload case is not added to failures. A malformed message will not get better on retry. Adding it to failures would cause it to retry until maxReceiveCount is hit, which is the correct behavior anyway, but logging it clearly and letting the retry cycle play out is intentional. The DLQ will catch it.

The ConditionalCheckFailedException case means the order was already written, so the message is a duplicate. Acknowledging it (by not adding to failures) is the right call.

Everything else that throws is a retryable failure. Those go into failures so SQS re-queues only those specific messages.


What is the right mental model for this pattern?

Lambda + SQS is a producer-consumer pattern with managed infrastructure on both ends. The producer sends messages when something happens. The consumer (Lambda) processes them asynchronously. The queue sits in the middle and absorbs rate differences between the two.

The queue is not a stream. Messages are not ordered (on standard queues). A message is not a log entry you can replay indefinitely. It is a unit of work that gets deleted once processed. If you need an immutable ordered log of events, look at Kinesis or Kafka. If you need background processing that can scale independently of your main application without managing infrastructure, Lambda + SQS is the right tool.

The failure modes are real but manageable: duplicate delivery (handle with idempotency), poison pills (handle with DLQ), concurrency overload (handle with reserved concurrency), and invisible timeout mismatches (handle with correct configuration). Get those four right before you ship and the pattern is solid.

Back to all posts
awsserverlessbackend