Reducing Story Generation from 20 Minutes to 2 with Temporal.io

How durable workflow orchestration and parallel processing cut AI story generation latency by 10x — a practical guide for Python developers.

Mokshit Jain · January 10, 2026 · 9 min read

When building an AI-powered personalized storybook platform, I faced a critical performance challenge: generating a single story took 20 minutes. After implementing Temporal.io for workflow orchestration with parallel processing, I reduced this to just 2 minutes—a 10x improvement. Here’s how I did it.

The Problem: Sequential Processing Bottleneck

The initial story generation pipeline looked like this:

async def generate_story(prompt: str) -> Story:
    # Step 1: Generate story text (5 minutes)
    story_text = await openai_client.generate(prompt)

    # Step 2: Generate chapter images (5 min x 5 chapters = 25 minutes)
    images = []
    for chapter in story_text.chapters:
        image = await image_client.generate(chapter.description)
        images.append(image)

    # Step 3: Generate narration (5 minutes)
    narration = await elevenlabs_client.generate(story_text)

    # Step 4: Render final PDF (5 minutes)
    pdf = await render_pdf(story_text, images, narration)

    return pdf

Total time: ~40 minutes (with 5 chapters)

This sequential approach was a bottleneck. Each step waited for the previous one to complete, even though many operations could run in parallel.

Enter Temporal.io

Temporal.io is a durable execution platform that makes it easy to:

Orchestrate complex workflows
Handle failures and retries automatically
Run activities in parallel
Maintain workflow state across failures

Solution: Parallelized Workflow

Here’s how I restructured the workflow with Temporal:

Define Activities

from datetime import timedelta
from temporalio import activity, workflow

@activity.defn
async def generate_story_text(prompt: str) -> dict:
    """Generate the story text with chapters"""
    response = await openai_client.chat.completions.create(
        model="gpt-4o",
        messages=[{"role": "user", "content": prompt}]
    )
    return parse_chapters(response.choices[0].message.content)

@activity.defn
async def generate_chapter_image(chapter: dict) -> str:
    """Generate image for a single chapter"""
    response = await openai_client.images.generate(
        model="dall-e-3",
        prompt=chapter["image_prompt"]
    )
    return response.data[0].url

@activity.defn
async def generate_narration(text: str) -> bytes:
    """Generate audio narration"""
    response = await elevenlabs_client.generate(text=text)
    return response.content

@activity.defn
async def render_story_pdf(story: dict, images: list, audio: bytes) -> bytes:
    """Render final PDF"""
    return await pdf_generator.render(story, images, audio)

Define the Workflow

from temporalio import workflow
from temporalio.workflow import execute_activity

@workflow.defn
class StoryGenerationWorkflow:
    @workflow.run
    async def run(self, prompt: str) -> bytes:
        # Step 1: Generate story text
        story_data = await execute_activity(
            generate_story_text,
            prompt,
            start_to_close_timeout=timedelta(minutes=10)
        )

        # Step 2: Generate all chapter images IN PARALLEL
        image_tasks = [
            execute_activity(
                generate_chapter_image,
                chapter,
                start_to_close_timeout=timedelta(minutes=5)
            )
            for chapter in story_data["chapters"]
        ]
        images = await asyncio.gather(*image_tasks)

        # Step 3: Generate narration
        narration = await execute_activity(
            generate_narration,
            story_data["full_text"],
            start_to_close_timeout=timedelta(minutes=5)
        )

        # Step 4: Render final PDF
        pdf = await execute_activity(
            render_story_pdf,
            story_data,
            images,
            narration,
            start_to_close_timeout=timedelta(minutes=5)
        )

        return pdf

Performance Breakdown

With the parallelized approach:

Story Text Generation:     2 minutes  (sequential)
├── Chapter Images:        2 minutes  (parallel, not 10!)
├── Narration:             1 minute   (can overlap with images)
└── PDF Rendering:         1 minute   (final step)

Total: ~6 minutes per story

But we can do even better by splitting chapter generation:

@workflow.defn
class OptimizedStoryGenerationWorkflow:
    @workflow.run
    async def run(self, prompt: str) -> bytes:
        # Start text generation immediately
        story_task = execute_activity(
            generate_story_text,
            prompt,
            start_to_close_timeout=timedelta(minutes=10)
        )

        # Wait for story text
        story_data = await story_task

        # Fire off all remaining activities simultaneously
        results = await asyncio.gather(
            # All chapter images in parallel
            *[execute_activity(generate_chapter_image, ch, ...)
              for ch in story_data["chapters"]],
            # Narration in parallel with images
            execute_activity(generate_narration, story_data["full_text"], ...),
            return_exceptions=True
        )

        images = results[:-1]  # All except narration
        narration = results[-1]

        pdf = await execute_activity(render_story_pdf, ...)
        return pdf

Final result: ~2 minutes per story

Setting Up Temporal with Python

Installation

pip install temporalio

Worker Configuration

from temporalio.worker import Worker
from temporalio.client import Client

async def main():
    # Connect to Temporal server
    client = await Client.connect("localhost:7233")

    # Run the worker
    worker = Worker(
        client,
        task_queue="story-generation-queue",
        workflows=[StoryGenerationWorkflow],
        activities=[
            generate_story_text,
            generate_chapter_image,
            generate_narration,
            render_story_pdf
        ]
    )

    await worker.run()

if __name__ == "__main__":
    asyncio.run(main())

Starting a Workflow from FastAPI

from fastapi import FastAPI
from temporalio.client import Client

app = FastAPI()

@app.post("/generate-story")
async def create_story(prompt: str):
    client = await Client.connect("localhost:7233")

    result = await client.execute_workflow(
        StoryGenerationWorkflow.run,
        prompt,
        id=f"story-{uuid.uuid4()}",
        task_queue="story-generation-queue",
        execution_timeout=timedelta(minutes=15)
    )

    return {"status": "completed", "pdf_url": result}

Key Benefits of Temporal

1. Automatic Retries

Activities retry automatically on failure:

@activity.defn
async def generate_chapter_image(chapter: dict) -> str:
    # Will retry on API failures
    with activity.heartbeater():
        response = await image_api.generate(chapter)
        return response.url

2. Durable Execution

Workflow state persists even if your server crashes:

# Temporal remembers where it left off
@workflow.defn
class ResumableWorkflow:
    @workflow.run
    async def run(self, prompt: str):
        # If this crashes at step 3, it resumes from there
        step1_result = await step1()
        step2_result = await step2()
        step3_result = await step3()  # Will resume from here

3. Visibility Dashboard

Temporal provides a web UI to monitor workflow executions, see history, and debug issues.

Production Tips

Set appropriate timeouts: Each activity should have a realistic timeout
Use heartbeats: For long-running activities, send heartbeats to avoid timeout
Handle signals: Allow workflows to be cancelled or updated mid-execution
Monitor resources: Parallel activities can spike API usage—implement rate limiting

Results

After implementing Temporal.io:

Story generation: 20 mins → 2 mins (10x faster)
Throughput: Capable of hundreds of stories per minute
Reliability: Automatic retries handle API failures gracefully
Scalability: Easy to scale workers horizontally

Conclusion

Temporal.io transformed our AI story generation from a slow, sequential process into a fast, parallel workflow. The key insight was identifying independent operations and running them concurrently. For any AI pipeline with multiple steps, Temporal is a game-changer.

Want to discuss workflow orchestration or share your own experiences? I’d love to connect!

Tags #Temporal #Python #Automation

Written by

Mokshit Jain

AI engineer & full-stack developer building LLM products, automation, and RAG pipelines.

Continue reading

Engineering·August 25, 2025

Integrating the WhatsApp Business API for Sales Automation

Cutting quotation time from five minutes to under one by wiring the WhatsApp Business API into legacy accounting systems — an automation field guide.

Mokshit Jain · 10 min

Web·January 25, 2026

Building Lightning-Fast Sites with Astro and TypeScript

Why Astro is the right tool for content-focused sites, and how it changed the way I think about shipping JavaScript to the browser.

Mokshit Jain · 8 min

AI·August 15, 2025

Building RAG Pipelines with pgvector and OpenAI: A Practical Guide

Building production-ready Retrieval-Augmented Generation with PostgreSQL pgvector and OpenAI — and the lessons from hitting 99% retrieval accuracy in production.

Mokshit Jain · 8 min