Reducing Story Generation from 20 mins to 2 mins with Temporal.io
- Mokshit Jain
- Backend Development , DevOps
- 10 Jan, 2026
When building an AI-powered personalized storybook platform, I faced a critical performance challenge: generating a single story took 20 minutes. After implementing Temporal.io for workflow orchestration with parallel processing, I reduced this to just 2 minutes—a 10x improvement. Here’s how I did it.
The Problem: Sequential Processing Bottleneck
The initial story generation pipeline looked like this:
async def generate_story(prompt: str) -> Story:
# Step 1: Generate story text (5 minutes)
story_text = await openai_client.generate(prompt)
# Step 2: Generate chapter images (5 min x 5 chapters = 25 minutes)
images = []
for chapter in story_text.chapters:
image = await image_client.generate(chapter.description)
images.append(image)
# Step 3: Generate narration (5 minutes)
narration = await elevenlabs_client.generate(story_text)
# Step 4: Render final PDF (5 minutes)
pdf = await render_pdf(story_text, images, narration)
return pdf
Total time: ~40 minutes (with 5 chapters)
This sequential approach was a bottleneck. Each step waited for the previous one to complete, even though many operations could run in parallel.
Enter Temporal.io
Temporal.io is a durable execution platform that makes it easy to:
- Orchestrate complex workflows
- Handle failures and retries automatically
- Run activities in parallel
- Maintain workflow state across failures
Solution: Parallelized Workflow
Here’s how I restructured the workflow with Temporal:
Define Activities
from datetime import timedelta
from temporalio import activity, workflow
@activity.defn
async def generate_story_text(prompt: str) -> dict:
"""Generate the story text with chapters"""
response = await openai_client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": prompt}]
)
return parse_chapters(response.choices[0].message.content)
@activity.defn
async def generate_chapter_image(chapter: dict) -> str:
"""Generate image for a single chapter"""
response = await openai_client.images.generate(
model="dall-e-3",
prompt=chapter["image_prompt"]
)
return response.data[0].url
@activity.defn
async def generate_narration(text: str) -> bytes:
"""Generate audio narration"""
response = await elevenlabs_client.generate(text=text)
return response.content
@activity.defn
async def render_story_pdf(story: dict, images: list, audio: bytes) -> bytes:
"""Render final PDF"""
return await pdf_generator.render(story, images, audio)
Define the Workflow
from temporalio import workflow
from temporalio.workflow import execute_activity
@workflow.defn
class StoryGenerationWorkflow:
@workflow.run
async def run(self, prompt: str) -> bytes:
# Step 1: Generate story text
story_data = await execute_activity(
generate_story_text,
prompt,
start_to_close_timeout=timedelta(minutes=10)
)
# Step 2: Generate all chapter images IN PARALLEL
image_tasks = [
execute_activity(
generate_chapter_image,
chapter,
start_to_close_timeout=timedelta(minutes=5)
)
for chapter in story_data["chapters"]
]
images = await asyncio.gather(*image_tasks)
# Step 3: Generate narration
narration = await execute_activity(
generate_narration,
story_data["full_text"],
start_to_close_timeout=timedelta(minutes=5)
)
# Step 4: Render final PDF
pdf = await execute_activity(
render_story_pdf,
story_data,
images,
narration,
start_to_close_timeout=timedelta(minutes=5)
)
return pdf
Performance Breakdown
With the parallelized approach:
Story Text Generation: 2 minutes (sequential)
├── Chapter Images: 2 minutes (parallel, not 10!)
├── Narration: 1 minute (can overlap with images)
└── PDF Rendering: 1 minute (final step)
Total: ~6 minutes per story
But we can do even better by splitting chapter generation:
@workflow.defn
class OptimizedStoryGenerationWorkflow:
@workflow.run
async def run(self, prompt: str) -> bytes:
# Start text generation immediately
story_task = execute_activity(
generate_story_text,
prompt,
start_to_close_timeout=timedelta(minutes=10)
)
# Wait for story text
story_data = await story_task
# Fire off all remaining activities simultaneously
results = await asyncio.gather(
# All chapter images in parallel
*[execute_activity(generate_chapter_image, ch, ...)
for ch in story_data["chapters"]],
# Narration in parallel with images
execute_activity(generate_narration, story_data["full_text"], ...),
return_exceptions=True
)
images = results[:-1] # All except narration
narration = results[-1]
pdf = await execute_activity(render_story_pdf, ...)
return pdf
Final result: ~2 minutes per story
Setting Up Temporal with Python
Installation
pip install temporalio
Worker Configuration
from temporalio.worker import Worker
from temporalio.client import Client
async def main():
# Connect to Temporal server
client = await Client.connect("localhost:7233")
# Run the worker
worker = Worker(
client,
task_queue="story-generation-queue",
workflows=[StoryGenerationWorkflow],
activities=[
generate_story_text,
generate_chapter_image,
generate_narration,
render_story_pdf
]
)
await worker.run()
if __name__ == "__main__":
asyncio.run(main())
Starting a Workflow from FastAPI
from fastapi import FastAPI
from temporalio.client import Client
app = FastAPI()
@app.post("/generate-story")
async def create_story(prompt: str):
client = await Client.connect("localhost:7233")
result = await client.execute_workflow(
StoryGenerationWorkflow.run,
prompt,
id=f"story-{uuid.uuid4()}",
task_queue="story-generation-queue",
execution_timeout=timedelta(minutes=15)
)
return {"status": "completed", "pdf_url": result}
Key Benefits of Temporal
1. Automatic Retries
Activities retry automatically on failure:
@activity.defn
async def generate_chapter_image(chapter: dict) -> str:
# Will retry on API failures
with activity.heartbeater():
response = await image_api.generate(chapter)
return response.url
2. Durable Execution
Workflow state persists even if your server crashes:
# Temporal remembers where it left off
@workflow.defn
class ResumableWorkflow:
@workflow.run
async def run(self, prompt: str):
# If this crashes at step 3, it resumes from there
step1_result = await step1()
step2_result = await step2()
step3_result = await step3() # Will resume from here
3. Visibility Dashboard
Temporal provides a web UI to monitor workflow executions, see history, and debug issues.
Production Tips
- Set appropriate timeouts: Each activity should have a realistic timeout
- Use heartbeats: For long-running activities, send heartbeats to avoid timeout
- Handle signals: Allow workflows to be cancelled or updated mid-execution
- Monitor resources: Parallel activities can spike API usage—implement rate limiting
Results
After implementing Temporal.io:
- Story generation: 20 mins → 2 mins (10x faster)
- Throughput: Capable of hundreds of stories per minute
- Reliability: Automatic retries handle API failures gracefully
- Scalability: Easy to scale workers horizontally
Conclusion
Temporal.io transformed our AI story generation from a slow, sequential process into a fast, parallel workflow. The key insight was identifying independent operations and running them concurrently. For any AI pipeline with multiple steps, Temporal is a game-changer.
Want to discuss workflow orchestration or share your own experiences? I’d love to connect!