Skip to content
Learn Agentic AI
Learn Agentic AI archive page 69 of 146

Learn Agentic AI — Build Voice & Chat Agents

Step-by-step tutorials on building voice and chat AI agents using OpenAI Agents SDK, Realtime API, function calling, multi-agent orchestration, and production deployment patterns.

9 of 1313 articles

Learn Agentic AI
14 min read0Mar 16, 2026

Building a Video Analysis Agent: Frame Extraction, Scene Detection, and Summarization

Learn how to build a video analysis agent in Python that extracts key frames, detects scene changes, performs temporal analysis, and generates structured summaries using vision language models.

Learn Agentic AI
13 min read0Mar 16, 2026

Screenshot Analysis Agent: Understanding UI Elements and Generating Descriptions

Build a screenshot analysis agent that detects UI elements, analyzes layouts, and generates accessibility descriptions. Learn techniques for button detection, form analysis, and hierarchical layout understanding.

Learn Agentic AI
13 min read0Mar 16, 2026

Building a Diagram Understanding Agent: Flowcharts, Architecture Diagrams, and Charts

Create an AI agent that classifies diagram types, extracts elements and relationships from flowcharts and architecture diagrams, and converts visual diagrams into structured data and code representations.

Learn Agentic AI
13 min read0Mar 16, 2026

Audio Analysis Agent: Music Classification, Speaker Identification, and Sound Events

Build an audio analysis agent in Python that classifies music genres, identifies speakers through diarization, and detects sound events. Covers audio feature extraction, classification models, and structured audio understanding.

Learn Agentic AI
12 min read0Mar 16, 2026

Building a Multi-Input Agent: Combining User Text with Uploaded Files for Rich Interactions

Build a multi-input AI agent that handles user text alongside uploaded files of any format. Learn file upload handling, automatic format detection, unified processing pipelines, and how to generate contextual responses from mixed inputs.

Learn Agentic AI
15 min read0Mar 16, 2026

Computer Use Agents: AI That Controls Browser and Desktop Applications

Learn how to build computer use agents that interact with browser and desktop applications by capturing screenshots, detecting UI elements, performing click and type actions, and verifying results through visual feedback loops.

Learn Agentic AI
14 min read0Mar 16, 2026

Generating Multimodal Outputs: AI Agents That Create Images, Audio, and Documents

Build AI agents that generate rich multimodal outputs including images with DALL-E, speech with TTS, PDF documents, and formatted reports. Learn how to orchestrate multiple generation APIs into cohesive, multi-format responses.

Learn Agentic AI
14 min read0Mar 16, 2026

Building an Email Automation Agent: Reading, Classifying, and Responding to Emails

Learn how to build an AI agent that connects to your inbox via IMAP and the Gmail API, classifies incoming messages by intent, and generates context-aware draft responses automatically.