Building a Floor Plan Analysis Agent: Room Detection, Measurement, and Description

Why Floor Plan Analysis Matters

Real estate listings, interior design platforms, and property management systems all need structured data from floor plans. A floor plan image contains room layouts, dimensions, furniture placement, and spatial relationships — but this information is trapped in pixels. An AI agent that can parse floor plans into structured data unlocks automated property descriptions, accurate square footage calculations, and intelligent room comparisons.

The challenge is that floor plans come in wildly different styles: architectural blueprints, hand-drawn sketches, 3D-rendered marketing plans, and everything in between. A robust agent must handle this variety.

The Analysis Pipeline

The agent processes floor plans through four stages: wall detection and room segmentation, room type classification, dimension estimation, and description generation.

Wall Detection and Room Segmentation

Walls are the structural elements that define rooms. Detect them using morphological operations:

import cv2
import numpy as np
from dataclasses import dataclass, field


@dataclass
class Room:
    room_id: int
    contour: np.ndarray
    bbox: tuple           # (x, y, w, h)
    area_pixels: float
    area_sqft: float = 0.0
    room_type: str = "unknown"
    center: tuple = (0, 0)
    furniture: list[str] = field(default_factory=list)


def detect_walls(image_path: str) -> np.ndarray:
    """Detect walls in a floor plan image."""
    img = cv2.imread(image_path, cv2.IMREAD_GRAYSCALE)

    # Binarize: walls are typically dark lines
    _, binary = cv2.threshold(img, 127, 255, cv2.THRESH_BINARY_INV)

    # Use morphological closing to connect broken wall segments
    kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (5, 5))
    walls = cv2.morphologyEx(binary, cv2.MORPH_CLOSE, kernel)

    # Thicken walls slightly for better room segmentation
    walls = cv2.dilate(walls, kernel, iterations=1)

    return walls


def segment_rooms(wall_mask: np.ndarray) -> list[Room]:
    """Segment the floor plan into individual rooms."""
    # Invert: rooms are the spaces between walls
    room_spaces = cv2.bitwise_not(wall_mask)

    # Remove small noise regions
    kernel = cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (3, 3))
    room_spaces = cv2.morphologyEx(room_spaces, cv2.MORPH_OPEN, kernel)

    # Find connected components (each is a room)
    num_labels, labels, stats, centroids = cv2.connectedComponentsWithStats(
        room_spaces, connectivity=4
    )

    rooms = []
    for i in range(1, num_labels):  # Skip background (label 0)
        area = stats[i, cv2.CC_STAT_AREA]

        # Filter out very small regions (noise) and the exterior
        if area < 1000 or area > 0.5 * wall_mask.size:
            continue

        # Create contour for this room
        room_mask = (labels == i).astype(np.uint8) * 255
        contours, _ = cv2.findContours(
            room_mask, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE
        )

        if contours:
            rooms.append(Room(
                room_id=len(rooms),
                contour=contours[0],
                bbox=(
                    stats[i, cv2.CC_STAT_LEFT],
                    stats[i, cv2.CC_STAT_TOP],
                    stats[i, cv2.CC_STAT_WIDTH],
                    stats[i, cv2.CC_STAT_HEIGHT],
                ),
                area_pixels=area,
                center=(int(centroids[i][0]), int(centroids[i][1])),
            ))

    return rooms

Scale Detection and Area Estimation

Floor plans often include a scale bar or dimension annotations. Detect these to convert pixel areas to real-world measurements:

See AI Voice Agents Handle Real Calls

Book a free demo or calculate how much you can save with AI voice automation.

Book a Demo ROI Calculator

import pytesseract
from PIL import Image
import re


def detect_scale(image_path: str) -> float:
    """Detect the scale factor (pixels per foot) from annotations."""
    img = Image.open(image_path)
    text = pytesseract.image_to_string(img)

    # Look for dimension patterns like "10'" or "10ft" or "3m"
    ft_pattern = r"(\d+)['′]"
    m_pattern = r"(\d+\.?\d*)\s*m"

    ft_matches = re.findall(ft_pattern, text)
    m_matches = re.findall(m_pattern, text)

    if ft_matches:
        # Find the dimension annotation and its pixel length
        # This is a simplified version; production code would
        # locate the dimension line in the image
        return estimate_scale_from_annotation(image_path, ft_matches[0])

    return 10.0  # Default: 10 pixels per foot (rough estimate)


def estimate_scale_from_annotation(
    image_path: str,
    known_dimension: str,
) -> float:
    """Estimate pixels-per-foot from a known dimension annotation."""
    # In production, you would locate the dimension line endpoints
    # and compute: pixels_between_endpoints / dimension_in_feet
    known_ft = float(known_dimension)
    estimated_pixel_length = 100  # Placeholder
    return estimated_pixel_length / known_ft


def calculate_room_areas(
    rooms: list[Room],
    pixels_per_foot: float,
) -> list[Room]:
    """Convert pixel areas to square feet."""
    sqft_per_pixel = 1.0 / (pixels_per_foot ** 2)

    for room in rooms:
        room.area_sqft = round(room.area_pixels * sqft_per_pixel, 1)

    return rooms

Room Type Classification

Classify rooms based on their size, shape, position, and any detected text labels or furniture:

from openai import OpenAI


def classify_rooms_with_context(
    rooms: list[Room],
    image_path: str,
) -> list[Room]:
    """Classify room types using spatial context and LLM reasoning."""
    # Extract any text labels near each room
    img = Image.open(image_path)
    full_text = pytesseract.image_to_string(img)

    room_descriptions = []
    for room in rooms:
        desc = (
            f"Room {room.room_id}: "
            f"area={room.area_sqft:.0f} sqft, "
            f"dimensions={room.bbox[2]}x{room.bbox[3]} pixels, "
            f"center=({room.center[0]}, {room.center[1]})"
        )
        room_descriptions.append(desc)

    client = OpenAI()
    response = client.chat.completions.create(
        model="gpt-4o",
        messages=[
            {"role": "system", "content": (
                "You are a floor plan analyst. Given room measurements "
                "and positions, classify each room as one of: living_room, "
                "bedroom, kitchen, bathroom, dining_room, hallway, closet, "
                "garage, office, laundry, entrance. Consider typical room "
                "sizes: bathrooms are small (30-80 sqft), bedrooms are "
                "medium (100-200 sqft), living rooms are large (200+ sqft)."
            )},
            {"role": "user", "content": (
                f"Text found on floor plan: {full_text}\n\n"
                f"Rooms:\n" + "\n".join(room_descriptions)
            )},
        ],
    )

    classifications = response.choices[0].message.content
    # Parse LLM response and update rooms
    for room in rooms:
        for room_type in ["living_room", "bedroom", "kitchen",
                          "bathroom", "hallway", "closet"]:
            if (f"Room {room.room_id}" in classifications and
                    room_type in classifications.lower()):
                room.room_type = room_type
                break

    return rooms

Furniture and Fixture Detection

Detect common furniture symbols in the floor plan:

FURNITURE_TEMPLATES = {
    "toilet": {"min_area": 200, "max_area": 800, "aspect_range": (0.5, 1.5)},
    "bathtub": {"min_area": 800, "max_area": 3000, "aspect_range": (0.3, 0.6)},
    "sink": {"min_area": 100, "max_area": 500, "aspect_range": (0.7, 1.3)},
    "bed": {"min_area": 2000, "max_area": 8000, "aspect_range": (0.5, 0.8)},
    "table": {"min_area": 500, "max_area": 3000, "aspect_range": (0.6, 1.4)},
}


def detect_furniture(
    image: np.ndarray,
    rooms: list[Room],
) -> list[Room]:
    """Detect furniture symbols within each room."""
    gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY) if len(image.shape) == 3 else image
    _, binary = cv2.threshold(gray, 200, 255, cv2.THRESH_BINARY_INV)

    for room in rooms:
        x, y, w, h = room.bbox
        room_region = binary[y:y+h, x:x+w]

        contours, _ = cv2.findContours(
            room_region, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE
        )

        for contour in contours:
            area = cv2.contourArea(contour)
            if area < 100:
                continue

            bx, by, bw, bh = cv2.boundingRect(contour)
            aspect = bw / max(bh, 1)

            for name, props in FURNITURE_TEMPLATES.items():
                if (props["min_area"] <= area <= props["max_area"] and
                        props["aspect_range"][0] <= aspect <= props["aspect_range"][1]):
                    if name not in room.furniture:
                        room.furniture.append(name)

    return rooms

Natural Language Description Generation

Generate listing-quality descriptions from the structured data:

def generate_property_description(rooms: list[Room]) -> str:
    """Generate a natural language property description."""
    total_sqft = sum(r.area_sqft for r in rooms)
    bedrooms = [r for r in rooms if r.room_type == "bedroom"]
    bathrooms = [r for r in rooms if r.room_type == "bathroom"]

    client = OpenAI()
    room_details = "\n".join(
        f"- {r.room_type.replace('_', ' ').title()}: "
        f"{r.area_sqft:.0f} sqft"
        f"{', with ' + ', '.join(r.furniture) if r.furniture else ''}"
        for r in rooms
    )

    response = client.chat.completions.create(
        model="gpt-4o",
        messages=[
            {"role": "system", "content": (
                "Write a professional real estate listing description "
                "based on these floor plan details. Be factual, "
                "highlight the layout flow, and mention room sizes."
            )},
            {"role": "user", "content": (
                f"Total area: {total_sqft:.0f} sqft\n"
                f"Bedrooms: {len(bedrooms)}\n"
                f"Bathrooms: {len(bathrooms)}\n\n"
                f"Room details:\n{room_details}"
            )},
        ],
    )

    return response.choices[0].message.content

FAQ

How accurate are the area measurements from floor plan analysis?

Pixel-based area estimation typically achieves 85-95% accuracy when a reliable scale reference is detected. The main error sources are perspective distortion in photographs of floor plans, inconsistent line weights, and scale bars that are not detected correctly. For critical measurements, always include a disclaimer that areas are estimates and should be verified by professional measurement.

Can this work on hand-drawn floor plans?

Yes, but with reduced accuracy. Hand-drawn plans have inconsistent line weights, imprecise angles, and often lack scale references. The wall detection stage needs more aggressive morphological operations, and room classification relies more heavily on text labels (which may be handwritten and harder to OCR). Expect 70-80% accuracy on room detection for clean hand-drawn plans.

How do I handle multi-story buildings?

Process each floor plan image independently, then use the LLM to identify common elements (staircases, elevators) that connect floors. Generate a combined description that references the flow between levels. The key challenge is maintaining consistent room numbering across floors.

#FloorPlanAI #RoomDetection #RealEstateAI #ComputerVision #ArchitectureAI #PropertyTech #AgenticAI #Python

Building a Floor Plan Analysis Agent: Room Detection, Measurement, and Description

Why Floor Plan Analysis Matters

The Analysis Pipeline

Wall Detection and Room Segmentation

Scale Detection and Area Estimation

Room Type Classification

Furniture and Fixture Detection

Natural Language Description Generation

FAQ

How accurate are the area measurements from floor plan analysis?

Can this work on hand-drawn floor plans?

How do I handle multi-story buildings?

Try CallSphere AI Voice Agents

Related Articles

WebArena and Real-World Web Agent Benchmarks: How We Measure Browser Agent Performance

Taking Screenshots and Recording Videos with Playwright for AI Analysis

Playwright Selectors Deep Dive: CSS, XPath, Text, and Role-Based Element Finding