How Weave is Replacing Story Points with LLMs and AI

Jun 17, 2025

June 17, 2025

Software engineering teams have long relied on story points to estimate and track work. But as teams grow and projects become more complex, traditional metrics often fall short. This gap leads to missed deadlines, unclear productivity signals, and frustration for both engineers and managers. Weave is changing this by using LLMs and domain-specific machine learning to provide a clearer, more objective view of engineering team performance.

Why Traditional Story Points Fall Short

The Problem with Story Points

Story points were designed to help teams estimate effort and complexity. But they are subjective, often inconsistent across teams, and can be influenced by team dynamics or external pressure. This makes it hard to compare work across teams or track progress over time.

Story points are not standardized, so a “5” for one team might be a “2” for another.
Teams often inflate or deflate points to meet targets.
Story points don’t capture the quality or impact of work, only perceived effort.

Industry Frameworks and Metrics

To address these gaps, many organizations have adopted frameworks like DORA, SPACE, and CORE 4 metrics. These models focus on outcomes such as deployment frequency, lead time, and team satisfaction. While they offer a broader view, they still rely on manual data entry and subjective reporting.

DORA metrics: Deployment frequency, lead time for changes, change failure rate, and time to restore service.
SPACE metrics: Satisfaction, Performance, Activity, Communication, and Efficiency.
CORE 4 metrics: Code, Output, Review, and Efficiency.

How Weave Uses LLMs and AI for Engineering Analytics

Objective Measurement with LLMs

Weave analyzes every pull request (PR) and code review using a combination of LLMs and proprietary machine learning models. Instead of relying on subjective estimates, Weave’s models are trained on expert-labeled datasets to answer a key question: “How long would this PR take for an expert engineer?”.

Each PR is evaluated for complexity, scope, and quality.
The system estimates the actual time and effort required, not just lines of code or number of commits.
Weave classifies work into categories like new features, bug fixes, and maintenance, giving teams a clear view of where their time goes.

Key Features of Weave’s Analytics Platform

Tracks real output over time, not just activity.
Summarizes data and insights in dashboards for easy review.
Measures both output and quality, providing a balanced view of team performance.
Monitors time spent on code review and the usefulness of those reviews.
Measures the quality of code reviews by understanding the depth and practicality of the comments.

Technical Deep Dive: How the Model Works

Weave’s custom machine learning model is trained on a large, expert-labeled dataset of PRs. The model considers factors such as:

Code complexity and dependencies
Size and scope of changes
Review comments and feedback cycles
Historical performance data

Comparing Weave’s Approach to Traditional Metrics

Criteria	Story Points	Weave LLM/AI Analytics
Subjectivity	High	Low
Standardization	Low	High
Measures Output Quality	No	Yes
Real-Time Insights	No	Yes
Tracks Review Quality	No	Yes
Gameable	High	Low

When to Use Each Approach

Story points work best for small, co-located teams with stable membership.
Weave’s analytics are ideal for distributed teams, organizations with multiple squads, or any size group seeking objective, scalable performance tracking.

Integrating Weave with Your Engineering Workflow

Seamless Integration with Existing Tools

Weave connects with popular platforms like GitHub and Jira, making it easy to start tracking engineering analytics without changing your workflow.

Step-by-Step: Getting Started with Weave

Connect your code repository (e.g., GitHub).
Allow Weave to analyze PRs (usually takes 5 hours)
Dive into the dashboards for output, quality, and time allocation.
Use insights to adjust team processes and improve performance.

The Future of Engineering Team Performance Tracking

The shift from story points to AI-driven analytics marks a significant step forward for engineering management. By providing objective, real-time insights into both output and quality, Weave helps teams identify strengths, address weaknesses, and deliver projects more reliably.

Teams that adopt data-driven performance tracking are better equipped to: