
How Weave is Replacing Story Points with LLMs and AI
Software engineering teams have long relied on story points to estimate and track work. But as teams grow and projects become more complex, traditional metrics often fall short. This gap leads to missed deadlines, unclear productivity signals, and frustration for both engineers and managers. Weave is changing this by using LLMs and domain-specific machine learning to provide a clearer, more objective view of engineering team performance.
Why Traditional Story Points Fall Short
The Problem with Story Points
Story points were designed to help teams estimate effort and complexity. But they are subjective, often inconsistent across teams, and can be influenced by team dynamics or external pressure. This makes it hard to compare work across teams or track progress over time.
Story points are not standardized, so a “5” for one team might be a “2” for another.
Teams often inflate or deflate points to meet targets.
Story points don’t capture the quality or impact of work, only perceived effort.
Industry Frameworks and Metrics
To address these gaps, many organizations have adopted frameworks like DORA, SPACE, and CORE 4 metrics. These models focus on outcomes such as deployment frequency, lead time, and team satisfaction. While they offer a broader view, they still rely on manual data entry and subjective reporting.
DORA metrics: Deployment frequency, lead time for changes, change failure rate, and time to restore service.
SPACE metrics: Satisfaction, Performance, Activity, Communication, and Efficiency.
CORE 4 metrics: Code, Output, Review, and Efficiency.
How Weave Uses LLMs and AI for Engineering Analytics
Objective Measurement with LLMs
Weave analyzes every pull request (PR) and code review using a combination of LLMs and proprietary machine learning models. Instead of relying on subjective estimates, Weave’s models are trained on expert-labeled datasets to answer a key question: “How long would this PR take for an expert engineer?”.
Each PR is evaluated for complexity, scope, and quality.
The system estimates the actual time and effort required, not just lines of code or number of commits.
Weave classifies work into categories like new features, bug fixes, and maintenance, giving teams a clear view of where their time goes.
Key Features of Weave’s Analytics Platform
Tracks real output over time, not just activity.
Summarizes data and insights in dashboards for easy review.
Measures both output and quality, providing a balanced view of team performance.
Monitors time spent on code review and the usefulness of those reviews.
Measures the quality of code reviews by understanding the depth and practicality of the comments.
Technical Deep Dive: How the Model Works
Weave’s custom machine learning model is trained on a large, expert-labeled dataset of PRs. The model considers factors such as:
Code complexity and dependencies
Size and scope of changes
Review comments and feedback cycles
Historical performance data
Comparing Weave’s Approach to Traditional Metrics
Criteria | Story Points | Weave LLM/AI Analytics |
---|---|---|
Subjectivity | High | Low |
Standardization | Low | High |
Measures Output Quality | No | Yes |
Real-Time Insights | No | Yes |
Tracks Review Quality | No | Yes |
Gameable | High | Low |
When to Use Each Approach
Story points work best for small, co-located teams with stable membership.
Weave’s analytics are ideal for distributed teams, organizations with multiple squads, or any size group seeking objective, scalable performance tracking.
Integrating Weave with Your Engineering Workflow
Seamless Integration with Existing Tools
Weave connects with popular platforms like GitHub and Jira, making it easy to start tracking engineering analytics without changing your workflow.
Step-by-Step: Getting Started with Weave
Connect your code repository (e.g., GitHub).
Allow Weave to analyze PRs (usually takes 5 hours)
Dive into the dashboards for output, quality, and time allocation.
Use insights to adjust team processes and improve performance.
The Future of Engineering Team Performance Tracking
The shift from story points to AI-driven analytics marks a significant step forward for engineering management. By providing objective, real-time insights into both output and quality, Weave helps teams identify strengths, address weaknesses, and deliver projects more reliably.
Teams that adopt data-driven performance tracking are better equipped to:
Spot and resolve bottlenecks quickly.
Allocate resources more effectively.
Improve code quality and team collaboration.
For engineering leaders looking to move beyond subjective metrics, Weave offers a clear, actionable path to better team performance.