Raising Organizational Productivity Floor through Harness in the Software 3.0 Era
Blog

Raising Organizational Productivity Floor through Harness in the Software 3.0 Era

Viva Republica
2026.02.27
Β·ServiceΒ·by μ„±μ‚°/λΆ€μ‚°/μž‘λΆ€
#AI Engineering#Harness#LLM#Software 3.0#Workflow

Key Points

  • 1The paper identifies a significant gap in team-wide LLM utilization due to varying "LLM Literacy" among engineers, leading to inefficiency despite common tools.
  • 2It proposes that LLM marketplaces and plugins, like those in Claude Code, can serve as an "Executable Single Source of Truth" and a "Frictionless Harness" to standardize effective LLM workflows across the organization.
  • 3This approach aims to elevate team productivity by deploying expert LLM practices as shareable modules, fostering continuous improvement through quality assurance, and ultimately creating an "AI-Native Data Flywheel."

This paper proposes a paradigm shift in how Large Language Models (LLMs) are integrated into software development teams, moving from individualistic ad-hoc utilization to a standardized, systemic approach. The core argument is that while LLMs offer significant productivity gains, the current "each for themselves" (κ°μžλ„μƒ) adoption leads to severe performance disparities among engineers due to varying "LLM Literacy"β€”specifically, the ability to design effective contexts for LLMs. An engineer with high contextual engineering skills can achieve complex refactoring in minutes, while another without this skill might spend hours battling hallucinations. This gap, the author argues, is not a coding skill deficit but a deficiency in controlling the LLM tool precisely.

The proposed solution centers on leveraging tools like Claude Code's plugin and marketplace ecosystem to create a "core harness" (핡심 ν•˜λ„€μŠ€) that "raises the floor" (상ν–₯ 평쀀화) of LLM utilization across the entire organization. This involves treating LLM workflows not as personal tricks but as shared, executable components within a team's system.

Core Methodologies and Technical Concepts:

  1. The Frictionless Harness (Seamless Integration):
The paper emphasizes the critical need for "seamless integration" (λŠκΉ€ 없이 μ„žμ΄λŠ” κ²½ν—˜) to minimize context switching costs. Traditional LLM usage often involves copy-pasting code into a browser-based chat interface, which introduces "micro-friction" (λ―Έμ„Έν•œ 마찰). The proposed solution advocates for a Terminal User Interface (TUI) environment (as provided by Claude Code), where natural language and code can interleave directly within the developer's primary workspace. This "smoothness" (λ§€λ„λŸ¬μ›€) is crucial for the unresisted propagation of structured LLM workflows throughout the team.

  1. Executable Single Source of Truth (Executable SSOT):
The paper introduces the concept of "Executable SSOT" (μ‹€ν–‰ κ°€λŠ₯ν•œ SSOT) to address the obsolescence inherent in traditional documentation (wikis, Notion). Knowledge, when defined as plugins for LLM agents, possesses a dual nature:
  • For humans: It serves as a static business guideline or manual.
  • For LLMs: It functions as a dynamic, precise instruction set encoded as "system prompts" (μ •ν™•ν•œ μ§€μ‹œμ‚¬ν•­μ΄ λ‹΄κΈ΄ μ‹œμŠ€ν…œ ν”„λ‘¬ν”„νŠΈ) and agent logic.
This shifts the paradigm of knowledge management from mere "recording" to "execution." Updates to a plugin directly translate to immediate changes in the LLM agent's behavior, ensuring that the team's shared knowledge is always current and actionable, representing a true SSOT.

  1. Domain-Optimized Harnesses (Raising the Floor):
While generic LLM plugins (e.g., oh-my-zsh equivalents for LLMs) can provide a baseline of best practices, the paper argues for a step further: domain-specific optimization. Generic tools understand "code" but lack "domain context" (도메인 λ§₯락). Teams need to define what tasks are "AI-friendly" and which require "Human-in-the-Loop" (HITL) intervention specifically for their domain (e.g., payment teams vs. settlement teams). The objective is to maximize token generation by the LLM while minimizing human intervention, reserving human approval for critical junctures. This is an extension of Software 1.0 era "Platform Engineering," where common functionalities (Auth, Logging) were encapsulated into shared libraries. In Software 3.0, these "common modules" become "AI workflow plugins," distributed via a "marketplace" instead of library deployment. The crucial difference is the shift from "code" to "prompts and agent logic" within these modules. Quality assurance (QA) for these AI workflows, through peer feedback (e.g., "this prompt uses too many tokens," "this agent hallucinates in this scenario"), on the marketplace platform is posited to evolve team AI capabilities from individual intuition to collective intelligence.

  1. Marketplace for Predictability and Dev-Prod Parity:
The paper argues for a plugin marketplace over Retrieval-Augmented Generation (RAG) systems due to benefits in "predictability" and "Dev-Prod Parity."
  • Predictability: Unlike RAG systems, where the injected context (from hybrid search, rerankers, etc.) can be opaque, plugins are explicit documents and code. Developers retain 100% control over the logic and can visually verify the context provided to the LLM, enhancing reliability.
  • Dev-Prod Parity: Workflows can be validated locally in the TUI environment by modifying plugins and getting immediate feedback from the LLM. Using SDKs (e.g., Claude Agent SDK), locally validated plugins can be loaded directly into server-side agent environments, making the marketplace the SSOT connecting development and production.
  1. Marketplace 1.0: Workflow Deployment Platform:
The marketplace is envisioned as "version 1.0 of a platform for deploying organizational workflows" (쑰직의 μΌν•˜λŠ” 방식(Workflow)을 λ°°ν¬ν•˜λŠ” ν”Œλž«νΌμ˜ 1.0 버전).
  • Governance: Team leads can package conventions (lint rules, Git branching strategies, testing policies) as plugins and distribute them via a private registry. The LLM, guided by these plugins, can actively enforce and "align" (Align) engineer behavior with team disciplines (e.g., stopping git commit on main and suggesting a feature branch). This transforms passive linters into active, guiding governance tools.
  • Knowledge Transfer: The expertise of highly proficient LLM users can be encapsulated into simple slash commands (e.g., /new-feature). When executed, these commands trigger a complex LLM-driven workflow (context gathering, Jira issue creation, branch generation, PR creation), enabling all team members to execute high-quality, standardized workflows, thereby "raising the floor" of collective productivity.
  1. Layered Architecture for Context Management:
Inspired by traditional layered architectures, the paper proposes a hierarchical structuring of knowledge within plugins for LLMs:
  • Global Layer: Enterprise-wide common regulations (security policies, basic coding styles).
  • Domain Layer: Team/business-specific knowledge (e.g., payment processing, settlement logic).
  • Local Layer: Repository-specific implementation details and project-specific rules.
This tiered structure ensures that the LLM is provided with only the "necessary knowledge" for the current task, preventing information overload and creating a "living knowledge base" that is always current and executable, obviating the need for complex external RAG systems.

  1. The Data Flywheel Hypothesis:
Looking forward, the established marketplace and plugin system are hypothesized to form a "data flywheel" (데이터 ν”ŒλΌμ΄νœ ). The standardized, structured interactions facilitated by plugins generate high-quality "Instruction Tuning datasets." This data can then be used to fine-tune domain-specific Small Language Models (sLLMs), with existing workflows serving as evaluation criteria for the refined models. This creates a virtuous cycle: more usage leads to more data, better data leads to more refined models, which in turn drives more usage. This is seen as the essence of "AI-Native Data Flywheel."

In conclusion, the paper advocates for moving beyond individual LLM usage to a formalized, team-centric system built on executable plugins and a marketplace. This approach aims to standardize LLM literacy, enforce organizational conventions, propagate best practices, and ultimately build a self-improving AI-driven development ecosystem, transforming LLMs from isolated tools into an integrated, enterprise-level "harness."