Research Wiki landing page showing the AI-powered search interface

Research Wiki

Research was being generated. None of it was being used.

UX insights lived in Confluence and died there. The Research Wiki makes existing research accessible to anyone on the team, in natural language, with sources cited and confidence scored.

3 days from idea to working pilot · AI-augmented design and build

RABOT Energy · 2025 · ~1.5 weeks part-time

Design Leadership Design Ops AI-Augmented Build Product Strategy

Starting Point

Rabot had a dedicated UX researcher generating high-quality insights. Those insights lived in Confluence. Product and design decisions were routinely made without consulting them — not because the evidence didn't exist, but because retrieving it required enough effort to skip.

The problem wasn't research quality. It was access friction. When the evidence is two logins and three searches away, people rely on memory and assumptions instead. The continuous discovery loop was breaking before it started.

01 · Problem

Research was siloed

Insights were stored in Confluence, a tool that was used as an information dump. High-quality evidence was hard to find with all the noise.

02 · Problem

Retrieval required prior knowledge

Finding relevant research meant knowing how Confluence was organized, which studies existed, and what keywords the researcher had used. Too much prerequisite knowledge.

03 · Problem

Decisions bypassed the evidence

Without easy access, teams defaulted to assumptions. Evidence-based decision-making was the goal in principle and a friction point in practice.

Product Decisions

The underlying technology is a standard RAG pipeline: embed research, retrieve by semantic similarity, generate a grounded answer. What makes the product useful is the design wrapped around that pipeline. Two decisions are worth naming explicitly because neither is the obvious default.

01 · Decision

The researcher controls what's searchable

Sofya Yushina (the UX researcher at Rabot) publishes or unpublishes Confluence pages to control what the Wiki can draw from. The system doesn't index everything automatically. It indexes what the researcher has validated and made live.

This is not a technical default. It is a deliberate product decision about accountability. Research quality stays a human responsibility, not an algorithm's job. The quality gate is the feature.

02 · Decision

The tool is honest about what it doesn't know

Every answer includes a confidence indicator — High, Medium, or Low — based on cosine similarity between the query and the retrieved research. Low confidence doesn't mean the answer is wrong. It means the team should treat it as a starting point, not a conclusion.

Designing for epistemic honesty is a product value. The alternative is a system that sounds certain when it isn't, which is worse than no system at all.

Research Wiki results view showing confidence indicators and source citation chips

Confidence indicators (High, Medium, Low) and source citations on every answer. The researcher's published pages are the only source the system draws from.

System Design

A natural-language question becomes a cited, grounded answer in four stages. The architecture is straightforward. The product decisions layered on top of it are where the design work lives.

Content ingestion

Confluence REST API pulls published research pages. The researcher controls what's live.

Embedding and storage

Voyage AI generates multilingual embeddings. Supabase and pgvector store and index them.

Semantic retrieval

The query is embedded and matched against stored vectors. Top results ranked by cosine similarity.

Answer generation

Claude Sonnet synthesizes a grounded response with source citations, a confidence score, and suggested follow-up questions.

Auth and fallback: Any Rabot employee can sign in via Microsoft Entra ID. If the system can't find a relevant answer, the question can be forwarded directly to Sofya on Slack — so the researcher knows exactly what the team is trying to find and what the research doesn't yet cover.
Research Wiki how-to screen guiding users through querying the tool

Onboarding prompt shown to first-time users — framing the tool as a way to query existing research rather than generate new answers.

Scope

I designed and built this product using Claude Code. The Azure deployment and Entra ID authentication were done in collaboration with a cloud ops colleague to meet Rabot's security standards for routing real employee data through an external AI API.

The distinction matters: the product decisions, UX, frontend, and API logic are mine. The infrastructure that keeps real data secure in a cloud environment required someone who does that full-time.

My work

Product, design, and engineering

All product and UX decisions. React and TypeScript frontend. Azure Functions API in Node.js. Voyage AI embedding pipeline. Supabase vector database setup. Claude Sonnet prompt design and answer generation. Slack integration for forwarding unanswered questions.

Collaborative

Cloud infrastructure and security

Azure Static Web Apps deployment. Entra ID authentication configuration. Security review for routing real user data through an external AI API. Done in collaboration with a cloud ops colleague to meet Rabot's internal security standards.

Built with AI: I used Claude Code throughout. Approximately 3 working days of productive time, spread over about 1.5 weeks part-time. That pace matters less as a number and more as a signal: AI-augmented design work makes it possible for a designer to ship a real internal product, connected to real company systems, without a dedicated engineering team. The role of design leadership is changing. This is one example of what it can look like.

Role

Product Designer · Product Lead · Frontend Engineer

Timeline

~3 working days · 1.5 weeks part-time · 2025

Stack

React · TypeScript · Azure Functions · Supabase · pgvector · Voyage AI · Claude Sonnet

Current Status

The Research Wiki is in pilot with Sofya Yushina. She is the quality gate: she controls which Confluence pages are live in the system and provides feedback on answer quality as it runs.

There was verbal interest from colleagues when the tool was previewed internally. No adoption data is available yet. Broader rollout is pending the outcome of the pilot.

What would be worth measuring: number of searches per week, confidence score distribution (a proxy for research coverage), questions forwarded to Sofya (a signal of where the research archive has gaps), and qualitative feedback on whether the tool changed how any product decisions were made. The last one is the only metric that actually matters.

Next Case Study

Target More Use-Cases

↑ 8.4% iOS · ↑ 12.1% Android non-minute rate selection

View case study →
Target More Use-Cases case study thumbnail