PIF AI Whitepaper

PIF AI Whitepaper (English edition)

A Multi-Tenant AI-Assisted Platform for Cosmetic Product Information File Documentation

Version: v0.2 · Date: 2026-04-30 · Author: Vincent Lin (Baiyuan Tech) License: Whitepaper licensed under CC BY-NC 4.0; the underlying PIF AI software is AGPL-3.0.

[!NOTE] This document is an academic-technical whitepaper. Any numbers related to performance, user counts, or revenue are labeled as target or expected values unless supported by measurement or live query — consistent with the project’s Development Constitution: no mock data, no hard-coded numbers, full testing before reporting.

The entire project (code and whitepaper) was developed with the assistance of Anthropic Claude Code, serving as an open-source case study of LLM-assisted engineering applied to regulatory-compliance domains.


🧭 Table of Contents

Part I — Introduction

§ Chapter Topic
01 Abstract TL;DR, four design propositions, system overview diagram
02 Regulatory Background Taiwan Cosmetic Hygiene & Safety Act Article 8, July 2026 deadline, penalties
03 The 16 PIF Items Per-item data source, AI handling, database mapping

Part II — System Architecture

§ Chapter Topic
04 System Architecture Five-layer architecture, module boundaries, data flow
05 Frontend Stack Next.js 15 App Router, RSC, shadcn/ui
06 Backend Stack FastAPI, SQLAlchemy async, Alembic vs inline migration

Part III — AI & Data

§ Chapter Topic
07 AI Engine Claude Tool Use, Claude Code engineering practice, confidence scoring
08 Database & Multi-Tenancy Schema, Row-Level Security, current_setting pattern
09 Toxicology Pipeline PubChem / TFDA / ECHA / OECD cross-query
10 Central RAG Integration Scheme C+ isolation, dual-header auth, fail-soft

Part IV — Security & Compliance Process

§ Chapter Topic
11 Security Model AES-256, JWT, TOTP, audit, threat model, 5-locale i18n
12 Roadmap, Deployment & Open-Source Strategy Docker → K8s, AGPL rationale, Phase 1–3, contribution model
13 Compliance Engine Deep Dive (Phase 22-23) Lifecycle 5 stages, business-type responsibility matrix, 14 cross-item lint rules, V0-V3 snapshots, penalty mapping, 14-page regulatory PDF

Appendices

§ Chapter Topic
A Glossary PIF, SA, TFDA, INCI, 50+ entries
B API Endpoint Reference All frontend BFF + backend FastAPI endpoints
C References Statutes, standards, RFCs, academic papers
D Changelog Whitepaper revision history

📖 How to Read

Linear reading: Academic or regulatory readers should start at §1 and proceed through §13, then the appendices.

Quick start (open-source contributors):

  1. Read §1 Abstract for the big picture.
  2. Jump to §4 System Architecture for module boundaries.
  3. Enter your area of interest (frontend → §5, backend → §6, AI → §7, RAG → §10).
  4. Read §12 Roadmap.
  5. Head to the code repo’s CONTRIBUTING.md to start coding.

Regulatory compliance: §2 → §3 → §9 → §11 (SA workflow) → Appendix C.

Security review: §10 → §11 + SECURITY.md.


📊 Whitepaper Scale

Metric Target Current
Chapters 13 chapters + 4 appendices v0.2 complete
English word count 28,000+ words v0.2 ≈ 32,000 words
Figures 15+ Mermaid diagrams v0.2 ≈ 16 diagrams
Code citations 40+ (format file:line) v0.2 complete
References 30+ entries v0.2 complete

[!NOTE] This README is a ToC. The complete PDF is available on GitHub Releases. PDF convention: releases/download/<version>/whitepaper-en.pdf.


🔗 Language versions


Nav ← Back to repo root · Format spec →