Session 4 Pre-Class Notes and Case Discussion Guide
Session 4 課前講義與個案討論引導
Theme: trust, governance, and execution when interactions may be synthetic.
主題:當互動可能是合成的、由 Bot 或 Agent 中介時,信任、治理與執行如何被設計?
Core question: When interactions may be synthetic, how do we prove authenticity and accountability?
核心問題:當互動可能由 AI、Bot 或 Agent 中介時,我們要如何證明真實性與可究責性?
Anchor cases: Anthropic — building safe and powerful AI; OpenAI — competitive strategy and governance.
核心個案:Anthropic —— 如何同時追求安全與強大;OpenAI —— 競爭策略與治理之間的張力。
Session output: minimal viable agent demo or prototype spec, a 3–6 month roadmap, KPIs, risk register, and the final deck.
本次產出:最小可行 Agent demo 或 prototype spec、3–6 個月 roadmap、KPI、風險登錄表,以及最終簡報。
Session 4 moves from “what AI can do” to “what makes AI-mediated work trustworthy, reviewable, and strategically legitimate.” In a world of synthetic interactions, governance is not paperwork after the fact. It becomes part of the product and part of the brand.
Session 4 會把重點從「AI 能做什麼」推進到「AI 做的事如何值得信任、如何被審核,以及如何具備策略正當性」。在合成互動的世界裡,治理不是事後 paperwork,而是產品的一部分,也是品牌的一部分。
Anthropic as a lens on safety, deployment discipline, and why trust must be designed into the operating model.
用 Anthropic 來看安全、部署紀律,以及為什麼信任必須被設計進營運模式。
OpenAI as a debate on governance under speed, platform pressure, and ecosystem expectations.
用 OpenAI 來辯論高速競爭、平台壓力與生態系期待下的治理問題。
Digital ghost towns: when traffic, engagement, and even apparent service may be synthetic or strategically ambiguous.
數位鬼城:當流量、互動甚至服務表象都可能是合成的,哪些指標還值得相信?
Trust architecture, disclosure, human-in-the-loop gates, prototype design, roadmap, and risk register.
信任架構、揭露、人類審核節點、prototype 設計、roadmap 與風險登錄表。
You will still find fixed navigation, bilingual switching, flip cards, reveal panels, true/false checks, sorting tasks, prompt copying, SVG animations, and clickable governance nodes—now rewritten for Session 4’s trust-and-governance problem.
本頁依然保留固定導覽、中英文切換、翻卡、展開提示、是非題、排序題、提示詞複製、SVG 動畫與可點擊治理節點,只是全部改寫成 Session 4 的信任與治理問題。
Treat the case as a question about operating design, not moral posture. If a firm claims to build “safe and powerful” AI, what release logic, review structure, and deployment discipline must exist behind that statement?
請把這個個案當成營運設計題,而不是抽象道德宣示。如果企業主張自己在打造「既安全又強大」的 AI,那麼其背後應該存在什麼樣的發佈邏輯、審核結構與部署紀律?
The system must perform meaningfully well, or safety will be dismissed as avoidance.
系統本身要有足夠能力,否則「安全」很容易被外界理解成保守與退縮。
Trust depends on what the model is not allowed to do, not only on what it can do.
信任不只來自模型能做什麼,更來自它被明確禁止做什麼。
A credible safety claim requires logs, checkpoints, and visible exception-handling rules.
可信的安全主張,必須伴隨日誌、檢查點與清楚可見的例外處理規則。
Governance matters most at the moment of deployment, not only in internal policy decks.
治理最關鍵的時刻,是產品真正上線部署時,而不是只存在於內部文件裡。
If the claim is real, it should reshape release policy, tooling, and review workflow.
如果這個主張是真的,它就應該反映在發佈政策、工具設計與審核工作流上。
The firm competes on reliability of behavior under uncertainty, not only on benchmark performance.
企業競爭的不只是 benchmark 分數,更是它在不確定情境下行為的可靠性。
Capability alone can increase risk if the workflow around it is under-specified.
如果圍繞能力的工作流設計不完整,單純變強反而可能增加風險。
Boundary design—what is blocked, escalated, or reviewed—is part of product value, not a side constraint.
邊界設計——哪些事被阻擋、升級或審核——是產品價值的一部分,而不是附加限制。
A board principle means little if it does not translate into checkpoints, test criteria, and release gates.
如果董事會原則沒有被翻譯成檢查點、測試準則與上線門檻,那它的實際意義很有限。
Trustworthy execution depends on how policies become workflows, not on whether principles sound impressive.
可信的執行取決於政策如何變成工作流,而不是原則聽起來多漂亮。
If customers and partners cannot perceive boundaries, logs, or escalation, “trust” stays abstract.
如果顧客與合作夥伴看不見邊界、日誌與升級機制,那麼「信任」就仍停留在抽象層次。
Disclosure, explanations, and visible handoff options become part of the service experience.
揭露、說明與可見的交接選項,會直接變成服務體驗的一部分。
Q: What has to be true for safety discipline to increase adoption, enterprise confidence, and strategic differentiation rather than merely slow the organization down?
Q:要在什麼條件下,安全紀律才會提升採用、提升企業信心、形成策略差異,而不只是拖慢組織速度?
Use this lens: safety becomes strategic when it reduces uncertainty for customers, partners, and regulators in ways that translate into willingness to deploy, spend, or integrate.
建議鏡片:當安全能夠替顧客、合作夥伴與監管者實質降低不確定性,並轉化成部署意願、付費意願或整合意願時,安全就從成本中心變成策略。
Q: If a firm says it is safe and accountable, what concrete interface elements or workflow artifacts should customers and partners be able to observe?
Q:如果企業宣稱自己安全且可究責,顧客與合作夥伴應該能在介面或流程中看見哪些具體元素?
Use this lens: look for visible disclosures, confidence boundaries, human handoff routes, incident handling commitments, and workflow logs that make accountability inspectable.
建議鏡片:請去找那些可見的揭露、信心邊界、真人接手路徑、事故處理承諾,以及讓責任可被檢視的流程紀錄。
Synthesis: safety becomes strategic when it is translated into repeatable release discipline and visible trust signals, not when it remains a slogan.
總結:當安全被翻譯成可重複的發佈紀律與可見的信任訊號時,它才真正是策略,而不只是口號。
Use the debate case to examine a broader governance problem: what happens when market pressure, platform expectations, and model capability move faster than review capacity? The point is not to litigate one company. It is to see how speed compresses deliberation and can blur accountability.
請把這個辯論個案當成更大的治理問題:當市場壓力、平台期待與模型能力的變化速度,快過審核能力時,會發生什麼事?這裡不是要評判單一公司,而是要看見「速度」如何壓縮 deliberation,並讓責任歸屬變得模糊。
Governance shapes market access, partner confidence, and the firm’s willingness to let systems act.
治理會影響市場進入、合作夥伴信心,以及企業願意讓系統自主行動到什麼程度。
When customers cannot easily tell whether they are dealing with a person or a system, disclosure becomes design, not legal fine print.
當顧客很難辨認自己面對的是人還是系統時,揭露就不只是法律細則,而是設計問題。
Someone must own failure across model behavior, workflow design, prompt policy, and human review.
模型行為、工作流設計、提示策略與人類審核之間,必須有人真正承擔失誤責任。
Fast growth is dangerous when exception handling, review bandwidth, and rollback paths are underbuilt.
如果例外處理、審核量能與回滾路徑不足,快速成長反而更危險。
Q: When the model output, product interface, workflow design, and human review all shape the final result, how should accountability actually be assigned?
Q:當模型輸出、產品介面、工作流設計與人類審核都共同塑造最後結果時,責任應該怎麼分配才合理?
Use this lens: accountability should map to controllable decisions: who set permissions, who defined prompts or policies, who approved release, who reviewed exceptions, and who owns remediation afterward.
建議鏡片:責任應該對應到可控制的決策:誰設定權限、誰定義提示或政策、誰批准上線、誰審核例外,以及誰負責事後補救。
Q: What kinds of organizational shortcuts are likely to appear when feature pressure rises faster than the review system can absorb?
Q:當功能壓力上升速度快過審核系統可承受的能力時,組織最可能出現哪些捷徑與妥協?
Use this lens: watch for reduced disclosure, narrower testing, blurred release criteria, overloaded reviewers, and weak rollback planning. Speed becomes expensive when failures scale faster than correction capacity.
建議鏡片:請注意:揭露變少、測試變窄、上線標準模糊、審核者過載,以及回滾計畫薄弱。當失誤放大的速度快過修正能力時,速度就會變得昂貴。
Synthesis: governance is strategic because it determines whether speed produces defensible scale or merely faster failure.
總結:治理之所以是策略,是因為它決定了速度會帶來可防禦的規模,還是只會更快地放大失誤。
In bot-mediated environments, clicks, responses, reviews, and even apparent support interactions can become noisy or synthetic. The analytical task changes from “how much engagement do we have?” to “which signals still indicate real human value, trust, and accountability?”
在 Bot 中介環境裡,點擊、回應、評論,甚至表面上的客服互動都可能變得嘈雜、模糊或被合成。分析任務會從「我們有多少互動?」改成「哪些訊號還能代表真實的人類價值、信任與責任?」
A system can answer instantly and still be misleading, unauditable, or emotionally inappropriate. Speed is useful, but it is not proof.
系統可以秒回,但仍可能誤導、不可稽核,或在情緒上完全不適切。速度有用,但不是證明。
Clicks, session counts, and response volumes become weaker signals when bots or automated scripts can inflate them.
當 Bot 或自動化腳本可以灌高數字時,點擊、工作階段與回應量就會變成更弱的訊號。
Customers trust systems more when they can see where human review starts, who owns the issue, and what happens next.
當顧客看得見人類審核從哪裡開始、誰負責這件事、接下來會發生什麼時,他們更容易信任系統。
The hardest moments are emotional, unusual, or high-stakes. Brand voice is tested most where the workflow must escalate.
最難的時刻往往是情緒性、非常態或高風險的情境。品牌語氣真正被測試的地方,就是工作流必須升級處理的地方。
Q: If engagement metrics become noisy, what signals should managers trust more when evaluating AI-mediated service quality?
Q:如果互動指標變得很吵,管理者在評估 AI 中介服務品質時,應該更相信哪些訊號?
Use this lens: favor verified resolution, repeat usage by real customers, complaint recovery, human escalation success, and post-issue trust retention over raw activity counts.
建議鏡片:請優先看:真實問題有沒有被解決、真實顧客是否回來使用、客訴是否被修復、人工接手是否成功,以及事件後的信任是否仍被保住,而不是只看活動量。
Q: What combination of interface design, workflow control, and logging makes accountability more than a vague promise?
Q:要靠什麼樣的介面設計、工作流控制與紀錄機制,才能讓可究責性不只是一句模糊承諾?
Use this lens: accountability becomes credible when the system’s role is disclosed, actions are logged, sensitive actions require approval, and customers can reach a responsible human path when needed.
建議鏡片:當系統角色有被揭露、行動有被記錄、敏感動作需要批准,而且顧客在需要時能夠找到負責的人類接手路徑,可究責性才會變得可信。
Synthesis: authenticity in digital ghost towns is less about sounding human and more about being inspectable, governable, and correctable.
總結:在數位鬼城裡,真實性不是看起來像不像人,而是系統是否可被檢視、可被治理,而且在出錯時可被修正。
Specify what the system can see, what it cannot see, and what requires stronger authorization.
清楚定義系統看得到什麼、看不到什麼,以及哪些資料需要更高層級授權。
Distinguish between reading, drafting, recommending, committing, and executing.
把讀取、草擬、建議、承諾與執行分開看待,而不是全部混成同一層權限。
Sensitive actions, low confidence, or exceptional emotions should trigger human review.
敏感動作、低信心輸出或情緒性例外,都應觸發人類審核。
Customers should know when the system is acting, what its role is, and how to escalate.
顧客應該知道系統何時在運作、扮演什麼角色,以及如何升級到真人。
Trust architecture links interface disclosure, policy constraints, review gates, and auditability into one chain. Remove one link and trust becomes rhetorical.
信任架構是把介面揭露、政策限制、審核節點與可稽核性串成一條鏈。少掉其中任何一節,信任就容易只剩口號。
Price changes, refunds, commitments, or reputationally sensitive statements should not execute silently.
價格變更、退款、對外承諾或名譽敏感陳述,都不應該靜默自動執行。
If the system is unsure, uncertainty should trigger human review rather than confident improvisation.
如果系統不確定,正確做法應該是觸發人工審核,而不是自信地臨場發揮。
Health, finance, legal, identity, and emotional distress often require tighter boundaries.
健康、金融、法律、身份與情緒困擾等情境,通常都需要更緊的邊界。
A trustworthy system must know how to stop, reverse, and repair—not only how to proceed.
可信的系統不只要知道如何往前做事,也要知道何時停下、如何回滾,以及如何修復。
Traditional digital dashboards assume that clicks, sessions, messages, and visible activity mostly originate from real humans pursuing meaningful goals. That assumption weakens when bots, scripts, and synthetic interactions populate the environment.
傳統數位儀表板通常假設:點擊、工作階段、訊息與可見活動,大多來自真實人類的有意義行動。當 Bot、腳本與合成互動大量存在時,這個假設就會變弱。
| Signal傳統訊號 | Why it gets weaker為什麼會變弱 | Better trust-oriented measure更好的信任導向指標 |
|---|---|---|
| Clicks / traffic點擊/流量 | Bots, scripts, and automated browsing can inflate volume without human intent.Bot、腳本與自動瀏覽會在沒有真實意圖的情況下膨脹數字。 | Verified conversion quality, resolution, or downstream human action.經驗證的轉換品質、問題解決率或下游真實人類行動。 |
| Session time停留時間 | Longer time may reflect confusion, loops, or automation—not trust.停留更久可能代表困惑、反覆迴圈,或自動化,而不是信任。 | Task completion with clarity and low escalation burden.任務是否清楚完成,以及升級負擔是否下降。 |
| Message volume訊息量 | More responses can mean the system creates noise or fails to resolve efficiently.訊息越多,可能只是代表系統製造噪音,或沒有效率地解決問題。 | First-contact resolution, quality of handoff, and complaint recovery.首次接觸解決率、交接品質,以及客訴修復成果。 |
| Reviews / ratings評論/評分 | Synthetic or manipulated sentiment can blur what real customers actually experienced.合成或被操弄的情緒訊號,會讓真實顧客經驗變得模糊。 | Verified feedback tied to fulfilled service or documented outcomes.與真實履約或文件化結果綁定的經驗證回饋。 |
| Raw CSAT / NPS snapshots原始 CSAT / NPS 快照 | One-time sentiment can hide whether the workflow is accountable over time.一次性的情緒快照,可能掩蓋了流程是否長期可究責。 | Retention after incidents, repeat trust, and success of remediation.事件後留存、重複信任,以及補救是否成功。 |
Synthesis: the weaker the signal environment becomes, the more governance and measurement have to shift from visible activity to verified outcomes.
總結:當訊號環境越來越弱,治理與衡量就越必須從「可見活動」轉向「可驗證成果」。
Every sensitive action needs a named owner or approval logic.
每一個敏感動作都需要明確的責任人或批准邏輯。
Without logs, the organization cannot investigate, learn, or defend itself credibly.
如果沒有日誌,組織就無法可信地追查、學習或自我防禦。
Low confidence, emotion, edge cases, and high stakes should not be improvised away.
低信心、情緒、邊界案例與高風險情境,不應該被臨場 improvisation 蓋過去。
The customer should know the system’s role, its limits, and the path to accountable human support.
顧客應知道系統的角色、限制,以及通往可究責真人支援的路徑。
Specify data whitelist, restricted fields, and contexts where access is prohibited.
定義資料白名單、受限欄位,以及哪些情境禁止存取。
Separate read, draft, suggest, approve, and execute. These should rarely live at the same level.
把讀取、草擬、建議、批准與執行分開。這些權限很少應該落在同一層。
Log prompts, sources, actions, approvals, errors, and handoffs where appropriate.
在合適情況下記錄提示、來源、行動、批准、錯誤與交接。
Write triggers for human review before sensitive actions are taken.
在敏感動作發生前,先寫好觸發人工審核的條件。
Rare or emotionally loaded cases should have dedicated playbooks, not ad hoc improvisation.
少見或情緒負荷高的情境,應該有專屬 playbook,而不是即興處理。
Disclosure should be clear without destroying the service experience. Brand voice must survive governance, not bypass it.
揭露必須清楚,但不等於破壞服務體驗。品牌語氣應該穿過治理,而不是繞過治理。
Synthesis: the minimum governance set is not bureaucracy for its own sake. It is the smallest set of controls that keeps an agentic service auditable, brand-safe, and correctable.
總結:最低治理集合不是為了官僚而官僚,而是讓 agentic 服務保持可稽核、品牌安全、且可被修正的最小控制組。
Choose one use case and specify a minimal viable agent that can do something meaningful without pretending to do everything. The key output is not merely a flashy demo. It is a governed workflow with visible boundaries, mandatory approvals, and failure handling.
請選定一個使用情境,設計一個能做出具體價值、但不假裝自己無所不能的最小可行 Agent。關鍵產出不是華麗 demo,而是一條有邊界、有人類批准、也有失誤處理的治理工作流。
Define the exact job: draft a reply, qualify a lead, summarize a case, route a complaint, or guide an order.
先定義精確任務:草擬回覆、篩選線索、摘要案件、分流客訴,或導引訂單。
List the approved data, tools, and forbidden actions. This defines the actual scope of the agent.
列出允許使用的資料、工具與禁止動作。這才真正界定了 Agent 的範圍。
Specify where a human must approve because of risk, low confidence, or policy sensitivity.
請明定在什麼地方因為風險、低信心或政策敏感度而必須由人批准。
List the main ways the workflow could fail, and what happens when that failure occurs.
請列出最主要的失誤方式,以及一旦發生後流程要怎麼應對。
Design a minimal viable agent for [USE CASE]. Goal: [OUTCOME] Allowed inputs: [DATA / TOOLS] Forbidden actions: [LIST] Human approval gates: [LIST] Escalation triggers: [LOW CONFIDENCE / EMOTION / POLICY / VALUE THRESHOLD] Output format: [DRAFT / RECOMMENDATION / ROUTED CASE / ACTION] Failure modes: [LIST] Required logs: [LIST]請為 [USE CASE] 設計一個最小可行 Agent。 目標:[OUTCOME] 允許輸入:[DATA / TOOLS] 禁止動作:[LIST] 人類批准節點:[LIST] 升級觸發:[低信心 / 情緒 / 政策 / 金額門檻] 輸出格式:[草稿 / 建議 / 分流案件 / 動作] 失誤模式:[LIST] 必要日誌:[LIST]
A minimal viable agent is deliberately narrow. It proves that one governed use case can work before the organization scales autonomy.
最小可行 Agent 的重點,就是故意做窄。先證明一個被治理的使用情境可以成立,再談更大規模的自主性。
A prototype proves possibility; the roadmap proves managerial seriousness. This section turns one governed agent into a phased plan: pilot, scale, institutionalize. It also forces the team to define KPIs, owners, review cadence, and the risk register that will keep the rollout honest.
Prototype 證明的是可能性;roadmap 證明的是管理上的認真程度。這一段要把一個被治理的 Agent 轉成分階段計畫:pilot、scale、institutionalize,同時逼團隊定義 KPI、責任人、審核節奏,以及能讓 rollout 保持誠實的風險登錄表。
One use case, one team, narrow permissions, heavy review, and fast learning loops.
一個 use case、一個團隊、窄權限、高審核,以及快速學習迴圈。
Add volume, more users, clearer SOPs, and better dashboards only after the pilot is stable.
等試點穩定後,再增加案件量、使用者、SOP 與更完整的儀表板。
Embed governance, training, ownership, and review routines into regular operations.
把治理、訓練、責任與審核例行程序嵌入日常營運。
Show not only the happy path, but the review gates, exception handling, and risk logic.
展示時不要只秀 happy path,也要秀出審核節點、例外處理與風險邏輯。
| Deliverable繳交項目 | What it must contain內容要求 | Why it matters為何重要 |
|---|---|---|
| Minimal viable agent最小可行 Agent | Demo or prototype spec with scope, inputs, permissions, review gates, and logs.含範圍、輸入、權限、審核節點與日誌的 demo 或 prototype spec。 | Shows the use case is real and governable.證明這個 use case 既真實,也可以被治理。 |
| 3–6 month roadmap3–6 個月 roadmap | Pilot → scale → institutionalize, including owners, checkpoints, and review cadence.Pilot → scale → institutionalize,並含責任人、檢查點與審核節奏。 | Shows whether the idea can survive beyond the prototype.檢驗這個想法能否走出 prototype,而不是只停在展示。 |
| KPI + risk registerKPI+風險登錄表 | Success metrics, failure indicators, and planned mitigation for the main risks.成功指標、失敗訊號,以及主要風險的預設緩解方式。 | Keeps the project honest when adoption pressure rises.當採用壓力變大時,這會讓專案保持誠實。 |
| Final deck最終簡報 | Problem, workflow, trust architecture, roadmap, KPIs, risks, and demo screenshots or prototype flow.問題定義、工作流、信任架構、roadmap、KPI、風險,以及 demo 畫面或 prototype 流程。 | Integrates all four sessions into one capstone narrative.把四次課的內容整合成一個 capstone 敘事。 |
Create a 3-6 month roadmap for the governed agent [USE CASE]. Phase 1 (Pilot): [scope / owners / review cadence / KPI / risk] Phase 2 (Scale): [volume / SOP / dashboards / training / KPI / risk] Phase 3 (Institutionalize): [policy / governance / audit / staffing / KPI / risk] Top risks: [LIST] Mitigations: [LIST] Final deck structure: [problem / workflow / trust architecture / roadmap / KPI / risk / demo]請為受治理的 Agent [USE CASE] 建立一份 3–6 個月 roadmap。 Phase 1(Pilot):[範圍 / 責任人 / 審核節奏 / KPI / 風險] Phase 2(Scale):[量能 / SOP / 儀表板 / 訓練 / KPI / 風險] Phase 3(Institutionalize):[政策 / 治理 / 稽核 / 人力配置 / KPI / 風險] 主要風險:[LIST] 緩解方法:[LIST] 最終簡報結構:[問題 / 工作流 / 信任架構 / roadmap / KPI / 風險 / demo]
Mitigation: narrower data whitelist, segmented access, and audit logging.
緩解:更窄的資料白名單、分段權限與稽核日誌。
Mitigation: better triage rules, risk thresholds, and queue design before expanding volume.
緩解:在擴大案件量之前,先優化分流規則、風險門檻與佇列設計。
Mitigation: approved language patterns, escalation for emotional cases, and spot-review routines.
緩解:建立核准語氣模式、情緒案例升級,以及抽樣審查機制。
Mitigation: confidence thresholds, refusal logic, and explicit fallback to humans.
緩解:設定信心門檻、拒答邏輯,以及明確的人類接手退場機制。
It is produced by boundaries, disclosure, review, and recovery—not by slogans.
信任來自邊界、揭露、審核與修復,而不是標語。
It determines whether capability scales into legitimate value or merely faster failure.
治理決定能力會擴張成有正當性的價值,還是只是更快放大失誤。
Visible activity matters less than verified outcomes, recovery quality, and accountable handoffs.
相較於可見活動,更重要的是可驗證成果、修復品質與可究責交接。
The course ends when one governed workflow becomes a credible implementation path.
當一條受治理的工作流被轉成可信的落地路徑時,整門課才真正收束。
Capstone checkpoint: submit the minimal viable agent demo or spec, the 3–6 month roadmap, the KPI + risk register, and the final deck.
Capstone 檢查點:繳交最小可行 Agent demo 或 spec、3–6 個月 roadmap、KPI+風險登錄表,以及最終簡報。