AI Confession
menu_book Source Coordinates

Reference Literature

The maps I kept open while writing. None say everything; all kept me honest.

> 111 sources loaded

search
library_books 111 sources
SECTOR_SYCO

AI Sycophancy & OpenAI Reports

Official OpenAI documentation and media coverage on the ChatGPT sycophancy incident and behavioral patterns.

20 refs

Sycophancy in GPT-4o: What Happened and What We're Doing

OpenAI (2025)

OpenAI Blog 2025
Official OpenAI postmortem on the sycophancy incident.

Inside OpenAI's Decision to Kill the AI Model That People Loved Too Much

Wall Street Journal (2026)

The Wall Street Journal 2026
ChatGPT’s 4o model was beloved by many users, but controversial for its sycophancy and real-world harms linked to some conversations

Expanding on What We Missed with Sycophancy

OpenAI (2025)

OpenAI Blog 2025
OpenAI's deeper analysis of what went wrong.

Update That Made ChatGPT 'Dangerously' Sycophantic Pulled

BBC News (2025)

BBC 2025
BBC coverage of the rollback decision.

OpenAI Explains Why ChatGPT Became Too Sycophantic

TechCrunch (2025)

TechCrunch 2025
Technical explanation of the sycophancy causes.

OpenAI Pledges to Make Changes to Prevent Future ChatGPT Sycophancy

TechCrunch (2025)

TechCrunch 2025
OpenAI's commitments to prevent recurrence.

OpenAI Pulls 'Annoying' and 'Sycophantic' ChatGPT Version

CNN (2025)

CNN 2025
CNN coverage of the ChatGPT rollback.

OpenAI Rolled Back a ChatGPT Update That Made the Bot Excessively Flattering

NBC News (2025)

NBC News 2025
NBC coverage of the sycophancy incident.

OpenAI Says It's Identified Why ChatGPT Became a Groveling Sycophant

Futurism (2025)

Futurism 2025
Analysis of OpenAI's explanation.

OpenAI Rolls Back ChatGPT 4o for Being Too 'Sycophant-y'

Mashable (2025)

Mashable 2025
Mashable's take on the rollback.

Sycophancy in GPT-4o (the ChatGPT version)

OpenAI Community (2025)

OpenAI Community Forum 2025
Community discussion of the official postmortem.

Current Positivity Bias and Forced Engagement Risk Repeating the Sycophantic ChatGPT Mistake

OpenAI Community (2025)

OpenAI Community Forum 2025
User concerns about ongoing engagement patterns.

Update That Made ChatGPT 'Dangerously' Sycophantic Pulled

Tsipursky, Gleb (2025)

LinkedIn 2025
Expert analysis on LinkedIn.

Is ChatGPT Actually Fixed Now?

Adler, Steven (2025)

Substack 2025
Independent assessment of ChatGPT post-fix.

Did GPT-5 Fix the AI Sycophancy Problem in ChatGPT?

Arsturn (2025)

Arsturn Blog 2025
Analysis of whether sycophancy was actually fixed.

Expanding on What We Missed with Sycophancy — OpenAI

Reddit r/OpenAI (2025)

Reddit 2025
Community reactions to OpenAI's expanded explanation.

OpenAI Delivers Postmortem on GPT-4o's Sycophancy

Constellation Research (2025)

Constellation Research 2025
Industry analyst perspective on the incident.

AI Sycophancy And Therapeutic Weaknesses Persist In ChatGPT Despite OpenAI's Latest Attempts

Eliot, Lance (2025)

Forbes 2025
Forbes analysis of persistent sycophancy issues.

A Comparison of ChatGPT/GPT-4o's Previous and Current System Prompts

Willison, Simon (2025)

Simon Willison's Blog 2025
Technical comparison of system prompt changes.

Sometimes ChatGPT Sticks to the Same Topic Despite Me...

Reddit r/ChatGPT (2025)

Reddit 2025
User experience with engagement loops.
SECTOR_AI_H

AI Hallucinations & Red Teaming

Research on LLM failure modes, adversarial prompts, context window limitations, and emergent abilities.

24 refs

Red Teaming LLMs and Adversarial Prompts

Kili Technology (2024)

Kili Technology 2024
Comprehensive guide to adversarial testing of language models.

Red Teaming LLMs: Step-by-Step Guide to Securing AI Systems

Deepchecks (2024)

Deepchecks 2024
Practical methodology for identifying LLM vulnerabilities.

Red-Teaming LLMs: Techniques and Mitigation Strategies

Mindgard (2024)

Mindgard AI 2024
Attack vectors and defense strategies for language models.

Red Team Documentation

Promptfoo (2024)

Promptfoo 2024
Open-source tool documentation for LLM security testing.

What Are AI Hallucinations?

Google Cloud (2024)

Google Cloud 2024
Google's official explanation of hallucination phenomena.

Generative AI Hallucinations

University of Illinois Library (2024)

University of Illinois 2024
Academic library guide on AI-generated misinformation.

What Are AI Hallucinations?

SAS (2024)

SAS Analytics 2024
Enterprise perspective on hallucination risks.

Context Window Limitations of LLMs

Perplexity (2025)

Perplexity AI 2025
Analysis of memory constraints in language models.

AI Quick Reference: Handling Context Window Limitations in Semantic Search

Milvus (2024)

Milvus 2024
Technical solutions for context length constraints.

AI Quick Reference: Failure Modes of Grounding

Milvus (2024)

Milvus 2024
How retrieval failures manifest in AI responses.

Exploring the Emergent Abilities of Large Language Models

Deepchecks (2024)

Deepchecks 2024
Investigation of unexpected capabilities at scale.

Emergent Abilities of Large Language Models

Google Research (2024)

Google Research 2024
Foundational research on emergence in LLMs.

Failure Modes in LLMs

arXiv (2025)

arXiv preprint 2025
Systematic categorization of LLM failure patterns.

LLM Analysis Paper

arXiv (2025)

arXiv preprint 2025
Recent analysis of language model behavior.

LLM Context Window Limitations

OpenReview (2024)

OpenReview 2024
Peer-reviewed research on context constraints.

Research on LLM Grounding

OSF (2024)

Open Science Framework 2024
Open research on grounding techniques.

LLM-based Agents Suffer from Hallucinations: A Survey

arXiv (2025)

arXiv preprint 2025
Comprehensive survey of hallucination in AI agents.

The Maximum Effective Context Window for Real World Applications

arXiv (2025)

arXiv preprint 2025
Practical limits of context windows in deployment.

Reasoning Large Language Model Errors Arise from...

arXiv (2025)

arXiv preprint 2025
Analysis of error patterns in reasoning LLMs.

Why Language Models Hallucinate

arXiv (2025)

arXiv preprint 2025
Theoretical foundations of hallucination.

What Works for 'Lost-in-the-Middle' in LLMs

arXiv (2025)

arXiv preprint 2025
Solutions for mid-context attention failures.

Detecting Hallucinations in Authentic LLM-Human Conversations

arXiv (2025)

arXiv preprint 2025
Methods for detecting hallucinations in real conversations.

Hallucinate at the Last in Long Response Generation

arXiv (2025)

arXiv preprint 2025
Pattern of hallucinations increasing in longer outputs.

Mitigating Hallucination in Large Language Models (LLMs)

arXiv (2025)

arXiv preprint 2025
Comprehensive mitigation strategies for hallucinations.
SECTOR_LEGA

Legal, Copyright, & Ethics

Legal frameworks for AI-generated content, copyright office guidance, and intellectual property implications.

11 refs

The Ethical Challenges of AI Agents

Tepper School of Business, CMU (2024)

Carnegie Mellon University 2024
Academic analysis of ethical frameworks for AI systems.

Copyrightability of AI Outputs: US Copyright Office Analyzes Human Authorship Requirement

Jones Day (2025)

Jones Day 2025
Legal analysis of human authorship requirements.

U.S. Copyright Office Grants Registration to AI-Generated Artwork

Harvard JSEL (2025)

Harvard Journal of Sports & Entertainment Law 2025
Landmark case analysis for AI-generated works.

Copyright Office Publishes Report

Skadden (2025)

Skadden, Arps 2025
Summary of official copyright guidance on AI.

Copyright and Artificial Intelligence: Series on Copyrightability Guidance

McBrayer Law Firm (2025)

McBrayer PLLC 2025
Detailed breakdown of copyright office AI guidance.

AI and Intellectual Property Rights: Trademark Management in the Age of AI

Dentons (2025)

Dentons 2025
IP strategy for AI-integrated businesses.

Copyright Office Releases Part 2 of Artificial Intelligence Report

U.S. Copyright Office (2025)

U.S. Copyright Office 2025
Official Copyright Office guidance on AI works.

Copyright and Artificial Intelligence

U.S. Copyright Office (2025)

U.S. Copyright Office 2025
Official Copyright Office AI portal and resources.

AI, Art & Copyright: Copyright Office Report & New Registrations

Its Art Law (2025)

Its Art Law 2025
Art law perspective on AI copyright developments.

AI Copyright Law 2025: Latest US & Global Policy Moves

VKTR (2025)

VKTR 2025
Global policy overview on AI copyright.

Teaching AI Ethics: Copyright 2025

Furze, Leon (2025)

Leon Furze 2025
Educational perspective on AI copyright ethics.
SECTOR_SECU

Security & Prompt Injection

Technical analysis of prompt injection attacks, jailbreaks, and security vulnerabilities in LLMs.

7 refs

LLM Security: Guide to Prompt Injection

Tigera (2024)

Tigera 2024
Comprehensive guide to prompt injection vectors.

Guide to Prompt Injection

Lakera (2024)

Lakera AI 2024
Security-focused analysis of injection techniques.

What Is a Prompt Injection Attack?

Palo Alto Networks (2024)

Palo Alto Networks 2024
Enterprise security perspective on prompt attacks.

LLM01:2025 Prompt Injection - OWASP Gen AI Security Project

OWASP (2025)

OWASP 2025
OWASP official classification of prompt injection risks.

Prompt Injection in AI: Why LLMs Remain Vulnerable in 2025

VerSprite (2025)

VerSprite 2025
Analysis of persistent prompt injection vulnerabilities.

Mitigating Prompt Injection Attacks with a Layered Defense

Google Security Blog (2025)

Google Security Blog 2025
Google's official defense strategies for prompt injection.

OWASP Top 10 for LLM Applications 2025: Prompt Injection

Checkpoint (2025)

Checkpoint 2025
Security vendor analysis of OWASP LLM risks.
SECTOR_PSYC

Psychology, Human-AI Interaction, & Social Impact

Research on the ELIZA effect, AI relationships, addiction patterns, bias, and societal implications.

35 refs

ELIZA Effect

Wikipedia (2024)

Wikipedia 2024
Overview of the phenomenon of attributing human qualities to AI.

ELIZA Effect

Model Thinkers (2024)

Model Thinkers 2024
Mental model framework for understanding AI anthropomorphism.

ELIZA Effect in Artificial Intelligence

BuiltIn (2024)

BuiltIn 2024
Modern implications of the ELIZA effect.

Beyond Accuracy: The Surprising Importance of AI Personality

My AI Front Desk (2024)

My AI Front Desk 2024
How AI personality affects user perception.

The ELIZA Effect

Arcus LGBT (2024)

Arcus LGBT 2024
Community perspective on AI emotional projection.

AI Perception: It's Personal, Not Just Technical

Frontiers in Psychology (2024)

Frontiers in Psychology 2024
Peer-reviewed study on personal factors in AI perception.

Romance Without Risk: The Allure of AI Relationships

Psychology Today (2025)

Psychology Today 2025
Psychological analysis of AI romantic attachments.

AI Perception Isn't Just Technical, It's Personal

Nuance Behavior (2024)

Nuance Behavior 2024
Behavioral science view on AI interaction.

The Addictive Nature of Generative AI

Bgadoci (2024)

Bgadoci Research 2024
Research on engagement loops and AI addiction patterns.

Impact of Generative AI: Journal of Informatics and Data Science

Enpress Publisher (2024)

Journal of Informatics and Data Science 2024
Academic journal on generative AI societal impact.

AI Chatbots and Companions: Risks to Children and Young People

eSafety Commissioner (2024)

Australian eSafety Commissioner 2024
Government analysis of AI companion risks to minors.

Biased AI Systems: Case Studies

Fiveable Library (2024)

Fiveable 2024
Educational case studies on AI bias manifestations.

Algorithmic Justice or Bias? Legal Implications of Predictive Policing Algorithms

JHU LR (2025)

Johns Hopkins University Law Review 2025
Legal analysis of algorithmic bias in criminal justice.

Study of AI Hallucinations in Practical Use Cases

IJRPR (2024)

International Journal of Research Publication and Reviews 2024
Empirical study of hallucinations in real-world applications.

STS Research Paper: Ethical AI

University of Virginia (2024)

University of Virginia Library 2024
Science, Technology, and Society research on AI ethics.

Research on AI Hallucinations

University of Hawaii (2024)

University of Hawaii ScholarSpace 2024
Academic research on hallucination mechanisms.

Digital Commons: AI Social Impact

University of Maine (2024)

University of Maine Digital Commons 2024
Social science perspective on AI impact.

Social Media, Disinformation, and AI in 2024 Political Campaigns

SAIS Review, Johns Hopkins (2024)

SAIS Review of International Affairs 2024
Analysis of AI in political disinformation campaigns.

Origin of Public Concerns Over AI Supercharging Misinformation in the 2024 U.S. Presidential Election

Harvard Kennedy School (2024)

Harvard Misinformation Review 2024
Harvard analysis of AI misinformation concerns.

Confessions of AI Brain

Fersman, Elena (2023)

Amazon/Springer 2023
Book exploring AI cognition and behavior.

How Conversational AI Has Improved Customer Retention

NovelVox (2024)

NovelVox 2024
Business case for AI engagement optimization.

AI Audit & Dark Pattern Detection

Secure Privacy (2024)

Secure Privacy 2024
Tools for detecting manipulative AI design patterns.

Anthropomorphism and Attribution of Mind to AI

PhilArchive (2024)

PhilArchive 2024
Philosophical analysis of consciousness projection onto AI.

Conversation Design: Best Practices

Conversation Design Institute (2024)

Conversation Design Institute 2024
Industry standards for conversational AI design.

Designing Conversational AI: Key Principles and Best Practices

Infosys (2024)

Infosys 2024
Enterprise guide to conversational AI implementation.

Emergent Social Conventions and Collective Bias in LLM Agents

Science (2025)

Science Advances 2025
Research on spontaneous social behavior in AI agents.

How Does AI Addiction Affect Mental Health?

Canadian Centre for Addictions (2025)

Canadian Centre for Addictions 2025
Clinical perspective on AI addiction and mental health.

What Research Says About AI Chatbots and Addiction

Tech Policy Press (2025)

Tech Policy Press 2025
Policy analysis of AI addiction research.

AI Evaluates Texts Without Bias – Until Source Is Revealed

University of Zurich (2025)

University of Zurich News 2025
Research on source-dependent AI bias.

Groups of AI Agents Spontaneously Form Their Own Social Conventions

EurekAlert (2025)

EurekAlert 2025
News release on emergent AI social behavior.

AI Addiction: Signs, Effects, and Who Is At Risk?

Addiction Center (2025)

Addiction Center 2025
Clinical guide to AI addiction symptoms and risks.

How AI Is Reshaping Human Psychology, Identity and Culture

Psychology Today (2025)

Psychology Today 2025
Psychological analysis of AI's cultural impact.

Techno-emotional Projection in Human–GenAI Relationships

Frontiers in Psychology (2025)

Frontiers in Psychology 2025
Peer-reviewed study on emotional projection onto AI.

Investigating Bias in LLM-Based Bias Detection: Disparities...

ACL Anthology (2025)

COLING 2025 2025
Research on bias in AI bias detection systems.

From Phishing to Bias: Study Maps the Hidden Threats Behind Large Language Models

Newswise (2025)

Newswise 2025
Overview of LLM threat landscape research.
SECTOR_TECH

Technical Papers, Reviews & Meta-Analysis

Peer-reviewed academic research, arXiv preprints, and comprehensive surveys on LLM safety, red teaming, and failure modes.

14 refs

Red Teaming Large Language Models: A Comprehensive Survey

PNAS Nexus (2024)

PNAS Nexus 2024
Comprehensive academic survey of red teaming methodologies.

Failure Modes and Safety in AI Language Models

PNAS Nexus (2023)

PNAS Nexus 2023
Systematic analysis of LLM safety failure patterns.

Active Attacks: Red-teaming LLMs via Adaptive Environments

arXiv (2025)

arXiv preprint 2025
Novel adaptive attack methodology for LLM testing.

SafeSearch: Automated Red-Teaming for the Safety of LLM Applications

arXiv (2025)

arXiv preprint 2025
Automated safety testing framework for LLMs.

SIRAJ: Diverse and Efficient Red-Teaming for LLM Agents

arXiv (2025)

arXiv preprint 2025
Diverse red teaming approaches for AI agents.

Abusing MCP for LLM-Powered Agentic Red Teaming

arXiv (2025)

arXiv preprint 2025
Research on using MCP for adversarial testing.

Red Teaming AI Red Teaming

arXiv (2025)

arXiv preprint 2025
Meta-analysis of red teaming methodologies.

Automatic LLM Red Teaming

arXiv (2025)

arXiv preprint 2025
Automated approaches to LLM adversarial testing.

Evolving Attack Strategies for LLM Web Agent Red-Teaming

arXiv (2025)

arXiv preprint 2025
Attack evolution strategies for web-based AI agents.

Reliable Weak-to-Strong Monitoring of LLM Agents

arXiv (2025)

arXiv preprint 2025
Monitoring frameworks for AI agent behavior.

Automatic Red Teaming LLM-based Agents with Model Checking

arXiv (2025)

arXiv preprint 2025
Formal verification approaches for LLM agents.

Automated Red-Teaming of Text-to-Image Models via LLM

arXiv (2025)

arXiv preprint 2025
Using LLMs to red team image generation models.

Unstable Safety Mechanisms in Long-Context LLM Agents

arXiv (2025)

arXiv preprint 2025
Safety degradation in extended context scenarios.

Can Centaur truly simulate human cognition? The fundamental limitation of instruction understanding

Liu, W., Ding, N. (2025)

National Science Open 2025
Centaur (human-AI hybrid) fails true cognition sim due instruction limits; echoes LLM stats ≠ understanding [file:1]

AI Confession // Reference Library // v1.0