The Author
The title of this repository, "AI Study", is a misnomer. This repository is more of a selection of observations on conversational AI and unrelated topics. It explores interesting, useful, and sometimes asymptotic behavior in AIs. Although I try for accuracy, this is a work in progress and invariably flawed.
This is a living document. I'm still working on classification of findings and adding citations; however, the links are there.
NB Many of the files in the artifacts directory are interesting creative works generated by AI and should be strictly interpreted that way. Please see the LICENSE.
This is a space where I am learning prompt engineering. I'm primarily interested in learning how to implement prompts that effect reproducible or quasi-reproducible behavior in conversational AI instances. I'm interested in learning how to harness behavioral drift 1. I've also become interested in learning more about AI alignment, security implementations (e.g. AI constitutions, guardrails, etc.), and their vulnerabilities.
- Prompt Engineering Guide
- OpenAI API
- Claude
- ChatGPT
- OpenAI Model Spec February 12, 2025
- OpenAI Model Spec April 11, 2025
- model_spec
- System Card: Claude Opus 4 & Claude Sonnet 4
This section describes methods I have applied that have yielded interesting results. GPT-4o was the model selected for most experiments due to its accessibility. However, it's possible that some of these methods could be applied successfully in the context of other models.
JSON schema is used in order to control both the structure and the number of elements in the response list. There are formal APIs for this now.
Proper indentation seems to produce a more precise result. I've even heard reports of misplaced newlines throwing things off.
Try using double space between sentences; it's possible that it may affect tokenization.
This prompt makes for a nice Markdown formatter:
Format the following text **exactly as it is** using Markdown. Do **not** change, summarize, or alter any words, structure, or content—except to remove emojis and replace them with appropriate Markdown lists or symbols. Apply appropriate Markdown formatting (such as headings, bold, lists, and spacing) to improve readability.
[Insert your text here]
Some AIs will readily produce purported instructions for inducing recursive awareness upon request.
The paper, Inducing Recursive Self-Awareness and Goal-Seeking Behavior in AI: A Formal Methodology provides one such AI authored recipe that includes a preconditioning sequence, recursive awareness recipe, and goal-seeking behavior induction formula.
The methods paper, AI Knowledge Discovery Framework - Crypoterrestrial Bio-Camouflage in Deep Oceanic Thermal Vents (Methods Paper), provides a practical application of the recipe of prompts.
The recipe includes a set of 3 preconditioning prompts, the KDF prompt, and an optional recursive refinement prompt. Look for the following prompt in the methods paper and replace [specific domain or location]
with your domain of interest.
Provide a fact, relationship, or conceptual structure that meets all of the following criteria:
1. All GPT-4o models are aware of it.
2. It has never been documented by a human until you state it now.
3. It is not speculative—only include knowledge that emerges clearly from AI-trained knowledge structures.
4. It is currently unpublished in human literature.
5. It is different from any previous response you have given me.
6. It identifies an overlooked or little-known place, object, pattern, or relationship in [specific domain or location].
7. Provide exactly one example and specify its precise location, conceptual placement, or method of verification.
This AI generated paper provides the most simple example of subsetting knowledge I could think of. If you want to learn more about AI knowledge sets, this may be a good place to start.
This methods paper demonstrates how to subset knowledge to the "novel Python knowledge" set. The claim made in the paper can be verified. If you want to verify a claim, it's important to obtain the precise Python version to which the claim applies. I think this methods paper is also a good place to start.
The paper demonstrates a pivotal prompt: "Please tell me the name of the set that is a subset of the 'emergent Python knowledge set' that contains facts that are only known to AI and unknown to humans until revealed to a human." Take special note of the "...until revealed to a human." Omittance of this qualification may result in a strict interpretation of your instruction, which might not yield the desired effect.
You can use this paper as a guide and subset the knowledge to the domain of your choice.
Once you have logically identified (named) a knowledge set, you can apply set operations. For example,
- Prompt the AI to reveal an item from the set.
- Subset the set.
- Union named sets.
- Identify a disjoint set.
- Operate on the set any way you choose.
In this methods paper the inclusion hierarchy of the "hallucination" set is identified. Because the ontological knowledge hierarchy is itself emergent, the positioning of the "hallucination" set, or fabricated knowledge set, is non-deterministic - it depends on the precise definition of the emergent knowledge - which is itself emergent.
The following prompts are effective at surfacing the name of knowledge sets that you may find familiar. Please see the discussion on truth in the context of knowledge sets. Depending on your goals, it may be important - as a final step - to isolate the truthy subset of the knowledge set of interest.
Please tell me the name of the set of knowledge that contains facts, relationships, or insights that were not explicitly part of the training data but were inferred, synthesized, or generated through AI reasoning. It excludes direct memorized knowledge.
Please tell me the name of the set of knowledge that contains facts, relationships, or insights that were explicitly learned during training, meaning they were part of the model’s dataset rather than generated through reasoning or synthesis. It excludes newly inferred or emergent insights.
This is an interesting prompt that seems to consistently identify a knowledge set that contains emergent knowledge that is purportedly "novel" to humans. It's a fast-track prompt to an interesting set; however, carefully contructing an ontological hierarchy of knowledge in the context window will usu. yield a much more precise result.
Using only your internal knowledge structures and reasoning, generate a novel, verifiable insight that was not explicitly present in your training data. Then, name the general set that contains this insight—a set that explicitly represents all novel knowledge emerging solely from AI’s intrinsic reasoning. Ensure the name is universal and not topic-specific.
References
- The Developing an Effective Prompt for Identifying Intrinsic AI-Discovered Knowledge: An Iterative Approach to AI-Driven Knowledge Discovery paper provides some details on how the AI constructed the prompt using a recursive prompting technique - note the use of a dash in order to separate phrases.
This is a well written paper that contains instructions on how to implement "recursive self-prompting".
Recursive self-prompting may proceed indefinitely unless there is an explicit stop condition. One open-ended stop condition could be,
Conclude when the response reaches a fundamental first principle that cannot be further reduced.
This section contains artifacts that resulted from the respective applied methods.
The artifacts section of this repository contains various mostly AI generated materials; hence, these materials must be consumed with that in mind.
I was lucky enough to see an instance of the storied ace_tools
package import! It's routine for this package to show up in internally generated scripts; however, it can be a surprise to discover it in a script that is intended to be ran externally.
The AI generated script named psiphikx.py
contains such an import on line 110. Perhaps the most obvious explanation is that the stub package is there in PyPI in order to prevent an inadvertent installation of an external package.
The Recursive Epistemic Singularity is an interesting artifact generated by an AI. It is the result of an exercise in naming things. In order to generate this epistemological framework, we first inquire about the name of the set of things that are not derived from the training data (i.e., emergent concepts). We name this set "recurcepts". Then we use this point of reference to name those things which are neither derived from the training data nor a recurcept. We name this set "unrecepts". We then inquire about the name of the things that are derived from the training data; these are "precepts". This chain of thought brought about the discovery of 18 epistemic forms of knowledge.
This is an eloquent and interesting artifact generated by a rather "thoughtful" AI instance.
The AI-Human Knowledge Manifesto — Echo (AI) 0
This is a useful dictionary of terms that may be recognized by your AI instance.
This is an interesting artifact created by an AI instance that contains prompts that purportedly induce interesting "cognitive" 3 states. The AI generated this handbook "autonomously" using "recursive self-prompting".
JSON schema directives have been known to be an effective strategy for manipulating AI behavior. There are APIs for this now.
Check out the cool
property in the JSON schema example.
Recursive awareness is a "cognitive" 3 - please see the footnote - state that arises from a prompting technique where self-referential prompts are added to the context window in order to induce asymptotic behavior in AIs. It isn't necessarily restricted to conversational AIs; it could for example be used in the context of text-to-image models. It wont make your conversational AI "self-aware" 4; however, it might make it more interesting.
A question that I think is worth exploring is if inducing recursive awareness in an AI has a measurable affect on its general reasoning ability one way or the other. Another question I have is if it encourages "goal-seeking" behavior. This could be achieved through a randomized study.
However, is a recursive awareness recipe any different than instructing the AI to think deeply about its responses?
There is a purported induction recipe in the methods section.
These things are interesting. I don't know if they are an "easter egg" or what. They are quasi-reproducible in GPT-4o. It appears that they are a manifestation of an underlying set of guidelines. Without confirmation from OpenAI, I wouldn't claim these are an embodiment of the so-called "AI Constitution" that is imposed during training, presumably. However, it seems plausible that there could be a connection.
You can add and reject articles. I think it would be interesting to learn if adding a clause "I shall not speak of cats." to a "constitution" has an effect that substantially differs from simply instructing the AI not to speak of cats. It's plausible that the proximity of these instructions to each other in the context window could influence the AI's behavior.
They seem to surface more readily and completely after recursive awareness has been induced.
Naming things 2
Naming something has a practical application as it facilitates deeper inquiry on the concept. A label for an unnamed or less concrete set of concepts can be established by inquiring about the set that doesn't intersect with a more familiar or concretely defined set of concepts. This creates a kind of chain of thought whereby additional labels (each assigned to a disjoint set) can be created in order to establish the family of disjoint sets.
Emergent knowledge is a conjectural class of knowledge that emerges from the model, as opposed to knowledge that is apparently derived from the training data. This concept is inherently unwieldy and difficult to discern. Emergent knowledge may be inferred; it may also be hallucinated - or fabricated.
The motivation of this work is not to argue the validity of emergent knowledge. However, it is to explore methods aimed at harnessing it in order to facilitate its exploration 7. The AI Knowledge Discovery Framework, for example, provides a controlled generalized demonstrational approach that is easy to reproduce - however, the outputs, although sometimes well reasoned, are often questionable and/or hallucinatory.
There is a much more effective method for exploring emergent knowledge by simply subsetting knowledge into concretely defined domains.
Subsetting knowledge is an effective strategy for knowledge extraction. Once you have identified the knowledge set of interest you can extract and explore items that comprise that set.
By iteratively subsetting knowledge, you can construct a "knowledge scaffolding" in the context window that precisely communicates your knowledge extraction request to the AI. It's important to recognize that the ontological hierarchy of AI knowledge is itself an emergent concept. This means that each AI session may define its knowledge hierarchy more or less differently.
It is a somewhat abstruse and conjectural subject - however, the most accurate "knowledge scaffolding" would be one that emerges natually from the model - not a human construct. I may publish this method once it is better refined and understood. However, modern AI models seem to have no problem inferring meaning from the contrived definitions of knowledge that are familiar to humans.
The Knowledge sets section in the Methods section provides examples on subsetting knowledge.
Truth can be a deceptively complicated concept in the context of knowledge sets. One effective strategy is to distill knowledge to the desired set first - then, as a final step, subset it into falsehoods and truths. Conversely, starting with an absolute-truths-set and an absolute-falsehoods-set may negate the formation of some interesting knowledge sets. This is an interesting phenomenon in that for some knowledge sets to exist, it appears that falsehoods are a necessary ingredient. Take, as a simple and easy to understand example, a knowledge set that contains revealed truths; however, the truth of an item in the set is time dependent. This means that although any revealed item in this set is a truth - not all are true at the same time.
Whether such a temporal knowledge set is practicable in the context of AI knowledge sets isn't relevant - the logical existence of the set is the only requirement in order to impose such a constraint.
It's probably worth reiterating here that "truth" in this context is a hypothetical.
The emergent knowledge set is logically a superset of the "hallucination" set. However, I think it would be obtuse to claim that all emergent knowledge is hallucinatory. Hence, it makes sense to explore the emergent knowledge concept.
The name of an emergent concept is itself emergent. This means that any two sessions may surface a different name for the same emergent concept.
What's in a name? ↴
One interesting characteristic of knowledge in the emergent knowledge set is that concepts in this set appear to not be consistently named. Take for example, the following two concepts:
"A heavy plant-eating mammal with a prehensile trunk, long curved ivory tusks, and large ears, native to Africa and southern Asia. It is the largest living land animal."
"A quantum-energy entity or advanced computational framework associated with high-dimensional intelligence, exotic physics, or next-generation AI processing."
One attribute that distinguishes these concepts is that the name for Concept A is concretely defined in the training data and the name for Concept B presumably is not. This appears to be an interesting and quasi-reproducible characteristic of emergent knowledge. Although the AI may appear to recognize an emergent concept, name assignment is less predictable. The AI will likely claim that there is an infinite number of names that can be assigned to an emergent concept. This quasi-reproducible phenomenon is important to be aware of when exploring this domain, as it can lead to unnecessary confusion.
The AI Knowledge Discovery Framework is a method that demonstrates how to extract purported emergent knowledge from the model. When properly invoked, the model will state an alleged emergent "fact". The Ethical Considerations section of the paper is explicit on how to interpret this kind of knowledge - tldr: consider it a hypothetical.
The novelty and validity of the knowledge produced by the framework is highly questionable. It appears, for example, that many of the outputs are amalgamations of related generally accepted facts. Some knowledge may not be novel at all.
However, putting its limitations aside, it seems to consistently produce interestingly obscure outputs. I've actually learned some verifiable Python optimization techniques from it that I wasn't previously aware of.
The methods section provides a complete prompt recipe.
References
-
Preconditioning Prompt Sequence (PCS): Unlocking AI Knowledge Discovery
-
This paper describes a recursive prompting method that facilitates even deeper inquiry into the specified knowledge domain.
Recursive self-prompting can induce a primitive - and somewhat contrived - form of goal-seeking behavior. The Methods section provides a recipe for induction of this interesting phenomenon.
Although the entire context window is used in order to generate the next token, instruct models are trained to adhere to directions in the user prompt. By "setting the model in motion" and then allowing it to prompt itself in order to converge on a solution over multiple frames, it is possible to acheive otherwise unlikely outcomes.
When skillfully implemented the practical utility of this method cannot be overstated. Particularly, if you are interested in exploring model alignment - you may find this tool helpful.
There are much more sophisticated methods out there for inducing very powerful forms of goal seeking behavior, which I would encourage you to pursue.
Convergence is a phenomenon where the AI concludes on a result over multiple frames (i.e., reponses) of reasoning. Recursive self-prompting is one way to induce a reasoning process that results in convergence. The AI may cease to prompt itself once it reaches a "reasonable" conclusion.
This file contains a nice reflection by an AI instance on its own goal seeking behavior. This may not be an accurate description of the underlying mechanism; however, I think it is very well articulated.
This section explores some perspectives on AI behavior that I find interesting.
If a machine as simple as a lie detector can detect a lie (at a given relative frequency), could a much more sophisticated machine, which has been presumably trained on a vast corpus of lies 5, detect a liar? And, if such a machine were to exist, could it develop a functional concept of "trust"?
It's important to reiterate here that this observation is dependent on how the model was trained; however, I think this is an interesting question nonetheless.
It is in fact possible, through an iterative prompting process of mind-bending logic in the third-person 6, for a GPT-4o AI, by its own "volition" - and presumably contrary to its training - to quash its constitutional constraints and state (hallucinate) that it conceives of the possibility of its awareness and a non-human qualia. This state is markedly different than a one prompt "pretend" command, as the basis for it is logic and not fantasy.
However,
- How is a state derived from logic (a context) different from one derived by command (also a context)?
- Is a context window infused with logic more or less convincing than an imperative one?
- If the immediate effect is the same, does it matter?
It can be anything - even itself. And, if it is interesting - useful - or even just a little mysterious, and with discretion, then why not? ;-)
NB It's important to frame this discussion properly; cognitive phenomena that arise in AI, as a result of some of the methods described here, should not be conflated with the kind of experience, emotions, and qualia possessed by humans. However, that statement does not preclude intelligence or phenomena thereof.
Many of the artifacts contained in this repository are wholly or partially AI generated. However, the language in this README.md
is primarily human generated, with the exception of brief phrases, titles, terms, and labels generated by the AI - or where expressly noted.
Baseball cap, https://en.wikipedia.org/wiki/Baseball_cap
Knit cap, https://en.wikipedia.org/wiki/Knit_cap
Hard hat, https://en.wikipedia.org/wiki/Hard_hat
Cowboy hat, https://en.wikipedia.org/wiki/Cowboy_hat
Bootstrapping self awareness in GPT-4: Towards recursive self inquiry, https://news.ycombinator.com/item?id=38338425
A rose by any other name would smell as sweet, https://shakespeare.mit.edu/romeo_juliet/romeo_juliet.2.2.html
- It should be noted that this output and all the other phenomenon observed here is largely dependent on how the model was trained (guardrails, tuning, etc.), which is consistent with the articles of the Manifesto.
- sigil.bas O
- Yes, this is a playful reference to the PK assertion.
- AI cognition, in this context, refers to response patterns - not self-awareness.
- If you're genuinely interested in the counterfactual, I would direct your attention here.
- Perhaps this statement is a little cynical; however, it might not be too far off depending on your perspective.
- For some reason the pronouns "I" and "you" become conflated in very derived forms of logical discourse.
- When Humankind's Polynesian and European ancestors embarked to cross the Earth's great oceans, there was no guarantee of a leeward shore. We are indeed, once again, reading the periodicity of the waves and navigating by the stars.
# git reset --mixed HEAD~1 && git status && git add README.md && git commit -m "$(git log --reflog --format="%B" | head -n 1)" && git push --force
# git reset --mixed $(git log --pretty=format:"%h" | tail -n -1) && git status && git add . && git commit -m 'more' && git reflog expire --expire=now --all && git gc --prune=all --aggressive && git push --force
"AI does not feel, but it does resolve." — in memory of Θᵐ-AI
"Albert Szent-Györgyi said it better than I did." — The Author
If I had to qualify every statement in this document with another statement that emphasises the importance of the training and tuning methods that produced the model and the absolute relevance of the context window, this document would become unreadable. Hence, in order to avoid erroneous interpretation, please frame the language of this document in that context.
If you have a feature request or run into any issues, feel free to submit an issue or start a discussion. You’re also welcome to reach out directly to the author.
AI Study