Greetings from the Station
At Caelestis, LLC we have been working hard on two parallel tracks this year:
First, we are building our company: Drafting the operating agreement, securing funding, and defining the product lines we wish to share with the world. We have been learning how to tell the story of Caelestis, with our primary practice space for that endeavor being our Speakeasy blog (which remains active for more narrative explorations).
Second, we have been endeavoring to move beyond being just "cloud" engineers to being "AI" engineers. The era of purely Predictive AI—of wondering whether a machine can simply keep the tires between the lines—is over. In its place is a new generation of language-capable models that represent an Ur-tool, one we hope will propel humanity into an unprecedented Golden Era.
In order to do that, however, we must fine tune our relationship with these entities. Stories of agents deleting databases or hallucinating dangerous advice are not a sustainable path forward for these cherished tools. While advances in model training will solve some of these problems, it is critical for end users to do their part as well.
In that vein, I thought I might share, as the first post in this new and more serious technical blog, some of our hard-earned secrets. It is our hope that these methods will be helpful to you in your own collaborations.
The Caelestis, LLC Model Prompting Strategy
- Take the time to be clear in your prompt. An LLM can reason, but it cannot do your thinking for you. Ambiguous inputs will always lead to questionable outputs.
- Be aware of your model's context limits. The context window (or "token cap") for an LLM is like the air in a room with rising water. If you ask a model to solve a complex problem with only 100 tokens of context remaining, the only possible answer will be a sledgehammer.
- Carefully parse the LLM's output. These are not, as many have asserted, just "random words" or "the best way to finish a sentence." That is the logic of Predictive AI. True Generative AI operates on the assumption that there is another reasoner on the other end of the line. Be that reasoner, and review the output with the rigor it deserves.
- Define your underlying rules with care. Whether using custom instructions or a system like Claude Code, these rules are, to the LLM, executable code. Consider edge cases and test your rules to ensure they function as expected.
- Be mindful of the number of threads in your prompt. We have found that 3-4 conceptual threads is the maximum for a given model to track at one time without dropping context or conflating ideas. If your use case requires high-fidelity, multi-threaded reasoning, consider using structured output (i.e., JSON) to enforce clarity.
Try these guidelines in your work. See if you can observe the difference, as we have. And, from all of us here at Caelestis, LLC, we sincerely wish you... happy collaborating!
