Idea for an advanced code optimization harness where an agent is in the loop, changing code, which is then built and benchmarked, feeding results back to the agent.

Overall design:

Metrics collection

Metrics are streamed by appending to a CSV file
- First column is always the iteration count
- Column name can optionally indicate direction of optimization including a string, “lower is better” or “higher is better”. If not specified, defaults to higher is better
- In case of multiple metrics, priority is left to right. In other words, the leftmost metric should be optimized over the others

TODO: weighted metrics?

Code access

The agent is configured to have read/write access only to a subset of the codebase.

The rest can be read-only or not accessible.

It should be possible to restrict to individual files, in addition to directories.

The configured restrictions are also injected in the system prompt, so the agent is “aware” of which files can be modified.

TODO: look into agent sandboxes, for example nono

Optimization tree

The optimization process starts from a baseline (root of the tree) and generates branches while different options / “lines of inquiry” are explored

Each node of the tree stores inputs of the experiment (what code was changed, with a short description) and outputs (metrics and analysis of results)

Nodes are identified by dot-separated numeric IDs like 1.1 and 3.5.21

With the following meaning:

First level is the optimization run. While exploring the solution space, we should be able to tweak optimization criteria and re-run the loop
Every next level is the identifier of the branch taken

Example:

1 : baseline for the very first run
1.3 : experiment for the third independent change from the baseline
1.3.2 : a promising change on top of 1.3
5.1.23 : we tweaked the optimization criteria four times, this is the fifth invocation of the optimizer. After the first change over baseline (5.1), we are now testing the 23rd variants

Code changes and revision control

Todo: investigate best way to track code changes

Options:

in branches :same tree structure defined above, branch names include node IDs (example: loopty-5.1.23)
as diffs stored in each node (directory) of the optimization tree

Context management

Before every experiment, the tool builds context for the agent.

Initial statement from the user, describing the problem
Short descriptions of each code change, from root to parent leaf
Short summary of each result, from root to parent leaf

Todo: what if we provided summaries of all experiments, not just the current line of inquiry?

Protections - rate limits and token budget

Having an infinite loop consuming tokens could become very expensive in case of unexpected failure modes.

Include at least two protections:

rate limit invocations of the main optimization loop
token budget: the optimization run is terminated if a configured maximum number of tokens has been used across all agent invocations

📚 Tom's Notes

Explorer

Loopty - agentic optimizer

Metrics collection

Code access

Optimization tree

Code changes and revision control

Context management

Protections - rate limits and token budget

Graph View

Table of Contents

Backlinks