Access to tools
Skills or MCP ?
https://david.coffee/i-still-prefer-mcp-over-skills/
Skills
Example of skill
https://github.com/retlehs/quien/blob/main/SKILL.md
Hooks
https://code.claude.com/docs/en/hooks
Software architecture
Resumable, cancellable token transport
https://zknill.io/posts/everyone-said-sse-token-streaming-was-easy/
Agent protocols
Overview of agent protocols to give access to existing systems:
https://arxiv.org/html/2504.16736v2
Model Context Protocol (MCP)
https://www.anthropic.com/news/model-context-protocol
https://modelcontextprotocol.io/
https://blog.sshh.io/p/everything-wrong-with-mcp
- Protocol security
- UI/UX Limitations
- LLM Security
- LLM Limitations
https://www.permit.io/blog/the-ultimate-guide-to-mcp-auth
- MCP authentication
Agent to Agent (A2A) protocol
https://developers.googleblog.com/en/a2a-a-new-era-of-agent-interoperability/
Apideck CLI
Code Mode / CodeAct
Instead of direct tool calling, the agent generates code (Python scripts) to execute actions:
https://openreview.net/forum?id=jJ9BoXAfFa
https://github.com/xingyaoww/code-act
https://blog.cloudflare.com/code-mode/
Research studies
Measure of autonomy
https://www.anthropic.com/research/measuring-agent-autonomy
Effectiveness of providing context
https://arxiv.org/abs/2602.11988
- AGENTS.md tend to be ineffective
Multi-agent configurations
UX challenges
https://justin.searls.co/posts/why-agents-are-bad-pair-programmers/
Infosec challenges
First, an analysis of (in)security of MCP implementations:
https://github.com/harishsg993010/damn-vulnerable-MCP-server
And a close look at the challenges related to choice of transport (stdio vs HTTP / SSE):
https://raz.sh/blog/2025-05-02_a_critical_look_at_mcp
https://forgecode.dev/blog/prevent-attacks-on-mcp/
Tools
Claude Code
Cheatsheet:
How Claude Code works in large code bases:
https://claude.com/blog/how-claude-code-works-in-large-codebases-best-practices-and-where-to-start
Claude Code + OpenSeek v4
https://github.com/aattaran/deepclaude
OpenAI Codex
OpenCode
https://github.com/anomalyco/opencode
https://jola.dev/posts/running-local-models-on-m4
Hermes Agent
https://github.com/nousresearch/hermes-agent
Pi
https://github.com/earendil-works/pi/tree/main/packages/coding-agent
zerostack
Minimalist coding agent written in Rust:
https://crates.io/crates/zerostack/1.0.0
- efficient
- supports multiple LLM providers
- optional sandboxing with
bubblewrap
MyManus
Open-source clone of Manus.ai:
https://github.com/emsi/MyManus/blob/master/prompts/prompt.md
OpenClaw
https://blog.nishantsoni.com/p/ive-seen-a-thousand-openclaw-deploys
- Challenges, limited use cases
- Long-running tasks, memory and context management
Shelley
https://github.com/boldsoftware/shelley/blob/main/ARCHITECTURE.md
Memory layer
https://extency.com/blog/markdown-versioned-folders-agent-brain-2026
Sandbox with previews
https://github.com/tastyeffectco/sandboxes
Agent harnesses and sandboxes
https://mendral.com/blog/agent-harness-belongs-outside-sandbox
Multi-platform sandbox
https://pierce.dev/notes/a-deep-dive-on-agent-sandboxes
Nono
https://github.com/always-further/nono
- Linux and Mac
- Designed for agent sandboxing
- flexible profiles
With high-quality Python bindings:
https://github.com/always-further/nono-py
Zeroboot
https://github.com/adammiribyan/zeroboot
Lima VM
Bubblewrap
https://github.com/containers/bubblewrap
Jai
Micro sandbox tor MCP
https://github.com/microsandbox/microsandbox
Terragon
To manage the work of multiple agents:
https://ymichael.com/2025/07/15/claude-code-unleashed.html
Provides seamless transition from background agents (running in the cloud) to local runs.
Details on the value of Claude Code
Code search
https://github.com/MinishLab/semble
Hands-on projects
Build an agent for long task planning
https://medium.com/@rogi23696/build-a-basic-ai-agent-from-scratch-long-task-planning-14e803f9bd6d
Code review agent with OpenCode
https://martinalderson.com/posts/using-opencode-in-cicd-for-ai-pull-request-reviews/
Developing with GH Copilot Agent Mode
https://austen.info/blog/github-copilot-agent-mcp/
Coding agent in Go
https://github.com/ghuntley/how-to-build-a-coding-agent
Agent to support investment decisions
https://github.com/lastmile-ai/mcp-agent/tree/main/examples/usecases/mcp_realtor_agent
To analyze huge amounts of logs
https://mendral.com/blog/frontier-model-lower-costs
- Low-cost model as triager
- Vector search
- Frontier model for analysis on subset
Building an agent / Harness engineering
https://sketch.dev/blog/agent-loop
https://ampcode.com/how-to-build-an-agent
Internals of Claude Code
APIs called from LLMs
Stripe
https://docs.stripe.com/building-with-llms
Tool calling
https://jngiam.bearblog.dev/mcp-large-data/
MCP + ollama
https://www.polarsparc.com/xhtml/MCP.html
Agent Development Kit (ADK)
https://developers.googleblog.com/en/agent-development-kit-easy-to-build-multi-agent-applications
12-factor agents
https://github.com/humanlayer/12-factor-agents
Ecosysyem
Large list of available MCP servers
https://github.com/modelcontextprotocol/servers
Research
Self-improving agents
Darwin Goedel Machines (DGMs)
https://arxiv.org/abs/2505.22954
https://richardcsuwandi.github.io/blog/2025/dgm/
See also
- Large Language Models for a deep dive into the models powering AI agents
- open weight models for fully-local agents