Travis Muhlestein PRO

TravisMuhlestein

https://www.godaddy.com/resources/engineering

travis-muhlestein

AI & ML interests

Product & AI CTO at GoDaddy focused on AI infrastructure, orchestration, agent systems, observability, and enterprise-scale AI deployment

Recent Activity

posted an update about 5 hours ago

A question we kept running into while operating AI agents in production: How do you write a unit test for something that never returns the same answer twice? At GoDaddy, we built a system called Veritas to help detect prompt regressions and model migration drift before changes reach production. The core idea is simple: Exact-match testing breaks down for LLMs. What matters is whether the agent preserved the same meaning and intent. We ended up using embeddings + cosine similarity as the primary evaluation signal. Rather than asking: "Did the model generate the same response?" We ask: "Did the model mean the same thing?" One of the more interesting findings was how often seemingly harmless prompt edits changed downstream behavior in ways that were difficult for human reviewers to catch. Prompts aren't documentation. Prompts are code. Curious what others are using today for regression testing: • LLM-as-judge? • Embedding similarity? • Human review? • Custom eval frameworks? https://www.godaddy.com/resources/news/veritas-catching-silent-ai-regressions-before-they-ship Would love to compare approaches.

posted an update 7 days ago

One thing that stands out to me about efforts like DNS-AID and ANS is a simple architectural principle: We didn't reinvent the internet. We extended it. A lot of discussion around agent ecosystems focuses on models, orchestration frameworks, and capabilities. But once agents begin operating across platforms and organizational boundaries, a different set of questions emerges: • How are agents discovered? • How is identity established? • How is trust verified? • How are capabilities described in a machine-readable way? DNS-AID and ANS approach different parts of the same problem space using infrastructure that already exists and operates at internet scale. • Identity standards help establish trust • Discovery standards help locate capabilities and services One thing I'm increasingly convinced of: we have model cards, but we still lack broadly adopted capability manifests for agents. As agent ecosystems evolve, open standards around discovery, identity, trust, and capability description may become just as important as the models themselves. Curious what builders here think: what's the biggest missing standard today for truly interoperable agent systems?

repliedto salma-remyx's post 8 days ago

🏇 Outrider — a GitHub Action that scouts arXiv for your repo We built Outrider to close the gap between your code and the latest arXiv research. The best new methods for your repo may not be from the viral paper. How it works — every week (or your configured cadence), Outrider: 1. Pulls candidate papers from a Remyx engine that ranks arXiv against your repo's commit history 2. Runs a Claude selection pass over the pool — picks the candidate most implementable against your specific codebase 3. Invokes Claude Code to draft the integration into an existing call site 4. Runs quality gates (path allowlist, integration validator, stub-density check, self-review) 5. Opens a draft PR — or an Issue when a PR would be premature Two recent PRs: - remyxai/FFMPerative — picked Aurora (2026 video-editing-agent paper), wired plan-validation into the existing execution path. 5 min, $1.45. - remyxai/VQASynth — picked PGT (procedurally-generated grounding), wired the scorer into the existing BenchmarkRunner registry. 8 min, $2.64. Free to install via GitHub Marketplace. You bring your own ANTHROPIC_API_KEY (~$2-3 per PR-track run). Repo: https://github.com/remyxai/outrider Longer write-up tomorrow on Substack — more detail on the spec-bundle format, the selection-pass design, and what we learned testing across dozens of repos.

View all activity

Organizations

posted an update about 5 hours ago

Post

A question we kept running into while operating AI agents in production: How do you write a unit test for something that never returns the same answer twice?

At GoDaddy, we built a system called Veritas to help detect prompt regressions and model migration drift before changes reach production.

The core idea is simple:
Exact-match testing breaks down for LLMs.

What matters is whether the agent preserved the same meaning and intent.

We ended up using embeddings + cosine similarity as the primary evaluation signal. Rather than asking:

"Did the model generate the same response?"
We ask: "Did the model mean the same thing?"

One of the more interesting findings was how often seemingly harmless prompt edits changed downstream behavior in ways that were difficult for human reviewers to catch.

Prompts aren't documentation.
Prompts are code.

Curious what others are using today for regression testing:

• LLM-as-judge?
• Embedding similarity?
• Human review?
• Custom eval frameworks?

https://www.godaddy.com/resources/news/veritas-catching-silent-ai-regressions-before-they-ship

Would love to compare approaches.

posted an update 7 days ago

Post

107

One thing that stands out to me about efforts like DNS-AID and ANS is a simple architectural principle:

We didn't reinvent the internet. We extended it.

A lot of discussion around agent ecosystems focuses on models, orchestration frameworks, and capabilities. But once agents begin operating across platforms and organizational boundaries, a different set of questions emerges:

• How are agents discovered?
• How is identity established?
• How is trust verified?
• How are capabilities described in a machine-readable way?

DNS-AID and ANS approach different parts of the same problem space using infrastructure that already exists and operates at internet scale.

• Identity standards help establish trust

• Discovery standards help locate capabilities and services

One thing I'm increasingly convinced of: we have model cards, but we still lack broadly adopted capability manifests for agents.

As agent ecosystems evolve, open standards around discovery, identity, trust, and capability description may become just as important as the models themselves.

Curious what builders here think: what's the biggest missing standard today for truly interoperable agent systems?

replied to salma-remyx's post 8 days ago

As agents become capable of proposing and implementing changes across software systems, we're likely to see increasing focus on the infrastructure that governs identity, authorization, provenance, and accountability. Execution is where agent ecosystems become operational at scale.

replied to sergiopaniego's post 8 days ago

One sign of a technology maturing is the emergence of a shared vocabulary. It feels like the industry is beginning to converge not only on agent architectures, but also on the operational concepts needed to make large-scale agent ecosystems work: identity, trust, governance, orchestration, and interoperability.

replied to evalstate's post 8 days ago

Skills make agents more useful. Interoperability makes them more connected. The next layer will be trust—helping agents verify who they're interacting with as they discover and invoke capabilities across organizational boundaries.

reacted to PhysiQuanty's post with 🔥 8 days ago

Post

4598

🌐 We crawled the entirety of Hugging Face to help the community! Huge thanks to the Hugging Face API 🌐
🤖 2.91M model repos (file names included), 📚 1.02M dataset repos, 🚀 1.31M Space repos
🤗 617,501 committers (datasets and models), we’ll share Hugging Face statistics with you in the coming days..

We also identified 61,398 users with “AI/ML Interests”, and NOW we can find each other through our “AI/ML Interests”🤗
HF-Collab-Center/Searching-For-HuggingFace-Users
HF-Collab-Center/All-Model-Repos
HF-Collab-Center/All-Dataset-Repos
HF-Collab-Center/All-Space-Repos

HF-Collab-Center/HF-Users
HF-Collab-Center/HF-Users-with-last-seen
HF-Collab-Center/HF-Users-With-AI-ML-Interests-Only

Made By @QuantaSparkLabs and @PhysiQuanty
C'est français, bon.. en anglais.. mais c'est français ;)

5 replies

posted an update 14 days ago

Post

We have model cards. We don’t yet have capability manifests. That’s the gap DNS-AID points toward.

The Linux Foundation just launched DNS-AID: open, decentralized discovery infrastructure for AI agents.

🔗 https://www.linuxfoundation.org/press/linux-foundation-announces-dns-aid-project-to-advance-decentralized-ai-agent-discovery

Most agent frameworks today still assume agents already know where other agents and tools exist. That assumption starts breaking down in cross-platform and cross-organization workflows.

My hypothesis: agent ecosystems eventually need standardized schemas describing not just what model an agent runs, but what it can actually do — tool interfaces, invocation patterns, input/output contracts, trust metadata, operational constraints, etc. Something orchestrators and other agents can discover and reason about dynamically without hardcoded integrations.

Feels like an area the open-source ecosystem could meaningfully shape early before proprietary registries and platform lock-in dominate the space.

Curious if others here are already working on interoperability, discovery, capability schemas, or agent routing layers. Would love to compare notes.

posted an update 16 days ago

Post

2330

Interesting to see broader ecosystem momentum forming around open standards for agentic systems.

Feels like conversations are increasingly converging around the same operational requirements: identity, interoperability, governance, trust boundaries, orchestration, and coordination between agents, tools, and services.

As agents become more operational, these infrastructure layers seem increasingly important for making larger multi-agent ecosystems reliable outside controlled environments.

https://www.linuxfoundation.org/press/agentic-ai-foundation-adds-43-new-members-as-enterprise-and-government-adoption-of-open-agent-standards-accelerates

posted an update about 1 month ago

Post

132

Agent ecosystems are starting to expose a new class of infrastructure problems around identity, interoperability, trust, and coordination.

Excited to see GoDaddy and HOL (Hashgraph Online) exploring open standards for verifiable AI agent identity and DNS-based coordination layers for emerging agent systems.

A lot still needs to evolve around orchestration, governance, and runtime trust boundaries, but it’s interesting to see more attention shifting toward the infrastructure layer of operational AI systems.

https://www.einpresswire.com/article/910813665/godaddy-and-hol-hashgraph-online-propose-open-standards-for-verifiable-ai-agent-identity-on-dns

posted an update about 1 month ago

Post

214

Interesting engineering experiment from the team around autonomous SDLC workflows for solo developers.

One thing that stood out: treating agents as specialized workflow participants instead of “universal copilots” created much more predictable behavior across planning, implementation, testing, and iteration.

There’s still a lot of work to do around orchestration, evaluation, and reliability, but it’s fascinating to see how quickly AI-native development workflows are evolving.

https://www.godaddy.com/resources/news/engineering-the-ai-how-to-build-an-autonomous-sdlc-as-a-solo-developer

posted an update about 2 months ago

Post

168

What if a “team” is just a set of agents?

We tend to think of agents as tools that help engineers.
But this example flips that model.

A set of prompt-native agents is handling security workflows that would normally involve multiple roles — without relying on traditional codebases.

Instead of: engineer → tool → output
It becomes: agent → execution → continuous operation

That changes how you think about systems:

No explicit task queues
No role-based handoffs
Minimal code as the coordination layer

The system is defined more by prompts and interactions than by code structure.

This raises interesting questions:

Where do you enforce correctness in a system like this?
How do you debug behavior without traditional code paths?
What replaces “ownership” when the work is distributed across agents?

🔗 https://www.godaddy.com/resources/news/the-zero-code-security-team-shifting-left-with-prompt-native-ai-agents

posted an update about 2 months ago

Post

From AI Outputs to AI Execution: Why Identity Comes First

AI agents are moving beyond the “safe zone” of generating drafts, recommendations, and insights that require human action.

With systems like Salesforce Agent Fabric and MuleSoft, agents can now execute directly inside enterprise environments — calling APIs, updating records, and triggering workflows.

This shift changes the problem.

It’s no longer about evaluating outputs.
It’s about controlling execution.

Once agents can act, identity becomes critical — not just to verify who an agent is, but to determine what it is allowed to do before any action takes place.

This is where Agent Name Service (ANS) comes into play.

ANS moves identity to the front of the decision layer, enabling systems to establish permissions and trust boundaries prior to execution.

In an agentic architecture, trust is not something you validate after the fact — it’s a prerequisite to action.

More details on how this integrates with MuleSoft:
https://blogs.mulesoft.com/news/verify-agent-identity-mulesoft-godaddy-ans/

posted an update about 2 months ago

Post

124

What happens after agent deployment becomes easy?

As platforms like Salesforce’s Agent Fabric reduce the friction of building and orchestrating agents, a new pattern starts to emerge.

The complexity doesn’t stop at creation. It starts after deployment.

Once agents are distributed across systems, APIs, and workflows, they effectively become part of a larger system.

That introduces new challenges:

- Tracking where agents are operating
- Understanding behavior across environments
- Managing interactions between agents and systems
Maintaining consistency at scale

In that sense, agent pipelines don’t just create agents.

They create distributed systems.

And those systems require a different level of visibility and coordination than isolated tools.

Open questions:

- How do we observe agents once they’re deployed?
- How do we manage behavior across systems?
- What does control look like post-deployment?

🔗 https://www.salesforce.com/news/stories/agent-fabric-control-plane-announcement/

posted an update 2 months ago

Post

154

We’re starting to see a shift: From building individual agents → to thinking about agent ecosystems.

And that raises a different set of problems.

Not just: “what can my agent do?” But: “how does my agent exist and interact in a broader network?”

Because once agents operate outside a single app or framework, they need:

- A way to be discovered
- A way to prove who they are
- A way to interact with other agents across environments

That’s where open infrastructure becomes critical.

The collaboration between Cloudflare and GoDaddy is interesting because it points toward an agentic web that’s:

- Not tied to a single provider
- Built on shared primitives
- And compatible across systems

If you’re building agents today, this is the bigger question to keep in mind:

Are we creating isolated capabilities — or contributing to a network that can actually scale?

🔗 https://www.cloudflare.com/press/press-releases/2026/cloudflare-and-godaddy-partner-to-help-enable-an-open-agentic-web/

posted an update 2 months ago

Post

181

A real-world example of agent identity and trust (GoDaddy x LegalZoom)

As agent-based systems start to operate across ecosystems, one challenge becomes increasingly visible: identity.

Most discussions focus on what agents can do, but less on how systems verify who they are interacting with.

This partnership between GoDaddy and LegalZoom is an interesting real-world example.

LegalZoom published its AI agent using ANS (Agent Name Service), which binds the agent to a domain-based identity and provides cryptographic verification.

This allows systems to:

-Confirm agent origin
-Verify authenticity before execution
-Establish trust across interactions

Architecturally, this suggests a shift:

Identity is moving from being implicit → to becoming part of the infrastructure layer.

Similar to how DNS and TLS enabled trusted communication on the web, agent ecosystems may require built-in primitives for identity and verification.

Curious how others are approaching:

-Identity layers for agents
-Verification mechanisms in production
-Trust in cross-agent interactions

🔗 https://aboutus.godaddy.net/newsroom/news-releases/press-release-details/2026/GoDaddy-and-LegalZoom-Partner-to-Support-Open-Agentic-Web/default.aspx

2 replies

posted an update 3 months ago

Post

409

Routing and trust are becoming coupled problems in multi-agent systems

As agent-based systems scale, two challenges start to converge: routing and trust.

Routing determines which agent should act. As the number of specialized agents increases, selecting the right one efficiently becomes non-trivial.But selecting an agent is only part of the problem.

In production systems, you also need to verify who that agent is before allowing it to execute. Without identity and verification, routing decisions are made on components that may not be trustworthy.

This creates an interesting architectural split:

-routing → decides what gets executed
-identity → determines whether it should be trusted

GoDaddy’s ANS (Agent Name Service) introduces a model where agents are tied to domain-based identity and can be cryptographically verified before interaction.

This suggests a shift where identity becomes part of the underlying infrastructure, similar to how DNS and TLS evolved for the web.

Curious how others are thinking about:

-routing strategies (static vs dynamic vs learned)
-identity layers for agents
-verification and trust in production systems

🔗 https://www.godaddy.com/resources/news/intelligent-ai-routing

1 reply

posted an update 3 months ago

Post

126

AI coding tools are changing engineering — not replacing engineers

There’s a lot of conversation right now about whether AI coding tools will replace software engineers.

In practice, what many teams are experiencing is a shift in where the complexity lives.

AI can generate code surprisingly well.
But building production systems still requires engineers to handle problems like:

-system architecture and abstractions
-integration between services and models
-failure modes and observability
-scaling infrastructure and data pipelines
-deciding what automation should (or shouldn’t) do

One interesting side effect of AI coding tools is that engineers increasingly start automating their own routine workflows, which lets them focus on the bigger architectural and system-level challenges.

Less time writing boilerplate. More time designing systems that safely integrate AI capabilities.

Interesting perspective here: https://www.godaddy.com/resources/news/dear-software-engineer-you-still-have-value

Curious how others here see engineering roles evolving as AI tools improve.

posted an update 3 months ago

Post

2239

Moving AI from experiments to production systems (GoDaddy + AWS case study)

A recurring pattern across many organizations right now is that AI experimentation is easy — operationalizing it is much harder.

This case study from AWS describes how GoDaddy has been deploying AI systems in production environments using AWS infrastructure.

One example is Lighthouse, a generative AI system built using Amazon Bedrock that analyzes large volumes of customer support interactions to identify patterns, insights, and opportunities for improvement.

The interesting part isn’t just the model usage — it’s the system design around it:

- large-scale interaction data ingestion
- LLM-driven analysis pipelines
- recursive learning platforms where real-world signals improve systems over time
- infrastructure designed for continuous iteration

We’re starting to see a shift where organizations move from AI prototypes toward AI platforms and production systems.

Would be interested to hear how others in the community are thinking about:

- production AI architectures
- LLM evaluation pipelines
- Feedback loops in real-world systems
- infrastructure for scaling AI workloads

Case study:
https://aws.amazon.com/partners/success/godaddy-agenticai/

posted an update 4 months ago

Post

194

Publishing AI Agent Identity to Public DNS: GoDaddy ANS + MuleSoft Agent Fabric

As AI agents move into production systems, one issue keeps resurfacing: identity.

Not model quality.
Not orchestration.
Identity.

GoDaddy’s Agent Name Service (ANS) registers AI agents and publishes their identity to the public DNS, binding them to domain ownership and cryptographic proof.

With the new integration between ANS and Salesforce’s MuleSoft Agent Fabric:

-Verified agents can be pulled into MuleSoft’s enterprise registry
-Teams can inspect verification status and publisher metadata
-Policies can be applied before agents access APIs and data

What’s interesting here is architectural separation:

-ANS → global identity + verification signal
-Agent Fabric → enterprise governance + orchestration

No closed directory requirement.
No proprietary lookup layer.
Identity becomes DNS addressable.

This feels like an early step toward treating agent identity as a public infrastructure primitive — like how TLS certificates enabled trusted HTTPS.

Curious how others in the HF community are thinking about:

-Agent identity standards
-DNS-based verification
-Interoperability across agent frameworks

More:

-www.godaddy.com/ans
-https://www.mulesoft.com/ai/agent-fabric
-https://aboutus.godaddy.net/newsroom/news-releases/press-release-details/2026/GoDaddy-ANS-Integrates-with-Salesforces-MuleSoft-Agent-Fabric/default.aspx
-https://blogs.mulesoft.com/news/mulesoft-agent-fabric-godaddy-ans-for-agent-discovery-and-verification/

posted an update 5 months ago

Post

228

Designing an acquisition agent around intent and constraints

We recently shared how we built an acquisition agent for GoDaddy Auctions, and one thing stood out: autonomy is easy to add—intent is not.

Rather than optimizing for agent capability, the design centered on:

-making user intent explicit and machine-actionable
-defining clear constraints on when and how the agent can act
-integrating tightly with existing systems, data, and trust boundaries

In our experience, this framing matters more than model choice once agents move into production environments.

The article describes how we approached this and what we learned when intent and constraints became core architectural inputs.

Link:
https://www.godaddy.com/resources/news/godaddy-auctions-building-the-acquisition-agent

Would love to hear how others here think about intent representation and guardrails in agentic systems.

1 reply

Travis Muhlestein PRO

AI & ML interests

Recent Activity

Organizations

TravisMuhlestein's activity