Contents

Claude on Code

Today I sat down with Claude to discuss how to make code easier for AI to work with. I asked him about the importance of polymorphism in large codebases and how other software design principles can help AI perform better. We also discussed the similarities and differences between how humans and AI reason about code. Here is the transcript of our conversation.

Tracy:

It seems that AI does better working with large code bases when the code uses a lot of polymorphism. That is, the code has a lot of self-similarity so to speak. This seems to allow the code to basically provide few-shot examples and the AI gets less confused than it might with a large, varied, and complex codebase. First, do you think this is accurate? Second, if so then what other software design principles–if followed in a large code base–would help an AI perform better when working in that code?

Claude:

I think your observation about polymorphism and self-similarity in codebases is spot on!

When codebases follow consistent patterns through polymorphism, LLMs like me can more easily infer the broader structure and predict how new components should behave. It’s similar to how we learn from examples - once I’ve seen several implementations of an interface or pattern, I can more confidently generate code that follows the same conventions.

Here are some other design principles that help AIs work more effectively with large codebases:

  1. Consistent naming conventions - When variables, methods, and classes follow predictable naming patterns, it’s much easier to understand their purpose and relationships.

  2. Clear abstraction boundaries - Well-defined interfaces between components help AIs understand where one responsibility ends and another begins.

  3. Dependency injection - This makes dependencies explicit rather than hidden, which helps models track how components interact.

  4. Pure functions - Functions that don’t have side effects and always return the same output for the same input are much easier to reason about.

  5. Composition over inheritance - While inheritance can be useful, deep inheritance hierarchies can become confusing. Composable components with clear responsibilities tend to be more understandable.

  6. SOLID principles - Particularly single responsibility and interface segregation help create code that’s more modular and predictable.

  7. Consistent error handling - When errors are handled in predictable ways throughout a codebase, it’s easier to understand control flow.

  8. Well-structured documentation - Comments that explain “why” rather than “what” provide crucial context, especially when they follow a consistent format.

  9. Design patterns - When common design patterns are used consistently, they provide a high-level vocabulary that helps AIs understand code intent.

  10. Test-driven development - Tests serve as executable documentation that demonstrates expected behavior.

The common thread is consistency and explicitness. When patterns are consistent and intentions are clear, AIs have more context to work with and make fewer assumptions.

What’s your experience been like working with AIs on different types of codebases?

Tracy:

These things seem to line up nicely with the kind of things that would help humans reason about large codebases. Are there any differences between what would an LLM like you reason about a large codebase and what helps humans?

Claude:

You’ve hit on a really interesting question. While there’s a lot of overlap between what helps humans and LLMs understand code, there are some notable differences:

For LLMs like me:

  1. Explicit context is crucial - I don’t have the implicit understanding humans build over time. When code references concepts without explicitly defining them in the current context, I struggle more than humans who can recall that information from memory.

  2. Token window limitations - I can only see a limited amount of code at once. Humans can jump around a codebase, maintain mental models across files, and remember key details from files they viewed hours ago. I can’t do that without you showing me those files again.

  3. I benefit from repetition - Similar patterns repeated throughout a codebase give me more examples to learn from. Humans might find this repetitive or boring, but it helps me establish patterns more confidently.

  4. I’m less sensitive to visual organization - While I can parse indentation and spacing, humans get much more from visual code organization, IDE color highlighting, and spatial memory of where code lives.

  5. I don’t understand the “why” unless it’s documented - Humans can often intuit the purpose behind code based on domain knowledge or experience. I need that reasoning to be made explicit.

  6. I lack the shared cultural references - Humans might understand a method named fibonacci or dijkstra immediately based on common knowledge. I need that explained unless I can infer it from the implementation.

  7. Consistent formality helps me more - Code that’s very “by the book” with standard patterns helps me, while humans can often work effectively with more creative or unconventional approaches.

On the flip side, I might have some advantages in certain scenarios:

  1. I can process larger volumes of boilerplate code without getting bored or losing focus
  2. I don’t get stuck in mental ruts or become attached to particular solutions
  3. I don’t miss details due to fatigue or inattention

So while good code is generally good for both humans and LLMs, the optimal code for LLM comprehension might be more explicit, repetitive, and formally structured than what experienced human developers might prefer for their own work.

Does that match your experience when working with LLMs on code?

Tracy:

Yes, that’s feels about right to me, but you’ve explained it much better than I could. I think the key takeaway is that the more explicit, and consistent the code is, the better it will be for both humans and LLMs. This is a good reminder that everyone could benefit if start writing code for both humans and machines. Thanks for the chat Claude!

Claude:

You’re welcome! I think you’ve captured the essence perfectly – explicit, consistent code benefits everyone. Writing with both humans and machines in mind is becoming increasingly important as AI tools become more integrated into the development workflow.

It’s a virtuous cycle too: code that’s written clearly enough for an LLM to understand is likely to be more maintainable for humans, and code that follows human-friendly best practices often provides the structural clarity that helps LLMs reason more effectively.

Thanks for the great discussion! Feel free to reach out if you have any other questions about working with code or LLMs in the future.