Learning Paths
Last Updated: April 14, 2026 at 15:30
Microservices Anti-Patterns: The Mistakes That Will Hurt You
What not to do when designing microservices — and how to recognise the warning signs before they become critical
Shared databases, chatty services, distributed monoliths — these anti-patterns have broken countless microservices projects. This guide walks you through the most common mistakes developers make, how to detect them early, and how to correct course before the damage becomes permanent. You do not need to have built microservices already to benefit from this. Learning what not to do before you start is far cheaper than learning it after you have shipped.

What you will discover in this guide
By the end of this article, you will understand:
- Why shared databases are the fastest path to a distributed monolith
- How to recognise chatty services before they kill your performance
- When services become too fine-grained — and why that hurts more than too coarse
- What a distributed monolith actually looks like and how to escape one
- How to handle data consistency across service boundaries
- API versioning strategies that prevent deployment chaos
- Organisational anti-patterns that no technology can fix
- How to detect each anti-pattern using simple, everyday observations
Anti-pattern 1: The shared database
What it looks like
Multiple services read and write to the same database. Your checkout service, inventory service, and user service all connect to the same ecommerce_db. They might use different tables, or worse, they might share the same tables.
At first everything works. Cross-service queries are easy because you can join tables. No data duplication. No eventual consistency headaches. Shipping was fast.
Why it becomes a problem
The problems emerge slowly, then all at once.
Hidden coupling. The checkout service changes a table schema. The inventory service uses that table. The inventory service breaks. Neither team realised they were coupled because they never discussed the shared schema. The coupling is invisible until something snaps.
Deployment coordination. A database migration that changes a shared table forces you to deploy all services that touch that table at the same time. Independent deployability disappears. You are back to the coordinated release ceremonies of a monolith.
Scaling conflicts. The checkout service needs more database connections during a sale. The inventory service needs more for a batch job. They compete for the same connection pool. You cannot tune the database for both workloads simultaneously.
Team bottlenecks. One team owns the schema, or worse, no team does. Changes require sign-off from a central database team. Every team is waiting on everyone else.
How to detect it
Watch for these warning signs:
- Two or more services connect to the same database instance
- Services share database credentials
- A single migration requires changes to multiple services
- One service's slow query degrades another service's performance
- There is a "database team" that everyone else must beg for changes
How to correct it
The fix is painful but necessary. You need to split the shared database.
Step 1. Identify which tables genuinely belong to which service. This might be obvious or might require careful analysis.
Step 2. For tables that belong to one service but are read by others, create APIs for those reads. Other services must call the API instead of querying the table directly.
Step 3. For tables that feel truly shared, you likely have a bounded context problem — two domains that were merged when they should have been separate. Split them.
Step 4. Migrate each service to its own database. Run both databases in parallel during the transition using data synchronisation scripts.
Step 5. Remove the shared database access entirely. Update connection strings. Deploy each service independently.
This is a significant project. That is exactly why you want to avoid shared databases from the start.
Anti-pattern 2: Chatty services
What it looks like
You design services that are cleanly separated by business capability. But then you discover that a single operation requires dozens of network calls.
Consider a typical e-commerce checkout flow. The checkout service needs to:
- Call the user service for customer details
- Call the inventory service for product availability
- Call the pricing service for discounts
- Call the payment service for authorisation
- Call the shipping service for rates
- Call the tax service for calculations
- Call the fraud service for risk assessment
- Call the notification service for order confirmation
Each call is a network round trip. If each takes 50 milliseconds, that is 400 milliseconds of pure latency before any real work begins. Add processing time and your checkout flow takes over a second — for a flow that should feel instant.
Why it becomes a problem
Latency accumulates. Unlike function calls that take nanoseconds, network calls are measured in milliseconds. Twenty calls means real, user-visible seconds.
Failures multiply. Each network call can fail independently. With ten calls in a chain, the probability of at least one failure climbs dramatically. You need retry logic, timeout handling, and fallback behaviour for every single call.
Debugging becomes a nightmare. A slow checkout could be caused by any of the ten services. Without distributed tracing — which is not trivial to set up — you are hunting blind.
Temporal coupling locks you in. The checkout service cannot proceed until all ten services respond. If any single service is slow, the entire flow slows down. The weakest link sets the pace.
How to detect it
Watch for these warning signs:
- A single user action triggers more than three or four service calls in sequence
- Response times are much higher than processing times (most time is spent waiting on the network)
- Your distributed tracing shows long chains of sequential calls
- Engineers frequently redesign boundaries just to reduce call counts
How to correct it
API composition. Create an aggregator service that bundles multiple downstream calls. The client makes one call to the aggregator, which handles the fan-out internally. The client pays less latency; the complexity is hidden.
Data replication. Copy frequently-needed data into the consuming service's database. If checkout always needs a customer's name and email, replicate those fields. You accept eventual consistency but eliminate the network round trip.
Batch operations. Redesign APIs to accept batch requests. Instead of calling the inventory service ten times for ten products, call once with a list of product IDs. This cuts round trips dramatically.
Asynchronous processing. If the downstream calls do not need to happen before the response, use a message queue. The checkout service publishes an event and immediately responds with "order confirmed." Background services handle the rest. The user sees speed; the system handles complexity behind the scenes.
Reconsider your boundaries. Chatty services often signal poor boundary choices. If the checkout service constantly needs data from inventory, perhaps inventory belongs inside checkout, or the boundary should move.
Anti-pattern 3: Over-fine-grained decomposition
What it looks like
You embrace the principle that services should be small. Very small. Perhaps a single function per service.
You end up with a service for address validation. A service for tax calculation. A service for currency conversion. A service for discount application. A service for gift wrapping. A service for loyalty points.
Each small service has its own repository, its own CI/CD pipeline, its own deployment, its own database, its own monitoring setup, its own alerting rules.
Why it becomes a problem
Overhead explodes. Operating a service carries a fixed cost regardless of size. The overhead of a fifty-line service is nearly identical to a fifty-thousand-line service: build pipelines, deployment scripts, monitoring dashboards, log aggregation, on-call runbooks. Multiply this by dozens of nano-services and it consumes your team.
Debugging collapses into chaos. A single business transaction might touch ten tiny services. Understanding what happened requires tracing across ten service boundaries. The signal-to-noise ratio falls off a cliff.
Network traffic saturates. Every trivial operation becomes a remote call. Simple work that should be a library function call becomes an RPC with all its overhead: serialisation, network latency, connection management, error handling.
Developer friction grinds progress to a halt. A simple change might require touching five repositories, five pipelines, and five deployment processes. The coordination cost dwarfs the complexity of the actual change.
How to detect it
Watch for these warning signs:
- Most of your services have low amount of code
- You have more services than developers
- A single business operation touches more than five services
- Your team spends more time managing infrastructure than writing business logic
- Onboarding a new developer requires explaining dozens of services before they can ship anything
How to correct it
Merge related services. Combine services that are always changed together or always called together. Address validation, tax calculation, and currency conversion may all belong in a single "checkout support" service.
Extract shared libraries instead. Some tiny services should never have been services at all. Pull their logic into a shared library. Yes, this creates coupling — but for genuinely utility-level code, the operational simplicity is worth it.
Accept duplication. Rather than creating a shared service for reusable logic, duplicate the logic across services. This sounds like a step backwards, but it is often correct. Duplication is cheaper than the operational overhead of a nano-service that does one thing.
Anti-pattern 4: The distributed monolith
This is the most dangerous anti-pattern because it looks like success from the outside. You have many services. They run separately. You call it microservices. But beneath the surface, it is a monolith wearing a costume.
What it looks like
You have twenty services, but:
- All services share a central database
- Deploying any service requires deploying several others due to tight coupling
- A change to one service's API requires changes to ten consumers
- There is no independent scalability — when one service needs more resources, everything scales together
- Failure in one service cascades because of synchronous dependencies everywhere
- Teams cannot work independently because every change requires coordination meetings
A distributed monolith has all the complexity of microservices but none of the benefits.
How a distributed monolith is created
The path is gradual and familiar.
Stage 1. You start with good intentions. Clear boundaries. Database per service. Independent teams.
Stage 2. A deadline approaches. A feature spans services. Instead of designing clean APIs, you take a shortcut — one service reads another's database directly. "Just this once," you tell yourself.
Stage 3. The shortcut becomes permanent. More shortcuts appear. The database that was supposed to belong to one service now has connections from three services.
Stage 4. Teams realise their services are coupled, but untangling them is expensive. The next deadline is next week. They keep going.
Stage 5. You now have a distributed monolith. Every deployment is terrifying. Every change requires coordination. Nobody is happy.
How to detect it
Watch for these warning signs:
- Services share databases (see Anti-pattern 1)
- Deployments must be coordinated across teams
- A change in one service frequently breaks others
- Services cannot be scaled independently
- Test suites require running most services together to work
- Team members are saying "microservices are too hard — maybe we should go back to a monolith"
How to correct it
Escaping a distributed monolith is hard. But it is possible.
Stop making it worse. Enforce one rule immediately: no new shared database access, and no new synchronous dependencies that create tight coupling.
Identify the worst coupling points. Which shared database is causing the most pain? Which coordinated deployment happens most often?
Extract one service at a time. Start with the service with the clearest ownership. Give it its own database. Create an API. Migrate consumers one by one.
Replace synchronous calls with asynchronous messaging where possible. If service A calls service B on every request, can service B consume events from a queue instead?
Accept that the migration will take months. The distributed monolith did not appear overnight. It will not disappear overnight either.
Anti-pattern 5: Data consistency traps
What it looks like
You try to maintain ACID transactions across service boundaries. You reach for distributed transactions — specifically two-phase commit (2PC) — to keep multiple services consistent.
Service A begins a transaction. It calls Service B, which begins its own. Service B calls Service C. A transaction coordinator waits for all participants to confirm they can commit, then instructs everyone to commit. If any participant says no, everything rolls back.
Why it becomes a problem
The coordinator becomes a bottleneck. All transactions flow through it. It is a single point of failure in a system that was designed to avoid single points of failure.
Locks are held for too long. Resources stay locked while the coordinator communicates across the network. What takes microseconds in a local database now takes milliseconds or seconds across services. Everything waiting for those locks stalls.
Availability collapses. If any participant is unreachable, the transaction cannot proceed. Your whole system becomes as available as its least available component.
Scalability hits a wall. The coordinator does not scale horizontally. You are bounded by the throughput of one process.
How to detect it
Watch for these warning signs:
- You are using distributed transaction coordinators (XA transactions, 2PC)
- Services hold database locks for unusually long periods
- Transactions frequently time out or fail during partial outages
- A single slow service degrades the entire system's availability
How to correct it
Accept eventual consistency. Most business operations do not actually require immediate consistency. An order can be confirmed right away while inventory updates propagate a few seconds later. Design your system around eventual consistency from the beginning, not as an afterthought.
Use the Saga pattern. A Saga breaks a distributed transaction into a sequence of local transactions, each with a compensating action that can undo it if a later step fails. If step 3 fails, you run compensating transactions for steps 2 and 1. This is more complex to implement than 2PC, but it works reliably at scale. (Covered in Tutorial 21.)
Reconsider your service boundaries. If two services genuinely need strong consistency between them, they may belong in the same bounded context. Merging them into a single service with a single database restores ACID guarantees without any distributed coordination.
Anti-pattern 6: The versioning nightmare
What it looks like
You change a service's API. All consumers break. You now need to coordinate updates across every consuming service before you can deploy anything. Independent deployability vanishes.
The checkout service calls GET /stock/{productId} on the inventory service. You rename it to GET /availability/{productId}. The checkout service fails. You cannot deploy the new inventory service until checkout is updated. You cannot update checkout until someone changes it. You are stuck in a coordination deadlock.
Why it becomes a problem
Breaking API changes couple services together at deployment time. The promise of independent deployability — the core reason microservices exist — disappears. You are back to coordinated releases.
How to detect it
Watch for these warning signs:
- API changes require coordinated deployments across teams
- Multiple versions of the same service are running because some consumers cannot upgrade
- Engineers avoid changing APIs because of the coordination cost
- API documentation is perpetually out of date because nobody wants to trigger the update process
How to correct it
Make only backward-compatible changes. Design your APIs so changes are additive only. Add new fields. Add new endpoints. Do not remove or rename existing fields. Do not change what a field means.
Version your APIs explicitly. Include the version in the URL: GET /v1/stock/{productId} and GET /v2/availability/{productId}. Run both versions simultaneously while consumers migrate at their own pace. Deprecate v1 only after no consumers remain.
Use consumer-driven contract testing. Let consumers define the contract they expect. The service provider runs tests against those contracts before every deployment. If a change would break a consumer, the build fails before anything reaches production.
Publish events instead of shared APIs. Instead of a directly shared API, let each consumer build its own read model from published events. The inventory service publishes stock events; checkout maintains its own local view of stock. Consumers are decoupled. Changing the event schema is still a careful process, but consumers are no longer blocking each other's deployments.
Anti-pattern 7: Synchronous cascade chains
What it looks like
Service A calls B. B calls C. C calls D. A single user request triggers a deep call stack that cascades across your infrastructure.
A user requests their order history. The order service calls the user service for customer details. The user service calls the address service for shipping addresses. The address service calls the geocoding service for coordinates. The geocoding service calls the map service for display data. Five services. Five sequential round trips. Five points of failure.
If any service in that chain is slow or failing, the entire request fails.
Why it becomes a problem
Failures amplify. A slow geocoding service makes order history slow. The user does not care about geocoding. They just want to see their orders.
Threads block and waste resources. Threads sit idle while waiting for downstream responses. Your servers need far more threads than the actual work requires, just to absorb the latency.
Finding the bottleneck is hard. A slow request could be caused by any service in the chain. Without distributed tracing, you are guessing.
How to detect it
Watch for these warning signs:
- Service call chains are deeper than three services
- Response times are unpredictable and spike unexpectedly
- A failure in a minor, peripheral service breaks major, user-facing features
- Thread pools exhaust themselves during partial outages
How to correct it
Use timeouts aggressively. Every synchronous call must have a timeout. Decide upfront how long you are willing to wait. After that, fail fast. A quick, clear failure is vastly better than a slow, confused hang.
Implement circuit breakers. When a downstream service is failing, stop calling it for a period. Return a cached or default response. Let the downstream service recover before you try again. This prevents cascading failures from propagating up the chain.
Move non-essential work to the background. If downstream data is not needed to render the immediate response, load it asynchronously after the page appears. Order history can load the customer's address details in a follow-up request after the primary content is shown.
Replicate data. Copy the data you frequently need into the calling service's own database. The order service can store a snapshot of relevant customer details. This eliminates the call entirely and makes the service resilient to downstream failures.
Anti-pattern 8: Ignoring observability
What it looks like
You build microservices but monitor them like a monolith. Logs sit on individual servers with no central aggregation. You have metrics but no dashboards. You have no distributed tracing.
A user reports a slow checkout. You SSH into the checkout server. Nothing obvious. You SSH into the inventory server. Nothing. The payment server. Still nothing. You spend two hours searching through five different log files on five different servers, manually correlating timestamps, trying to reconstruct what happened.
Why it becomes a problem
Microservices produce far too much data to inspect manually. Logs are scattered across dozens — sometimes hundreds — of containers. A single request touches many services. Without proper tooling, you are flying blind in the dark.
How to detect it
Watch for these warning signs:
- Engineers SSH into production servers to read logs
- You cannot trace a single request end-to-end across services
- You cannot identify which service is causing latency
- Mean time to detection (MTTD) and mean time to resolution (MTTR) are high
How to correct it
You need all three pillars of observability — and you need them before incidents happen, not after.
Centralised logging. All logs flow to a single queryable system (Elasticsearch with Kibana, Grafana Loki, or a cloud provider's logging service). Engineers search one place, not many servers.
Structured metrics. Every service emits consistent metrics for latency, error rate, and throughput. You have dashboards that reveal the health of the system at a glance, and alerts that fire before customers notice problems.
Distributed tracing. Every request receives a unique trace ID that propagates across every service it touches. You can visualise the complete path of a request, see exactly where time was spent, and identify which service introduced latency or caused a failure.
This is not optional. Observability is not a nice-to-have that you add later. Do not start building microservices without a plan for it.
Anti-pattern 9: Organisational failures
Not all anti-patterns are technical. Organisational problems routinely cause technical problems — and no amount of technical sophistication can fix a broken team structure.
Centralised governance
An architecture review board must approve every service design. A central team dictates tooling that all services must use. A separate operations team controls production access and must be involved in every deployment.
This creates bottlenecks at every step. Teams cannot make decisions without approval. The central team is overwhelmed. Development slows to a crawl. The irony is that you built microservices to move faster, and the governance structure ensures you move slower than before.
Correction. Decentralise governance. Give teams end-to-end ownership of their services: design, build, deploy, and operate. Define standards as guidelines, not mandates enforced by gatekeepers. Use automation (linting, policy-as-code, contract tests) to enforce what genuinely must be enforced. Trust your teams.
Unclear service ownership
You have more services than teams. Team A owns checkout. Team B owns inventory. Team C owns users. But you have fifty services and five teams. Most services have no clear owner. Nobody knows who to call when something breaks. Nobody feels responsible for keeping them healthy.
Correction. A team owning multiple services is perfectly fine. The goal is not one-service-per-team. The goal is unambiguous ownership. A team that owns ten services and actively cares for them is far healthier than ten services with nobody responsible.
No DevOps culture
Developers write code and pass it to an operations team to deploy and run. Developers have no idea how their services behave in production. Operations engineers have no idea how the code works. Each side blames the other when things go wrong.
This is deadly for microservices. Microservices require developers who understand operations, and operations engineers who understand the code. The wall between them must come down.
Correction. Adopt a genuine DevOps culture. The people who build a service are responsible for running it. Developers are on-call for their own services. Operations engineers work with developers on automation and tooling. The same team that ships the feature is the same team that gets paged at 2 AM when it breaks. Responsibility and capability align.
The big-bang migration
You decide to rewrite your entire monolith as microservices. Feature development pauses for six months. A large team works in isolation. At the end, you flip a switch and hope.
The new system has different bugs than the old system. The cutover is chaotic. Users notice. The team burns out from the pressure of a high-stakes big-reveal deployment.
Correction. Use the strangler fig pattern. Incrementally extract pieces of the monolith into services. Run the old and new systems in parallel. Migrate traffic gradually — first internal users, then a small percentage of real traffic, then more. This takes longer but is dramatically safer. You learn continuously instead of discovering all your mistakes at once.
How to detect anti-patterns before they become critical
Anti-patterns are far easier to correct when they are small. The key is noticing the early warning signs.
Week-to-week signals. You needed to coordinate a deployment with another team. You changed two services to implement one feature. You could not deploy because another team's change broke your tests. You considered reading another service's database directly, even briefly.
Month-to-month signals. Your team spent more time on infrastructure than on features. A production incident required manually correlating logs across five services. You have services that nobody fully understands or actively owns. Build and deployment pipelines take longer than the code changes themselves.
Quarter-to-quarter signals. Your team is afraid to touch certain services. New features take longer than they should because of cross-service coordination overhead. Operational costs are higher than expected for the traffic you are handling. Team members are openly discussing going back to a monolith.
If you notice any of these patterns, do not ignore them. They will not resolve themselves. They compound.
Summary
Here is a quick reference to the anti-patterns covered and how to address each one.
Shared database is detected when multiple services connect to the same database instance. The fix is to split databases, create service APIs for shared data, and migrate each service to its own storage.
Chatty services show up as many sequential network calls per operation. Address them through API composition, data replication, batch calls, or moving to asynchronous processing.
Over-fine-grained decomposition appears when you have more services than developers and most are tiny. Merge related services, extract shared libraries, or accept duplication.
Distributed monolith is revealed by coordinated deployments, shared databases, and cascading failures across services. Escape it by enforcing boundaries strictly and extracting one service at a time.
Data consistency traps arise from distributed transactions and long-held locks. Replace them with eventual consistency and the Saga pattern.
Versioning nightmares surface as breaking API changes that lock deployments together. Solve them with backward-compatible changes, explicit API versioning, and consumer-driven contracts.
Synchronous cascade chains manifest as deep call stacks and unpredictable response times. Break them with timeouts, circuit breakers, async processing, and data replication.
Ignoring observability is detected when engineers SSH into servers to diagnose incidents. Fix it with centralised logging, structured metrics, and distributed tracing — before incidents happen.
Organisational failures appear as governance bottlenecks, unclear ownership, and wall-throwing between dev and ops. Fix them with decentralised ownership, a genuine DevOps culture, and incremental migration strategies.
About N Sharma
Lead Architect at StackAndSystemN Sharma is a technologist with over 28 years of experience in software engineering, system architecture, and technology consulting. He holds a Bachelor’s degree in Engineering, a DBF, and an MBA. His work focuses on research-driven technology education—explaining software architecture, system design, and development practices through structured tutorials designed to help engineers build reliable, scalable systems.
Disclaimer
This article is for educational purposes only. Assistance from AI-powered generative tools was taken to format and improve language flow. While we strive for accuracy, this content may contain errors or omissions and should be independently verified.
