Error
Error Code: 189

MongoDB Error 189: Primary Stepped Down

📦 MongoDB
📋

Description

Error 189, 'Primary Stepped Down', indicates that the current primary node in a MongoDB replica set has voluntarily or involuntarily transitioned to a secondary role. This event triggers a new election among the remaining members to choose a new primary. While often a normal part of replica set failover, frequent occurrences can signal underlying network, resource, or configuration issues.
💬

Error Message

Primary Stepped Down
🔍

Known Causes

4 known causes
⚠️
Network Connectivity Loss
The primary node lost network connectivity to a majority of the replica set members, causing it to step down.
⚠️
Resource Contention or Overload
The primary experienced high CPU, memory, or disk I/O, leading it to become unresponsive or unhealthy and step down.
⚠️
Manual Primary Step Down
An administrator explicitly issued a rs.stepDown() command for planned maintenance, upgrades, or configuration changes.
⚠️
Configuration or Replication Issues
Changes in replica set configuration or significant replication lag on the primary might trigger a step-down event.
🛠️

Solutions

4 solutions available

1. Investigate and Resolve Network Connectivity Issues medium

Address underlying network problems that are causing the primary to step down.

1
Check network connectivity between replica set members. Ensure there are no firewalls blocking communication or intermittent network failures.
ping <replica_set_member_ip>
2
Verify that all replica set members can resolve each other's hostnames correctly. Use `nslookup` or `dig` to test.
nslookup <replica_set_member_hostname>
3
Monitor network latency and packet loss between replica set members. High latency or packet loss can lead to elections.
mtr <replica_set_member_ip>
4
Review system logs on the affected server(s) for any network-related error messages.
sudo tail -f /var/log/syslog  # Or equivalent log file for your OS

2. Analyze and Address Resource Constraints medium

Identify and resolve resource limitations (CPU, RAM, Disk I/O) on the primary that might be causing it to become unresponsive.

1
Monitor CPU usage on the primary server. High CPU utilization can make the server unresponsive, leading to timeouts and elections.
top
2
Check available RAM on the primary server. Insufficient memory can lead to excessive swapping, degrading performance.
free -h
3
Monitor disk I/O performance on the primary. Slow disk operations can significantly impact MongoDB's ability to respond.
iostat -xz 1
4
Review MongoDB logs for any messages indicating resource pressure, such as slow queries or write stalls.
sudo tail -f /var/log/mongodb/mongod.log
5
Consider scaling up the server resources (CPU, RAM) or optimizing application queries and indexing if resource constraints are identified.
N/A

3. Review and Adjust Replica Set Configuration medium

Examine and modify replica set settings to prevent unnecessary elections.

1
Connect to your MongoDB replica set using the `mongosh` shell.
mongosh
2
Access the `rs.conf()` to view the current replica set configuration.
rs.conf()
3
Check the `settings.electionTimeoutMillis` value. If it's too low, consider increasing it to allow more time for the primary to respond before an election is triggered. The default is 10000 milliseconds (10 seconds).
db.adminCommand({ replSetGetConfig: 1 }).settings.electionTimeoutMillis
4
If you need to change the election timeout, use `rs.reconfig()` with the updated configuration. *Caution: Modifying configuration requires careful consideration and understanding of replica set dynamics.*
var cfg = rs.conf();
cfg.settings.electionTimeoutMillis = 20000; // Example: Increase to 20 seconds
rs.reconfig(cfg);
5
Ensure the `members` array in `rs.conf()` correctly lists all replica set members with their respective hostnames and ports.
N/A

4. Perform a Graceful Primary Re-election easy

Manually trigger an election to shift the primary role to a different member.

1
Connect to the current primary node using the `mongosh` shell.
mongosh
2
Initiate a primary step-down. This command forces the current primary to step down and trigger an election.
rs.stepDown()
3
Monitor the replica set status (`rs.status()`) to confirm a new primary has been elected and the replica set is healthy.
rs.status()
🔗

Related Errors

5 related errors