Monthly Archives: November 2017

Skype for Business unplanned DR failover and Fail back

This article outlines the unplanned failover and fail back for Skype for business. However the DR setup must be provisioned in order for the DR activation to happen.

The deployment must have the below setup :

1)Skype for Business HQ Front end, HQ SQL ,HQ OOS will be part of the HQ active directory site.

2) HQ site will have its dedicated Edge server.

3) Skype for Business DR Front end, DR SQL and DR OOS will be part of the DR active directory site.

4) DR site will have its dedicated Edge server.

5) DR front end, edge servers will be in the same Skype for Business Site since the site is a standby site.

6) Synchronous data Replication will be enabled between the HQ FE pool and the DR FE pool .

7) DR sql store information must be published in the topology builder.

8) Associated backup pool must be specified as DR Skype for business FE pool in the topology builder.DR file stores must be published in the topology builder.

9) HQ and DR site edge servers DNS name spaces can be load balanced. DR site must be made unavailable during normal scenario and connections  to DR edge must be allowed only during DR scenario.

10) Required communication from HQ to DR FE,SQL should be present for the Pool replication to happen.

Example of DR setup with main site:

 

SFB112

Procedure to activate unplanned DR failover:

In case of unplanned failover its a total disaster where the main site will be completely unavailable.

So the  CMS (Central Management Store) ,HQ fe pool and HQ edge services will not be accessible during this scenario.

Below steps can be used:

1) Configure in the DNS load balancer and make sure the edge server DNS name spaces are ready to accept connections in the DR site edge server. There are multiple ways to achieve this based on the network setup. As a last resort also we can add simply 2 entries (hq & dr) on the DNS name spaces and stop the DR edge services. We can activate the DR edge services only during the DR scenarios.

2) Activate the CMS

We can try to run the below command to see the CMS status

Invoke-CsManagementServerFailover -Whatif

This command will throw an error because this CMS is not available since it was present in the main site and main site is totally in accessible.

SFBDR7

In a normal state when the main site is available in a planned failover the result of the command will be the below

SFBDR8

It will let us know the current state of the CMS and the proposed state of the CMS after the failover.

SFBDR9

3) In this scenario the CMS needs to be activated forcefully by the below command

Invoke-CsManagementServerFailover -BackupSqlServerFqdn “DRSQLFQDN” –BackupSqlInstaceName “BACKUPDRSQLINSTANCE” –Force:$true

Untitled11

SFBDR1

4) Wait for the replication status to be completed:

We can check the replication status by below command

Get-CsManagementStoreReplicationStatus | ft

5) Reconfigure Edge Federation Route  via DR edge and publish topology and run the setup on all edge servers.

Enable the federation on DR edge and modify the federation route via DR edge.

Untitled21

Untitled12

6) Failover the Pool using disaster mode switch.

Invoke-CsPoolFailOver -PoolFqdn “poolfqdn” -Force -DisasterMode

Untitled14

Failback to HQ site:

Once after the main site is back  make sure the  DNS name spaces are available in the main site

1)  Failover the CMS

Invoke-CsManagementServerFailover

Wait for the CMS replication to complete in the main site.

2) Failback the FE pool to the main site.

Invoke-CsPoolFailBack -PoolFqdn “poolfqdn”

SFBDR2

3) Reconfigure Edge Federation Route and publish topology and run the setup on all edge servers.

Note: The DNS routing and the VOIP component SIP/PSTN integration will vary in each and every deployment .The DR setup and failover needs to be taken into consideration  according to these configuration.

Thanks & Regards
Sathish Veerapandian

Enterprise Vault – There has been no mailbox synchronization with Exchange since 3 days

Recently one of the mailbox server was not syncing with  enterprise vault server and was getting the message on the status  monitor screen as unable to synchronize with one exchange server for last 4 days.

Checked the below things:

  1. Did a Force a synchronization and checked if the A7 queue is populated – No luck.
  2. Checked for the archiving service mailbox status of that server(verified if this service mailbox is hidden) – No issues.
  3. Checked the throttling policy – all are set to unlimited and no issues.
  4. Tried to manually synchronize a mailbox  from the affected archiving Task – No Luck.
  5. Brought Offline/Online Enterprise Vault Task Controller Service – No Luck
  6. Restarted the affected Exchange Server and also the EV nodes – Still the same
  7. Checked the moved items update summary of the affected server – no of failed updates was 0 and there was no affected mailboxes present.
  8. Checked if Enterprise Vault related MSMQ queues are clear by running the below command and they were clear

Get-MsmqQueue | where {$_QueueName -like “*private*<affected” exchange server name>*”} | ft QueueName,MessageCount -autosize

But the interesting thing was that the affected server did not have any new files generated moved items summary  files right after the issue reported date.

Started looking into the event logs to see if there are any additional information to be gathered right after the issue was reported

Found this event ID 3349 right after the mailbox synchronization started happening for the affected server.

looked for this mailbox and found it was present in the affected mailbox server.

b7a7db31-7190-43a7-8fc5-03b7fa068c56

Looked into the existing entry of the legacy DN of the affected mailbox in the EV directory database by running the below

EV1

looked into the legacy exchangeDN of the affected mailbox from the Active Directory Users and computer through Attribute Editor. Both the values were different.

In this case the admin has removed the user and the mailbox but  archive remained. After some time the same user has rejoined and was created with the same UPN ,SAMID .

So when the mailbox synchronization started after the new mailbox creation EV during its synchronization of this server found this mismatch entry and stuck in the synchronization. After this it was not able to proceed with the synchronization.

Solution:

Delete the old stale legacy mailbox DN entry from the Enterprise Vault  Directory Database.

Use EnterpriseVaultDirectory
Select * from ExchangeMailboxEntry where LegacyMbxDN = ‘mention the old dn showing from the Event viewer’

Once this is done we can manually run the Synchronize Mailboxes from this affected Archiving Task  and the synchronization of all mailboxes from this affected  mailbox server will be successful.

Thanks
Sathish Veerapandian