API Rate-Shaping with F5 iRules

New theme, new blog post…

Many larger websites running software as a service platforms may opt to provide web API’s or other integration points for third-party developers to consume, thus providing an open-architecture for sharing content or data. Obviously when allowing others to reach into your application there is always the possibility that the integration point could be abused… perhaps someone writes some rubbish code and attempt to call your API 500 times a second and effectively initiates a denial of service (DoS). One method is to check something unique, such as an API key in your application and check how frequently its called, however this can become expensive especially if you need to spark up a thread for each check.

The solution – Do checking in software, but also on the edge, perhaps on an F5 load balancer using iRules…

The concept is fairly simple – We want to take both the users IP address and API Key concatenate it together and store it in a session table with a timeout. If the user/application requesting the resource attempts to call your API endpoint beyond a a pre-configured threshold (i.e. 3 times per second) they are returned a 503 HTTP status and told to come back later. Alternatively, if they don’t even pass in an API Key they get a 403 HTTP status returned. This method is fairly crude, but its effective when deployed alongside throttling done in the application. Lets see how it fits together:

As mentioned above the users IP/API Key are inserted into an iRule Table – This is a global table shared across all F5 devices in a H.A deployment and, it stores values that are indexed by keys.

Each table contains the following columns:

  • Key – This is the unique reference to the table entry and is used during table look up’s
  • Value – This is the concatenated IP/API Key
  • Timeout – The timeout type for the session entry
  • Lifetime - This is the lifetime for the session, it will expire after a certain period of time no matter how many changes or lookups are performed on it. An entry can have a lifetime and a timeout at the same time. It will expire whenever the timeout OR the lifetime expires, whichever comes first.
  • Touch Time – Indicates when the key entry was last touched – It’s used internally by the session table to keep track of when to expire entries.
  • Create Time – Indicates when the key was created.

The table would look something like this:
F5 iRule Session Table

The Rule itself:

The iRule DataGroup:

RateLimit URI(Click for larger Image)

So how does this iRule work? Lets step through it:

  1. When the rule i initialized two static variables are set: The “Max Rate”, how many requests are allowed within the “windowSecs” period. i.e. 3 requests per 1 second.
  2. When the HTTP request is parsed, the rule scans HTTP paths (i.e. /someservice.svc”) inside an iRule Datagroup named “Ratelimit-URI” to check if its a page that requires rate-limiting, if not breaks and returns the page content.
  3. We check if the request is coming from a white-listed IP address, if it is we return the page content without rate-limiting, otherwise the rule will continue
  4. The rule then checks if the request contains an HTTP header of “APIKey”, if not a 403 message is returned and the connection is dropped, if it is the rule continues.
  5. We then setup the variables that will be inserted into the iRule table. First we hash the APIKey as a CRC32 value to cut down on the size if its large. We then concatenate the client IP address with the resulting hash. Finally we drop it into an table
  6. A check is then performed to see if the count of requests didn’t breach the maximum number of requests set when the rule initialized, if it didn’t then when the count of requests is incremented by one and the table is updated. Otherwise if the count did breach the maximum number of requests, a 503 is returned to the user and the connection is dropped.

That’s it, simple – fairly crude, but effective as a first method of protection from someone spamming your API. Making changes to the rule is fairly simple (i.e. changing whats checked, perhaps you want to look for full URI’s instead of just the path). It may also be worth while adding a check for the size of the header before you hash to ensure no one abuses the check and forces your F5 to do a lot of expensive work, perhaps do away with the hashing all together… your call :)

It must be noted that the LTM platform places NO limits on the amount of memory that can be consumed by tables, because of this its recommenced that you don’t do this on larger platforms or investigate some time in setting up monitoring on your F5 device to warn you if memory is getting drastically low – “tmsh show sys mem” is your friend.

Let me know if you have any questions.

-Patrick

Posted in F5

Windows Server App-Fabric “failed to connect to hosts in cluster”

I’ve just completed the process of building a new AppFabric Cluster on version 1.1 with a SQL backend over an existing XML Based 1.0 cluster… The new version appeared to fix a lot of issues that existed in V1.0, plus by installing a Cumulative Update  you are able to use Windows Server 2012 standard to host a Highly Available cache cluster (now that it includes cluster functionality that only previously existed in Server Enterprise in 2008/R2)

Fortunately my old automated deployment scripts did not need that much tweaking aside from the obvious changes required to use a SQL server to store the configuration + changing my secondary cache count for scaling.

After the script established the Cache Cluster and added the host I ran into an issue when attempting to start the cluster to add the individual caches and assign permissions, I got the following error: “Use-CacheCluster : ErrorCode<ERRCAdmin040>:SubStatus<ES0001>:Failed to connect to hosts in the cluster

 

cache-error
 

There appeared to be no real help on MSDN  to help me solve my problem… A bit of research yielded the following fixes:

  1. Ensure that the AppFabric Cache Host can resolve itself (and other Cache Lead-Hosts) via DNS, Hosts Files etc.
  2. Ensure that the Remote Registry Service has been started and the rule “Remote Service Management (NP-In)” on the Windows Firewall rule is allowed.
  3. Ensure that Firewall rules exist to allow App-Fabric communication (e.g. Port 22233 for cache port, 22234 for Cluster port etc).

My script opened firewall ports but didn’t start the remote registry service… After starting this service and reconfiguring my cache once more everything came online – I was able to add all all of my cache nodes to my brand new cluster.

Extend AD Schema to allow greater Office 365 Management

If you run Office 365 and use Directory sync to push Active Directory objects to Microsoft Online then you’ll likely know that if you want to make a change to a mailbox, contact or distribution group, then it needs to be done on that object within AD.
This is great, and Directory Sync is a brilliant idea but it seems to have a slight pitfall; It assumes that you’ve previously had Exchange deployed… Dirsync wants to sync Exchange AD Attributes

As an Example; You may have run into an instance where you’ve wanted apply settings such as delivery options or mail tips to a distribution group; Searching through Active directory yields no results for the correct attribute so the the setting has to be changed online/via powershell? Wrong:

 

 Error: The action ‘Set-DistributionGroup’, ‘RequireSenderAuthenticationEnabled’, can’t be performed on the object ‘RESDEVManagers’ because the object is being synchronized from your on-premises organization. This action should be performed on the object in your on-premises organization.

 

Now, in order to set this attribute manually I could set the MsExchRequireAuthToSendTo to ‘true’ from the attribute editor in Active Directory Users and Computers (or ADSI)… But I don’t have Exchange, I never had exchange and therefore I don’t have that attribute in my AD schema.

This Microsoft KB article (http://support.microsoft.com/kb/2256198) explains what AD attributes are referenced and written to/from AD and a quick look in the FIM Metaverse designer confirms this:

 

Fim Attributes
 So, We need to add these Exchange attributes to our Schema – To do so, we have a couple of options

  • You could manually create the attributes from ADSI edit and set them to the correct Type as per FIM’s Metaverse designer – Messy and could cause issues
  • Run the Exchange 2010 Installation and extend your AD schema to include all MsExch* attributes so you can set them from ADUC/Powershell/Some other management tool

We’ll opt for the 2nd option (its easier and automated) – Let’s get started:

  1. Download the Exchange 2010 Trial media from here. Run the executable and extract the files to a temp location.
  2. Ensure your account is a member of Enterprise Admins and Schema Admins in Active directory. Change Directory to your extracted Installation media  and run the following: Setup /PrepareSchema 

    prepare Schema

    Wait for the tool to complete.
  3. Open up Active Directory Users and Computers and enable View > Advanced features (If you haven’t already). 

    Active Directory Users and computers
  4. Locate an object from the AD tree and click the Attribute Editor Tab and Scroll down to MSExch- ; Your AD Schema has been extended successfully and you now have a bit more control over objects in Office 365. 

    RESDEV Managers

Hopefully everything’s there and the process went smoothly

You can go ahead and edit MsExchServerHintTranslations for Mailtips and MSExchRequireAuthtoSendTo for Distribution group send as permissions (as two examples)

-Patrick

 

F5 Monitor text file contents

Another quick F5 related post:

Below is a neat little GET request that can be used in a F5 monitor to check the contents of a text file (or if it even exists) and degrade the performance of pool members if it doesnt. This could be useful for a Maintenance mode/Sorry site monitor for when a deployment is triggered.

To create the monitor:

  1. Create a new monitor from Local Traffic -> Monitors; Give it a name and a description
  2. Set Monitor type to HTTP
  3. Specify the interval times you require
  4. In the Send String Simply swap “/testfile.txt”  for your own text file name and “nlb.resdevops.com” in the example below for the target website the monitor will query for said text file.
  5. In the  Receive String enter the contents of your text file
  6. Save it and apply the monitor to a Pool

Remember: The F5 must be able to resolve the host in the above query (you will need correct DNS/Gateway information set)

-Patrick

 

HTTP Get text File Monitor
 

Posted in F5

Hosting Maintenance/Sorry site from a F5 NLB

As I’ve said in a previous post, I’m fairly new to the world of F5; That being said, I’m really enjoying the power and functionality of LTM!

One task I recently undertook was to implement a maintenance/sorry site to display when we do patch releases to our software (or if something was to go horribly wrong at a server level). The solution we opted for was to essentially use our F5 device as a web-server and host the “sorry site” from the NLB appliance. The page would show if the custom monitors we defined on a virtual server reported <1 healthy server in its respective NLB pool; if this criteria was met then an iRule would fire and show our page before relaying the request it onto the VIP.

 

Since I am using some LTM’s running v10.x and v11.x I’ve opted to use a method that works across both versions. This post has been written for v10.x, but rest assured, it’s a trivial task to get it working on v11; Feel free to ask if you get stuck.

 

So… Lets get started:

  1. First we need to generate a .class file with the content of our images (encoded in a Base64 format). This class file is called by our iRule and images are decoded when its run.
    Save the following Shell script to a location on your F5; I have called it “Base64encode.sh” 

    SCP your image files to your F5 and place them in “/var/images”

    Fire up your favourite SSH client and Call Base64encode.sh to enumerate all images in “/var/images” and generate an image.class file (which is exported to /var/class/images.class) of key/values with the following syntax:
    “image.png” := “<Base64EncodedString”,

    (If you intend to call this class file directly from the iRule OR are referencing this from using  a data-group list in LTM v10.x you may need to execute a “B Load” or a “TMSH Load Sys Config” command from SSH so the class file is referenced).

  2. Next we need to create a Data-group list so our iRule can reference the encoded images, If we were running LTM v11.x we would be forced to download the class file and upload it from “System > File Management > Data Group File List”; however, since this is tutorial is for v10 we can simply reference our class file using a file location. From the GUI navigate to “Local Traffic > iRules > Data Group List > Create” and create the Data group as follows:
    Name: images_class
    Type: (External File)
    Path/Filename: /var/class/images.class
    File Contents: String
    Key/Value Pair Separator: :=
    Access Mode: Read Only
  3. Now we can assemble our iRule using  a template similar to the one written by thepacketmaster. The iRule below does the following:
    1. Invokes on HTTP Request
    2. Establishes What VirtualServer pool is responsible for serving up the requested website’s content
    3. Checks to see if the Active members (health) of the Pool has less than one healthy member
    4. Adds a verbose entry to F5 log with Client address and requested URL
    5. Responds with a 200 HTTP code for each image and decodes our Base64 encoded image by referencing the images_class Data Group and subsequently Images.class file we defined in the previous step
    6. Responds with the HTML of the Sorry/Maintenance Mode Page.

    Replace the CSS/HTML in the above iRule with your own (same goes for the images). Remember that you MUST escape any quote marks in your HTML/JS with a “\

    Please note: I had a hard time getting the above to work with LTM v11; My HTML would show but my text would not. After a bit of head-scratching and research I re-factored the Base64 decode lines (i.e. Lines 6 & 9 above) with the following:

    You may also want to look at using the iFile list functionality of LTM11 to serve up images instead of manually base encoding them (even though the above should work): https://devcentral.f5.com/tech-tips/articles/v111-ndashexternal-file-access-from-irules-via-ifiles

  4. Apply your new iRule to your respective Virtual Server and test it out. Make your Virtual Server monitors “trip” by manually shutting down pool members from the F5 to bring the overall pool into an unhealthy state

 

 

webfefail
 

 

I hope this helps anyone struggling to get something similar to this working. As always, feel free to ask questions.

Thanks,

Patrick

Lync 2013 executable name change can break QoS

A quick note to those with Group Policy-based QoS for Lync/OCS that I thought worthy of a blog post:

The Lync 2013 client executable file name has changed from communicator.exe in OCS and Lync 2010 to Lync.exe.

So, why does this matter – If you are applying QoS DSCP markings on Lync traffic for specific port ranges for communicator.exe (using Policy-Based QoS), you’ll likely find no markings applied to Lync 2013 client traffic. As a result, there is a possibility that call quality can be reduced or other Lync functionality slow due to a lack of traffic prioritisation.

If you’re currently running a hybrid of both clients it would be worthwhile updating these GPO’s to add in the new executable name… or replacing it if you have finished upgrading :) .

 

-Patrick

 

Lync GPO's

SCOM Win-RM Powershell – Hitting WSMan Memory Limits

scomposhWinRM can be a very useful tool (even if it is somewhat of a challenge to set-up [esp if you want to use CredSSP]. I am however finding it more and more useful to execute cmdlets for Powershell modules you cant easily install; An example is the Operations Manager 2012 cmdlets that are only installed with the Console.

For a maintance mode script we recently used (similar to this one) I found myself tripping over one of the following errors when attempting to remotely connect to my SCOM server and put a group into maintenance mode:

 

Processing data from remote server failed with the following error message: TheWSMan Providor host did not return a proper response. A Providor in the host process may have behaved improperly. For more information, see the about help _Remoting_troubleshooting help topic.

Or

Processing data for a remote command failed with the following error message: Deserialized objects exceed the memory quota. For more information, see
the about_Remote_Troubleshooting Help topic

 

These messages are nice and generic with no real hint to the cause. So why does it happen?

When attempting to run up the OperationsManager Module (and subsequantly connect to the SCOM SDK) to run cmdlets we will exhaust the 150MB defualt WSMan memory allocation. In order to continue we will need to exend the “MaxMemoryPerShellMb” allocation Run the following to view/alter these values:

Easy :)

-Patrick

 

Office 365 Search and Delete mail using Powershell

A neat feature of Exchange is the ability to run up a search across mailboxes within an organization from Powershell using the Search-Mailbox cmdlet and delete inappropriate or harmful messages using the -DeleteContent parameter. Fortunately these features exist in Office 365 if you are an Exchange Online Administrator

While administrators can use the Multi-Mailbox search feature in the Exchange control panel UI to locate mail, you  may discover you are unable to remove messages directly without some PowerShell magic.

The below script requires you add your admin account to the “Discovery Management” role from Roles & Auditing on the Exchange Control Panel (ECP).

 

 

Warning: Use the above script with caution; When using the DeleteContent parameter, messages are permanently deleted from the user’s mailbox and cannot be recovered. (It could also take some time to run over a big Office365 Tenant)

See related Office 365 help article here:  here: http://help.outlook.com/en-ca/140/gg315525.aspx

-Patrick

IIS Logging broken when traffic proxied Via F5 NLB

So you have a new F5 NLB, You have a new site hosted on IIS behind said F5…And now you have broken IIS Logging…

You may find that after deploying F5, any IIS logging will now reflect the internal IP of the F5 unit, and not the external address of the actual client. Why? When requests are passed through proxies/load balancers, the client no longer has a direct connection to the web-server itself, all traffic is proxied by the F5-Unit and the traffic looks like its coming from the last hop in the chain (F5)

 

 

X-Forwarded-For Diagram

So how do we get our logging back? Easy, it just requires two simple pre-requisites (and no downtime).

 

 

First is to insert an “X-Forwarded-For” header into each request to the web server. This header is a non-standardised header used for identifying the originating IP address of a client connecting to a web server via an HTTP proxy or load balancer.
To Insert X-Forwarded header:

  1. From your F5 Web console select Local Traffic > Select Profiles > Select Services
  2. Choose One of your custom HTTP profiles or select the default HTTP profile to edit all child profiles
  3. Scroll down the page and locate the “Insert X-Forwarded-For” property and enable it (you may need to select the custom check-box first depending on your profile type)
  4. Select update to apply changes

Next step is to install an ISAPI filter developed by F5 to amend IIS’s logging with the correct requester IP using the X-Forwarded for HTTP header Syntax {X-Forwarded-For: clientIP, Proxy1IP, Proxy2IP} (this filter is supported on both IIS6 & 7)
Download the ISAPI filter here: https://devcentral.f5.com/downloads/codeshare/F5XForwardedFor.zip

 

  1. Copy the F5XForwardedFor.dll file from the x86\Release or x64\Release directory (depending on your platform) into a target directory on your system.  Let’s say C:\ISAPIFilters.
  2. Ensure that the containing directory and the F5XForwardedFor.dll file have read permissions by the IIS process.  It’s easiest to just give full read access to everyone.
  3. Open the IIS Admin utility and navigate to the web server you would like to apply it to.
  4. For IIS6, Right click on your web server and select Properties.  Then select the “ISAPI Filters” tab.  From there click the “Add” button and enter “F5XForwardedFor” for the Name and the path to the file “c:\ISAPIFilters\F5XForwardedFor.dll” to the Executable field and click OK enough times to exit the property dialogs.  At this point the filter should be working for you.  You can go back into the property dialog to determine whether the filter is active or an error occurred.
  5. For II7, you’ll want to select your website and then double click on the “ISAPI Filters” icon that shows up in the Features View.  In the Actions Pane on the right select the “Add” link and enter “F5XForwardedFor” for the name and “C:\ISAPIFilters\F5XForwardedFor.dll” for the Executable.  Click OK and you are set to go.

If you’re that way inclined – there is also an IIS Module available if you think ISAPI filters are not for you (See: https://devcentral.f5.com/weblogs/Joe/archive/2009/12/23/x-forwarded-for-http-module-for-iis7-source-included.aspx)

Let me know if you have any questions :)

-Patrick

Unable to Open Lync CSCP from Lync Server

When deploying a Lync Server the other day I spent a good 15 mins (stupid me) trying to figure out why I couldn’t open the Lync CSCP control panel from the Lync Server – I kept getting:

HTTP Error 401.1 – Unauthorized

You do not have permission to view this directory or page using the credentials that you supplied.

I had defined an Admin URL when establishing my topology (and published it), plus I had set the appropriate DNS records within my domain to make the CSCP site resolve – still no Dice. I finally ended up trying from another server which had Silverlight installed… It worked!?!

So what was the cause?
Back in Win Server 2003 Sp1 (and subsequent versions of Windows) Microsoft introduced a loop-back security check. This feature prevents access to a web application using a fully qualified domain name (FQDN) if the attempt to access it takes place from a machine that hosts that application. The end result is a 401. 1 Access denied from the web server and a logon failure event in the Windows event log.

A work around for the issue if you really want to access the Lync CSCP from the Lync server itself (using anything other than https://localhost/cscp):

  1. Logon on to the Lync server with an account that is member of the local admins group
  2. Start “regedit”
  3. Navigate and expand the following reg key “HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\LanmanServer\Parameters”
  4. Right-click Parameters, click New, and then click DWORD (32-bit) Value.
  5. In the Value name box, type DisableStrictNameChecking, and then press ENTER.
  6. Now navigate to “HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Lsa\MSV1_0″
  7. Right-click MSV1_0, point to New, and then click Multi-String Value.
  8. Type BackConnectionHostNames, and then press ENTER
  9. Right-click BackConnectionHostNames, and then click Modify.
  10. In the Value data box, type the host name (or the host names) for the sites that are on the local computer, and then click OK.
  11. Quit Registry Editor, and then restart your server

You should be good to go :)

For more reading + another possible (and less-secure) work around for a lab environment see KB896861

-Patrick

Manual WSUS Sync using Powershell

A quick blog post tonight:

When setting up WSUS, its common practice to trial updates internally  prior to deployment across an entire environment…

For a recent WSUS setup I completed we decided to leave auto update approval on for all Windows Critical & Security updates so they could be tested after release every Patch Tuesday. After Microsoft’s recent spate of Out-Of-Band updates we were finding that machines were being updated half way through a month due the limited update controls you have with WSUS… We could opt to manually sync updates or select the longest time between syncs of “check once every 24 hours”.

Using some cool tips from the Hey Scripting Guy Blog I’ve slapped together this script that now runs as a scheduled task to download updates once a month on Patch Tuesday.

This is an impractical approach to updating, critical updates should be applied as soon as possible, however forcing a manual WSUS update could come in handy for a select few:

 

-Patrick

Remote logoff all Windows users with Powershell

An annoyance for anyone when running a large scripted software release is when the deployment fails half way through because someone has left a logged-on Windows session with an Open file or Window. You may then need to invest large amounts of time identifying the cause, reverting changes that only deployed half-way and eventually re-run the release package…

In one of my earlier blog posts I explained how I would drop servers into maintenance mode when running a software deployment to prevent being spammed by Operations Manager Alerts… I targeted my specific environment in SCOM using a filter that would only add servers with a NetBIOS name starting with PROD or TEST.

I have quickly whipped up something similar to boot all users off of servers (log-off) prior to executing release package. The snippet below will grab all servers in the “Servers OU” on active directory and throw them into an array. It will then enumerate each server and use the Inbuilt “MSG” command (similar to NET SEND) to warn each user logged onto servers to save their work and log off (along with a countdown timer). Once the timer hits zero the Win32Shutdown() WMI method is run to Force log-off all users so the deployment can proceed

This script could easily be turned into a function that passed along a parameter for Prod/Test/Dev environments.

Enjoy…

-Patrick

Add Proxy Addresses via PowerShell to Office 365 Users

Its safe to say that one of the most useful features of Office 365 from an administrative point of view is Directory-Sync via Forefront Identity Manager (FIM).

When given the task to add a new email alias/address to all users, we can simply add to the existing Proxy Address attribute in Active directory programmatically through PowerShell (and sync it up to the cloud using FIM). The code snippit below will take the take each users first& last names plus the $proxydomain to add “smtp:firsname.lastname@domain.com” to their AD user object if they reside in $usersOU.

If the “Coexistence-Configuration” PSSnapin exists on the running system a manual directory sync will be initiated.

 

-Patrick

F5 Network Load Balancing using Route Domains

I’m pretty new to the F5 NLB scene, any network load balancing I had previously done had been through the inbuilt Windows Network Load balancing (WLB) Server role. Recently I was asked to deploy a F5 configuration to an already running production environment to handle SSL Termination, Caching and (of course) Load balancing on both web and app tiers.

The existing deployment comprised of two /22 segments (Internal and DMZ networks) with a single router as the default gateway; Everything I read online told me to use a “One armed F5 Config”. This type of config “should” let me add a single physical network to one of my segments and reach both networks using a SNAT rule to adjust the origin address on the reply; But how could this work if a router was translating requests between the two networks? I discovered after many hours, it can’t…

My answer, ensure the F5 is connected to both network segments and use Route Domains to solve my routing problems. Here’s what I did:

 

 

 

  1. Utilize a free port on my F5 to connect into both networks – Most people could probably just add another VLAN to their existing network, however I don’t have the ability to control the managed network
  2. Establish the 2nd untagged VLAN for the 2nd connection in Step 1
  3. Establish a new route domain from Network -> Route Domains -> Create.
      1. Enter a new description ID (I used 2 [to increment by 1 from the default Route domain used for my other External network])
      2. Give your route domain a description
      3. Enable Strict Isolation to enable cross-routing restrictions
      4. Add new VLAN as a member to Route domain
      5. Click Finished
  4. Add a new Self-IP for the 2nd network connection on your new VLAN – Add %<routedomainID> after your IP

     

     

  5. Create a new Load balanced application/Virtual server and add %<routedomainID> to your VIP Address (NOTE: you will likely need to enable SNAT auto-map to the Virtual Server profile to allow return traffic via the Routedomain)
  6. You should “hopefully” be good to go – your application is now hopefully responding correctly and your health monitor is showing a friendly green status

 

This solution is merely a stop-gap until we can convert it into a routed configuration (recommended setup) – where the F5 unit will be the default gateway on both networks with something like a /29 stub network between the F5 and the router.

All in all, the F5 units are pretty blimmin powerful devices – I have a good handle on the UI but have only really scratched the surface using the BigPipe/TMSH commands. I KNOW that I’ve glossed over details, please feel free to leave a comment if you have any questions (it looks like there are many stuck Engineers on the net with the same problem)

-Patrick

SCOM Remote Powershell Maintenance Mode

I’m currently working a neat project with a nice automated deployment process – The Deployment (Powershell) does everything from configuring IIS to Updating SQL with Delta schema changes etc. One of the problems for the people in charge of Ops Manager (or those on-call for that matter) is that they will be spammed with Alert emails if the appropriate servers are not dropped into Maintenance mode when this deployment is run.

The team deploying the latest build don’t want mucking around with SCOM while trying to focus on a Prod release for exmple – What we do… Set appropriate servers to maintenance mode at the start of a deployment and pull them out once its complete. Here’s how we do it:

  1. Establish a new SCOM Group which our Maintenance Mode script will target when run.:
    Each server name in our environment is prefixed with its environment type e.g. PRODWEB1, TESTAPP2, PRODSQL1, etc. I’ve setup a Dynamic group with the following rule to target “PROD” machines (and another for test): ( Object is Windows Computer AND ( NetBIOS Computer Name Contains PROD ) AND True )
  2. Enable PowerShell remoting on SCOM server so we can utilize the OpsManager Powershell Cmdlets:
  3. Add the appropriate Powershell snippits (or add them as functions) to your deployment to enter and Exit Maintenance Mode:

To Enter Maintenance Mode:

Exit Maintenance Mode:

Let me know if you have any Questions.

-Patrick