We’ve just released a significant update to the Office365Mon Distributed Probes and Diagnostics feature. For those of you who aren’t familiar with this feature, it was originally released a little over a year ago. From the beginning it was designed to do two things:
- Work in conjunction with the Office365Mon cloud service to issue health probes from different geographic regions where you have users. That allows you to check the availability and performance not only from our cloud service, but also from all of the locations where you have users.
- When there’s a problem connecting to Office 365, it runs a series of diagnostics on the local network to try and determine if there are any issues. That includes things like checking local network cards, DNS, gateway and a non-Office 365 Internet site.
In addition to the tasks above, it also allows you to set a performance threshold – for example, let me know when it takes longer than x seconds to connect to and get data back from Office 365. You can set “x” to whatever value you want, so it allows you to set different minimum performance thresholds for each location where you have users. One of the big reasons we did this is because we got a lot of feedback from our enterprise customers that they had situations where performance may be great in the US for example, but poor or completely down for users in another region, like Europe.
In the previous version of Distributed Probes and Diagnostics, any issues with health probes, performance, or any of the items in the diagnostics checklist was written to the local event log. You could then monitor the event log in each location where you have it installed to find out when there are issues at a particular location. That also proved to be pretty helpful if you had to open a support case with Microsoft because of connectivity issues. They will typically try and triage the issue by looking to see if there are local network issues, versus an issue with the Office 365 service. By using the Distributed Probes and Diagnostics feature, you can quickly check the event log on the machine(s) where it’s running and if any local network issues were found, it will be logged in there. That notifies you if an issue is found, saving you a call and allowing you to focus on the real problem, or else validate with support that your local network is fine.
Our new update has all of the same features I’ve described above you’ve come to depend upon, but we’ve also built some very important new pieces to complement it. Now, in addition to logging data to the local event log, as long as you have a working Internet connection it also reports and sends out alerts through our cloud service. This opens up some very interesting data points, both from a reporting perspective as well as notifications.
When you configure the Distributed Probes and Diagnostics feature now, you are asked to enter the ZIP code where the computer is located on which you installed it. We use that data for both local and regional geographic data that feeds into the new reports that have been built for the service. Overall we added 10 new reports to the service to accommodate this new data stream – two Basic reports and eight Advanced reports.
During the beta phase for this release we had the feature running in 8 different countries and more than a dozen locations. From that data we can create a performance heat map across the globe from all of our customers that are running this service:
The picture above shows data from locations in the UK, India and Australia. You can tell based on the intensity of the color around the push pin which locations are performing worse than others. For example, Australia has the most intense colors and France has some of the lightest colors, so you can tell at a glance that you have much worse performance in Australia than France. That’s going to be pretty important to know when supporting your Australian users.
We also create bubble maps to represent the performance in different locations for your Office365Mon subscription. This gives you another “at a glance” snapshot of what how things are going in different locations. The key distinction here is that in the report above, you get to see what the data looks like across the globe for all Office365Mon customers; the bubble map lets you see the performance just for the locations associated with your Office365Mon subscription. That gives you the capability to compare how others are doing in a particular region relative to your users. If you see a negative difference between them then that may indicate that you have problems in your network in those locations that should be addressed.
Here’s a screenshot of that report, where we’ve drilled down to see just locations in the US:
Here we can see that folks out West are getting much better performance than their counterparts in the East.
The graphical maps are a great way to use an “at a glance” view of the performance for your user base, where ever they may be located. We also offer more traditional views of this data as well though, so you can quickly compare performance on each computer where you’ve installed the Distributed Probes and Diagnostics agent, as shown here:
In our case, we had the agent installed in a LOT of locations, so you see a lot of data there. Again, the number of locations in which it’s installed is completely up to you.
Of course just as important as performance, we definitely have seen scenarios where the service as a whole may be up, but individual regions may be down. A good example of this is the handful of times a few months ago when there were problems with Azure Active Directory in certain European regions. Since our cloud service currently runs out of data centers in the US, it did not have any issues connecting to the service because the regional Azure AD services it uses were working. However, our customers that had the Distributed Probes and Diagnostics agent running in Europe were able to find out first that there was an issue over there, because the probe and authentication process occurred there, where their users are.
We also saw this occur at times during the beta for this release, and you can see that reflected in the new availability reports. They show availability based on the agents where the Distributed Probes and Diagnostics feature is installed; here’s a screenshot of that:
New Notification Capabilities
While we’ve added a bunch of new reports, we’ve also vastly improved upon the notification capabilities. As I was describing earlier, in the previous release of Distributed Probes and Diagnostics, all notifications went exclusively to the local event log. We still do that, but now these events are also wired up to go out to our cloud service as long as you have a working Internet connection. Just like you might expect, you get notifications for the same kinds of things you get from our cloud monitoring service – when outages start and end. But now you are getting those notifications from a specific location, so you can know right away if the service overall is up, but just one or two locations are down.
We also send notifications when the performance for a health probe doesn’t meet the threshold you had defined. So for example, you could define a threshold of 15 seconds from Melbourne, Australia and 8 seconds from Glasgow, Scotland. If it takes longer than the threshold you’ve defined for that location, then you’ll get notifications to all of the “channels” that you’ve configured for your Office365Mon subscription – emails, text messages, and webhook data if you have that configured – that indicate the issue and where it’s occurring. You really will have an up-to-date, around-the-world view of your users’ ability to connect to Office 365 in a reasonable time frame.
Get Started Now
This feature is available to use now for all Office365Mon customers that are either in their 90-day trial period, or that have the Enterprise Premium license. We hope that you’ll give it a try and, as always, let us know how we can improve upon it. The features in this update were all driven by feedback from our customers so it DOES matter when you make suggestions.
To get more information on this feature, see our original post about it here: https://samlman.wordpress.com/2015/09/28/announcing-the-availability-of-office-365-on-premise-health-probes-for-office365mon-customers/. To get get the documentation and agent, visit the Distributed Probes and Diagnostics configuration page on our site here: https://www.office365mon.com/Configure/OnPremProbes.
Thanks from sunny Phoenix,