Wednesday, 9 June 2021

Using RPA to control my ISP settings for kids gaming - Power Automate Desktop for web automation - part 1

Like many families, we have annoying kids who would spend all their time gaming or on screens given half the chance. So, our tyrannical regime approach essentially grants them a certain amount of time per week, with controls put in place through our ISP which allows me to block or allow individual websites. Unfortunately with my ISP (Virgin Media here in the UK) there's no scheduling capability for this, only for parental controls overall - which means I have to manually go the portal and tap in the sinful URLs many times per week, and then unblock later on. A recurring conversation in our house goes something like this:  

  • Kid - "Dad, you know how you block our gaming websites when it's not gaming time so we can't sneak on?"
  • Me - "Yes?"
  • Kid - "Well, you're always on calls when it's 5pm on a gaming day and we're losing lots of time. It's so unfair! Can't you schedule it or something?"
  • Me - "Sorry son. Despite using a company who claim to be the UK's leading ISP and paying them through the nose each month, they chronically underinvest in their management portal and it was built by interns in the 1990s. There are no APIs either, so it's not possible unfortunately. Trust me, I'd like nothing more than to never visit that thing again or have notes shoved under the bedroom door when I'm on calls - but that's life son."
  • Kid - "Erm. Could we not just......turn the controls off?"   
  • Me - "Hahahahaha!"

After the 86th iteration of this conversation, I decided to spend a weekend looking at Power Automate Desktop for this since it does web automation. Microsoft announced in March 2021 that Power Automate Desktop is now free with Windows 10 as they expand the Power Platform into other forms of automation. In truth, the licensing means that if you want true unattended execution (rather than attended automation where you manually press a button and all the steps execute for you), Microsoft's RPA technology is more suited to workplace scenarios than home or personal automation. I'll talk more about licensing in the next article. Nevertheless, my ISP scenario was a good excuse to automate something else and try another scenario with Power Automate. 

What I'm automating

The basic process can be described as:

Once logged into the Virgin Media portal I can manage my "web safe" settings through a series of tabs. I'd need the automation to load the website (ensuring the session is authenticated with my credentials), and firstly navigate to the "Websites" tab shown below:

The websites tab provides an interface to specify the sites to block. I leave some sites permanently blocked, but what my automation needs to do is come in here and add or remove the "scheduled" websites. As the numbers in the image below depict, it's a 3-step process involving the URL being typed into the box, the "Add" button click and then settings applied with the "Apply" button once all changes are in:

The unblocking is effectively a reversal - this time the automation needs to find the right entries in the list and click the "Remove" link next to each. This involves a slightly different series of steps and there are a couple of ways in Power Automate Desktop to do this, but we'll get to that later. 

Getting started with Power Automate Desktop

Given that the tool is all about automation from the desktop (although cloud-integrated), we need to download some software. There are a number of ways you can do this, including from the "Install" menu within the Power Automate portal:






Once installed and signed-in, your machine effectively has a binding to a Power Platform environment:

Power Automate Desktop capabilities

The overall capability set is very powerful indeed, and of course the whole premise of desktop automation is that you can record (or manually create) steps which drive keyboard and mouse actions. You can open applications and browser windows, interact with their controls, and perform steps in systems that would be otherwise difficult to automate. Here are some of the things you can do:

Control elements of the desktop PC:      Work with files and folders:

Control applications through UI automation:     Control applications through web automation:

This distinction between "web automation" and "UI automation" is important - we'll come back to this later, but notice there are some similarities and some differences between the possible actions.

Overall, the toolbox covers many different areas of automation and there are almost infinite permutations of how these things can be combined:

Now let's look at the specific process to implement the automation I need.

Getting started with Power Automate Desktop

Like other desktop automation software, Power Automate Desktop brings a recorder to allow you to record the screen as a one-time operation to create your automated process. Whilst you *can* create your automation by directly dragging and dropping actions from the toolbox, in most cases you'll use the recorder somewhere in there. In fact, there are two:

  • A web recorder
  • A UI recorder
These are represented as two icons at the top of the designer:

As we're talking about this distinction, here's a tip:

Tip #1 - Web automation vs. UI automation
Since it's web automation I need to perform, using the web recorder is likely to give me the best results since it has a deeper understanding of browsers, web page structure and HTML input elements such as textboxes, radio buttons and dropdowns. The UI recorder is more suited to automation of desktop apps. This rule of thumb works in most cases, though I could imagine the occasional need to use UI automation in the browser (e.g. for a particularly complex web UI where pixel locations are more effective than navigating a DOM structure or dealing with content represented in images).

When using the web recorder you choose which browser to use - regardless, you'll always need the Power Automate Desktop browser extension installed and enabled (here are the links for Edge and Chrome). You select your browser of choice to get started:

Tip #2 - Chrome vs. Edge
At the moment, web automation with Chrome seems to be much more reliable than with Edge. The notorious "failed to get window" error seems much more common with Edge, and operations that fail in Edge tend to just work in Chrome. For now, I recommend switching to Chrome if you're trying with Edge but run into unexplained challenges.

As you record the steps you want to automate, they are captured by the Web Recorder:

In the end, my automated process looks for blocking the websites looks like this:

Tip #3 - use variables in your automated steps
Whilst the recorder will do a good job of detecting text you type into forms, replacing these with variables to more clearly separate out the values is a good idea. They become easier to replace and less buried in your script - just like in code.

In my case I have variables for the two websites I want to block and my Alexa access code:

The result

The video below shows the automated process in action - some observations:
  • It happens very fast :) 
  • It's hard to see what happens because some UI elements are off-screen or barely in the screen - however the web automation doesn't care about this 


Reversing the process - unblocking the sinful websites


As you might expect, I have a similar process to reverse the actions - in essence I have two distinct Flows (which can be scheduled independently):

In my case, unblocking is a little more challenging because I have to identify the correct sites in the list. This proved to be one of those web automation challenges where it takes more than the recorder to nail the steps. I found myself with two options:
  • Identify the right CSS selector to find the element with "roblox.com", and navigate in the DOM to a sibling element which is the "Remove" hyperlink
  • Leverage the fact that the websites I'm removing will always be the last ones in the list. This was far simpler!
In the end I used Power Automate Desktop's ability to run some JavaScript on the page you're automating - very powerful! My actions looked like this:

The final step - announcing to Alexa

Since the whole problem scenario is the fact that I'm often on calls, I needed to signal to the kids that gaming was either available or unavailable. Alexa automation is a world I hadn't really dug into until now, but I found there are essentially two ways of getting what I wanted -
  • Alexa Notifications - this pings the Alexa devices, but only in a "you have a notification" way. To listen to the notification, someone has to say "Alexa, read me my messages", which is sub-optimal for what I wanted
  • Alexa Routines - this allows you to do any number of things, including make an announcement in the true sense with no "pull" of the message required
To integrate with Alexa Routines, I used the Virtual Buttons Alexa skill - this is a service which abstracts the triggering of Alex Routines and gives you a REST endpoint to call. If you want more than one virtual button, and I did because I wanted two different announcements, the service is chargeable but the cost is fairly negligible and it does seem to simplify things.

There's some setup work to do, but the steps are documented in an e-mail the service creators send. Part of the process involves defining the Routine in the Alexa app - in this case, to announce "Sorry kids, gaming is blocked" or similar:   

To integrate this into my automated process, I made use of the "Run PowerShell script" action in Power Automate Desktop and called the REST API through Invoke-RestMethod, passing the appropriate JSON in the body to call make the blocked/unblocked announcement as needed (via the Virtual Button ID):

Summary

Solution complete! I now have an automated process which opens a browser, drives the keyboard and mouse to make the changes to my ISP settings, and announce to Alexa when done - fantastic.

Except at this point, things aren't fully automated in an unattended sense - up to now, I have a button to click in Power Automate Desktop which will fire up the browser and execute the steps. I have to press the button manually though. The next step is to make this an automated process that runs from the cloud on a schedule. So next time, we'll talk about:

  • Turning this into a cloud Flow which runs on a schedule
  • Licensing considerations for Power Automate Desktop
  • Authentication challenges and solutions
For the last point, I'm talking about things like this - Captcha image verifications which are designed to foil malicious automation:



More on this next time!