Tuesday, 17 December 2019

Using Gatling to dynamically generate lots of complex JSON

This is a quick post sharing some recent work I’ve done to investigate using Gatling to generate large amounts of load, specifically in the form of complex JSON documents. As the JSON documents are complex, with nesting, relationships and logic, I didn’t want to use the usual method of string replacements with Gatling session variables. I wanted to construct the JSON in code so I could make use of helpful code patterns and practices to make it easier to build and maintain.
At the same time, I wanted to make use of Gatling’s scenario functionality because its a useful way of modelling and shaping data in a realistic manner, as well as also giving a lot of code that generates load for free. I also knew already it was possible to have Gatling call and use Scala code as I had done it before.


The code structure and building JSON

You can find the code here:
https://github.com/matthewbretten/gatling-json-generator

The first point of entry for the code is the “TestSimulation.scala” file which defines and executes the Gatling session. I have included a simplistic e-commerce example, where there are two main user stories - your casual shopper who buys 1 or 2 items and big spenders who buy lots of items. In the comments, you can see an example of how I use this to control the load - letting me define a scenario where we have lots of casual shoppers regularly sending data, whereas big spenders are more rare and only occasionally send data.

The key part for this post is the feeders (defined by “.feed”) that pull data defined by a Scala object imported into this test. This is how I bring in JSON objects defined by Scala code into the Gatling session.

If you follow this code, you will see how I’ve written Scala case classes that define the shape of the JSON (under the folder “objects” in my code) and I’ve written Scala objects that define how to generate their respective classes. This gives me a nice separation of maintaining the JSON structure and modifying and maintaining the data set I populate the JSON with.


Relational data in JSON

Sometimes you need to create JSON that has a relationship, such as a variable that sums the numbers of other items or a summed price. If you look at the ItemGenerator code you can see how I’ve been able to dynamically generate a random list of items with random prices but still have a related field “totalPrice” correctly equal the sum of the individual item prices.

Generating random data outside of Gatling’s DSL
In addition to this JSON object generation, I’ve also included some examples of generating random data. Why not use Gatling’s features for this? Well because I want to define the JSON up front, I can’t use Gatling functions without starting a new Gatling session, which you cannot have multiple instances of. So I wrote some of my own code to allow me randomly generate things like loading values from files.
Why did I copy the function “RandomIntBetween” from scala.util.Random? This is because its only available in Scala 2.13 and currently Gatling only works with 2.12.


Debugging Gatling

Also as an aside, I’ve included comments on how to print out Gatling session variables during its run. Gatling can be difficult to debug and sometimes it’s useful to see and tweak its behaviour during the run.

Summary

I hope someone finds this useful, I had fun writing this and learned a lot about Gatling and Scala in the process, and I will for sure be referring back to this code in future. I also found myself refactoring this code a whole lot more in the process of uploading and sharing it!

Thursday, 12 September 2019

Testing in DevOps

I've just spent the past year embedded in a "devops" team (quotation marks explained later) and I've got a few different points to make, so bear with me, this is going to be a long post. Also a bit of a brain dump so it might not be my best writing ever as I want to write this while its relatively fresh in my head
This post is also going to be a little technical and assume some knowledge of DevOps, if you're new to the phrase, I highly recommend Katrina Clokie's book "A Practical Guide to Testing in DevOps" found here - https://leanpub.com/testingindevops

That word "Devops"

In my experience, there are two different understandings of the word/phrase "devops". Basically it boils down to:
  • "Devops" is not a role, it's a set of practices, which makes it a bit woolly and vague but in general is about bringing two traditionally separate roles together so that a "team" can both deliver (for example) a software application and its hardware but also maintain, operate and support it in production/live/whatever you want to call it. This can be achieved by training the team in operations or by embedding ops engineers.
  • "Devops" is where developers write infrastructure-as-code, typically Ops engineers interested in programming. But sometimes also software programmers interested in getting their hands dirty with more Opsy work. In this definition, you tend to see this become a role called "Devops engineer" and they tend to write re-usable chunks of code that builds infrastructure for software teams to use. For example, creating a generic set of code that provides a MySQL cluster in AWS.

I highlight this because I've realised people aren't aware that there is this difference and personally I prefer to encourage the former rather than the latter. The latter is a bit like SDETs/Developers in Test where you're creating a whole new communication/distance from the end goal role, writing tests that are broken by the dev team because they have no idea about them. The first definition I like because its about teamwork and delivery, encouraging the team to take a more holistic view of software delivery or rather product delivery as a whole. After all, who cares if your code is shit hot if we've put it on hardware too small to run it?

Cloud and infrastructure-as-code

Regardless of those definitions, if you're going to work with cloud-based infrastructure (as opposed to on-premise, where your company owns or rents the physical servers), you're going to be writing infrastructure-as-code. Why? Because in the cloud you are sharing your hardware with other companies and people, which means you have less control over how it operates and this changes the risk profile. The cloud is cheaper than owning the physical servers but this comes at the cost of reliability.
Therefore you may want to automate many aspects of operating your product, such as recoverability and scalability which is where infrastructure-as-code comes in.
There are two main areas that can be expressed in code:

  • Code for provisioning the infrastructure you need e.g. the hardware, the network routing, firewall rules and so on.
  • Code for configuring an individual server, e.g. setting up users and file permissions, installing software, configuring software (such as setting up Java and then setting up a Java based application to run).
Why consider these two separate areas? In my opinion there is so much to consider and think about in both that its worthwhile considering them in their own discussions, even though you will want to develop and test them together.

Testing in DevOps

So what to test? How do you test? What tools do you use? Is there anything to really test?
Hell yes there’s loads to test, you’ve now got code that builds the foundations of your product, and not just that, but its code that defines how your product will scale and recover and also determine its reliability. Suddenly a developer can potentially make a small change and open all of your servers to the public to access.
These are some ideas of what you can test in an Ops world:

  • You can manually run and try out the code as an obvious starting point. Does it build the servers correctly? Can you use the application after destroying it and building it again?
  • Destroying the servers leads to OAT (Operational Acceptance Testing) or general operability. Testing what happens in disaster and failure scenarios. Will your product recover if the servers suddenly disappear and new ones are built? Do you test your backups regularly? This also neatly leads to a ideas such as chaos engineering.
  • The code itself can be sort-of unit tested. For example Ansible has a framework called Molecule which allows you to run your Ansible scripts against a Docker container and assert what state the scripts will leave a server in. There are also more broader integration test tools such as Test Kitchen which have slightly more capabilities.
  • Using tools such as AWS Trusted Advisor or Well-Architected to analyse your infrastructure for common mistakes (such as setting up firewall rules completely open to the public) or under-utilised hardware that could be run more cheaply.
  • Given cloud infrastructure is inherently prone to failure, can you monitor and alert those failures? Do you know if your servers fell over overnight? How many errors are happening in your environments? Usually cloud providers don’t have access to your servers to know what is going on, so you need to setup your own access to software logs (e.g. like Java app errors), have you centralised these logs for easy access?
  • Tools like Sensu allow you write custom automated checks to monitor your servers, this is very useful for more granular checks like specific software health checks (e.g. your server never failed but the software application has crashed, can you tell from your monitoring?). I think there is a lot of value here for testers to help design, write and create new simple but smart checks and improve how observable systems are not just in production but also in all environments!


Some other things I’ve worked on that were very context specific, and not very DevOpsy or Testing, but could be useful ideas:

  • Creating Jenkins pipelines to test out rebuilds of infrastructure-as-code on a daily basis - this was to cover a specific risk we had that our code was very tightly coupled and we were breaking a lot of projects with our changes. This was an interim solution until we un-coupled our architecture but it was useful to be able to do it.
  • Off the back of Jenkins pipelines for rebuilding infrastructure-as-code, I also created jobs that would manage downtime periods to help save a good two-thirds of our monthly costs. In general we only needed our test environments to be up 8 hours a day, 5 days a week, not 24-7. I used Jenkins so I could manage dependencies and run custom checks but this could also be achieved with the right auto-scaling policies in AWS too.
  • Fixing and writing my own Sensu checks, this was useful although I would be careful as its easy to write a check that produces false positives or negatives. Its very hard to think of all scenarios the check could encounter so avoid writing scenario-based checks where possible. It’s not helpful to have monitoring checks that have bugs themselves and are difficult to debug when they fail, keep them simple.
  • Hooking together Dev team Selenium tests, this was because my context was an Ops team changing infrastructure for Dev teams. I wanted a way to test our changes before we potentially broke the Dev teams’ dev environments. This isn’t recommended if you can avoid it as obviously its not very DevOps. But in general finding a way for infrastructure-as-code to be eventually end-to-end tested in an automated way is useful because its hard to really test things like firewall and network routing configuration or file permissions until you actually try to perform certain actions from the server. The hard part is knowing when to run the tests, you to know when your infrastructure-as-code is finished and when the servers and various components have actually completed their setup and app is running. I achieved this with a Jenkins pipeline which polled a Sensu check that might look at a health endpoint from the app, when this went green I knew to proceed with the test and it would timeout if it took longer than usual.
  • Writing simple scripts for monitoring or analysing our AWS account. In our context we needed to tag the hardware we were using for our own internal billing purposes so we could appropriately budget for certain projects. As this relied on humans remembering to include the tagging in their infrastructure-as-code, it was useful to regularly audit the account for servers that were missing tags and therefore wouldn’t be billed appropriately. This also made it easier to investigate under-utilised servers and talk to the owners about saving costs.

Monday, 26 August 2019

Using Bowtie Diagrams to describe test strategy

Introduction

A long time ago in this blog post I was introduced to the Bowtie Diagram. I love how this visualises how we manage risks and I feel this compliments test strategy. Why? Well surely our test strategies should be accounting for risk and how we manage it. Whether you define testing in a specific focused sense (like functionally testing code) or in a holistic or broader sense (like viewing code reviews as testing, or monitoring or simply asking the question “What do end users want?”) - these activities are ways of either preventing or mitigating risks. I feel if you want to be effective in helping improve the quality of a project or product, you need to assess the potential risks and how you are going to prevent or mitigate them and therefore also assess where your time is best placed.
What are Bowtie Diagrams?
The short version - it’s a diagram that looks like a bowtie, where you describe a hazard, the top likely harmful event then list threats that will trigger the event and consequences from the event happening. For each threat and consequence, you describe a prevention or mitigation.



The long version, (and better more comprehensively described version) can be read here - https://www.cgerisk.com/knowledgebase/The_bowtie_method

What I love about these diagrams is that they more visually describe and explain risks and how we intend to manage them. Creating them is a useful exercise in exploring risks we may not have normally thought of, but in particular, I find we don’t explore the right hand side (consequences) of these diagrams very often. I find most of the time in software development that we are very reactionary to consequences, and even then, we don’t not typically spend much time on improving mitigation. 


Managing the threats
I’ve started using these diagrams to explain my recent approaches to test strategy because they neatly highlight why I’m not focusing all of my attention on automated test scripts or large manual regression test plans. I view automated test scripts as a barrier to the threat of code changes. Perhaps the majority of people out there view this as the biggest threat to quality, perhaps many people understand the word “bug” to mean something the computer has done wrong. These automated test scripts or regression test plans may well catch most of these. But are these the only threats?

I see other threats to quality and I feel we have a role to play in “testing” for these and helping prevent or mitigate them.

Managing the consequences
Any good tester knows you cannot think of or test for everything. There are holes in our test coverage, we constantly make decisions to focus our testing to make the most of precious time. We knowingly or unknowingly choose to not test some scenarios and therefore make micro judgements on risk all of the time. There are also limits to our knowledge and limits to the environments in which we test, sometimes we don’t have control over all of the possible variables. So what happens when an incident happens and we start facing the consequences? Have we thought about how we prevent, mitigate or recover from those consequences? Do we have visibility of these events occurring or signs they may be about to occur?
I find this particular area is rarely talked about in such terms. Perhaps there is some notion of monitoring and alerting. Usually there are some disaster recovery plans. But are testers actively involved? Are we actively improving and re-assessing these? I typically find most projects do not consider this as part of their strategy, in most cases it seems to be an after-thought. I think most of this stems from these areas typically being an area of responsibility for Ops Engineers, SysAdmins, DBAs and the like, whereas as testers we have typically focused on software and application teams. As the concept of DevOps becomes ever popular, we can now start to get more involved in the operational aspects of our products, which I think can relieve a lot of pressure for us to prevent problems even occurring.


Mapping the diagram to testing
An example of using a diagram like this within software development and testing:







I feel our preventative measures to an incident occurring are typically pretty good from a strategic view, especially lately where it has become more and more accepted that embedding testers within development teams improves their effectiveness. Yes, maybe we aren’t writing unit tests or involving testers in the right ways sometimes still. But overall, even with such issues, efforts are being made to improve how we deliver software to production.

But on the right-hand side, we generally suffer in organisations where DevOps has not been adopted. And when I say DevOps, I don’t mean devs the write infrastructure-as-code, I mean teams who are responsible and capable of both delivering software solutions and operating and maintaining them. Usually, we see the Ops side of things separated into its own silo still and very little awareness or involvement from a software development team of their activities. But Ops plays a very key role in the above diagram because they tend to be responsible for implementing and improving the barriers or mitigations that help reduce the impact of an incident.

I feel the diagram neatly brings this element into focus and helps contribute to the wider DevOps movement towards a holistic view of software development, towards including aspects such as maintenance, operability, network security, architecture performance and resilience as qualities of the product too.

As testers I feel we can help advocate for this by:

  • Asking questions such as “if this goes wrong in production, how will we know?”
  • Requesting access to production monitoring and regularly checking for and reporting bugs in production.
  • Encouraging teams to use a TV monitor to show the current usage of production and graphs of performance and errors.
  • If you have programming/technical skills, helping the team add new monitoring checks, not dissimilar to automation checks. (e.g. Sensu checks)
  • Becoming involved with and performing OAT (Operational Acceptance Testing) where you test what happens to the product in both expected downtime (such as deploying new versions) and disaster scenarios, including testing the guides and checklists for recovery.
  • Advocating for Chaos Engineering.

Friday, 17 May 2019

Using character recognition for browser automation

Introduction

I was venting recently to my colleague Chris Johnson about the frustration of working with yet another horrendous website that didn’t have any usable locators and he put to me the idea of “what if we had visual driven automation, where the automation treats the website as a complete black box, like a human user?”. It got me thinking, surely this could be possible already in some respect. As I am already very familiar with AWS (Amazon Web Services), I had a look if they had any services that could be useful for this and found their brand new Textract service, which lets you send image documents (.jpg, .png or .pdf) and analyses them for text. This then gave me the idea to try and use this service with Selenium to find elements on a page and click them without any need to interact or understand the HTML. This post is a report on my findings.

What is Textract and how does it work?

Textract is a type of OCR (Optical Character Recognition) service that detects text and data in image documents. It works by supplying it an image file and it responds with the results of its analysis, a list of words, sentences and objects (like forms and tables) that it has identified. Each word or sentence has a set of data including the location in terms a rectangular box and a percentage confidence of the accuracy of its findings. Behind the scenes it manages this via a pre-trained machine learning algorithm.





 

As with other AWS services, you can access Textract via the AWS CLI so for my research I used the boto3 library for Python3 which allows you to interact with the AWS CLI within Python very easily.

When you are using Textract, you receive JSON responses that look like this:

{
        "Blocks": [
            {
                "Geometry": {
                    "BoundingBox": {"Width": 1.0,"Top": 0.0,"Left": 0.0,"Height": 1.0},
                    "Polygon": [
                        {"Y": 0.0, "X": 0.0},
                        {"Y": 0.0, "X": 1.0},
                        {"Y": 1.0, "X": 1.0},
                        {"Y": 1.0,"X": 0.0}
                    ]
                },
            "Text": "Store",
            "Confidence": "99.1231233333",
            "BlockType": "WORD"
        }
        ]

}

My test

So the idea for my test was to create a simple script that would:
  1.     Use Selenium to load a website.
  2.     Take a screenshot.
  3.     Send the screenshot to Textract.
  4.     Find the location of some text on the website that would take us to a new page.
  5.     Click the location on the website.
  6.     Check that we had gone to the correct new page.

Pre-requisites

In order to get this test working, this is what I needed:
  • A computer with Python3 installed and the relevant libraries (boto3, Selenium).
  • Chromedriver and Chrome installed.
  • An AWS account with access to Textract. At the time of writing Textract is in preview and you need to ask Amazon nicely for access. It took about a week for me to get access.
  • An AWS IAM (Identity and Access Management) user on my AWS account that has the permissions to interact with Textract.
  • AWS CLI installed on my machine and configured with the IAM user and an AWS region that Textract is available in (its currently only available in a few regions).
I chose to use Python because I’m very familiar with it, and it allows me to write a simple script like this very rapidly with little setup needed. You can use any programming language, you just need to be able to interact with the AWS CLI.

My findings

Did it work? Yes, effectively. I was able to create a script that did all of the above and successfully navigate between pages...but with a very big caveat - only if the text appeared in the top left corner of the page in any resolution. Why? When you request a screenshot from web browsers, they do not take a screenshot of what you can see, but instead they take a full page screenshot. This means you cannot easily map the pixel locations from the screenshots to the browser window, particularly when the real browser window is a smaller resolution and the website dynamically changes visible location of elements. Even at larger resolutions there is a small difference in resolution sizes because you almost never display the full page in a window.

This means that my idea has limited usefulness until I could find a better way of generating screenshots that are just the visible area rendered by the browser.

However, in principle my idea did work if you can provide an image that maps 1:1 with the browser window, but there are still other issues. I tried testing against different websites just to see what other issues would crop up:

  • Naturally you need relatively clear text in order for Textract to confidently find it. However, this could be argued that unclear text would be an accessibility issue anyway and in the below screenshot you can see Textract does a reasonable job of even reading unusual fonts or text that isn’t perfectly straight.
  • Obviously if you want to find elements that are not text, Textract isn’t useful at all. Potentially other pattern recognition services could be useful though if you could provide it an image pattern to find rather than just a text string.
  • It could be tricky to figure out which is the right element if there is more than one example of some text on the page.
  • While Textract is quite fast (roughly 2 seconds or so for it to return a response), some dynamic elements of the page could change in that time. Some website have scrolling elements or pop-ups that only appear after a short time. This means you might not be able to totally run a black box test as you need to check for these elements before proceeding.
Due to these issues, I don’t think you could totally remove the need for HTML locators yet. But there is certainly some potential here to expand the abilities of automation tests and maybe make parts of them more powerful. Even if you don’t use this tech to replace locators for Selenium, there are other applications that might not have been easy before - such as testing PDF files generated by the systems you test.

Other considerations

If you were to consider using this tech in your automation, there are some other factors to consider:
  • Costs, AWS charge for the use of Textract. Depending on your context and how you use it, this may be a limiting factor. The pricing can be found here: https://aws.amazon.com/textract/pricing/
  • Data privacy, AWS state they may store and use the documents you upload to maintain and improve the machine learning algorithm behind Textract. You may work in a workplace where the screenshots contain sensitive information that AWS employees are not authorised to look at. You can request AWS delete documents you upload, but this is a manual process of contacting their support team.
  • While I did find Textract responded quite quickly (around 2 seconds), I don’t know its capabilities under significant load nor did I hit the API request limits. I was only running very low numbers of requests.
  • Textract is currently only available in certain AWS regions, so if you require data be held in particular geographic locations then this is may be a limiting factor.
Of course, other services similar to AWS Textract may be available which overcome these factors.

Summary

This was a fun little exploration into technology I’ve not used before and it was nice to find it so easy to use. I think there is some potential in using this service for automation tooling, particularly if you need to analyse image documents.

If you’re looking to totally remove the need for HTML locators, I think there is more work required though, if you know of a good solution to the browser screenshot resolution problem let me know!

My code

Here’s my code if you’d like to try it yourself:

https://github.com/Ardius/python-selenium-textract/blob/master/textract_test.py

Wednesday, 1 November 2017

So you can test an API, what to learn next?

Introduction

Last week I ran my first ever workshop at a conference for TestBash Manchester! It was an awesome experience, totally different to the talks I’ve done before at meetups and smaller conferences. The workshop was all about how to get started with web API testing and I targeted it at beginners who had no prior knowledge of APIs. For this post though, I’m not going to talk about the workshop, but more about what happens next. Several people have asked me about a more advanced workshop and what they could study in their own time next. I don’t have a quick or easy answer to this as I feel there is lots more to learn and it really depends on your context. However, I’m going to try and discuss some areas and ideas.

Where are we starting from?

Before I get started, I’d like to clarify where my workshop left people and what this post is assuming you’re already familiar with.
  • What APIs are.
  • What do APIs look like and how they work.
  • Why they are useful to understand for testing.
  • What API documentation looks like.
  • What paths and query strings are.
  • How methods work.
  • What status codes mean.
  • The concepts of authentication and authorisation.
  • How to create requests with authentication using Basic Auth.
  • The concept of resources and IDs.
  • What headers are.
  • What request and response bodies are.
  • An introduction to JSON & XML data formats and data types.
  • Using Postman’s basic features.
  • Understanding how Postman collections could be used.
  • Awareness of how basic automation can be created using Postman’s runner.
If you feel lacking in these areas, I would still spend more time on understanding the basics of these before starting on anything more advanced.


So with that, I’ll go over some areas that we could explore for a more advanced workshop or for you to explore in your own time.

Try testing other, more complex APIs

On my workshop, we learned to interact with a simple API that myself and my friend Lee Goodman built together. This API was intentionally designed in a way that allowed attendees to learn in stages, introducing different concepts at each stage. However, this is not how a API will look in reality, you won’t have nice ways to learn it in stages, they will immediately throw everything at you. You typically won’t be able to interact with most APIs without authentication (which will be more complex and won’t be the same for every API) and they will provide varying levels of quality of documentation.


One of the aspects of APIs that I didn’t cover in my workshop is that they are an abstract representation of the resources, objects, functions and capabilities of an application or system. In plain English this means when you learn to use an application via an API, the picture you build up in your head based on the API’s structure and responses is based on a translation of the system underneath the API. Just as with translating from Japanese to English, there are concepts that are not easy or even possible to express in an API. People also make mistakes in translation or have many different ways of creating the translation. It’s useful to get some experience of this by using more APIs, you may start to notice these differences and get a feel for what works well, what doesn’t and perhaps get some sense of the compromises made. You will also see where some of the language that even I use is not consistent across all APIs.


You can try out various public APIs for free, one example being twitter’s API. You can find documentation for lots of public APIs you can try here:
There is also simpler and neater API to to play with produced by Swagger here:

Learn about more complex forms of authentication

In the workshop I only covered the basics of how to authenticate your requests in Postman with one of simplest forms of authentication called Basic Auth. There are many many more types and technologies for authentication that you could learn about, some of them are very complicated and deserve an entire workshop in themselves! I don’t feel it’s necessary to understand them all because you probably won’t come across many of them. But it could be useful to understand the more popular (and secure) kinds of authentication such as OAuth 1.0 and OAuth 2.0.

Learn about different kinds of headers

I briefly talked about headers in the workshop, mainly in reference to the “Authorization” header (which is the one Postman creates for you when you add authentication) and the “Content-type” header which we used in the workshop to tell the server whether we were sending JSON or XML with our POST and PUT requests (again automatically generated by Postman). There are a couple of more headers that you can experiment with when sending requests such as the “Accept” header which can tell the server to return responses in a different format, in the same was the Content-type header. This means you can do weird stuff like send JSON data but demand the response is in XML. I have accidentally killed servers in the past with typos in my headers too! You can read more about different kinds of HTTP headers here.

Learn to use the more advanced features of Postman

Postman has a lot of neat features which can be used to augment your testing in different ways. Learning to use collections can allow you to create documentation of an API you’re exploring which you can share with other testers and developers (particularly useful when a new person starts on a project, you can give them a collection to get them up and running much faster). If you’re finding yourself repeating the same requests a lot, especially to create data, using Postman’s collection runner can allow you to create automated scripts of requests that quickly generate test data for you.


You can further extend the capabilities of your collections to automation checks which can be rapidly run and tell you if the API you are testing is ready for deeper exploratory testing or if there is something significantly wrong. You can do this by using Postman’s test scripts function. While these tests are written in Javascript, it’s possible to write these scripts with little knowledge of Javascript using the example snippets. However, it may be helpful to learn a little bit about Javascript to get the most out of these scripts. You can learn Javascript via sites like this one, which is a free 30 day coding challenge.


Combined with Postman’s pre-request script functionality, you can then create more complex collections using functions such as loops and branching. In addition to these, you can also learn about environments and variables, which let you parameterise data that needs to change every time you run a request. The most common examples of using environments is where you have multiple test environments with different domain names (e.g. wwww.live.test.com & www.stage.test.com) but you don’t want to keep re-writing the requests.


These features can allow you to chain requests together, so rather than manually copying an ID from one request to use in another, Postman can run the two requests together for you. This is a good blog post explaining how to do this.

Try integrating your Postman collections as part of a CI pipeline

One of the most popular topics in software development currently is DevOps and the related topics of CI (Continuous Integration) and CD (Continuous Delivery). Typically a team that works to these methodologies has a ‘deployment pipeline’ where they build their code and run unit and integration tests. If you work with a team like this, you can setup their deployment pipeline to run your Postman collections for you. This means that your collections that help you create test data or check the environment is ready for deeper exploratory testing can be run for you every time you create a new build of the codebase. The tool that allows you to do this is called Newman. Newman simply allows Postman collections to be run on command line, however this means any CI build tool can run it too such as Jenkins, Bamboo, TeamCity or GoCD. Here is a blog about how to do that with Jenkins.

Have a look at any existing APIs you may work with already

This seems obvious, but ask around about any APIs you might already work with, there may be systems you didn’t realise existed that you could have a look at. Or there could be third party integrations that your team use within your application. There may be some existing documentation or even monitoring, it can be especially interesting to have a look at any monitoring you already have. Tools such as AppDynamics gather a lot of data such as API requests, their speed and responses. This can give you an insight into how people use those APIs and what problems may already be occurring.

Try out other tools

Postman isn’t the only way to interact with APIs, it’s a great tool and I especially like using it to teach with because it’s popular and has a nicer interface which isn’t as cluttered as some others. Your team may use a totally different tool or you may need to use another tool in future and they all have different strengths and weaknesses. So it may be useful to learn some other tools such as:
Another popular tool for interacting with APIs is Jmeter, however I highlight this separately as it’s actually a load testing tool. It can be used in a similar fashion to the Postman collections but is designed for running many, many concurrent requests and designing performance test runs. However, I have seen it successfully used as an automated functional checking tool and can be integrated into build pipelines too.

Write automated checks in a scripting or programming language

There will come a point where creating automated checks using Postman becomes very complicated or unwieldy. This is a point where it's typically easier to write it in a scripting or programming language. Why? Postman (and also Jmeter) are GUI-based and so they enforce a particular pattern and design to your tests in order to work. Sometimes what you are trying to do doesn’t neatly fit their structure or sometimes you want to integrate with more systems or perform functions they don’t provide.


Which programming language? It pretty much doesn’t matter, almost every popular programming language comes with libraries for making HTTP requests and frameworks for running tests. What you decide to learn should be guided by:
  • Your level of experience with programming.
  • Who is going to write and maintain these checks.
  • What languages people use around your workplace.
  • What languages are used for the application you are testing.
If you are working with a Java application and your team is happy to share the work and help out, it may make the most sense to write your automated checks in Java using frameworks like Junit and libraries like RestAssured. This means they can be easily incorporated into the rest of the integration tests your team already has and removes the need to find more tools to run it in your pipeline.
However, you may not be closely working with a team like this or have developers to support you. You may have decided that it will be best for you to maintain the tests. In this case it’s more important to choose a language you are comfortable learning. In this scenario I personally like teaching people about Python and the requests library because Python can be easier for newbies to learn programming. However, there are lots of other languages such as Ruby, Javascript, C# and more and none of them are bad to learn. They all have the capability to create these checks and much of what you learn will be transferrable to other languages.

Learn how to work with mocks

Sometimes you may need to test an application that relies upon a third party API. Maybe it’s a website that hasn’t got a back-end finished yet. Perhaps you are working with lots of other development teams and some of the work has been finished early. In these situations it’s useful to be able to create pretend versions of these APIs so you can test as if they were there. These are referred to as ‘mocks’. You can have a play with websites such as this one to create a fake API that responds exactly the same as an application you want to fake. You can then point an application you are testing at it to begin testing against your contract.

Learn about contract testing

Speaking of contracts and mocks, there are now tools that let you create these mocks in a more reliable and automated way. Tools such as Pact allow teams to run automated checks against each other’s services without having to understand how to run the services. While not strictly about APIs, learning about how you could automatedly check an API you provide for another team or vice versa can be a lot more useful than creating massive Postman collections or custom mocks that fall out of date as teams update the behaviour of their services.

This video gives a helpful explanation in a conversational style about Pact and contract testing (thanks Conny!):

Thursday, 17 August 2017

Best of the BSides - A friendly security conference in Manchester

https://media.licdn.com/media/AAEAAQAAAAAAAAd-AAAAJDU1ZDQ1MWY4LWNmYzEtNGNlMi04MTgzLTRhNTBiODgxNmIwYg.png

Introduction

Today I attended a great little conference in Manchester called BSides Manchester. This was a free conference about security ran by members of the security community in a similar way to TestBash. In fact the whole event was a bit of a “SecurityBash” in so many respects, which is awesome and I recognised many familiar topics, concerns and ideas. Whether you're experienced with security or a newbie, I highly recommend this conference. I went along with no expectations, just hoping to learn as much as I could, expose my brain to new ideas and even if I didn’t pick it all up immediately, it could give my brain a place to start. Not only did I actually learn quite a bit but I also noticed that there was a great deal of similarities to testing so I thought I’d talk about the conference from that angle.

The similarities and parallels to testing

In no particular order:
  • The security community seem to be very keen to promote leaner and more effective ways of improving security such as getting involved earlier and trying to be involved in discussions about new projects or approaches. This is exactly the same as with testing in general and both are frustrated when they are only asked for their opinion very late in projects. Perhaps this is the biggest area we share in common and maybe we could share our experiences and lessons with each other. Perhaps we can also be allies on this, for example where a tester has managed to get involved in the project early, we could be advocates for involving security professionals earlier too and vice versa.
  • Carolyn Yates gave a great talk on the bowtie method which is very applicable to testing too and reminds me of how we look to use examples like mind-maps to effectively visualise our work. She also made the point that “not all tools need be programs, sometimes they can be visual aids” which I think as testers we can certainly appreciate too.
  • There was a great talk by Collette Weston about echo chambers - in particular the difficulty for women and other industry minorities to break into the InfoSec industry and community and what can be done about it. I think we can all agree this is an issue across the software industry as a whole too and while I feel testing is a little bit better in this regard, it’s definitely not as good as it could be. This talk also prompted a great discussion about how some companies had started trying to diversify their security personnel (including hiring people with biology degrees) and I know in testing its well appreciated that we benefit greatly for our diverse backgrounds.
  • In two separate talks by Ian Trump and Charl Van Der Walt there were discussions of what the future might hold and how artificial intelligence and the advance of technology will shape the industry and the work of security professionals. It seems obvious but I found it quite re-assuring to know that it’s not just testers who are wondering how these advances will affect their jobs. There was also discussion of the effects of automation and whether people were really considering these effects on the loss of jobs and how humans interact and use the automation. This echoes the concerns I’ve heard many testers raise and reminds me of my old blog post on this subject.
  • Naturally there were several more technical talks focusing on particular types of hacks, attacks and penetration tests. This included discussions of how to defend against these attacks too. The mindsets and techniques that security professionals use to find and report these exploits is exactly the same as how testers find and report bugs. I think we have a lot in common on this subject (as, well, it is a form of testing) and I think we could do more to engage with the security community and share our experience - just as much as we can learn a lot from them too! All the things we talk about in testing were present here - such as trying to turn exploits into the most damaging problem they could find to justify and explain to companies why they need to fix it. I believe as testers we can also become more effective at general testing by learning about these exploits too - both in helping raise security issues earlier but also giving us more ideas for other kinds of testing. Perhaps we could share our knowledge, approaches and experience of exploratory testing with them.
  • Another common theme of the conference was that security was not really a technological problem, but a people problem. This is of course not a new revelation, there are many historical quotes and philosophical discussions, for example, “a bad workman blames his tools”, “pick the right tool for the job” and so on. However, as humans we clearly find it difficult to keep these lessons in mind and it is easy with bias to miss that we are making assumptions about our problems. As testers I feel we should be very aware of this too and typically many challenges we face are nothing to do with the particular technologies involved. Many software bugs are caused by humans and machines are simply doing as they are told, the same applies for security exploits.

The differences

Of course, for all our similarities, there are also differences:
  • As part of the discussion about diversity in the industry from Collette’s talk, there was also discussion about autism and how there was a general belief that many “black hats” may have struggled at school, dropped out and only picked up hacking because they had no other options. It was pointed out that because many companies require specific levels of education (such as GCSEs), it meant there was no way for these individuals to become security professionals. Why is this different to testing? Well in the testing industry I don’t feel we have such a specific concern with autism (though it will definitely also affect the testing industry and community too!), I feel our concerns are more about increasing awareness of testing as a possible career in the first place!
  • I think this one is probably obvious but the security community is more naturally technically focused and capable, in tandem with the above point, most people seem to join the industry because of their interest in it and interest in technology. As such, while there is diversity, I get the general impression the diversity of backgrounds is a lot more acute as opposed to the very broad backgrounds of testers. As a result I feel testers tend to be less technically focused and more a balanced spread of soft skills to go with the technical subjects. That said, the conference did feature plenty of talks that were more about the soft skills, although probably a different balance compared to some testing conferences.
  • I feel that the security community is even more aware of justifying their testing and explaining the effects of the exploits they find than the average tester because of both the ethical act of the testing and it’s very technical nature. Not only must they be very careful in not breaking laws or damaging a company but they also have to be very good at explaining why they think something is a significant problem and helping the company fix it. I think as testers we have a lot we can learn from this, not because we don’t do a good job of this, but our testing is a lot safer and doesn’t always require as much explanation. However, I think this will change over time if we get more involved with DevOps, challenging requirements and testing in production.

You should go to!

All in all, it was a great conference, I took a lot away and enjoyed myself. It was very reassuring to see so many similarities to testing and seeing ways in which we could work together. I hope to go to some other conferences in other areas around software development like Programming, Project Management, UX, Business Analysis, Operations and Systems Administration and continue learning from them. Maybe even to begin talking about testing at their meetups and conferences and see more sharing across our disciplines.