The Tragedy of Given-When-Then

pjc50 · on April 16, 2019

Whereas:

> The reduction of the tester to an expert translator/typist is a tragedy

> The Given-When-Then detracts from understanding and readability but provides the much prized automation through tools like Cucumber

> Instead of abstract domain models in Rational Rose

Now I see how we got into this situation. All this misery comes from the people who understand the code and the people who understand the business being different, non-intersecting groups of people.

Since the BAs can't read the code, there is a desire to construct a technical representation that they can read, representing a some kind of formal specification, and then derive as much as possible the program from that. This was the promise of Rational Rose and the whole UML project. Had it been successful, it would have reduced programmers to stenographers, "coders" in a very limited sense. As it is, in a "Rational" system of this kind, coders end up producing increasingly elaborate polyfills between the machine-generated parts of the system and the rest.

The target of this is always the "5GL" dream:

     - specify program in English
     - ???
     - automated tools transform to code

.. without programmers involved. Unfortunately we've not managed to reduce the irreducible complexity of the step in the middle.

GWT does this de-skilling to testers instead; it's effectively TDD with tests in this quasi-human-readable format that gets unimaginatively translated into programs by the testers.

The proposed solution is to do more communication through spreadsheet-prototypes. Since these are effectively programs in Excel, with all the real programming capabilities thereof, you get a real working model of the system that actually behaves like a program.

Many businesses simplify this process by then just deploying the spreadsheet to production.

(I'm not familiar with Cucumber, but it looks like a thing for constructing toy "human readable" DSLs for tests? I wonder if we could get BAs to learn INFORM?)

delusional · on April 16, 2019

I haven't studied this deeply, but i see a deep irony here.

The old school business guys had a plan (or at least a wish) to get rid of programmers. Ironically, I now see the best programs being written by programmers who have no BA's (essentially becoming BA's themselves). It turns out that it was way easier to just get the BA's to talk the language of the computer, rather than the other way around.

From what I've read, the same is true in testing. The best BA's are people who are actively interested in testing, and willing to learn the language they need to best help out. You cut out the middleman, not by making translators, but by learning to speak the language natively.

As it often is, it's easier to change the people than to change the world.

lugg · on April 16, 2019

Im surprised you missed it.

Businesses solve this problem by hiring and training programmers in their business model and domain.

Coders, are worthless. Coders with business acumen and domain knowledge are invaluable.

This is also why I will never work with anyone with a job title of BA. They are trying to do my job but can only do one half of it. They're a pointless bottlekneck in the system.

And no, there is no place for BAs who can code only those who do. They're called software developers.

At this point I don't know if software engineer, developer, programmer or coder is really the right term any more.

I'm more of a systems analyst. The amount of time I spend coding is miniscule and I don't think I'm alone in this.

borland · on April 17, 2019

That's a bit inflammatory... My experience with BA's is that they're like anyone else on a team. They are more skilled in certain areas, and they can provide a lot of value if their skills are used appropriately.

I've found BA's are very useful in the following areas specifically

- Communication with the market (either customers or in-market staff such as sales/support) to gather feedback about what they'd want

- Orchestrating decisions (asking all the dozen stakeholders what they want to do and arriving at a sensible conclusing)

- Arbitrating feature/UI decisions (people on the team aren't agreeing on whether we should do something or not)

- Documenting requirements (by this I mean the desired functionality that we believe customers need/want)

- Iterating and getting feedback on requirements

- there's more, but off the top of my head that's all I can think of

Sure, I as a lead developer have business knowledge and can also do the job of business analysis, but a dedicated BA will be able to spend more time doing that - increasing their skills, and freeing up me to do other things. If I take on all of the above tasks in a decent-sized project, it's going to suck up a huge amount of my time - in many cases almost all of it - so then basically I'm just a BA with a different job title. BA's are great if you learn to understand the role and how it can help a team

lugg · on April 17, 2019

All of those things are what a developer worth anything to a company should be doing.

Similarly, BAs without business domain knowledge are even more worthless because of this.

I.e. consultant BAs are, as most of us know, a waste of money.

BAs are struggling to find a language to describe their business requirements in a way the programmer or computer can understand.

It's code. It's what programmers have been writing all along. That is the god damn language they need.

Given when then? If then else.

Surprise!

kthejoker2 · on April 17, 2019

Ricardo strikes again. Relative comparative advantage says even if a developer is a better coder and BA, they're better off spending time coding and letting non coders be BAs to maximize overall throughput.

lugg · on April 17, 2019

This assumes a BA is cheaper and more available than a programmer.

If they are cheaper, I guarantee you're going to have problems.

You're literally installing a weak link in the most crucial part of the chain.

kthejoker2 · on April 17, 2019

BAs are by definition more available than developers, as all developers are BAs but not vice versa.

And given basic supply and demand they are cheaper.

Anyway read your Ricardo, you're right but wrong about the implications. Your development is an even weaker link in the chain because good developers are even rarer than good BAs.

dragonwriter · on April 19, 2019

> BAs are by definition more available than developers, as all developers are BAs but not vice versa.

Plenty of developers are not BAs. BAs are systems analysts with expertise in requirements elicitation (basically, goal-directed interviewing) and technically writing. Plenty of developers are neither systems analysts nor skilled at interviewing or technical writing.

dangets · on April 16, 2019

This point is brought up in the book Domain Modeling Made Functional [1] where the author contends that F# types (including functions) is readable to BA folk and this will help the communication barrier between them and devs. I highly recommend the book and agree with that statement at the highest level, though I could see actual implementation details going out into the weeds (result & async types etc).

I know other functional languages like Idris take this approach too where the designers could declare the input/output types and as long as the implementation conforms, you have some higher confidence in program correctness (at least API contractually). Interesting stuff.

[1] https://pragprog.com/book/swdddf/domain-modeling-made-functi...

joshwa · on April 16, 2019

The article seems to assume that it is, in fact, possible to completely and correctly specify a system, and not only that, do so ahead of development.

This is a pipe dream in all but a minuscule slice of software projects.

GWT/cucumber is as decent a tool as any for creating automated tests of system actions (and especially interactions) in a mostly-human-readable format that is likely to be understood by BAs, testers, and devs, even if not all of them are expected to be able to write them.

I've used them across many projects with great success, with the understanding that it's not intended to replace unit tests for calculations, nor stories for initial specification.

As endian says downthread[0]: ", G/W/T isn't the solution to domain understanding, conversations are. You should be continually communicating to between all stake holders to maintain a current domain understanding."

[0] https://news.ycombinator.com/item?id=19674013

p1necone · on April 16, 2019

It seems pretty obvious to me that any way to "completely and correctly specify a system" is going to look a hell of a lot like a programming language, and be no less complex or hard to understand than one... it's almost a tautology.

pcm191 · on April 19, 2019

Hi Josh

The article definitely does not make that assumption. The expectation is that the business analyst (someone with business analysis skills) will specify a small slice of the solution that will deliver value. Based on feedback it is expected that the specification will evolve.

I think you are spot on about readability. A specification in human readable form can easily be verified by a wider range of stakeholders.

Chris

pytester · on April 16, 2019

>Although Given-When-Then is a fantastic way to describe interactions, state and behaviour, it is a lousy way to describe data and calculations.

I follow a slightly different technique that I think makes given/when/then also a useful tool for describing data and calculations.

1) Write the test as follows, using deliberately simplified but still realistic data (this is crucial):

>Given <a set of market data and trades in a table> >When <arbitrary event such as calculation is performed> >Then <leave blank>

2) Write the code that outputs the data/calculation.

3) Run the test in a "rewrite" mode that fills in the results of Then based upon actual output (this process is somewhat similar to golden master).

4) You now have a passing test and generated data which you can eyeball to see if it is in line with what you would expect. This test with recorded output can then be shown to the PO (or whomever) and committed to source control and used for regression testing.

This obviously isn't possible with cucumber or other gherkiny tools and it does require your processes to be fully deterministic (a laudable goal anyway), but it works pretty well IMHO.

delusional · on April 16, 2019

That's a pretty common technique. The problem I've found is that it risks incorrect results be assumed correct. If I refactor your algorithm, and fix a bug in the process, I'll be very confused if your tests start failing.

The benefit of calculating it yourself first is obvious, you get to cross check your result. That tends to highlight minor errors such as -/+ errors.

This is not a problem if stability is more important than correctness of course, but that's not where I am.

endiangroup · on April 16, 2019

Act 1 - G/W/T can be used to express iteratively aspects of an algorithm such that you can derive it without knowing or fully understanding it (algorithm triangulation: think GPS where each satellite is a constraint and you derive through iteration of each passing scenario the general algorithm).

Act 2 - is really about process rather than G/W/T (which is really just AAA, arrange, act & assert).

Act 3 - again process, G/W/T isn't the solution to domain understanding, conversations are. You should be continually communicating to between all stake holders to maintain a current domain understanding.

We wrote an article recently on the limits of BDD [1], G/W/T didn't really come up, there are other more glaring issues with BDD when it comes to systems that intersect mismatched understandings of the real world between experts and users. Unrealistic wants and goals are killer. Additionally we started writing a tool to attach metadata to scenarios (G/W/T) so you can capture technical details about things called SpecStack [2]

[1] https://endian.io/articles/limits-of-bdd/ [2] https://github.com/endiangroup/specstack

pcm191 · on April 19, 2019

Hi Endian

Interesting article. I would be interested to see if feature injection ("Working Backwards") would help you. I do not think the first of the problems is a BDD problem, its more a "reality" issue. The only way to solve it is to use financial derivatives and I suspect the lack of liquidity in the crypto would make that prohibitively expensive.

Judging from your web-site, you are only 5 mins up the road from me. Would you be interested in a lunchtime session to see if FI would help?

Chris

ryanmarsh · on April 16, 2019

G/W/T is great at capturing the context/action/outcome of a test scenario in English. It isn’t great for all cases though. Furthermore Gherkin could be updated to allow more flexibility.

Having structured format for tests/requirements, in English (et. al.), can be incredibly helpful. I would love to see some innovation around helping programmers and non-programmers reach a shared understanding of what the system should do and the cases we will use to verify it.

I don’t think unstructured conversation, unstructured English, or Excel tables are the solution. This is still an unsolved problem in our industry.

pcm191 · on April 19, 2019

Hi Ryan

How about structured excel sheets. Excel sheets with quide rails for business people to structure the expression of their thoughts?

That's what I'm getting at.

Chris

60sec · on April 16, 2019

The main problem with cucumber / GWD is that in most implementations it serves as an opaque abstraction layer which is an incomplete/incorrect model abstraction of the system itself.

Been doing a lot of API testing recently with karate dsl and writing cucumber tests that include json expressions with some syntactical sugar for validation. The tests serve as a specification for the system which is actually quite a bit more precise than even swagger since you can even go back in time and compare the deltas on request/response between test executions to troubleshoot regressions.

Agree that GWT can't help business understand the inherently complexity of a state machine, but individual tests can be used effectively to model state transitions, especially at the api level.

Rooster61 · on April 16, 2019

I think a lot of the issue is the misconception that G/W/T feature files can essentially replace specifications/requirements. They can and absolutely should REFLECT the requirements, but they are ill suited to act as the actual specifications themselves.

Feature files to me are most effective when they act as a roadmap of the steps one needs to take to effectively test a given set of use cases, NOT a 1-1 carbon copy of the requirements. It should tell a non-programmer what the test is doing without having to dive into the code, while being a scaffold to which a programmer can build their test logic into. If one needs to look at the requirements, one should do just that, read the requirements document, or in the developer's case, read the requirements set forth in the user story. Scenarios are guides to how to navigate what the test is doing, not what the application itself should be doing.

Also, I often see programmers attempt to write a BDD test and run into a case that doesn't quite fit flush into G/W/T, then ask the community of how they might go about writing that test. Instead of understanding flexibility, they are met with an abrupt "that's not BDD, you are doing it wrong, if you did it the BDD way everything would work out". That's discouraging, frustrating, and destructive. G/W/T is not gospel, and it doesn't fit all test cases. I see nothing wrong with fudging some tests to not follow Gherkin to-the-letter if it better facilitates a test while still remaining clear what the test is doing in plain English within the feature file's scenario.

neves · on April 16, 2019

I can't agree more. I'm still to see a testing tool that can be used for specification and used by end users. Today everything is developer centric. Maybe there's no escape to this. The solution is really to make your developers understand the business.

endiangroup · on April 16, 2019

Thats where things like Domain Driven Design are going, and really it makes a lot of sense. Why express and model your domain in terms outside of it? Its like forever living through a translator instead of just learning the language in the first instance, you add more points of potential failure.

BoiledCabbage · on April 16, 2019

Domain Drive Design is one of those things that seems so easy to write off as "just another process" but looking closer it seems like it really is a solid method for resolving issues of modeling, requirements, implementation and communicstion on a project.

I expect to see is continue to organically grow in popularity as people try it and adopt it. It's both obvious and insightful at the same time.

delusional · on April 16, 2019

I recently read the red book, and I'm a little torn. The strategic patterns are great (I love the idea of bounded contexts, Ubiquitous languages, and integration), but the tactical patterns i'm less sure of.

I'm not sure i like the idea of going through what's essentially a batch job to export my events inside one bounded context. I'm also not sure i like going through an external message bus to operate between aggregates.

I don't have any better ideas, but i wonder if it's necessary to discard my reservations to succeed in the strategic goal.

zabil · on April 17, 2019

Shameless plug. We've built Gauge and Taiko to address this problem. Here the specifications are in Markdown and the API for automating the browser is user centric. It treats the browser like a black box.

You can find more info https://gauge.org https://taiko.gauge.org

pcm191 · on April 19, 2019

Hi Neves

Could not agree more. That's why Andy Pols and I recast the BA role as a Business Coach back in the day.

Chris

RHSeeger · on April 16, 2019

> We will realise that describing data and calculations using the Given-When-Then format leads to tragedy, and will create and popularise tools and approaches using Excel to document examples.

Given all the examples given, it sounds more like Cucumber is the problem. Having specifications/tests written in the form of given-when-then isn't shown to have any issues. Rather, taking those specs/requirements and disconnecting them from the people who need them is the issue.

thom · on April 16, 2019

Is the complaint literally just about aligning the text in tables of a Cucumber file because business analysts are more comfortable in Excel? Or just managing the text of your test suite across a whole project? I am struggling to parse out the underlying point of the article.

anentropic · on April 16, 2019

What I understood is:

You are supposed to capture requirements in Given/When/Then form before coding, and they should be written in a way that is independent of the implementation

The complaint seems to be that what happens in practice is "the devs have taken over" the system, so Cucumber just becomes a test automation software

This has led to writing tests in Given/When/Then format which test overly specific elements of the implementation, and are often written after the fact, instead of as a way to capture requirements.

And some other stuff, but that's primarily what I got out of it.

tannhaeuser · on April 17, 2019

> the devs have taken over [authoring tests]

That's what happened in the projects where I used fitnesse, jbehave, cucumber, etc. The users couldn't understand the artifacts and what they're supposed to do, and the tools were highly idiosyncratic. It ended in becoming just yet another hassle for developers.

pcm191 · on April 19, 2019

Exactly.

Whereas using Excel for specification of calculations (by Business Users or Business Analysts) actually brings the business and developers closer together, removing the need for intermediaries. Business Users are very happy using Excel and checking calculations etc. as it tends to be their home environment.

maxxxxx · on April 16, 2019

You just have to look for 10 minutes at Cucumber and what's needed to make it work and you'll see that it's simply a bad idea. It's the same nonsense as generating code from UML or visual coding. Looks good to the non programmer but horrible to work with.

ben509 · on April 16, 2019

The GWT problem and Cucumber's issues remind me of COBOL or even AppleScript. They worked out an English-like syntax that looks great in a demo, but winds up being harder to write code in than most other programming languages.

And the reason seems to be that it's requiring us to express ideas in a vastly simplified language. In order to do that, we have to expand a complex idea to many simpler ideas.

That's making us do the job of the compiler! Real high-level languages are rich with many idioms and structures and the compiler then reduces that to simpler ISA.

rgoulter · on April 16, 2019

A common anti-pattern with Given-When-Then has the Three Amigoes collaborating on scenarios that are stored as acceptance criteria in user stories.

I agree with this. Writing Cucumber/Gherkin scenarios is extra effort if the Cucumber files themselves aren't used/read elsewhere. -- It'd be simpler to embed "Given/When/Then" statements within test code (like RSpec).

I'd emphasise that Gojko Adzic's "Specification By Example" suggests discussing examples before refining to a specification; that may get around the author's complaint that non-table formats don't allow for important cases.

That said, "Given/When/Then" is hardly magical, so doesn't deserve much praise/criticism itself. Any test involves "do an action, check the result" (with "setup the system" and "cleanup" being implied). Sometimes called "Assemble, Act, Assert". "G/W/T" is just a neat, consistent format for describing behaviour in English. A table of values specifies some computation; the column titles help to describe that behaviour.

Dangeranger · on April 16, 2019

You can embed G/W/T into RSpec. There's a library called Turnip[0], a play on Cucumber, that extends the syntax to Feature tests in RSpec.

- [0] https://github.com/jnicklas/turnip

The reason RSpec didn't include this originally probably stems from the library being designed for testing at the class or module level, rather than an entire application.

dmitryminkovsky · on April 16, 2019

This is why I like Spock [0]. You can go G/W/T/, W/T, T, or Expect [1]. It's billed as "multi-paradigm" which really just means you can do whatever feels right for a given case. Also its data table feature is wonderful [2][3].

[0]: http://spockframework.org/

[1]: http://spockframework.org/spock/docs/1.3/spock_primer.html#_...

[2]: http://spockframework.org/spock/docs/1.3/data_driven_testing...

[3]: https://twitter.com/dminkovsky/status/1116727735399976966

vorg · on April 16, 2019

> you can do whatever feels right for a given case

Too bad you can't use whatever specification language feels right. Spock only provides Apache Groovy, with its tacky syntax hacks like long strings for function names, block labels having special meanings based on their name, or the OR and LOR operators being used for drawing tables in the code. In the past when software has provided Groovy for writing specs, they eventually provide an alternative when Groovy's shortfalls become obvious, e.g. Kotlin for Gradle [1], or the Declarative Pipeline Syntax [2] for Jenkins.

[1]: https://docs.gradle.org/5.0/userguide/kotlin_dsl.html

[2]: https://jenkins.io/blog/2016/12/19/declarative-pipeline-beta

zmmmmm · on April 17, 2019

> Groovy, with its tacky syntax hacks like long strings for function names

Actually, Kotlin does that too. I think it is even encouraged for writing tests.

dmitryminkovsky · on April 17, 2019

Interesting, thank you. Not sure why the string function names, named blocks etc are bad. I enjoy them. Looking forward to reading about this Kotlin DSL though... I like Groovy for testing (your tests are more likely to compile even if they're broken) but it could be better.

raldi · on April 16, 2019

This article would have been better if at some point it explained what Given-When-Then is.

Chlorus · on April 16, 2019

That would have gotten in the way of making wild generalizations about what Developers & BAs do.

verisimilitudes · on April 16, 2019

>Before the internet, user experience was considered of little value because most users of systems were internal employees of companies.

I'm skeptical of this claim. It would help if there were a year, considering it's not clear if the author means the modern Internet, the ARPANET, or when the modern Internet became widely available to a larger group of people. Based on the mentions of Excel, I'm inclined to believe it's the last option I listed.

The MIT AI lab and other research areas come to mind as places that cared about how the programs were operated and whatnot and these weren't exclusively used by employees. SHRDLU comes to mind. While I'm thinking about it, the Apple Macintosh also does.

I don't believe I was familiar with this Given-When-Then model beforehand, but I also think that's because it's somewhat natural, or at least seems natural. The author has failed to convince me why this is a bad thing. I suppose I can see why these larger business practices are poor, but that has me failing to see why this particular practice is singled out.

exelius · on April 16, 2019

Yeah; I suspect the core of the problem is that too many software developers fancy themselves business analysts while too many business analysts start to cower in fear whenever you suggest they check something in to GitHub.

ryanmarsh · on April 16, 2019

When I teach BDD classes I sometimes joke that we might not need acceptance tests written in G/W/T if the biz people could read programmer tests.

projektfu · on April 16, 2019

You might want to check out Fit as a testing framework that may be more appropriate for your use case than Cucumber. Not sure if it's still well maintained, but it could be brought up to speed without much effort.

noveltyaccount · on April 16, 2019

To;Dr, given-when-then can sometimes obscure requirements rather than illuminate them. Use Excel or other tools to document such scenarios, as everyone in the software design process can understand Excel formulas.

thom · on April 16, 2019

Why would moving a table from The Place All The Other Requirements Are Kept to an Excel file stored elsewhere do anything other than obscure requirements?

jrochkind1 · on April 16, 2019

I think there's a lot of people down on cucumber after experience with it, in several different contexts... what do you think, what has been your experience?

Macha · on April 16, 2019

The dream of having the requirements be the test is not realistic:

* Good luck telling your PM their English has a syntax error

* Once that's out the window and the cucumber files are just maintained by devs, is the regex translation layer worth it?

* A lot of implicit state also gets wrapped up in the test in making these phrases at least semi-readable, which makes for a pain debugging failures.

rhinoceraptor · on April 16, 2019

You can also enter Cucumber-developer double jeopardy where you need to meet with the PM for an hour to write Cucumber specs, and then meet with the QA for an hour to implement them.

When you could have just written a simple integration test in 15 minutes, but the Scrum master thought Cucumber would be a good idea.

maxxxxx · on April 16, 2019

We tried it for a little and concluded that it made nothing really better but a lot of things worse. We just should accept that a tester needs to learn writing code. I wish it were different but Cucumber is not a solution.

rhinoceraptor · on April 16, 2019

I once was tasked with writing a script to generate the cucumber implementation boilerplate, from the cucumber tests themselves. Because the tester would constantly mess it up if he did it manually.

Needless to say, it wasted a ton of time and never produced any meaningful value.

jrochkind1 · on April 17, 2019

Did you verify the functionality of your cucumber-implementation-generating code with cucumber? :)

philipodonnell · on April 16, 2019

> The discussion helped me realise that Given-When-Then is as much of a hindrance in some contexts as it is a help in other contexts.

I like when authors express a strong viewpoint but then also include descriptions of circumstances where their viewpoint may not be applicable. This seems to be alluded in the above quote, but are there specific contexts where Given-When-Then _is_ helpful and the appropriate mechanism to document requirements?

rgoulter · on April 16, 2019

I think rspec's Cucumber docs are fine. https://relishapp.com/rspec/rspec-expectations/v/3-8/docs/bu...

It's an executable specification, but it's also in a format which is readable. If not "documentation", then it's at least "small, verified examples".

I'd find a table of values less readable. I don't mind if a document like this is the output of running the tests themselves, though. (Obviously, the important details are "input program, expected output", which might not lead you to use Cucumber, but I think it's a fine use of Cucumber).

xchip · on April 16, 2019

That is pretty much the core of software engineering, I'd call it Data-If-Then and I fail to see why this is a tragedy.

Sahhaese · on April 16, 2019

I think that is the tragedy that the article covers. Described as a scenario where the engineering has ended up squeezed into the business analysis and there isn't good communication between there are the development which are seen as implementers.

This doesn't feel familiar to me but I've tended to always work in legacy systems where the opposite problem (no BA at all) is a more familiar problem.

barbecue_sauce · on April 16, 2019

What ever happened to Systems Analysts?

jodrellblank · on April 16, 2019

arcfide has claimed on several occasions that non-programmers take to APL quite easily, and can read and collaborate on it with a programmer, be talked through it directly, in a way they can't/won't do for mainstream languages.

I find this such an unlikely sounding claim that I want to reject it without consideration, or at least assume that he's only talking to a very restricted subset of engineering non-programmers.

APL is decades old and very much a business language by origin in IBM System 360, there ought to be decades of people's experience with this on both sides to back it up or refute it - programmer and non-programmer. Is there?

macca321 · on April 16, 2019

It's a fallacy that GWT has to be at the browser automation level. And it's a lot easier to understand the point of a test if it's got GWT comments interspersed with the code.