In every software application I’ve ever worked on, no matter the industry or maturity of the team or number of weeks in a sprint, there have been three questions that always come up:
- What is the best way to center a?
- Should we use tabs or spaces?
- How should we implement complex workflows?
It was with that third question in mind that I stumbled upon a link in HackerNews a few weeks back on Temporal.io announcing that its .NET SDK is now in alpha.
If you aren’t in the know, Temporal is a library that lets you describe workflows as code. It’s available in multiple languages, but the Python flavor is the most popular.
So, I took a leap of faith, tried out the Temporal .NET SDK, and decided to recap my thoughts for you all as a blog. I’ll walk through, at a high level, what the Temporal approach is, the implications of workflows at the different zones of enterprise architecture, and where I see Temporal being useful in a large organization’s software strategy.
The Temporal Approach
Temporal bills itself as a way to implement a workflow as code, perhaps as a subtweet against competing products that believe it can all be done through a .yaml configuration file.
To keep it simple for this blog, a workflow to Temporal is a series of activities that need to occur within a given time limit, and any activity could fail and require retry logic. The classic examples they give are API calls or DB commands that need a higher guarantee of reliability in the event the happy path does not succeed.
If you’re into reinventing the wheel, you could try to build your own retry and roll back code to wrap these areas, or you could describe your workflow and activities using Temporal.
If you did try to implement a bespoke workflow library by yourself, you would quickly need to be able to see the status of workflows – whether they’ve failed or succeeded, why something failed, and potentially replay a particular workflow run. This is somewhat difficult to do for two reasons.
First, it’s challenging because there are never enough hours in the day to implement business-specific functionality within a sprint, much less purely technical functionality. Second, it can be difficult because you need a strategy that effectively captures state within a workflow instance in a potentially polyglot architecture. Not every team needs to use the same language, much less the same frameworks and libraries within a language. Suffice it to say, going at it alone is rarely the choice with the best ROI.
Temporal helps to alleviate that problem by offering a centralized dashboard that, drumroll please, allows you to see the status of workflows, the state of the workflow instance at each action, and replay failed workflows. The Temporal server is something you can share between teams or keep “in the family,” depending on your use case.
It comes in the form of a Docker container (because this is 2023 after all) and does require several supporting services to get running. What I like about it is that if you want to test it for development purposes and only want to run one container, it’ll be nice and spin up those backing services with sensible defaults for you. Not how it would be run in production but good enough to help a developer get productive quickly.
So to recap, a Temporal workflow is composed of activities, which can describe the replay and rollback strategy as code. A Temporal server gives us an easy-to-use dashboard to see the status and state of the instances of the workflow.
Temporal .NET SDK
The strategy the Temporal team needs to take to implement the architecture is slightly different from language to language. Though the basic premise remains the same, reality always hits us like a brick wall, and tough choices need to be made to actually get working code.
This is what I found most revealing about the Temporal .NET SDK. The realization led me to realize how workflows, in general, should be implemented but more on that later.
Earlier versions of the SDK used a
refsingleton to register the workflow so it can be invoked. This is because C# doesn’t allow references to non-static methods on static classes, only on instances of the class. The Temporal team has found a way around that in newer versions of the library, hence the “alpha” designation.
The major value added with Temporal is that it can encapsulate activities that commonly come up in enterprise application development that require retries as well as timeouts, without having to roll your own implementation. The primary examples they provide center around DB access and API calls, both of which are common in usual applications.
Starting from the most myopic level, within a microservice I do agree that Temporal can add benefits. I am skeptical, though, if encapsulating all or even most DB calls is appropriate (or even possible) over engineering.
As we all found with Hadoop, it’s ideal to have close proximity between where our data lives within the system that processes it. Bandwidth is one of the most expensive, nontrivial costs in the architecture of an application; keeping the DB in the same zone/region as the API reduces the barrier to high performance.
Regulatory concerns such as HIPAA and GDPR keep this reality top of mind, to where it becomes unlikely that a zonal or regional outage could be mitigated at the API layer when it accesses the DB with Temporal.
API calls come in two essential species: internal and third-party. Some architects make the distinction between intracompany and intercompany API calls as well. However, I feel this is a distinction that lacks a difference, given that the priorities and SLA expectations are heterogenous within an organization, sometimes even within a team, where the more effective approach is to keep all API calls at arm’s length.
If it is truly necessary to fulfill the expectation of a given piece of business functionality, a given microservice must make calls out to others in the same ecosystem. This can be a smell that the microservices are too granular and should be combined.
Our architectural olfactory senses should cringe just on principle alone that we’ve coupled our application code to the whims and uptime of a third party’s API. If they change their request/response contract without properly informing our team, wrapping calls within Temporal won’t really help.
A better approach is to externalize our API calls in an Integration Engine, a separate deployable unit from our application itself, which can be upgraded and deployed on an as-needed basis. The Integration Engine is a more appropriate location for these calls, Temporal or not, as it embraces the reality of enterprise application lifecycles without pretending “this time will be different.”
The next level of complexity is a workflow between applications within a company. Possibly higher than that would be B2B workflows. I suppose it is theoretically possible to use Temporal to help manage these instances, but I sincerely doubt it is the appropriate tool.
These layers of complexity are very difficult to manage and implement, much less get prioritized on each team’s respective board. No single tool is a salve for this kind of reality, and if anyone says something to the contrary, they are selling you something.
The Temporal .NET SDK has changed significantly even within the time of authoring this blog post, so teams should proceed with appropriate caution. Temporal is one of the best tools I’ve seen in terms of its dashboard and replay capability. I am still unsure of the appropriate use cases between microservices, but it could become more obvious for real-world use cases, above and beyond the theoretical.
Let me know your thoughts on Temporal .NET SDK in the comments below. I’m curious to hear how you believe it could (or couldn’t) be useful. As always, if you enjoyed what you read today, head over to the Keyhole Dev Blog for more.