More tests, docs, and minor things #41

cretz · 2022-06-07T13:09:37Z

What was changed

More README docs including build docs (opened Add more robust docs to docstrings #40 for more docstring docs)
Update core submodule origin URL to be http instead of SSH
Add retry policy to workflow info
Support disabling deadlock timeout with env var
Track tasks ourselves to since Python only uses weak references for them (see https://bugs.python.org/issue21163 and others)
Give descriptive task names in Python versions that support it
Many more tests

Checklist

Closes More workflow tests #28
Closes Add build docs to README #39

# Conflicts: # tests/worker/test_workflow.py

README.md

bergundy · 2022-06-08T20:00:03Z

README.md

+
+#### Testing
+
+Tests currently require [Go](https://go.dev/) to be installed since they use an embedded Temporal server as a library.


Are you going to get rid of this soon?

There are two things in Go right now: Temporalite wrapper and "kitchen-sink" workflow/worker. I am waiting on TLS support and distributed Temporalite binaries for the former. For the latter, probably at the same time I get rid of the former, I will rewrite the workflow in Python and get rid of the Go worker.

I could move to docker compose I suppose, though Temporalite is very fast for our use case.

bergundy · 2022-06-08T20:14:22Z

temporalio/common.py

@@ -35,6 +37,21 @@ class RetryPolicy:
    non_retryable_error_types: Optional[Iterable[str]] = None
    """List of error types that are not retryable."""

+    @staticmethod


nit: could use a classmethod instead and instaniate cls() below instead of referencing the class by name

I like being explicit on class instantiation here (and I have chosen to use @staticmethod throughout, though Google style guide wants @classmethod; I have documented the deviation in the README too)

temporalio/worker/workflow.py

bergundy · 2022-06-08T20:39:14Z

temporalio/worker/workflow_instance.py

+                # Mark it as started each loop because backoff could cause it to
+                # be marked as unstarted
+                handle._started = True


Is this referring to local activities?

Yes, but we share a good bit of code w/ local/non-local activity handling so it's in this shared space

bergundy · 2022-06-08T20:41:22Z

temporalio/worker/workflow_instance.py

+        if hasattr(task, "_log_destroy_pending"):
+            setattr(task, "_log_destroy_pending", False)


Should we only do this for workflows that we know we've evicted or after shutdown?

I think this only applies in those two cases since this is only used on GC and I hold a strong reference until eviction/shutdown or the task is done. I just choose to do this at registration time instead of loop over all tasks at the end.

temporalio/common.py

bergundy · 2022-06-08T20:48:01Z

tests/worker/test_workflow.py

+    @workflow.run
+    async def run(self) -> None:
+        # Wait forever
+        await asyncio.Future()


Wondering if it's worth having a workflow method to "block" like this

I am not sure a "block forever" is a primitive we need in any lang. I suppose it's just await new Promise(() => {}) in TS which I figure we don't have a utility for. I guess you could await workflow.wait_condition(lambda: false) too.

Yeah, in TS you have to use await new Trigger() or await CancellationScope.current().cancelled or await condition(() => false).

There's no exported function and I have gotten questions on how to achieve this.
I recommend waiting on the cancellation scope directly.

I think sleeping until cancel is a bit of an anti-pattern due to history growth, though some people trust they'll stay low until they cancel. Python doesn't have an explicit wait forever so not sure we need one. They could just as easily do await asyncio.Event().wait() or anything like that.

bergundy · 2022-06-08T20:50:00Z

tests/worker/test_workflow.py

+
+    @workflow.query
+    def bad_query(self) -> NoReturn:
+        raise ApplicationError("query fail", 456)


nit: don't need an ApplicationError here

We do if I want to test how details are handled :-) Though I note "(no details on query failure)" in the test, so it became useless.

bergundy · 2022-06-08T20:51:01Z

tests/worker/test_workflow.py

+        assert isinstance(err.value.cause, ApplicationError)
+        assert list(err.value.cause.details) == [123]
+        # Fail query (no details on query failure)
+        with pytest.raises(RPCError) as rpc_err:


We wrap query failures in other SDKs with QueryRejectedError

Actually, at least in TS that's only on successful query response with a query_rejected field which is different than this failure.

But yes, in other SDKs we do sometimes make the RPC errors more palatable in some cases. But some SDKs get it wrong (e.g. assuming not found means workflow not found). This was discussed at #32 (comment) and temporalio/features#59 was opened as a result.

bergundy · 2022-06-08T21:00:16Z

tests/worker/test_workflow.py

+            await assert_eq_eventually(True, child_started)
+            # Send cancel signal and wait on the handle
+            await handle.signal(CancelChildWorkflow.cancel_child)
+            await handle.result()
        assert isinstance(err.value.cause, ChildWorkflowError)


Pardon my laziness, remind me what err.value is here? Is it the Failure object?

It's the Python Exception. Pytest wraps it, so err here is technically https://docs.pytest.org/en/7.1.x/reference/reference.html#exceptioninfo

tests/worker/test_workflow.py

bergundy · 2022-06-08T21:07:49Z

tests/worker/test_workflow.py

+                event.timer_started_event_attributes.start_to_fire_timeout.ToMilliseconds()
+                == 10


Hmm.. looking at this now I'm wondering if it's worth not sending this command if it's cancelled in the same activation. (Would be a backwards incompatible Core change).

From what I saw in history, it seems core properly handles cancelling out a command sent in the same activation completion. It's for that reason I didn't add extra code to remove commands of things I cancel in the same completion before sending the completion - it seems like core does it.

tests/worker/test_workflow.py

bergundy · 2022-06-08T21:15:49Z

tests/worker/test_workflow.py

+    assert 1 == sum(
+        1


Very pythonic :)

bergundy

LGTM, nothing blocking here.

I think there were some other missing tests that I mentioned in the last PR, also not blocking this PR.

cretz · 2022-06-08T21:42:25Z

I think there were some other missing tests that I mentioned in the last PR, also not blocking this PR.

@bergundy - There are probably some I missed, but can you point me at them again just so I have them tracked?

bergundy · 2022-06-08T22:06:50Z

Did you check all interceptors e2e?
Did you check that if workflow receives signal while WFT is in flight the workflow is rewinded and completes after signal is processed?

cretz · 2022-06-08T22:55:40Z

I will add tests for those two

EDIT: Only adding test for the first one for now and will do the next in another PR (after off-PR discussions about how best to solve).

cretz added 4 commits June 3, 2022 17:13

More tests, docs, and minor fixes

1b0fdc2

More workflow tests

f3f6962

Work on docs

75d0e48

Merge remote-tracking branch 'remotes/origin/main' into more-tests

21b4f2d

# Conflicts: # tests/worker/test_workflow.py

cretz force-pushed the more-tests branch 2 times, most recently from 69dc4cf to 0355f47 Compare June 7, 2022 14:41

Minor PR update

8dad0fe

cretz force-pushed the more-tests branch from 0355f47 to 8dad0fe Compare June 7, 2022 16:29

cretz marked this pull request as ready for review June 7, 2022 16:58

cretz requested review from bergundy, Sushisource and a team June 7, 2022 16:58

Sushisource approved these changes Jun 7, 2022

View reviewed changes

cretz mentioned this pull request Jun 7, 2022

Add generated protos #42

Merged