-
Notifications
You must be signed in to change notification settings - Fork 13
refactor: Introduce new Apify storage client #470
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
d27c080
to
82efd3e
Compare
067b793
to
104a168
Compare
dc7f0a7
to
a3d68a2
Compare
@@ -11,14 +11,14 @@ async def main() -> None: | |||
await dataset.export_to( | |||
content_type='csv', | |||
key='data.csv', | |||
to_key_value_store_name='my-cool-key-value-store', | |||
to_kvs_name='my-cool-key-value-store', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this BC break worth it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's evaluate all the potential BCs at the end
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure. I thought we are nearing that now 😁
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we're just re-exporting the storages
from crawlee
here, there will be many more cases than this one. I'm not saying we have to rename this particular argument (and I will undo it if you insist—just I don't like those long identifiers, especially when we can use the KVS abbreviation).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While testing the change on platform I used modified benchmark tests. They are specific, because they scrape locally hosted site and so the speed of scraping is unusually fast - no network related delays. This exposes RequestQueue race condition that can happen when checking if the RQ is empty or not.
When running the test using this branch, I saw that the crawler premature finished even though RQ was still quite full. The hypothesis about premature exit was confirmed when running same test on branch modified with extra log and sleep.
Please see following tiny commit based on this branch and attached log:
5b25728

From this run: https://console.apify.com/actors/UdJ0NQp2j8140G9db/runs/4yfZqLvLo1xhrk2Cb#log
2f39b35
to
ca72313
Compare
- Add more integration tests for RQ (mostly to better test the new Apify RQ in the #470). - Utilization of `Actor.log` - as they are seen in the test output.
d59b100
to
f5189c5
Compare
Description
Issues
Testing