Probably not the first AI wrapper around Playwright this week, and certainly not the first this month.
I think this use case of automation in a BPA sense is more compelling than using it for test automation, because the latter is much more concerned with the precision and repeatability of the process. For the BPA task, arguably you care only about the outcome and it often doesn't matter if it gets there via some crazy route.
Part of the problem for me is that your example video shows a big wodge of prompt that had to be written to make this work and then a few kb of payload data (parameters) in a plaintext, non-csv format. If the expectation is that this replaces someone just using Playwright with codegen due to that being too technical, I'm not convinced there is a huge group of people who can manage one task but not the other.
Furthermore, you are expecting them to pass over their website login credentials and apparently their credit card details too, in plain text. You had better have a very solid idea of how to handle that sensitive data to avoid serious consequences if your users' skyvern accounts are compromised.
I think the frequency of website redesigns is oversold by people producing these LLM-driven Playwright wrappers, especially when targeting old-fashioned or government sites. As an example, we have had a suite of lengthy Playwright browser automations to interact with a government site for a few years and have had to maintain them only once, when the agency's business process changed. The prompt would also have needed to change had we used Skyvern, as would the payload, because the process was different. The difference with the Playwright automation, though, is that we could use assertions to verify steps had succeeded/failed and data had been recorded correctly, so we would know the process needed updating. I can't see that option in Skyvern which would have me worrying that process changes would be overlooked and we would unknowingly start entering the wrong data or missing steps.
1/ the current prompt + payload structure is definitely on the complicated end of the spectrum, but we've found that we can use an LLM to help generate this payload for our users
The technical users want to learn more and generate their own payloads, and the non technical users prompt LLMs to help them generate the ultimate skyvern prompt to get going
This was very unexpected -- but a surprisingly logical chain of events.
Phase 1: build the thing the complex way (playwright)
Phase 2: build the playwright thing with complex prompts (we are here right now)
Phase 3: build the thing that builds the playwright thing with simpler prompts
Each phase lowers the technical bar to build your automations
2/ re: frequency of website changes
This IMO is a smaller value prop of LLM based automations. The biggest one is being able to handle highly dynamic situations. Consider the case where you're automating an e-commerce website where the popup offer changes every week. skyvern doesn't even notice those, but playwright scripts would break
Similarly, I love using the Geico example because it highlights something that was very difficult to automate before: The form changes every time you run it
Skyvern breezes through it.. but another case that was hard to automate before.
3/ data correctness
We're actually rolling out a workflows feature that allows you to chain multiple tasks together. The cool thing about this feature is that you can add steps in to have Skyvern self-validate it's own unless before continuing.
For example, you can add n products to cart, then navigate to the cart and validate the cart state
... As you can guess, this creates the foundation to have another agent go and use these tools to self-build workflows with simpler prompts
TL;DR -- we're on a pretty long journey to use LLMs to make BPA easier and easier, and this is just the first step
I think this use case of automation in a BPA sense is more compelling than using it for test automation, because the latter is much more concerned with the precision and repeatability of the process. For the BPA task, arguably you care only about the outcome and it often doesn't matter if it gets there via some crazy route.
Part of the problem for me is that your example video shows a big wodge of prompt that had to be written to make this work and then a few kb of payload data (parameters) in a plaintext, non-csv format. If the expectation is that this replaces someone just using Playwright with codegen due to that being too technical, I'm not convinced there is a huge group of people who can manage one task but not the other.
Furthermore, you are expecting them to pass over their website login credentials and apparently their credit card details too, in plain text. You had better have a very solid idea of how to handle that sensitive data to avoid serious consequences if your users' skyvern accounts are compromised.
I think the frequency of website redesigns is oversold by people producing these LLM-driven Playwright wrappers, especially when targeting old-fashioned or government sites. As an example, we have had a suite of lengthy Playwright browser automations to interact with a government site for a few years and have had to maintain them only once, when the agency's business process changed. The prompt would also have needed to change had we used Skyvern, as would the payload, because the process was different. The difference with the Playwright automation, though, is that we could use assertions to verify steps had succeeded/failed and data had been recorded correctly, so we would know the process needed updating. I can't see that option in Skyvern which would have me worrying that process changes would be overlooked and we would unknowingly start entering the wrong data or missing steps.