Glad you got a solution already and thanks for sharing the resource
Regarding the mass audio generation:
Japanese Audio (for dictionary cards) is generated through google APIs and cached on Kitsun’s hosting once generated (so no additional API requests are made for generation of the same vocabulary word).
I just took a look and it seems like the hosting now has 28749 audio files for Japanese, which I think covers most of the common vocabulary. With that in mind, it might be okay to allow generation for complete decks as most of the words should be cached on the Kitsun hosting already. This should prevent it from incurring large additional costs (Google API requests cost money).
There already is an internal API function that generates audio for complete decks, so it wouldn’t be difficult to add this to the frontend of Kitsun for everyone to use.
The only problem I see with it is that users might generate audio for things other than Japanese or for fields which contain sentences, which are not cached yet, and incurring large additional costs because of that.
Maybe as a workaround I could add a “Request audio generation” feature (button + field selection) that sends me a request and I’ll be able to approve it manually?