WordList for Manga

@Neicudi

Hi! I read your post on flo flo on you data mining manga for word list. Can you elaborate on what software you used to generate the word list and add definitions? I have 600 manga in my house I would like to get started on… I found lots of software to convert image to text but the next step of filtering and sorting by words. Did you code that up yourself?

Regards,
Dan

3 Likes

Nice shelf :ok_hand:

1 Like

Hi @2stash, do you perhaps have a link to the post you are refering to? I think it’s been a while since I last posted something about it, so I’m not sure what exactly I mentioned there ^^

Yeah, the hardest part is the extraction of text from the manga, after that I run it through a custom written node.js script which creates wordlists and such.

That said, you could also just paste the extracted text into Kitsun’s Reader (which improved upon the above mentioned script/parsing method) which could help you figure out sentence meanings and lets you create flashcards from vocabulary very easily. I think that’d pretty much do the same trick without you having to wade through a lot of unimportant cards.

1 Like

@jprspereira Thanks! They sell for like 80 cents a piece in Japan! So I got a few when I lived there :slight_smile:
Hey @Neicudi thanks for replying! I read it on “https://floflo.moe/final-thoughts-on-floflo/”, which is a pretty cool and inspirational story. I am curious on what you are doing now besides kitsun.io. I am currently switching careers into software development (15 years in aerospace engineering). I actually made a Japanese learning website 2 years ago, via wordpress, and was trying to figure things out about how to build it out. Was also trying to make a Japanese learning video game. I ended up giving up because everything felt impossible, but started up again this year and found a way to learn that fit my style. I have finished a few projects(react, node) and I am working getting a few more done, mostly bigger ones that could be profitable, but also for my resume. I am kind of looking for a job, but I really want to get more of my own projects done first. But I wanted to say, that if you haven’t checked out the job market, it’s easy to get a software job after you built a website like kitsun.io.

Back to the wordlist, I am actually specifically trying to make wordlists for reading Japanese manga. After studying Japanese for 10 years, I either forget the meaning, the kanji, or the reading. So if possible, I like to have a wordlist, so I can refer to it when reading, so I don’t have to stop and look something up because that makes it not fun.

And how is your Japanese study going? In your blog you talked about spending all your time on the website and it was taking away from studying.

Dan

1 Like

Heya! Actually, Kitsun is not affiliated with floflo as that is made by another developer, I did comment on their (now closed) thread on the wanikani forums though, which I thought you were talking about in your original post haha :laughing: This also means that the blog you are mentioning is not written by me :stuck_out_tongue:

In case you are still interested, next to Kitsun I have a dayjob that I’m slowly transitioning away from (working 3 days a week now) so I can focus more on Kitsun. If all goes well I’ll probably change to fulltime Kitsun within the next 6 months, but that depends on the growth of the platform.

2 Likes

lol. Just as impressive