Waiting impatiently for the day when we can pull in ideal customer datasets to help us learn about different buyer personas. Note to B2B: if you're not archiving your chat and bot data in an organized way -- and so many firms aren't -- please start. 🙀
Well that was fun to chat with :D
Seems like that conversation about my code not compiling, quickly turned into some deep discussion about each of us ;)
Hey guys.
Don’t expect accurate answers here - it’s just another weekend hack for fun from the 🤗 team (following https://medium.com/@julien_c/cha... which got over 100,000 messages in just a few hours).
With this experiment, we wanted to understand how we could train a neural network on new external datasets very quickly and study the difference in the language and answers depending on the topic of the dataset.
No scripted chatbot here, we are talking of neural-nets trained in the wild :) We are using the celebrated seq-2-seq model which computes a “thought-vector” from an input sentence and generates an output sentence conditioned on this vector. We gathered various datasets from stackexchange and launched a big overnight training of our models to have some surreal morning coffee talks with our AI.
There’s still a ton of work for the answers to start to make sense (longer training and bigger datasets would improve the quality of course, we can also easily add components to improve the variety and coherence of the responses) but a difference can definitely be noticed based on the dataset subset.
This will allow us to test a variety of new datasets way faster than before.
Let us know what you think and post your funniest exchanges below! And if you think of a cool dataset to test, just let us know on Github.
Julien from HF