Comments on Salmon Run: Visualizing Intermediate Outputs of a Similarity Network

Hi Aarthi, I treat words with 1 occurrence as rare...

2017-09-15T10:07:26.910-07:00

Hi Aarthi, I treat words with 1 occurrence as rare and lump them with other OOV words. Its an approximation to keep the size of the embedding down and to improve generalization. Using the full vocabulary is definitely a valid option, but I didn't try it, so can't say for sure how it will impact the results.

Hi Sujit, While you form the vocabulary of the dat...

2017-09-14T23:05:14.952-07:00

Hi Sujit, While you form the vocabulary of the dataset, u have ignored the words which have a just one occurrence in the dataset. How can it be justified? Shouldn't i use the entire vocabulary size as 10270..

Hi Aarthi, the sts-vocab file is generated using t...

2017-08-15T08:42:47.473-07:00

Hi Aarthi, the sts-vocab file is generated using the 06-sts-data-analysis notebook.

Hi Sujit Where do i find the sts-vocab.tsv file? ...

2017-08-14T21:18:13.368-07:00

Hi Sujit

Where do i find the sts-vocab.tsv file? I would like to visualize the model.