tag:blogger.com,1999:blog-7583720.post6401968803678859640..comments2024-03-05T03:17:02.289-08:00Comments on Salmon Run: Visualizing Intermediate Outputs of a Similarity NetworkSujit Palhttp://www.blogger.com/profile/06835223352394332155noreply@blogger.comBlogger4125tag:blogger.com,1999:blog-7583720.post-13192852866344687172017-09-15T10:07:26.910-07:002017-09-15T10:07:26.910-07:00Hi Aarthi, I treat words with 1 occurrence as rare...Hi Aarthi, I treat words with 1 occurrence as rare and lump them with other OOV words. Its an approximation to keep the size of the embedding down and to improve generalization. Using the full vocabulary is definitely a valid option, but I didn't try it, so can't say for sure how it will impact the results.Sujit Palhttps://www.blogger.com/profile/06835223352394332155noreply@blogger.comtag:blogger.com,1999:blog-7583720.post-77416888082734111542017-09-14T23:05:14.952-07:002017-09-14T23:05:14.952-07:00Hi Sujit, While you form the vocabulary of the dat...Hi Sujit, While you form the vocabulary of the dataset, u have ignored the words which have a just one occurrence in the dataset. How can it be justified? Shouldn't i use the entire vocabulary size as 10270.. Aarthinoreply@blogger.comtag:blogger.com,1999:blog-7583720.post-19257490252384187312017-08-15T08:42:47.473-07:002017-08-15T08:42:47.473-07:00Hi Aarthi, the sts-vocab file is generated using t...Hi Aarthi, the sts-vocab file is generated using <a href="https://github.com/sujitpal/eeap-examples/blob/master/src/06-sts-data-analysis.ipynb" rel="nofollow">the 06-sts-data-analysis notebook</a>.Sujit Palhttps://www.blogger.com/profile/06835223352394332155noreply@blogger.comtag:blogger.com,1999:blog-7583720.post-33596416258477710472017-08-14T21:18:13.368-07:002017-08-14T21:18:13.368-07:00Hi Sujit
Where do i find the sts-vocab.tsv file? ...Hi Sujit<br /><br />Where do i find the sts-vocab.tsv file? I would like to visualize the model.Aarthihttps://www.blogger.com/profile/13244458527471036037noreply@blogger.com