tag:blogger.com,1999:blog-7583720.post3696049217628222528..comments2024-03-17T13:30:18.387-07:00Comments on Salmon Run: Sentence Similarity using Word2Vec and Word Movers DistanceSujit Palhttp://www.blogger.com/profile/06835223352394332155noreply@blogger.comBlogger11125tag:blogger.com,1999:blog-7583720.post-1830795888556889262017-05-02T10:04:04.721-07:002017-05-02T10:04:04.721-07:00Thank you. To answer your question, it doesn't...Thank you. To answer your question, it doesn't, it just uses shortest distance between single words on the LHS to RHS. I guess we could do a limited form of that by adding 2-grams and 3-grams to our list of tokens on either end, and removing the subsumed tokens after an n-gram has matched.<br />Sujit Palhttps://www.blogger.com/profile/06835223352394332155noreply@blogger.comtag:blogger.com,1999:blog-7583720.post-8331946100529398952017-05-01T10:00:38.564-07:002017-05-01T10:00:38.564-07:00Nice work! How does your implementation account fo...Nice work! How does your implementation account for the "flow" of one LHS word to multiple RHS words?Mr Elusivehttps://www.blogger.com/profile/10795151344292949879noreply@blogger.comtag:blogger.com,1999:blog-7583720.post-53967765744836449612016-06-02T23:47:56.026-07:002016-06-02T23:47:56.026-07:00Thank you, and sorry, but I don't know of R pa...Thank you, and sorry, but I don't know of R packages that implement WMD. But it should be possible to implement it yourself, <a href="http://vene.ro/blog/word-movers-distance-in-python.html" rel="nofollow">this blog post</a> has a nice explanation and a Python implementation.<br />Sujit Palhttps://www.blogger.com/profile/06835223352394332155noreply@blogger.comtag:blogger.com,1999:blog-7583720.post-33954256186519955682016-06-01T04:49:45.092-07:002016-06-01T04:49:45.092-07:00Hi, this is great post and would like to implement...Hi, this is great post and would like to implement WMD using R. Do you know any existing R packages or references to implement in R.<br />Great to have reply and thanks for your time.Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-7583720.post-26500992801424739902016-04-24T09:45:17.600-07:002016-04-24T09:45:17.600-07:00Thank you. I did this using a Scala Databricks Not...Thank you. I did this using a Scala Databricks Notebook, and I have already provided snippets in the post. I have also downloaded the notebook as HTML and put it into a <a href="https://gist.github.com/sujitpal/a9ce8aad1d80c92f70c01acab62749b9" rel="nofollow">Github Gist here</a>. Unfortunately gist does not render the page, so you will have to download it locally and view it through the browser Sujit Palhttps://www.blogger.com/profile/06835223352394332155noreply@blogger.comtag:blogger.com,1999:blog-7583720.post-44689009135319684322016-04-20T04:37:48.452-07:002016-04-20T04:37:48.452-07:00This is so great publishing open source for learni...This is so great publishing open source for learning, any way I'm also very new on the scala. Could you issue the full files of the code on github or (touy_say@hotmail.com), that will make deep understanding what you did in your implementations.<br /><br />Thanks for your will issue.Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-7583720.post-61290988803519608862016-01-08T11:55:40.150-08:002016-01-08T11:55:40.150-08:00@Anonymous: apologies for the delay in replying, j...@Anonymous: apologies for the delay in replying, just saw your comment sitting in my queue, must have missed it earlier. I used <a href="https://radimrehurek.com/gensim/models/word2vec.html" rel="nofollow">Gensim's Word2Vec module</a> to do the conversion from BIN to TSV.<br /><br />@Sander: I don't have code for WMD in either of these languages, but here is the definition: The WMD is a Sujit Palhttps://www.blogger.com/profile/06835223352394332155noreply@blogger.comtag:blogger.com,1999:blog-7583720.post-54308952882101066352016-01-08T11:10:16.091-08:002016-01-08T11:10:16.091-08:00I see, I am working with Python, R, and Matlab, ...I see, I am working with Python, R, and Matlab, it is pity I can not understand this code, do you know Python, R, and Matlab examples or some simple description . By the reference in <br />http://sujitpal.blogspot.ca/2014/12/semantic-similarity-for-short-sentences.html<br />to A reader recently recommended a paper for me to read - Sentence Similarity Based on Semantic Nets and Corpus StatisticsAnonymoushttps://www.blogger.com/profile/00815025625155764930noreply@blogger.comtag:blogger.com,1999:blog-7583720.post-26755192678592728022016-01-08T09:41:04.261-08:002016-01-08T09:41:04.261-08:00Hi Sander, there is a link to the original WMD pap...Hi Sander, there is a link to the original WMD paper at the top of the post. Also I calculate WMD in my post using the code snippet that starts with "val bestWMDs".<br />Sujit Palhttps://www.blogger.com/profile/06835223352394332155noreply@blogger.comtag:blogger.com,1999:blog-7583720.post-15356593928221821752016-01-08T09:29:02.750-08:002016-01-08T09:29:02.750-08:00sounds great, it is very in now to add more info t...sounds great, it is very in now to add more info to word2vec http://ai.stanford.edu/~amaas/papers/wvSent_acl2011.pdf Learning Word Vectors for Sentiment Analysis or http://anthology.aclweb.org/P/P14/P14-1146.pdf Learning Sentiment-Specific Word Embedding or Coooolll A Deep Learning System for Twitter Sentiment Classification http://www.aclweb.org/anthology/S14-2033, but may you share someAnonymoushttps://www.blogger.com/profile/00815025625155764930noreply@blogger.comtag:blogger.com,1999:blog-7583720.post-44033522639621902652015-12-27T08:27:46.230-08:002015-12-27T08:27:46.230-08:00how did you convert the GoogleNews-vectors-negativ...how did you convert the GoogleNews-vectors-negative300.bin to GoogleNews-vectors-negative300.tsvAnonymousnoreply@blogger.com