tag:blogger.com,1999:blog-7583720.post2870416482833403163..comments2024-03-17T13:30:18.387-07:00Comments on Salmon Run: Clustering Word Vectors using a Self Organizing MapSujit Palhttp://www.blogger.com/profile/06835223352394332155noreply@blogger.comBlogger9125tag:blogger.com,1999:blog-7583720.post-8027329815433713432020-04-29T13:35:57.759-07:002020-04-29T13:35:57.759-07:00If I understand the question correctly, you want t...If I understand the question correctly, you want to derive document vectors from word vectors and then cluster the documents using this document vector? So two questions (1) how to convert word vectors to document vectors and (2) how to cluster document vectors. For first question, if you know nothing about your documents, then you can simply add up the word vectors and take their mean and call Sujit Palhttps://www.blogger.com/profile/06835223352394332155noreply@blogger.comtag:blogger.com,1999:blog-7583720.post-459521333455385732018-10-08T10:05:40.203-07:002018-10-08T10:05:40.203-07:00How to cluster word2vec with respective to number ...How to cluster word2vec with respective to number of cluster.Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-7583720.post-46798507747791965452018-10-08T10:01:35.798-07:002018-10-08T10:01:35.798-07:00Sujit Sir,
thank you for nice post. I need little...Sujit Sir,<br /><br />thank you for nice post. I need little bit guidance over how to use word2vec with respective to document.<br />or How to cluster the wor2vec model with respect to cluster.<br /><br />for example I take 10 sentence and word2vec created the 30 vectors.<br />How can I map them as per document. <br />or how can I cluster the document using word2vec vector.<br />I can cluster theAnonymoushttps://www.blogger.com/profile/13145083098276118621noreply@blogger.comtag:blogger.com,1999:blog-7583720.post-81601750138597789542016-07-18T08:56:53.017-07:002016-07-18T08:56:53.017-07:00You could transpose the TD matrix so now each row ...You could transpose the TD matrix so now each row is the TF-IDF of a word in various documents. When brought down to 2 dimensions, you will see similar words (in the sense that they co-occur in the same way) close together. Not sure if this would be useful for summarization though, since (at least for abstractive summaries) you are looking inside the same document for the top N sentences that areSujit Palhttps://www.blogger.com/profile/06835223352394332155noreply@blogger.comtag:blogger.com,1999:blog-7583720.post-150576710321158302016-07-17T14:29:27.369-07:002016-07-17T14:29:27.369-07:00Can I use TF-IDF vectors to map words instead of d...Can I use TF-IDF vectors to map words instead of documents? because organizing or clustering documents wouldn't help me with summarization process. Anonymoushttps://www.blogger.com/profile/00798865189548677452noreply@blogger.comtag:blogger.com,1999:blog-7583720.post-59698986923338991862016-07-16T18:50:11.715-07:002016-07-16T18:50:11.715-07:00Given your pipeline, after TF-IDF, each document i...Given your pipeline, after TF-IDF, each document is represented as a vector of (N,) where N is the number of words in your vocabulary. SOM will reduce each document to a point in two dimensional space. Documents that are related will be close to each other in this space. Not sure if this is what you want, doesn't seem to help with the summarization process. But assuming you do (perhaps this Sujit Palhttps://www.blogger.com/profile/06835223352394332155noreply@blogger.comtag:blogger.com,1999:blog-7583720.post-16267954615758636872016-07-16T17:19:31.583-07:002016-07-16T17:19:31.583-07:00Hello Sujit,
I am doing a text summarization in w...Hello Sujit,<br /><br />I am doing a text summarization in which I plan to do : Raw text files->lemmatization->stop word removal->tf-idf->SOM Clustering and visualization.<br /><br />Can you explain me on how to input tf-idf from spark ml library to SOM (How to design input vector for SOM). encog package takes array of list of doubles as input to SOM. How to transform from spark ml Anonymoushttps://www.blogger.com/profile/00798865189548677452noreply@blogger.comtag:blogger.com,1999:blog-7583720.post-63444519444699655982016-03-15T07:26:00.994-07:002016-03-15T07:26:00.994-07:00Sure, just let me know where you are stuck and wha...Sure, just let me know where you are stuck and what you already did to handle, that way I don't duplicate your work.<br />Sujit Palhttps://www.blogger.com/profile/06835223352394332155noreply@blogger.comtag:blogger.com,1999:blog-7583720.post-67047387829389392832016-03-15T02:43:25.919-07:002016-03-15T02:43:25.919-07:00Hello Mr. Sujit,
I am very new to scala coding an...Hello Mr. Sujit,<br /><br />I am very new to scala coding and want to use SOM code form clustering line vectors generated using gensim. Can you please help me to run your code?<br /><br />Regards,<br />SachinUnknownhttps://www.blogger.com/profile/10875908427212483144noreply@blogger.com